ZXA10 C6XX V1.1.0 Troubleshooting Guide V1.1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 57

Confidential▲

ZXA10 C6XX V1.1.0


Troubleshooting Guide V1.1
For more documents, please refer to Document Map:

Fixed Network Product Service Strategy

http://tsm.zte.com.cn/tsm/FileCenter/File.aspx?Mode=read&FileID=30662900

All rights reserved. No spreading without permission of ZTE. 1


Confidential▲

Revision History

Document
Product Version Serial Number Reason for Revision
Version

First published

Modules added

Author
Reviewed
Document Version Date Prepared by Approved by
by

1.0 March, 2019 Lin Guoxin

1.1

All rights reserved. No spreading without permission of ZTE. 2


Confidential▲

Applicable to: PON-OLT commissioning maintenance personnel

Proposal: Before reading this document, it is recommended to


understand the following knowledge and skills
SEQ Knowledge and skills Reference material

Follow-up document: After read this document, you may need the
following information
SEQ Reference Material Information

All rights reserved. No spreading without permission of ZTE. 3


Confidential▲

About this document

Abstract
Chapter Description

错误!未找到引用源。 错误!未找到引用
源。

错误!未找到引用源。 错误!未找到引用
源。

错误!未找到引用源。 错误!未找到引用
源。

错误!未找到引用源。 错误!未找到引用
源。

错误!未找到引用源。 错误!未找到引用
源。

All rights reserved. No spreading without permission of ZTE. 4


Confidential▲

Contents
Overview .......................................................................................................................................... 7
Troubleshooting methods when SNMP management is blocked ........................................... 8
1. Confirm EMS matching version ....................................................................................... 8
2. Check SNMP-related configurations ............................................................................... 8
(1)Check whether the version that the NE snmp corresponds to is enabled ........ 8
(2)Check SNMP community names configured on EMS and NE ........................... 8
(3)Check whether view on EMS is consistent with that defined .............................. 9
3. Check network states ........................................................................................................ 9
(1)Check whether the route entry is set up ................................................................ 9
(2)Check whether it is connective with the EMS server’s address ......................... 9
(3) Perform snmp ping test on EMS .................................................................... 10
4. Check states of SNMP packets receiving / transmission statistics .......................... 10
5. Enable SNMP log to observe ......................................................................................... 12
6. View statistics of the main control’s control-panel SNMP protocol........................... 12
7. View the value of control-panel SNMP rate limit ......................................................... 13
8. View CPU queue rate limit configuration ...................................................................... 13
Troubleshoot when the whole C600 NE is disconnected ....................................................... 14
1. Symptoms and features of such disconnection problems ......................................... 14
2. Collect information for such disconnection faults ........................................................ 14
(1)Collect information of card, version, patch and cpu usage ratio and so on .... 14
(2)Collect information of all alarms and logs on the NE ......................................... 19
(3)All alarms on EMS ................................................................................................... 21
(4)Suspension File Explanation ................................................................................. 21
(5)Query suspension files ........................................................................................... 22
(6)Remotely view suspension files ............................................................................ 23
(7)Upload suspension files .......................................................................................... 24
(8)Collect CLI and diagnosis mode log file ............................................................... 24
(9)Collect information under OAM diag shell mode ................................................ 26
Troubleshooting when C600 line cards are HWONLINE ....................................................... 27
1.Features and corresponding state to light on the RUN indicator ............................... 27
2. Collect fault information when the card is HWONLINE .............................................. 28
3. The card becomes from INSERVICE into HWONLINE state .................................... 28
(1)Inter-card communication is interrupted............................................................... 28
(2)The card is repeatedly restarted............................................................................ 32
(3)Abnormal card tasks ............................................................................................... 34
(4)System control process or service support process is faulty ............................ 34
C600 NE Service Troubleshooting Guide ................................................................................. 35
1. Introduction of C600 NE service channel model ......................................................... 35
2. Unicast Service Troubleshooting Guide ....................................................................... 36

All rights reserved. No spreading without permission of ZTE. 5


Confidential▲
(1)A brief introduction of upstream unicast data flow forwarding .......................... 36
(2)Unicast troubleshooting idea ................................................................................. 36
(3)Unicast troubleshooting steps................................................................................ 37
3. Multicast Service Troubleshooting Guide ..................................................................... 40
(1)Multicast fault diagnosis idea ................................................................................. 40
(2)Upstream protocol packets handling procedures ............................................... 41
(3)Downstream protocol packets handling procedures .......................................... 42
(4)Downstream data flow forwarding procedures.................................................... 43
(5)Multicast debug function ......................................................................................... 44
(6)Multicast log function............................................................................................... 44
(7) Multicast troubleshooting method .................................................................. 45
4. GPON ONUs Fail To Register ........................................................................................ 47
(1)Troubleshooting procedures when GPON ONUs fail to register ...................... 47
(2)Symptoms when GPON ONUs fail to register..................................................... 48
(3)Possible reasons for GPON ONUs to fail to register ......................................... 49
(4)Method to collect information when GPON ONU registration is abnormal ..... 49
5.EPON ONUs fail to register ............................................................................................. 50
(1)Troubleshooting procedures when EPON ONUs fail to register ...................... 50
(2)Symptoms when EPON ONUs fail to register ..................................................... 51
(3)Troubleshooting steps when EPON ONUs fail to register ................................. 51
Introduce C600 Maintenance & Locating Method ................................................................... 53
1. Capture packets of CPU remotely ................................................................................. 53
(1)Main control card-based protocol packet’s cpu packet capturing function ..... 53
(2)Line card protocol packet’s cpu packet capturing function................................ 53
2. Packet statistics function ................................................................................................ 54
(1)Unicast flow statistics function ............................................................................... 54
(2)CPU protocol packet statistics ............................................................................... 54
(3)Protocol packet statistics shown as entering the line card’s CPU ................... 55
(4)Multicast service flow statistics .............................................................................. 56

All rights reserved. No spreading without permission of ZTE. 6


Confidential▲

Overview

For newly-launched C6xx series products, ZXA10 C6xxV1.1.0 is the main version.

Except similar customer-oriented EMS operating interfaces, work order traffic,

maintenance methods and mechanism at the PON side,there are great differences in

system platforms, equipment hardware, and operating instructions with C300, so

maintenance and troubleshooting are a bit different. This manual is written to let

after-sales maintenance personnel be familiar with the equipment and provide on-site

maintenance support and guidance. This document summarizes maintenance and

troubleshooting from such aspects as NE SNMP management, device system, hardware

troubleshooting and various services, to provide a troubleshooting guide.

Contents in the document:

1. NE SNMP management troubleshooting guide

2. Service troubleshooting methods, including unicast faults, multicast faults and

forwarding plane fault

3. ONU registration troubleshooting guide

4. System troubleshooting guide

5. NE hardware troubleshooting guide

6. Summarize C600 NE related locating means

All rights reserved. No spreading without permission of ZTE. 7


Confidential▲

Troubleshooting methods when SNMP management is

blocked

1. Confirm EMS matching version

First, make sure the EMS version matches with the NE version. The EMS version

must be T125, T135 patched and newer versions, otherwise, C600 NEs may be unable to

be managed normally on EMS, or part of functions may unable to be used. For details,

please consult the EMS support personnel.

2. Check SNMP-related configurations

Through executing the command show snmp config on the NE, show all snmp

configurations, to confirm step by step.

(1)Check whether the version that the NE snmp corresponds to is

enabled

First, you need to confirm whether the version that snmp-server corresponds to is

enabled, if the version used is V2C, the following configuration is needed:

snmp-server version v2c enable

(2)Check SNMP community names configured on EMS and NE

Check whether community names configured between EMS and NE are consistent.

Because the community name on the NE is encrypted, if you forget the community

name on the NE, you can ensure consistent community names between EMS and NE

through re-configuring a community name on the NE which is consistent with that

configured on EMS, as shown in the following:

ZXAN(config)#snmp-server community public1 view AllView rw

All rights reserved. No spreading without permission of ZTE. 8


Confidential▲

(3)Check whether view on EMS is consistent with that defined

On the NE, snmp-server view is AllView by default, as shown in the following:

snmp-server view AllView internet included

Therefore, when configuring community names, view of community names must be

AllView, case sensitive, and must be completely consistent.

3. Check network states

(1)Check whether the route entry is set up

View whether the route entry that the NE’s management address corresponds to is

generated, and whether its state is UP, as shown in the following:


ZXAN(config)#show ip interface brief
Interface IP-Address Mask Admin Phy Prot
mgmt_eth 192.168.124.100 255.255.255.0 up up up
vlan100 10.1.1.1 255.255.255.0 up up down
ZXAN(config)#

(2)Check whether it is connective with the EMS server’s address

Through PING the IP address of the EMS server on the NE, view whether it can be

pinged successfully. Or ping the NE fro the EMS server, to view whether it is connective. If

OLT NEs are connected via out-band interface, and ping EMS server from the NE,

interface name needs to be added. To PING the IP address of EMS through the NE’s

out-band address 10.1.1.1 to determine the connectivity, suppose EMS IP as 10.1.1.2.


ZXAN#ping vrf mng 10.1.1.2
sending 5,100-byte ICMP echo(es) to 10.1.1.2,timeout is 2 second(s).
.....
Success rate is 0 percent(0/5).
[finish]
ZXAN#

If OLTs are connected via in-band management address, directly ping the IP address.

All rights reserved. No spreading without permission of ZTE. 9


Confidential▲

(3) Perform snmp ping test on EMS

Through snmp ping test on the EMS interface,check interconnection states of the

snmp protocol between EMS and NE. You can set EMS snmp ping 100 packets, timeout

as 2s, and observe whether snmp reply packet is received on EMS.

4. Check states of SNMP packets receiving / transmission

statistics

Use the command show snmp to check states of snmp packets received and
transmitted by the NE, generally, each increase of request packets got or set must
correspond to response packets one to one. Mainly focus on following blue parts:
ZXAN(config)#show snmp
27571 SNMP packets input
0 Bad SNMP version errors
30 Unknown community name
0 Illegal operation for community name supplied
76245 Number of requested variables
300 Number of altered variables
17060 Get-request PDUs
10392 Get-next PDUs
89 Set-request PDUs
570806 SNMP packets output
0 Too big errors (Maximum packet size 8192)
1 No such name errors
0 Bad values errors
257 General errors
27524 Response PDUs
543282 Trap PDUs SNMP
0 Input ASN parse errors packets
0 Proxy drops packets
0 Unknown security model packets
0 Unknown PDU handler packets
0 Unsupported security level packets
0 Not in time-window packets
0 Unknown user name packets
0 Unknown engine ID packets
0 Wrong digest packets
0 Decryption error packets
SNMP version v1: enable
SNMP version v2c: enable

All rights reserved. No spreading without permission of ZTE. 10


Confidential▲
SNMP version v3: enable
SNMP agent listen port: 161
SNMP notification listen port: 162
SNMP command-responser: enable
SNMP proxy-forwarder: disable
ZXAN(config)# show snmp
27571 SNMP packets input
0 Bad SNMP version errors
30 Unknown community name
0 Illegal operation for community name supplied
76245 Number of requested variables
300 Number of altered variables
17060 Get-request PDUs
10392 Get-next PDUs
89 Set-request PDUs
570817 SNMP packets output
0 Too big errors (Maximum packet size 8192)
1 No such name errors
0 Bad values errors
257 General errors
27524 Response PDUs
543293 Trap PDUs SNMP
0 Input ASN parse errors packets
0 Proxy drops packets
0 Unknown security model packets
0 Unknown PDU handler packets
0 Unsupported security level packets
0 Not in time-window packets
0 Unknown user name packets
0 Unknown engine ID packets
0 Wrong digest packets
0 Decryption error packets
SNMP version v1: enable
SNMP version v2c: enable
SNMP version v3: enable
SNMP agent listen port: 161
SNMP notification listen port: 162
SNMP command-responser: enable
SNMP proxy-forwarder: disable
In addition, when pinging NE from EMS by snmp, normally, the quantity of packets
transmitted on EMS and that requested to increase on the NE are consistent, indicating
that it has already received packets, and when there are no other requests, its increase
states and response quantity on the NE must be consistent.

All rights reserved. No spreading without permission of ZTE. 11


Confidential▲

5. Enable SNMP log to observe

To observe whether the NE receives snmp packets or not, snmp log can be enabled,

and SNMP log information within the disconnected time can be got.
ZXAN(config)#snmp-server log enable all
ZXAN(config)#snmp-server log enable get
ZXAN(config)#snmp-server log enable set

Generally, just enable get log, to be able to view SNMP log:


ZXAN(config)#logging snmp
ZXAN(config-log-snmp)#accept on

ZXAN(config)#show logging buffer snmp log


Log acceptance: on
Log matched: 0
Log in buffer: 0
Buffer occupied: 0.00%
Log file path: /datadisk0/LOG/SNMP/

6. View statistics of the main control’s control-panel SNMP

protocol

Observe whether snmp packets are received:


ZXAN(config)#show control-panel rate limit statistics inband
pktType pktTotal pktDrop pps ppsPeak dropSec peakTime
all 0 0 0 0 0 -
icmp 0 0 0 0 0 -
ssh 0 0 0 0 0 -
other 0 0 0 0 0 -
igmp 0 0 0 0 0 -
dhcp 0 0 0 0 0 -
pppoe 0 0 0 0 0 -
snmp 0 0 0 0 0 -
telnet 0 0 0 0 0 -
bfd 0 0 0 0 0 -
stp 0 0 0 0 0 -
lacp 0 0 0 0 0 -
lldp 0 0 0 0 0 -
rip 0 0 0 0 0 -
bgp 0 0 0 0 0 -

All rights reserved. No spreading without permission of ZTE. 12


Confidential▲

7. View the value of control-panel SNMP rate limit

Observe whether control-panel SNMP rate limit is too small:


ZXAN(config)#show control-panel rate limit
rate limit(pps) icmp 250
rate limit(pps) ssh 512
rate limit(pps) other 2000
rate limit(pps) igmp 300
rate limit(pps) dhcp 160
rate limit(pps) pppoe 300
rate limit(pps) snmp 200
rate limit(pps) telnet 300
rate limit(pps) bfd 300
rate limit(pps) stp 500
rate limit(pps) lacp 600
rate limit(pps) lldp 500
rate limit(pps) rip 500
rate limit(pps) bgp 500
rate limit(pps) ospf 400

8. View CPU queue rate limit configuration

Check whether the CPU queue rate limit is too small, which can be troubleshooted

together with control-panel packets statistics.


ZXAN(config)#show control-panel cpu queue
QueueId Ratelimit(pps)
0 NONE
1 1024
2 256
3 1024
4 1024
5 256
6 256
7 NONE

If the problem still fails to be solved through above troubleshooting methods,you need to
capture packets to analyze. First, Capture packets at the EMS side, if EMS side transmits
packets, but receives no packet replied from the NE side or packet replying at the NE side
is abnormal, you need to mirror and capture packets at the NE side to analyze, so as to
confirm where on earth the packets are lost or why packet replying is abnormal.

All rights reserved. No spreading without permission of ZTE. 13


Confidential▲

Troubleshoot when the whole C600 NE is

disconnected

1. Symptoms and features of such disconnection problems

Disconnection happens between EMS and NE, if EMS has received alarm, and then

the NE is link-down, it is in a disconnection state, or there are no alarms on EMS,

suddenly the NE is link-down, failing to ping the NE from the EMS server, and failing to

connect remotely.

2. Collect information for such disconnection faults

(1)Collect information of card, version, patch and cpu usage ratio and so

on

View card information:


ZXAN#show card
Shelf Slot CfgType Port HardVer Status
---------------------------------------------------
1 4 GFGH 16 V1.0.0 INSERVICE
1 6 XFTO 8 V1.0.0 INSERVICE
1 8 XFPH 16 V1.0.0 INSERVICE
1 10 SFUL 0 V1.0.0 INSERVICE
1 11 SFUL 0 V1.0.0 STANDBY
1 17 PFGK 48 V1.0.0 INSERVICE
1 20 PRVR 0 V1.0.0 INSERVICE
1 23 FCVDB 0 V1.0.0 INSERVICE
ZXAN#

ZXAN#show equipment
Slot CfgType RealType HardVer McuVer BootVer EpldVer FpgaVer M-Code SN
--------------------------------------------------------------------------------
4 GFGH GFGHK V1.0.0 V1.0.3 V1.0.12 V1.7 N/A 000100 sn0004_GFGHK
6 XFTO XFTOA V1.0.0 V1.0.3 V1.0.12 V1.3 N/A 000100 100000000000
8 XFPH XFPHR V1.0.0 V1.0.3 V1.0.12 V1.7 N/A 000100 sn0000_XFPHR
10 SFUL SFUL V1.0.0 N/A V1.0.12 V2.2 V1.0.1 000100 sn0010__SFUL
11 SFUL SFUL V1.0.0 N/A V1.0.12 V2.2 V1.0.1 000100 sn0011__SFUL

All rights reserved. No spreading without permission of ZTE. 14


Confidential▲
17 PFGK PFGKA V1.0.0 V1.0.3 V1.0.12 V1.0 V1.0.0 000100 sn1234567_03
20 PRVR PRVR V1.0.0 V1.0.2 N/A N/A N/A 000001 709505600021
23 FCVDB FCVDB V1.0.0 V1.0.1 N/A N/A N/A 000100 sn0000000_23
ZXAN#

Version information and patch information:


ZXAN#show software
ZXA10 C600
ZTE ZXA10 Software, Version: V1.1.0, Release software
Copyright (c) 2018 by ZTE Corporation
Build on 2018/08/13 10:15:59
System image file is <sysdisk0: verset/base.set>, file size is 895,066,397 Bytes
[GFGH, shelf 1, slot 4]:
Active Packages on Node PFU-1/4/0:
Base-1.1.0:
Package: Base
Version: V1.1.0

[EFTH, shelf 1, slot 7]:


Active Packages on Node PFU-1/7/0:
Base-1.1.0:
Package: Base
Version: V1.1.0

[SFUL, shelf 1, slot 10]:


Active Packages on Node MPU-1/10/0:
Base-1.1.0:
Package: Base
Version: V1.1.0

[SFUL, shelf 1, slot 11]:


Active Packages on Node MPU-1/11/0:
Base-1.1.0:
Package: Base
Version: V1.1.0

ZXAN#show install
Fpga-1.1.0-70820:
Package: Fpga
Version: 1.1.0
Install from: Fpga_V1.1.0.45332.pkg, size 81,563,583 Bytes, build on 2018-08-08 03:29:55

Boot-1.1.0:
Package: Boot

All rights reserved. No spreading without permission of ZTE. 15


Confidential▲
Version: 1.1.0
Install from: Boot_V1.1.0.41.pkg, size 48,096,297 Bytes, build on 2018-07-24 01:28:27

Epld-1.1.0:
Package: Epld
Version: 1.1.0
Install from: Epld_V1.1.0.45332.pkg, size 10,455,918 Bytes, build on 2018-08-08 03:27:24

Mcu-1.1.0:
Package: Mcu
Version: 1.1.0
Install from: Mcu_V1.1.0.44.pkg, size 18,770,517 Bytes, build on 2018-08-04 01:43:29

Base-1.1.0-1:
Package: Base
Version: V1.1.0
Install from: base.set, size 895,066,397 Bytes, build on 2018-08-13 10:15:59

ZXAN#show install active


Boot-1.1.0:
Package: Boot
Version: 1.1.0

Epld-1.1.0:
Package: Epld
Version: 1.1.0

Mcu-1.1.0:
Package: Mcu
Version: 1.1.0

Base-1.1.0-1:
Package: Base
Version: V1.1.0

Fpga-1.1.0-1238:
Package: Fpga
Version: 1.1.0

ZXAN#
Read CPU usage ratio information:
If the cpu usage ratio of the main control or a certain line card is found very high, directly
collect information of the card.
ZXAN#show processor

All rights reserved. No spreading without permission of ZTE. 16


Confidential▲
===============================================================================
=
===============================================================================
=
Character: CPU current character in system
MSC : Master-SC in Cluster System
SSC : Slave-SC in Cluster System
N/A : None-SC in Cluster System
CPU(5s) : CPU usage ratio measured in 5 seconds
CPU(1m) : CPU usage ratio measured in 1 minute
CPU(5m) : CPU usage ratio measured in 5 minutes
Peak : CPU peak usage ratio measured in 1 minute
PhyMem : Physical memory (megabyte)
FreeMem : Free memory (megabyte)
Mem : Memory usage ratio
===============================================================================
=
Character CPU(5s) CPU(1m) CPU(5m) Peak PhyMem FreeMem Mem
===============================================================================
=
PFU-1/4/0 N/A 20% 20% 20% 22% 2048 1144 44.141%
--------------------------------------------------------------------------------
PFU-1/6/0 N/A 6% 5% 5% 6% 1024 253 75.293%
--------------------------------------------------------------------------------
PFU-1/7/0 N/A 7% 7% 7% 7% 2048 1419 30.713%
--------------------------------------------------------------------------------
MPU-1/10/0 MSC 9% 9% 9% 9% 8192 2726 66.724%
--------------------------------------------------------------------------------
MPU-1/11/0 SSC 6% 5% 5% 6% 8192 5257 35.828%
--------------------------------------------------------------------------------
ZXAN#
Version running time:
ZXAN#show clock
13:58:51 Tue Apr 25 2017 UTC

ZXAN#show card slotno 4


Config-Type : GFGH
Status : INSERVICE
Port-Number : 16
Serial-Number : sn0004_GFGHK
Part-Number : pn0004_GFGHK
Hardware-VER : V1.0.0
C-Code : FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
SubCard : N/A

All rights reserved. No spreading without permission of ZTE. 17


Confidential▲
Uptime : 6 Days, 23 Hours, 14 Minutes, 11 Seconds
LastResetReason : Card reset due to software reason.
ZXAN#
Temperature-related:
ZXAN#show fan
[shelf 1 LCC]:
Fan speed mini level: 1
Slot 23: online
Fan speed mode: auto
Fan name: FAN
Index Status Speed
0 work normally 34%
1 work normally 34%
2 work normally 34%
3 work normally 34%
4 work normally 34%
5 work normally 34%
6 work normally 34%
7 work normally 34%

ZXAN#
ZXAN#show temperature detail
-------------------------------------------------------------------------
BoardType : Type of board
I2C : Inter integrated circuit
Addr : Address of check point
Description: Temperature point information
Status : Status of check point
Minor : Slight alarm value (celsius)
Major : Serious alarm value (celsius)
Fatal : Fatal alarm value (celsius)
Overheat : The high-threshold of temperature point (celsius)
Temper : Current temperature (celsius)
-------------------------------------------------------------------------

[shelf 1, slot 4]
BoardType I2C Addr Description Status Minor Major Fatal Overheat Temper
-------------------------------------------------------------------------
GFGH 0 1 cpu Normal 100 108 115 120 40
GFGH 0 2 np temp Normal 100 108 115 120 54

[shelf 1, slot 6]
BoardType I2C Addr Description Status Minor Major Fatal Overheat Temper
-------------------------------------------------------------------------

All rights reserved. No spreading without permission of ZTE. 18


Confidential▲
XFTO 0 1 cpu Normal 100 108 115 120 42
XFTO 0 2 np temp Normal 100 108 115 120 57

[shelf 1, slot 10]


BoardType I2C Addr Description Status Minor Major Fatal Overheat Temper
-------------------------------------------------------------------------
SFUL 0 1 mp cpu Normal 100 108 115 120 49
SFUL 0 3 sf3600 Normal 100 108 115 120 65

[shelf 1, slot 11]


BoardType I2C Addr Description Status Minor Major Fatal Overheat Temper
-------------------------------------------------------------------------
SFUL 0 1 mp cpu Normal 100 108 115 120 51
SFUL 0 3 sf3600 Normal 100 108 115 120 55

(2)Collect information of all alarms and logs on the NE

Cmdlog (there may be a lot of prints, for the NEs with especially long running time,
recommend to collect cmdlog log file)
ZXAN#show logging buffer cmdlog
Log acceptance: on
Log matched: 301
Log in buffer: 301
Buffer occupied: 32.38%
Log file path: /sysdisk0/usrcmd_log/

LogID:9691 2018-08-20 13:57:10.875 Class:Cmd [StartTime: 13:57:10 08-20-2018 EndTime:


13:57:10 08-20-2018 FlowID: 301 VtyNo: vty0 UserNa
me: zte UserLevel: 15 IP: 10.60.182.160 HostName: ZXAN Result: success CMDLevel: 5
CMDLine: show temperature detail]
LogID:9690 2018-08-20 13:57:04.710 Class:Cmd [StartTime: 13:57:02 08-20-2018 EndTime:
13:57:04 08-20-2018 FlowID: 300 VtyNo: vty0 UserNa
me: zte UserLevel: 15 IP: 10.60.182.160 HostName: ZXAN Result: success CMDLevel: 5
CMDLine: show fan]
Information of the alarm pool’s alarms:
ZXAN#show alarm current
An alarm 722468 ID 5672 level 2 occurred at 09:48:20 08-20-2018 sent by ZXAN PFU-1/4/0
%PON% PON alarm Olt Rx Power Low (rack 1 shelf 1 slot 4 port 1 onu 2)

An alarm 400315 ID 2307 level 3 occurred at 12:49:17 08-19-2018 sent by ZXAN MPU-1/10/0
%POWER% Power voltage fault! Shelf = 1, Group = 0, Slot = 20 DC Power voltage fault alarm
History alarms:
ZXAN#show alarm history

All rights reserved. No spreading without permission of ZTE. 19


Confidential▲
An alarm 722452 ID 5725 level 3 occurred at 14:01:34 08-20-2018, cleared at 14:01:37 08-20-2018 sent
by ZXAN PFU-1/4/0
%GPON% GPON alarm link olt rdii (rack 1 shelf 1 slot 4 port 1 onu 2)

An alarm 722452 ID 5724 level 3 occurred at 13:57:32 08-20-2018, cleared at 13:57:35 08-20-2018 sent
by ZXAN PFU-1/4/0
%GPON% GPON alarm link olt rdii (rack 1 shelf 1 slot 4 port 1 onu 2)
Collect Snmp log:
Some NEs disable snmp log by default, at the moment, there is no snmp log information,
you need to manually use the snmp-server log enable all command under the
configuration mode to enable it.
ZXAN#show logging buffer snmp log
Log acceptance: on
Log matched: 1
Log in buffer: 1
Buffer occupied: 0.14%
Log file path: /datadisk0/LOG/SNMP/

LogID:9722 2018-08-20 14:12:38.372 Class:Snmp [log-type:LOCAL IP:10.60.182.160 version:v2c


community/user:public pdu-type:GetBulk request-id
:5 OID:1.3.6.1.4.1.3902.1082.10.1.2.4.1.2 error-status:0<NO_ERROR> error-index:0
start-time:2018-08-20 14:12:38.347 end-time:2018-08-20 14:12
:38.371]
ZXAN#
Retrieve Snmp statistics information:
ZXAN#show snmp
3 SNMP packets input
0 Bad SNMP version errors
0 Unknown community name
0 Illegal operation for community name supplied
1 Number of requested variables
0 Number of altered variables
0 Get-request PDUs
1 Get-next PDUs
0 Set-request PDUs
3 SNMP packets output
0 Too big errors (Maximum packet size 8192)
0 No such name errors
0 Bad values errors
0 General errors
3 Response PDUs
0 Trap PDUs SNMP
0 Input ASN parse errors packets
0 Proxy drops packets

All rights reserved. No spreading without permission of ZTE. 20


Confidential▲
0 Unknown security model packets
0 Unknown PDU handler packets
0 Unsupported security level packets
0 Not in time-window packets
0 Unknown user name packets
0 Unknown engine ID packets
0 Wrong digest packets
0 Decryption error packets
SNMP version v1: enable
SNMP version v2c: enable
SNMP version v3: disable
SNMP agent listen port: 161
SNMP notification listen port: 162
SNMP command-responser: enable
SNMP proxy-forwarder: disable
ZXAN#

(3)All alarms on EMS

All the alarms and operation logs before / after the NE is faulty can be collected on
EMS.

(4)Suspension File Explanation

Collect suspension files related to the main control card and abnormal slots on the
master / slave servers (/datadisk0/run_log directory, /datadisk0/run_log/EXCINFO
directory and /datadisk0/run_log/CPULOAD directory).
Detailed collection is as shown in the following:
/datadisk0/run_log/Exc_Omp.txt------General suspension files of the main control card
/datadisk0/run_log/Exc_pp.txt------ General suspension files of line cards
/datadisk0/run_log/EXCINFO/Exc_0_1_slot_0.txt--slot logs of a card, slot represents slot number, to
record important logs in the running of the card
/datadisk0/run_log/EXCINFO/exception_kernel_0_1_slot_0.txt--Kernel suspension files of a card, slot
represents slot number, to record the card’s kernel suspension logs and call general resetting interface to
reset logs.

/datadisk0/run_log/EXCINFO/bspstartuplog_0_1_slot_0.txt--Starting log of a card, slot represents slot


number, to record UBOOT, kernel starting log
/datadisk0/run_log/EXCINFO/commLostInfo_0_1_slot_0.txt--Inter-card communication interruption log of
a card, slot represents slot number, to record related logs with inter-card communication between the
card and the main control card interrupted.

/datadisk0/run_log/CPULOAD/Exc_CpuLoad_0_1_slot_0.txt--CPU usage ratio soaring log of a card,


slot represents slot number, to record the task and stack information when the card’s CPU usage ratio

All rights reserved. No spreading without permission of ZTE. 21


Confidential▲
reaches 60% continuously within 30s.

/datadisk0/run_log/EXCDUMP/Dump_0_1_slot_0.txt--Abnormal memory logs of a card, slot represents


slot number,to record information of suspiciously abnormal memory of the card.

(5)Query suspension files

ZXAN#dir /datadisk0/run_log
Directory of MPU-1/10/0: /datadisk0/run_log
30644060 KB total (23669384 KB free)

attribute size date time name


1 <DIR> 4096 03-08-2019 09:22 .
2 <DIR> 4096 03-08-2019 09:22 ..
3 ---- 79354 03-08-2019 10:50 curserialportlog.txt
4 ---- 164929 03-08-2019 09:22 lastserialportlog.txt
5 <DIR> 4096 03-05-2019 14:07 EXCDUMP
6 <DIR> 4096 02-03-2019 13:38 CPULOAD
7 ---- 49969 03-04-2019 15:58 ushell_process_log.txt
8 <DIR> 4096 03-05-2019 09:47 util
9 ---- 531835 03-08-2019 09:25 Exc_pp.txt
10 <DIR> 4096 01-27-2019 00:47 ODB
11 <DIR> 4096 12-26-2017 01:41 processlog
12 ---- 1069390 03-07-2019 15:45 bspstartuplog_bak.txt
13 <DIR> 4096 03-04-2019 22:30 Loader
14 ---- 2098062 03-07-2019 03:33 Exc_Omp.txt.bak
15 ---- 94531 03-08-2019 09:28 fnsc.log
16 ---- 324560 03-08-2019 09:22 bspstartuplog.txt
17 ---- 572181 03-08-2019 09:23 exception_kernel.txt
18 ---- 1212022 03-08-2019 09:25 Exc_Omp.txt
19 <DIR> 4096 03-08-2019 10:15 EXCINFO
ZXAN#dir /datadisk0/run_log/EXCINFO
Directory of MPU-1/10/0: /datadisk0/run_log/EXCINFO
30644060 KB total (23669252 KB free)

attribute size date time name


1 <DIR> 4096 03-08-2019 10:15 .
2 <DIR> 4096 03-08-2019 10:15 ..
3 ---- 2097503 03-07-2019 08:32 Exc_0_1_11_0.txt.bak
4 ---- 2097620 03-07-2019 13:25 Exc_0_1_10_0.txt.bak
5 ---- 262144 03-08-2019 09:35 bspstartuplog_0_1_11_0.t
xt.bak
6 ---- 262232 03-08-2019 09:25 bspstartuplog_0_1_2_0.txt
......

All rights reserved. No spreading without permission of ZTE. 22


Confidential▲

ZXAN#dir /datadisk0/run_log/CPULOAD
Directory of MPU-1/10/0: /datadisk0/run_log/CPULOAD
30644060 KB total (23669348 KB free)

attribute size date time name


1 <DIR> 4096 02-03-2019 13:38 .
2 <DIR> 4096 02-03-2019 13:38 ..
3 ---- 186478 03-08-2019 09:23 Exc_CpuLoad_0_1_10_0.txt
4 ---- 1730414 03-04-2019 16:24 Exc_CpuLoad_0_1_2_0.txt

ZXAN#dir /datadisk0/run_log/EXCDUMP
Directory of MPU-1/10/0: /datadisk0/run_log/EXCDUMP
30644060 KB total (23669308 KB free)

attribute size date time name


1 <DIR> 4096 03-05-2019 14:07 .
2 <DIR> 4096 03-05-2019 14:07 ..
3 ---- 1140674 03-05-2019 14:07 Dump_0_1_5_0.txt
4 ---- 4195511 03-05-2019 14:07 Dump_0_1_5_0.txt.bak

(6)Remotely view suspension files

Suspension files can be directly viewed by using the more command + file name on
CLI. Capture screen and record before viewing.
As shown in the following:
More /datadisk0/run_log/Exc_Omp.txt
To type out logs at one time, please set terminal properties:
terminal length 0.
However, you need to be cautious if there are rather a lot of contents in files.
To disable, use: no terminal length
Instance:
ZXAN#more /datadisk0/run_log/Exc_Omp.txt
************************ End of Record ************************

************************ Begin of Record ************************


Record Time: 2019-03-07 03:52:05(UTC: 2019-03-06 19:52:05)
exception record from rack:0,shelf:1,slot:11,cpu:0.
Used Time: 2019-03-07 03:52:04(UTC: 2019-03-06 19:52:04)
Find dead loop in thread 0x3fff68df8170(SCHE24_1)!
task id : 0x3fff68df8170
task name: SCHE24_1
task tid: 3163
Job Name : Fnscmanage

All rights reserved. No spreading without permission of ZTE. 23


Confidential▲
Job Type : 1
Job ID : 2446589953
Job state: 1
Job RunStatus: 0
Job DLoopDur: 300000
Job DloopStartTick: 32109
Job ThisMonitorTick: 62058
Msg ID : 2684493279
Headver : 1

(7)Upload suspension files

To send the whole suspension file to the R & D Institute, the suspension file needs to

be uploaded to the EMS server, and then copy.

Example to upload suspension file:

ZXAN#copy ftp root: /datadisk0/run_log/Exc_Omp.txt //*.*.*.*/Exc_Omp.txt@xx:xx------- in-band

ZXAN#copy ftp vrf mng root: /datadisk0/run_log/Exc_Omp.txt //*.*.*.*/Exc_Omp.txt@xx:xx------- out-band

Instance:

ZXAN#copy ftp vrf mng root: /datadisk0/run_log/Exc_Omp.txt //10.63.173.181/Exc_Omp.txt@uni:uni

Start copying file

Put file successfully!Sent 1212022 bytes!

If it is in-band, please remove the vrf mng parameter. If marked as black, represent

the specific path to upload suspension file, if marked as purple, represent the IP address

of the FTP server, if marked as green, represent the user name and password of the FTP

server.

(8)Collect CLI and diagnosis mode log file

Collect CLI and diagnosis mode log files of the main control card

(/datadisk0/usrcmd_log directory、/datadisk0/syscmd_log directory).

Details:

/datadisk0/usrcmd_log/cmdlog_xxxxx.log--Common CLI operation log, xxxxxx

All rights reserved. No spreading without permission of ZTE. 24


Confidential▲

represents date and index, the logs are recorded by entering command under under basic

modes such as CLI common mode and configuration mode.

/datadisk0/syscmd_log/systemlog_xxxxxx.log--CLI special operation log, xxxxxx

represents date and index, the logs are recorded by entering command under special

modes such as CLI diagnosis mode and vendor mode.

1)Query CLI log files


ZXAN#dir /datadisk0/usrcmd_log
Directory of MPU-1/10/0: /datadisk0/usrcmd_log
30644060 KB total (23669140 KB free)

attribute size date time name


1 <DIR> 4096 12-26-2017 01:38 .
2 <DIR> 4096 12-26-2017 01:38 ..
3 ---- 4124665 03-08-2019 16:38 cmdlog_20171225173830_0.
cmd.log
ZXAN#dir /datadisk0/syscmd_log
Directory of MPU-1/10/0: /datadisk0/syscmd_log
30644060 KB total (23669276 KB free)

attribute size date time name


1 <DIR> 4096 12-26-2017 02:21 .
2 <DIR> 4096 12-26-2017 02:21 ..
3 ---- 98285 03-08-2019 08:43 systemlog_20171225182153
_0.sys.log

2)Upload CLI log files


Generally, CLI log files ar huge with a lot of contents, it is better not use the more

command to view, but just directly upload files.

Example of uploading CLI log files

ZXAN#copy ftp vrf mng root: /datadisk0/usrcmd_log/

cmdlog_20180528110623_0.cmd.log //*.*.*.*/usrcmd.log@xx:xx

If it is in-band, please remove the vrf mng parameter. If marked as black, represent

the specific path to upload suspension file, if marked as purple, represent the IP address

of the FTP server, if marked as green, represent the user name and password of the FTP

server.

All rights reserved. No spreading without permission of ZTE. 25


Confidential▲

(9)Collect information under OAM diag shell mode

1)Operation methods under OAM diag shell mode

Under V1.1.0 version, for operation methods to enter the shell mode via CLI, please refer

to ZXA10 C6xx V1.1.0 Operation Methods to Enter Shell under Diagnosis Mode.

2)Collect general information

Check whether there are tasks suspended: After current general engineering version

has tasks suspended, it will be restarted to switch. If not switching, under the ADM

process of corresponding card’s diag-shell, enter XOS_DbgShowSuspInfo() to view

whether there are tasks suspended, if yes, switch to corresponding process, and

XOS_DbgTt2(“suspended task name”) can track call stacks.

As shown in the following:


ZXAN(diag)#diag shell MPU-1/10/0
ZXAN(diag-shell-MPU-1/10/0)#
ZXAN(diag-shell-MPU-1/10/0)#exec sh 0
shell 0
Now switch to ADM shell...
[ADM]#
ZXAN(diag-shell-MPU-1/10/0)#exec XOS_DbgShowSuspInfo()
XOS_DbgShowSuspInfo()
[ADM]no suspended task
[ADM]value = 0 = 0x0(32);value = 1=0x1(64)
[ADM]
[ADM]ushell command finished
[ADM]start time: 2018-08-20 15:28:20(607762 s, 264 ms)
[ADM] end time: 607762 s, 265 ms
[ADM]

ZXAN(diag-shell-MPU-1/10/0)#
ZXAN(diag-shell-MPU-1/10/0)#exec XOS_DbgTt2("SCS_MCM_MGT_2")
XOS_DbgTt2("SCS_MCM_MGT_2")
[ADM]Show task context by proc...
[ADM]Track function call list...
[ADM]0x9d045658 __recvmsg(PC)+(0x58/0xac)
[ADM]0x1050cdd0 L1WaitForMSG+(0x5c/0x10c)
[ADM]0x1050c6bc L1ScheTaskEntry+(0x270/0x928)
[ADM]0x104d4858 Vos_UniThreadEntry+(0xdc4/0xed8)
[ADM]value = 0 = 0x0(32);value = 0=0x0(64)
[ADM]

All rights reserved. No spreading without permission of ZTE. 26


Confidential▲
[ADM]ushell command finished
[ADM]start time: 2018-08-20 16:11:16(610338 s, 218 ms)
[ADM] end time: 610338 s, 219 ms
[ADM]

View whether there are abnormal suspension logs.

To view suspension logs of the card, execute related function under ADM process of

the corresponding card’s diag-shell, for specific function name, you can contact the R & D

to get.

Troubleshooting when C600 line cards are HWONLINE

1.Features and corresponding state to light on the RUN

indicator

The RUN indicator is divided into BOOT phase and version phase according to

different running phases of the card’s software. In BOOT phase, usually the RUN indicator

is lighted on by BSP, used for indicting to the user that the BOOT software starts to run,

which is an important prompt information for users. At present, the lighting on of the the

RUN indicator in BOOT phase for C600 cards is consistent with C300, the green LED is

solid on during BOOT phase.

Note: If the yellow LED / orange LED is solid on once powered on, generally,it is the

DDR memory’s chip that is faulty. When BOOT is running and before it changes to DDR, it

is in orange LED state.

In version running phase, the controlling of RUN and the card’s system control states

correspond to one to one. Present correspondence relationship:


State Flashing Color Remark
(Flash
Cycle)
INSERVICE Yes Green 1s
CONFIGING Yes Green 0.5s
TYPEMISMATCH Yes Yellow 1s
HWONLINE No Green -

All rights reserved. No spreading without permission of ZTE. 27


Confidential▲

LED Definition:
State of Lighted on Definition
The RUN indicator Represent that the card is in a state that can
slowly flashes in green normally provide services(INSERVICE)
(1s)
The RUN indicator Represent that the card is in the configuration
quickly flashes in green process(configing)
(0.5s)
The RUN indicator is (1)Represent that the card is in the version
solid on in green started process
(2)Represent that the card is in the HWONLINE
state
The RUN indicator Represent that the inserted card and
slowly flashes in yellow configuration type are
(1s) mismatched(TYPEMISMATCH)

Note: Above belong to conventional states, when state of a line card is


HWONLINE because inter-card communication with the main control card is
link-down, the line card may still maintain in previous state, and lighted on
according to previous state.

2. Collect fault information when the card is HWONLINE

Collect the main control card via CLI


show card
show equipment
show processor
show install
show install activate
show software
show alarm current
show alarm history

3. The card becomes from INSERVICE into HWONLINE state

(1)Inter-card communication is interrupted

1)General symptoms

 On cli, use show card to view the state of the corresponding slot’s card is HWONLINE,

All rights reserved. No spreading without permission of ZTE. 28


Confidential▲

both NE and EMS have Board not running alarm, and the RUN indicator of the card is

solid on in green.

 Execute show processor, and there is no information about the HWONLINE line card.

 Under oam diag-shell, fail to enter the line card’s shell, and fail to ping the IP

address of the line card’s slot.

2)Collect information

 Collect linkup information and ping information of the main control and each card

under the FNSC_SVR process of the main control card. Functions involved:

**PeerShow(),**Ping(ip,pingTimes), etc. (different NE versions, function names are

different, please ask the R & D Institute to get operation commands when operating)

 Collect the main control card’s Ethernet port information and SSP switching chip

port’s information, function involved: **QueryNetPort("eth2")

 Log in HWONLINE card in ODB remote mode or file mode to collect serial port prints

of the faulty card.

 Collect suspension files, slot log, inter-card communication interruption log, CPU

usage ratio high logs.

3)Check linkup information and ping information of cards under the main control card

 Through **PeerShow() function to view the card’s inter-card communication linkup

relationship.

Execute **PeerShow() function under diag-shell’s FNSC_SVR process:


ZXAN(diag-shell-MPU-1/10/0)#execute **PeerShow()
diagRslPeerShow()
[FNSC_SVR]Slot ipAddr
[FNSC_SVR]1 168.1.130.0
[FNSC_SVR]2 168.1.131.0
[FNSC_SVR]3 168.1.132.0
[FNSC_SVR]4 168.1.133.0
[FNSC_SVR]10 168.1.139.2
[FNSC_SVR]11 168.1.140.2
[FNSC_SVR]value = 0 = 0x0(32);value = 0=0x0(64)

All rights reserved. No spreading without permission of ZTE. 29


Confidential▲
[FNSC_SVR]
[FNSC_SVR]ushell command finished
[FNSC_SVR]start time: 2019-03-08 22:24:35(46952 s, 738 ms)
[FNSC_SVR] end time: 46952 s, 743 ms
[FNSC_SVR]-

ZXAN(diag-shell-MPU-1/10/0)#show card
Shelf Slot CfgType Port HardVer Status
---------------------------------------------------
1 1 GFGH 16 V1.0.0 INSERVICE
1 2 GFTH 16 V1.0.0 INSERVICE
1 3 XFTH 16 V1.0.0 INSERVICE
1 4 GFTH 16 V1.0.0 INSERVICE
1 10 SFUL 0 V1.0.0 INSERVICE
1 11 SFUL 0 V1.0.0 STANDBY
1 22 PRVR 0 V1.0.0 INSERVICE
1 23 FCVD 0 V1.0.0 INSERVICE

 Pinging the ip address of the HWONLINE card can further confirm whether the

inter-card communication is blocked or not.

Through ping command, ping the IP of the target card slot from the host. Each slot’s

IP address starts from 168.1.131.0, and correspond to slot 1,168.1.132, 0 corresponds to

slot 2, and the like. For example, under the main control diag shell, enter the FNSC_SVR

process (process number is 31), execute **Ping(“168.1.132.0”,4), this function has two

parameters: the first parameter is the ip address of the corresponding HWONLINE card,

and the second parameter is the quantity of ping packets(this number must be filled in).

Format:
ZXAN(diag-shell-MPU-1/10/0)#execute **Ping("168.1.132.0",4)
diagRslPing("168.1.132.0",4)
[FNSC_SVR]PING 168.1.132.0 (168.1.132.0): 56 data bytes
64 bytes from 168.1.132.0: seq=0 ttl=64 time=0.457 ms
64 bytes from 168.1.132.0: seq=1 ttl=64 time=0.450 ms
64 bytes from 168.1.132.0: seq=2 ttl=64 time=0.420 ms
64 bytes from 168.1.132.0: seq=3 ttl=64 time=0.438 ms

--- 168.1.132.0 ping statistics ---


4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.420/0.441/0.457 ms
command execute finished!
[FNSC_SVR]value = 0 = 0x0(32);value = 0=0x0(64)
[FNSC_SVR]

All rights reserved. No spreading without permission of ZTE. 30


Confidential▲
[FNSC_SVR]ushell command finished
[FNSC_SVR]start time: 2019-03-08 22:32:42(47439 s, 681 ms)
[FNSC_SVR] end time: 47442 s, 685 ms

If above has no showing of packets replied, it represents that pinging failed, indicating

the inter-card communication is blocked, at the moment, you need to get logs under

path/datadisk0/run_log/EXCINFO/CommLostLog_0_1_slot_0.txt to continue diagnosis,

and further analyze logs recorded and such information as CPU Ethernet port, phy, mac,

L2 switching, kernel protocol stack.

4)Query the information of CPU Ethernet port and SSP switching chip port under oam

diag-shell of the main control card

Log in the the main control card to enter under diag shell FNSC_SVR process,

continue entering interface to check packets received / transmitted, including packet loss

states.

[FNSC_SVR] RX packets:1914709 errors:0 dropped:0 overruns:0 frame:0

[FNSC_SVR] TX packets:7146379 errors:0 dropped:0 overruns:0 carrier:0

Continue focusing on changes of above RX packets and TX packets, if one item is

unchanged continuously, maybe because the main control card’s inter-card

communication is faulty.

5)Log in HWONLINE card in ODB remote or file mode, and collect serial port prints of the

faulty card.

Use ODB file mode to collect serial port prints of the HWONLINE card, and observe

whether there are abnormal prints.

First, use the following command to delete HWONLINE card ODB serial port logs

saved previously, slot represents slot number For HWONLINE card.

delete /datadisk0/run_log/ODB/odb_log_0_1_slot_0.txt

And then use the following command to enable the HWONLINE card’s ODB file mode,

xx represents HWONLINE slot number

odb-switch enable slotno xx mode file

All rights reserved. No spreading without permission of ZTE. 31


Confidential▲

And then regularly view /datadisk0/run_log/ODB/odb_log_0_1_slot_0.txt, slot

represents slot number of the HWONLINE card. Use the more

/datadisk0/run_log/ODB/odb_log_0_1_slot_0.txt command to view ODB serial port logs of

the HWONLINE card.

For specific usage methods, please refer to ZXA10 C6xx V1.1.0 Operation Methods

to Enter Shell under Diagnosis Mode.

6)Collect suspension files for further analysis

Collect NE suspension file, kernel suspension file, BSP starting log, inter-card
communication interruption log, slot log and CPU usage ratio soaring log
Collect following files (all following slots represent slot numbers of the HWONLINE
faulty card)
/datadisk0/run_log/Exc_Omp.txt
/datadisk0/run_log/Exc_pp.txt
/datadisk0/run_log/EXCINFO/exception_kernel_0_1_slot_0.txt
/datadisk0/run_log/EXCINFO/commLostInfo_0_1_slot_0.txt
/datadisk0/run_log/EXCINFO/bspstartuplog_0_1_slot_0.txt
/datadisk0/run_log/EXCINFO/Exc_0_1_slot_0.txt
/datadisk0/run_log/CPULOAD/Exc_CpuLoad_0_1_slot_0.txt

(2)The card is repeatedly restarted

1)General symptoms

 show card, and view that the state of the card switch among

HWONLINE---->CONFIGING---->INSERVICE for many times.

 The RUN indicator of the card switch among orange ---->solid on in green---->quickly

flashes in green----->slowly flashes in green repeatedly within a period of time for

many times.

 diagRslPeerShow(), view linkup states between each card and the main control card,

sometimes there is the IP of this card, but sometimes there isn’t.

 A certain card at the NE side and EMS side reports Board not running alarm

repeatedly within a period of time.

All rights reserved. No spreading without permission of ZTE. 32


Confidential▲

2)Collect information

 Collect line card restarting prints under remote ODB mode and file ODB mode.

 Disable user-state exception restarting policy, when the card is UP, log onto the card

as quickly as possible, and then under ADM process, enter: function which can forbid

the card to restart, and then log onto the line card to query specific reasons for the

exception.

 Enter the function under the HWONLINE card's ADM process to get suspension file.

 Enter the function under the HWONLINE card's ADM process to view whether the

process has power-on failure, if yes, get power-on failure log information

 Collect user-state suspension files, kernel suspension files, BSP starting logs and slot

logs.

3)View line card starting print under remote ODB mode and file ODB mode

Under file ODB mode, capture starting prints of the card that is repeatedly restarted,

and view starting prints of the HWONLINE card, to observe when the HWONLINE card is

restarted, whether it fails to get version, or fails to get FPGA, or the process has power-on

failure, or there are tasks suspended, and so on under UBOOT.

4)View whether there are tasks suspended

Under the ADM process of the HWONLINE card via remote ODB, enter following

functions to view whether the card has tasks suspended, or kernel exception. Under the

FTM process of the HWONLINE card via remote ODB, enter following functions to view

kernel logs.

5)View whether the process has power-on failure or is powered on all the time.

Enter the HWONLINE card’s diag-shell, under the ADM process (remote ODB is

OK), enter interface function process to check.

For specific process’s power-on information, collect through power-on log function.

All rights reserved. No spreading without permission of ZTE. 33


Confidential▲

(3)Abnormal card tasks

Abnormal card tasks usually will lead to HWONLINE, generally, high priority task busy
or dead lock/dead loop, task suspended and so on happen.

1)General symptoms

 The card is restarted repeatedly.

 Execute show processor, the CPU usage ratio of a certain card is very high.

 Such information as suspension, dead lock, dead loop, PCIE suspension are

recorded in the suspension file.

2)Collect information

 Check related functions under the HWONLINE card’s ADM process.

 Enter the HWONLINE card’s diag-shell, under the ADM process, view information

such as CPU usage ratio.

 Collect user state suspension file, kernel suspension file, BSP starting log, slot log

and CPU soaring log.

 View whether there are tasks suspended.

 View whether there are tasks with high CPU usage ratio.

 View whether there are message queues which are full.

(4)System control process or service support process is faulty

1)General symptoms

 The HWONLINE card has no tasks suspended, and has no process power-on failure;

 Slot logs of user-state suspension file, kernel-state suspension file have no contents,

but the card is still HWONLINE.

 IAP_LP, XPON_LP processes are not started.

All rights reserved. No spreading without permission of ZTE. 34


Confidential▲

2)Collect information

 Enter to the main control card’s diag shell IAP_RP process, and execute related

functions under the XPON_RP process, TDM process;

 Enter to the main control card’s diag shell, and execute under the FNSC_SVR

process;

 Enter to the active main control card’s diag shell, and execute under the

OFFCFGBRDMNG process;

 Enter to the HWONLINE card’s diag shell, and execute under the FNSC_CLT_LP

process;

 Enter to the HWONLINE card’s IAP_LP, XPON_LP process (uplink card has no

XPON_LP process, so it needn’t be focused on), and execute under the TDM_LP

process;

 Enter under the HWONLINE card’s RMAGENT process to execute;

C600 NE Service Troubleshooting Guide

1. Introduction of C600 NE service channel model

The whole service channel model of C600 is shown in the following figure.
C600 and C300 differ greatly in L2 switching: both C300’s main control card and line
cards adopt Ethernet L2 forwarding architecture of mac+vlan addressing standard, while
the biggest change of C600 L2 forwarding is that the main control card has no Ethernet
switching chip, but configure ZTE-developed SF3600 switching chip (can also be called
as cross-connect chip ). SF3600 is not used for switching Ethernet packets but switching
cell, Ethernet packets coming in are cut by the line card SSP-1 chip into several cells with
fixed length (cell length can be configured), while it is cell switching inside the main control
card, SF3600 sends the cell to the destination SSP-1 chip according to the information of
the cell header, and then at the exit, the chip transmits the cell out as Ethernet packets.

All rights reserved. No spreading without permission of ZTE. 35


Confidential▲

GPON Uplink card

SSP-1 SF3600 SSP-1


Exit handling
XPP Inlet handling Forward

Exit handling Forward Inlet handling

2. Unicast Service Troubleshooting Guide

(1)A brief introduction of upstream unicast data flow forwarding

GPON data frames are converted into Ethernet packets through the XPP chip, and
sent to SSP-1 chip. First, SSP-1 chip gets the egress port through viewing the MAC table
according to the destination MAC and VLAN in the Ethernet packet, and then adds two
tags: destination chip number (in the SAHI header) and destination port (in the MF header)
to the packet header by microcode. The Ethernet packet is divided into several cells, and
each cell header contains the destination chip information to be sent to SF3600 which
forwards the packet to the uplink card’s SSP-1 chip according to cell header, and then the
uplink card’s SSP-1 chip reassembles the received cell header into Ethernet packets, and
retrieve the destination port information in the MF header, to send towards the destination
port.
Downstream unicast forwarding process is similar to upstream.

(2)Unicast troubleshooting idea

For abnormal unicast services, basic idea is to check each node of the data

forwarding channel along upstream and downstream directions, to see where services are

interrupted. To be specific, basic ideas are:

1)Make clear service forwarding channels and several main nodes on the OLT;

2)At each node, confirm whether services have reached this node and are normally

All rights reserved. No spreading without permission of ZTE. 36


Confidential▲

forwarded through viewing port state, MAC address, statistics and capturing packets;

3)If the OLT has enabled service-related protocol processing function (such as DHCP

option82, DHCP relay, PPPOE+, ARP proxy, MFF and so on), CPU is also one of service

channel nodes, but this node just process protocol packets, at the moment, you need to

check the whole protocol procedures from receiving packets to processing and then to

transmitting to judge the position of the fault.

(3)Unicast troubleshooting steps

In a big way, specific troubleshooting is mainly divided into 3 big steps: analyze fault
symptoms,check configuration data and troubleshoot key nodes.

1)Analyzing fault symptoms

Before troubleshooting, you must identify the fault symptom, for example,unicast is
blocked, because individual ONU under a certain PON port has the problem, or all ONUs
under a certain PON port have the problem, or several PON ports have the problem, or all
ONUs under all PON ports of the faulty card have the problem, or users the whole NE’s
multiple slots have the problem, or multiple ONUs have the problem but scattered on
different PON ports or different slots or different NEs, or ONUs of some types have the
problem. Under what circumstances the problem occurs, blocked all the time or
sometimes blocked and sometimes unblocked, and under what circumstances services
are blocked and unblocked. The objective to make clear these symptoms is to be able to
preferentially choose the node that is most likely to be faulty from general troubleshooting
steps, so as to improve the locating efficiency.

2)Checking configuration data

Before executing specific troubleshooting command, remember to check the

configuration data first, to ensure that the configuration data is correct. The configuration

here refers to the general configuration that can be seen on CLI, such as clock

configuration, MAC address aging time configuration, MAC anti-spoofing, loop detection

and anti-dos configuration, port protection configuration, flood suppression configuration,

VLAN configuration, PON port configuration, ONU configuration (including VLAN,

bandwidth, FEC, encryption and remote management, etc.). The configurations can be

All rights reserved. No spreading without permission of ZTE. 37


Confidential▲

compared with the NEs with normal services when checking configurations.

3)Troubleshooting key nodes

Methods to check key node information are summarized into 4 kinds: check state,

view address,view statistics and capture(retrieve)packets, which need to combine fault

symptoms to further implement.

 NE-level problems:

Check state: Check the state of NE CPU, memory and traffic

Environmental factors: View fans, card temperature and card clock;

Card and port state: View the physical state of the uplink port, the state of line cards,

the state of the inner port at the main control card side and cascading port, the state of the

inner port at the line card side and cascading port;

ONU state: View ONU state, including on-line state and remote management state;

Abnormal alarm: View alarm and notification logs;

View address: View the MAC address table, ARP table and routing table of the main

control card, and view whether there is MAC spoofing and hash conflict;

View statistics: View SSP-1 statistics, and view PON MAC statistics of line cards;

Capture packets: Capture packets at the uplink port to analyze;

 Card-level problems:
Check state: Check the state of card’s CPU, memory and traffic
Environmental factors: View fans, card temperature and card clock;
Card and port state: View the physical state of the uplink port, the state of line cards,
the state of the inner port at the main control card side and cascading port, the state of the
inner port at the line card side and cascading port;
ONU state: View ONU state, including on-line state and remote management state;
Abnormal alarm: View alarm and notification logs;
View address: View address tables of the main control card and line cards, ARP table
and routing table of the main control card, and view whether there is MAC spoofing and
hash conflict between main control card and line card;
View statistics: View SSP-1 statistics, and view PON MAC statistics of line cards;
Capture packets: When services of the whole card are blocked, it doesn’t need to
capture packets;

 PON port-level problems:

All rights reserved. No spreading without permission of ZTE. 38


Confidential▲
Check state:
Environmental factors: View fans, card temperature and card clock;
Card and port state: View the physical state of the uplink port, the state of line cards,
the state of the inner port at the main control card side and cascading port, the state of the
inner port at the line card side and cascading port, Rx / Tx optical power states of the PON
link;
ONU state: View ONU state, including on-line state and remote management state;
Abnormal alarm: View alarm and notification logs;
View address: View address tables of the main control card and line cards, ARP table
and routing table of the main control card, and view whether there is MAC spoofing and
hash conflict between main control card and line card;
View statistics: View SSP-1 statistics, and view PON MAC statistics of line cards;
Capture packets: When PON port-level services are blocked, you can consider to
mirror at the inner port of the main control card to capture packets;

 ONU/user-level problems:
Check state:
Environmental factors: View fans, card temperature and card clock;
Card and port state: View the state of the inner port at the line card side and
cascading port, Rx / Tx optical power states of the PON link that the ONU corresponds to ;
ONU state: View ONU state, including on-line state and remote management state;
Abnormal alarm: View alarm and notification logs;
View address: View address tables of the main control card and line cards, ARP table
and routing table of the main control card, and view whether there is MAC spoofing and
hash conflict between main control card and line card;
View statistics: View SSP-1 statistics, and view PON MAC statistics of line cards;
Capture packets: Flow mirroring to capture packets;
In addition, if involving service-related protocol handling, you also need to focus on
CPU packet retrieving, packet transmitting and protocol handling states.
Steps:
Check state: View service-related protocols, such as uplink port LACP, port locating
(DHCP Option82, PPPOE+), working states of DHCP RELAY, ARP PROXY, MFF
service-related protocols, handling protocol-related tasks, and you also need to focus on
whether there are obvious abnormal prints under shell;
View statistics: View protocol packet statistics retrieved to CPU and protocol packet
statistics transmitted by CPU, and various statistics inside the protocol module;
View contents: Through prints of retrieving packets or transmitting packets, view
content of received / transmitted protocol packets;

All rights reserved. No spreading without permission of ZTE. 39


Confidential▲
Capture packets: Through CPU mirroring and packet capturing, view specific
contents of the protocol packets;

3. Multicast Service Troubleshooting Guide

(1)Multicast fault diagnosis idea

Basic idea of C600 multicast fault diagnosis is generally troubleshoot along two

directions according to upstream / downstream protocol packets handling procedures and

downstream data flow forwarding.

Overall idea:

1) Check whether receiving / transmission and handling of protocol packets at each

node (including multiple sub-nodes) are normal, and whether multicast entries of protocols

and driver and forwarding digitmap are correct;

2) Check whether downstream forwarding of data flows is consistent with the driver’s

forwarding entries and forwarding digitmap, and whether each forwarding node (including

multiple sub-nodes ) has packet loss or is broken.

All rights reserved. No spreading without permission of ZTE. 40


Confidential▲

(2)Upstream protocol packets handling procedures

PON

IAP
组播协议
FTM

IAP
线卡收发包 SW
FTM

驱动 主控/SF
L-CPU

FPP&SSP IAP
Pon 组播协议
FTM
ONU PUB&LIF PKTRX PPU
Mac
IAP
主控收发包
FTM

驱动
R-CPU
上联

SW SF
IAP
组播协议
FTM

IAP
线卡收发包 SW
FTM

驱动 L-CPU

FPP&SSP
Pon
SERVER PUB&LIF ODMA PPU
Mac

Upstream protocol packets include:

Proxy mode:

 Join / leave packets transmitted by the user or the user responds the report packet

queried;

 Host at the line card side changes the transmitted report packet or responds the

report packet queried by the main control;

 Host at the main control side changes the transmitted report packet or responds the

report packet queried by the server;

Snooping mode:

 Join / leave packets transmitted by the user or the user responds the report packet

queried by the server, and transparently transmit it to the server.

All rights reserved. No spreading without permission of ZTE. 41


Confidential▲

(3)Downstream protocol packets handling procedures

上联

IAP
组播协议
FTM

IAP
线卡收发包 SW
FTM

驱动 主控/SF
L-CPU

FPP&SSP IAP
Pon 组播协议
FTM
SERVER PUB&LIF PKTRX PPU
Mac
IAP
主控收发包
FTM

驱动
R-CPU
PON

SW SF
IAP
组播协议
FTM

IAP
线卡收发包 SW
FTM

驱动 L-CPU

FPP&SSP
Pon
ONU PUB&LIF ODMA PPU
Mac

Downstream protocol packets include:

Proxy mode:

 Query packets received by the source port transmitted by the server, transparently

transmitted to the main control via the uplink protocol;

 Query packets sent to the host at the line card side from the main control router;

 Query packets sent to the user side from the line card router;

Snooping mode:

 Query packets transmitted by server, transparently transmitted to the user side.

All rights reserved. No spreading without permission of ZTE. 42


Confidential▲

(4)Downstream data flow forwarding procedures

上联

组播协议
线卡收发包 SW
驱动 L-CPU
主控/SF
FPP&SSP

Pon PUB&LIF ODMA FTM


SERVER 组播协议
Mac
PKTRX PPU SAIP
主控收发包

驱动
R-CPU

PON
SF SW

组播协议
线卡收发包 SW
驱动 L-CPU

FPP&SSP

Pon ETM ODMA SAIP


ONU
Mac
PUB&LIF PPU PKTRX

Above figure shows each node of multicast downstream data flow forwarding

procedures. Adopt Level-2 copy.

Forwarding procedures:

uplink NP receive multicast data flows, first find corresponding MID according to

VSIID+DIP+SIP (SSM mode), if failing to hit when finding, find corresponding MID

according to VSIID+DIP (ASM mode), if failing to hit when finding twice, handle according

to unknown multicast packets (if the flood property of vlan is drop-unkown, the driver will

drop it, otherwise, flood in corresponding vlan), if corresponding MID is hit, encapsulate it

into the header of SAIH and FMF and add to the packet header, and transfer to SF of the

main control card or switching card to perform Level-1 SA copy;

The main control or switching card find corresponding SA bitmap according to MID,

and copy one multicast data flow to corresponding SA’s NP respectively to forward;

All rights reserved. No spreading without permission of ZTE. 43


Confidential▲

Line cards find corresponding exit according to MID, and copy one packet for each

exit, if multicast VLAN translation is configured, handle at the exit to perform vlan

translation, and forward it out.

Of them, the uplink card’s SA IP slices the packet into cell and send to SF, while the

PON card’s SA IP receives and reassembles the cell.

(5)Multicast debug function

Through multicast debug function, print protocol information about packet interaction,

protocol handling and FTM-layer driver setting on CLI. Analyze related protocol interaction

step by step, to locate multicast problems.

Related commands:

1)Enable multicast Debug print

CLI form: debug {igmp | mld} {all | error | event}

Command mode: Privilege mode

2)Disable multicast Debug print

CLI form: no debug {igmp | mld} or no debug all

Command mode: Privilege mode

(6)Multicast log function

Through multicast protocol, record necessary information on the main control and line

cards: such as user join / leave , aging information and records and storage of related

failures, for users to query on the NE, used for locating multicast faults.

The function is enabled to the igmp protocol by default, and disabled to mld protocol

by default, and allocate memory when enabled. For above two protocols, there are enable,

manually reporting and server configuration command control respectively, and files are

also saved respectively.

Related commands:

multicast log enable

CLI form: {igmp | mld} log {enable|disable}

All rights reserved. No spreading without permission of ZTE. 44


Confidential▲

Command mode: Configuration mode

Real-time report of multicast logs

CLI form: {igmp | mld} log report

Command mode: Configuration mode

Explanation: Do not have configuration operation command. Real-time log.

Record and interaction process of multicast logs

CLI form: {igmp | mld} log record-packet {enable|disable}

Command mode: Configuration mode

Explanation: Disable by default

Configure the uploading server of multicast logs

CLI form: {igmp | mld} log server [vrf <vrfname>] server-ip <ipaddr> host-ip

<ipaddr> type <ftp|sftp> user <username> password <password>

Command mode: Configuration mode

Explanation: server-ip: server ip

host-ip: In the NE uploading file, the IP carried in the file’s header

EMS writes host ip under mvlan, and configure NE.

Show multicast log

CLI form: show {igmp | mld} log...

Command mode: All

(7)Multicast troubleshooting method

Fault Range Specific Symptom Possible Reason Handling Suggestion


Check whether versions of the
All channels fail to Versions are
The whole NE line card and main control card
be watched mismatched
are matched

All rights reserved. No spreading without permission of ZTE. 45


Confidential▲
Check multicast configuration
and port vlan configuration,and
whether the iptv cac function is
Configuration error enabled, whether L3 multicast is
enabled, and whether isolation
between the uplink port and user
is enabled
Uplink device Check whether uplink
configuration is device-related configurations are
abnormal normal
Check whether there is hash
Hash conflict
conflict print
Check whether uplink
Part of channels fail
dispatches data flows of
to be watched Program source is
corresponding channel, capture
abnormal
packets after using drop-flow
command to configure
Check port states of line cards
and main control.
Line card and Confirm whether other services
All users fail to
main control ports except multicast are normal.
watch
are down Confirm whether the inner port of
the line card has joined to
corresponding multicast vlan
A single card Check whether forwarding
Digitmap setting is
digitmaps of services and driver
abnormal
are consistent
individual user fails Under the same PON port, for
to watch User downstream the same channel, some users
forwarding is can watch, while some can’t.
abnormal Check forwarding states of the
user’s ONU
Check dynamic entry, enable
Protocol
Never watch debug print, check whether
interaction is
successfully retrieve packets and packets
abnormal
are correct

Check dynamic entries, confirm


Scattered users
whether line card entries or main
Abnormal at regular control entries hasn’t
Entries are aged
intervals
enabled debug print when faulty,
and check whether protocols
interact normally;

All rights reserved. No spreading without permission of ZTE. 46


Confidential▲
Query packets statistics, check
whether there are impacts from
other packets;
Port bandwidth Check whether the port’s fast
Abnormal after
exceeds the limit leave is enabled
multiple times of
Maximum group Check whether the port has
channel switching
limit has been set configured maximum group limit
The network has Check whether upstream traffic
Any channels are packet loss is normal
blurred Traffic limit has Check multicast bandwidth and
been set QOS configuration

Check dynamic entries, confirm


whether line card entries or main
control entries hasn’t
Irregularly buffering Entries are aged
enabled debug print when faulty,
and check whether protocols
interact normally;

4. GPON ONUs Fail To Register

(1)Troubleshooting procedures when GPON ONUs fail to register

As shown in the following figure:

All rights reserved. No spreading without permission of ZTE. 47


Confidential▲

确认故障现象

单板上所有ONU都注册 PON口上所有ONU都注册
PON口个别ONU注册不上
失败 不上
检查配置:
1)PON口shutdown*
检查线卡软件: 2)PON口连接的ONU信息和 检查配置:
1)登录线卡shell查看 Y PON口配置的ONU认证信息 Y 1)PON口连接的ONU信息和
* ONU处于OFFLINE? 不匹配(SN认证)* ONU处于OFFLINE? PON口配置的ONU认证信息
3)是否配置TYPEB保护组* 不匹配(SN认证)*

OLT软件问题,搜 N 4)ONU是否开电?
配置问题
N 2)ONU是否开电?

集信息联系研发分 Y 线卡任务挂起?
析*
N Y 检查配置: 检查光路: 配置问题
检查硬件问题: ONU处于LOGGING- 1)PON口配置的测距模式 1)主干光纤断或者弯折?
1)查看背板时钟锁定 LOS间变化? 和当前实际光纤距离不匹 2)测量上下行光功率过低或过
状态;* 配* 高?
2)reset线卡恢复?
N 3)光纤转接头或光模块有污
光路问题
3)插拔线卡恢复? 检查光路: 损?
4)更换线卡恢复? 1)主干光纤断或者弯折? 5)LOSi/Lofi/Sdi/SFi/LOAMi
2)测量上下行光功率过低或过 告警导致
高?
Y 时钟状态非锁定? 3)光纤转接头或光模块有污
光路问题 更换/清洁分支光 Y
损? 纤恢复?
4)光回损ORL超标;
N 5)存在长发光告警?* N
Y 复位/插拔单板恢 6)LOSi/Lofi/Sdi/SFi/LOAMi
复? 告警导致 ONU处于LOGGING- Y
SYNCMIB-LOS间变
N 化?
Y 光模块异常,记录
单板问题,记录单 插拔光模块恢复? 光模块厂家、型号 N
板SN返修或联系研 Y 更换单板恢复? 和SN
发分析 N 检查软件问题: Y
N 1)线卡shell下采集信息*
OMCI消息收发是否
更换/清洁光模块
Y 光模块损坏,记录
2)查看ONU的LOS原因*
正常*?
ONU型号是否都相 光模块厂家、型号
恢复?
同? 和SN N
N N
Y 确认软件没有异 OLT/ONU软件问
ONU处于LOGGING- Y 常? 题,联系研发分析
互通问题收集ONU SYNCMIB-LOS间变
型号版本信息,联 化? Y
系ONU研发分析
N 插拔ONU光纤恢 Y
复?
检查软件问题: Y OMCI消息收发是否
1)线卡shell下采集信息*
正常*? N
2)查看ONU的LOS原因*
重启ONU恢复? Y
N
确认软件没有异
N OLT软件问题,联
N
常? 系研发分析 检查硬件问题:
1)更换同型号的ONU;
Y 2)更换不同型号的
更换PON口是否正
N ONU型号是否都相
N ONU;
常? 同?

Y Y 更换同型号ONU恢
ONU个体问题
单板硬件问题,联
更换其它型号ONU
N 复?
系研发分析或记录
恢复?
SN并返修
Y 互通问题收集ONU
更换不同型号ONU
型号版本信息,联
互通问题收集ONU 恢复?
系ONU研发分析
型号版本信息,联
系ONU研发分析

(2)Symptoms when GPON ONUs fail to register

ONUs fail to register and enter the WORKING state, general symptoms are:
 ONU state is OFFLINE;
 ONU state changes between LOGGING-LOS;
 ONU state changes among LOGGING-SYNCMIB-LOS;
 ONU state is AUTHFAILED

Extended description: several scenarios when the ONU is in non-WORKING state:


 OFFLINE
 ONU is never on-line
 ONU is RESET
 ONU is in the backup port of the protection group
 LOGGING
 ONU authentication information has passed, and is being activated
 ONU is in the ranging process
 SYNCMIB

All rights reserved. No spreading without permission of ZTE. 48


Confidential▲
 ONU activates successfully, and enters the OMCI interaction
 After ONU scans alarms and migrates to LOS, activated ONU migrates to SYNCMIB
 After protection switching, ONU migrates from OFFLINE to SYNCMIB
 LOS
 Detect that ONU has alarms generated
 ONU is offline because DEACTIVE/DISABLE/REBOOT
 ONU modifies the authentication mode
 ONU modifies the SN binding relationship

Of them, above symptoms can also be divided into: card -level problems, PON port -level
problems and ONU-level problems, which need to be differentiated.

(3)Possible reasons for GPON ONUs to fail to register

Common reasons:
1) All ONUs on the whole card fail to register:

2)All ONUs under a certain PON port are in the OFFLINE state:

3)individual ONU under a certain PON port is in the OFFLINE state:

4)All ONUs under a certain PON port change between LOGGING-LOS states:

5)individual ONU under a certain PON port changes between LOGGING-LOS states:

6)All ONUs under a certain PON port change among LOGGING-SYNCMIB-LOS states:

7)individual ONU under a certain PON port changes among LOGGING-SYNCMIB-LOS

states:

(4)Method to collect information when GPON ONU registration is

abnormal

1)Current CLI-related configuration information

ZXAN#show running-config-interface gpon_onu-1/X/X:X


ZXAN#show running-config-interface gpon_olt-1/X/X
ZXAN#show gpon olt config gpon_olt-1/X/X
ZXAN #show gpon onu detail-info gpon_onu-1/X/X:X
ZXAN#show optical-module-info gpon_olt-1/X/X
ZXAN#show pon power attenuation interface gpon_onu-1/X/X:X
ZXAN#show performance pon-onu mac gpon_onu-1/X/X:X current(Check repeatedly for
many times)
ZXAN#show card slot xx

All rights reserved. No spreading without permission of ZTE. 49


Confidential▲
ZXAN#show clock

2)Information got under oam shell

Commands of different version may be different, which you need to get from the R&D
Institute.
execute smto(oltId-1,onuId)
execute smso(oltId,onuId,1000)
execute smeo(oltId,onuId,1000)
execute smao(oltId,onuId,1000)
execute sme(2000)
execute sma(2000 )
execute sms(2000 )
execute showOltSaveAlm(oltId-1)
execute showOnuSaveAlm(oltId-1, onuId)

5.EPON ONUs fail to register

(1)Troubleshooting procedures when EPON ONUs fail to register

As shown in the following figure:

确认故障现象

单板上所有ONU都注册 PON口上所有ON都注册
PON口个别ONU注册不上
不上 不上

检查线卡软件: 检查配置: 检查配置:


1)登录线卡shell查看 1)PON口 shutdown?* 1)max rtt过小?*
* 配置问题 配置问题
2)max rtt太小?* 2)启用了硬件认证?*
3)启用了硬件认证?* 3)MAC地址冲突?

搜集信息后,联系
任务挂起?
研发分析。* 检查光路:
检查光路:
1)主干光纤断? 1)测量光功率;
光路问题
光路问题 2)测量下行光功率过 2)是否有短时长发光
检查硬件问题: 低或过高; 统计?*
1)查看时钟状态;* 3)存在长发光告警?*
2)reset线卡恢复?
3)插拔线卡恢复: 检查软件问题
4)更换线卡恢复? 光模块异常,记录 1)线卡shell下用命令
插拔光模块恢复? 光模块厂家、型号 采集相关信息;*
和SN 2)插拔ONU光纤恢复?
3)重启ONU恢复?
时钟状态非锁定?
光模块损坏,记录
更换光模块恢复? 光模块厂家、型号
和SN 插拔ONU光纤恢
联系研发分析
复?
复位/插拔单板恢
复? 检查软件问题:
软件问题,联系研
1)线卡shell下采集信
发分析
息,OLT固件问题?* 重启ONU光纤恢 ONU问题,联系研
单板问题,记录SN 复? 发分析
返修或联系研发分 更换单板恢复?
单板硬件问题,联 检查硬件问题:

系研发分析或记录 1)其他PON口是否正
SN并返修 常? 检查硬件问题:
1)更换同型号的ONU;
2)更换不同型号的
ONU;

更换同型号ONU恢
ONU个体问题
复?

互通问题,联系研 更换不同型号ONU
发分析 恢复?

All rights reserved. No spreading without permission of ZTE. 50


Confidential▲

(2)Symptoms when EPON ONUs fail to register

An EPON ONU fails to register, view via CLI that the ONU is in OFFLINE state or

power-off state, and the local PON LED of the ONU is off or flashing. Symptoms can also

be divided into:

 All ONUs on the card fail to register;

 All ONUs under a certain or some PON ports fail to register;

 A single or some ONUs fail to register;

(3)Troubleshooting steps when EPON ONUs fail to register

Step 1: Confirm fault symptoms. Confirm whether ONUs on the whole card fail to register,

or ONUs under the whole PON port fail to register, or several ONUs fail to register;

Step 2: Adopt different troubleshooting steps according to fault symptoms:

1) ONUs on the whole card fail to register, troubleshooting steps:

a) Log in the line card’s serial port, query whether there are tasks suspended

under shell, if yes, collect necessary information and then contact the R & D

to analyze.

b) Check the clock state of the card, if the clock state is non-locked, it can be

judged as a clock problem, and the problem is usually caused by hardware

problems;

c) If it can restore by resetting the line card or inserting line card, then the card

hardware has a problem, but symptoms are unstable, you need to continue

observing later;

d) If it can only be solved by changing a card can, the card hardware has a

problem, please record SN and contact R & D to confirm whether to return it

back to R & D for analysis;

2) ONUs under the whole PON port fail to register, troubleshooting steps:

a) Check the configuration data;

b) Check the physical optical path;

c) Judge whether there is long laser alarm;

All rights reserved. No spreading without permission of ZTE. 51


Confidential▲

d) Log in the line card’s serial port, collect ONU lower-layer registration

information, to transmit to the R & D;

e) If the R & D judge that the software is normal, view whether there are other

PON ports on the same card also have the same problem, and then reset

the card on CLI to view whether it can be restored, if it can't be restored,

plug / unplug the card to view whether it can be restored, if it still can't be

restored, change a card to view whether it can be restored. If it is confirmed

that the card has a problem, please record SN and contact R & D to confirm

whether to return it back to R & D for analysis;

3) Only part of ONUs fail to register, troubleshooting steps:

a) Check the configuration data;

b) Check the physical optical path;

c) Judge whether there is long laser alarm;

d) Log in the line card’s serial port, collect lower-layer registration information,

to send to the R & D for analysis;

e) If the R & D judge that the software is normal, you need to troubleshoot

through changing ONUs. First, change ONUs of the same type, and then

change ONUs of other types.

All rights reserved. No spreading without permission of ZTE. 52


Confidential▲

Introduce C600 Maintenance & Locating Method

1. Capture packets of CPU remotely

(1)Main control card-based protocol packet’s cpu packet capturing

function

Enter the control-panel security mode control-panel


ZXAN#config t

Enter configuration commands, one per line. End with CTRL/Z.

ZXAN(config)#control-panel

ZXAN(config-control-panel)#

ZXAN(config-control-panel)#snatch-cpu-packet mp enable //Enable the main control card’s cpu packet

capturing function

ZXAN(config-control-panel)#write snatch-packet mp //Packet capturing process saving will disable the

packet capturing function automatically

ZXAN (config-control-panel)end

ZXAN# dir //View the name of the packet capturing file

ZXAN #copy ftp root: /datadisk0/capture/packetmp.pcap MPU-1/10/0

//10.60.181.21/packetmp.pcap@zte:zte //Upload the packets captured to the local PC, MPU-1/x/0, x is

the main control’s slot number

(2)Line card protocol packet’s cpu packet capturing function

Enter the control-panel security mode control-panel


ZXAN(control-panel)#

ZXAN(control-panel)#snatch-cpu-packet slot 3 enable //Enable the line card cpu packet capturing

function

ZXAN(control-panel)#write snatch-packet slot 3 //Packet capturing process saving will disable the

packet capturing function automatically

ZXAN (control-panel)end

ZXAN# dir //View the name of the packet capturing file

ZXAN #copy ftp root: /datadisk0/capture/packetnp.pcap MPU-1/10/0

//10.60.181.21/packetnp.pcap@zte:zte //Upload the packets captured to the local PC

All rights reserved. No spreading without permission of ZTE. 53


Confidential▲

2. Packet statistics function

(1)Unicast flow statistics function

Flow statistics can be performed according to IP, MAC, port:


ZXAN(diag)#statistics flow 1 ?
cvlan Cvlan configuration
dst-ip Destination ip configuration
dst-mac Destination mac configuration
src-ip Source ip configuration
src-mac Source mac configuration
src-port Source port configuration
svlan Svlan configuration

(2)CPU protocol packet statistics

ZXAN#show control-panel rate limit statistics inband


pktType pktTotal pktDrop pps ppsPeak dropSec peakTime
all 2906 0 0 1 0 12:21:10
12/18/2018
icmp 0 0 0 0 0 -
ssh 0 0 0 0 0 -
other 0 0 0 0 0 -
igmp 0 0 0 0 0 -
dhcp 0 0 0 0 0 -
pppoe 0 0 0 0 0 -
snmp 0 0 0 0 0 -
telnet 0 0 0 0 0 -
bfd 0 0 0 0 0 -
stp 0 0 0 0 0 -
lacp 0 0 0 0 0 -
lldp 0 0 0 0 0 -
rip 0 0 0 0 0 -
bgp 0 0 0 0 0 -
ospf 0 0 0 0 0 -
isis 0 0 0 0 0 -
ldp 0 0 0 0 0 -
cfm 0 0 0 0 0 -
ipv6_icmp 0 0 0 0 0 -
ipv6_dhcp 0 0 0 0 0 -
raps 0 0 0 0 0 -
mld 0 0 0 0 0 -

All rights reserved. No spreading without permission of ZTE. 54


Confidential▲
erps 0 0 0 0 0 -
ieee1588 0 0 0 0 0 -
pim 0 0 0 0 0 -
ntp 0 0 0 0 0 -
dhcp_bcast 0 0 0 0 0 -
padi 0 0 0 0 0 -
arp_networ 0 0 0 0 0 -
k_side
arp_user_s 2906 0 0 1 0 12:21:10
ide 12/18/2018
ipv6_rs_ne 0 0 0 0 0 -
twork_side
ipv6_rs_us 0 0 0 0 0 -
er_side
ipv6_ra_ne 0 0 0 0 0 -
twork_side
ipv6_ra_us 0 0 0 0 0 -
er_side
ipv6_ns_ne 0 0 0 0 0 -
twork_side
ipv6_ns_us 0 0 0 0 0 -
er_side
ipv6_na_ne 0 0 0 0 0 -
twork_side
ipv6_na_us 0 0 0 0 0 -
er_side

(3)Protocol packet statistics shown as entering the line card’s CPU

ZXAN#show control-panel rate limit statistics slot 2


pktType pktTotal pktDrop pps ppsPeak dropSec peakTime
all 37656 0 0 9 0 12:20:30
12/18/2018
icmp 0 0 0 0 0 -
ssh 0 0 0 0 0 -
other 0 0 0 0 0 -
igmp 37656 0 0 9 0 12:20:30
12/18/2018
dhcp 0 0 0 0 0 -
pppoe 0 0 0 0 0 -
snmp 0 0 0 0 0 -
telnet 0 0 0 0 0 -
bfd 0 0 0 0 0 -

All rights reserved. No spreading without permission of ZTE. 55


Confidential▲
stp 0 0 0 0 0 -
lacp 0 0 0 0 0 -
lldp 0 0 0 0 0 -
rip 0 0 0 0 0 -
bgp 0 0 0 0 0 -
ospf 0 0 0 0 0 -
isis 0 0 0 0 0 -
ldp 0 0 0 0 0 -
cfm 0 0 0 0 0 -
ipv6_icmp 0 0 0 0 0 -
ipv6_dhcp 0 0 0 0 0 -
raps 0 0 0 0 0 -
mld 0 0 0 0 0 -
erps 0 0 0 0 0 -
ieee1588 0 0 0 0 0 -
pim 0 0 0 0 0 -
ntp 0 0 0 0 0 -
dhcp_bcast 0 0 0 0 0 -
padi 0 0 0 0 0 -
arp_networ 0 0 0 0 0 -
k_side
arp_user_s 0 0 0 0 0 -
ide
ipv6_rs_ne 0 0 0 0 0 -
twork_side
ipv6_rs_us 0 0 0 0 0 -
er_side
ipv6_ra_ne 0 0 0 0 0 -
twork_side
ipv6_ra_us 0 0 0 0 0 -
er_side
ipv6_ns_ne 0 0 0 0 0 -
twork_side
ipv6_ns_us 0 0 0 0 0 -
er_side
ipv6_na_ne 0 0 0 0 0 -
twork_side
ipv6_na_us 0 0 0 0 0 -
er_side
ZXAN#

(4)Multicast service flow statistics

The commands are listed as the following, and you can choose IP, port information

All rights reserved. No spreading without permission of ZTE. 56


Confidential▲
according to specific problems in the existing network:
ZXAN(diag)#statistics multicast-flow 1 mvlan 4000 group 224.1.1.2 source ?
:: Ipv6 source address
A.B.C.D Ipv4 source address
ZXAN(diag)#statistics multicast-flow 1 mvlan 4000 group 224.1.1.2 source

All rights reserved. No spreading without permission of ZTE. 57

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy