HPUX SW Recovery Handbook - HPUX IO Addressing (五) 工具fcmsutil,tdutil等

标签:
hpuxfcmsutiltdutil杂谈 |
分类: HPUX |
诊断工具fcmsutil, tdutil, tdlist, tddiag
fcmsutil
fcmsutil工具在/opt/fc/bin/ 或是/opt/fcms/bin.可以用来诊断FC环路的问题。需要以光纤卡的设备文件启动。 (通过ioscan –fnk查看).
Tachyon Example:
# fcmsutil /dev/fcms2
Local N_Port_ID is = 0x000001
N_Port Node World Wide Name = 0x10000060B03EF669
N_Port Port World Wide Name = 0x10000060B03EF669
Topology = IN_LOOP
Speed = 1062500000 (bps)
HPA of card = 0xFFB4C000
EIM of card = 0xFFFA2009
Driver state = READY
Number of EDB's in use = 0
Number of OIB's in use = 0
Number of Active Outbound Exchanges = 1
Number of Active Login Sessions = 3
Tachyon TL/TS/XL2 Example:
# fcmsutil /dev/td0
Vendor ID is = 0x00103c
Device ID is = 0x001029
XL2 Chip Revision No is = 2.2
PCI Sub-system Vendor ID is = 0x00103c
PCI Sub-system ID is = 0x00128c
Topology = PTTOPT_FABRIC
Link Speed = 1Gb
Local N_Port_id is = 0x0d1200
N_Port Node World Wide Name = 0x50060b000010dcef
N_Port Port World Wide Name = 0x50060b000010dcee
Driver state = ONLINE
Hardware Path is = 0/4/1/0
Number of Assisted IOs = 471
Number of Active Login Sessions = 0
Dino Present on Card = NO
Maximum Frame Size = 1024
Driver Version = @(#) libtd.a HP Fibre Channel
Tachyon XL2 Driver B.11.23.0512 $Date: 2005/09/20 12:22:47 $Revision:
r11.23/1
其他可能的拓扑是: PRIVATE_LOOP
把光纤拔出来以后,状态就会变成:
Driver state = AWAITING_LINK_UP
可以获取连接的统计信息:
# fcmsutil /dev/td0 stat -s
Fri Apr 26 16:05:55 2002
Channel Statistics
Statistics From Link Status Registers ...
Loss of
signal
Loss of
Sync
Received
EOFa
Bad CRC
不要只看数值高不高,因为有些计数器随着启动或LIP就会递增。只有随着时间增长而增长的数字才意味着问题。下面可以看到一个典型的出错信息:
0/4/0/0: Unable to access previously accessed device at nport ID 0xae.
如何诊断:
http://s6/middle/7643a1bfgc26710d995c5&690SW
http://s16/middle/7643a1bfgc26710e7a69f&690SW
看到蓝色部分8.0.255没有,说明这是一个private
loop. 这里N_port
ID就等于AL_PA。
# fcmsutil /dev/td0 devstat all | grep -e Nport -e Failed
Device Statistics for Nport_id 0x0000ae(Loop_id 34)
Failed
Open of previously opened device
Device Statistics for Nport_id 0x0000b9(Loop_id 27)
Failed
Open of previously opened device
Device Statistics for Nport_id 0x0000ba(Loop_id 26)
Failed
Open of previously opened device
Device Statistics for Nport_id 0x0000bc(Loop_id 25)
Failed
Open of previously opened device
Device Statistics for Nport_id 0x0000c3(Loop_id 24)
Failed
Open of previously opened device
Device Statistics for Nport_id 0x0000c6(Loop_id 22)
Failed
Open of previously opened device
Device Statistics for Nport_id 0x0000ce(Loop_id 15)
Failed
Open of previously opened device
Device Statistics for Nport_id 0x0000d1(Loop_id 14)
Failed
Open of previously opened device
Device Statistics for Nport_id 0x0000d2(Loop_id 13)
Failed
Open of previously opened device
Device Statistics for Nport_id 0x0000d3(Loop_id 12)
Failed
Open of previously opened device
这是一个private loop,采用PDA 编址模式(8.0.255),
LoopID = 16*Bus+Target
==> Bus = 34 DIV 16 = 2
==> Target = 34 MOD 16 = 2
所以对应的 HW path:
HBA Domain Area Port Bus Target Lun
0/4/0/0. 8 . 0 . 255 . 2 . 2 . 0
这个路径在ioscan的输出中没出现,因为IO显示为NO_HW,可能有人把这个设备断开了,在这之前主机没有重起过。
ext_bus 6 0/4/0/0.8.0.255.2 fcpdev NO_HW INTERFACE FCP Device Interface
A5236A 是FC10 JBOD盘柜.
Overview of fcmsutil options
T: Tachyon芯片适用, TL: Tachyon TL/TS/XL2适用, 红色字体的:
有干扰的操作,通讯可能中断。
http://s8/middle/7643a1bfgc2670d8ade37&690SW Recovery Handbook - HPUX IO Addressing (五) 工具fcmsutil,tdutil等" TITLE="HPUX SW Recovery Handbook - HPUX IO Addressing (五) 工具fcmsutil,tdutil等" />
tdutil, tdlist, tddiag
在/opt/fcms/bin中还有其他的工具:
tdutil
这就是td驱动专用的fcmsutil.
tdlist
是一个shell脚本,通过ioscan和tdutil命令列出所有通过td驱动程序来驱动的设备。这个程序有个不错的功能是把Loop ID转换成AL_PA。
tddiag
是一个shell脚本,收集下列信息:主机名、主机型号,内存信息,mount的文件系统,TachLite版本信息,ioscan输出,tdlist输出,补丁,TachLite对应的设备文件名,正在运行的进程,每个/dev/td#对应的信息等
如何更换硬盘- Tachyon TL/TS/XL2 HBAs
服务器访问目标前要先通过认证其WWN,认证过程(PLOGI)确保主机访问的是正确的设备,避免用户无意间在同一nport_id上连接了其他设备造成数据损坏。
HBA保留一张表记录了每一个访问过的设备的地址 (S_ID or AL_PA and WWN)。这张表是在链路初始化过程或是每一次通讯时建立的。
NOTE: 这个认证过程只在TL/TS/XL2 adapter上支持,Tachyon adapters不进行这样的认证。 (replace_dsk 选项只在TL/TS/XL2上支持).
Fcmsutil的replace_dsk选项用来替换同一nport_id上的设备。更换硬盘时应该执行这个命令,这样就禁止了下次访问里对设备的认证,从而避免出现下列的出错信息 (syslog):
0/4/0/0: 'World-wide name' (unique identifier) for device at Loop ID
0x5 has changed. If the device has been replaced intentionally, please
use the fcmsutil replace_dsk command to allow the new device to be
used.
本例中连接在Tachyon TL/TS/XL2-Adapter /dev/td0 上loop_id=5的硬盘将要被更换
. 确认待更换硬盘的nport_id或是loop_id (如果syslog或其他日志里已经报出来了,就不用了)
.把旧设备移走
.列出所有正常访问过的设备,通过devstat all选项
# fcmsutil /dev/td0 devstat all | grep Loop
Device Statistics for Nport_id 0x0000E8 (Loop_id 1)
Device Statistics for Nport_id 0x0000DA (Loop_id 5)
# fcmsutil /dev/td0 echo -l 1
Data came back intact
...
# fcmsutil /dev/td0 echo -l 5
Unable to login
· 运行replace_dsk ,用nport_id:
# fcmsutil /dev/td0 replace_dsk 0x0000DA
在private loop拓扑中也可以用 loop_id:
# fcmsutil /dev/td0 replace_dsk -l 5
命令运行后,会有提示信息:
Disk at nportid 0x0000da (Loop_id 5) will not be authenticated
ATTENTION:这个步骤要在所有访问该设备的Tachyon TL/TS/XL2 adapter 上执行!
· 换盘吧
新的硬盘(loop_id=5)就不会出现认证错误了。下一次主机访问设备时,会将其WWN更新,与nport_id对应。
输入正确的nport_id 或是loop_id很重要。但你要是万一输错了,也没啥坏结果。
Fibre Channel Storage Devices
简单看下现有的FC存储设备。细节可以去看
略过,写的都是老掉牙的设备,只有EVA与XP还在用。