Sunday, 31 March 2013

lsof - List open files

This lists information about files opened by processes. It is great, especially when you are troubleshooting an issue and need more information about process or connection details. Linux treats most everything as a file. An open file may be a regular file, a directory, a block special file, a character special file, an executing text reference, a library, a stream or a network file (Internet socket, NFS file or UNIX domain socket.) A specific file or all the files in a file system may be selected by path. When a process or application interacts with these files it has to "open" them. Using this command you can dig into and see what your process is doing.

To show all the open TCP files - This will give you what service is running, who is running it, the process ID and the connections on all TCP ports:

# lsof -i TCP
Show open files or programs that is running on TCP port 80


# lsof -i TCP:80
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
httpd 21867 root 3u IPv4 98670393 TCP *:http (LISTEN)
httpd 21891 apache 3u IPv4 98670393 TCP *:http (LISTEN)
httpd 21892 apache 3u IPv4 98670393 TCP *:http (LISTEN)
httpd 21893 apache 3u IPv4 98670393 TCP *:http (LISTEN)
httpd 21894 apache 3u IPv4 98670393 TCP *:http (LISTEN)
httpd 21895 apache 3u IPv4 98670393 TCP *:http (LISTEN)
httpd 21896 apache 3u IPv4 98670393 TCP *:http (LISTEN)

To list which user is actively using /tmp/

# lsof /tmp/ COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
bash 4756 root cwd DIR 8,2 36864 212577 /tmp/

Dynamically detecting new disks in Linux

When you have new LUNs created on the SAN fabric, zoned & mapped it to the server; how can you detect the luns on the linux server online, without rebooting it?.

When you dynamically add new disks to a Linux VM running on ESX server, how do you detect that disks on the Linux virtual machine?.

Here are the steps to do that :
  1. Install sg3_utils and lsscsi package.
    [root@uminux ~]# # yum install –y sg3_utils lsscsi
  2. The “lsscsi” command will list the disks attached to it. If you have just attached a disk, you will not be able to see it. You can also list this using “fdisk –l
    [root@uminux ~]# lsscsi
    [0:0:0:0]    disk    VMware   Virtual disk     1.0   /dev/sda
    [root@uminux ~]#

    As you can see above, I currently have one disk connected to the system. To scan for a new device I just added, we should run rescan-scsi-bus.sh from the host.

  3. Run the command “/usr/bin/rescan-scsi-bus.sh” , to dynamically detect and activate the new disk.
    [root@uminux ~]# /usr/bin/rescan-scsi-bus.sh -l
    Host adapter 0 (mptspi) found.
    Scanning SCSI subsystem for new devices
    Scanning host 0 for  SCSI target IDs  0 1 2 3 4 5 6 7, LUNs  0 1 2 3 4 5 6 7
    Scanning for device 0 0 0 0 ...
    OLD: Host: scsi0 Channel: 00 Id: 00 Lun: 00
          Vendor: VMware   Model: Virtual disk   Rev: 1.0
          Type:   Direct-Access                  ANSI SCSI revision: 02
    Scanning for device 0 0 1 0 ...
    NEW: Host: scsi0 Channel: 00 Id: 01 Lun: 00
          Vendor: VMware   Model: Virtual disk   Rev: 1.0
          Type:   Direct-Access                  ANSI SCSI revision: 02
    1 new device(s) found.
    0 device(s) removed.
    [root@uminux ~]#

    [root@uminux ~]# lsscsi
    [0:0:0:0]    disk    VMware   Virtual disk     1.0   /dev/sda
    [0:0:1:0]    disk    VMware   Virtual disk     1.0   /dev/sdb
    [root@uminux ~]#
You see the new disk is visible. Now you can create a partition or filesystem on it.

After running those commands, check dmesg and /var/log/messages to see if there are any device detections. You can also do "fdisk -l" or "cat /proc/scsi/scsi" to see the attached LUNs. This works fine in RHEL5, SuSE 10, CentOS5, OEL5.

Listing mksysb image details

lsmksysb

There's a simple command to list information about a mksysb image, called lsmksysb:
# lsmksysb -lf mksysb.image
VOLUME GROUP:      rootvg
BACKUP DATE/TIME:  Mon Jun 6 04:00:06 MST 2011
UNAME INFO:        AIX testaix1 1 6 0008CB1A4C00
BACKUP OSLEVEL:    6.1.6.0
MAINTENANCE LEVEL: 6100-06
BACKUP SIZE (MB):  49920
SHRINK SIZE (MB):  17377
VG DATA ONLY:      no

rootvg:

LV NAME    TYPE     LPs  PPs  PVs  LV STATE      MOUNT POINT
hd5        boot     1    2    2    closed/syncd  N/A
hd6        paging   32   64   2    open/syncd    N/A
hd8        jfs2log  1    2    2    open/syncd    N/A
hd4        jfs2     8    16   2    open/syncd    /
hd2        jfs2     40   80   2    open/syncd    /usr
hd9var     jfs2     40   80   2    open/syncd    /var
hd3        jfs2     40   80   2    open/syncd    /tmp
hd1        jfs2     8    16   2    open/syncd    /home
hd10opt    jfs2     8    16   2    open/syncd    /opt
dumplv1    sysdump  16   16   1    open/syncd    N/A
dumplv2    sysdump  16   16   1    open/syncd    N/A
hd11admin  jfs2     1    2    2    open/syncd    /admin

AIX ODM Commands

Basic ODM Commands:

ODM is object database Manager

NOTE: VERY IMPORTANT!

Use these commands with EXTREME CAUTION!!! You should make backup copies of the individual ODM Class files (CuAt, CuDv, CuDvDr, CuDep,and CuVPD), before you attempt to use these commands.

First, take a backup of the ODM files by issuing:
cd /etc/objrepos
for i in CuAt CuDv CuDvDr CuDep
do
odmget $i > /tmp/$i.orig
done

1. How to find disk drive in ODM customized database:

odmget -q name=hdisk# CuAt           ==> CuAt = Customized Attribute
odmget -q value=hdisk# CuAt
odmget -q name=hdisk# CuDv           ==> CuDv = Customized Device
odmget -q value3=hdisk# CuDvDr      ==> CuDvDr = Customized Device Driver
odmget -q name=hdisk# CuDep        ==> CuDep = Customized Dependency
odmget -q name=hdisk# CuVPD       ==> CuVPD = Customized Vital Product Database

2. How to remove disk drive entries from ODM customized database:

odmdelete -q name=hdisk# -o CuAt
odmdelete -q value=hdisk# -o CuAt
odmdelete -q name=hdisk# -o CuDv
odmdelete -q value3=hdisk# -o CuDvDr
odmdelete -q name=hdisk# -o CuDep
odmdelete -q name=hdisk# -o CuVPD

3. How to find VG (rootvg) in ODM database:

odmget -q name=rootvg CuAt
odmget -q name=rootvg CuDv
odmget -q parent=rootvg CuDv
odmget -q value1=rootvg CuDvDr
odmget -q value3=rootvg CuDvDr
odmget -q name=rootvg CuDep

4. How to find LV in ODM database:

odmget -q name=LV CuAt
odmget -q name=LV CuDv
odmget -q value3=LV CuDvDr
odmget -q dependency=LV CuDep

5. How to find an object in CuDvDR by major, minor number:

Example: if major num=10 & minor=1
odmget -q "value=10 and value=1" CuDvDr

6. How to find value (may be pvid of an old disk left in CuAt):

odmget -q value=00001165d6faf66b0000000000000000 CuAt
(Add 16 zeros after the PVID number. This value
should be 32 characters in lenght.)

7. To search the ODM for a specific Item.

odmget CuAt | grep (Specific Item) -> record the number of items
odmget CuDv | grep (Specific Item) -> record the number of items
odmget CuDvDr | grep (Specific Item) -> record the number of items
odmget CuDep | grep (Specific Item) -> record the number of items
odmget CuVPD | grep (Specific Item) -> record the number of items

Now you can use the odmdelete command above to remove the specific item that you searched for.

How can I directly read out the VGDA of a PV (hdisk)?

Information about VGx, LVx, filesystems, etc. are stored in the ODM. But these information are also written to the VGDA of the disks itself. You can read the information directly from the disk's VGDA with a command like this:
# lqueryvg -Atp hdisk100
You can use
# redefinevg -d hdisk100 myvg
to synchronize the ODM with the information of the VGDA. You can also synchronize the VGDA with the information stored in the ODM:
# synclvodm myvg

NIM Commands

A. INTRODUCTION: OBJECTS AND CLASSES

NIM (the Network Installation Manager) stores all information needed for the installation of servers in objects. Objects are organized in object types and object classes. Here is an overview of the most important object types and classes:

ClassTypeDescription
machinesstandalonethe client LPAR to be installed via NIM
networksentnetwork definition (network address, gateway)
resourceslpp_sourcea set of AIX filesets
resourcesmksysban mksysb image
resourcesspota /usr filesystem
resourcesfb_scriptscript, to be executed during the first boot after installation
resourcesscripta postinstall script

B. COMMAND OVERVIEW

1. LISTING ALL DEFINED NIM OBJECTS

# lsnim

2. LISTING ALL DEFINED OBJECTS OF A SPECIFIC TYPE

# lsnim -t <type>

3. SHOWING AN OBJECT'S DEFINITION

# lsnim -l <object>

4. DEFINING AN LPP SOURCE

# nim -o define -t lpp_source \
            -a server=master \
            -a location=</path/to/bffs> \
            -a comments=<free text> \
        <lpp source>

5. DEFINING A NETWORK

# nim -o define -t ent \
          -a net_addr=<netaddress>  \
          -a snm=<netmask>  \
          -a routing1="default <gateway>" \
       <network>

6. DEFINING A NIM CLIENT

# nim -o define -t standalone \
           -a platform=chrp \
           -a netboot_kernel=64 \
           -a if1="<network> <ip label> 0 ent" \
           -a cable_type1=tp \
        <client>

You could also use an ip address instead of an ip label here

7. DEFINING AN MKSYSB RESOURCE

# nim -o define -t mksysb \
        -a server=master \
        -a comments="<free text>" \
        -a location=<directory> \
    <mksysb>

8. DEFINING AN IMAGE_DATA RESOURCE

# nim -o define -t image_data \
        -a server=master \
        -a comments="<free text>" \
        -a location=</path/to/image_data> \
    <image_data>

9. CREATING A SPOT FROM AN LPP SOURCE

# nim -o define -t spot \
            -a server=master \
            -a source=<lpp source> \
            -a location=<directory> \
            -a comments="<free text>" \
        <spot>

10. CREATING A SPOT FROM AN MKSYSB

# nim -o define -t spot \
        -a server=master \
        -a source=<mksysb> \
        -a location=<directory> \
        -a comments="<free text>" \
    <spot>

Use the base directory for your spots here rather than a spot specific directory. NIM automatically creates a subdirectory with the name of the spot object: <spot>

11. PREPARE SPOT AND LPP SOURCE FOR AN ALTERNATE DISK MIGRATION

# nimadm -M -s <spot> -l <lpp source> -d <source directory>

In <source directory> NIM searches for the two filesets «bos.alt_disk_install.rte» and «bos.alt_disk_install.boot_images». nimadm then updates spot and LPP source with these two filesets. This way you can migrate a client to a lower AIX level then the level of the NIM server itself. This feature has been added to NIM with AIX 7.1.

12. MODIFYING A CLIENT DEFINITION

# nim -o change -a <attribute>=<value> <client>

You find the exact names of valid attributes in the output of lsnim -l <client>. The option change is used to change the value of an attribute, e.g. if you want to change a client's netboot kernel from 64 to mp you would type:

# nim -o change -a netboot_kernel=mp <client>

13. RE-INITIALIZING A CLIENT

If a client's /etc/niminfo is out of date. It can be rewritten by the below procedure:

client# rm /etc/niminfo
client# niminit -a name=<client> -a master=<nimserver> -a connect=nimsh¹

This procedure is useful if you want to move a client from one NIM server to another. In this case remember to first create the client on the server before running this procedure.

¹ "-a connect=nimsh" is optional and only required if you don't want the NIM server to communicate via rsh with the client.

14. INSTALLING A CLIENT

# nim -o bos_inst \
           -a spot=<spot> \
           -a lpp_source=<lpp source> \
           -a fb_script=<script> \
           -a script=<postinstall script> \
           -a no_client_boot=yes \
           -a accept_licenses=yes \
        <client>

Use the option no_client_boot=yes if you don't want NIM to initiate a reboot of your LPAR over rsh. You have to manually boot the LPAR from the SMS menu then - what is probably what you want.

15. INSTALLING A CLIENT WITH AN MKSYSB IMAGE

# nim -o bos_inst \
           -a source=mksysb \
           -a spot=<spot> \
           -a mksysb=<mksysb> \
           -a lpp_source=<lpp source> \
           -a fb_script=<script> \
           -a script=<postinstall script> \
           -a no_client_boot=yes \
           -a accept_licenses=yes \
        <client>

16. RESET A NIM CLIENT

# nim -F -o reset <client>

resets a NIM client so new operations can be done. Please note that often it's not enough to just reset a NIM object because there are still resources allocated for the client. You find all resources still allocated to the client with lsnim -l <client>. They can be removed with:
# nim -o deallocate -a spot=<spot> -a ...=... <client>

To remove all resources from a client simply run:
# nim -o deallocate -a subclass=all <client>

17. QUERY A CLIENT FOR INSTALLED APARS

# nim -o fix_query <client>

This command is useful to check for your nimserver can reach the client.

18. ENABLING A MAINTENANCE BOOT

# nim -o maint_boot -a spot=<spot> <client>

Now you can boot your client over the network into a maintenance shell.

19. START AN ALTERNATE DISK MIGRATION

# nimadm -c <client> -l <lpp source> -s <spot> -d <hdisk> -Y

Configuring Tape Devices on AIX with NPIV and VIOS

This quick tip provides some insight into what to expect when configuring tape drives on AIX operating systemwith Virtual Fibre Channel adapters, the Virtual I/O Server (VIOS), and NPIV.
In this environment, the 8GB Fibre Channel (FC) adapters (Feature Code 5735) have been assigned to the Virtual I/O Servers. A single dual-port 8GB FC adapter is assigned to each VIOS. These adapters are dedicated for use with tape only. The tape library in question is an IBM TS3310.
The AIX LPARs were initially configured with Virtual FC adapters for connectivity to FC SAN disk (XIV). As shown in the lspath output below, fcs0 through fcs3 are used exclusively for access to disk only.
# lsdev -Cc adapter | grep fcs
fcs0 Available 30-T1 Virtual Fibre Channel Client Adapter
fcs1 Available 31-T1 Virtual Fibre Channel Client Adapter
fcs2 Available 32-T1 Virtual Fibre Channel Client Adapter
fcs3 Available 33-T1 Virtual Fibre Channel Client Adapter

# lspath
Enabled hdisk0  fscsi0
Enabled hdisk0  fscsi0
Enabled hdisk0  fscsi0
Enabled hdisk0  fscsi0
Enabled hdisk0  fscsi1
Enabled hdisk0  fscsi1
Enabled hdisk0  fscsi1
Enabled hdisk0  fscsi1
Enabled hdisk0  fscsi2
Enabled hdisk0  fscsi2
Enabled hdisk0  fscsi2
Enabled hdisk0  fscsi2
Enabled hdisk0  fscsi3
Enabled hdisk0  fscsi3
Enabled hdisk0  fscsi3
Enabled hdisk0  fscsi3
..etc.. for the other disks on the system
In order for us to connect to our tape drives (of which there were four in total in the TS3310), we configured four additional virtual FC adapters for the LPAR.
 First we ensured that the physical adapters were available and had fabric connectivity. On both VIOS, we used the lsnports command to determine the state of the adapters and their NPIV capability. As shown in the following output, the physical adapter’s fcs4 and fcs5 were both available and NPIV ready (i.e. there was a 1 in the fabric column. If it was zero then the adapter may not be connected to an NPIV capable SAN).
$ lsnports
name             physloc                        fabric tports aports swwpns  awwpns
fcs0             U78A0.001.DNWK4W9-P1-C3-T1          1     64     52   2048    1988
fcs1             U78A0.001.DNWK4W9-P1-C3-T2          1     64     52   2048    1988
fcs2             U5877.001.0084548-P1-C1-T1          1     64     61   2048    2033
fcs3             U5877.001.0084548-P1-C1-T2          1     64     61   2048    2033
fcs4             U5877.001.0084548-P1-C2-T1          1     64     64   2048    2048
fcs5             U5877.001.0084548-P1-C2-T2          1     64     64   2048    2048
When I initially checked the state of the adapters on both VIOS, I encountered the following output from lsnports
$ lsnports
name             physloc                        fabric tports aports swwpns  awwpns
fcs0             U78A0.001.DNWK4W9-P1-C3-T1          1     64     52   2048    1988
fcs1             U78A0.001.DNWK4W9-P1-C3-T2          1     64     52   2048    1988
fcs2             U5877.001.0084548-P1-C1-T1          1     64     61   2048    2033
fcs3             U5877.001.0084548-P1-C1-T2          1     64     61   2048    2033
fcs4             U5877.001.0084548-P1-C2-T1          0     64     64   2048    2048
As you can see, only the fcs4 adapter was discovered, fcs5 was missing and the fabric value for fcs4 was 0. Both of these issues were the result of physical connectivity issues to the SAN. The cables were unplugged and/or they had a loopback adapter plugged into the interface. There was an error in the error report indicating link errors on fcs4 but not for fcs5.
$ errlog
IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
7BFEEA1F   0502104011 T H fcs4           LINK ERROR
Once the ports were physically connected to the SAN switches, I removed the entry for fcs4 from the ODM (as shown below) and then ran cfgmgr on the VIOS.
$oem_setup_env
# rmdev -dRl fcs4
fcnet4 deleted
sfwcomm4 deleted
fscsi4 deleted
fcs4 deleted
# cfgmgr
# exit
$
Then both fcs4 and fcs5 were discovered and configured correctly.
$ lsnports
name             physloc                        fabric tports aports swwpns  awwpns
fcs0             U78A0.001.DNWK4W9-P1-C3-T1          1     64     52   2048    1988
fcs1             U78A0.001.DNWK4W9-P1-C3-T2          1     64     52   2048    1988
fcs2             U5877.001.0084548-P1-C1-T1          1     64     61   2048    2033
fcs3             U5877.001.0084548-P1-C1-T2          1     64     61   2048    2033
fcs4             U5877.001.0084548-P1-C2-T1          1     64     64   2048    2048
fcs5             U5877.001.0084548-P1-C2-T2          1     64     64   2048    2048
Our next step was to configure the new virtual FC host adapters on the VIOS and the new virtual FC adapters on the client LPAR. Below is a conceptual diagram of how the LPAR would connect to the tape drives.

As you can see, each VIO server has a single two port FC adapter which is dedicated for tape. These adapter ports appear as fcs4 and fcs5 on each VIO server (vio1 and vio2).
The AIX LPAR (tsm1) has additional Virtual FC adapters, dedicated for tape as well. These adapters appear as fcs4, fcs5, fcs6 and fcs7.
The plan was for fcs4 on tsm1 to map to fcs4 on vio1, fcs5 to map to fcs5 on vio1 and fcs6 to map to fcs4 on vio2 and fcs7 to map to fcs5 on vio2.
The virtual adapter slot configuration was as follows:
LPAR: tsm1                      VIOS: vio1
U8233.E8B.06XXXXX-V4-C34-T1 >   U8233.E8B.06XXXXX-V1-C60      
U8233.E8B.06XXXXX-V4-C35-T1 >   U8233.E8B.06XXXXX-V1-C61      

LPAR: tsm1                      VIOS: vio2
U8233.E8B.06XXXXX-V4-C36-T1 >   U8233.E8B.06XXXXX-V2-C60      
U8233.E8B.06XXXXX-V4-C37-T1 >   U8233.E8B.06XXXXX-V2-C61
We created two new virtual FC host adapters on vio1 and two new virtual FC host adapters on vio2. This was done by updating the LPARs profile (on the HMC) with the new adapters and then adding them with a DLPAR operation on each VIOS. Once we had run the cfgdev command on each VIOS, to bring in the new virtual FC host adapters, next we needed to map them to the physical FC ports.
Using the vfcmap command on each of the VIOS, we mapped the physical ports to the virtual host adapters as follows:
1.     Map tsm1 vfchost30 adapter to physical FC adapter fcs4 on vio1.
$ vfcmap –vadpater vfchost30 –fcp fcs4
2.     Map tsm1 vfchost31 adapter to physical FC adapter fcs5 on vio1.
$ vfcmap –vadapter vfchost31 – fcp fcs5
3.     Map tsm1 vfchost30 adapter to physical FC adapter fcs4 on vio2.
$ vfcmap –vadapter vfchost30 – fcp fcs4
4.     Map tsm1 vfchost31 adapter to physical FC adapter fcs5 on vio2.
$ vfcmap –vadapter vfchost31 –fcp fcs5
Next we used DLPAR (using the following procedure) to update the client LPAR with four new virtual FC adapters. After running the cfgmgr command on the LPAR, we confirmed we had four new virtual FC adapters. We ensured that we saved the LPARscurrent configuration, as outlined in the procedure.
# lsdev –Cc adapter  grep fcs
fcs0       Available 30-T1       Virtual Fibre Channel Client Adapter
fcs1       Available 31-T1       Virtual Fibre Channel Client Adapter
fcs2       Available 32-T1       Virtual Fibre Channel Client Adapter
fcs3       Available 33-T1       Virtual Fibre Channel Client Adapter
fcs4       Available 34-T1       Virtual Fibre Channel Client Adapter
fcs5       Available 35-T1       Virtual Fibre Channel Client Adapter
fcs6       Available 36-T1       Virtual Fibre Channel Client Adapter
fcs7       Available 33-T1       Virtual Fibre Channel Client Adapter
On both VIOS, we confirmed that the physical to virtual mapping on the FC adapters was correct using the lsmap –all –npiv command. Also checking that client LPAR had successfully logged into the SAN by noting the Status: LOGGED_IN entry in the lsmapoutput for each adapter.
vio1:
Name          Physloc                            ClntID ClntName       ClntOS
------------- ---------------------------------- ------ -------------- -------
vfchost30     U8233.E8B.06XXXXX-V1-C60                4 tsm1           AIX

Status:LOGGED_IN
FC name:fcs4                    FC loc code:U5877.001.0084548-P1-C2-T1
Ports logged in:1
Flags:aLOGGED_IN,STRIP_MERGE>
VFC client name:fcs4            VFC client DRC:U8233.E8B.06XXXXX-V4-C34-T1

Name          Physloc                            ClntID ClntName       ClntOS
------------- ---------------------------------- ------ -------------- -------
vfchost31     U8233.E8B.06XXXXX-V1-C61                4 tsm1           AIX

Status:LOGGED_IN
FC name:fcs5                    FC loc code:U5877.001.0084548-P1-C2-T2
Ports logged in:1
Flags:aLOGGED_IN,STRIP_MERGE>
VFC client name:fcs5            VFC client DRC:U8233.E8B.06XXXXX-V4-C35-T1


vio2:
Name          Physloc                            ClntID ClntName       ClntOS
------------- ---------------------------------- ------ -------------- -------
vfchost30     U8233.E8B.06XXXXX-V2-C60                4 tsm1           AIX

Status:LOGGED_IN
FC name:fcs4                    FC loc code:U5877.001.0084548-P1-C5-T1
Ports logged in:1
Flags:aLOGGED_IN,STRIP_MERGE>
VFC client name:fcs6            VFC client DRC:U8233.E8B.06XXXXX-V4-C36-T1

Name          Physloc                            ClntID ClntName       ClntOS
------------- ---------------------------------- ------ -------------- -------
vfchost31     U8233.E8B.06XXXXX-V2-C61                4 tsm1           AIX

Status:LOGGED_IN
FC name:fcs5                    FC loc code:U5877.001.0084548-P1-C5-T2
Ports logged in:1
Flags:aLOGGED_IN,STRIP_MERGE>
VFC client name:fcs7            VFC client DRC:U8233.E8B.06XXXXX-V4-C37-T1
We were able to capture the WWPNs for the new adapters at this point. This information was required in order to zone the tape drives to the system.
# for i in 4 5 6 7
> do
> echo fcs$i
> lscfg -vpl fcs$i | grep Net
> echo
> done
fcs4
        Network Address.............C0507603A2920088

fcs5
        Network Address.............C0507603A292008A

fcs6
        Network Address.............C0507603A292008C

fcs7
        Network Address.............C0507603A292008E
The IBM Atape device drivers were installed prior to zoning in the TS3310 tape drives.
# lslpp -l | grep -i atape
  Atape.driver              12.2.4.0  COMMITTED  IBM AIX Enhanced Tape and
Then, once the drives had been zoned to the new WWPNs, we ran cfgmgr on the AIX LPAR to configure the tape drives.
# lsdev -Cc tape
#
# cfgmgr
# lsdev -Cc tape
rmt0 Available 34-T1-01-PRI IBM 3580 Ultrium Tape Drive (FCP)
rmt1 Available 34-T1-01-PRI IBM 3580 Ultrium Tape Drive (FCP)
rmt2 Available 35-T1-01-ALT IBM 3580 Ultrium Tape Drive (FCP)
rmt3 Available 35-T1-01-ALT IBM 3580 Ultrium Tape Drive (FCP)
rmt4 Available 36-T1-01-PRI IBM 3580 Ultrium Tape Drive (FCP)
rmt5 Available 36-T1-01-PRI IBM 3580 Ultrium Tape Drive (FCP)
rmt6 Available 37-T1-01-ALT IBM 3580 Ultrium Tape Drive (FCP)
rmt7 Available 37-T1-01-ALT IBM 3580 Ultrium Tape Drive (FCP)
smc0 Available 34-T1-01-PRI IBM 3576 Library Medium Changer (FCP)
smc1 Available 35-T1-01-ALT IBM 3576 Library Medium Changer (FCP)
smc2 Available 37-T1-01-ALT IBM 3576 Library Medium Changer (FCP)
Our new tape drives were now available on our AIX system. These drives were to be used with Tivoli Storage Manager (TSM).

Easy Ways to Trace VSCSI Configuration with AIX

Virtual SCSI (VSCSI) disks are great for sharing physical disks without requiring physical adapters, but mapping the configuration can be painful. I recently came across a couple of tools for tracing and mapping out the VSCSI configuration that have been huge time savers.

Identify VSCSI Adapters Using kdb Command

There's a quick way of checking which Virtual I/O Server (VIOS) Virtual SCSI (VSCSI) adapter is being used for a VSCSI client adapter. You can do it all from the VIO client logical partition in one line:
# echo "cvai" | kdb | grep vscsi
read vscsi_scsi_ptrs OK, ptr = 0x59A03C0
?vscsi0 0x000007 0x0000000000 0x0 vios1->vhost9?
vscsi1 0x000007 0x0000000000 0x0 vios2->vhost9

# echo "cvai" | kdb | grep vscsi read vscsi_scsi_ptrs OK, ptr = 0x59A03C0 vscsi0     0x000007 0x0000000000 0x0 vios1->vhost9 vscsi1 0x000007 0x0000000000 0x0 vios2->vhost9.
As you can see, the command not only shows you the VIOS host names -- in this example, vios1 and vios2 -- it also tells you which vhost number you're using on each VIO server. This is especially helpful when the vhost number on VIOS1 is different from VIOS2's vhost.

If you've ever tried to walk through the procedure to trace virtual disks, you'll quickly see the benefit of this shortcut. To get the information the slow way, you have to run several commands on the VIO client, then some more on the Hardware Management Console (HMC), then log onto the VIOS and run some more commands.

Instead of all of that, the command:
# echo "cvai" | kdb
gives you all you want. Pipe it to “grep vscsi” and you've got the VIOS and the vhost adapter.

vhosts for Dual VIOS

If you're using dual VIOS, sometimes the vhost adapter on VIOS2 doesn't correspond to the vhost adapter number on VIOS2. Not a problem:
# echo "cvai"|kdb|grep vscsi
read vscsi_scsi_ptrs OK, ptr = 0x4240398
vscsi0 0x000007 0x0000000000 0x0 vios1->vhost8
vscsi1 0x000007 0x0000000000 0x0 vios2->vhost5
Here you can see that vscsi0 (on the AIX client) corresponds to vios1's VSCSI server adapter known as vhost8. But vios2's adapter is called vhost5.

Armed with this information, you can log onto the VIOS restricted shell as the userpadmin and list the devices using the lsmap command. You can use lsmap -vadapter to identify a single Virtual SCSI server adapter.

In this example, on vios1, you would list the disks on vhost8. Here's an extract from oneIBM system admin I was working on:
lsmap -vadapter vhost8
SVSA                            Physloc                                                                 Client Partition ID
--------------- ---------------------------------- ------------------
vhost8                        U9119.595.021A34E-V30-C36                        0x00000024
VTD tst_boot_a
Status Available
LUN 0x8100000000000000
Backing device hdisk53
Physloc U5791.001.9920070-P2-C10-T1-W500507680130239F-L32000000000000
And then on vios2, you'd run the lsmap command for vhost5.

Mapping VSCSI Configuration via a Script

If you're looking to map out all the VSCSI disks presented to AIX clients from Storage Area Network (SAN) LUNs, have a read of this blog post written by Brian Smith on IBM developerWorks. It includes a Perl script that presents the output in an HTML page. You can put that page on a web server or view it locally on your PC or laptop. The script takes very little time to install, and you don't need to be a Perl expert to do it.

You can download the script from a link in the blog post I noted above, and it's always best to check the original blog post in case there are updates. There's a disclaimer that you should read, and because the script accesses vital parts of your infrastructure, the author warns that you use it at your own risk.

If, like me, you hesitate to create a new spreadsheet to document your configuration, you'll find utilities such as these are worth looking into. They can make a huge difference to the time and effort it takes you to trace your VSCSI configuration.

AIX System Recovery Tips and Techniques



An AIX recovery can be necessary as a result of a number of events: the loss of some system files, an unexplained system crash, a site environmental problem, or simply a request for a system recovery test. Either way, be prepared to hit the ground running and get the recovery done—or be ready to pack your bags and say goodbye.

AIX recovery is a basic skill; there are no excuses for not having it or not being prepared to use it as part of a disaster recovery (DR) plan. AIX system recovery isn’t rocket science, but you need to have your wits about you. This article will help you prepare to perform a recovery quickly and with  confidence. 

Prepare, Prepare, Prepare

Key requirements for a successful recovery are an up-to-date configuration listing of the target machine, a current system backup, and application backups or re-installation media. Whether you’re dealing with a full or partial restore, or a simulated or real disaster, the processes involved are the same. If you’re prepared with these prerequisites, your recovery will go smoothly; if not, you’re in for a difficult time.

The best way to ensure that you’re prepared is to routinely (at least weekly) create a system-bootable backup of your AIX servers to capture the sort of periodic changes that occur on a regular basis, such as PTFs and minor file changes. Also, track the status of applications and data being backed up daily, because these components are much more volatile than the OS itself. Typically, application backup is the responsibility of an operational team, but as an AIX systems admin, you should be informed that the data is being backed up successfully; after all, the applications do reside on your machine. You should also take a configuration report for each server. At minimum, this should include the output of the following commands:
  • lspv
  • lsvg -l <vgname>, lsvg -p <vgname>, lsvg <vgname>(for all volume groups)
  • lsslot -c slot
  • lscfg -vp
  • lsdev
  • lsattr -El sys0
A script can collect this information for you automatically and archive it off machine by, for example, emailing its output file to you. With the information these commands provide, you’ll be on a good footing to a confident recovery.

Expect the Unexpected

Recovering a system to a new server at a remote site typically involves restoring the OS from a tape or DVD bootable backup. You can perform a boot restore via the network if you’ve taken remote network system saves with netboots (e.g., Storix or NIM), but this process is much slower than restoring from a tape or DVD, and only the largest “hot site” facilities have netboot host capabilities. The rest of us must make do with bare-bones recovery from the trenches.

The restore-from-bootable-media process is straightforward. First, because it’s best to start up without a network attachment, make sure all Ethernet and other network cables (other than storage) are unplugged. Next, insert the bootable media—tape or DVD—into a boot-capable drive and start the system. It’s best if the server you’re restoring to closely matches the specs of the failed server, but some differences can be accommodated. For example, the root volume group (rootvg) disk(s) might not be the same size, but as long as they’re larger, not smaller, the restore will complete. You should be prepared to alter some of the logical volume copies or re-size the logical volumes during the AIX recovery process if your restore product allows.

Confirm Settings in New Environment

Confirm from the networks team or DR manager what IPs you’ll be using for the following:
  • Host and gateway IP addresses (IPv4 and IPv6)
  • Subnet mask
  • DNS servers
  • DNS entries (forward and reverse for all addresses owned by the host)
  • Firewall, ipfilter, and/or tcpwrapper rules
  • Printed copies of all customized directories showing ownership and permission settings
  • Mail relay host (if your machine forwards mail)
  • xntpd server
You might be on a different LAN or VLAN for the duration of the disaster, so be sure to document the IP environment for the recovery site so that you’re not fighting network issues during recovery operations. And, of course, if your system interacts with other servers or services, ensure that those are accessible from the recovery site.

Review Operational Parameters

Remember that all Ethernet cables should be disconnected at startup. If the machine comes up with the network interface disabled, that’s good; if it comes up enabled, you forgot to take out the Ethernet cables, which can complicate startup troubleshooting. (You don’t want some automated application process kicking off uncontrolled sessions.) When the AIX recovery boot-up completes, it’s time to check all the operational parameters, and then check them again. Review the /etc/inittab file, comment out any non-required services you don’t want started, then refresh the inittab with telinit -q. Check out root’s crontab and review any non-required periodic jobs that might start. Once you’re satisfied that all application processes and undesired mail sending processes are commented out, stop or kill any processes that might have been kicked off before you reviewed /etc/inittab and crontabs. You might want to delete any outbound queued email files held in /var/spool/mqueue because the mail system might try to send those messages, which you might not want until you’re ready for full production operation.

Next, stop and re-start sendmail so you have a clean mail agent running. Review any firewall, ipfilter, and tcp wrapper rules you have; these will undoubtedly have to be amended now that you’re in recovery mode and in a geographically different environment. If your machine’s database applications use raw devices, be sure to check the ownerships of these devices in /dev, because these likely would have been changed on a system restore. Most databases use async I/O; check that your databases are running using pstat -a. If your machine is on AIX 6.1 or later, database processes are started automatically. On AIX 5.3, you’ll probably need to start them up.

On the Network

Bring the machine onto the network by connecting the Ethernet cables (you should have already configured the net interfaces). Verify that you can ping the network gateway (both IPv4 and IPv6 if you use it), your DNS server, and any necessary collaborative servers. Validate that your configured DNS correctly resolves local and global names, and give special attention to reverse name resolution for the IP addresses owned by the AIX system you’re recovering. One of the most common root causes of startup failure is missing DNS entries for the new network environment.

If static routes are required to reach any internal or WAN networks other than through the default gateway, use the netstat -rn command to verify that the routes exist, and add them if needed. Stop and start the sshd service if it’s present (from the console, or you’ll cut off your command-line session). Test a remote connection, such as Telnet or ssh, to ensure you have remote access capabilities. Next, begin the xntpd service to start getting the machine time synced, and verify it with the date command. You should now be able to send a test email to make sure sendmail forwarding works:
#echo "test mail" | mail admin@unixmantra.com
Now you're ready to configure your data volumes.

Bring In the Disks

Internal data volumes won’t typically be saved with the system bootable backup. You must restore them separately, so be sure your DR plan includes the instructions for this step. If you use a Storage Area Network (SAN), the SAN volumes might reside at a remote site. If so, be sure to get iSCSI or FC zoning correct—there’s no time to mess around—then run cfgmgr to bring them in. The same goes for locally attached disks. Be sure to create your disk raid configuration, if required. If you’re only going to be at DR for a few days, you can generally forgo RAID altogether—the complexity isn’t worth the risk of a disk failure during DR operations.

Create the volume groups and file systems based on the configuration reports you captured previously. It might be advantageous to create a script when you're gathering your reports of the host configurations; this lets you automatically create the file systems and saves you a lot of time, as I’ve learned from experience.

Restore the Application Data

As I noted earlier, your application data must be backed up separately from the bootable OS media, and thus must be restored separately. If you’re using a third-party product for your application backups, check that the client is running and talking back to the remote backup server. Next, restore the applications and the data (if you do incremental backups, ensure the operational team has the full list of tapes required). This is typically the operational team’s responsibility, so be sure to hurry them along. When all the data is recovered, review the permissions of the base directories or file systems, then review them again. Once you’re satisfied, prepare to start up the services in a controlled manner, one by one. If you have databases to restore, make sure you have the latest dumps before restoring them. Review the processes running and consult with the applications’ support teams so that there are no issues. If everything looks good, stop all applications.

A reboot with Pause

Now’s the time to test that the machine can reboot. You might be thinking, “Why do this; let’s just get the machine recovered?” Well, if the machine goes down at a working DR site, it doesn’t reflect well on you or your team, so run this test now before you release the machine to the users. There are many factors that could stop an automatic boot, and because your initial boot was closely attended, you might not have encountered or noticed them. Simple things such as an incorrectly seated Ethernet cable or an IP address conflict can cause a reboot to stop and wait for manual intervention, so a trial reboot is essential.

First, clear the errorlog with errclear 0 so that you have a clean error logging sheet. Issue the bosboot and then the reboot commands. You should always issue a bosboot before any reboot or shut down because it’s a good habit to have. If for some reason the boot hangs, count your lucky stars that you discovered the problem now.

A Final Cross-Check, Please

Once the machine comes back up, check that all services are up. Get the support team to connect to the applications. Then relax and wait for the phone calls to come in on some other tinkering that needs to be done. This is inevitable, I’m afraid; however, the bulk of your work is now done.

How can I apply an efix or ifix?


You don't apply interim fixes (ifix) or emergency fixes (efix) with installp - instead you do it with the Efix Manager. IBM provides these fixes in a compressed epkg format (suffix: .epkg.Z). And that's how it's been applied:
# emgr -e <EFIX>.epkg.Z

You get a list of all installed fixes with
# emgr -l
ID  STATE LABEL    INSTALL TIME      UPDATED BY ABSTRACT
=== ===== ======== ================= ========== ================
1    S    IZ79677  09/16/10 16:09:52            iFix for IZ79677

The Label from the table above is needed when you ever want to remove an efix from the system:
# emgr -r -L <LABEL>

With a TL or SP upgrade installp will automatically remove an interim fix only if the service pack already contains it. If not the upgrade will fail and you have to remove it with the efix manager before upgrading.

IBM AIX TCP Traffic Regulation

AIX TCP Traffic Regulation

Introduction

TCP network services and subsystems running on AIX automatically and transparently take advantage of this powerful DoS mitigation technology using simple administrative tuning. This new feature provides a simplified approach to increased network security by leveraging centralized management and firewall-based customization. 

In addition to providing effective service-level and system-level TCP DoS mitigation, IBM AIX TCP Traffic Regulation provides system-wide TCP connection resource diversity across source Internet protocol addresses initiating connections. 

Due to the mass adoption of Internet technology by governments, banks, universities, hospitals, and businesses around the world, our society has transformed to depend on the availability of network services for daily operation. It is imperative that our society's network infrastructure become resilient to active attacks on this availability. 

IBM AIX TCP Traffic Regulation provides a low-cost solution for network service attack resiliency. Availability is assured at the operating system level, allowing for transparent mitigation of active and passive network denial-of-service attacks. To activate protection, an administrator defines a firewall profile and customizes it to protect the specific TCP ports handling critical services. These centralized custom firewall profiles provide the security administrator greater power and flexibility in tailoring network security solutions.

Operation system architecture

IBM AIX TCP Traffic Regulation provides a new architectural layer within the AIX operating system. The goal of this new layer is two-fold: 
  • Provide a centralized management framework for defining custom TCP firewall profiles.
  • Actively manage incoming TCP socket connections and resource diversity in accordance to the current firewall policy.

Figure 1. IBM AIX TCP Traffic Regulation (TR) Architecture 



The firewall policy itself is governed by the profile definitions added, removed, or modified by a systems administrator.

 Each profile consists of three elements: 
  • TCP port or port-range requiring protection.
  • Maximum number of incoming socket connections allowed for this profile's TCP port(s).
  • Diversity value (a numerical quantity used to tune the overall diversity of shared TCP resources across the pool of maximum incoming socket connections).
This system of mitigation works transparently, requiring no change to existing applications. TCP TR actively manages incoming socket connection requests at the kernel level, allowing the mitigation to work transparently- requiring no change to existing applications (See Figure 1). Thus, any network service software running on AIX and operating on the TCP ports covered by these firewall profiles are automatically protected from denial-of-service attacks.

Firewall profiles are defined using the tcptr command-line utility. This utility provides interactive administration and scripted manipulation of TCP TR policies. The entire TCP TR system can be turned on or off with the tcptr_enable network option. For example, to activate the subsystem, use the following no command:

no -p -o tcptr_enable=1

The tcptr command assigns a maximum limit of incoming TCP connections to a given network port or a range of ports. Administrative users control system resources related to TCP TR by adding or removing pools of connection resources to be shared collectively by incoming socket requests remotely accessing the AIX TCP layer.

Optionally, a diversity tunable can be specified allowing for increased resource sharing policy control.
Once in effect, these TCP TR profiles become the active policy governing connections. The operating system automatically ensures that resources are shared across multiple remote IP addresses that are attempting to connect through TCP to a specific port. 

Attack overview

Network services are generally agnostic to the underlying operating system resources available and allocated for their benefit of TCP communication. Most TCP services simply attempt to accept new socket connection requests as they are received. If left uncapped, a continuous barrage of TCP connection requests and subsequent consumption of TCP resources by these network services will eventually use up all the available system resources.

Figure 2. Topology for TCP resource exhaustion

A malicious attacker can make use of this behavior and launch a remote denial-of-service attack against a vulnerable network service over the Internet. The attack eventually makes the service unavailable by establishing thousands of socket connection requests with the vulnerable system. This occurs either from bringing down the system itself or maxing out socket availability for the vulnerable service. Once the system or service has been made unavailable, legitimate clients are blocked from using the network service hosted by the system under attack (See Figure 2).

TCP TR utility

The TCP TR utility configures or displays TCP TR policy information to control the maximum incoming socket connections for ports. The syntax of the utility follows:
tcptr -add <start port> <end port> <max connection> [divisor]
tcptr -delete <start port> <end port>
tcptr -show

where:
  • -add adds new TCP TR policies to the system. You should specify the maximum allowable connections for the current policy, the start port, and the end port with this flag. The start port and the end port can be the same port when a port range is not specified. Optionally, you can specify a divisor to allow a greater diversity of resource sharing on the pool of available TCP connections.
  • -delete deletes existing TCP TR policies that are defined for the system. This flag requires the user specify the maximum allowable connections for the current policy, the start port, and the end port (can be the same as start port if not specifying a port-range).
  • -show displays all existing TCP TR policies defined on the system. You might use the -show flag to see the active policies before using the -delete flag.
The parameters are:

<max connection>Specifies the maximum incoming TCP connections for the given TR policy.
<start port>Specifies the beginning port for the current TR policy.
<end port>Specifies the end port for the current TR policy. If the port is a range, the value specified must be larger than the start port. If the TR policy is for a single port, the value specified must be equal to the value specified for the start port.
<divisor>Specifies a divisor to compare the number of available incoming TCP connections with the number of consumed incoming TCP connections for an IP, and corresponds to a division of the overall available connections by a power of two. The divisor is the power of two that is used in the division. This parameter is optional, and if it is not specified, the default value is one. In that case, half of the number of available connections are used.

Examples

To add a TCP Traffic Regulation Policy that covers only TCP port 23, and to set a maximum incoming connection pool of 256 with an available connections divisor of 3, enter the following command: 
# tcptr -add 23 23 256 3
To add a TCP Traffic Regulation Policy that covers a TCP port that ranges from 5000 to 6000, and to set a maximum incoming connection pool of 5000 with an available connections divisor of 2, enter the following command: 
# tcptr -add 5000 6000 5000 2
To show TCP Traffic Regulation Policies set for the system, enter the following command:
# tcptr -show 
To delete the TCP Traffic Regulation Policy that covers a TCP port that ranges from 5000 to 6000, enter the following command: 
# tcptr -delete 5000 6000

Summary

IBM AIX TCP Traffic Regulation provides a low-cost solution for network service attack resiliency. Availability is assured at the operating system level allowing for transparent mitigation of active and/or passive network denial-of-service attacks. Network services requiring security and availability should benefit from this powerful operating system technology.

Some tips and techniques for managing GPFS file systems


Mount a file system on some nodes
Set special mount options for a specific node
Specify mount order (GPFS Version 3.4 or later only)

MOUNT A FILE SYSTEM ON SOME NODES

If you do not want any file systems that are configured for automount to be mounted on a specific node. On that node create the file  /var/mmfs/etc/ignoreStartupMount
#echo "some text" > /var/mmfs/etc/ignoreStartupMount
If you want to mount some of the file system that are configured for automount create a /var/mmfs/etc/ignoreStartupMount file for each file system by adding a “.devicename” to the end of the file.
#echo "some text" > /var/mmfs/etc/ignoreStartupMount.<devicename>
For example if you have file systems fs1 and fs1 configured for automount and you do not want to mount fs2 on this node you would create a file called “/var/mmfs/etc/ignoreStartupMount.fs2
#echo "some text" /var/mmfs/etc/ignoreStartupMount.fs2
(With a period between the keyword and the device name)

If you do not want the file system to mount at all instead of using ignoreStartupMount you can create an ignoreAnyMount file using the same syntax. This keeps the file system from being mounted on this node when mmmount is used in addition to not mounting on startup.

SET SPECIAL MOUNT OPTIONS FOR A SPECIFIC NODE

You may want to set special file system mount options for a specific node. To set special mount options create a file called /var/mmfs/etc/localMountOptions.$fsname where $fsname is the name of the file system. You can create a file for each file system that requires special mount options.

The file /var/mmfs/etc/localMountOptions.$fsname contains a single line with the mount options you wish to use. The mount options available are the same as your operating system mount options for the mount commands -o flag.

For example, to mount the file system fs1 as read only, start by creating a file.

/var/mmfs/etc/localMountOptions.fs1 ,That contains the value ro in the first line of the file.If you want all the file system on a particular node to mount with the same options create a file without the file system name at the end that contains a single line of text with the desired mount options.
/var/mmfs/etc/localMountOptions

SPECIFY MOUNT ORDER

You may want to set the order of file system mounts when the daemon starts and successfully joins the cluster (if mmlsconfig autoload shows yes) or when a mmount all command is executed. This is especially useful when you have nested GPFS file systems.

To do so, use the –mount-priority NumericPriority option on the mmcrfs, mmchfs or mmremotefscommands.

Configuring Persistent static route in Linux

Static routing

Static routing is a form of routing that occurs when a router uses a manually-configured routing entry, rather than information from a dynamic routing protocol to forward traffic. In many cases, static routes are usually manually configured by a network administrator by adding in entries into a routing table, though this may not always be the case.

Unlike dynamic routing, static routes are fixed and do not change if the network is changed or reconfigured. Static routing and dynamic routing are not mutually exclusive. Both dynamic routing and static routing are usually used on a router to maximize routing efficiency and to provide backups in the event that dynamic routing information fails to be exchanged. Static routing can also be used in stub networks, or to provide a gateway of last resort.

Static routes will be added usually through "route add" command. The drawback of 'route' command is that, when Linux reboots it will forget static routes. But to make it persistent across reboots, you have to add it to /etc/sysconfig/network-scripts/route-<eth> .

To add static route using "route add": 

# route add -net 192.168.100.0 netmask 255.255.255.0 gw 192.168.10.1 dev eth0 

Adding Persistent static route:

You need to edit /etc/sysconfig/network-scripts/route-eth0 file to define static routes for eth0 interface. 
GATEWAY0=192.168.10.1
NETMASK0=255.255.255.0
ADDRESS0=192.168.100.0

GATEWAY1=10.64.34.1
NETMASK1= 255.255.255.240
ADDRESS1=10.64.34.10 
Save and close the file. Restart networking: 
# service network restart 
Verify new routing table: 
# route –n 
# netstat –nr

AIX LVM Cheat Sheet

This is a quick and dirty cheat sheet on LVM using AIX, I have highlighted many of the common attributes for each command however this is not an extensive list, make sure you look up the command.

First a quick review on some of the terminology that AIX LVM uses
Examples
What it means
PHYSICAL VOLUME (PV)
Represents a hard disk (hdisk0).
PHYSICAL PARTITION (PP)
The smallest allocation unit in the LVM. All PPs within a VG are the same size, usually 4 or 8 MB.
VOLUME GROUP (VG)
A set of one or more PVs which form a single storage pool. You can define multiple VGs on each AIX system.
LOGICAL VOLUME (LV)
One or more PPs. A file system resides on top of an LV. Only one LV is mapped to a file system. A LV can't span across a VG. Up to 255 LVs in a VG
LOGICAL PARITITION (LP)
One or more PPs. LP represents a mirrored copy of a PP. Up to two copies of a PP can be mirrored resulting in a LP count of three (2 mirrors plus original).
Volume Group Descriptor Area(VGDA)
Information about all the LVs and PVs within a VG. The first 64K of a PV is reserved for this area - defined in <sys/bootrecord.h>.

The VGDA consists of
·BOOTRECORD: - first 512 bytes. Allows the Read Only System (ROS) to boot system
·  BAD BLK DIRECTORY - found in <sys/bddir.h>
·  LVM RECORD - found in <lvmrec.h>
Volume Group Status Area(VGSA)
Information about which PPs that are stale and which PVs are missing within a VG. The LVM and SCSI driver reserves somewhere between 7-10% of the available disk space for LVM maps, etc.
Physical Volume ID (PVID)
The PVID is an amalgamation of the machine’s serial number (from the systems EPROMs) and the date that the PVID is being generated. This combination insures the extremely low chance of two disks being created with the same PVID. Finally, when a system is booted, the disk configurator goes and looks at the PVID sitting on each disk platter and then compares that to an entry in ODM. If the entry is found, then the disk is given the hdiskX name that is associated with the ODM entry for the PVID.
Quorum
Quorum is a sort of “sanity” check that LVM uses to resolve possible data confliction and prevent data corruption. Quorum is a method by which 51% or more quorum votes must be available to a volume group before LVM actions can continue. Quorum is issued to a disk in a volume group according to how the disk was created within the volume group. When a volume group consists of one disk, there are two VGDA’s on that disk. Thus, this single disk volume group has a quorum vote of 2. When another disk is added to the volume group with an “extendvg”, then this new disk gets one VGDA, but the original, first disk still retains the two VGDA’s. When the volume group has been extended to three disks, the third disk gets the spare VGDA sitting on the first disk and then each disk has a quorum vote of 1. Every disk after the third disk is automatically given one VGDA, and thus one vote.
Volume Group ID (VGID)
Just as the PVID is a soft serial number for a disk, the VGID is the soft serial number for the volume group. It is this serial number, not the volume group’s ascii name, which all low level LVM commands reference. Additionally, it is the basis for the LVIDs created on that VGID.
Logical Volume Control Block (LVCB)
The logical volume control block (lvcb) consists of the first 512 bytes of a logical volume. This area holds important
information such as the creation date of the logical volume, information about mirrored copies, and possible mount points in a journaled  filesystem.
Logical Volume ID (LVID)
The LVID is the soft serial number used to represent the logical volume to the LVM libraries and low level commands. The LVID is created from the VGID of the volume group, a decimal point, and a number which represents the order which the logical volume was created on the volume group.
Now for the cheat sheet
Directory and Files
Directories and Files
Tools
diagnostic
diag - used to hot swap the disk
cfgmgr - used mak sure the new disk is seen

# to add new disk from the scsi0 controller
cfgmgr -l scsi0
Create/Remove hard disk
cfgmgr -l scsi0
mkdev -c disk -l <pv>
rmdev -dl <pv>
Physical Volumes
display
lspv
lspv <pv> (detailed)
lspv -l <pv> (list logical volumes)
lspv -p <pv> (physical partition usage)
PVID
chdev -l <pv> -a pv=yes
chdev -l <pv> -a pv=clear

Note: PVID's are automatically added when the disk is placed into a vg
adding
chdev -l <pv> -a pv=yes (new)
chpv -v a <pv> (adds back the removed disk)
removing
chpv -v r <pv>
change physical attributes
chpv -a y <pv> (changes allocatable state to YES)
chpv -a n <pv> (changes allocatable state to NO)
moving
migratepv <old pv> <new pv>
Volume Groups
display
lsvg
lsvg <vg> (detailed)
lsvg -l <vg> (list all logical volumes in goup)
lsvg -p <vg> (list all physical volumes in group)
lsvg -o (lists all varied on)
lsvg -M <vg> (lists assicated disks and state)

## Details volume group info for the hard disk
lqueryvg -Atp <pv>
lqueryvg -p <disk> -v (Determine the VG ID# on disk)
lqueryvg -p <disk> -L (Show all the LV ID#/names in the VG on disk)
lqueryvg -p <disk> -P (Show all the PV ID# that reside in the VG on disk)
varyon
varyonvg <vg>
varyonvg -f <vg> (force)
varyonvg -s <vg> (maintenance mode can use VG commands but lv 's cannot be opened for i/o access)
varyoffvg <vg>
Note: the varyon command activiates the volume goup which means it is available for use
ODM related
## Determine if the ODM and VGDA are correct (in sync)
getlvodm -u <vg>

## tries to resync VGDA, LV control blocks and ODM
synclvodm <vg>

## If the message 0516-366 lsvg: Volume group <vg> is locked
 is ever seen
putlvodm -K `gtlvodm -v <vg>`
creating
mkvg -y <vg> -s <PP size> <pv>

mkvg -y datavg -s 4 hdisk1

Note: the PP size will be the size of the physical partition size you want 4MB, 8MB
extending
extendvg <vg> <pv>
reducing
reducevg -d <vg> <pv>

## removes the PVID from the VGDA when a disk has vanished without using the reducevg command
reducevg <vg> <PVID>
removing
varyoffvg <vg>
exportvg <vg>

Note: the export command nukes everything regardingthe volume goup in the ODM and /etc/filesystems
checking
## check to see if underlying disk has grown in size
chvg -g <vg>
Note: use this command if you are using SAN LUN's that have increased in size
change volume attributes
## auto vary on a volume at system start
chvg -a y

# Turns on/off quorum checking on a volume group
chvg -Q [y|n] <vg>
renaming
varyoffvg <old vg name>
lsvg -p <old vg name> (obtain disk names)
exportvg <old vg name>
import -y <new vg name> <pv>
varyonvg <new vg name>
mount -a
importing
importvg -y <vg> <pv>
importvg <pv> (will use rootvg as default vg)
exporting
varyoffvg <vg>
exportvg <vg>

Note: if the volume has an active paging space this must be turned off before
Logical Volumes
display
lslv <lv>
lslv -l <lv> (list all physical volumes in logical volume)
lslv -m <lv> (list ppartition mapping)

## Display lv control block information
getlvcb -AT <lv>
creating
mklv <vg> <# of PP's> <pv>
mklv -y <lv name> <vg> <# of PP's> <pv>
## Create a mirrored named logical volume
mklv -y <lv> -c <copies 2 or 3> <vg> <# of PP's> <pv>

## create a JFSlog logical Volume
mklv -y <lv name> -t jfslog <vg> <# of PP's> <pv>
extending
extendlv <lv> <additonal # of PP's>
extendlv <lv> <size of volume in B||M|G>
reducing/resizing
see filesystem below
removing
rmlv <lv>
moving
migratepv -l <lv> <old pv> <new pv>
adding a mirror to a non-mirrored volume
mklvcopy -s n <lv> <copies 2 or 3> <pv>
removing a mirror copy from a mirrored volume
rmlvcopy <lv> <copies 1 or 2>
rmlvcopy <lv> <copies 1 or 2> <pv> (specified pv)

unmirrorvg <vg> <pv>
synchronize logical volume
syncvg -p <pv>
syncvg -v <vg>
syncvg -l <lv>
mirror any unmirrored volumes
mirrorvg <vg> <pv>
change volume attributes
## Enable the bad-block relocation policy

chlv -b [y|n] <lv>
renaming
chlv -n <new lv name> <old lv name>
Miscellaneous
## Initialises an LV for use as an JFSlog
logform </dev/lv>
Filesystems
display
lsfs
lsfs -q <fs> (detailed)

Note: use the '-q' to see if the logical volume size is bigger than the filesystem size
create
## create new filesystem, -A means to mount after restart
crfs -v jfs -d <lv> -m <mountpoint> -A yes

## Create logical volume, filesystem, mountpoint, add entry to /etc/filesystems at the specified size
crfs -v jfs2 -g <vg> -m <mountpoint> -a size=<size in 512k blocks|M|G) -A yes

Note: there are two types of filesystems jfs and jfs2, jfs2 allows you to decrease the filesystem size , you cannot reduce a jfs filesystem
remove
rmfs <fs>

Note: if all filesystems have been removed from a logical volume then the logical volume is removed as well.
resize
chfs -a size=<new size> <fs>

chfs -a size=1G /var (specific size, can be used to increase and decrease)
chfs -a size=+1G /var (increase by 1GB)
chfs -a size=-1GB /var (reduce by 1GB)
Note: this will automatically increase or decrease the underlying logical volume as well
freeze/unfreeze
chfs -a freeze=<time in seconds> <fs>
chfs -a freeze=off <fs>
split mirrored copy
chfs -a splitcopy=<split copy mountpoint>-a copy=2 <fs>

chfs -a splitcopy=/backup -a copy=2 /testfs
change
## Change the mountpoint
chfs -m <new mountpoint> <fs>

## Do not mount after a restart
chfs -A no <fs>

## Mount read-only
chfs -p ro <fs>
mount
mount
mount [<fs>|<lv>]
mount -a
mount all
defrag
defragfs -q <fs> (report defrag status)
defragfs -r <fs> (runs in report only mode - no action)
defragfs <fs> (actually defrags the filesystem)
checking and repairing
fsck [-y|-n] <fs> (check a filesystem)
fsck -p <fs> (restores primary superblock from backup copy if corrupt)
Miscellaneous
Complete VG, LV and FS with mirroring example
## Create the volume group
mkvg -s 256 -y datavg hdisk2
## Create the jfs2 log logical volume and initialize it this for the volume group
mklv -t jfs2log -y dataloglv datavg 1
logform /dev/dataloglv

## Create the logical volume
mklv -t jfs2 -y data01lv datavg 8

## Create the filesystems that will use the logical volume
crfs -v jfs -d data01lv -m /data01 -A yes

## Add an additional hard disk to the volume group
extendvg datavg hdisk3

## Now mirror both the volume group log logical volume and the logical volume
mklvcopy dataloglv 2
mklvcopy data01lv 2

## Make sure everything is sync'ed both the log and the logical volume
syncvg -v datavg

## Make sure everything is OK
lsvg -l datavg

## a quick way to perform the above in two steps
mklv -c 2 -t jfs2 -y data02lv datavg 8
crfs -v jfs -d data02lv -m /data02 -A yes

## mount everything and check
mount -a
Replaced failed mirror drive
## break the mirror (two ways to do this)
rmlvcopy <lv name> 1 <broken disk>
unmirrorvg <lv> <broken pv >
## remove the disk from the vg
reducevg <vgname> <broken pv >
## remove the hdisk from ODM
rmdev -dl <broken pv>

## physically replace the disk
diag -> function select -> task selection -> hot plug task -> scsi and scsi raid hot plug manager -> replace/remove a device attached to an scsi hot swap enclosure device -> select disk and follow instructions

## configure new disk an check the new number (hopefully the same)
cfgmgr -v
lsdev -Cc <pv>
## add back to volume group
extendvg <vg> <pv>
## create mirror (two ways to do this)
mklvcopy <lv> 2 <pv>
mirrorvg <lv>

## sync mirror
syncvg -l <lv>

## If this is the rootvg there are additonal steps to take
bosboot -ad /dev/<pv>
bootlist -m normal <pv> <pv>
bootlist -m normal -o
Accidently remove a mirrored disk or SAN LUN disappeared off the network
## This procedure places back a mirror disk that you have accidently pulled or that a SAN LUN disappeared off the network
## and its states is classed as "missing"

## see that the disk is in a missing state (see PV state column), also see stale volumes
lsvg -p <vg>
lsvg -M <vg>

## To make the disk active again we use the varyonvg command
varyonvg <vg>

## see that the disk is in a active state (see PV state column)
lsvg -p <vg>

## Now re-sync the volumes in that volume group
syncvg -v <vg>

## Make sure that no volumes are stale
lsvg -M <vg>

## Determine if the ODM and VGDA are correct (in sync)
getlvodm -u <vg>