HMC (Hardware Management Console) is a technology created by IBM Vendor to provide a standard utility (interface) for configuring and operating logical partitions (also known as an LPAR or virtualized systems) and managing the SMP (Symmetric multiprocessing) systems such as IBM System i/z/p and IBM Power Systems.
Basically HMC is customized Linux blended with Java and many other graphical components. As per wiki "The HMC is a Linux kernel using Busybox to provide the base utilities and X Window using the Fluxbox window manager to provide graphical logins. The HMC also utilizes Java applications to provide additional functionality."
As AIX admins like me very much fond of HMC uses in day-today operations. HMC supports the system with features that
enable a system administrator to manage configuration and operation of
partitions in a system, as well as to monitor the system for hardware
problems. It consists of a 32-bit Intel-based desktop PC with a DVD-RAM
drive.
Connection of HMC with different managed systems is shown in below diagram.
Connection of HMC with different managed systems is shown in below diagram.
What does the HMC do?
- Creates and maintains a multiple-partitioned environment
- Displays a virtual operating system session terminal for each partition
- Displays virtual operator panel values for each partition
- Detects, reports, and stores changes in hardware conditions
- Powers managed systems on and off
- Powers Logical partitions on and off
- Booting systems in Maintenance mode and doing dump reboots
- Acts as a service focal point for service representatives to determine an appropriate service strategy and enable the Service Agent Call-Home capability
- Activates additional resources on demand ( we call it as CoD, capacity on demand)
- Perform DLAR Operations.
- Perform Firmware up-gradations on managed systems
- Remote management of managed systems
HMC Facts:
- Single HMC can manage multiple physical frames frames ( managed systems)
- You can't open more than one virtual console for a given lpar at a given time.
- If your HMC is down , nothing will happen to your managed systems and their lpars they will operate as usual but only thing we can't manage them if something happens
- There wont be direct root login . By default we get hscroot. ( need to engage IBM support to get the root password)
HMC Operating Modes:
You can operate HMC in two modes.
- Command Line Interface ( CLI )
- Graphical Interface
Where as using CLI you can run the information very fastly using commands and scripts.
Below figure show how graphical interface.
HMC Version Evolution:
- HMC V7, for POWER5, POWER6 and POWER7 models
- HMC V7R7.2.0 (Initial support for Power 710, Power 720, Power 730, Power 740 and Power 795 models)
- HMC V7R7.1.0 (Initial support for POWER7)
- HMC V7R3.5.0 (released Oct. 30, 2009)
- HMC V7R3.4.0
- HMC V7R3.3.0
- HMC V7R3.2.0
- HMC V7R3.1.0 (Initial support for POWER6 models)
- HMC V6
- HMC V6R1.3
- HMC V6R1.2
- 5.2.1
- 5.1.0
- 4.5.0
- 4.4.0
- 4.3.1
- 4.2.1
- 4.2.0, for POWER5 models
- 4.1.x
- 3.x, for POWER4 models
RMC (Resource Monitoring and Control) & Association with HMC:
RMC is a distributed framework and architecture that allows the HMC to communicate with a managed logical partition. for example "IBM.DMSRM" is deamon which needs to run on the lapr inorder do DLAPR operation on through HMC on the lpar.
Both daemons in LPARs and HMCs use external network to communicate among themselves but not through server processor means both have access to same external network in order to work with RMC related commands.
In order for RMC to work, port 657 upd/tcp must be open in both directions between the HMC public interface and the lpar.
The RMC daemons are part of the Reliable, Scalable Cluster Technology
(RSCT) and are controlled by the System Resource Controller (SRC). These
daemons run in all LPARs and communicate with equivalent RMC daemons
running on the HMC. The daemons start automatically when the operating
system starts and synchronize with the HMC RMC daemons.
Note: Apart from rebooting, there is no way to stop and start the RMC daemons on the HMC!
Things to check at the HMC:
- checking the status of the managed nodes: /usr/sbin/rsct/bin/rmcdomainstatus -s ctrmc (you must be root on the HMC)
- checking connection between HMC and LPAR:
hscroot@umhmc1:~> lspartition -dlpar
<#0> Partition:<2 data-blogger-escaped-10.10.50.18="" data-blogger-escaped-aix10.domain.com="">
Active:<1>, OS:, DCaps:<0x4f9f>, CmdCaps:<0x1b data-blogger-escaped-0x1b="">, PinnedMem:<1452>
<#1> Partition:<4 data-blogger-escaped-10.10.50.71="" data-blogger-escaped-aix20.domain.com="">
Active:<0>, OS:, DCaps:<0x0>, CmdCaps:<0x1b data-blogger-escaped-0x1b="">, PinnedMem:<656>
For correct DLPAR function:
- the partition must return with the correct IP of the lpar.
- the active value (Active:...) must be higher than zero,
- the decaps value (DCaps:...) must be higher 0x0
(The first line shows a DLPAR capable LPAR, the second line is anon-working LPAR)
----------------------------------------
Things to check at the LPAR:
- checking the status of the managed nodes: /usr/sbin/rsct/bin/rmcdomainstatus -s ctrmc
- Checking RMC status:
# lssrc -a | grep rsct
ctrmc rsct 8847376 active <== it is a RMC subsystem
IBM.DRM rsct_rm 6684802 active <== it is for executing the DLPAR command on the partition
IBM.DMSRM rsct_rm 7929940 active <== it is for tracking statuses of partitions
IBM.ServiceRM rsct_rm 10223780 active
IBM.CSMAgentRM rsct_rm 4915254 active <== it is for handshaking between the partition and HMC
ctcas rsct inoperative <== it is for security verification
IBM.ERRM rsct_rm inoperative
IBM.AuditRM rsct_rm inoperative
IBM.LPRM rsct_rm inoperative
IBM.HostRM rsct_rm inoperative <==it is for obtaining OS information
You will see some active and some missing (The key for DLPAR is the IBM.DRM)
- Stopping and starting RMC without erasing configuration:
# /usr/sbin/rsct/bin/rmcctrl -z <== it stops the daemons
# /usr/sbin/rsct/bin/rmcctrl -A <== adds entry to /etc/inittab and it starts the daemons
# /usr/sbin/rsct/bin/rmcctrl -p <== enables the daemons for remote client connections
(This is the correct method to stop and start RMC without erasing the configuration.)
Do not use stopsrc and startsrc for these daemons; use the rmcctrl commands instead!
- recfgct: deletes the RMC database, does a discovery, and recreates the RMC configuration
# /usr/sbin/rsct/install/bin/recfgct (Wait several minutes)
# lssrc -a | grep rsct
(If you see IBM.DRM active, then you have probably resolved the issue)
- checking the status of the managed nodes: /usr/sbin/rsct/bin/rmcdomainstatus -s ctrmc (you must be root on the HMC)
- checking connection between HMC and LPAR:
hscroot@umhmc1:~> lspartition -dlpar
<#0> Partition:<2 data-blogger-escaped-10.10.50.18="" data-blogger-escaped-aix10.domain.com="">
Active:<1>, OS:
<#1> Partition:<4 data-blogger-escaped-10.10.50.71="" data-blogger-escaped-aix20.domain.com="">
Active:<0>, OS:
For correct DLPAR function:
- the partition must return with the correct IP of the lpar.
- the active value (Active:...) must be higher than zero,
- the decaps value (DCaps:...) must be higher 0x0
(The first line shows a DLPAR capable LPAR, the second line is anon-working LPAR)
----------------------------------------
Things to check at the LPAR:
- checking the status of the managed nodes: /usr/sbin/rsct/bin/rmcdomainstatus -s ctrmc
- Checking RMC status:
# lssrc -a | grep rsct
ctrmc rsct 8847376 active <== it is a RMC subsystem
IBM.DRM rsct_rm 6684802 active <== it is for executing the DLPAR command on the partition
IBM.DMSRM rsct_rm 7929940 active <== it is for tracking statuses of partitions
IBM.ServiceRM rsct_rm 10223780 active
IBM.CSMAgentRM rsct_rm 4915254 active <== it is for handshaking between the partition and HMC
ctcas rsct inoperative <== it is for security verification
IBM.ERRM rsct_rm inoperative
IBM.AuditRM rsct_rm inoperative
IBM.LPRM rsct_rm inoperative
IBM.HostRM rsct_rm inoperative <==it is for obtaining OS information
You will see some active and some missing (The key for DLPAR is the IBM.DRM)
- Stopping and starting RMC without erasing configuration:
# /usr/sbin/rsct/bin/rmcctrl -z <== it stops the daemons
# /usr/sbin/rsct/bin/rmcctrl -A <== adds entry to /etc/inittab and it starts the daemons
# /usr/sbin/rsct/bin/rmcctrl -p <== enables the daemons for remote client connections
(This is the correct method to stop and start RMC without erasing the configuration.)
Do not use stopsrc and startsrc for these daemons; use the rmcctrl commands instead!
- recfgct: deletes the RMC database, does a discovery, and recreates the RMC configuration
# /usr/sbin/rsct/install/bin/recfgct (Wait several minutes)
# lssrc -a | grep rsct
(If you see IBM.DRM active, then you have probably resolved the issue)
Getting Information of LPARS & HMC either way:
Make a Note: In-order to work with these commands you should have rsct daemons running on the servers means make sure RMC communication between the HMC and LPAR is happening.
1) Getting HMC IP information from LPAR:
If you get the information of which HMC/HMCs your lpar associate managed system ( frame) connected by using "lsrsrc" command .
Command: finding the HMC IP address (lsrsrc IBM. |
$ lsrsrc IBM.ManagementServer Resource Persistent Attributes for IBM.ManagementServer resource 1: Name = "192.168.1.2″ Hostname = "192.168.1.2″ ManagerType = "HMC" LocalHostname = "ldap1-en1″ ClusterTM = "9078-160″ ClusterSNum = "" ActivePeerDomain = "" NodeNameList = {"lpar1"} |
2) Get Managed System & LPAR information :
Below script will give us full details about Frame & LPAR information and their allocated CPU & Memory.
Script to get Frame & LPAR information |
for system in `lssyscfg -r sys -F "name,state" | sort | grep ",Operating" | sed 's/,Operating//'`; do echo $system echo " LPAR CPU VCPU MEM OS" for lpar in `lssyscfg -m $system -r lpar -F "name" | sort`; do default_prof=`lssyscfg -r lpar -m $system --filter "lpar_names=$lpar" -F default_profile` procs=`lssyscfg -r prof -m $system --filter "profile_names=$default_prof,lpar_names=$lpar" -F desired_proc_units` vcpu=`lssyscfg -r prof -m $system --filter "profile_names=$default_prof,lpar_names=$lpar" -F desired_procs` mem=`lssyscfg -r prof -m $system --filter "profile_names=$default_prof,lpar_names=$lpar" -F desired_mem` os=`lssyscfg -r lpar -m $system --filter "lpar_names=$lpar" -F os_version` printf " %-15s %-6s %-6s %-6s %-30s\n" $lpar $procs $vcpu $mem "$os" done done |
Generally people will think there is no way to run scripts in HMC,but we have a possibility for this use "rnvi" command to make scrippt file i.e "rnvi -f hmcscriptfile".
To run the script, use the "source" command. For example "source hmcscriptfile". This will run the script in your current shell.
Here "hmcscriptfile" is the script name and run the script like below you will see the o/p as below.
How to run script & o/p |
hscroot@umhmc1:~> source hmcscriptfile
p570_frame5_ms
LPAR CPU VCPU MEM OS
umlpar1 0.1 3 512 AIX 6.1 6100-07-05-1228
umlpar2 0.1 3 512 AIX 6.1 6100-07-05-1228
umlpar3 0.1 3 512 Unknown
linux1 0.1 3 512 Unknown
vio1 0.2 2 512 VIOS 2.2.1.4
vio2 0.1 2 352 Unknown
|
I think it is very effective post for us. Thanks for this nice and Helpful post. There is exciting moderately nice about the "Hardware Management Console (HMC ) Explained". I like the suggestion.
ReplyDeleteI think, Online computer troubleshooting is rapidly becoming the most demanded online service on the World Wide Web.
I want to share some Information about computer troubleshooters.
I like to add a small tip. Wrap your block of script snippet into a shell function, source the script file, and invoke the block of shell commands by calling the shell function.
ReplyDeleteGood Script., Very helpful. Is it possible to add the LPARs uptime in the same script.
ReplyDeleteI am trying to get this Info. But not able to find. Please let me know if its possible to do so..
Thanks,
Srinivasan R K
Thank you for an informative post.
ReplyDeletegood article i certified AIX 4.3 but I know I way out of date. :)
ReplyDelete