Dynamic Reconfiguration (DR) has seen a variety of changes over the past years.
Below is a quick guide that can be used to help set up and use DR in the Sun Enterprise 10000 (E10K) and Sun Fire 12K/15K/E20K/E25K server environments.
Solution
Steps to Follow
Sun Enterprise 10000 (E10K)
The method in which DR is enabled, differs according to the Solaris[TM] Operating System(OS)release. This applies to all versions of DR.
For Solaris 2.5.1 OS, DR is enabled by setting the Open Boot PROM(OBP) parameter(dr-max-mem), to any non-zero number via 'setenv' or 'eeprom'. See the following examples.
ok setenv dr-max-mem 1
or
# eeprom dr-max-mem=1
NOTE: If 'dr-max-mem' is set to 0, DR attach/detach is DISABLED. If 'dr-max-mem' is set to anything other than 0 (non-zero), DR attach/detach is ENABLED. This value denotes the maximum memory configuration permitted for the domain after all DR attaches have been completed. For example, a value of 16384 would allow for a maximum of 16GB of memory. However, be careful not to set this variable too high, as it unnecessarily enlarges the kernel and wastes memory that might be better used elsewhere.
For Solaris 2.6 OS(similar to 2.5.1), DR is enabled by setting the OBP parameter (dr-max-mem) to any non-zero number, via 'setenv' or 'eeprom'. See the following examples.
ok setenv dr-max-mem 1
or
# eeprom dr-max-mem=1
NOTE: If 'dr-max-mem' is set to 0, DR attach/detach is DISABLED. If 'dr-max-mem' is set to anything other than 0 (non-zero), DR attach/detach is ENABLED. If the value is specifically set to 2, it will make the number of DR kernel pages at boot time, 5X larger than the normal value. Be aware, that in environments with large configurations (i.e., Tbs of storage), it is possible to exhaust the kernel resources prior to the system becoming fully active. Review Bug ID 4218687 for details.
For Solaris 7-10 OS's, DR is enabled with an entry(kernel_cage_enable) in the /etc/system file. When this variable is set to 1 , it is enabled. If set to 0 then this function is disabled. The 'dr-max-mem' OBP parameter becomes obsolete as well, with Solaris 7-10 OS's. The following, represents an example entry in the /etc/system file, to enable DR:
* DR enabled set kernel_cage_enable=1
* DR entry complete
There are three versions of DR that can be utilized on an E10K platform
Legacy DR (DR) - This was the initial release of DR, seen in SSP 3.1 through SSP 3.3. Each DR operation consisted of a 3 step manual process.
1. To add a board (ex. SB6):
ssp:domain% dr
dr> init_attach 6
dr> drshow 6 obp (to verify board inventory)
dr> complete_attach 6
dr> exit
2. To remove a board (ex. SB6):
NOTE: Stop edd so that no Recordstops can occur during a detach DR operation.
If a Recordstop were to occur during a DR operation, the domain will have to be STOPPED!
Therefore, you should stop 'edd' and then re-start it again after DR is finished with the 'edd_cmd' command:
ssp% edd_cmd -x stop
ssp:domain% dr
dr> drain 6
dr> drshow 6 IO (determine if there is active I/O on board being detached)
dr> complete_detach 6
dr> reconfig
dr> exit
Restart edd again:
ssp% edd_cmd -x start
Automated DR (ADR) - Introduced in SSP 3.3, ADR had a new command structure that would allow users to use DR in scripts to 'automate' the process so each DR operation is completed by one command instead of three, as in the previous release.
New Generation DR (ngdr) - Introduced with the Sun Fire 12K/15K and backported into the E10K in SSP 3.4 and SSP 3.5 running Solaris 8 and Solaris 9 OS. This new command structure, allows for remote DR capabilities as well.
These automated methods may be used for DR operations:
1. addboard -d
2. moveboard -d
3. deleteboard -d
Adding a board (ex. SB6):
ssp% addboard -b 6 -d domain_name -r 2 -t 600
where (-b) is SB#, (-d) is domain name, (-r) is # of retries, (-t) timeout
Removing a board (ex. SB6):
ssp% deleteboard -b 6 -r 2 -t 900
Moving a board (ex. SB6):
ssp% moveboard -b 6 domain_name -r 2 -t 900
If any RT (real-time) processes are running on a domain, it will prevent a DR from completing. These processes must be stopped for DR to work properly, if it complains about them. Use the command:
ssp% ps -eo class | grep RT
to identify which PIDs(Process Ids) to kill if necessary. Be aware of which RT processes are running, and what their exact function is. Be sure to understand any adverse affects that may arise if these processes are killed manually.
________________________________________
Sun Fire 12K/15K/E20K/E25K Servers
________________________________________
Syntax for SMS (System Management Services) 1.x DR commands from the SC (System Controller):
1. addboard -d
2. moveboard -d
3. deleteboard [-q] [-f]
Examples:
sms> addboard -d A SB10
sms> moveboard -d B SB7
sms> deleteboard SB0
If running 'rcfgadm'(Remote configadm) commands from the SC, the usage may be as follows:
sc0:sms-user:> rcfgadm -d
function - assign | unassign, configure | unconfigure, or connect | disconnect
APIDs - can be either logical or physical, and are either static or dynamic.
PHYSICAL EXAMPLES:
/devices/pseudo/dr@0:IO4
/devices/pseudo/dr@0:IO6
/devices/pseudo/dr@0:IO14
/devices/pseudo/dr@0:SB4
/devices/pseudo/dr@0:SB6
LOGICAL EXAMPLES:
IO4, IO6, IO14, SB4, SB6
STATIC AP TYPES:
HPCI, CPU, MCPU, pci-pci/hp
DYNAMIC AP TYPES:
cpu, mem, io
Examples:
sc0:sms-user:> rcfgadm -d a -f -c configure SB6
sc0:sms-user:> rcfgadm -d a -c unconfigure IO14
sc0:sms-user:> rcfgadm -d a -c configure SB6
sc0:sms-user:> rcfgadm -d a -c configure pcisch3:e06B1slot2 <--DR an I/O component (See Below) Breakdown of specific I/O card to DR: Example from above: sc0:sms-user:> rcfgadm -d a -c configure pcisch3:e06B1slot2
pcisch<#>: This represents the pcisch device instance number. The example shows that the device being configured is pcisch3, the third instance of a pcisch device for this domain. Prior to configuring a new device instance, you should do a grep pcisch /etc/path_to_inst on the domain to confirm what instances of the device are currently configured. Choose the next available instance to configure into the domain.
e<#>: This indicates the Expander Board location of this device. The example shows that this is e06, indicating the device is located on Expander 06.
B1: Indicates a slot1-type board.
NOTE: The board type will always be B1 on a Sun Fire[TM] 12K/15K/E20K/E25K for the I/O devices, because a slot1 board is the only type of board where these devices can be installed.
Slot<#>: Indicates the Cassette slot# (1-4) that the device is located in on a slot1 board. The example above shows slot2.
This is the Bottom Left cassette slot on the I/O Board.
See Technical Instruction Document: 1017493.1 for a diagram.
Useful information gathering commands:
rcfgadm -d a lists all attachment points except dynamic points.
rcfgadm -d a -al lists all current configurable hardware information (including dynamic).
rcfgadm -d a -avl lists all current configurable hardware in verbose mode.
If the 'cfgadm' (configadm) command on the domain is used:
cfgadm [-f] [-v] -c
Command uses the same syntax rules and examples as you see above with `rcfgadm`. The difference is, that 'cfgadm' is executed on the domain itself, not from the SC as 'rcfgadm' is used. There is no '-d
http://download.oracle.com/docs/cd/E19065-01/servers.10k/816-3627-10/index.html