I recently had the joys of helping deal with an All Paths Down (APD) situation which presented itself when removing a LUN from all the hosts in a cluster. If you do not detach the device first, which will also initiate an unmount operation before you physically unpresent the LUN from the ESX, it causes an APD situation to happen. ADP is when ESXi server no longer has any active paths to a device. When the device is no longer present and you rescan the adapters ESXi server will still retain the information on the removed devices and hostd will continue to try to open a connection to the disk device by issuing different commands like read capacity and read requests to validate the partitions tables are set. If SCSI Sense codes are not returned from a device (you are unable to contact the storage array, or the storage array that does not return the supported “SCSI codes”), then the device is in an All-Paths-Down (APD) state, and the ESXi host continues to send I/O requests until it times out.
When this happens in vSphere 5, your hosts will disconnect from vCenter and you will not be able to connect directly with the vCenter client. Your virtual machines will continue to run but you will lose all control of the hosts until the timeout threshold has been reached. When we discovered that we could no longer connect to the hosts our initial thoughts were that we would have to reboot the hosts to recover and would have to boot in an emergency change to reboot the host and restart the virtual machines.
Although the hosts were unresponsive to any connection attempt, after a period of time the timeout threshold was reached and we were able to get connected to the hosts again in a reasonable amount of time except for one specific host that did not recover like the rest of the hosts and had the most virtual machines hosted on it. It seems that when an ESXi host is unable to determine if the loss of the device is permanent, it will retry SCSI I/O using the user world I/O (host management agent) as well as the virtual machines guest I/O. This is where it got interesting in that during the rescan process, the service that shows up before the hang is the usbarbitrator. This service supports USB device passthrough from the ESXi host to a virtual machine and was the process that took the longest to time out and let the hostd service continue and allow communication and connections to happen again. (In order to enable USB storage within the console for either ESX or ESXi you first need to disable USB passthrough).
What do you do when you find yourself in an APD situation? Although it appears all is lost while all the hosts are disconnecting, if you are running vSphere 5 then the SCSI commands will time out. If you are running vSphere 4, then a reboot may be your only hope in that the SCSI commands to reconnect will continue indefinably.
There is a check list to follow to prevent this from happening to you, and it would be very useful for something like vCenter to perform such checks before allowing a LUN to be unpresented. Perhaps the integrations EMC and other storage vendors are working upon could be bi-directional instead of unidirectional such as VASA which is queries by vCenter. Perhaps if the storage administration tools realized they were presenting to vSphere, they could ensure these steps were done before unpresenting the LUN from a vSphere host. At the very least, there needs to be good teamwork between the storage administrators and the virtualization administrators.
Before unpresenting a LUN, ensure that:
- Review the log files looking for hardware errors related to fibre channel and iSCSI hardware
- Disable USB passthrough (at least temporarily)
- If the LUN is being used as a VMFS datastore, all objects (such as virtual machines and templates) stored on the VMFS Datastore are unregistered or moved to another datastore.
- Note: All CD/DVD images located on the VMFS datastore must also be unregistered from the virtual machines.
- If the LUN is being used as an RDM, remove the RDM from the virtual machine and click Delete from disk. Click OK.
- Note: This destroys the mapping file, but not the LUN content.
- The datastore is not part of a datastore cluster.
- The datastore is not managed by Storage DRS.
- Storage I/O Control is disabled for the datastore.
- The datastore is not used for vSphere HA heartbeat.
- No third party scripts or utilities running on the ESXi host can access the LUN in question. If the LUN is being used a datastore, unregister all objects (such as virtual machines and templates) stored on the datastore.
- If the LUN is being used as an RDM, remove the RDM from the virtual machine. Click Edit Settings, highlight the RDM hard disk, and select Remove. Ensure that Delete from disk is selected and click OK. Note: This destroys the mapping file, but not the LUN content.
- No third party scripts or utilities running on the ESXi host can access the LUN in question.
- If the LUN is an RDM, skip to Step 2. Otherwise, in the Configuration tab of the ESXi host, click Storage. Right-click on the datastore being removed, and click Unmount.
- A Confirm Datastore Unmount window appears. When the prerequisite criteria have been passed, click OK.
- Note: To unmount a datastore from multiple hosts, from the vSphere Client select Hosts and Clusters, Datastores and Datastore Clusters view (Ctrl+Shift+D). Perform the umount task and select the appropriate hosts that should no longer access the datastore to be unmounted.
- Under the Operational State of the Device, the LUN will be listed as Unmounted.
The LUN can now be unpresented from the SAN and it is time to get the storage team involved to unpresent the LUNs from an array level. Perform a rescan on all of the ESXi hosts that had visibility to the LUN. The device is automatically removed from the Storage Adapters.
As an aside, there is at least one other way to get an All Paths Down situation, and that is via faulty fibre channel or iSCSI hardware. So be sure to check those logs for any fibre channel or iSCSI errors and investigate each one.
Thanks Steve, Good post, it covers all aspects of APD.
Thanks Jerome!!
Excellent analysis, it gave me a good understanding of APD.. Thanks again.