All Paths Down!

I recently had the joys of helping deal with an All Paths Down (APD) situation which presented itself when removing a LUN from all the hosts in a cluster. If you do not detach the device first, which will also initiate an unmount operation before you physically unpresent the LUN from the ESX, it causes an APD situation to happen. ADP is when ESXi server no longer has any active paths to a device. When the device is no longer present and you rescan the adapters ESXi server will still retain the information on the removed devices and hostd will continue to try to open a connection to the disk device by issuing different commands like read capacity and read requests to validate the partitions tables are set. If SCSI Sense codes are not returned from a device (you are unable to contact the storage array, or the storage array that does not return the supported “SCSI codes”), then the device is in an All-Paths-Down (APD) state, and the ESXi host continues to send I/O requests until it times out.