Ran into this on one of our ESX hosts recently after shutting it down for some maintenance:
* vsd-mount ... [!!]
You have entered the recovery shell. The situation you are in may be recoverable. If you are able to fix this situation the boot process will continue normally after you exit this terminal
/bin/sh: can't access tty; job control turned off
/ #
I was unable to get in troubleshooting mode on the ESX box, so I couldn't run any commands of consequence. I perused around and determined pretty quickly that reloading was my quickest and easiest option. After all, we've got a Distributed Switch setup, and the rest are just basics (IP, etc), so it wasn't a very hard decision.
------
Dustin Shaw
VCP
I had the same issue today. i un-presented the SAN storage and it came up fine.
ReplyDeleteWas your boot partition on the SAN or just your Datastore? Just curious in case others have the same issues.
ReplyDeletePart of what we traced the problem down to was BIOS issues with the Dell R710s that the box was running on. There is a particular build that had serious issues with ESX 4 and 4.1. It took Dell a little while to straighten themselves out.
At least we only had the issue with one box. Once it was resolved and we updated the BIOS on them, we updated the rest and haven't had a problem since. I have a client that happened to move their Office Datacenter locations (they have a small scale 2-4 ESX server config at their office) during this timeframe. When they brought their boxes back up at the new location, every box had issues. It took them about 10-20 hours to do what should've taken 2.
Ran into a similar problem at a customer's site few days ago. A UPS failed, and it turned out that some of the SAN drive enclosures were not cabled properly to take advantage of the redundant UPS power. One ESX 4.0 cluster was not affected (other than a loss of redundant power), while the other had all of the host servers & VMs marked as "disconnected". There was very little you could do in vCenter to figure out what was going on. I did notice some of the datastores would not enumerate a directory listing, but I couldn't determine why.
ReplyDeleteRebooting the affected host servers caused all to hang during boot-up at "vsd-mount". The servers boot from local storage. The datastores are on the SAN.
I disabled all of the FC switch ports & rebooted. All of the ESX hosts booted up. I re-enabled the FC switch ports & rescanned the storage. It was then I discovered several missing datastores. Some now were grayed out, while others appeared normally and I could browse them.
The SAN is owned by a different vendor, but the sys admin had entrusted me with the password for just such an occasion. I got into Navisphere, and discovered several LUNs were offline, then figured out which enclosures they were hosted in. A quick look at the physical enclosures revealed they were not powered on. I'm not sure why the SAN didn't throw an alert or have any indicator lights indicating a problem, but once we fixed the PDU problem and brought the enclosures back "online" in Navisphere, ESX resumed normal operation.