Question: We are looking for VMware Administrator with VSAN experience. Have you ever configured VSAN cluster? What kind of challenges you ran into while VSAN deployment? What tools you used to troubleshoot those issues? Help me to understand your VSAN understanding and implementation experience.
Answer: Most of the VMware Administrators answer this question in theory part of VSAN fundamentals rather than explaining real time challenges with deployment. Interviewer looking for some real time issues as part of your answer which helps me to trust your skills in VSAN implementation. Today let’s see how to answer such tough questions by adding production issue of disk failed to detect when you tried to configure VSAN at cluster level. There are many blogs cover the steps required to configure/enable VSAN and here is the VMware feature walk through for VSAN beginners.
We are going to use the tool “Ruby vSphere Console” to troubleshoot the local hard disks which are not visible during the VSAN configuration but they are available at ESXi for regular disk operations. The Ruby vSphere Console is a console user interface for VMware ESXi and Virtual Center.The Ruby vSphere Console comes bundled with both the vCenter Server Appliance (VCSA) and the Windows version of vCenter Server. Most importantly, RVC is one of the primary tools for managing and troubleshooting a Virtual SAN environment. You can find the white paper published by VMware for Ruby vSphere console command reference for Virtual SAN.
Production issue: Customer requested you to enable vSAN at Cluster where Flash/HDD disks are populated. When you start the VSAN wizard, you can’t see local disks connected to couple of ESXi servers
Connect to RVC with below procedures based on your vCenter version:
Run the command: vsan.disks_info ~/computers/Clustername/hostname
/10.30.40.61/CICD-SDDC/computers> vsan.disks_info ~/computers/vDC-NSX/hosts/10.30.40.51
2017-05-16 14:52:55 -0700: Gathering disk information for host 10.30.40.51
2017-05-16 14:53:08 -0700: Done gathering disk information
Disks on host 10.30.40.51:
| DisplayName | isSSD | Size | State |
| Local ATA Disk (naa.55cd2e404c09019c) | SSD | 223 GB | eligible |
| ATA INTEL SSDSC2BB24 | | | |
| Local ATA Disk (naa.55cd2e404c090c3b) | SSD | 223 GB | ineligible (Existing partitions found on disk ‘naa.55cd2e404c090c3b’.) |
| ATA INTEL SSDSC2BB24 | | | |
| | | | Partition table: |
| | | | 1: 223.57 GB, type = vmfs (’51_SSD-Datastore’) |
From the output you can clearly see that there is ineligible disk as there is existing partition found on that disk. Which means this disk has previous file system and failed to detect by VSAN as it will look for disks without any file system/partition. Now it’s time to delete old partition on the disk and make it usable for VSAN operations. When you try to de-commission the VSAN cluster, you may get same challenge as file system/ disk partitions are not deleted during the VSAN disable process from vCenter server. Now you have proper real time/ production scenario to explain it to the Interviewer. Start your answer with the number of ESXi servers in the cluster going to be part of VSAN operations, number of disks used and FTT values.
Summary: When you answer this question, you will explain about VSAN Infrastructure planning/sizing decisions taken by Architect. You are responsible to implement the solution and come across the ineligible disks issue during the deployment. RVC tool is used to find the problem and later partedUtil is used to delete the existing partition from the disk. Finally local disks are visible for VSAN enablement and VSAN Datastore is created successfully.
“Be social and share it with social media, if you feel worth sharing it”