News: VMwareGuruz has been  Voted Top 50 vBlog 2018. 

Cloud E2E

vSphere Diagnostic Tool – Quick health checks via python script

This is going to be a quick post about the health checks that you can run using the script shared by VMware support team. We recently had issues with vCenter services crash and opened a support case with VMware, they suggested to run this short version of VMware Skyline Health Diagnostics which is really cool to run when there are issues found with vCenter. This reminds me a Nutanix health checks scripts but number of checks are limited in this version.

 

VMware Skyline Health Diagnostics is a self-service tool that analyzes log bundles to detect issues and suggest relevant Knowledge Base articles or Steps to remediate in vSphere and vSAN products. vSphere Administrators can use this tool for troubleshooting issues, before contacting the VMware Support. There is shorter version of health check.It is called VDT and the steps for it are below. The file will get created in /var/log/vmware/vdt/ and mirrors the output to the screen. Please find the steps listed below to run this script followed by a sample copy.

 

-Download the latest version of vSphere Diagnostic Tool: https://via.vmw.com/vdt
Note: Let me know if you have any issues downloading it.

-Use the file-moving utility of your choice (WinSCP for example) to copy the entire ZIP directory to /root on the node on which you wish to run it.
Note: If you have troubles connecting to a vCenter appliance using WinSCP, please see kb article Error when uploading files to vCenter Server Appliance using WinSCP

-Change your directory to the location of the file, and unpackage the zip:
cd /root/
unzip vdt-version_number.zip

-Run the tool with the command:
python vdt.py

You will be prompted for the password for administrator@sso.domain. Many checks will still run even if credentials are not supplied.

The tool will then proceed to run.  You can review the output by scrolling up and down in the window.  Each test should be self-explanatory in its meaning, findings, and directions.

 

Benefits of Skyline Health Diagnostics:

  • SHD can help perform diagnostics on the vCenter Appliance even if the Services are offline.
  • SHD can also perform diagnostics on ESXi Hosts that are either disconnect/not responding from vCenter Server Inventory using SSH access.
  • SHD can also gather Offline Health Diagnostics: Environments in Dark site can install the Appliance on a Standalone ESXi Host to manually upload logs to the SHD appliance to perform analysis.
  • SHD performs the below diagnostics checks as of today:
    VMware Security Advisory alerts based environment status
     VMware Compatibility Health Checks hardware & vSAN
     SHD also performs log analysis based on plugins available and will recommend steps or Knowledge base articles to fix the reported issue, SHD as of today has more than 500+ plugins
  • The SHD Report can be used for Audit and Documentation purpose for any reported issue in the environment. The report comes with very detailed and comprehensive list of checks and results, that can be presented to wider audience within the organization.
  • Security team within the Organization can also benefit from this report, by validating if the environment is up to date with the available VMware Security Advisories and fixes.

 

Sample Report:

2021-08-25T18:22:58 INFO Vdt: Today: Wednesday, August 25 18:22:58 Version: 1.0.9 Log Level: INFO
2021-08-25T18:23:28 INFO Vdt: Running __vc_info_auth.py
2021-08-25T18:23:31 INFO Vdt:
________________________
VCENTER BASIC INFO

Current Time: 2021-08-25 18:23:31.559527
vCenter Uptime: up 1000 days
vCenter Load Average: 2.41, 2.60, 2.45
Number of CPUs: 24
Total Memory: 48 GB
vCenter Hostname: NAME
vCenter PNID: NAME
vCenter IP Address: IPADDRESS
Proxy Configured: “no”
NTP Servers: Your NTP Serers
vCenter Node Type: vCenter with Embedded PSC
vCenter Version: 6.x.0.48000 – Build version
vCenter SSO Domain: domainname
vCenter AD Domain: No DOMAIN

Number of ESXi Hosts: 20,000
Number of Virtual Machines: 200,000
Number of Clusters: 2000
Disabled Plugins: None

2021-08-25T18:23:31 INFO Vdt: Running _vc_dns.sh
2021-08-25T18:23:32 INFO Vdt:
__________________
VC DNS CHECK

Nameservers
DNS1

DNS2

Entries in /etc/hosts
127.0.0.1 vcenter  vcenter localhost
::1 vcenter vcenter localhost ipv6-localhost ipv6-loopback

Non-standard entries in /etc/hosts
[PASS] None

Basic Port Testing
[PASS] Port TCP 53 open to nameserver DNS1
[PASS] Port TCP 53 open to nameserver DNS2

Nameserver Queries
DNS1
[PASS] DNS with UDP – resolved vcenter to DNS1
[PASS] Reverse DNS – resolved DNS1 to vcenter
[PASS] DNS with TCP – resolved vcenter to 1DNS1
DNS2
[PASS] DNS with UDP – resolved vcenter to DNS2
[PASS] Reverse DNS – resolved DNS2 to vcenter
[PASS] DNS with TCP – resolved vcenter to DNS2

Commands used:
dig +short <fqdn> <nameserver>
dig +noall +answer -x <ip> <namserver>
dig +short +tcp <fqdn> <nameserver>

RESULT: [PASS]

2021-08-25T18:23:32 INFO Vdt: Running lsreport.py
2021-08-25T18:23:32 INFO _svc_log: Get services status, svcnames=[‘vmdird’]
2021-08-25T18:23:35 INFO live_checkCerts: Checking services for trust mismatches…
2021-08-25T18:23:35 INFO Vdt:
__________________________
Lookup Service Check

Please remember to check if a node shows up in more than one SSO site.
If a node exists in more than one SSO site, you will need to run
lsdoctor.py -r option 2 (https://kb.vmware.com/s/article/80469)

MACHINE ID CHECK

[PASS] Machine ID matches vpxd solution user in vpxd.cfg

REGISTRATION CHECK

SSO Site: default-site
[FAIL] Node: vcenter
– PROBLEM: Duplicates Found: Ignore if this is the PSC HA VIP. Otherwise, you must unregister the extra endpoints.

2021-08-25T18:23:35 INFO Vdt: Running vc_ad_check.py
2021-08-25T18:23:35 INFO _svc_log: Get services status, svcnames=[‘lwsmd’]
2021-08-25T18:23:36 INFO Vdt:
______________
AD Check

Domain Report:
No domain(s) detected

Domain Exclusion List:

None

DC Exclusion List:

None

2021-08-25T18:23:36 INFO Vdt: Running vc_auth_cert_check.py
2021-08-25T18:23:36 INFO _svc_log: Get services status, svcnames=[‘vmafdd’]
2021-08-25T18:23:37 INFO checkCerts: Found vpxd-extension.
2021-08-25T18:23:39 INFO Vdt:
__________________________
VC CERTIFICATE CHECK

[PASS] ESXi Certificate Management Mode: vmca

Checking MACHINE_SSL_CERT

[PASS] Supported Signature Algorithm
[PASS] Certificate trust check
[PASS] Certificate expiration check
[INFO] Certificate SAN check

Checking Other Certificate Stores

VPXD-EXTENSION
[PASS] Supported Signature Algorithm
[PASS] Certificate trust check
[PASS] Certificate expiration check
[INFO] Certificate SAN check
DETAILS: SAN contains hostname but not IP.
Checking VC Extension Thumbprints
[PASS] com.vmware.vim.eam Thumbprint Check
[PASS] com.vmware.rbd Thumbprint Check
[PASS] com.vmware.imagebuilder Thumbprint Check

SMS
[PASS] Supported Signature Algorithm
[PASS] Certificate expiration check

VPXD
[PASS] Supported Signature Algorithm
[PASS] Certificate trust check
[PASS] Certificate expiration check
[INFO] Certificate SAN check

MACHINE
[PASS] Supported Signature Algorithm
[PASS] Certificate trust check
[PASS] Certificate expiration check
[INFO] Certificate SAN check

DATA-ENCIPHERMENT
[PASS] Supported Signature Algorithm
[PASS] Certificate trust check
[PASS] Certificate expiration check
[INFO] Certificate SAN check

KMS_ENCRYPTION
[PASS] Supported Signature Algorithm
[FAIL] Certificate trust check
DETAILS: Signing authority does not exist in TRUSTED_ROOTS!
[PASS] Certificate expiration check
[FAIL] Certificate SAN check
DETAILS: xxxxxx – SAN contains neither hostname nor IP!

[PASS] Supported Signature Algorithm
[FAIL] Certificate trust check
DETAILS: Signing authority does not exist in TRUSTED_ROOTS!
[PASS] Certificate expiration check
[FAIL] Certificate SAN check
DETAILS: xxxxxx – SAN contains neither hostname nor IP!

[PASS] Supported Signature Algorithm
[PASS] Certificate is self-signed
[PASS] Certificate expiration check
[FAIL] Certificate SAN check
DETAILS: xxxxxx – SAN contains neither hostname nor IP!

[PASS] Supported Signature Algorithm
[FAIL] Certificate trust check
DETAILS: Signing authority does not exist in TRUSTED_ROOTS!
[PASS] Certificate expiration check
[FAIL] Certificate SAN check
DETAILS: xxxxxx – SAN contains neither hostname nor IP!

VSPHERE-WEBCLIENT
[PASS] Supported Signature Algorithm
[PASS] Certificate trust check
[PASS] Certificate expiration check
[INFO] Certificate SAN check
DETAILS: SAN contains hostname but not IP.

Checking TRUSTED_ROOTS certificates

Alias: xxxxx
[PASS] Supported Signature Algorithm
[PASS] Certificate is self-signed
[PASS] Certificate expiration check
Child certificates:
MACHINE_SSL_CERT
vpxd-extension
vpxd
machine
data-encipherment
vsphere-webclient
[PASS] Certificate is a CA

Checking STS Certs

[PASS] Certificate expiration check

2021-08-25T18:23:39 INFO Vdt: Running vc_auth_vmdir_check.py
2021-08-25T18:23:39 INFO _svc_log: Get services status, svcnames=[‘vmdird’]
2021-08-25T18:23:39 INFO Vdt:
_________________
VMdir Check

[INFO] VMdir database size: 42.69MB

[INFO] VMdir Status Check (No partners)

[PASS] VMdir State Check

[PASS] VMdir Arguments Check

2021-08-25T18:23:39 INFO Vdt: Running vc_corefile_check.py
2021-08-25T18:23:40 INFO Vdt:
_____________________
CORE FILE CHECK

[PASS] Number of core files: 0
[PASS] Number of hprof files: 0

2021-08-25T18:23:40 INFO Vdt: Running vc_db_check.py
2021-08-25T18:23:41 INFO _svc_log: Get services status, svcnames=[‘vmware-vpostgres’]
2021-08-25T18:23:41 INFO Vdt:
______________________________
vCenter PostgresDB Check

Top 10 Largest Tables:

tablename | size
——————+———
vpx_task | 1760 MB
vpx_event_arg_91 | 856 MB
vpx_event_arg_92 | 839 MB
vpx_event_arg_3 | 827 MB
vpx_event_arg_4 | 826 MB
vpx_event_arg_1 | 821 MB
vpx_event_arg_2 | 821 MB
vpx_event_arg_90 | 753 MB
vpx_event_arg_89 | 721 MB
vpx_event_arg_76 | 687 MB

Total Postgres Size:
8.5G /storage/db/vpostgres/
99G /storage/seat/vpostgres/
107G Interpreted by vPostgres

2021-08-25T18:23:41 INFO Vdt: Running vc_disk_space.py
2021-08-25T18:23:42 INFO Vdt:
________________
DISK CHECK

[PASS] DISK CAPACITY

[PASS] INODE USAGE

RESULT: [PASS]
Please see KB: https://kb.vmware.com/s/article/1003564

2021-08-25T18:23:42 INFO Vdt: Running vc_ntp.sh
2021-08-25T18:23:42 INFO Vdt:
__________________
VC NTP CHECK

[FAIL] NTP and Host time are both disabled!

2021-08-25T18:23:42 INFO Vdt: Running vc_ports.py
2021-08-25T18:23:45 INFO Vdt:
________________________
vCenter Port Check

Checking ports: 443, 389, 2012, 2020
For port information, please see KB: https://kb.vmware.com/s/article/52963

[PASS] Port check for host vcenter

2021-08-25T18:23:45 INFO Vdt: Running vc_root_check.py
2021-08-25T18:23:45 INFO Vdt:
________________________
Root Account Check

[PASS] Root password never expires

2021-08-25T18:23:45 INFO Vdt: Running vc_services.py
2021-08-25T18:23:45 INFO _svc_log: Get services status, svcnames=None
2021-08-25T18:23:48 INFO Vdt:
_______________________
VC SERVICES CHECK

Printing only services that are stopped and should be started.
KB: https://kb.vmware.com/s/article/2109887

RESULT: [PASS]

2021-08-25T18:23:48 INFO Vdt: Running vc_syslog_check.py
2021-08-25T18:23:48 INFO Vdt:
__________________
Syslog Check

Remote Syslog config: @logs:1517

[PASS] DNS lookup for logs

We’ve detected you have a remote syslog server configured.
Please search your remote syslog server for this string to
validate syslog is working correctly:

vdt-2021-08-25-182348

[PASS] Local Syslog Functional Check

2021-08-25T18:23:48 INFO Vdt: Running vc_vcha_check_auth.py
2021-08-25T18:23:49 INFO _svc_log: Get services status, svcnames=[‘vmware-vcha’]
2021-08-25T18:23:49 INFO Vdt:
________________
VCHA CHECK

[INFO] VCHA is enabled.

 

 

“Be social and share this on social media, if you feel this is worth sharing it”

 

Related posts
Cloud E2ENutanix

Nutanix Support Engineer Diary

Cloud E2ENutanix

Nutanix AHV Networking (CLI commands)

Cloud E2E

How to become a Certified Kubernetes Administrator (CKA)?

Cloud E2E

Docker Networking Basics

Leave a Reply

Your email address will not be published. Required fields are marked *