You may have reviewed my blog post about Isolation of Intermittent Network Issue. That blog makes you aware, that ESXI comes with a neat utility that can help you capture network traffic at various points within the host.
Like most troubleshooting tools, pktcap-uw needs a user to be in front of the system to start the data collection. This is not only operational but also a human challenge for environments running at scale.
The code below is a sample script that you can customize to collect this data in 15-minute batches.
Code:
#!/bin/sh
capture()
{
fileid=$1
pktcap-uw --uplink vmnic4 -o /vmfs/volumes/local-ds-86/vmnic4-inbound-file$fileid.pcap >/dev/null &
pktcap-uw --uplink vmnic4 --dir 1 -o /vmfs/volumes/local-ds-86/vmnic4-outbound-file$fileid.pcap >/dev/null &
pktcap-uw --uplink vmnic5 -o /vmfs/volumes/local-ds-86/vmnic5-inbound-file$fileid.pcap >/dev/null &
pktcap-uw --uplink vmnic5 --dir 1 -o /vmfs/volumes/local-ds-86/vmnic5-outbound-file$fileid.pcap >/dev/null &
}
#Change the number of iterations to reduce the total time of capture
for i in 0 1 2 3 4 5 6 7 8 9;
do
#Kill any existing session. This to make sure that the script can handle a previous dirty exit. For example a script exit using ctrl+c
kill -9 $(lsof | grep -i pktcap | grep vmfs | awk '{print $1}');
#Start captures
capture $i >/dev/null &
#Change the sleep time to reduce the time of capture per batch
sleep 900;
#Kill captures
kill -9 $(lsof | grep -i pktcap | grep vmfs | awk '{print $1}');
done
Hmmm nice code….
Hi Akshay, so we need to create a file by name lets say and execute it from anywhere. Post that collect the logs from the location mentioned in your script, correct??
Also I was wondering is it still not a manual process of executing the script? In other words, an admin has to execute/call the above script at the time of the issue while he has access to his production environment. Am I correct??
Thanks Kamal. Yes, your understanding is correct
Script is indeed not a fully automated solution. However, it can be made fully automated by using any of the following methods
1) Cron scheduler job that runs the script every two hours
2) Calling the script in an infinite while loop as background process
3) Calling the script in while loop as background process with termination file
I would recommend using option 1 as that does not involve infinite loops
some of you might see “&” at various places in the above script. This seems to happing due to HTLM translation for the bash code. It is suppose to be just &