After recently upgrading vCenter and ESXi to 6.0U1a and installing all patches (now build numbers 3018524 and 3073146 respectively), we began experiencing random host disconnects from vCenter. The host itself and all guests on it are still alive when it's disconnected; I can SSH to the host and RDP/SSH to guests. If I literally do nothing, it eventually fixes itself and rejoins vCenter within 15-20 minutes. We did not have this issue prior to upgrading to 6.0U1a. This is not the "NETDEV WATCHDOG: vmnic4: transmit timed out" issue. In fact, the reason we upgraded to the latest build was to get the fix for that particular issue.
I've personally witnessed this happen now on three different hosts and never has it reoccurred on the same host twice that we have noticed. The vmkernel.log simply shows:
2015-11-18T20:56:42.662Z cpu12:173086)User: 3816: wantCoreDump:vpxa-worker signal:6 exitCode:0 coredump:enabled
2015-11-18T20:56:42.819Z cpu15:173086)UserDump: 1907: Dumping cartel 172357 (from world 173086) to file /var/core/vpxa-worker-zdump.000 ...
The vpxa.log doesn't show anything building up to the disconnection and leaves a large gap after the agent crashes, like so:
2015-11-18T20:56:42.638Z info vpxa[FFF2AB70] [Originator@6876 sub=vpxLro opID=QS-host-311567-2883ed8a-1e-SWI-42a5654a] [VpxLroList::ForgetTask] Unregistering vim.Task:sessio
2015-11-18T20:56:42.641Z verbose vpxa[FFF6CB70] [Originator@6876 sub=VpxaHalCnxHostagent opID=QS-host-311567-2883ed8a-1e] [VpxaHalCnxHostagent::DoCheckForUpdates] CheckForUp
2015-11-18T20:56:42.641Z verbose vpxa[FFF6CB70] [Originator@6876 sub=vpxaMoService opID=QS-host-311567-2883ed8a-1e] [VpxaMoService] GetChanges: 97820 -> 97820
2015-11-18T20:56:42.641Z verbose vpxa[FFF6CB70] [Originator@6876 sub=VpxProfiler opID=QS-host-311567-2883ed8a-1e] [2+] VpxaStatsMetadata::PrepareStatsChanges
2015-11-18T21:10:20.328Z Section for VMware ESX, pid=3326854, version=6.0.0, build=3073146, option=Release
2015-11-18T21:10:20.329Z verbose vpxa[FF8A6A60] [Originator@6876 sub=Default] Dumping early logs:
2015-11-18T21:10:20.329Z info vpxa[FF8A6A60] [Originator@6876 sub=Default] Logging uses fast path: false
vCenter logs simply show the host becoming unreachable so the problem is obviously host-side.
Anyone else seeing similar activity? This has all the feel of another "known issue" but I don't see any talk about it. I did open a case with VMware support and am awaiting contact now.