We ‘ll be moving to OpenLDAP 2.4 shortly at ntua as our primary LDAP server. I wanted to post a few details on my testing so here goes:
In general the server is quite stable, fast and reliable. The only thing an administrator should really keep an eye on is making sure that the database files have the correct ownership/permissions after running database administration tools (slapadd, slapindex etc).

Features

OpenLDAP has a really nice set of features which can prove quite useful like:

  • Per user limits. The administrator can set time and size limits on a per user or user group basis apart from setting it globally. One nice application is to limit anonymous access to only a few entries per query while allowing much broader limits for authenticated users. In my case I found that you have to set both soft and hard limits for things to work correctly.
  • MemerOf Overlay. Maintains a memberOf attribute in users which are members of static groups.
  • Dynamic Group Overlay. Dynamically creates the member attribute for dynamic groups based on the memberURL attribute value. Since the memberURL is evaluated on every group entry access care should be taken so that only the proper users access the group entries.
  • Audit Log Overlay. Maintains an audit text log file with all entry changes.
  • Referrential Integrity and Attribute Uniqueness Overlays
  • Constraint Overlay. A very handy overlay which allows the administrator to set various constraints on attribute values. Examples are count constraint (for instance on the userpassword attribute), size constraint (jpegPhoto attribute), regular expression constraint (mail attribute) and even value list constraint (through an LDAP URI) which can be very handy for attributes with a specific value set (like edupersonaffiliation). Make sure you use 2.4.13 though since value list constraint will crash the server in earlier versions.
  • Monitor Backend in order to monitor directly through LDAP server status, parameters and operations.
  • Dynamic Configuration which provides an LDAP view of all the server configuration thus allowing the administrator to dynamically change most of the server configuration. I would not recommend only using dynamic config (although it is possible) since it’s a bit cryptic and hard to administer. What i did is to enable it on top of the regular text slapd.conf in order to be able to dynamically change a few parameters, especially the database read-only feature (which can be used to perform online maintenance tasks like indexing or backing up data).

Benchmarks

The administration guide provides a few hints on how to measure cache size needs depending on your database. The important points are the following:

  • cachesize should be as large as possible (optimally the sum of all the BDB files sizes – id2entry and indexes).
  • For decent performance cachesize should be enough to contain all internal nodes of the database B-Trees. db_stat -d can be used to show that information.  id2entry uses 16KB pages while dn2id uses filesystem page (4KB or 8KB). The same should be done for the index files as well.
  • The entry cache should ideally hold all entries but at least be able to hold the working set. Keep in mind that even if you have 100,000 users in your database, only a small percent of them will use your services daily so your working set could be as small as 10% of your user population.
  • For back-hdb the index IDL cache should be 3 times the entry cache.
  • You should setup 4 thread per available CPU core.

My testing shows that you can probably use the bare minimum cachesize and still get nice performance. I loaded the LDAP server with two different working sets. One with a 20,000 users database (which resembled our user population) and one with 200,000 users (so that the test system memory was not large enough to hold all entries in cache). My cache calculations were the following:

Entries 20.000

du -c -h /var/db/openldap-data/*.bdb
400K    /var/db/openldap-data/businessCategory.bdb
608K    /var/db/openldap-data/cn.bdb
608K    /var/db/openldap-data/description.bdb
1.8M    /var/db/openldap-data/dn2id.bdb
240K    /var/db/openldap-data/eduPersonAffiliation.bdb
960K    /var/db/openldap-data/eduPersonOrgUnitDN.bdb
8.0K    /var/db/openldap-data/eduPersonPrimaryOrgUnitDN.bdb
576K    /var/db/openldap-data/employeeNumber.bdb
432K    /var/db/openldap-data/entryCSN.bdb
560K    /var/db/openldap-data/entryUUID.bdb
576K    /var/db/openldap-data/givenName.bdb
35M    /var/db/openldap-data/id2entry.bdb
1.0M    /var/db/openldap-data/objectClass.bdb
608K    /var/db/openldap-data/sn.bdb
5.8M    /var/db/openldap-data/telephoneNumber.bdb
640K    /var/db/openldap-data/title.bdb
608K    /var/db/openldap-data/uid.bdb
50M    total

db_stat -d dn2id and id2entry:
dn2id: 7 internal pages + 251 leaf pages, 4KB page size: 4 * 7 = 28KB
id2entry: 4 internal pages + 2223 leaf pages, 16KB page size: 4 * 16 = 64KB

Indexes: ~70 internal pages, 4KB page size: 70 * 4 = 280KB

Minimum cache: 1MB
Perfect cache: 60MB

Entries 200.000

3:45pm  /var/db/openldap-data # du -c -h *.bdb
2.1M    businessCategory.bdb
5.7M    cn.bdb
5.6M    description.bdb
18M    dn2id.bdb
1.3M    eduPersonAffiliation.bdb
4.6M    eduPersonOrgUnitDN.bdb
8.0K    eduPersonPrimaryOrgUnitDN.bdb
5.7M    employeeNumber.bdb
4.3M    entryCSN.bdb
5.6M    entryUUID.bdb
5.7M    givenName.bdb
362M    id2entry.bdb
3.2M    objectClass.bdb
5.6M    sn.bdb
36M    telephoneNumber.bdb
2.6M    title.bdb
5.6M    uid.bdb
474M    total

db_stat -d dn2id and id2entry
dn2id: 40 internal pages + 2589 leaf pages: 40 * 4 = 160KB
id2entry: 27 internal pages + 23112 leaf pages: 17 * 16 = 432KB

Indexes: ~640 internal pages: 640 * 4 = 2560KB
Minimum cache: ~6MB
Perfect cache: 500MB (too much!)

As you can see in the 200K database the perfect cache is quite large (and there’s not enough system memory to keep it), while the minimum cache is still tiny. Here are some bare minimum benchmark results:

Entries: 20000

settings:
cachesize: 60MB
entry cache: 500 entries
index cache: 1500 entries

Memory footprint: 128MB / resident: 66MB

7:18pm  /usr/local/etc/openldap/data # time slapadd -b dc=ntua,dc=gr -w < 20000.users
36.122u 3.330s 1:41.86 38.7%    1383+1695k 0+82583io 0pf+0w

7:23pm  /usr/local/etc/openldap/data # time slapcat -b dc=ntua,dc=gr -l /dev/null
1.644u 0.059s 0:01.70 99.4%     1381+5167k 0+0io 0pf+0w

7:26pm  /usr/local/etc/openldap/data # time ldapsearch -h localhost -D ‘cn=manager,dc=ntua,dc=gr’ -x -w manager -b dc=ntua,dc=gr ‘objectclass=*’ dn > /dev/null
0.635u 0.137s 0:03.46 21.9%     84+330k 0+0io 0pf+0w

Increase cache:
entry cache: 20000 entries
index cache: 60000 entries

Memory footprint: 214MB / resident: 160MB

7:30pm  /usr/local/etc/openldap # time ldapsearch -h localhost -D ‘cn=manager,dc=ntua,dc=gr’ -x -w manager -b dc=ntua,dc=gr ‘objectclass=*’ dn > /dev/null
0.508u 0.093s 0:01.70 34.7%     85+334k 0+0io 0pf+0w

Drop cachesize to 1MB:

Memory footprint: 47MB / resident: 16MB

7:36pm  /usr/local/etc/openldap # time slapcat -b dc=ntua,dc=gr -l /dev/null
1.675u 0.067s 0:01.74 99.4%     1379+5144k 0+0io 0pf+0w
7:37pm  /usr/local/etc/openldap # time ldapsearch -h localhost -D ‘cn=manager,dc=ntua,dc=gr’ -x -w manager -b dc=ntua,dc=gr ‘objectclass=*’ dn > /dev/null
0.590u 0.230s 0:03.62 22.6%     78+307k 0+0io 0pf+0w

Entries: 200000

settings:
cachesize: 100MB
entry cache: 500 entries
index cache: 1500 entries

3:30pm  /usr/local/etc/openldap/data # time slapadd -b dc=ntua,dc=gr -w <200000.users
394.239u 33.645s 18:23.79 38.7% 1367+4726k 2562+773886io 16pf+0w

11:52am  /var/db/openldap-data # time slapcat -b dc=ntua,dc=gr -l /dev/null
17.808u 1.260s 0:19.11 99.7%    1365+39492k 0+0io 0pf+0w

Drop cachesize to 6MB:
11:48am  /var/db/openldap-data # time slapcat -b dc=ntua,dc=gr -l /dev/null
17.562u 1.109s 0:18.73 99.6%    1365+39867k 0+0io 0pf+0w

It’s very interesting that the slapcat time is mostly the same as long as the cachesize is equal to the calculated bare minimum, while increasing entry cache to hold all entries can make ldapsearch run in the same time as slapcat (that’s a nice show of the fact that the OpenLDAP operation/network processing overhead is quite minimum).

Replication

Replication (through SyncRepl) is quite easy to setup and maintain and runs like a charm.

Things i ‘d like to see

I ‘d really like to see an implementation of Class of Service (Sun Jave DS) in OpenLDAP for dynamically creating attribute values. I ‘ve found this feature lacking especially if you are running a Shibboleth Single Sign On service. In Shibboleth there are many attributes which could use such a feature like edupersonprincipalname (usually equal to <uid>@<domain>) and edupersonentitlement (which usually holds the same attribute value for a large user set). Class Of Service is also a very elegant way to minimize your database size thus allowing a larger database to be stored on a lighter (and cheaper) server, as long as you don’t perform searches based on the dynamically created attributes.

Advertisements