In my last post i went through some tips on RADIUS performance. This time i ‘ll look into issues with RADIUS server redundancy. I ‘ll focus on accounting redundancy. You can easily use things like LVS, VRRP, LDAP and SQL replication to achieve radius server and authentication database redundancy. Having a redundant accounting database is a bit trickier. The reason:

  • Accounting is several times larger than authentication information and is ever growing and changing (instead of pretty much static data in the authentication database).
  • Regardless of it’s size you still want accounting to be synchronized between all your redundant radius servers so that you can perform network wide double login detection, ip and resource management.

So what i would suggest to do is the following:

  • Keep the live accounting table (the one holding online users, already discussed in my post on performance), the nas table (holding the radius server clients) and the ip pool table (used by rlm_sqlippool) in a replicated sql database. MySQL does not support Multimaster replication (it would be very difficult to do that anyway) so you must make sure to:
    • Perform all sql queries to a single Virtual SQL server IP
    • Configure High Availability between all replicated MySQL servers so that only one, working server holds the virtual IP at any time.
  • Don’t use modules like rlm_ippool which depend on local files for storage and therefore cannot be easily replicated.
  • You could also keep the radacct table replicated and only log to the Virtual IP but i would suggest against it. You can usually tolerate a small delay in accounting synchronization between your multiple radius servers (as long as you keep information like currently logged in users on a replicated table). Just imagine the scenario that replication fails and you have to manually synchronize all MySQL servers with a multimillion row radacct table. Nightmare! Instead just log to a detail file and have radrelay (or the radius server in radrelay mode in recent versions) send accounting to the rest of the redundant servers. You will need one detail file for each redundant server so this solution will not scale if you have more than a couple of radius servers. On the other hand, if you are looking for a multi-server setup, you will need to use an SQL server cluster. The above mechanism will also keep things like rlm_counter synchronized.

The above suggestions assume that you have a redundant permanent store on each radius server (raid-1/5). If you permanently lose data stored on a disk, no replication strategy can provide 100% guarantee. SQL (or RADIUS) server will first store data locally, acknowledge the fact to the client and afterwards replicate data to the rest of the servers. If this data is lost before it is replicated, it is lost forever.