You are currently browsing the category archive for the ‘RADIUS’ category.
FreeRADIUS 2.0 has been released after a long and productive development cycle. It’s much more scalable, fast and simple while providing even more powerful features like a policy language, virtual hosting and IPv6 support. More information available on the freeradius website.
In my last post i went through some tips on RADIUS performance. This time i ‘ll look into issues with RADIUS server redundancy. I ‘ll focus on accounting redundancy. You can easily use things like LVS, VRRP, LDAP and SQL replication to achieve radius server and authentication database redundancy. Having a redundant accounting database is a bit trickier. The reason:
- Accounting is several times larger than authentication information and is ever growing and changing (instead of pretty much static data in the authentication database).
- Regardless of it’s size you still want accounting to be synchronized between all your redundant radius servers so that you can perform network wide double login detection, ip and resource management.
So what i would suggest to do is the following:
- Keep the live accounting table (the one holding online users, already discussed in my post on performance), the nas table (holding the radius server clients) and the ip pool table (used by rlm_sqlippool) in a replicated sql database. MySQL does not support Multimaster replication (it would be very difficult to do that anyway) so you must make sure to:
- Perform all sql queries to a single Virtual SQL server IP
- Configure High Availability between all replicated MySQL servers so that only one, working server holds the virtual IP at any time.
- Don’t use modules like rlm_ippool which depend on local files for storage and therefore cannot be easily replicated.
- You could also keep the radacct table replicated and only log to the Virtual IP but i would suggest against it. You can usually tolerate a small delay in accounting synchronization between your multiple radius servers (as long as you keep information like currently logged in users on a replicated table). Just imagine the scenario that replication fails and you have to manually synchronize all MySQL servers with a multimillion row radacct table. Nightmare! Instead just log to a detail file and have radrelay (or the radius server in radrelay mode in recent versions) send accounting to the rest of the redundant servers. You will need one detail file for each redundant server so this solution will not scale if you have more than a couple of radius servers. On the other hand, if you are looking for a multi-server setup, you will need to use an SQL server cluster. The above mechanism will also keep things like rlm_counter synchronized.
The above suggestions assume that you have a redundant permanent store on each radius server (raid-1/5). If you permanently lose data stored on a disk, no replication strategy can provide 100% guarantee. SQL (or RADIUS) server will first store data locally, acknowledge the fact to the client and afterwards replicate data to the rest of the servers. If this data is lost before it is replicated, it is lost forever.
Here are a few basic guidelines on things to keep an eye for when seting up a large scale radius service (most are FreeRADIUS specific):
First of all, when you see performance problems with your radius service always, ALWAYS blame the database first. I ‘ve seen FreeRADIUS crash on me, have memory leaks (in the early first days) but the server never was the actual bottleneck. As for more precise tips:
- Create an on-memory (HEAP) table to hold ‘live’ accounting. When i say live i mean online sessions. You perform an INSERT on Accounting-Start and DELETE on Accounting-Stop. That way you have a really fast and small table which you use for all RADIUS operations as well as for double login detection. If you need to retain sessions between SQL/server restarts you could just create a normal table (instead of heap) and still see big performance gains instead of a large (and always growing) radacct table.
- You obviously still need to keep historical accounting. You can perform that near-online by using the detail file/radrelay mechanism. In accounting also log to a detail file and have a separate radius process read through that and log to a radacct table. That way your main radius server will only have to perform writes to the detail file (which should normally always succeed and be fast enough) and only the secondary radius server (the radrelay process) will have to deal with any problems with the radacct table. As a result you can perform house keeping operations (deleting old entries, statistics extraction) with the main accounting table (radacct) without disturbing your actual radius service (one of the most frequently faced problems i ‘ve had in my own installations). You can use this radrelay process with additional sql modules to also perform online statistics creation.
- Alternatively, you can just use the sql_log module, log the actual sql queries and run them through radsqlrelay instead of using radrelay if that suits yours needs.
- If your database supports views you can do the following: Only add entries in the radacct table on stop packets and create a combined view of the liveacct and radacct tables. That way only have to do one sql query per session on the radacct table instead of two and stil keep full accounting overview.
- Possibly the most frequent problem i ‘ve faced was dealing with double logins. Normally, FreeRADIUS uses an external process (checkrad) which has to be executed on every possible double login in order to query the access server and determine if the user is actually already logged on or if the session is stale. The problem is that this process is time consuming (depending on how long it takes for the access server to respond), involves numerous process creations and usually is a place for big headaches (involving the use of waitpid() calls from the radius server to wait for the checkrad process). You can eliminate all this by just moving double login detection offline. The radius server should just trust the accounting database and immediately reject any already logged-on user. The big step forward is that it also logs a corresponding entry on an on-memory double loginers table for this reject. That table can be checked by an outside processs running every 1 minute. This process can then call checkrad in order to determine if a user is actually online or not. If the session is found to be stale then a corresponding fake accounting stop can be sent to the radius service for the session to be cleared. That way double login detection (in the radius process) will always take a specific amount of time to complete and not depend on access server response and process creating time.
- Another thing that can help in keeping your online accounting table current with the actual sessions is to use accounting-updates. Setup a large accounting-update interval on your access servers (for instance, 4 hours) and add a column on the liveacct table called AcctUpdateTime. On accounting-update update the column with the time of the packet. Create an index for this column and have a separate process run every hour and scan for entries with accounting-update interval * 2 + a few minutes old. These entries are stale and after checking with the access server you can just send a fake accounting-stop so that they get closed on the accounting tables.
- I ‘ve had a few cases where the access server would send an accounting-start multiple times and that would end up being inserted multiple times in the accounting tables. In order to avoid this scenario you can do the following: Use the acct_unique module to create a unique id based on the accouting packet attributes. Create a unique key constaint for the AcctUniqueId column so that any INSERT for the same unique id will fail and fall back to the accounting_start_qeury_alt which is just an update.
- Try using modules like rlm_perl or rlm_python when creating outside scripts. That way you don’t have to wait for process initialization or depend on error prone functions like waitpid() on a threaded application. I ‘ve had a client achieve response time of a few ms with a rlm_perl script which had to perform various sql queries and processing (the client was running a VoIP application so the total response time was one very important factor).
- If you observe sql query slowdowns, 99% you have to check your indexing. Run EXPLAIN SELECT on your queries and add any needed indexes.
- When performing large deletes (like deleting old records from the accounting table) i ‘ve found that it’s always better (at least in MySQL) to do a global LOCK TABLE WRITE around the DELETE.
Most of you have already heard of RADIUS and many of you use it in your infrastructure (usually to provide wi-fi or dialup/dsl access). Have you ever wondered, what’s the main difference between RADIUS and user authentication databases like LDAP (and also what they have in common)? Here ‘re a few points:
- LDAP and RADIUS have something in common. They ‘re both mainly a protocol (more than a database) which uses attributes to carry information back and forth. They ‘re clearly defined in RFC documents so you can expect products from different vendors to be able to function properly together.
- RADIUS is NOT a database. It’s a protocol for asking intelligent questions to a user database. LDAP is just a database. In recent offerings it contains a bit of intelligence (like Roles, Class of Service and so on) but it still is mainly just a rather stupid database. RADIUS (actually RADIUS servers like FreeRADIUS) provide the administrator the tools to not only perform user authentication but also to authorize users based on extremely complex checks and logic. For instance you can allow access on a specific NAS only if the user belongs to a certain category, is a member of a specific group and an outside script allows access. There’s no way to perform any type of such complex decisions in a user database.
- RADIUS also includes accounting. That means that you can use accounting history when making authorization decisions and get functionality like quotas (a user is only allowed 4 hours of dialup access per day regardless of how many times he connects).
- With the introduction of Extensible Authentication Protocol (EAP) you can use almost any authentication protocol known to man 🙂
- RADIUS is extensible. You can easily extend the RADIUS schema with attributes of you choice (as long as you have a Vendor number). RADIUS servers are extensible. You can use almost any database for authentication and accounting (LDAP, SQL, password files, outside scripts). The same stands for the LDAP protocol (one of the major factors for it’s popularity) and for LDAP servers although they don’t get even close to the levels allowed by RADIUS servers.