Created on March 26, 2011
One of the most common questions for memcached, just after "how do I list all the keys?", is "how do I put my sessions in memcached?" The usual answer is somewhere between "please don't" and "here use this library". Below I explain my position on the matter.
The usual session pattern:
* Blobs (250 bytes to 5k+)
* Read from datastore on every single page load
* Usually written to the datastore on every page load
* Reaped from the DB after inactivity
The usual memcached session pattern:
* Read from memcached on page load
* 'set' over the existing session with a new expiration time
* Cache misses mean user is logged out
This pattern is dangerous to a memcached cluster. The simple design of memcached is a gift of flexibility. Once you rely on memcached to be the end stop for important user data, you lose half of the reason why it exists. Memcached will still be fast, but every other feature is nulled.
- Run your instances out of memory and people get logged out early, or can't log in at all
- Upgrading memcached, the OS, hardware, etc, now kicks people off
- Adding or removing servers to the cluster, now kicks people off
As your site grows, so will pressure to avoid maintenance on the cluster. Then you'll find a need for replication, for persisting memcached to disk, for dumping the keys for session analysis...
It's much easier with a simple memcached + RDBMS pattern (NoSQL works too in some cases). The goal is to use memcached to avoid reads to your datastore, as well as greatly reduce the amount of writing.
- When a user logs in, 'set' into memcached and write to the database.
- Add a field to your session noting the last time the session has been sent to the database
- Every page load fetches from memcached first, database second
- Every N page loads or Y minutes, send another write to the datastore
- Pull expired sessions from the DB, fetching the latest data from memcached first
Now you can upgrade everything, reboot everything, expand or contract your cluster, and even run analysis over finalized sessions. The hit to your database will be miniscule compared to not using memcached at all. Best of all it's trivial to write.
It's been a while since my original post on this, and I've made it terse, but I'd still love to see more session libraries coming out using this pattern.