Using MongoDB as a queue

For some reason, I was very curious about NoSQL technologies when they started gaining hype against traditional RDBMS systems.

Maybe because I was a firm believer that for certain tasks RDBMS systems just excel. And of course I am talking about tasks that ACID properties are crucial. Or about tasks that transactions are much needed.

But all those fanatics trying to prove that MongoDB is a high performance “web scale” database got me curious.

So I started playing with it.

When I tried to use it for “web scale” or “high availability” stuff, frankly, it failed miserably and made my life a nightmare. But throughout my experiment, I realised that there was some potential about it.

It could insert data very fast. Of course, this is because, you could be inserting rubbish as far as integrity is concerned, or simply because what you think you inserted was never written to disk and would be lost if the system crashed.

But if you don’t mind that much about that, it could be perfect. I know that this begs for the answer “pipe your data to /dev/null“, but honestly, it is not that bad in the end.

So the way I ended up using it, is as a non-critical, very versatile and easy to use queue system. Performance wise, it performs very well. Convenience wise, you don’t have a schema, so you could be adding almost anything, in any format, as long as your code has cases support it.

One of these queues, records every single request made to niume from visitors. And those requests are coming from three different front-end servers behind a load balancer. Niume is a website that has more than one million unique visitors per month, so we are talking about a lot of requests.

Despite the load (it can get up to more than 25 requests per second), it performs really well, whilst running on a modest virtual machine that also runs other services.

Of course, it only acts as a queue, writing the raw requests where another application sorts out the data and inserts it in a massive MariaDB database. This happens asynchronously (thus the need for a queue), so that it is not a bottleneck on traffic spike situations.

But this is not the only queue running on this machine, on the same database.

There also is a queue for sending messages for events to the back end and another one that comes really handy, with emails.

All emails are queued in MongoDB and then they are asynchronously sent, according to their priority. Losing an email is fine, sending it twice is a more serious issue. That is why, a MariaDB database is used to record a log of all sent emails for a couple of days.

So, to sum up, my most favourite features of MongoDB are the quick writes of non critical data, the fact that the _id column contains the timestamp of the date inserted and the flexibility of adding columns by only changing the code and not having to do a full schema refactoring like on a traditional RDBMS.

MongoDB is a good tool to perform certain tasks. But using it for critical data sounds like a disaster waiting to happen for me. Also, really hating it because of its non-ACID properties is moronic as well, because it was never designed to replace MySQL. This is just marketing BS. It is a product on its own.

It pisses me off when developers become “religious” of languages, databases, frameworks, or whatever. I just consider them to be tools and according to the job, you pick the most suitable you can find.


Leave a Reply