Recent Posts

Showing posts with label MongoDB. Show all posts
Showing posts with label MongoDB. Show all posts

CouchDB vs MongoDB

After posting about Scott Motte’s comparison of MongoDB and CouchDB, I thought there should be some more informative sources out there, so I’ve started to dig.

The first I came upon is an article about Raindrop requirements and the issues faced while attacking them with CouchDB and the pros and cons of possibly replacing CouchDB with MongoDB:

[Pros]
  • Uses update-in-place, so the file system impact/need for compaction is less if we store our schemas in one document are likely to work better.
  • Queries are done at runtime. Some indexes are still helpful to set up ahead of time though.
  • Has a binary format for passing data around. One of the issues we have seen is the JSON encode/decode times as data passes around through couch and to our API layer. This may be improving though.
  • Uses language-specific drivers. While the simplicity of REST with CouchDB sounds nice, due to our data model, the megaview and now needing a server API layer means that querying the raw couch with REST calls is actually not that useful. The harder issue is trying to figure out the right queries to do and how to do the “joins” effectively in our API app code.
[Cons]
  • easy master-master replication. However, for me personally, this is not so important. […] So while we need backups, we probably are fine with master-slave. To support the sometimes-offline case, I think it is more likely that using HTML5 local storage is the path there. But again, that is just my opinion.
  • ad-hoc query cost may still be too high. It is nice to be able to pass back a JavaScript function to do the query work. However, it is not clear how expensive that really is. On the other hand, at least it is a formalized query language — right now we are on the path to inventing our own with the server API with a “query language” made up of other API calls.
Anyway while some of the points above are generic, you should definitely try to consider them through the Raindrop requirements perspective about which you can read more here.

Another article comparing MongoDB and CouchDB is hosted by MongoDB docs. I find it well balanced and you should read it all as it covers a lot of different aspects: horizontal scalability, query expressions, atomicity, durability, mapreduce support, javascript, performance, etc.

I’d also mention this benchmark comparing the performance of MongoDB, CouchDB, Tokyo Cabinet/Tyrant (note: the author of the benchmark is categorizing Tokyo Cabinet as a document database, while Tokyo is a key-value store) and uses MySQL results as a reference.

In case you have other resources that you think would be worth including do not hesitate to send them over.

How to: Translate SQL to MongoDB MapReduce

I keep hearing people complaining that MapReduce is not as easy as SQL. But there are others saying SQL is not easy to grok. I’ll keep myself away from this possible flame war and just point you out to this SQL to MongoDB translation PDF put together by Rick Osborne and also his post providing some more details.



As regards the SQL and MapReduce comparison, here’s what Rick has to say:
It seems kindof silly to go through all this, right? SQL does all of this, but with much less complexity. However, this approach has some huge advantages over SQL:
  1. Programmers who don’t know SQL or relational theory may find it easier to understand and get using quickly. (Newbies especially, such as my students.)
  2. The map and reduce functions can be heavily parallelized on commodity hardware.
It’s really that second one that is the key.

I’d also like to share something that I’ve learned lately: SQL parallel execution is supported in different forms by some RDBMS. So at the end of the day, it will probably become just a matter of what fits better the problem and your team.