Lately, I have been looking into MongoDB and other alternatives to traditional relational databases. MongoDB seems to be the best fit for the kinds of scenarios I am working with here at BancVue.
What I am looking for…
The project I am currently working on is using messaging to transfer large amounts of data from many remote source systems. This data then hydrates a master database that represents a consolidated view of all the data from those remote systems. This master system can then feed other systems the data they need. Based on this system, here is a list of the qualities I need:
Fast Inserts
Because there is little querying, but large amounts of inserts, they need to be fast.
Handle Large Data Volumes
The consolidated dataset can get very large. Therefore we need a system that can handle massive amounts of data.
Horizontally Scalable
Because we need to house so much data, it requires us to scale to multiple machines. Ideally, the system would easily scale to multiple machines as needed (something SQL-based databases don’t do very well).
Parallelized Queries
Since we do have a need to query the database to get the data out, it would be nice if the system could parallelize the queries so they would be more performant when scaled out.
Easy to Setup and Use
I hate the thought of working on something that takes a UNIX guru to set up (because I am not that guru), and I don’t have a whole heck of a lot of time to devote to learning a new system at the moment.
MongoDB fulfills all these needs. It has a very fast insert speed, can scale to thousands of machines, can automatically shard the data across those machines to store large volumes of data, and run map/reduce queries in parallel across those machines to produce results. Not to mention that it is easier to get running than any db I have ever used.
Setting up MongoDB
Setting up an instance of MongoDB could not be much easier. Simply follow the instructions on the MongoDB Quickstart page:
- Download the binaries and extract them.
- Create a folder for the data. (C:\data\db or /data/db depending on OS).
- Execute mongod.exe in the extracted bin folder.
Now you have a running instance of MongoDB!
Connecting to the MongoDB Instance
MongoDB comes with its own interactive shell. Run mongo.exe from the extracted bin folder to start it up. It will automatically connect to the instance we started up on our local box. (You can use command line args to connect to instances on other machines of course).
From here, you can interactively execute commands against the database.
Working in the Shell
MongoDB does not use SQL, instead, it uses JavaScript as its query language. This is all very well documented on the MongoDB website. Lets walk through a few commands to get you started. The text in green is the command you type in. The text in blue is the server’s response.
Getting a list of databases
The show dbs command will display a list of all the databases on this server.
> show dbs
admin
local
mongo_session
test
>
This is the list of all the databases in your mongo server.
Switching to use a different database
To switch to another database you use the use command just like in SQL.
> use myorders
switched to db myorders
>
Notice that you may pass a new database name to the use command and it will work. Actually, MongoDB will not create the database until you actually insert something. You can see this by executing show dbs again. The myorders database does not show in the list.
Inserting data into a collection
MongoDB uses the concept of collections in the same way Sql uses tables. However, since MongoDB is a document database, it does not constrain all the objects in a collection to the same structure like tables do. Each object can have its own structure (or schema). Thus, MongoDB is called a “schema-less†database.
Lets insert some data into a collection now.
> order1 = {orderAmount:25.00, customerName:â€Bob Smithâ€};
{ “orderAmount†: 25, “customerName†: “Bob Smith†}
> db.orders.save(order1);
>
So what did we just do?
We created a variable called order1, and assigned it a JSON object. (Did I mention it was all JavaScript based?). Then we inserted order1 into the orders collection.
Where did the orders collection come from?
It was created automatically by MongoDB. Likewise, the database was created at the same time. If you want to run show dbs again now, you will see the myorders database.
> show dbs
admin
local
mongo_session
myorders
test
>
Querying the database
Lets now look at what was inserted. First, we’ll look at all the records in the collection. We do this by calling the find() function on the collections with no arguments.
> db.orders.find();
{ “_id†: ObjectId(“4bd…â€), “orderAmount†: 25, “customerName†: “Bob Smith†}
>
You will notice the introduction of a new field called _id. This is an autogenerated id that serves as the primary key. You can override this key if you like, but that is beyond the scope of this test drive.
This is the simplest kind of query, we can also query by giving a prototype object to match on:
> db.orders.find( { orderAmount: 25 } );
{ “_id†: ObjectId(“4bd…â€), “orderAmount†: 25, “customerName†: “Bob Smith†}
>
Notice we got the same result…it found the one record. If you supply an object to the find method, it will search for all items that match all the fields provided.
Conclusion
Well, I hope you found this test drive useful. Check out the documentation for more information.
I am impressed with the ease of setup and use of MongoDB so far. Next time, we’ll look at accessing MongoDB through C#.
-Chris
2 Trackbacks
[…] Test Driving MongoDB […]
[…] we’ll create a new console app to test out the driver. Add a reference just like you did in the previous example, but this time, you will reference the Norm.dll […]