0

I am building a web application that requires to be scalable. In a nutshell:

We got users, users have friends, so they got a friendlist. Users can create messages, and messages from your friends are displayed on the homepage, each message is linked to a location and these messages can be filtered by date, for example I want to display all the messages from my friends that where posted yesterday, or display me all messages from location X.

I am now building the application fully in MongoDb, however I am heading into trouble atm. For example:

  • On the mainpage, we got the message list of the friends of the users, no problem we use:

$db->messages->find(array('users._id' => array('$in' => $userFriendListGoesHere)));

So then we got our messages, however after that, each message has a location, so I have to make a loop through all messages, and get the location from another collection, and also multiple users can be bound to a single message, so we also have to get all the user data from another collection, in MySql simply a join query, in MongoDb 2 loops, and this is my first question: is this a problem? Does this require alot of resources, the looping?

So my idea is to split up with MySql and MongoDb, I use MongoDb to store all the locations (since it are over 350.000+ locations and use lat long calculations) and MySql for the message, users and friends of the users, so second question, can you help me with my decision, should I keep using MongoDb with the loops? Or use a combination?

Thanks for reading and your time.

2 Answers 2

2

.. in MySql simply a join query, in MongoDb 2 loops, and this is my first question: is this a problem?

This is par for the course with MongoDB, in fact, it's a core MongoDB trade-off.

MongoDB is based on the precept that joins do not scale. So it has no joins and leaves you to "roll your own". Some libraries like Morphia (for Java) provide built-in logic for loading references.

PHP has the Doctrine project, which should help with some of this.

Does this require alot of resources, the looping?

Kind of? This will really depend on implementation.

It's obviously going to involve a bunch of back and forth with the DB, but it may be less network traffic than the SQL version. You will need memory space for all of the data coming back. But again, that's not terribly different from SQL.

Really, it's up to you to make all of the trade-offs about how this is implemented and who is keeping what in memory.

Sign up to request clarification or add additional context in comments.

2 Comments

In SQL, you make one query, and stream N results. With this setup, you make one query, get a blob of N results, and make N further RPCs to look up extra data. It's almost certainly a latency / traffic hit vs. doing it locally in SQL, by sheer number of extra roundtrips. That said, each of those lookups should be relatively cheap, and it may certainly be feasible to do something where you stream the data back anyhow (i.e. show the first N, then do an AJAX request to get the next N as you scroll, etc.) which may offset it.
If I join 4 tables, the server has to do 4 index lookups + data lookups + a bunch of in-memory for loops. If you do it in Mongo, you still do the 4 index lookups + data lookups + for loops. The SQL response will generally require more data due to repeats (inherent in the join). The Mongo will require more queries. The slowest part of the puzzle is the drives, but these look about the same, so this becomes a question of 4 small queries vs 1 large query, or latency vs throughput. I'm honestly not sure which is going to be better, the latency or the throughput.
0

should I keep using MongoDb with the loops

MongoDB is a great idea when your data is not inherently relational.

In the example you provided, it kinda seems like your data is relational. MySQL and other relational DBs (such as Postgres) are better data stores than MongoDB for relational data. This blog post covers this topic in more detail.

In summary, I'd recommend the following:

  1. Please spend some time analyzing whether your data is inherently relational or not.
  2. If it is not, then MongoDB can give you benefits over using MySQL.
  3. If it is relational, then MySQL is the better solution.
  4. Using both is, of course, possible - but it will create additional work & complexity for you. In the long term - is that worth the effort? Only you will know the answer.

Best of luck with your web app!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.