[mysql] Best way to store chat messages in a database?

I'm building a chat app and I want a full history off all messages ever sent in the chat conversation. At the moment I am storing each message as a single row in a table called 'messages'. I am aware that this table could grow huge as even small messages like 'Hi' would have their own database record.

Can anyone recommend a more scalable mysql solution? I don't require the individual messages to be searchable, editable or deletable. Could the whole conversation be stored in one huge field?

Would love to hear your ideas!

This question is related to mysql scalability chat

The answer is


There's nothing wrong with saving the whole history in the database, they are prepared for that kind of tasks.

Actually you can find here in Stack Overflow a link to an example schema for a chat: example

If you are still worried for the size, you could apply some optimizations to group messages, like adding a buffer to your application that you only push after some time (like 1 minute or so); that way you would avoid having only 1 line messages


If you can avoid the need for concurrent writes to a single file, it sounds like you do not need a database to store the chat messages.

Just append the conversation to a text file (1 file per user\conversation). and have a directory/ file structure

Here's a simplified view of the file structure:

chat-1-bob.txt
        201101011029, hi
        201101011030, fine thanks.

chat-1-jen.txt
        201101011030, how are you?
        201101011035, have you spoken to bill recently?

chat-2-bob.txt
        201101021200, hi
        201101021222, about 12:22
chat-2-bill.txt
        201101021201, Hey Bob,
        201101021203, what time do you call this?

You would then only need to store the userid, conversation id (guid ?) & a reference to the file name.

I think you will find it hard to get a more simple scaleable solution.

You can use LOAD_FILE to get the data too see: http://dev.mysql.com/doc/refman/5.0/en/string-functions.html

If you have a requirement to rebuild a conversation you will need to put a value (date time) alongside your sent chat message (in the file) to allow you to merge & sort the files, but at this point it is probably a good idea to consider using a database.


You could create a database for x conversations which contains all messages of these conversations. This would allow you to add a new Database (or server) each time x exceeds. X is the number conversations your infrastructure supports (depending on your hardware,...).

The problem is still, that there may be big conversations (with a lot of messages) on the same database. e.g. you have database A and database B an each stores e.g. 1000 conversations. It my be possible that there are far more "big" conversations on server A than on server B (since this is user created content). You could add a "master" database that contains a lookup, on which database/server the single conversations can be found (or you have a schema to assign a database from hash/modulo or something).

Maybe you can find real world architectures that deal with the same problems (you may not be the first one), and that have already been solved.