Understanding MongoDB BSON Document size limit

Question

From MongoDB The Definitive Guide      Documents larger than 4MB  when converted to BSON  cannot be   saved to the database  This is a somewhat arbitrary limit  and may be   raised in the future   it is mostly to prevent bad schema design and ensure   consistent performance    I don t understand this limit  does this mean that A Document containing a Blog post with a lot of comments which just so happens to be larger than 4MB cannot be stored as a single document   Also does this count the nested documents too   What if I wanted a document which audits the changes to a value   It will eventually may grow  exceeding 4MB limit    Hope someone explains this correctly   I have just started reading about MongoDB  first nosql database I m learning about    Thank you

User · Answer

To post a clarification answer here for those who get directed here by Google.

The document size includes everything in the document including the subdocuments, nested objects etc.

So a document of:

{
  "_id": {},
  "na": [1, 2, 3],
  "naa": [
    { "w": 1, "v": 2, "b": [1, 2, 3] },
    { "w": 5, "b": 2, "h": [{ "d": 5, "g": 7 }, {}] }
  ]
}

Has a maximum size of 16 MB.

Subdocuments and nested objects are all counted towards the size of the document.

User · Answer

First off  this actually is being raised in the next version to 8MB or 16MB     but I think to put this into perspective  Eliot from 10gen  who developed MongoDB  puts it best   EDIT  The size has been officially  raised  to 16MB     So  on your blog example  4MB is   actually a whole lot   For example    the full uncompresses text of  War of   the Worlds  is only 364k  html     http   www gutenberg org etext 36      If your blog post is that long with   that many comments  I for one am not   going to read it         For trackbacks  if you dedicated 1MB   to them  you could easily have more   than 10k  probably closer to 20k       So except for truly bizarre   situations  it ll work great  And in   the exception case or spam  I really   don t think you d want a 20mb object   anyway  I think capping trackbacks as   15k or so makes a lot of sense no   matter what for performance  Or at   least special casing if it ever   happens       -Eliot   I think you d be pretty hard pressed to reach the limit     and over time  if you upgrade     you ll have to worry less and less   The main point of the limit is so you don t use up all the RAM on your server  as you need to load all MBs of the document into RAM when you query it     So the limit is some   of normal usable RAM on a common system     which will keep growing year on year   Note on Storing Files in MongoDB  If you need to store documents  or files  larger than 16MB you can use the GridFS API which will automatically break up the data into segments and stream them back to you  thus avoiding the issue with size limits RAM        Instead of storing a file in a single document  GridFS divides the file into parts  or chunks  and stores each chunk as a separate document        GridFS uses two collections to store files  One collection stores the file chunks  and the other stores file metadata    You can use this method to store images  files  videos  etc in the database much as you might in a SQL database  I have used this to even store multi gigabyte video files

User · Answer

I have not yet seen a problem with the limit that did not involve large files stored within the document itself   There are already a variety of databases which are very efficient at storing retrieving large files  they are called operating systems   The database exists as a layer over the operating system   If you are using a NoSQL solution for performance reasons  why would you want to add additional processing overhead to the access of your data by putting the DB layer between your application and your data   JSON is a text format   So  if you are accessing your data through JSON  this is especially true if you have binary files because they have to be encoded in uuencode  hexadecimal  or Base 64  The conversion path might look like  binary file  lt   JSON  encoded   lt   BSON  encoded   It would be more efficient to put the path  URL  to the data file in your document and keep the data itself in binary   If you really want to keep these files of unknown length in your DB  then you would probably be better off putting these in GridFS and not risking killing your concurrency when the large files are accessed

User · Answer

Perhaps storing a blog post -  comments relation in a non-relational database is not really the best design   You should probably store comments in a separate collection to blog posts anyway    edit   See comments below for further discussion

User · Answer

According to https   www mongodb com blog post 6-rules-of-thumb-for-mongodb-schema-design-part-1  If you expect that a blog post may exceed the 16Mb document limit  you should extract the comments into a separate collection and reference the blog post from the comment and do an application-level join      posts            id  ObjectID  AAAA        text   a post                     comments           text   a comment      post  ObjectID  AAAA                text   another comment      post  ObjectID  AAAA

User · Answer

Nested Depth for BSON Documents  MongoDB supports no more than 100 levels of nesting for BSON documents   More more info vist

User · Answer

Many in the community would prefer no limit with warnings about performance  see this comment for a well reasoned argument  https   jira mongodb org browse SERVER-431 focusedCommentId 22283 amp page com atlassian jira plugin system issuetabpanels comment-tabpanel comment-22283  My take  the lead developers are stubborn about this issue because they decided it was an important  feature  early on  They re not going to change it anytime soon because their feelings are hurt that anyone questioned it  Another example of personality and politics detracting from a product in open source communities but this is not really a crippling issue

[mongodb] Understanding MongoDB BSON Document size limit

Examples related to mongodb

Examples related to bson