[python] Reading rather large json files in Python

The issue here is that JSON, as a format, is generally parsed in full and then handled in-memory, which for such a large amount of data is clearly problematic.

The solution to this is to work with the data as a stream - reading part of the file, working with it, and then repeating.

The best option appears to be using something like ijson - a module that will work with JSON as a stream, rather than as a block file.

Edit: Also worth a look - kashif's comment about json-streamer and Henrik Heino's comment about bigjson.