[python] What are the differences between json and simplejson Python modules?

I have seen many projects using simplejson module instead of json module from the Standard Library. Also, there are many different simplejson modules. Why would use these alternatives, instead of the one in the Standard Library?

This question is related to python json simplejson

The answer is


Here's (a now outdated) comparison of Python json libraries:

Comparing JSON modules for Python (archive link)

Regardless of the results in this comparison you should use the standard library json if you are on Python 2.6. And.. might as well just use simplejson otherwise.


I have to disagree with the other answers: the built in json library (in Python 2.7) is not necessarily slower than simplejson. It also doesn't have this annoying unicode bug.

Here is a simple benchmark:

import json
import simplejson
from timeit import repeat

NUMBER = 100000
REPEAT = 10

def compare_json_and_simplejson(data):
    """Compare json and simplejson - dumps and loads"""
    compare_json_and_simplejson.data = data
    compare_json_and_simplejson.dump = json.dumps(data)
    assert json.dumps(data) == simplejson.dumps(data)
    result = min(repeat("json.dumps(compare_json_and_simplejson.data)", "from __main__ import json, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "      json dumps {} seconds".format(result)
    result = min(repeat("simplejson.dumps(compare_json_and_simplejson.data)", "from __main__ import simplejson, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "simplejson dumps {} seconds".format(result)
    assert json.loads(compare_json_and_simplejson.dump) == data
    result = min(repeat("json.loads(compare_json_and_simplejson.dump)", "from __main__ import json, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "      json loads {} seconds".format(result)
    result = min(repeat("simplejson.loads(compare_json_and_simplejson.dump)", "from __main__ import simplejson, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "simplejson loads {} seconds".format(result)


print "Complex real world data:" 
COMPLEX_DATA = {'status': 1, 'timestamp': 1362323499.23, 'site_code': 'testing123', 'remote_address': '212.179.220.18', 'input_text': u'ny monday for less than \u20aa123', 'locale_value': 'UK', 'eva_version': 'v1.0.3286', 'message': 'Successful Parse', 'muuid1': '11e2-8414-a5e9e0fd-95a6-12313913cc26', 'api_reply': {"api_reply": {"Money": {"Currency": "ILS", "Amount": "123", "Restriction": "Less"}, "ProcessedText": "ny monday for less than \\u20aa123", "Locations": [{"Index": 0, "Derived From": "Default", "Home": "Default", "Departure": {"Date": "2013-03-04"}, "Next": 10}, {"Arrival": {"Date": "2013-03-04", "Calculated": True}, "Index": 10, "All Airports Code": "NYC", "Airports": "EWR,JFK,LGA,PHL", "Name": "New York City, New York, United States (GID=5128581)", "Latitude": 40.71427, "Country": "US", "Type": "City", "Geoid": 5128581, "Longitude": -74.00597}]}}}
compare_json_and_simplejson(COMPLEX_DATA)
print "\nSimple data:"
SIMPLE_DATA = [1, 2, 3, "asasd", {'a':'b'}]
compare_json_and_simplejson(SIMPLE_DATA)

And the results on my system (Python 2.7.4, Linux 64-bit):

Complex real world data:
json dumps 1.56666707993 seconds
simplejson dumps 2.25638604164 seconds
json loads 2.71256899834 seconds
simplejson loads 1.29233884811 seconds

Simple data:
json dumps 0.370109081268 seconds
simplejson dumps 0.574181079865 seconds
json loads 0.422876119614 seconds
simplejson loads 0.270955085754 seconds

For dumping, json is faster than simplejson. For loading, simplejson is faster.

Since I am currently building a web service, dumps() is more important—and using a standard library is always preferred.

Also, cjson was not updated in the past 4 years, so I wouldn't touch it.


I came across this question as I was looking to install simplejson for Python 2.6. I needed to use the 'object_pairs_hook' of json.load() in order to load a json file as an OrderedDict. Being familiar with more recent versions of Python I didn't realize that the json module for Python 2.6 doesn't include the 'object_pairs_hook' so I had to install simplejson for this purpose. From personal experience this is why i use simplejson as opposed to the standard json module.


Some values are serialized differently between simplejson and json.

Notably, instances of collections.namedtuple are serialized as arrays by json but as objects by simplejson. You can override this behaviour by passing namedtuple_as_object=False to simplejson.dump, but by default the behaviours do not match.

>>> import collections, simplejson, json
>>> TupleClass = collections.namedtuple("TupleClass", ("a", "b"))
>>> value = TupleClass(1, 2)
>>> json.dumps(value)
'[1, 2]'
>>> simplejson.dumps(value)
'{"a": 1, "b": 2}'
>>> simplejson.dumps(value, namedtuple_as_object=False)
'[1, 2]'

I've been benchmarking json, simplejson and cjson.

  • cjson is fastest
  • simplejson is almost on par with cjson
  • json is about 10x slower than simplejson

http://pastie.org/1507411:

$ python test_serialization_speed.py 
--------------------
   Encoding Tests
--------------------
Encoding: 100000 x {'m': 'asdsasdqwqw', 't': 3}
[      json] 1.12385 seconds for 100000 runs. avg: 0.011239ms
[simplejson] 0.44356 seconds for 100000 runs. avg: 0.004436ms
[     cjson] 0.09593 seconds for 100000 runs. avg: 0.000959ms

Encoding: 10000 x {'m': [['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19]], 't': 3}
[      json] 7.76628 seconds for 10000 runs. avg: 0.776628ms
[simplejson] 0.51179 seconds for 10000 runs. avg: 0.051179ms
[     cjson] 0.44362 seconds for 10000 runs. avg: 0.044362ms

--------------------
   Decoding Tests
--------------------
Decoding: 100000 x {"m": "asdsasdqwqw", "t": 3}
[      json] 3.32861 seconds for 100000 runs. avg: 0.033286ms
[simplejson] 0.37164 seconds for 100000 runs. avg: 0.003716ms
[     cjson] 0.03893 seconds for 100000 runs. avg: 0.000389ms

Decoding: 10000 x {"m": [["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19]], "t": 3}
[      json] 37.26270 seconds for 10000 runs. avg: 3.726270ms
[simplejson] 0.56643 seconds for 10000 runs. avg: 0.056643ms
[     cjson] 0.33007 seconds for 10000 runs. avg: 0.033007ms

Another reason projects use simplejson is that the builtin json did not originally include its C speedups, so the performance difference was noticeable.


json seems faster than simplejson in both cases of loads and dumps in latest version

Tested versions:

  • python: 3.6.8
  • json: 2.0.9
  • simplejson: 3.16.0

Results:

>>> def test(obj, call, data, times):
...   s = datetime.now()
...   print("calling: ", call, " in ", obj, " ", times, " times") 
...   for _ in range(times):
...     r = getattr(obj, call)(data)
...   e = datetime.now()
...   print("total time: ", str(e-s))
...   return r

>>> test(json, "dumps", data, 10000)
calling:  dumps  in  <module 'json' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\json\\__init__.py'>   10000  times
total time:  0:00:00.054857

>>> test(simplejson, "dumps", data, 10000)
calling:  dumps  in  <module 'simplejson' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\site-packages\\simplejson\\__init__.py'>   10000  times
total time:  0:00:00.419895
'{"1": 100, "2": "acs", "3.5": 3.5567, "d": [1, "23"], "e": {"a": "A"}}'

>>> test(json, "loads", strdata, 1000)
calling:  loads  in  <module 'json' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\json\\__init__.py'>   1000  times
total time:  0:00:00.004985
{'1': 100, '2': 'acs', '3.5': 3.5567, 'd': [1, '23'], 'e': {'a': 'A'}}

>>> test(simplejson, "loads", strdata, 1000)
calling:  loads  in  <module 'simplejson' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\site-packages\\simplejson\\__init__.py'>   1000  times
total time:  0:00:00.040890
{'1': 100, '2': 'acs', '3.5': 3.5567, 'd': [1, '23'], 'e': {'a': 'A'}}

For versions:

  • python: 3.7.4
  • json: 2.0.9
  • simplejson: 3.17.0

json was faster than simplejson during dumps operation but both maintained the same speed during loads operations


simplejson module is simply 1,5 times faster than json (On my computer, with simplejson 2.1.1 and Python 2.7 x86).

If you want, you can try the benchmark: http://abral.altervista.org/jsonpickle-bench.zip On my PC simplejson is faster than cPickle. I would like to know also your benchmarks!

Probably, as said Coady, the difference between simplejson and json is that simplejson includes _speedups.c. So, why don't python developers use simplejson?


An API incompatibility I found, with Python 2.7 vs simplejson 3.3.1 is in whether output produces str or unicode objects. e.g.

>>> from json import JSONDecoder
>>> jd = JSONDecoder()
>>> jd.decode("""{ "a":"b" }""")
{u'a': u'b'}

vs

>>> from simplejson import JSONDecoder
>>> jd = JSONDecoder()
>>> jd.decode("""{ "a":"b" }""")
{'a': 'b'}

If the preference is to use simplejson, then this can be addressed by coercing the argument string to unicode, as in:

>>> from simplejson import JSONDecoder
>>> jd = JSONDecoder()
>>> jd.decode(unicode("""{ "a":"b" }""", "utf-8"))
{u'a': u'b'}

The coercion does require knowing the original charset, for example:

>>> jd.decode(unicode("""{ "a": "?????ßß?f?e?" }"""))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 8: ordinal not in range(128)

This is the won't fix issue 40


In python3, if you a string of b'bytes', with json you have to .decode() the content before you can load it. simplejson takes care of this so you can just do simplejson.loads(byte_string).


All of these answers aren't very helpful because they are time sensitive.

After doing some research of my own I found that simplejson is indeed faster than the builtin, if you keep it updated to the latest version.

pip/easy_install wanted to install 2.3.2 on ubuntu 12.04, but after finding out the latest simplejson version is actually 3.3.0, so I updated it and reran the time tests.

  • simplejson is about 3x faster than the builtin json at loads
  • simplejson is about 30% faster than the builtin json at dumps

Disclaimer:

The above statements are in python-2.7.3 and simplejson 3.3.0 (with c speedups) And to make sure my answer also isn't time sensitive, you should run your own tests to check since it varies so much between versions; there's no easy answer that isn't time sensitive.

How to tell if C speedups are enabled in simplejson:

import simplejson
# If this is True, then c speedups are enabled.
print bool(getattr(simplejson, '_speedups', False))

UPDATE: I recently came across a library called ujson that is performing ~3x faster than simplejson with some basic tests.


The builtin json module got included in Python 2.6. Any projects that support versions of Python < 2.6 need to have a fallback. In many cases, that fallback is simplejson.