How to JSON serialize sets

Question

I have a Python set that contains objects with   hash   and   eq   methods in order to make certain no duplicates are included in the collection   I need to json encode this result set  but passing even an empty set to the json dumps method raises a TypeError     File   usr lib python2 7 json encoder py   line 201  in encode     chunks   self iterencode o   one shot True    File   usr lib python2 7 json encoder py   line 264  in iterencode     return  iterencode o  0    File   usr lib python2 7 json encoder py   line 178  in default     raise TypeError repr o      is not JSON serializable   TypeError  set     is not JSON serializable   I know I can create an extension to the json JSONEncoder class that has a custom default method  but I m not even sure where to begin in converting over the set   Should I create a dictionary out of the set values within the default method  and then return the encoding on that   Ideally  I d like to make the default method able to handle all the datatypes that the original encoder chokes on  I m using Mongo as a data source so dates seem to raise this error too   Any hint in the right direction would be appreciated   EDIT   Thanks for the answer   Perhaps I should have been more precise   I utilized  and upvoted  the answers here to get around the limitations of the set being translated  but there are internal keys that are an issue as well   The objects in the set are complex objects that translate to   dict    but they themselves can also contain values for their properties that could be ineligible for the basic types in the json encoder   There s a lot of different types coming into this set  and the hash basically calculates a unique id for the entity  but in the true spirit of NoSQL there s no telling exactly what the child object contains   One object might contain a date value for starts  whereas another may have some other schema that includes no keys containing  non-primitive  objects   That is why the only solution I could think of was to extend the JSONEncoder to replace the default method to turn on different cases - but I m not sure how to go about this and the documentation is ambiguous   In nested objects  does the value returned from default go by key  or is it just a generic include discard that looks at the whole object   How does that method accommodate nested values   I ve looked through previous questions and can t seem to find the best approach to case-specific encoding  which unfortunately seems like what I m going to need to do here

User · Answer

If you need just quick dump and don t want to implement custom encoder  You can use the following    json string   json dumps data  iterable as array True   This will convert all sets  and other iterables  into arrays  Just beware that those fields will stay arrays when you parse the json back  If you want to preserve the types  you need to write custom encoder

User · Answer

Shortened version of  AnttiHaapala  json dumps dict with sets  default lambda x  list x  if isinstance x  set  else x

User · Answer

Only dictionaries  Lists and primitive object types  int  string  bool  are available in JSON

User · Answer

I adapted Raymond Hettinger s solution to python 3   Here is what has changed    unicode disappeared updated the call to the parents  default with super   using base64 to serialize the bytes type into str  because it seems that bytes in python 3 can t be converted to JSON    from decimal import Decimal from base64 import b64encode  b64decode from json import dumps  loads  JSONEncoder import pickle  class PythonObjectEncoder JSONEncoder       def default self  obj           if isinstance obj   list  dict  str  int  float  bool  type None                 return super   default obj          return    python object   b64encode pickle dumps obj   decode  utf-8     def as python object dct       if   python object  in dct          return pickle loads b64decode dct   python object   encode  utf-8         return dct  data    1 2 3  set   knights    who    say    ni       key   value    Decimal  3 14    j   dumps data  cls PythonObjectEncoder  print loads j  object hook as python object     prints   1  2  3    knights    who    say    ni      key    value    Decimal  3 14

User · Answer

JSON notation has only a handful of native datatypes  objects  arrays  strings  numbers  booleans  and null   so anything serialized in JSON needs to be expressed as one of these types   As shown in the json module docs  this conversion can be done automatically by a JSONEncoder and JSONDecoder  but then you would be giving up some other structure you might need  if you convert sets to a list  then you lose the ability to recover regular lists  if you convert sets to a dictionary using dict fromkeys s  then you lose the ability to recover dictionaries    A more sophisticated solution is to build-out a custom type that can coexist with other native JSON types   This lets you store nested structures that include lists  sets  dicts  decimals  datetime objects  etc    from json import dumps  loads  JSONEncoder  JSONDecoder import pickle  class PythonObjectEncoder JSONEncoder       def default self  obj           if isinstance obj   list  dict  str  unicode  int  float  bool  type None                 return JSONEncoder default self  obj          return    python object   pickle dumps obj    def as python object dct       if   python object  in dct          return pickle loads str dct   python object         return dct   Here is a sample session showing that it can handle lists  dicts  and sets    gt  gt  gt  data    1 2 3  set   knights    who    say    ni       key   value    Decimal  3 14      gt  gt  gt  j   dumps data  cls PythonObjectEncoder    gt  gt  gt  loads j  object hook as python object   1  2  3  set   knights    say    who    ni      u key   u value    Decimal  3 14      Alternatively  it may be useful to use a more general purpose serialization technique such as YAML  Twisted Jelly  or Python s pickle module   These each support a much greater range of datatypes

User · Answer

You don t need to make a custom encoder class to supply the default method - it can be passed in as a keyword argument   import json  def serialize sets obj       if isinstance obj  set           return list obj       return obj  json str   json dumps set  1 2 3    default serialize sets  print json str    results in  1  2  3  in all supported Python versions

User · Answer

If you only need to encode sets  not general Python objects  and want to keep it easily human-readable  a  simplified version of Raymond Hettinger s answer can be used   import json import collections  class JSONSetEncoder json JSONEncoder          Use with json dumps to allow Python sets to be encoded to JSON      Example     -------      import json      data   dict aset set  1 2 3         encoded   json dumps data  cls JSONSetEncoder      decoded   json loads encoded  object hook json as python set      assert data    decoded       Should assert successfully      Any object that is matched by isinstance obj  collections Set  will     be encoded  but the decoded value will always be a normal Python set                def default self  obj           if isinstance obj  collections Set               return dict  set object list obj           else              return json JSONEncoder default self  obj   def json as python set dct          Decode json    set object    1 2 3   to set  1 2 3        Example     -------     decoded   json loads encoded  object hook json as python set       Also see  class  JSONSetEncoder               if   set object  in dct          return set dct   set object        return dct

User · Answer

One shortcoming of the accepted solution is that its output is very python specific  I e  its raw json output cannot be observed by a human or loaded by another language  e g  javascript   example   db              a     44  set  4 5 6               b     55  set  4 3 2                j   dumps db  cls PythonObjectEncoder  print j    Will get you     a    44     python object    gANjYnVpbHRpbnMKc2V0CnEAXXEBKEsESwVLBmWFcQJScQMu      b    55     python object    gANjYnVpbHRpbnMKc2V0CnEAXXEBKEsCSwNLBGWFcQJScQMu       I can propose a solution which downgrades the set to a dict containing a list on the way out  and back to a set when loaded into python using the same encoder  therefore preserving observability and language agnosticism   from decimal import Decimal from base64 import b64encode  b64decode from json import dumps  loads  JSONEncoder import pickle  class PythonObjectEncoder JSONEncoder       def default self  obj           if isinstance obj   list  dict  str  int  float  bool  type None                 return super   default obj          elif isinstance obj  set               return     set     list obj           return    python object   b64encode pickle dumps obj   decode  utf-8     def as python object dct       if    set    in dct          return set dct    set          elif   python object  in dct          return pickle loads b64decode dct   python object   encode  utf-8         return dct  db              a     44  set  4 5 6               b     55  set  4 3 2                j   dumps db  cls PythonObjectEncoder  print j  ob   loads j  print ob  a      Which gets you     a    44      set      4  5  6      b    55      set      2  3  4      44      set      4  5  6      Note that serializing a dictionary which has an element with a key    set    will break this mechanism  So   set   has now become a reserved dict key  Obviously feel free to use another  more deeply obfuscated key

User · Answer

You can create a custom encoder that returns a list when it encounters a set  Here s an example    gt  gt  gt  import json  gt  gt  gt  class SetEncoder json JSONEncoder          def default self  obj             if isinstance obj  set                return list obj            return json JSONEncoder default self  obj        gt  gt  gt  json dumps set  1 2 3 4 5    cls SetEncoder    1  2  3  4  5     You can detect other types this way too  If you need to retain that the list was actually a set  you could use a custom encoding  Something like return   type   set    list  list obj   might work   To illustrated nested types  consider serializing this    gt  gt  gt  class Something object          pass  gt  gt  gt  json dumps set  1 2 3 4 5 Something      cls SetEncoder    This raises the following error   TypeError   lt   main   Something object at 0x1691c50 gt  is not JSON serializable   This indicates that the encoder will take the list result returned and recursively call the serializer on its children  To add a custom serializer for multiple types  you can do this    gt  gt  gt  class SetEncoder json JSONEncoder          def default self  obj             if isinstance obj  set                return list obj            if isinstance obj  Something                return  CustomSomethingRepresentation            return json JSONEncoder default self  obj        gt  gt  gt  json dumps set  1 2 3 4 5 Something      cls SetEncoder    1  2  3  4  5   CustomSomethingRepresentation

[python] How to JSON serialize sets?

Examples related to python

Examples related to json

Examples related to serialization

Examples related to set