[python] Python - How to convert JSON File to Dataframe

How can I convert a JSON File as such into a dataframe to do some transformations.

For Example if the JSON file reads:

{"FirstName":"John",

"LastName":"Mark",

"MiddleName":"Lewis",

"username":"johnlewis2",

"password":"2910"}

How can I convert it to a table like such

Column -> FirstName | LastName | MiddleName | username | password



Row ----->    John | Mark |Lewis | johnlewis2 |2910

This question is related to python json pandas dataframe

The answer is


Creating dataframe from dictionary object.

import pandas as pd
data = [{'name': 'vikash', 'age': 27}, {'name': 'Satyam', 'age': 14}]
df = pd.DataFrame.from_dict(data, orient='columns')

df
Out[4]:
   age  name
0   27  vikash
1   14  Satyam

If you have nested columns then you first need to normalize the data:

data = [
  {
    'name': {
      'first': 'vikash',
      'last': 'singh'
    },
    'age': 27
  },
  {
    'name': {
      'first': 'satyam',
      'last': 'singh'
    },
    'age': 14
  }
]

df = pd.DataFrame.from_dict(pd.json_normalize(data), orient='columns')

df    
Out[8]:
age name.first  name.last
0   27  vikash  singh
1   14  satyam  singh

Source:


import pandas as pd
print(pd.json_normalize(your_json))

This will Normalize semi-structured JSON data into a flat table

Output

  FirstName LastName MiddleName password    username
      John     Mark      Lewis     2910  johnlewis2

There are 2 inputs you might have and you can also convert between them.

  1. input: listOfDictionaries --> use @VikashSingh solution

example: [{"":{"...

The pd.DataFrame() needs a listOfDictionaries as input.

  1. input: jsonStr --> use @JustinMalinchak solution

example: '{"":{"...

If you have jsonStr, you need an extra step to listOfDictionaries first. This is obvious as it is generated like:

jsonStr = json.dumps(listOfDictionaries)

Thus, switch back from jsonStr to listOfDictionaries first:

listOfDictionaries = json.loads(jsonStr)

jsondata = '{"0001":{"FirstName":"John","LastName":"Mark","MiddleName":"Lewis","username":"johnlewis2","password":"2910"}}'
import json
import pandas as pd
jdata = json.loads(jsondata)
df = pd.DataFrame(jdata)
print df.T

This should look like this:.

         FirstName LastName MiddleName password    username
0001      John     Mark      Lewis     2910  johnlewis2

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to json

Use NSInteger as array index Uncaught SyntaxError: Unexpected end of JSON input at JSON.parse (<anonymous>) HTTP POST with Json on Body - Flutter/Dart Importing json file in TypeScript json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 190) Angular 5 Service to read local .json file How to import JSON File into a TypeScript file? Use Async/Await with Axios in React.js Uncaught SyntaxError: Unexpected token u in JSON at position 0 how to remove json object key and value.?

Examples related to pandas

xlrd.biffh.XLRDError: Excel xlsx file; not supported Pandas Merging 101 How to increase image size of pandas.DataFrame.plot in jupyter notebook? Trying to merge 2 dataframes but get ValueError Python Pandas User Warning: Sorting because non-concatenation axis is not aligned How to show all of columns name on pandas dataframe? Pandas/Python: Set value of one column based on value in another column Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Python convert object to float

Examples related to dataframe

Trying to merge 2 dataframes but get ValueError How to show all of columns name on pandas dataframe? Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Display all dataframe columns in a Jupyter Python Notebook How to convert column with string type to int form in pyspark data frame? Display/Print one column from a DataFrame of Series in Pandas Binning column with python pandas Selection with .loc in python Set value to an entire column of a pandas dataframe