[json] Importing data from a JSON file into R

Is there a way to import data from a JSON file into R? More specifically, the file is an array of JSON objects with string fields, objects, and arrays. The RJSON Package isn't very clear on how to deal with this http://cran.r-project.org/web/packages/rjson/rjson.pdf.

This question is related to json r

The answer is


If the URL is https, like used for Amazon S3, then use getURL

json <- fromJSON(getURL('https://s3.amazonaws.com/bucket/my.json'))

packages:

  • library(httr)
  • library(jsonlite)

I have had issues converting json to dataframe/csv. For my case I did:

Token <- "245432532532"
source <- "http://......."
header_type <- "applcation/json"
full_token <- paste0("Bearer ", Token)
response <- GET(n_source, add_headers(Authorization = full_token, Accept = h_type), timeout(120), verbose())
text_json <- content(response, type = 'text', encoding = "UTF-8")
jfile <- fromJSON(text_json)
df <- as.data.frame(jfile)

then from df to csv.

In this format it should be easy to convert it to multiple .csvs if needed.

The important part is content function should have type = 'text'.


First install the RJSONIO and RCurl package:

_x000D_
_x000D_
install.packages("RJSONIO")_x000D_
install.packages("(RCurl")
_x000D_
_x000D_
_x000D_

Try below code using RJSONIO in console

_x000D_
_x000D_
library(RJSONIO)_x000D_
library(RCurl)_x000D_
json_file = getURL("https://raw.githubusercontent.com/isrini/SI_IS607/master/books.json")_x000D_
json_file2 = RJSONIO::fromJSON(json_file)_x000D_
head(json_file2)
_x000D_
_x000D_
_x000D_


import httr package

library(httr)

Get the url

url <- "http://www.omdbapi.com/?apikey=72bc447a&t=Annie+Hall&y=&plot=short&r=json"
resp <- GET(url)

Print content of resp as text

content(resp, as = "text")

Print content of resp

content(resp)

Use content() to get the content of resp, but this time do not specify a second argument. R figures out automatically that you're dealing with a JSON, and converts the JSON to a named R list.


An alternative package is RJSONIO. To convert a nested list, lapply can help:

l <- fromJSON('[{"winner":"68694999",  "votes":[ 
   {"ts":"Thu Mar 25 03:13:01 UTC 2010", "user":{"name":"Lamur","user_id":"68694999"}},   
   {"ts":"Thu Mar 25 03:13:08 UTC 2010", "user":{"name":"Lamur","user_id":"68694999"}}],   
  "lastVote":{"timestamp":1269486788526,"user":
   {"name":"Lamur","user_id":"68694999"}},"startPrice":0}]'
)
m <- lapply(
    l[[1]]$votes, 
    function(x) c(x$user['name'], x$user['user_id'], x['ts'])
)
m <- do.call(rbind, m)

gives information on the votes in your example.


jsonlite will import the JSON into a data frame. It can optionally flatten nested objects. Nested arrays will be data frames.

> library(jsonlite)
> winners <- fromJSON("winners.json", flatten=TRUE)
> colnames(winners)
[1] "winner" "votes" "startPrice" "lastVote.timestamp" "lastVote.user.name" "lastVote.user.user_id"
> winners[,c("winner","startPrice","lastVote.user.name")]
    winner startPrice lastVote.user.name
1 68694999          0              Lamur
> winners[,c("votes")]
[[1]]
                            ts user.name user.user_id
1 Thu Mar 25 03:13:01 UTC 2010     Lamur     68694999
2 Thu Mar 25 03:13:08 UTC 2010     Lamur     68694999