[json] Is there a query language for JSON?

Is there a (roughly) SQL or XQuery-like language for querying JSON?

I'm thinking of very small datasets that map nicely to JSON where it would be nice to easily answer queries such as "what are all the values of X where Y > 3" or to do the usual SUM / COUNT type operations.

As completely made-up example, something like this:

[{"x": 2, "y": 0}}, {"x": 3, "y": 1}, {"x": 4, "y": 1}]

SUM(X) WHERE Y > 0     (would equate to 7)
LIST(X) WHERE Y > 0    (would equate to [3,4])

I'm thinking this would work both client-side and server-side with results being converted to the appropriate language-specific data structure (or perhaps kept as JSON)

A quick Googling suggests that people have thought about it and implemented a few things (JAQL), but it doesn't seem like a standard usage or set of libraries has emerged yet. While each function is fairly trivial to implement on its own, if someone has already done it right I don't want to re-invent the wheel.

Any suggestions?

Edit: This may indeed be a bad idea or JSON may be too generic a format for what I'm thinking.. The reason for wanting a query language instead of just doing the summing/etc functions directly as needed is that I hope to build the queries dynamically based on user-input. Kinda like the argument that "we don't need SQL, we can just write the functions we need". Eventually that either gets out of hand or you end up writing your own version of SQL as you push it further and further. (Okay, I know that is a bit of a silly argument, but you get the idea..)

This question is related to json nosql web-standards querying dynamic-queries

The answer is


  1. Google has a project called lovefield; just found out about it, and it looks interesting, though it is more involved than just dropping in underscore or lodash.

    https://github.com/google/lovefield

Lovefield is a relational query engine written in pure JavaScript. It also provides help with persisting data on the browser side, e.g. using IndexedDB to store data locally. It provides SQL-like syntax and works cross-browser (currently supporting Chrome 37+, Firefox 31+, IE 10+, and Safari 5.1+...


  1. Another interesting recent entry in this space called jinqJs.

    http://www.jinqjs.com/

    Briefly reviewing the examples, it looks promising, and the API document appears to be well written.


function isChild(row) {
  return (row.Age < 18 ? 'Yes' : 'No');
}

var people = [
  {Name: 'Jane', Age: 20, Location: 'Smithtown'},
  {Name: 'Ken', Age: 57, Location: 'Islip'},
  {Name: 'Tom', Age: 10, Location: 'Islip'}
];

var result = new jinqJs()
  .from(people)
  .orderBy('Age')
  .select([{field: 'Name'}, 
     {field: 'Age', text: 'Your Age'}, 
     {text: 'Is Child', value: isChild}]);

jinqJs is a small, simple, lightweight and extensible javaScript library that has no dependencies. jinqJs provides a simple way to perform SQL like queries on javaScript arrays, collections and web services that return a JSON response. jinqJs is similar to Microsoft's Lambda expression for .Net, and it provides similar capabilities to query collections using a SQL like syntax and predicate functionality. jinqJs’s purpose is to provide a SQL like experience to programmers familiar with LINQ queries.


jq is a JSON query language, mainly intended for the command-line but with bindings to a wide range of programming languages (Java, node.js, php, ...) and even available in the browser via jq-web.

Here are some illustrations based on the original question, which gave this JSON as an example:

 [{"x": 2, "y": 0}}, {"x": 3, "y": 1}, {"x": 4, "y": 1}]

SUM(X) WHERE Y > 0 (would equate to 7)

map(select(.y > 0)) | add

LIST(X) WHERE Y > 0 (would equate to [3,4])

map(.y > 0)

jq syntax extends JSON syntax

Every JSON expression is a valid jq expression, and expressions such as [1, (1+1)] and {"a": (1+1)}` illustrate how jq extends JSON syntax.

A more useful example is the jq expression:

{a,b}

which, given the JSON value {"a":1, "b":2, "c": 3}, evaluates to {"a":1, "b":2}.


Another way to look at this would be to use mongoDB You can store your JSON in mongo and then query it via the mongodb query syntax.


jmespath works really quite easy and well, http://jmespath.org/ It is being used by Amazon in the AWS command line interface, so it´s got to be quite stable.


I'd recommend my project I'm working on called jLinq. I'm looking for feedback so I'd be interested in hearing what you think.

If lets you write queries similar to how you would in LINQ...

var results = jLinq.from(records.users)

    //you can join records
    .join(records.locations, "location", "locationId", "id")

    //write queries on the data
    .startsWith("firstname", "j")
    .or("k") //automatically remembers field and command names

    //even query joined items
    .equals("location.state", "TX")

    //and even do custom selections
    .select(function(rec) {
        return {
            fullname : rec.firstname + " " + rec.lastname,
            city : rec.location.city,
            ageInTenYears : (rec.age + 10)
        };
    });

It's fully extensible too!

The documentation is still in progress, but you can still try it online.


Here's some simple javascript libraries that will also do the trick:

  • Dollar Q is a nice lightweight library. It has a familiar feel to the chaining syntax made popular by jQuery and is only 373 SLOC.
  • SpahQL is a fully featured query language with a syntax similar to XPath (Homepage, Github
  • jFunk is an in progress query language, with a syntax similar to CSS/jQuery selectors. It looked promising, but hasn't had any development beyond its in initial commit.

  • (added 2014): the jq command line tool has a neat syntax, but unfortunately it is a c library. Example usage:

    < package.json jq '.dependencies | to_entries | .[] | select(.value | startswith("git")) | .key'


The built-in array.filter() method makes most of these so-called javascript query libraries obsolete

You can put as many conditions inside the delegate as you can imagine: simple comparison, startsWith, etc. I haven't tested but you could probably nest filters too for querying inner collections.


In MongoDB, this is how it would work (in the mongo shell, there exist drivers for a language of your choice).

db.collection.insert({"x": 2, "y": 0}); // notice the ':' instead of ','
db.collection.insert({"x": 3, "y": 1});
db.collection.insert({"x": 4, "y": 1});

db.collection.aggregate([{$match: {"y": {$gt: 0}}}, 
                         {$group: {_id: "sum", sum: {$sum: "$x"}}}]);
db.collection.aggregate([{$match: {"y": {$gt: 0}}}, 
                         {$group: {_id: "list", list: {$push: "$x"}}}]);

The first three commands insert the data into your collection. (Just start the mongod server and connect with the mongo client.)

The next two process the data. $match filters, $group applies the sum and list, respectively.


I'll second the notion of just using your own javascript, but for something a bit more sophisticated you might look at dojo data. Haven't used it but it looks like it gives you roughly the kind of query interface you're looking for.


Whenever possible I would shift all of the querying to the backend on the server (to the SQL DB or other native database type). Reason being is that it will be quicker and more optimized to do the querying.

I know that jSON can be stand alone and there may be +/- for having a querying language but I cannot see the advantage if you are retrieving data from the backend to a browser, as most of the JSON use cases. Query and filter at the backend to get as small a data that is needed.

If for whatever reason you need to query at the front-end (mostly in a browser) then I would suggest just using array.filter (why invent something else?).

That said what I think would be more useful is a transformation API for json...they are more useful since once you have the data you may want to display it in a number of ways. However, again, you can do much of this on the server (which can be much easier to scale) than on the client - IF you are using server<-->client model.

Just my 2 pence worth!


OK, this post is a little old, but... if you want to do SQL-like query in native JSON (or JS objects) on JS objects, take a look at https://github.com/deitch/searchjs

It is both a jsql language written entirely in JSON, and a reference implementation. You can say, "I want to find all object in an array that have name==="John" && age===25 as:

{name:"John",age:25,_join:"AND"}

The reference implementation searchjs works in the browser as well as as a node npm package

npm install searchjs

It can also do things like complex joins and negation (NOT). It natively ignores case.

It doesn't yet do summation or count, but it is probably easier to do those outside.


You can also use Underscore.js which is basically a swiss-knife library to manipulate collections. Using _.filter, _.pluck, _.reduce you can do SQL-like queries.

var data = [{"x": 2, "y": 0}, {"x": 3, "y": 1}, {"x": 4, "y": 1}];

var posData = _.filter(data, function(elt) { return elt.y > 0; });
// [{"x": 3, "y": 1}, {"x": 4, "y": 1}]

var values = _.pluck(posData, "x");
// [3, 4]

var sum = _.reduce(values, function(a, b) { return a+b; });
// 7

Underscore.js works both client-side and server-side and is a notable library.

You can also use Lo-Dash which is a fork of Underscore.js with better performances.


ObjectPath is simple and ligthweigth query language for JSON documents of complex or unknown structure. It's similar to XPath or JSONPath, but much more powerful thanks to embedded arithmetic calculations, comparison mechanisms and built-in functions.

Example

Python version is mature and used in production. JS is still in beta.

Probably in the near future we will provide a full-fledged Javascript version. We also want to develop it further, so that it could serve as a simpler alternative to Mongo queries.


Update: XQuery 3.1 can query either XML or JSON - or both together. And XPath 3.1 can too.

The list is growing:


if you want to use pure javascript try this:

_x000D_
_x000D_
var object = { result: { data: { point1: "x", value: 2 }, foo: { bar: 7 } } },
    path = 'result.data.value',
    getValue = (o, p) => p.split('.').reduce((r, k) => r[k], o);

console.log(getValue(object, path));
_x000D_
_x000D_
_x000D_


SpahQL is the most promising and well thought out of these, as far as I can tell. I highly recommend checking it out.


PythonQL offers an embedded syntax that IMHO is an improvement on SQL, principally because group, window, where, let, etc. can be freely intermixed.

$ cat x.py
#coding: pythonql
data = [{"x": 2, "y": 0}, {"x": 3, "y": 1}, {"x": 4, "y": 1}]
q = [x match {'x': as x, 'y': as y} in data where y > 0]
print(sum(q))
print(list(q))

q = [x match {'x': as x, 'y': as y} as d in data where d['y'] > 0]
print(sum(q))

This code shows two different answers to your question, depending on your need to handle the entire structure or just the value. Execution gives you the expected result.

$ python x.py
7
[3, 4]
7

You could use linq.js.

This allows to use aggregations and selectings from a data set of objects, as other structures data.

_x000D_
_x000D_
var data = [{ x: 2, y: 0 }, { x: 3, y: 1 }, { x: 4, y: 1 }];_x000D_
_x000D_
// SUM(X) WHERE Y > 0     -> 7_x000D_
console.log(Enumerable.From(data).Where("$.y > 0").Sum("$.x"));_x000D_
_x000D_
// LIST(X) WHERE Y > 0    -> [3, 4]_x000D_
console.log(Enumerable.From(data).Where("$.y > 0").Select("$.x").ToArray());
_x000D_
<script src="https://cdnjs.cloudflare.com/ajax/libs/linq.js/2.2.0.2/linq.js"></script>
_x000D_
_x000D_
_x000D_


Check out https://github.com/niclasko/Cypher.js (note: I'm the author)

It's a zero-dependency Javascript implementation of the Cypher graph database query language along with a graph database. It runs in the browser (tested with Firefox, Chrome, IE).

With relevance to the question. It can be used to query JSON endpoints:

load json from "http://url/endpoint" as l return l limit 10

Here's an example of querying a complex JSON document and performing analysis on it:

Cypher.js JSON query example



I've just finished a releaseable version of a clientside JS-lib (defiant.js) that does what you're looking for. With defiant.js, you can query a JSON structure with the XPath expressions you're familiar with (no new syntax expressions as in JSONPath).

Example of how it works (see it in browser here http://defiantjs.com/defiant.js/demo/sum.avg.htm):

var data = [
       { "x": 2, "y": 0 },
       { "x": 3, "y": 1 },
       { "x": 4, "y": 1 },
       { "x": 2, "y": 1 }
    ],
    res = JSON.search( data, '//*[ y > 0 ]' );

console.log( res.sum('x') );
// 9
console.log( res.avg('x') );
// 3
console.log( res.min('x') );
// 2
console.log( res.max('x') );
// 4

As you can see, DefiantJS extends the global object JSON with a search function and the returned array is delivered with aggregate functions. DefiantJS contains a few other functionalities but those are out of the scope for this subject. Anywho, you can test the lib with a clientside XPath Evaluator. I think people not familiar with XPath will find this evaluator useful.
http://defiantjs.com/#xpath_evaluator

More information about defiant.js
http://defiantjs.com/
https://github.com/hbi99/defiant.js

I hope you find it useful... Regards


The current Jaql implementation targets large data processing using a Hadoop cluster, so it might be more than you need. However, it runs easily without a Hadoop cluster (but still requires the Hadoop code and its dependencies to get compiled, which are mostly included). A small implementation of Jaql that could be embedded in Javascript and the a browser would be a great addition to the project.

Your examples above are easily written in jaql:

$data = [{"x": 2, "y": 0}, {"x": 3, "y": 1}, {"x": 4, "y": 1}];

$data -> filter $.y > 0 -> transform $.x -> sum(); // 7

$data -> filter $.y > 0 -> transform $.x; // [3,4]

Of course, there's much more too. For example:

// Compute multiple aggregates and change nesting structure:
$data -> group by $y = $.y into { $y, s:sum($[*].x), n:count($), xs:$[*].x}; 
    // [{ "y": 0, "s": 2, "n": 1, "xs": [2]   },
    //  { "y": 1, "s": 7, "n": 2, "xs": [3,4] }]

// Join multiple data sets:
$more = [{ "y": 0, "z": 5 }, { "y": 1, "z": 6 }];
join $data, $more where $data.y == $more.y into {$data, $more};
    // [{ "data": { "x": 2, "y": 0 }, "more": { "y": 0, "z": 5 }},
    //  { "data": { "x": 3, "y": 1 }, "more": { "y": 1, "z": 6 }},
    //  { "data": { "x": 4, "y": 1 }, "more": { "y": 1, "z": 6 }}]

Jaql can be downloaded/discussed at http://code.google.com/p/jaql/


If you are using .NET then Json.NET supports LINQ queries over the top of JSON. This post has some examples. It supports filtering, mapping, grouping, etc.


Examples related to json

Use NSInteger as array index Uncaught SyntaxError: Unexpected end of JSON input at JSON.parse (<anonymous>) HTTP POST with Json on Body - Flutter/Dart Importing json file in TypeScript json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 190) Angular 5 Service to read local .json file How to import JSON File into a TypeScript file? Use Async/Await with Axios in React.js Uncaught SyntaxError: Unexpected token u in JSON at position 0 how to remove json object key and value.?

Examples related to nosql

Firestore Getting documents id from collection What is Hash and Range Primary Key? Mongodb: Failed to connect to 127.0.0.1:27017, reason: errno:10061 Explanation of JSONB introduced by PostgreSQL DynamoDB vs MongoDB NoSQL Querying DynamoDB by date Delete all nodes and relationships in neo4j 1.8 When to use CouchDB over MongoDB and vice versa Difference between scaling horizontally and vertically for databases NoSQL Use Case Scenarios or WHEN to use NoSQL

Examples related to web-standards

input type="submit" Vs button tag are they interchangeable? Create a HTML table where each TR is a FORM Valid content-type for XML, HTML and XHTML documents Is it a good practice to use an empty URL for a HTML form's action attribute? (action="") Is a DIV inside a TD a bad idea? What is the proper way to URL encode Unicode characters? How does one target IE7 and IE8 with valid CSS? Is there a query language for JSON? Is it valid to have a html form inside another html form? What's the difference between ISO 8601 and RFC 3339 Date Formats?

Examples related to querying

Nested select statement in SQL Server Is there a query language for JSON?

Examples related to dynamic-queries

Is there a query language for JSON?