[node.js] Synchronous request in Node.js

If I need to call 3 http API in sequential order, what would be a better alternative to the following code:

http.get({ host: 'www.example.com', path: '/api_1.php' }, function(res) { 
  res.on('data', function(d) { 

    http.get({ host: 'www.example.com', path: '/api_2.php' }, function(res) { 
      res.on('data', function(d) { 

        http.get({ host: 'www.example.com', path: '/api_3.php' }, function(res) { 
          res.on('data', function(d) { 


          });
        });
        }
      });
    });
    }
  });
});
}

This question is related to node.js synchronization

The answer is


I actually got exactly what you (and me) wanted, without the use of await, Promises, or inclusions of any (external) library (except our own).

Here's how to do it:

We're going to make a C++ module to go with node.js, and that C++ module function will make the HTTP request and return the data as a string, and you can use that directly by doing:

var myData = newModule.get(url);

ARE YOU READY to get started?

Step 1: make a new folder somewhere else on your computer, we're only using this folder to build the module.node file (compiled from C++), you can move it later.

In the new folder (I put mine in mynewFolder/src for organize-ness):

npm init

then

npm install node-gyp -g

now make 2 new files: 1, called something.cpp and for put this code in it (or modify it if you want):

#pragma comment(lib, "urlmon.lib")
#include <sstream>
#include <WTypes.h>  
#include <node.h>
#include <urlmon.h> 
#include <iostream>
using namespace std;
using namespace v8;

Local<Value> S(const char* inp, Isolate* is) {
    return String::NewFromUtf8(
        is,
        inp,
        NewStringType::kNormal
    ).ToLocalChecked();
}

Local<Value> N(double inp, Isolate* is) {
    return Number::New(
        is,
        inp
    );
}

const char* stdStr(Local<Value> str, Isolate* is) {
    String::Utf8Value val(is, str);
    return *val;
}

double num(Local<Value> inp) {
    return inp.As<Number>()->Value();
}

Local<Value> str(Local<Value> inp) {
    return inp.As<String>();
}

Local<Value> get(const char* url, Isolate* is) {
    IStream* stream;
    HRESULT res = URLOpenBlockingStream(0, url, &stream, 0, 0);

    char buffer[100];
    unsigned long bytesReadSoFar;
    stringstream ss;
    stream->Read(buffer, 100, &bytesReadSoFar);
    while(bytesReadSoFar > 0U) {
        ss.write(buffer, (long long) bytesReadSoFar);
        stream->Read(buffer, 100, &bytesReadSoFar);
    }
    stream->Release();
    const string tmp = ss.str();
    const char* cstr = tmp.c_str();
    return S(cstr, is);
}

void Hello(const FunctionCallbackInfo<Value>& arguments) {
    cout << "Yo there!!" << endl;

    Isolate* is = arguments.GetIsolate();
    Local<Context> ctx = is->GetCurrentContext();

    const char* url = stdStr(arguments[0], is);
    Local<Value> pg = get(url,is);

    Local<Object> obj = Object::New(is);
    obj->Set(ctx,
        S("result",is),
        pg
    );
    arguments.GetReturnValue().Set(
       obj
    );

}

void Init(Local<Object> exports) {
    NODE_SET_METHOD(exports, "get", Hello);
}

NODE_MODULE(cobypp, Init);

Now make a new file in the same directory called something.gyp and put (something like) this in it:

{
   "targets": [
       {
           "target_name": "cobypp",
           "sources": [ "src/cobypp.cpp" ]
       }
   ]
}

Now in the package.json file, add: "gypfile": true,

Now: in the console, node-gyp rebuild

If it goes through the whole command and says "ok" at the end with no errors, you're (almost) good to go, if not, then leave a comment..

But if it works then go to build/Release/cobypp.node (or whatever its called for you), copy it into your main node.js folder, then in node.js:

var myCPP = require("./cobypp")
var myData = myCPP.get("http://google.com").result;
console.log(myData);

..

response.end(myData);//or whatever

Super Request

This is another synchronous module that is based off of request and uses promises. Super simple to use, works well with mocha tests.

npm install super-request

request("http://domain.com")
    .post("/login")
    .form({username: "username", password: "password"})
    .expect(200)
    .expect({loggedIn: true})
    .end() //this request is done 
    //now start a new one in the same session 
    .get("/some/protected/route")
    .expect(200, {hello: "world"})
    .end(function(err){
        if(err){
            throw err;
        }
    });

This code can be used to execute an array of promises synchronously & sequentially after which you can execute your final code in the .then() call.

const allTasks = [() => promise1, () => promise2, () => promise3];

function executePromisesSync(tasks) {
  return tasks.reduce((task, nextTask) => task.then(nextTask), Promise.resolve());
}

executePromisesSync(allTasks).then(
  result => console.log(result),
  error => console.error(error)
);

This has been answered well by Raynos. Yet there have been changes in the sequence library since the answer has been posted.

To get sequence working, follow this link: https://github.com/FuturesJS/sequence/tree/9daf0000289954b85c0925119821752fbfb3521e.

This is how you can get it working after npm install sequence:

var seq = require('sequence').Sequence;
var sequence = seq.create();

seq.then(function call 1).then(function call 2);

Here's my version of @andy-shin sequently with arguments in array instead of index:

function run(funcs, args) {
    var i = 0;
    var recursive = function() {
        funcs[i](function() {
            i++;
            if (i < funcs.length)
                recursive();
        }, args[i]);
    };
    recursive();
}

You could do this using my Common Node library:

function get(url) {
  return new (require('httpclient').HttpClient)({
    method: 'GET',
      url: url
    }).finish().body.read().decodeToString();
}

var a = get('www.example.com/api_1.php'), 
    b = get('www.example.com/api_2.php'),
    c = get('www.example.com/api_3.php');

As of 2018 and using ES6 modules and Promises, we can write a function like that :

import { get } from 'http';

export const fetch = (url) => new Promise((resolve, reject) => {
  get(url, (res) => {
    let data = '';
    res.on('end', () => resolve(data));
    res.on('data', (buf) => data += buf.toString());
  })
    .on('error', e => reject(e));
});

and then in another module

let data;
data = await fetch('http://www.example.com/api_1.php');
// do something with data...
data = await fetch('http://www.example.com/api_2.php');
// do something with data
data = await fetch('http://www.example.com/api_3.php');
// do something with data

The code needs to be executed in an asynchronous context (using async keyword)


I landed here because I needed to rate-limit http.request (~10k aggregation queries to elastic search to build an analytical report). The following just choked my machine.

for (item in set) {
    http.request(... + item + ...);
}

My URLs are very simple so this may not trivially apply to the original question but I think it's both potentially applicable and worth writing here for readers that land here with issues similar to mine and who want a trivial JavaScript no-library solution.

My job wasn't order dependent and my first approach to bodging this was to wrap it in a shell script to chunk it (because I'm new to JavaScript). That was functional but not satisfactory. My JavaScript resolution in the end was to do the following:

var stack=[];
stack.push('BOTTOM');

function get_top() {
  var top = stack.pop();
  if (top != 'BOTTOM')
    collect(top);
}

function collect(item) {
    http.request( ... + item + ...
    result.on('end', function() {
      ...
      get_top();
    });
    );
}

for (item in set) {
   stack.push(item);
}

get_top();

It looks like mutual recursion between collect and get_top. I'm not sure it is in effect because the system is asynchronous and the function collect completes with a callback stashed for the event at on.('end'.

I think it is general enough to apply to the original question. If, like my scenario, the sequence/set is known, all URLs/keys can be pushed on the stack in one step. If they are calculated as you go, the on('end' function can push the next url on the stack just before get_top(). If anything, the result has less nesting and might be easier to refactor when the API you're calling changes.

I realise this is effectively equivalent to the @generalhenry's simple recursive version above (so I upvoted that!)


...4 years later...

Here is an original solution with the framework Danf (you don't need any code for this kind of things, only some config):

// config/common/config/sequences.js

'use strict';

module.exports = {
    executeMySyncQueries: {
        operations: [
            {
                order: 0,
                service: 'danf:http.router',
                method: 'follow',
                arguments: [
                    'www.example.com/api_1.php',
                    'GET'
                ],
                scope: 'response1'
            },
            {
                order: 1,
                service: 'danf:http.router',
                method: 'follow',
                arguments: [
                    'www.example.com/api_2.php',
                    'GET'
                ],
                scope: 'response2'
            },
            {
                order: 2,
                service: 'danf:http.router',
                method: 'follow',
                arguments: [
                    'www.example.com/api_3.php',
                    'GET'
                ],
                scope: 'response3'
            }
        ]
    }
};

Use the same order value for operations you want to be executed in parallel.

If you want to be even shorter, you can use a collection process:

// config/common/config/sequences.js

'use strict';

module.exports = {
    executeMySyncQueries: {
        operations: [
            {
                service: 'danf:http.router',
                method: 'follow',
                // Process the operation on each item
                // of the following collection.
                collection: {
                    // Define the input collection.
                    input: [
                        'www.example.com/api_1.php',
                        'www.example.com/api_2.php',
                        'www.example.com/api_3.php'
                    ],
                    // Define the async method used.
                    // You can specify any collection method
                    // of the async lib.
                    // '--' is a shorcut for 'forEachOfSeries'
                    // which is an execution in series.
                    method: '--'
                },
                arguments: [
                    // Resolve reference '@@.@@' in the context
                    // of the input item.
                    '@@.@@',
                    'GET'
                ],
                // Set the responses in the property 'responses'
                // of the stream.
                scope: 'responses'
            }
        ]
    }
};

Take a look at the overview of the framework for more informations.


Another possibility is to set up a callback that tracks completed tasks:

function onApiResults(requestId, response, results) {
    requestsCompleted |= requestId;

    switch(requestId) {
        case REQUEST_API1:
            ...
            [Call API2]
            break;
        case REQUEST_API2:
            ...
            [Call API3]
            break;
        case REQUEST_API3:
            ...
            break;
    }

    if(requestId == requestsNeeded)
        response.end();
}

Then simply assign an ID to each and you can set up your requirements for which tasks must be completed before closing the connection.

const var REQUEST_API1 = 0x01;
const var REQUEST_API2 = 0x02;
const var REQUEST_API3 = 0x03;
const var requestsNeeded = REQUEST_API1 | REQUEST_API2 | REQUEST_API3;

Okay, it's not pretty. It is just another way to make sequential calls. It's unfortunate that NodeJS does not provide the most basic synchronous calls. But I understand what the lure is to asynchronicity.


There are lots of control flow libraries -- I like conseq (... because I wrote it.) Also, on('data') can fire several times, so use a REST wrapper library like restler.

Seq()
  .seq(function () {
    rest.get('http://www.example.com/api_1.php').on('complete', this.next);
  })
  .seq(function (d1) {
    this.d1 = d1;
    rest.get('http://www.example.com/api_2.php').on('complete', this.next);
  })
  .seq(function (d2) {
    this.d2 = d2;
    rest.get('http://www.example.com/api_3.php').on('complete', this.next);
  })
  .seq(function (d3) {
    // use this.d1, this.d2, d3
  })

I'd use a recursive function with a list of apis

var APIs = [ '/api_1.php', '/api_2.php', '/api_3.php' ];
var host = 'www.example.com';

function callAPIs ( host, APIs ) {
  var API = APIs.shift();
  http.get({ host: host, path: API }, function(res) { 
    var body = '';
    res.on('data', function (d) {
      body += d; 
    });
    res.on('end', function () {
      if( APIs.length ) {
        callAPIs ( host, APIs );
      }
    });
  });
}

callAPIs( host, APIs );

edit: request version

var request = require('request');
var APIs = [ '/api_1.php', '/api_2.php', '/api_3.php' ];
var host = 'www.example.com';
var APIs = APIs.map(function (api) {
  return 'http://' + host + api;
});

function callAPIs ( host, APIs ) {
  var API = APIs.shift();
  request(API, function(err, res, body) { 
    if( APIs.length ) {
      callAPIs ( host, APIs );
    }
  });
}

callAPIs( host, APIs );

edit: request/async version

var request = require('request');
var async = require('async');
var APIs = [ '/api_1.php', '/api_2.php', '/api_3.php' ];
var host = 'www.example.com';
var APIs = APIs.map(function (api) {
  return 'http://' + host + api;
});

async.eachSeries(function (API, cb) {
  request(API, function (err, res, body) {
    cb(err);
  });
}, function (err) {
  //called when all done, or error occurs
});

use sequenty.

sudo npm install sequenty

or

https://github.com/AndyShin/sequenty

very simple.

var sequenty = require('sequenty'); 

function f1(cb) // cb: callback by sequenty
{
  console.log("I'm f1");
  cb(); // please call this after finshed
}

function f2(cb)
{
  console.log("I'm f2");
  cb();
}

sequenty.run([f1, f2]);

also you can use a loop like this:

var f = [];
var queries = [ "select .. blah blah", "update blah blah", ...];

for (var i = 0; i < queries.length; i++)
{
  f[i] = function(cb, funcIndex) // sequenty gives you cb and funcIndex
  {
    db.query(queries[funcIndex], function(err, info)
    {
       cb(); // must be called
    });
  }
}

sequenty.run(f); // fire!

Using the request library can help minimize the cruft:

var request = require('request')

request({ uri: 'http://api.com/1' }, function(err, response, body){
    // use body
    request({ uri: 'http://api.com/2' }, function(err, response, body){
        // use body
        request({ uri: 'http://api.com/3' }, function(err, response, body){
            // use body
        })
    })
})

But for maximum awesomeness you should try some control-flow library like Step - it will also allow you to parallelize requests, assuming that it's acceptable:

var request = require('request')
var Step    = require('step')

// request returns body as 3rd argument
// we have to move it so it works with Step :(
request.getBody = function(o, cb){
    request(o, function(err, resp, body){
        cb(err, body)
    })
}

Step(
    function getData(){
        request.getBody({ uri: 'http://api.com/?method=1' }, this.parallel())
        request.getBody({ uri: 'http://api.com/?method=2' }, this.parallel())
        request.getBody({ uri: 'http://api.com/?method=3' }, this.parallel())
    },
    function doStuff(err, r1, r2, r3){
        console.log(r1,r2,r3)
    }
)

I like Raynos' solution as well, but I prefer a different flow control library.

https://github.com/caolan/async

Depending on whether you need the results in each subsequent function, I'd either use series, parallel, or waterfall.

Series when they have to be serially executed, but you don't necessarily need the results in each subsequent function call.

Parallel if they can be executed in parallel, you don't need the results from each during each parallel function, and you need a callback when all have completed.

Waterfall if you want to morph the results in each function and pass to the next

endpoints = 
 [{ host: 'www.example.com', path: '/api_1.php' },
  { host: 'www.example.com', path: '/api_2.php' },
  { host: 'www.example.com', path: '/api_3.php' }];

async.mapSeries(endpoints, http.get, function(results){
    // Array of results
});

sync-request

By far the most easiest one I've found and used is sync-request and it supports both node and the browser!

var request = require('sync-request');
var res = request('GET', 'http://google.com');
console.log(res.body.toString('utf-8'));

That's it, no crazy configuration, no complex lib installs, although it does have a lib fallback. Just works. I've tried other examples here and was stumped when there was much extra setup to do or installs didn't work!

Notes:

The example that sync-request uses doesn't play nice when you use res.getBody(), all get body does is accept an encoding and convert the response data. Just do res.body.toString(encoding) instead.


It seems solutions for this problem is never-ending, here's one more :)

// do it once.
sync(fs, 'readFile')

// now use it anywhere in both sync or async ways.
var data = fs.readFile(__filename, 'utf8')

http://alexeypetrushin.github.com/synchronize