[javascript] Parse XLSX with Node and create json

Ok so I found this really well documented node_module called js-xlsx

Question: How can I parse an xlsx to output json?

Here is what the excel sheet looks like:

enter image description here

In the end the json should look like this:

[
   {
   "id": 1,
   "Headline": "Team: Sally Pearson",
   "Location": "Austrailia",
   "BodyText": "...",
   "Media: "..."
   },
   {
   "id": 2,
   "Headline": "Team: Rebeca Andrade",
   "Location": "Brazil",
   "BodyText": "...",
   "Media: "..."
   }
]

index.js:

if(typeof require !== 'undefined') {
    console.log('hey');
    XLSX = require('xlsx');
}
var workbook = XLSX.readFile('./assets/visa.xlsx');
var sheet_name_list = workbook.SheetNames;
sheet_name_list.forEach(function(y) { /* iterate through sheets */
  var worksheet = workbook.Sheets[y];
  for (z in worksheet) {
    /* all keys that do not begin with "!" correspond to cell addresses */
    if(z[0] === '!') continue;
    // console.log(y + "!" + z + "=" + JSON.stringify(worksheet[z].v));

  }

});
XLSX.writeFile(workbook, 'out.xlsx');

This question is related to javascript json node.js excel xlsx

The answer is


here's angular 5 method version of this with unminified syntax for those who struggling with that y, z, tt in accepted answer. usage: parseXlsx().subscribe((data)=> {...})

parseXlsx() {
    let self = this;
    return Observable.create(observer => {
        this.http.get('./assets/input.xlsx', { responseType: 'arraybuffer' }).subscribe((data: ArrayBuffer) => {
            const XLSX = require('xlsx');
            let file = new Uint8Array(data);
            let workbook = XLSX.read(file, { type: 'array' });
            let sheetNamesList = workbook.SheetNames;

            let allLists = {};
            sheetNamesList.forEach(function (sheetName) {
                let worksheet = workbook.Sheets[sheetName];
                let currentWorksheetHeaders: object = {};
                let data: Array<any> = [];
                for (let cellName in worksheet) {//cellNames example: !ref,!margins,A1,B1,C1

                    //skipping serviceCells !margins,!ref
                    if (cellName[0] === '!') {
                        continue
                    };

                    //parse colName, rowNumber, and getting cellValue
                    let numberPosition = self.getCellNumberPosition(cellName);
                    let colName = cellName.substring(0, numberPosition);
                    let rowNumber = parseInt(cellName.substring(numberPosition));
                    let cellValue = worksheet[cellName].w;// .w is XLSX property of parsed worksheet

                    //treating '-' cells as empty on Spot Indices worksheet
                    if (cellValue.trim() == "-") {
                        continue;
                    }

                    //storing header column names
                    if (rowNumber == 1 && cellValue) {
                        currentWorksheetHeaders[colName] = typeof (cellValue) == "string" ? cellValue.toCamelCase() : cellValue;
                        continue;
                    }

                    //creating empty object placeholder to store current row
                    if (!data[rowNumber]) {
                        data[rowNumber] = {}
                    };

                    //if header is date - for spot indices headers are dates
                    data[rowNumber][currentWorksheetHeaders[colName]] = cellValue;

                }

                //dropping first two empty rows
                data.shift();
                data.shift();
                allLists[sheetName.toCamelCase()] = data;
            });

            this.parsed = allLists;

            observer.next(allLists);
            observer.complete();
        })
    });
}

You can also use

var XLSX = require('xlsx');
var workbook = XLSX.readFile('Master.xlsx');
var sheet_name_list = workbook.SheetNames;
console.log(XLSX.utils.sheet_to_json(workbook.Sheets[sheet_name_list[0]]))

I think this code will do what you want. It stores the first row as a set of headers, then stores the rest in a data object which you can write to disk as JSON.

var XLSX = require('xlsx');
var workbook = XLSX.readFile('test.xlsx');
var sheet_name_list = workbook.SheetNames;
sheet_name_list.forEach(function(y) {
    var worksheet = workbook.Sheets[y];
    var headers = {};
    var data = [];
    for(z in worksheet) {
        if(z[0] === '!') continue;
        //parse out the column, row, and value
        var col = z.substring(0,1);
        var row = parseInt(z.substring(1));
        var value = worksheet[z].v;

        //store header names
        if(row == 1) {
            headers[col] = value;
            continue;
        }

        if(!data[row]) data[row]={};
        data[row][headers[col]] = value;
    }
    //drop those first two rows which are empty
    data.shift();
    data.shift();
    console.log(data);
});

prints out

[ { id: 1,
    headline: 'team: sally pearson',
    location: 'Australia',
    'body text': 'majority have…',
    media: 'http://www.youtube.com/foo' },
  { id: 2,
    headline: 'Team: rebecca',
    location: 'Brazil',
    'body text': 'it is a long established…',
    media: 'http://s2.image.foo/' } ]

I found a better way of doing this

  function genrateJSONEngine() {
    var XLSX = require('xlsx');
    var workbook = XLSX.readFile('test.xlsx');
    var sheet_name_list = workbook.SheetNames;
    sheet_name_list.forEach(function (y) {
      var array = workbook.Sheets[y];

      var first = array[0].join()
      var headers = first.split(',');

      var jsonData = [];
      for (var i = 1, length = array.length; i < length; i++) {

        var myRow = array[i].join();
        var row = myRow.split(',');

        var data = {};
        for (var x = 0; x < row.length; x++) {
          data[headers[x]] = row[x];
        }
        jsonData.push(data);

      }

**podria ser algo asi en react y electron**

 xslToJson = workbook => {
        //var data = [];
        var sheet_name_list = workbook.SheetNames[0];
        return XLSX.utils.sheet_to_json(workbook.Sheets[sheet_name_list], {
            raw: false,
            dateNF: "DD-MMM-YYYY",
            header:1,
            defval: ""
        });
    };

    handleFile = (file /*:File*/) => {
        /* Boilerplate to set up FileReader */
        const reader = new FileReader();
        const rABS = !!reader.readAsBinaryString;

        reader.onload = e => {
            /* Parse data */
            const bstr = e.target.result;
            const wb = XLSX.read(bstr, { type: rABS ? "binary" : "array" });
            /* Get first worksheet */
            let arr = this.xslToJson(wb);

            console.log("arr ", arr)
            var dataNueva = []

            arr.forEach(data => {
                console.log("data renaes ", data)
            })
            // this.setState({ DataEESSsend: dataNueva })
            console.log("dataNueva ", dataNueva)

        };


        if (rABS) reader.readAsBinaryString(file);
        else reader.readAsArrayBuffer(file);
    };

    handleChange = e => {
        const files = e.target.files;
        if (files && files[0]) {
            this.handleFile(files[0]);
        }
    };

Examples related to javascript

need to add a class to an element How to make a variable accessible outside a function? Hide Signs that Meteor.js was Used How to create a showdown.js markdown extension Please help me convert this script to a simple image slider Highlight Anchor Links when user manually scrolls? Summing radio input values How to execute an action before close metro app WinJS javascript, for loop defines a dynamic variable name Getting all files in directory with ajax

Examples related to json

Use NSInteger as array index Uncaught SyntaxError: Unexpected end of JSON input at JSON.parse (<anonymous>) HTTP POST with Json on Body - Flutter/Dart Importing json file in TypeScript json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 190) Angular 5 Service to read local .json file How to import JSON File into a TypeScript file? Use Async/Await with Axios in React.js Uncaught SyntaxError: Unexpected token u in JSON at position 0 how to remove json object key and value.?

Examples related to node.js

Hide Signs that Meteor.js was Used Querying date field in MongoDB with Mongoose SyntaxError: Cannot use import statement outside a module Server Discovery And Monitoring engine is deprecated How to fix ReferenceError: primordials is not defined in node UnhandledPromiseRejectionWarning: This error originated either by throwing inside of an async function without a catch block dyld: Library not loaded: /usr/local/opt/icu4c/lib/libicui18n.62.dylib error running php after installing node with brew on Mac internal/modules/cjs/loader.js:582 throw err DeprecationWarning: Buffer() is deprecated due to security and usability issues when I move my script to another server Please run `npm cache clean`

Examples related to excel

Python: Pandas pd.read_excel giving ImportError: Install xlrd >= 0.9.0 for Excel support Converting unix time into date-time via excel How to increment a letter N times per iteration and store in an array? 'Microsoft.ACE.OLEDB.16.0' provider is not registered on the local machine. (System.Data) How to import an Excel file into SQL Server? Copy filtered data to another sheet using VBA Better way to find last used row Could pandas use column as index? Check if a value is in an array or not with Excel VBA How to sort dates from Oldest to Newest in Excel?

Examples related to xlsx

How to save .xlsx data to file as a blob Parse XLSX with Node and create json Easy way to export multiple data.frame to multiple Excel worksheets Using Pandas to pd.read_excel() for multiple worksheets of the same workbook Python convert csv to xlsx How to Bulk Insert from XLSX file extension? Convert xlsx to csv in Linux with command line Importing Excel files into R, xlsx or xls Convert xlsx file to csv using batch Excel "External table is not in the expected format."