Read file from aws s3 bucket using node fs

Question

I am attempting to read a file that is in a aws s3 bucket using   fs readFile file  function  err  contents      var myLines   contents Body toString   split   n        I ve been able to download and upload a file using the node aws-sdk  but I am at a loss as to how to simply read it and parse the contents   Here is an example of how I am reading the file from s3   var s3   new AWS S3    var params    Bucket   myBucket   Key   myKey csv   var s3file   s3 getObject params

User · Answer

With the new version of sdk  the accepted answer does not work - it does not wait for the object to be downloaded  The following code snippet will help with the new version      dependencies  const AWS   require  aws-sdk        get reference to S3 client  const s3   new AWS S3     exports handler   async  event  context  callback    gt     var bucket    TestBucket   var key    TestKey      try          const params                 Bucket  Bucket              Key  Key                    var theObject   await s3 getObject params  promise           catch  error            console log error           return

User · Answer

If you are looking to avoid the callbacks you can take advantage of the sdk  promise   function like this   const s3   new AWS S3    const params    Bucket   myBucket   Key   myKey csv   const response   await s3 getObject params  promise      await the promise const fileContent   getObjectResult Body toString  utf-8       can also do  base64  here if desired   I m sure the other ways mentioned here have their advantages but this works great for me  Sourced from this thread  see the last response from AWS   https   forums aws amazon com thread jspa threadID 116788

User · Answer

var fileStream   fs createWriteStream   path to file jpg    var s3Stream   s3 getObject  Bucket   myBucket   Key   myImageFile jpg    createReadStream        Listen for errors returned by the service s3Stream on  error   function err           NoSuchKey  The specified key does not exist     console error err        s3Stream pipe fileStream  on  error   function err           capture any errors that occur when writing data to the file     console error  File Stream    err      on  close   function         console log  Done          Reference  https   docs aws amazon com sdk-for-javascript v2 developer-guide requests-using-stream-objects html

User · Answer

I couldn t figure why yet  but the createReadStream pipe approach didn t work for me  I was trying to download a large CSV file  300MB   and I got duplicated lines  It seemed a random issue  The final file size varied in each attempt to download it   I ended up using another way  based on AWS JS SDK examples   var s3   new AWS S3    var params    Bucket   myBucket   Key   myImageFile jpg    var file   require  fs   createWriteStream   path to file jpg     s3 getObject params       on  httpData   function chunk    file write chunk           on  httpDone   function     file end            send      This way  it worked like a charm

User · Answer

I prefer Buffer from data Body  toString  utf8    It supports encoding parameters  With other AWS services  ex  Kinesis Streams  someone may want to replace  utf8  encoding with  base64    new AWS S3   getObject      Bucket  this awsBucketName  Key  keyName       function err  data        if   err          const body   Buffer from data Body  toString  utf8          console log body

User · Answer

I had exactly the same issue when downloading from S3 very large files   The example solution from AWS docs just does not work   var file   fs createWriteStream options filePath           file on  close   function                if self logger  self logger info  S3Dataset file download saved to  s   options filePath                return callback null done                       s3 getObject   Key   documentKey    createReadStream   on  error   function err                if self logger  self logger error  S3Dataset download error key  s error      options fileName  error               return callback error              pipe file     While this solution will work       var file   fs createWriteStream options filePath       s3 getObject   Bucket  this  options s3 Bucket  Key  documentKey         on  error   function err            if self logger  self logger error  S3Dataset download error key  s error      options fileName  error           return callback error               on  httpData   function chunk    file write chunk           on  httpDone   function              file end             if self logger  self logger info  S3Dataset file download saved to  s   options filePath            return callback null done               send      The createReadStream attempt just does not fire the end  close or error callback for some reason  See here about this   I m using that solution also for writing down archives to gzip  since the first one  AWS example  does not work in this case either           var gunzip   zlib createGunzip            var file   fs createWriteStream  options filePath             s3 getObject   Bucket  this  options s3 Bucket  Key  documentKey             on  error   function  error                if self logger  self logger error      error               return callback error                       on  httpData   function  chunk                file write chunk                       on  httpDone   function                   file end                 if self logger  self logger info  downloadArchive downloaded  s   options filePath                fs createReadStream  options filePath                on  error    error    gt                    return callback error                               on  end        gt                    if self logger  self logger info  downloadArchive unarchived  s   options fileDest                   return callback null  options fileDest                               pipe gunzip               pipe fs createWriteStream options fileDest                       send

User · Answer

If you want to save memory and want to obtain each row as a json object  then you can use fast-csv to create readstream and can read each row as a json object as follows   const csv   require  fast-csv    const AWS   require  aws-sdk     const credentials   new AWS Credentials  ACCESSKEY    SECRETEKEY    SESSIONTOKEN    AWS config update       credentials  credentials     credentials required for local execution     region   your region      const dynamoS3Bucket   new AWS S3    const stream   dynamoS3Bucket getObject   Bucket   your bucket   Key   example csv     createReadStream     var parser   csv fromStream stream    headers  true    on  data   function  data        parser pause       can pause reading using this at a particular row     parser resume       to continue reading     console log data      on  end   function          console log  process finished

User · Answer

here is the example which i used to retrive and parse json data from s3        var params    Bucket  BUCKET NAME  Key  KEY NAME       new AWS S3   getObject params  function err  json data              if   err            var json   JSON parse new Buffer json data Body  toString  utf8                PROCESS JSON DATA

User · Answer

This will do it   new AWS S3   getObject   Bucket  this awsBucketName  Key  keyName    function err  data        if   err          console log data Body toString

User · Answer

You have a couple options  You can include a callback as a second argument  which will be invoked with any error message and the object  This example is straight from the AWS documentation   s3 getObject params  function err  data      if  err  console log err  err stack      an error occurred   else     console log data                successful response       Alternatively  you can convert the output to a stream  There s also an example in the AWS documentation   var s3   new AWS S3  apiVersion   2006-03-01     var params    Bucket   myBucket   Key   myImageFile jpg    var file   require  fs   createWriteStream   path to file jpg    s3 getObject params  createReadStream   pipe file

User · Answer

Since you seem to want to process an S3 text file line-by-line   Here is a Node version that uses the standard readline module and AWS  createReadStream    const readline   require  readline     const rl   readline createInterface       input  s3 getObject params  createReadStream        rl on  line   function line        console log line       on  close   function

[node.js] Read file from aws s3 bucket using node fs

Examples related to node.js

Examples related to amazon-web-services

Examples related to amazon-s3

Examples related to fs