Quick way to list all files in Amazon S3 bucket

Question

I have an amazon s3 bucket that has tens of thousands of filenames in it  What s the easiest way to get a text file that lists all the filenames in the bucket

User · Answer

The EASIEST way to get a very usable text file is to download S3 Browser http://s3browser.com/ and use the Web URLs Generator to produce a list of complete link paths. It is very handy and involves about 3 clicks.

-Browse to Folder
-Select All
-Generate Urls

Best of luck to you.

User · Answer

Here s a way to use the stock AWS CLI to generate a diff-able list of just object names  aws s3api list-objects --bucket  quot  BUCKET quot  --query  quot Contents    Key  Key  quot  --output text  based on https   stackoverflow com a 54378943 53529  This gives you the full object name of every object in the bucket  separated by new lines  Useful if you want to diff between the contents of an S3 bucket and a GCS bucket  for example

User · Answer

You can list all the files  in the aws s3 bucket using the command  aws s3 ls path to file   and to save it in a file  use   aws s3 ls path to file  gt  gt  save result txt   if you want to append your result in a file otherwise   aws s3 ls path to file  gt  save result txt   if you want to clear what was written before   It will work both in windows and Linux

User · Answer

For Scala developers  here it is recursive function to execute a full scan and map the contents of an AmazonS3 bucket using the official AWS SDK for Java  import com amazonaws services s3 AmazonS3Client import com amazonaws services s3 model  S3ObjectSummary  ObjectListing  GetObjectRequest  import scala collection JavaConversions  collectionAsScalaIterable   gt  asScala   def map T  s3  AmazonS3Client  bucket  String  prefix  String  f   S3ObjectSummary    gt  T         def scan acc List T   listing ObjectListing   List T          val summaries   asScala S3ObjectSummary  listing getObjectSummaries        val mapped    for  summary  lt - summaries  yield f summary   toList      if   listing isTruncated  mapped toList     else scan acc     mapped  s3 listNextBatchOfObjects listing          scan List    s3 listObjects bucket  prefix       To invoke the above curried map   function  simply pass the already constructed  and properly initialized  AmazonS3Client object  refer to the official AWS SDK for Java API Reference   the bucket name and the prefix name in the first parameter list  Also pass the function f   you want to apply to map each object summary in the second parameter list   For example  val keyOwnerTuples   map s3  bucket  prefix  s   gt   s getKey  s getOwner     will return the full list of  key  owner  tuples in that bucket prefix  or  map s3   bucket    prefix   s   gt  println s     as you would normally approach by Monads in Functional Programming

User · Answer

In Java you can get the keys using ListObjects  see AWS documentation   FileWriter fileWriter  BufferedWriter bufferedWriter            AmazonS3 s3client   new AmazonS3Client new ProfileCredentialsProvider              ListObjectsRequest listObjectsRequest   new ListObjectsRequest    withBucketName bucketName   withPrefix  myprefix    ObjectListing objectListing   do       objectListing   s3client listObjects listObjectsRequest       for  S3ObjectSummary objectSummary            objectListing getObjectSummaries                 write to file with e g  a bufferedWriter         bufferedWriter write objectSummary getKey               listObjectsRequest setMarker objectListing getNextMarker       while  objectListing isTruncated

User · Answer

s3cmd is invaluable for this kind of thing    s3cmd ls -r s3   yourbucket    awk   print  4     objects in bucket

User · Answer

I know its old topic  but I d like to contribute too  With the newer version of boto3 and python  you can get the files as follow  import os import boto3 from botocore exceptions import ClientError      client   boto3 client  s3    bucket   client list objects Bucket BUCKET NAME  for content in bucket  quot Contents quot        key   content  quot key quot    Keep in mind that this solution not comprehends pagination  For more information  https   boto3 amazonaws com v1 documentation api latest reference services s3 html S3 Client list objects

User · Answer

Be carefull  amazon list only returns 1000 files  If you want to iterate over all files you have to paginate the results using markers    In ruby using aws-s3  bucket name    yourBucket  marker       AWS  S3  Base establish connection      access key id   gt   your access key id      secret access key   gt   your secret access key     loop do   objects   Bucket objects bucket name   marker  gt marker   max keys  gt 1000    break if objects size    0   marker   objects last key    objects each do  obj        puts    obj key     end end   end  Hope this helps  vincent

User · Answer

The below command will get all the file names from your AWS S3 bucket and write into text file in your current directory   aws s3 ls s3   Bucketdirectory Subdirectory    cat  gt  gt  FileNames txt

User · Answer

There are couple of ways you can go about it   Using Python   import boto3  sesssion   boto3 Session aws access key id  aws secret access key   s3   sesssion resource  s3    bucketName    testbucket133  bucket   s3 Bucket bucketName   for obj in bucket objects all        print obj key    Another way is using AWS cli for it   aws s3 ls s3    bucketname  example   aws s3 ls s3   testbucket133

User · Answer

After zach I would also recommend boto  but I needed to make a slight difference to his code   conn   boto connect s3  access-key    secret key   bucket   conn lookup  bucket-name   for key in bucket      print key name

User · Answer

Alternatively you can use Minio Client aka mc  Its Open Source and compatible with AWS S3  It is available for Linux  Windows  Mac  FreeBSD    All you have do do is to run mc ls command for listing the contents       mc ls s3 kline   2016-04-30 13 20 47 IST  1 1MiB 1 jpg  2016-04-30 16 03 55 IST  7 5KiB docker png  2016-04-30 15 16 17 IST   50KiB pi png  2016-05-10 14 34 39 IST  365KiB upton pdf   Note     s3  Alias for Amazon S3  kline  AWS S3 bucket name   Installing Minio Client Linux Download mc for    64-bit Intel from https   dl minio io client mc release linux-amd64 mc 32-bit Intel from https   dl minio io client mc release linux-386 mc 32-bit ARM from https   dl minio io client mc release linux-arm mc      chmod 755 mc     mc --help   Setting up AWS credentials with Minio Client     mc config host add mys3 https   s3 amazonaws com BKIKJAA5BMMU2RHO6IBB V7f1CwQqAcwo80UEIJEjc5gVQUSSx5ohQ9GSrr12   Note  Please replace mys3 with alias you would like for this account and  BKIKJAA5BMMU2RHO6IBB  V7f1CwQqAcwo80UEIJEjc5gVQUSSx5ohQ9GSrr12 with your AWS ACCESS-KEY and SECRET-KEY  Hope it helps   Disclaimer  I work for Minio

User · Answer

In PHP you can get complete list of AWS-S3 objects inside specific bucket using following call   S3    Aws S3 S3Client  factory array  region    gt   region      iterator    S3- gt getIterator  ListObjects   array  Bucket    gt   bucket    foreach   iterator as  obj        echo  obj  Key        You can redirect output of the above code in to a file to get list of keys

User · Answer

Code in python using the awesome  boto  lib  The code returns a list of files in a bucket and also handles exceptions for missing buckets    import boto  conn   boto connect s3   lt ACCESS KEY gt    lt SECRET KEY gt    try      bucket   conn get bucket   lt BUCKET NAME gt   validate   True   except boto exception S3ResponseError  e      do something     The bucket does not exist  choose how to deal with it or raise the exception  return   key name encode   utf-8    for key in bucket list       Don t forget to replace the  lt  PLACE HOLDERS   with your values

User · Answer

For Python s boto3 after having used aws configure   import boto3 s3   boto3 resource  s3    bucket   s3 Bucket  name   for obj in bucket objects all        print obj key

User · Answer

You can use standard s3 api -   aws s3 ls s3   root folder1 folder2

User · Answer

AWS CLI  Documentation for aws s3 ls  AWS have recently release their Command Line Tools  This works much like boto and can be installed using sudo easy install awscli or sudo pip install awscli  Once you have installed  you can then simply run  aws s3 ls   Which will show you all of your available buckets  CreationTime Bucket        ------------ ------ 2013-07-11 17 08 50 mybucket 2013-07-24 14 55 44 mybucket2   You can then query a specific bucket for files   Command   aws s3 ls s3   mybucket   Output   Bucket  mybucket Prefix         LastWriteTime     Length Name       -------------     ------ ----                            PRE somePrefix  2013-07-25 17 06 27         88 test txt   This will show you all of your files

User · Answer

AWS CLI can let you see all files of an S3 bucket quickly and help in performing other operations too   To use AWS CLI follow steps below    Install AWS CLI   Configure AWS CLI for using default security credentials and default AWS Region   To see all files of an S3 bucket use command   aws s3 ls s3   your bucket name --recursive   Reference to use AWS cli for different AWS services  https   docs aws amazon com cli latest reference

User · Answer

First make sure you are on an instance terminal and you have all access of S3 in IAM you are using  For example I used an ec2 instance   pip3 install awscli   Then Configure aws  aws configure   Then fill outcredantials ex -     aws configure AWS Access Key ID  None   AKIAIOSFODNN7EXAMPLE AWS Secret Access Key  None   wJalrXUtnFEMI K7MDENG bPxRfiCYEXAMPLEKEY Default region name  None   us-west-2 Default output format  None   json  or just press enter    Now  See all buckets  aws s3 ls   Store all buckets name   aws s3 ls  gt  output txt   See all file structure in a bucket  aws s3 ls bucket-name --recursive   Store file structure in each bucket  aws s3 ls bucket-name --recursive  gt  file Structure txt   Hope this helps

User · Answer

please try this bash script  it uses curl command with no need for any external dependencies  bucket  lt bucket name gt  region  lt region name gt  awsAccess  lt access key gt  awsSecret  lt secret key gt  awsRegion    region   baseUrl  s3   awsRegion  amazonaws com   m sed       if which gsed  gt   dev null 2 gt  amp 1  then     gsed        else     sed        fi    awsStringSign4       kSecret  AWS4 1    kDate   printf           s    2    openssl dgst -sha256 -hex -mac HMAC -macopt  key   kSecret       2 gt  dev null   m sed  s            kRegion   printf         s    3    openssl dgst -sha256 -hex -mac HMAC -macopt  hexkey   kDate      2 gt  dev null   m sed  s            kService   printf        s    4    openssl dgst -sha256 -hex -mac HMAC -macopt  hexkey   kRegion    2 gt  dev null   m sed  s            kSigning   printf  aws4 request    openssl dgst -sha256 -hex -mac HMAC -macopt  hexkey   kService   2 gt  dev null   m sed  s            signedString   printf    s    5    openssl dgst -sha256 -hex -mac HMAC -macopt  hexkey   kSigning   2 gt  dev null   m sed  s            printf   s     signedString      if   -z    region      then   region    awsRegion   fi     Initialize helper variables  authType  AWS4-HMAC-SHA256  service  s3  dateValueS   date -u    Y m d   dateValueL   date -u    Y m dT H M SZ      0  Hash the file to be uploaded    1  Create canonical request    NOTE  order significant in   signedHeaders  and   canonicalRequest   signedHeaders  host x-amz-content-sha256 x-amz-date   canonicalRequest    GET    host   bucket  s3 amazonaws com x-amz-content-sha256 e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 x-amz-date   dateValueL     signedHeaders  e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855     Hash it  canonicalRequestHash   printf   s     canonicalRequest     openssl dgst -sha256 -hex 2 gt  dev null   m sed  s             2  Create string to sign  stringToSign      authType    dateValueL    dateValueS    region    service  aws4 request   canonicalRequestHash      3  Sign the string  signature   awsStringSign4    awsSecret      dateValueS      region      service      stringToSign       Upload  curl -g -k  https     baseUrl    bucket       -H  x-amz-content-sha256  e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855      -H  x-amz-Date    dateValueL       -H  Authorization    authType  Credential   awsAccess    dateValueS    region    service  aws4 request SignedHeaders   signedHeaders  Signature   signature

User · Answer

Use plumbum to wrap the cli and you will have a clear syntax   import plumbum as pb folders   pb local  aws    s3    ls

User · Answer

In javascript you can use  s3 listObjects params  function  err  result         to get all objects inside bucket  you have to pass bucket name inside params  Bucket  name

User · Answer

This is an old question but the number of responses tells me many people hit this page  The easiest way I found is to just use the built in AWS console for creating an inventory   It s easy to set up but the first CSV file can take up to 48 hours to show up   After that you can create either a daily or weekly output to a bucket of your choosing

User · Answer

Update 15-02-2019   This command will give you a list of all buckets in AWS S3   aws s3 ls  This command will give you a list of all top-level objects inside an AWS S3 bucket   aws s3 ls bucket-name  This command will give you a list of ALL objects inside an AWS S3 bucket   aws s3 ls bucket-name --recursive  This command will place a list of ALL inside an AWS S3 bucket    inside a text file in your current directory   aws s3 ls bucket-name --recursive   cat  gt  gt  file-name txt

User · Answer

I d recommend using boto  Then it s a quick couple of lines of python   from boto s3 connection import S3Connection  conn   S3Connection  access-key   secret-access-key   bucket   conn get bucket  bucket   for key in bucket list        print key name encode  utf-8     Save this as list py  open a terminal  and then run     python list py  gt  results txt

User · Answer

public static Dictionary lt string  DateTime gt  ListBucketsByCreationDate string AccessKey  string SecretKey             return AWSClientFactory CreateAmazonS3Client AccessKey          SecretKey  ListBuckets   Buckets ToDictionary s3Bucket   gt  s3Bucket BucketName          s3Bucket   gt  DateTime Parse s3Bucket CreationDate

User · Answer

Simplified and updated version of the Scala answer by Paolo   import scala collection JavaConversions  collectionAsScalaIterable   gt  asScala  import com amazonaws services s3 AmazonS3 import com amazonaws services s3 model  ListObjectsRequest  ObjectListing  S3ObjectSummary   def buildListing s3  AmazonS3  request  ListObjectsRequest   List S3ObjectSummary        def buildList listIn  List S3ObjectSummary   bucketList ObjectListing   List S3ObjectSummary          val latestList  List S3ObjectSummary    bucketList getObjectSummaries toList      if   bucketList isTruncated  listIn     latestList     else buildList listIn     latestList  s3 listNextBatchOfObjects bucketList          buildList List    s3 listObjects request       Stripping out the generics and using the ListObjectRequest generated by the SDK builders

User · Answer

aws s3api list-objects --bucket bucket-name   For more details see here - http   docs aws amazon com cli latest reference s3api list-objects html

User · Answer

find like file listing for s3 files aws s3api --profile  lt  lt profile-name gt  gt    --endpoint-url  lt  lt end-point-url gt  gt  list-objects   --bucket  lt  lt bucket-name gt  gt  --query  Contents    Key  Key

User · Answer

function showUploads        if   class exists  S3    require once  S3 php          AWS access info     if   defined  awsAccessKey    define  awsAccessKey    234567665464tg        if   defined  awsSecretKey    define  awsSecretKey    dfshgfhfghdgfhrt463457         bucketName    my bucket1234        s3   new S3 awsAccessKey  awsSecretKey        contents    s3- gt getBucket  bucketName       echo   lt hr  gt List of Files in bucket     bucketName   lt hr  gt         n   1      foreach   contents as  p   gt   v           echo  p   lt br  gt             n        endforeach

[amazon-s3] Quick way to list all files in Amazon S3 bucket?

Examples related to amazon-s3