How to list all AWS S3 objects in a bucket using Java

Question

What is the simplest way to get a list of all items within an S3 bucket using Java   List lt S3ObjectSummary gt  s3objects   s3 listObjects bucketName prefix  getObjectSummaries      This example only returns 1000 items

User · Answer

Gray your solution was strange but you seem like a nice guy    AmazonS3Client s3Client   new AmazonS3Client new BasicAWSCredentials        ObjectListing images   s3Client listObjects bucketName     List lt S3ObjectSummary gt  list   images getObjectSummaries    for S3ObjectSummary image  list        S3Object obj   s3Client getObject bucketName  image getKey         writeToFile obj getObjectContent

User · Answer

This worked for me   Thread thread   new Thread new Runnable         Override     public void run             try               List lt String gt  listing   getObjectNamesForBucket bucket  s3Client               Log e TAG   listing    listing                      catch  Exception e                e printStackTrace                Log e TAG   Exception found while listing    e                        thread start         private List lt String gt  getObjectNamesForBucket String bucket  AmazonS3 s3Client            ObjectListing objects s3Client listObjects bucket           List lt String gt  objectNames new ArrayList lt String gt  objects getObjectSummaries   size             Iterator lt S3ObjectSummary gt  oIter objects getObjectSummaries   iterator            while  oIter hasNext                  objectNames add oIter next   getKey                       while  objects isTruncated                  objects s3Client listNextBatchOfObjects objects               oIter objects getObjectSummaries   iterator                while  oIter hasNext                      objectNames add oIter next   getKey                                     return objectNames

User · Answer

As a slightly more concise solution to listing S3 objects when they might be truncated   ListObjectsRequest request   new ListObjectsRequest   withBucketName bucketName   ObjectListing listing   null   while  listing    null      request getMarker      null       listing   s3Client listObjects request        do stuff with listing   request setMarker listing getNextMarker

User · Answer

You don t want to list all 1000 object in your bucket at a time  A more robust solution will be to fetch a max of 10 objects at a time  You can do this with the withMaxKeys method  The following code creates an S3 client  fetches 10 or less objects at a time and filters based  on a prefix and generates a pre-signed url for the fetched object  import com amazonaws HttpMethod  import com amazonaws SdkClientException  import com amazonaws auth AWSStaticCredentialsProvider  import com amazonaws auth BasicAWSCredentials  import com amazonaws regions Regions  import com amazonaws services s3 AmazonS3  import com amazonaws services s3 AmazonS3ClientBuilder  import com amazonaws services s3 model     import java net URL  import java util Date           author shabab     since 21 Sep  2020     public class AwsMain        static final String ACCESS KEY    quot  quot       static final String SECRET    quot  quot       static final Regions BUCKET REGION   Regions DEFAULT REGION      static final String BUCKET NAME    quot  quot        public static void main String   args            BasicAWSCredentials awsCreds   new BasicAWSCredentials ACCESS KEY  SECRET            try               final AmazonS3 s3Client   AmazonS3ClientBuilder                      standard                        withRegion BUCKET REGION                       withCredentials new AWSStaticCredentialsProvider awsCreds                        build                 ListObjectsV2Request req   new ListObjectsV2Request   withBucketName BUCKET NAME  withMaxKeys 10               ListObjectsV2Result result               do                   result   s3Client listObjectsV2 req                    result getObjectSummaries                            stream                            filter s3ObjectSummary - gt                                return s3ObjectSummary getKey   contains  quot Market-subscriptions  quot                                        amp  amp   s3ObjectSummary getKey   equals  quot Market-subscriptions  quot                                                        forEach s3ObjectSummary - gt                                 GeneratePresignedUrlRequest generatePresignedUrlRequest                                       new GeneratePresignedUrlRequest BUCKET NAME  s3ObjectSummary getKey                                                 withMethod HttpMethod GET                                               withExpiration getExpirationDate                                  URL url   s3Client generatePresignedUrl generatePresignedUrlRequest                                System out println s3ObjectSummary getKey      quot  Pre-Signed URL   quot    url toString                                                  String token   result getNextContinuationToken                    req setContinuationToken token                  while  result isTruncated               catch  SdkClientException e                e printStackTrace                          private static Date getExpirationDate             Date expiration   new java util Date            long expTimeMillis   expiration getTime            expTimeMillis    1000   60   60          expiration setTime expTimeMillis            return expiration

User · Answer

For those  who are reading this in 2018   There are two new pagination-hassle-free APIs available  one in AWS SDK for Java 1 x and another one in 2 x   1 x  There is a new API in Java SDK that allows you to iterate through objects in S3 bucket without dealing with pagination   AmazonS3 s3   AmazonS3ClientBuilder standard   build     S3Objects inBucket s3   the-bucket   forEach  S3ObjectSummary objectSummary  - gt           TODO  Consume  objectSummary  the way you need     System out println objectSummary key         This iteration is lazy      The list of S3ObjectSummarys will be fetched lazily  a page at a time  as they are needed  The size of the page can be controlled with the withBatchSize int  method    2 x  The API changed  so here is an SDK 2 x version   S3Client client   S3Client builder   region Region US EAST 1  build    ListObjectsV2Request request   ListObjectsV2Request builder   bucket  the-bucket   prefix  the-prefix   build    ListObjectsV2Iterable response   client listObjectsV2Paginator request    for  ListObjectsV2Response page   response        page contents   forEach  S3Object object  - gt               TODO  Consume  object  the way you need         System out println object key                 ListObjectsV2Iterable is lazy as well      When the operation is called  an instance of this class is returned  At this point  no service calls are made yet and so there is no guarantee that the request is valid  As you iterate through the iterable  SDK will start lazily loading response pages by making service calls until there are no pages left or your iteration stops  If there are errors in your request  you will see the failures only after you start iterating through the iterable

User · Answer

I am processing a large collection of objects generated by our system  we changed the format of the stored data and needed to check each file  determine which ones were in the old format  and convert them  There are other ways to do this  but this one relates to your question       ObjectListing list   amazonS3Client listObjects contentBucketName  contentKeyPrefix        do                            List lt S3ObjectSummary gt  summaries   list getObjectSummaries             for  S3ObjectSummary summary   summaries                 String summaryKey   summary getKey                                   Retrieve object                    Process it                        list   amazonS3Client listNextBatchOfObjects list         while  list isTruncated

User · Answer

It might be a workaround but this solved my problem   ObjectListing listing   s3 listObjects  bucketName  prefix    List lt S3ObjectSummary gt  summaries   listing getObjectSummaries     while  listing isTruncated         listing   s3 listNextBatchOfObjects  listing      summaries addAll  listing getObjectSummaries

User · Answer

Listing Keys Using the AWS SDK for Java  http   docs aws amazon com AmazonS3 latest dev ListingObjectKeysUsingJava html  import java io IOException  import com amazonaws AmazonClientException  import com amazonaws AmazonServiceException  import com amazonaws auth profile ProfileCredentialsProvider  import com amazonaws services s3 AmazonS3  import com amazonaws services s3 AmazonS3Client  import com amazonaws services s3 model ListObjectsRequest  import com amazonaws services s3 model ListObjectsV2Request  import com amazonaws services s3 model ListObjectsV2Result  import com amazonaws services s3 model ObjectListing  import com amazonaws services s3 model S3ObjectSummary   public class ListKeys       private static String bucketName       bucket name           public static void main String   args  throws IOException           AmazonS3 s3client   new AmazonS3Client new ProfileCredentialsProvider             try               System out println  Listing objects                final ListObjectsV2Request req   new ListObjectsV2Request   withBucketName bucketName               ListObjectsV2Result result              do                                 result   s3client listObjectsV2 req                   for  S3ObjectSummary objectSummary                       result getObjectSummaries                         System out println   -     objectSummary getKey                                         size       objectSummary getSize                                                                       System out println  Next Continuation Token       result getNextContinuationToken                    req setContinuationToken result getNextContinuationToken                   while result isTruncated      true                 catch  AmazonServiceException ase                System out println  Caught an AmazonServiceException                           which means your request made it                          to Amazon S3  but was rejected with an error response                          for some reason                 System out println  Error Message         ase getMessage                 System out println  HTTP Status Code      ase getStatusCode                 System out println  AWS Error Code        ase getErrorCode                 System out println  Error Type            ase getErrorType                 System out println  Request ID            ase getRequestId               catch  AmazonClientException ace                System out println  Caught an AmazonClientException                           which means the client encountered                          an internal error while trying to communicate                          with S3                           such as not being able to access the network                 System out println  Error Message      ace getMessage

User · Answer

I know this is an old post  but this still might be usefull to anyone  The Java Android SDK on version 2 1 provides a method called setMaxKeys  Like this   s3objects setMaxKeys arg0    You probably found a solution by now  but please check one answer as correct so that it might help others in the future

User · Answer

This is direct from AWS documentation   AmazonS3 s3client   new AmazonS3Client new ProfileCredentialsProvider              ListObjectsRequest listObjectsRequest   new ListObjectsRequest        withBucketName bucketName       withPrefix  m    ObjectListing objectListing   do           objectListing   s3client listObjects listObjectsRequest           for  S3ObjectSummary objectSummary                objectListing getObjectSummaries                  System out println    -     objectSummary getKey                                  size       objectSummary getSize                                                  listObjectsRequest setMarker objectListing getNextMarker       while  objectListing isTruncated

User · Answer

Try this one out  public void getObjectList            System out println  Listing objects            ObjectListing objectListing   s3 listObjects new ListObjectsRequest                    withBucketName bucketName                   withPrefix  ads             for  S3ObjectSummary objectSummary   objectListing getObjectSummaries                  System out println   -     objectSummary getKey                                             size       objectSummary getSize                             You can all the objects within the bucket with specific prefix

[java] How to list all AWS S3 objects in a bucket using Java

Examples related to java

Examples related to amazon-s3