[java] Listing files in a specific "folder" of a AWS S3 bucket

I need to list all files contained in a certain folder contained in my S3 bucket.

The folder structure is the following

/my-bucket/users/<user-id>/contacts/<contact-id>

I have files related to users and files related to a certain user's contact. I need to list both.

To list files I'm using this code:

ListObjectsRequest listObjectsRequest = new ListObjectsRequest().withBucketName("my-bucket")
                .withPrefix("some-prefix").withDelimiter("/");
ObjectListing objects = transferManager.getAmazonS3Client().listObjects(listObjectsRequest);

To list a certain user's files I'm using this prefix:

users/<user-id>/

and I'm correctly getting all files in the directory excluding contacts subdirectory, for example:

users/<user-id>/file1.txt
users/<user-id>/file2.txt
users/<user-id>/file3.txt

To list a certain user contact's files instead I'm using this prefix:

users/<user-id>/contacts/<contact-id>/

but in this case I'm getting also the directory itself as a returned object:

users/<user-id>/contacts/<contact-id>/file1.txt
users/<user-id>/contacts/<contact-id>/file2.txt
users/<user-id>/contacts/<contact-id>/

Why am I getting this behaviour? What's different beetween the two listing requests? I need to list only files in the directory, excluding sub-directories.

This question is related to java amazon-web-services amazon-s3

The answer is


As other have already said, everything in S3 is an object. To you, it may be files and folders. But to S3, they're just objects.

If you don't need objects which end with a '/' you can safely delete them e.g. via REST api or AWS Java SDK (I assume you have write access). You will not lose "nested files" (there no files, so you will not lose objects whose names are prefixed with the key you delete)

AmazonS3 amazonS3 = AmazonS3ClientBuilder.standard().withCredentials(new ProfileCredentialsProvider()).withRegion("region").build();
amazonS3.deleteObject(new DeleteObjectRequest("my-bucket", "users/<user-id>/contacts/<contact-id>/"));

Please note that I'm using ProfileCredentialsProvider so that my requests are not anonymous. Otherwise, you will not be able to delete an object. I have my AWS keep key stored in ~/.aws/credentials file.


S3 does not have directories, while you can list files in a pseudo directory manner like you demonstrated, there is no directory "file" per-se.
You may of inadvertently created a data file called users/<user-id>/contacts/<contact-id>/.


Based on @davioooh answer. This code is worked for me.

ListObjectsRequest listObjectsRequest = new ListObjectsRequest().withBucketName("your-bucket")
            .withPrefix("your/folder/path/").withDelimiter("/");

While everybody say that there are no directories and files in s3, but only objects (and buckets), which is absolutely true, I would suggest to take advantage of CommonPrefixes, described in this answer. So, you can do following to get list of "folders" (commonPrefixes) and "files" (objectSummaries):

ListObjectsV2Request req = new ListObjectsV2Request().withBucketName(bucket.getName()).withPrefix(prefix).withDelimiter(DELIMITER);
ListObjectsV2Result listing = s3Client.listObjectsV2(req);
for (String commonPrefix : listing.getCommonPrefixes()) {
        System.out.println(commonPrefix);
}
for (S3ObjectSummary summary: listing.getObjectSummaries()) {
    System.out.println(summary.getKey());
}

In your case, for objectSummaries (files) it should return (in case of correct prefix):
users/user-id/contacts/contact-id/file1.txt
users/user-id/contacts/contact-id/file2.txt

for commonPrefixes:
users/user-id/contacts/contact-id/

Reference: https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html


you can check the type. s3 has a special application/x-directory

bucket.objects({:delimiter=>"/", :prefix=>"f1/"}).each { |obj| p obj.object.content_type }

If your goal is only to take the files and not the folder, the approach I made was to use the file size as a filter. This property is the current size of the file hosted by AWS. All the folders return 0 in that property. The following is a C# code using linq but it shouldn't be hard to translate to Java.

var amazonClient = new AmazonS3Client(key, secretKey, region);
var listObjectsRequest= new ListObjectsRequest
            {
                BucketName = 'someBucketName',
                Delimiter = 'someDelimiter',
                Prefix = 'somePrefix'
            };
var objects = amazonClient.ListObjects(listObjectsRequest);
var objectsInFolder = objects.S3Objects.Where(file => file.Size > 0).ToList();

Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to amazon-web-services

How to specify credentials when connecting to boto3 S3? Is there a way to list all resources in AWS Access denied; you need (at least one of) the SUPER privilege(s) for this operation Job for mysqld.service failed See "systemctl status mysqld.service" What is difference between Lightsail and EC2? AWS S3 CLI - Could not connect to the endpoint URL boto3 client NoRegionError: You must specify a region error only sometimes How to write a file or data to an S3 object using boto3 Missing Authentication Token while accessing API Gateway? The AWS Access Key Id does not exist in our records

Examples related to amazon-s3

How to specify credentials when connecting to boto3 S3? AWS S3 CLI - Could not connect to the endpoint URL How to write a file or data to an S3 object using boto3 The AWS Access Key Id does not exist in our records AccessDenied for ListObjects for S3 bucket when permissions are s3:* Save Dataframe to csv directly to s3 Python Listing files in a specific "folder" of a AWS S3 bucket How to get response from S3 getObject in Node.js? Getting Access Denied when calling the PutObject operation with bucket-level permission Read file content from S3 bucket with boto3