[python] check if a key exists in a bucket in s3 using boto3

I would like to know if a key exists in boto3. I can loop the bucket contents and check the key if it matches.

But that seems longer and an overkill. Boto3 official docs explicitly state how to do this.

May be I am missing the obvious. Can anybody point me how I can achieve this.

This question is related to python amazon-s3 boto3

The answer is


You can use S3Fs, which is essentially a wrapper around boto3 that exposes typical file-system style operations:

import s3fs
s3 = s3fs.S3FileSystem()
s3.exists('myfile.txt')

import boto3
client = boto3.client('s3')
s3_key = 'Your file without bucket name e.g. abc/bcd.txt'
bucket = 'your bucket name'
content = client.head_object(Bucket=bucket,Key=s3_key)
    if content.get('ResponseMetadata',None) is not None:
        print "File exists - s3://%s/%s " %(bucket,s3_key) 
    else:
        print "File does not exist - s3://%s/%s " %(bucket,s3_key)

Use this concise oneliner, makes it less intrusive when you have to throw it inside an existing project without modifying much of the code.

s3_file_exists = lambda filename: bool(list(bucket.objects.filter(Prefix=filename)))

The above function assumes the bucket variable was already declared.

You can extend the lambda to support additional parameter like

s3_file_exists = lambda filename, bucket: bool(list(bucket.objects.filter(Prefix=filename)))

you can use Boto3 for this.

import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('my-bucket')
objs = list(bucket.objects.filter(Prefix=key))
if(len(objs)>0):
    print("key exists!!")
else:
    print("key doesn't exist!")

Here key is the path you want to check exists or not


The easiest way I found (and probably the most efficient) is this:

import boto3
from botocore.errorfactory import ClientError

s3 = boto3.client('s3')
try:
    s3.head_object(Bucket='bucket_name', Key='file_path')
except ClientError:
    # Not found
    pass

I'm not a big fan of using exceptions for control flow. This is an alternative approach that works in boto3:

import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('my-bucket')
key = 'dootdoot.jpg'
objs = list(bucket.objects.filter(Prefix=key))
if any([w.key == path_s3 for w in objs]):
    print("Exists!")
else:
    print("Doesn't exist")

S3_REGION="eu-central-1"
bucket="mybucket1"
name="objectname"

import boto3
from botocore.client import Config
client = boto3.client('s3',region_name=S3_REGION,config=Config(signature_version='s3v4'))
list = client.list_objects_v2(Bucket=bucket,Prefix=name)
for obj in list.get('Contents', []):
    if obj['Key'] == name: return True
return False

Not only client but bucket too:

import boto3
import botocore
bucket = boto3.resource('s3', region_name='eu-west-1').Bucket('my-bucket')

try:
  bucket.Object('my-file').get()
except botocore.exceptions.ClientError as ex:
  if ex.response['Error']['Code'] == 'NoSuchKey':
    print('NoSuchKey')

I noticed that just for catching the exception using botocore.exceptions.ClientError we need to install botocore. botocore takes up 36M of disk space. This is particularly impacting if we use aws lambda functions. In place of that if we just use exception then we can skip using the extra library!

  • I am validating for the file extension to be '.csv'
  • This will not throw an exception if the bucket does not exist!
  • This will not throw an exception if the bucket exists but object does not exist!
  • This throws out an exception if the bucket is empty!
  • This throws out an exception if the bucket has no permissions!

The code looks like this. Please share your thoughts:

import boto3
import traceback

def download4mS3(s3bucket, s3Path, localPath):
    s3 = boto3.resource('s3')

    print('Looking for the csv data file ending with .csv in bucket: ' + s3bucket + ' path: ' + s3Path)
    if s3Path.endswith('.csv') and s3Path != '':
        try:
            s3.Bucket(s3bucket).download_file(s3Path, localPath)
        except Exception as e:
            print(e)
            print(traceback.format_exc())
            if e.response['Error']['Code'] == "404":
                print("Downloading the file from: [", s3Path, "] failed")
                exit(12)
            else:
                raise
        print("Downloading the file from: [", s3Path, "] succeeded")
    else:
        print("csv file not found in in : [", s3Path, "]")
        exit(12)

If you seek a key that is equivalent to a directory then you might want this approach

session = boto3.session.Session()
resource = session.resource("s3")
bucket = resource.Bucket('mybucket')

key = 'dir-like-or-file-like-key'
objects = [o for o in bucket.objects.filter(Prefix=key).limit(1)]    
has_key = len(objects) > 0

This works for a parent key or a key that equates to file or a key that does not exist. I tried the favored approach above and failed on parent keys.


Here is a solution that works for me. One caveat is that I know the exact format of the key ahead of time, so I am only listing the single file

import boto3

# The s3 base class to interact with S3
class S3(object):
  def __init__(self):
    self.s3_client = boto3.client('s3')

  def check_if_object_exists(self, s3_bucket, s3_key):
    response = self.s3_client.list_objects(
      Bucket = s3_bucket,
      Prefix = s3_key
      )
    if 'ETag' in str(response):
      return True
    else:
      return False

if __name__ == '__main__':
  s3  = S3()
  if s3.check_if_object_exists(bucket, key):
    print "Found S3 object."
  else:
    print "No object found."

There is one simple way by which we can check if file exists or not in S3 bucket. We donot need to use exception for this

sesssion = boto3.Session(aws_access_key_id, aws_secret_access_key)
s3 = session.client('s3')

object_name = 'filename'
bucket = 'bucketname'
obj_status = s3.list_objects(Bucket = bucket, Prefix = object_name)
if obj_status.get('Contents'):
    print("File exists")
else:
    print("File does not exists")

It's really simple with get() method

import botocore
from boto3.session import Session
session = Session(aws_access_key_id='AWS_ACCESS_KEY',
                aws_secret_access_key='AWS_SECRET_ACCESS_KEY')
s3 = session.resource('s3')
bucket_s3 = s3.Bucket('bucket_name')

def not_exist(file_key):
    try:
        file_details = bucket_s3.Object(file_key).get()
        # print(file_details) # This line prints the file details
        return False
    except botocore.exceptions.ClientError as e:
        if e.response['Error']['Code'] == "NoSuchKey": # or you can check with e.reponse['HTTPStatusCode'] == '404'
            return True
        return False # For any other error it's hard to determine whether it exists or not. so based on the requirement feel free to change it to True/ False / raise Exception

print(not_exist('hello_world.txt')) 

For boto3, ObjectSummary can be used to check if an object exists.

Contains the summary of an object stored in an Amazon S3 bucket. This object doesn't contain contain the object's full metadata or any of its contents

import boto3
from botocore.errorfactory import ClientError
def path_exists(path, bucket_name):
    """Check to see if an object exists on S3"""
    s3 = boto3.resource('s3')
    try:
        s3.ObjectSummary(bucket_name=bucket_name, key=path).load()
    except ClientError as e:
        if e.response['Error']['Code'] == "404":
            return False
        else:
            raise e
    return True

path_exists('path/to/file.html')

In ObjectSummary.load

Calls s3.Client.head_object to update the attributes of the ObjectSummary resource.

This shows that you can use ObjectSummary instead of Object if you are planning on not using get(). The load() function does not retrieve the object it only obtains the summary.


Try This simple

import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('mybucket_name') # just Bucket name
file_name = 'A/B/filename.txt'      # full file path
obj = list(bucket.objects.filter(Prefix=file_name))
if len(obj) > 0:
    print("Exists")
else:
    print("Not Exists")

FWIW, here are the very simple functions that I am using

import boto3

def get_resource(config: dict={}):
    """Loads the s3 resource.

    Expects AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to be in the environment
    or in a config dictionary.
    Looks in the environment first."""

    s3 = boto3.resource('s3',
                        aws_access_key_id=os.environ.get(
                            "AWS_ACCESS_KEY_ID", config.get("AWS_ACCESS_KEY_ID")),
                        aws_secret_access_key=os.environ.get("AWS_SECRET_ACCESS_KEY", config.get("AWS_SECRET_ACCESS_KEY")))
    return s3


def get_bucket(s3, s3_uri: str):
    """Get the bucket from the resource.
    A thin wrapper, use with caution.

    Example usage:

    >> bucket = get_bucket(get_resource(), s3_uri_prod)"""
    return s3.Bucket(s3_uri)


def isfile_s3(bucket, key: str) -> bool:
    """Returns T/F whether the file exists."""
    objs = list(bucket.objects.filter(Prefix=key))
    return len(objs) == 1 and objs[0].key == key


def isdir_s3(bucket, key: str) -> bool:
    """Returns T/F whether the directory exists."""
    objs = list(bucket.objects.filter(Prefix=key))
    return len(objs) > 1

If you have less than 1000 in a directory or bucket you can get set of them and after check if such key in this set:

files_in_dir = {d['Key'].split('/')[-1] for d in s3_client.list_objects_v2(
Bucket='mybucket',
Prefix='my/dir').get('Contents') or []}

Such code works even if my/dir is not exists.

http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.list_objects_v2


This could check both prefix and key, and fetches at most 1 key.

def prefix_exits(bucket, prefix):
    s3_client = boto3.client('s3')
    res = s3_client.list_objects_v2(Bucket=bucket, Prefix=prefix, MaxKeys=1)
    return 'Contents' in res

Assuming you just want to check if a key exists (instead of quietly over-writing it), do this check first:

import boto3

def key_exists(mykey, mybucket):
  s3_client = boto3.client('s3')
  response = s3_client.list_objects_v2(Bucket=mybucket, Prefix=mykey)
  if response:
      for obj in response['Contents']:
          if mykey == obj['Key']:
              return True
  return False

if key_exists('someprefix/myfile-abc123', 'my-bucket-name'):
    print("key exists")
else:
    print("safe to put new bucket object")
    # try:
    #     resp = s3_client.put_object(Body="Your string or file-like object",
    #                                 Bucket=mybucket,Key=mykey)
    # ...check resp success and ClientError exception for errors...

Just following the thread, can someone conclude which one is the most efficient way to check if an object exists in S3?

I think head_object might win as it just checks the metadata which is lighter than the actual object itself


Check out

bucket.get_key(
    key_name, 
    headers=None, 
    version_id=None, 
    response_headers=None, 
    validate=True
)

Check to see if a particular key exists within the bucket. This method uses a HEAD request to check for the existence of the key. Returns: An instance of a Key object or None

from Boto S3 Docs

You can just call bucket.get_key(keyname) and check if the returned object is None.


In Boto3, if you're checking for either a folder (prefix) or a file using list_objects. You can use the existence of 'Contents' in the response dict as a check for whether the object exists. It's another way to avoid the try/except catches as @EvilPuppetMaster suggests

import boto3
client = boto3.client('s3')
results = client.list_objects(Bucket='my-bucket', Prefix='dootdoot.jpg')
return 'Contents' in results

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to amazon-s3

How to specify credentials when connecting to boto3 S3? AWS S3 CLI - Could not connect to the endpoint URL How to write a file or data to an S3 object using boto3 The AWS Access Key Id does not exist in our records AccessDenied for ListObjects for S3 bucket when permissions are s3:* Save Dataframe to csv directly to s3 Python Listing files in a specific "folder" of a AWS S3 bucket How to get response from S3 getObject in Node.js? Getting Access Denied when calling the PutObject operation with bucket-level permission Read file content from S3 bucket with boto3

Examples related to boto3

How to specify credentials when connecting to boto3 S3? Difference in boto3 between resource, client, and session? boto3 client NoRegionError: You must specify a region error only sometimes How to write a file or data to an S3 object using boto3 Save Dataframe to csv directly to s3 Python Read file content from S3 bucket with boto3 Retrieving subfolders names in S3 bucket from boto3 check if a key exists in a bucket in s3 using boto3 Unable to install boto3 How to choose an AWS profile when using boto3 to connect to CloudFront