Is there a function to extract the extension from a filename?
This question is related to
python
filenames
file-extension
You can use a split
on a filename
:
f_extns = filename.split(".")
print ("The extension of the file is : " + repr(f_extns[-1]))
This does not require additional library
This is The Simplest Method to get both Filename & Extension in just a single line.
fName, ext = 'C:/folder name/Flower.jpeg'.split('/')[-1].split('.')
>>> print(fName)
Flower
>>> print(ext)
jpeg
Unlike other solutions, you don't need to import any package for this.
Just join
all pathlib suffixes
.
>>> x = 'file/path/archive.tar.gz'
>>> y = 'file/path/text.txt'
>>> ''.join(pathlib.Path(x).suffixes)
'.tar.gz'
>>> ''.join(pathlib.Path(y).suffixes)
'.txt'
try this:
files = ['file.jpeg','file.tar.gz','file.png','file.foo.bar','file.etc']
pen_ext = ['foo', 'tar', 'bar', 'etc']
for file in files: #1
if (file.split(".")[-2] in pen_ext): #2
ext = file.split(".")[-2]+"."+file.split(".")[-1]#3
else:
ext = file.split(".")[-1] #4
print (ext) #5
Another solution with right split:
# to get extension only
s = 'test.ext'
if '.' in s: ext = s.rsplit('.', 1)[1]
# or, to get file name and extension
def split_filepath(s):
"""
get filename and extension from filepath
filepath -> (filename, extension)
"""
if not '.' in s: return (s, '')
r = s.rsplit('.', 1)
return (r[0], r[1])
For funsies... just collect the extensions in a dict, and track all of them in a folder. Then just pull the extensions you want.
import os
search = {}
for f in os.listdir(os.getcwd()):
fn, fe = os.path.splitext(f)
try:
search[fe].append(f)
except:
search[fe]=[f,]
extensions = ('.png','.jpg')
for ex in extensions:
found = search.get(ex,'')
if found:
print(found)
filename='ext.tar.gz'
extension = filename[filename.rfind('.'):]
import os.path
extension = os.path.splitext(filename)[1]
Surprised this wasn't mentioned yet:
import os
fn = '/some/path/a.tar.gz'
basename = os.path.basename(fn) # os independent
Out[] a.tar.gz
base = basename.split('.')[0]
Out[] a
ext = '.'.join(basename.split('.')[1:]) # <-- main part
# if you want a leading '.', and if no result `None`:
ext = '.' + ext if ext else None
Out[] .tar.gz
Benefits:
As function:
def get_extension(filename):
basename = os.path.basename(filename) # os independent
ext = '.'.join(basename.split('.')[1:])
return '.' + ext if ext else None
import os.path
extension = os.path.splitext(filename)[1][1:]
To get only the text of the extension, without the dot.
You can find some great stuff in pathlib module (available in python 3.x).
import pathlib
x = pathlib.PurePosixPath("C:\\Path\\To\\File\\myfile.txt").suffix
print(x)
# Output
'.txt'
Although it is an old topic, but i wonder why there is none mentioning a very simple api of python called rpartition in this case:
to get extension of a given file absolute path, you can simply type:
filepath.rpartition('.')[-1]
example:
path = '/home/jersey/remote/data/test.csv'
print path.rpartition('.')[-1]
will give you: 'csv'
a = ".bashrc"
b = "text.txt"
extension_a = a.split(".")
extension_b = b.split(".")
print(extension_a[-1]) # bashrc
print(extension_b[-1]) # txt
Even this question is already answered I'd add the solution in Regex.
>>> import re
>>> file_suffix = ".*(\..*)"
>>> result = re.search(file_suffix, "somefile.ext")
>>> result.group(1)
'.ext'
def NewFileName(fichier):
cpt = 0
fic , *ext = fichier.split('.')
ext = '.'.join(ext)
while os.path.isfile(fichier):
cpt += 1
fichier = '{0}-({1}).{2}'.format(fic, cpt, ext)
return fichier
Any of the solutions above work, but on linux I have found that there is a newline at the end of the extension string which will prevent matches from succeeding. Add the strip()
method to the end. For example:
import os.path
extension = os.path.splitext(filename)[1][1:].strip()
worth adding a lower in there so you don't find yourself wondering why the JPG's aren't showing up in your list.
os.path.splitext(filename)[1][1:].strip().lower()
name_only=file_name[:filename.index(".")
That will give you the file name up to the first ".", which would be the most common.
This is a direct string representation techniques : I see a lot of solutions mentioned, but I think most are looking at split. Split however does it at every occurrence of "." . What you would rather be looking for is partition.
string = "folder/to_path/filename.ext"
extension = string.rpartition(".")[-1]
New in version 3.4.
import pathlib
print(pathlib.Path('yourPath.example').suffix) # '.example'
I'm surprised no one has mentioned pathlib
yet, pathlib
IS awesome!
If you need all the suffixes (eg if you have a .tar.gz
), .suffixes
will return a list of them!
For simple use cases one option may be splitting from dot:
>>> filename = "example.jpeg"
>>> filename.split(".")[-1]
'jpeg'
No error when file doesn't have an extension:
>>> "filename".split(".")[-1]
'filename'
But you must be careful:
>>> "png".split(".")[-1]
'png' # But file doesn't have an extension
Also will not work with hidden files in Unix systems:
>>> ".bashrc".split(".")[-1]
'bashrc' # But this is not an extension
For general use, prefer os.path.splitext
With splitext there are problems with files with double extension (e.g. file.tar.gz
, file.tar.bz2
, etc..)
>>> fileName, fileExtension = os.path.splitext('/path/to/somefile.tar.gz')
>>> fileExtension
'.gz'
but should be: .tar.gz
The possible solutions are here
A true one-liner, if you like regex. And it doesn't matter even if you have additional "." in the middle
import re
file_ext = re.search(r"\.([^.]+)$", filename).group(1)
See here for the result: Click Here
# try this, it works for anything, any length of extension
# e.g www.google.com/downloads/file1.gz.rs -> .gz.rs
import os.path
class LinkChecker:
@staticmethod
def get_link_extension(link: str)->str:
if link is None or link == "":
return ""
else:
paths = os.path.splitext(link)
ext = paths[1]
new_link = paths[0]
if ext != "":
return LinkChecker.get_link_extension(new_link) + ext
else:
return ""
Source: Stackoverflow.com