[python] Using Python's ftplib to get a directory listing, portably

You can use ftplib for full FTP support in Python. However the preferred way of getting a directory listing is:

# File: ftplib-example-1.py

import ftplib

ftp = ftplib.FTP("www.python.org")
ftp.login("anonymous", "ftplib-example-1")

data = []

ftp.dir(data.append)

ftp.quit()

for line in data:
    print "-", line

Which yields:

$ python ftplib-example-1.py
- total 34
- drwxrwxr-x  11 root     4127         512 Sep 14 14:18 .
- drwxrwxr-x  11 root     4127         512 Sep 14 14:18 ..
- drwxrwxr-x   2 root     4127         512 Sep 13 15:18 RCS
- lrwxrwxrwx   1 root     bin           11 Jun 29 14:34 README -> welcome.msg
- drwxr-xr-x   3 root     wheel        512 May 19  1998 bin
- drwxr-sr-x   3 root     1400         512 Jun  9  1997 dev
- drwxrwxr--   2 root     4127         512 Feb  8  1998 dup
- drwxr-xr-x   3 root     wheel        512 May 19  1998 etc
...

I guess the idea is to parse the results to get the directory listing. However this listing is directly dependent on the FTP server's way of formatting the list. It would be very messy to write code for this having to anticipate all the different ways FTP servers might format this list.

Is there a portable way to get an array filled with the directory listing?

(The array should only have the folder names.)

This question is related to python ftp portability

The answer is


Try to use ftp.nlst(dir).

However, note that if the folder is empty, it might throw an error:

files = []

try:
    files = ftp.nlst()
except ftplib.error_perm, resp:
    if str(resp) == "550 No files found":
        print "No files in this directory"
    else:
        raise

for f in files:
    print f

The reliable/standardized way to parse FTP directory listing is by using MLSD command, which by now should be supported by all recent/decent FTP servers.

import ftplib
f = ftplib.FTP()
f.connect("localhost")
f.login()
ls = []
f.retrlines('MLSD', ls.append)
for entry in ls:
    print entry

The code above will print:

modify=20110723201710;perm=el;size=4096;type=dir;unique=807g4e5a5; tests
modify=20111206092323;perm=el;size=4096;type=dir;unique=807g1008e0; .xchat2
modify=20111022125631;perm=el;size=4096;type=dir;unique=807g10001a; .gconfd
modify=20110808185618;perm=el;size=4096;type=dir;unique=807g160f9a; .skychart
...

Starting from python 3.3, ftplib will provide a specific method to do this:


That helped me with my code.

When I tried feltering only a type of files and show them on screen by adding a condition that tests on each line.

Like this

elif command == 'ls':
    print("directory of ", ftp.pwd())
    data = []
    ftp.dir(data.append)

    for line in data:
        x = line.split(".")
        formats=["gz", "zip", "rar", "tar", "bz2", "xz"]
        if x[-1] in formats:
            print ("-", line)

I found my way here while trying to get filenames, last modified stamps, file sizes etc and wanted to add my code. It only took a few minutes to write a loop to parse the ftp.dir(dir_list.append) making use of python std lib stuff like strip() (to clean up the line of text) and split() to create an array.

ftp = FTP('sick.domain.bro')
ftp.login()
ftp.cwd('path/to/data')

dir_list = []
ftp.dir(dir_list.append)

# main thing is identifing which char marks start of good stuff
# '-rw-r--r--   1 ppsrt    ppsrt      545498 Jul 23 12:07 FILENAME.FOO
#                               ^  (that is line[29])

for line in dir_list:
   print line[29:].strip().split(' ') # got yerself an array there bud!
   # EX ['545498', 'Jul', '23', '12:07', 'FILENAME.FOO']

This is from Python docs

>>> from ftplib import FTP_TLS
>>> ftps = FTP_TLS('ftp.python.org')
>>> ftps.login()           # login anonymously before securing control 
channel
>>> ftps.prot_p()          # switch to secure data connection
>>> ftps.retrlines('LIST') # list directory content securely
total 9
drwxr-xr-x   8 root     wheel        1024 Jan  3  1994 .
drwxr-xr-x   8 root     wheel        1024 Jan  3  1994 ..
drwxr-xr-x   2 root     wheel        1024 Jan  3  1994 bin
drwxr-xr-x   2 root     wheel        1024 Jan  3  1994 etc
d-wxrwxr-x   2 ftp      wheel        1024 Sep  5 13:43 incoming
drwxr-xr-x   2 root     wheel        1024 Nov 17  1993 lib
drwxr-xr-x   6 1094     wheel        1024 Sep 13 19:07 pub
drwxr-xr-x   3 root     wheel        1024 Jan  3  1994 usr
-rw-r--r--   1 root     root          312 Aug  1  1994 welcome.msg

There's no standard for the layout of the LIST response. You'd have to write code to handle the most popular layouts. I'd start with Linux ls and Windows Server DIR formats. There's a lot of variety out there, though.

Fall back to the nlst method (returning the result of the NLST command) if you can't parse the longer list. For bonus points, cheat: perhaps the longest number in the line containing a known file name is its length.


I happen to be stuck with an FTP server (Rackspace Cloud Sites virtual server) that doesn't seem to support MLSD. Yet I need several fields of file information, such as size and timestamp, not just the filename, so I have to use the DIR command. On this server, the output of DIR looks very much like the OP's. In case it helps anyone, here's a little Python class that parses a line of such output to obtain the filename, size and timestamp.

import datetime

class FtpDir:
    def parse_dir_line(self, line):
        words = line.split()
        self.filename = words[8]
        self.size = int(words[4])
        t = words[7].split(':')
        ts = words[5] + '-' + words[6] + '-' + datetime.datetime.now().strftime('%Y') + ' ' + t[0] + ':' + t[1]
        self.timestamp = datetime.datetime.strptime(ts, '%b-%d-%Y %H:%M')

Not very portable, I know, but easy to extend or modify to deal with various different FTP servers.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to ftp

Uploading into folder in FTP? Wordpress plugin install: Could not create directory How to setup FTP on xampp Google Drive as FTP Server Fatal error: Call to undefined function mysqli_connect() Filezilla FTP Server Fails to Retrieve Directory Listing FTP/SFTP access to an Amazon S3 Bucket 200 PORT command successful. Consider using PASV. 425 Failed to establish connection PowerShell Connect to FTP server and get files How to use passive FTP mode in Windows command prompt?

Examples related to portability

How do SO_REUSEADDR and SO_REUSEPORT differ? OS specific instructions in CMAKE: How to? Multi-character constant warnings Is there a portable way to get the current username in Python? How should I print types like off_t and size_t? How to measure time in milliseconds using ANSI C? Is there a replacement for unistd.h for Windows (Visual C)? Using Python's ftplib to get a directory listing, portably