[python] Python concatenate text files

I have a list of 20 file names, like ['file1.txt', 'file2.txt', ...]. I want to write a Python script to concatenate these files into a new file. I could open each file by f = open(...), read line by line by calling f.readline(), and write each line into that new file. It doesn't seem very "elegant" to me, especially the part where I have to read//write line by line.

Is there a more "elegant" way to do this in Python?

This question is related to python file-io concatenation

The answer is


Check out the .read() method of the File object:

http://docs.python.org/2/tutorial/inputoutput.html#methods-of-file-objects

You could do something like:

concat = ""
for file in files:
    concat += open(file).read()

or a more 'elegant' python-way:

concat = ''.join([open(f).read() for f in files])

which, according to this article: http://www.skymind.com/~ocrow/python_string/ would also be the fastest.


If you have a lot of files in the directory then glob2 might be a better option to generate a list of filenames rather than writing them by hand.

import glob2

filenames = glob2.glob('*.txt')  # list of all .txt files in the directory

with open('outfile.txt', 'w') as f:
    for file in filenames:
        with open(file) as infile:
            f.write(infile.read()+'\n')

  import os
  files=os.listdir()
  print(files)
  print('#',tuple(files))
  name=input('Enter the inclusive file name: ')
  exten=input('Enter the type(extension): ')
  filename=name+'.'+exten
  output_file=open(filename,'w+')
  for i in files:
    print(i)
    j=files.index(i)
    f_j=open(i,'r')
    print(f_j.read())
    for x in f_j:
      outfile.write(x)

I don't know about elegance, but this works:

    import glob
    import os
    for f in glob.glob("file*.txt"):
         os.system("cat "+f+" >> OutFile.txt")

That's exactly what fileinput is for:

import fileinput
with open(outfilename, 'w') as fout, fileinput.input(filenames) as fin:
    for line in fin:
        fout.write(line)

For this use case, it's really not much simpler than just iterating over the files manually, but in other cases, having a single iterator that iterates over all of the files as if they were a single file is very handy. (Also, the fact that fileinput closes each file as soon as it's done means there's no need to with or close each one, but that's just a one-line savings, not that big of a deal.)

There are some other nifty features in fileinput, like the ability to do in-place modifications of files just by filtering each line.


As noted in the comments, and discussed in another post, fileinput for Python 2.7 will not work as indicated. Here slight modification to make the code Python 2.7 compliant

with open('outfilename', 'w') as fout:
    fin = fileinput.input(filenames)
    for line in fin:
        fout.write(line)
    fin.close()

An alternative to @inspectorG4dget answer (best answer to date 29-03-2016). I tested with 3 files of 436MB.

@inspectorG4dget solution: 162 seconds

The following solution : 125 seconds

from subprocess import Popen
filenames = ['file1.txt', 'file2.txt', 'file3.txt']
fbatch = open('batch.bat','w')
str ="type "
for f in filenames:
    str+= f + " "
fbatch.write(str + " > file4results.txt")
fbatch.close()
p = Popen("batch.bat", cwd=r"Drive:\Path\to\folder")
stdout, stderr = p.communicate()

The idea is to create a batch file and execute it, taking advantage of "old good technology". Its semi-python but works faster. Works for windows.


def concatFiles():
    path = 'input/'
    files = os.listdir(path)
    for idx, infile in enumerate(files):
        print ("File #" + str(idx) + "  " + infile)
    concat = ''.join([open(path + f).read() for f in files])
    with open("output_concatFile.txt", "w") as fo:
        fo.write(path + concat)

if __name__ == "__main__":
    concatFiles()

Use shutil.copyfileobj.

It automatically reads the input files chunk by chunk for you, which is more more efficient and reading the input files in and will work even if some of the input files are too large to fit into memory:

import shutil

with open('output_file.txt','wb') as wfd:
    for f in ['seg1.txt','seg2.txt','seg3.txt']:
        with open(f,'rb') as fd:
            shutil.copyfileobj(fd, wfd)

outfile.write(infile.read()) # time: 2.1085190773010254s
shutil.copyfileobj(fd, wfd, 1024*1024*10) # time: 0.60599684715271s

A simple benchmark shows that the shutil performs better.


What's wrong with UNIX commands ? (given you're not working on Windows) :

ls | xargs cat | tee output.txt does the job ( you can call it from python with subprocess if you want)


If the files are not gigantic:

with open('newfile.txt','wb') as newf:
    for filename in list_of_files:
        with open(filename,'rb') as hf:
            newf.write(hf.read())
            # newf.write('\n\n\n')   if you want to introduce
            # some blank lines between the contents of the copied files

If the files are too big to be entirely read and held in RAM, the algorithm must be a little different to read each file to be copied in a loop by chunks of fixed length, using read(10000) for example.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to file-io

Python, Pandas : write content of DataFrame into text File Saving response from Requests to file How to while loop until the end of a file in Python without checking for empty line? Getting "java.nio.file.AccessDeniedException" when trying to write to a folder How do I add a resources folder to my Java project in Eclipse Read and write a String from text file Python Pandas: How to read only first n rows of CSV files in? Open files in 'rt' and 'wt' modes How to write to a file without overwriting current contents? Write objects into file with Node.js

Examples related to concatenation

Pandas Merging 101 What does ${} (dollar sign and curly braces) mean in a string in Javascript? Concatenate two NumPy arrays vertically Import multiple csv files into pandas and concatenate into one DataFrame How to concatenate columns in a Postgres SELECT? Concatenate string with field value in MySQL Most efficient way to concatenate strings in JavaScript? How to force a line break on a Javascript concatenated string? How to concatenate two IEnumerable<T> into a new IEnumerable<T>? How to concat two ArrayLists?