how to merge 200 csv files in Python

Question

Guys  I here have 200 separate csv files named from SH  1  to SH  200   I want to merge them into a single csv file  How can I do it

User · Answer

fout open  out csv   a   for num in range 1 201       for line in open  sh  str num    csv             fout write line      fout close

User · Answer

Over the solution that made  Adders and later on improved by  varun  I implemented some little improvement too leave the whole merged CSV with only the main header   from glob import glob  filename    main csv   with open filename   a   as singleFile      first csv   True     for csv in glob    csv            if csv    filename              pass         else              header   True             for line in open csv   r                    if first csv and header                      singleFile write line                      first csv   False                     header   False                 elif header                      header   False                 else                      singleFile write line      singleFile close     Best regards

User · Answer

If you are working on linux mac you can do this   from subprocess import call script  cat   csv gt merge csv  call script shell True

User · Answer

You could import csv then loop through all the CSV files reading them into a list  Then write the list back out to disk   import csv  rows       for f in  file1  file2            reader   csv reader open  f    rb         for row in reader          rows append row   writer   csv writer open  some csv    wb    writer writerows   n  join rows     The above is not very robust as it has no error handling nor does it close any open files  This should work whether or not the the individual files have one or more rows of CSV data in them  Also I did not run this code  but it should give you an idea of what to do

User · Answer

OR  you could just do  cat sh  csv  gt  merged csv

User · Answer

An easy-to-use function   def csv merge destination path   source paths       Merges all csv files on source paths to destination path   param destination path  Path of a single csv file  doesn t need to exist  param source paths  Paths of csv files to be merged into  needs to exist  return  None     with open destination path  a   as dest file      with open source paths 0   as src file          for src line in src file read                dest file write src line      source paths pop 0      for i in range len source paths            with open source paths i   as src file              src file next               for src line in src file                   dest file write src line

User · Answer

Here is a script    Concatenating csv files named SH1 csv to SH200 csv Keeping the headers   import glob import re    Looking for filenames like  SH1 csv       SH200 csv  pattern   re compile   SH  1-9   1-9  0-9  1 0-9  0-9  200  csv    file parts    name for name in glob glob    csv   if pattern match name    with open  file merged csv   wb   as file merged      for  i  name  in enumerate file parts           with open name   rb   as file part              if i    0                  next file part    skip headers if not first file             file merged write file part read

User · Answer

As ghostdog74 said  but this time with headers   fout open  out csv   a     first file  for line in open  sh1 csv        fout write line    now the rest      for num in range 2 201       f   open  sh  str num    csv       f next     skip the header     for line in f           fout write line      f close     not really needed fout close

User · Answer

Why can t you just sed 1d sh  csv  gt  merged csv   Sometimes you don t even have to use python

User · Answer

I modified what  wisty said to be worked with python 3 x  for those of you that have encoding problem  also I use os module to avoid of hard coding  import os  def merge all        dir   os chdir  C  python data         fout   open  merged files csv    ab         first file      for line in open  file 1 csv   rb            fout write line        now the rest      list   os listdir dir      number files   len list      for num in range 2  number files           f   open  file     str num      csv    rb           f   next        skip the header         for line in f              fout write line          f close      not really needed     fout close

User · Answer

I m just gonna through another code example in the basket  from glob import glob  with open  singleDataFile csv    a   as singleFile      for csvFile in glob    csv            for line in open csvFile   r                singleFile write line

User · Answer

import pandas as pd import os  df   pd read csv  e   data science  kaggle assign  monthly sales  Pandas-Data-Science-Tasks-master  SalesAnalysis  Sales Data  Sales April 2019 csv   files    file for file in  os listdir  e   data science  kaggle assign  monthly sales  Pandas-Data-Science-Tasks-master  SalesAnalysis  Sales Data   for file in files      print file   all data   pd DataFrame   for file in files      df pd read csv  e   data science  kaggle assign  monthly sales  Pandas-Data-Science-Tasks-master  SalesAnalysis  Sales Data    file      all data   pd concat  all data df       all data head

User · Answer

A slight change to the code above as it does not actually work correctly   It should be as follows     from glob import glob  with open  main csv    a   as singleFile      for csv in glob    csv            if csv     main csv               pass         else              for line in open csv   r                    singleFile write line

User · Answer

Updating wisty s answer for python3  fout open  out csv   a     first file  for line in open  sh1 csv        fout write line    now the rest      for num in range 2 201       f   open  sh  str num    csv       next f    skip the header     for line in f           fout write line      f close     not really needed fout close

User · Answer

Use accepted StackOverflow answer to create a list of csv files that you want to append and then run this code   import pandas as pd combined csv   pd concat    pd read csv f  for f in filenames       And if you want to export it to a single csv file  use this   combined csv to csv   combined csv csv   index False

User · Answer

It depends what you mean by  merging  -- do they have the same columns   Do they have headers   For example  if they all have the same columns  and no headers  simple concatenation is sufficient  open the destination file for writing  loop over the sources opening each for reading  use shutil copyfileobj from the open-for-reading source into the open-for-writing destination  close the source  keep looping -- use the with statement to do the closing on your behalf    If they have the same columns  but also headers  you ll need a readline on each source file except the first  after you open it for reading before you copy it into the destination  to skip the headers line   If the CSV files don t all have the same columns then you need to define in what sense you re  merging  them  like a SQL JOIN  or  horizontally  if they all have the same number of lines  etc  etc  -- it s hard for us to guess what you mean in that case

User · Answer

You can simply use the in-built csv library  This solution will work even if some of your CSV files have slightly different column names or headers  unlike the other top-voted answers  import csv import glob   filenames    i for i in glob glob  quot SH  csv quot    header keys      merged rows       for filename in filenames      with open filename  as f          reader   csv DictReader f          merged rows extend list reader           header keys extend  key for key in reader fieldnames if key not in header keys    with open  quot combined csv quot    quot w quot   as f      w   csv DictWriter f  fieldnames header keys      w writeheader       w writerows merged rows   The merged file will contain all possible columns  header keys  that can be found in the files  Any absent columns in a file would be rendered as blank   empty  but preserving rest of the file s data   Note   This won t work if your CSV files have no headers  In that case you can still use the csv library  but instead of using DictReader  amp  DictWriter  you ll have to work with the basic reader  amp  writer  This may run into issues when you are dealing with massive data since the entirety of the content is being store in memory  merged rows list

User · Answer

Let s say you have 2 csv files like these    csv1 csv    id name 1 Armin 2 Sven   csv2 csv   id place year 1 Reykjavik 2017 2 Amsterdam 2018 3 Berlin 2019   and you want the result to be like this csv3 csv   id name place year 1 Armin Reykjavik 2017 2 Sven Amsterdam 2018 3  Berlin 2019   Then you can use the following snippet to do that   import csv import pandas as pd    the file names f1    csv1 csv  f2    csv2 csv  out f    csv3 csv     read the files df1   pd read csv f1  df2   pd read csv f2     get the keys keys1   list df1  keys2   list df2     merge both files for idx  row in df2 iterrows        data   df1 df1  id      row  id           if row with such id does not exist  add the whole row     if data empty          next idx   len df1          for key in keys2              df1 at next idx  key    df2 at idx  key         if row with such id exists  add only the missing keys with their values     else          i   int data index 0           for key in keys2              if key not in keys1                  df1 at i  key    df2 at idx  key     save the merged files df1 to csv out f  index False  encoding  utf-8   quotechar     quoting csv QUOTE NONE    With the help of a loop you can achieve the same result for multiple files as it is in your case  200 csv files

User · Answer

If the files aren t numbered in order  take the hassle-free approach below  Python 3 6 on windows machine   import pandas as pd from glob import glob  interesting files   glob  C  temp   csv     it grabs all the csv files from the directory you mention here  df list      for filename in sorted interesting files    df list append pd read csv filename   full df   pd concat df list     save the final file in same different directory  full df to csv  C  temp merged pandas csv   index False

User · Answer

If the merged CSV is going to be used in Python then just use glob to get a list of the files to pass to fileinput input   via the files argument  then use the csv module to read it all in one go

User · Answer

Quite easy to combine all files in a directory and merge them  import glob import csv     Open result file with open  output txt   wb   as fout      wout   csv writer fout delimiter           interesting files   glob glob    csv        h   True     for filename in interesting files           print  Processing  filename            Open and process file         with open filename  rb   as fin              if h                  h   False             else                  fin next   skip header             for line in csv reader fin delimiter                       wout writerow line

[python] how to merge 200 csv files in Python

Examples related to python

Examples related to csv

Examples related to merge

Examples related to concatenation