[python] How to add a new column to a CSV file?

I have several CSV files that look like this:

Input
Name        Code
blackberry  1
wineberry   2
rasberry    1
blueberry   1
mulberry    2

I would like to add a new column to all CSV files so that it would look like this:

Output
Name        Code    Berry
blackberry  1   blackberry
wineberry   2   wineberry
rasberry    1   rasberry
blueberry   1   blueberry
mulberry    2   mulberry

The script I have so far is this:

import csv
with open(input.csv,'r') as csvinput:
    with open(output.csv, 'w') as csvoutput:
        writer = csv.writer(csvoutput)
        for row in csv.reader(csvinput):
            writer.writerow(row+['Berry'])

(Python 3.2)

But in the output, the script skips every line and the new column has only Berry in it:

Output
Name        Code    Berry
blackberry  1   Berry

wineberry   2   Berry

rasberry    1   Berry

blueberry   1   Berry

mulberry    2   Berry

This question is related to python csv python-3.x

The answer is


Append new column in existing csv file using python without header name

  default_text = 'Some Text'
# Open the input_file in read mode and output_file in write mode
    with open('problem-one-answer.csv', 'r') as read_obj, \
    open('output_1.csv', 'w', newline='') as write_obj:
# Create a csv.reader object from the input file object
    csv_reader = reader(read_obj)
# Create a csv.writer object from the output file object
    csv_writer = csv.writer(write_obj)
# Read each row of the input csv file as list
    for row in csv_reader:
# Append the default text in the row / list
        row.append(default_text)
# Add the updated row / list to the output file
        csv_writer.writerow(row)

Thankyou


I used pandas and it worked well... While I was using it, I had to open a file and add some random columns to it and then save back to same file only.

This code adds multiple column entries, you may edit as much you need.

import pandas as pd

csv_input = pd.read_csv('testcase.csv')         #reading my csv file
csv_input['Phone1'] = csv_input['Name']         #this would also copy the cell value 
csv_input['Phone2'] = csv_input['Name']
csv_input['Phone3'] = csv_input['Name']
csv_input['Phone4'] = csv_input['Name']
csv_input['Phone5'] = csv_input['Name']
csv_input['Country'] = csv_input['Name']
csv_input['Website'] = csv_input['Name']
csv_input.to_csv('testcase.csv', index=False)   #this writes back to your file

If you want that cell value doesn't gets copy, so first of all create a empty Column in your csv file manually, like you named it as Hours then, Now for this you can add this line in above code,

csv_input['New Value'] = csv_input['Hours']

or simply we can, without adding the manual column, we can

csv_input['New Value'] = ''    #simple and easy

I Hope it helps.


import csv
with open('input.csv','r') as csvinput:
    with open('output.csv', 'w') as csvoutput:
        writer = csv.writer(csvoutput)

        for row in csv.reader(csvinput):
            if row[0] == "Name":
                writer.writerow(row+["Berry"])
            else:
                writer.writerow(row+[row[0]])

Maybe something like that is what you intended?

Also, csv stands for comma separated values. So, you kind of need commas to separate your values like this I think:

Name,Code
blackberry,1
wineberry,2
rasberry,1
blueberry,1
mulberry,2

I don't see where you're adding the new column, but try this:

    import csv
    i = 0
    Berry = open("newcolumn.csv","r").readlines()
    with open(input.csv,'r') as csvinput:
        with open(output.csv, 'w') as csvoutput:
            writer = csv.writer(csvoutput)
            for row in csv.reader(csvinput):
                writer.writerow(row+","+Berry[i])
                i++

I'm surprised no one suggested Pandas. Although using a set of dependencies like Pandas might seem more heavy-handed than is necessary for such an easy task, it produces a very short script and Pandas is a great library for doing all sorts of CSV (and really all data types) data manipulation. Can't argue with 4 lines of code:

import pandas as pd
csv_input = pd.read_csv('input.csv')
csv_input['Berries'] = csv_input['Name']
csv_input.to_csv('output.csv', index=False)

Check out Pandas Website for more information!

Contents of output.csv:

Name,Code,Berries
blackberry,1,blackberry
wineberry,2,wineberry
rasberry,1,rasberry
blueberry,1,blueberry
mulberry,2,mulberry

Yes Its a old question but it might help some

import csv
import uuid

# read and write csv files
with open('in_file','r') as r_csvfile:
    with open('out_file','w',newline='') as w_csvfile:

        dict_reader = csv.DictReader(r_csvfile,delimiter='|')
        #add new column with existing
        fieldnames = dict_reader.fieldnames + ['ADDITIONAL_COLUMN']
        writer_csv = csv.DictWriter(w_csvfile,fieldnames,delimiter='|')
        writer_csv.writeheader()


        for row in dict_reader:
            row['ADDITIONAL_COLUMN'] = str(uuid.uuid4().int >> 64) [0:6]
            writer_csv.writerow(row)

In case of a large file you can use pandas.read_csv with the chunksize argument which allows to read the dataset per chunk:

import pandas as pd

INPUT_CSV = "input.csv"
OUTPUT_CSV = "output.csv"
CHUNKSIZE = 1_000 # Maximum number of rows in memory

header = True
mode = "w"
for chunk_df in pd.read_csv(INPUT_CSV, chunksize=CHUNKSIZE):
    chunk_df["Berry"] = chunk_df["Name"]
    # You apply any other transformation to the chunk
    # ...
    chunk_df.to_csv(OUTPUT_CSV, header=header, mode=mode)
    header = False # Do not save the header for the other chunks
    mode = "a" # 'a' stands for append mode, all the other chunks will be appended

If you want to update the file inplace, you can use a temporary file and erase it at the end

import pandas as pd

INPUT_CSV = "input.csv"
TMP_CSV = "tmp.csv"
CHUNKSIZE = 1_000 # Maximum number of rows in memory

header = True
mode = "w"
for chunk_df in pd.read_csv(INPUT_CSV, chunksize=CHUNKSIZE):
    chunk_df["Berry"] = chunk_df["Name"]
    # You apply any other transformation to the chunk
    # ...
    chunk_df.to_csv(TMP_CSV, header=header, mode=mode)
    header = False # Do not save the header for the other chunks
    mode = "a" # 'a' stands for append mode, all the other chunks will be appended

os.replace(TMP_CSV, INPUT_CSV)

This code will suffice your request and I have tested on the sample code.

import csv

with open(in_path, 'r') as f_in, open(out_path, 'w') as f_out:
    csv_reader = csv.reader(f_in, delimiter=';')
    writer = csv.writer(f_out)

    for row in csv_reader:
    writer.writerow(row + [row[0]]

For adding a new column to an existing CSV file(with headers), if the column to be added has small enough number of values, here is a convenient function (somewhat similar to @joaquin's solution). The function takes the

  1. Existing CSV filename
  2. Output CSV filename (which will have the updated content) and
  3. List with header name&column values
def add_col_to_csv(csvfile,fileout,new_list):
    with open(csvfile, 'r') as read_f, \
        open(fileout, 'w', newline='') as write_f:
        csv_reader = csv.reader(read_f)
        csv_writer = csv.writer(write_f)
        i = 0
        for row in csv_reader:
            row.append(new_list[i])
            csv_writer.writerow(row)
            i += 1 

Example:

new_list1 = ['test_hdr',4,4,5,5,9,9,9]
add_col_to_csv('exists.csv','new-output.csv',new_list1)

Existing CSV file: enter image description here

Output(updated) CSV file: enter image description here


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to csv

Pandas: ValueError: cannot convert float NaN to integer Export result set on Dbeaver to CSV Convert txt to csv python script How to import an Excel file into SQL Server? "CSV file does not exist" for a filename with embedded quotes Save Dataframe to csv directly to s3 Python Data-frame Object has no Attribute (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape How to write to a CSV line by line? How to check encoding of a CSV file

Examples related to python-3.x

Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation Replace specific text with a redacted version using Python Upgrade to python 3.8 using conda "Permission Denied" trying to run Python on Windows 10 Python: 'ModuleNotFoundError' when trying to import module from imported package What is the meaning of "Failed building wheel for X" in pip install? How to downgrade python from 3.7 to 3.6 I can't install pyaudio on Windows? How to solve "error: Microsoft Visual C++ 14.0 is required."? Iterating over arrays in Python 3 How to upgrade Python version to 3.7?