[python] Creating a dictionary from a CSV file

I am in the process of trying to write a python script that will take input from a CSV file and then push it into a dictionary format (I am using Python 3.x).

I use the code below to read in the CSV file and that works:

import csv

reader = csv.reader(open('C:\\Users\\Chris\\Desktop\\test.csv'), delimiter=',', quotechar='|')

for row in reader:
    print(', '.join(row))

But now I want to place the results into a dictionary. I would like the first row of the CSV file to be used as the "key" field for the dictionary with the subsequent rows in the CSV file filling out the data portion.

Sample Data:

     Date        First Name     Last Name     Score
12/28/2012 15:15        John          Smith        20
12/29/2012 15:15        Alex          Jones        38
12/30/2012 15:15      Michael       Carpenter      25

There are additional things I would like to do with this code but for now just getting the dictionary to work is what I am looking for.

Can anyone help me with this?

EDITED Version 2:

import csv
reader = csv.DictReader(open('C:\\Users\\Chris\\Desktop\\test.csv'))

result = {}

for row in reader:
    for column, value in row.items():
        result.setdefault(column, []).append(value)
        print('Column -> ', column, '\nValue -> ', value)
print(result)

fieldnames = result.keys()

csvwriter = csv.DictWriter(open('C:\\Users\\Chris\\Desktop\\test_out.csv', 'w'), delimiter=',', fieldnames=result.keys())

csvwriter.writerow(dict((fn,fn) for fn in fieldnames))

for row in result.items():
    print('Values -> ', row)
    #csvwriter.writerow(row)

'''
Test output

'''
test_array = []
test_array.append({'fruit': 'apple', 'quantity': 5, 'color': 'red'});
test_array.append({'fruit': 'pear', 'quantity': 8, 'color': 'green'});
test_array.append({'fruit': 'banana', 'quantity': 3, 'color': 'yellow'});
test_array.append({'fruit': 'orange', 'quantity': 11, 'color': 'orange'});
fieldnames = ['fruit', 'quantity', 'color']
test_file = open('C:\\Users\\Chris\\Desktop\\test_out.csv','w')
csvwriter = csv.DictWriter(test_file, delimiter=',', fieldnames=fieldnames)
csvwriter.writerow(dict((fn,fn) for fn in fieldnames))
for row in test_array:
    print(row)
    csvwriter.writerow(row)
test_file.close()

This question is related to python csv dictionary

The answer is


If you have:

  1. Only 1 key and 1 value as key,value in your csv
  2. Do not want to import other packages
  3. Want to create a dict in one shot

Do this:

mydict = {y[0]: y[1] for y in [x.split(",") for x in open('file.csv').read().split('\n') if x]}

What does it do?

It uses list comprehension to split lines and the last "if x" is used to ignore blank line (usually at the end) which is then unpacked into a dict using dictionary comprehension.


One-liner solution

import pandas as pd

dict = {row[0] : row[1] for _, row in pd.read_csv("file.csv").iterrows()}

Many solutions have been posted and I'd like to contribute with mine, which works for a different number of columns in the CSV file. It creates a dictionary with one key per column, and the value for each key is a list with the elements in such column.

    input_file = csv.DictReader(open(path_to_csv_file))
    csv_dict = {elem: [] for elem in input_file.fieldnames}
    for row in input_file:
        for key in csv_dict.keys():
            csv_dict[key].append(row[key])

You have to just convert csv.reader to dict:

~ >> cat > 1.csv
key1, value1
key2, value2
key2, value22
key3, value3

~ >> cat > d.py
import csv
with open('1.csv') as f:
    d = dict(filter(None, csv.reader(f)))

print(d)

~ >> python d.py
{'key3': ' value3', 'key2': ' value22', 'key1': ' value1'}

Assuming you have a CSV of this structure:

"a","b"
1,2
3,4
5,6

And you want the output to be:

[{'a': '1', ' "b"': '2'}, {'a': '3', ' "b"': '4'}, {'a': '5', ' "b"': '6'}]

A zip function (not yet mentioned) is simple and quite helpful.

def read_csv(filename):
    with open(filename) as f:
        file_data=csv.reader(f)
        headers=next(file_data)
        return [dict(zip(headers,i)) for i in file_data]

You can use this, it is pretty cool:

import dataconverters.commas as commas
filename = 'test.csv'
with open(filename) as f:
      records, metadata = commas.parse(f)
      for row in records:
            print 'this is row in dictionary:'+rowenter code here

If you are OK with using the numpy package, then you can do something like the following:

import numpy as np

lines = np.genfromtxt("coors.csv", delimiter=",", dtype=None)
my_dict = dict()
for i in range(len(lines)):
   my_dict[lines[i][0]] = lines[i][1]

For simple csv files, such as the following

id,col1,col2,col3
row1,r1c1,r1c2,r1c3
row2,r2c1,r2c2,r2c3
row3,r3c1,r3c2,r3c3
row4,r4c1,r4c2,r4c3

You can convert it to a Python dictionary using only built-ins

with open(csv_file) as f:
    csv_list = [[val.strip() for val in r.split(",")] for r in f.readlines()]

(_, *header), *data = csv_list
csv_dict = {}
for row in data:
    key, *values = row   
    csv_dict[key] = {key: value for key, value in zip(header, values)}

This should yield the following dictionary

{'row1': {'col1': 'r1c1', 'col2': 'r1c2', 'col3': 'r1c3'},
 'row2': {'col1': 'r2c1', 'col2': 'r2c2', 'col3': 'r2c3'},
 'row3': {'col1': 'r3c1', 'col2': 'r3c2', 'col3': 'r3c3'},
 'row4': {'col1': 'r4c1', 'col2': 'r4c2', 'col3': 'r4c3'}}

Note: Python dictionaries have unique keys, so if your csv file has duplicate ids you should append each row to a list.

for row in data:
    key, *values = row

    if key not in csv_dict:
            csv_dict[key] = []

    csv_dict[key].append({key: value for key, value in zip(header, values)})

with pandas, it is much easier, for example. assuming you have the following data as CSV and let's call it test.txt / test.csv (you know CSV is a sort of text file )

a,b,c,d
1,2,3,4
5,6,7,8

now using pandas

import pandas as pd
df = pd.read_csv("./text.txt")
df_to_doct = df.to_dict()

for each row, it would be

df.to_dict(orient='records')

and that's it.


You need a Python DictReader class. More help can be found from here

import csv

with open('file_name.csv', 'rt') as f:
    reader = csv.DictReader(f)
    for row in reader:
        print row

import csv
reader = csv.reader(open('filename.csv', 'r'))
d = {}
for row in reader:
   k, v = row
   d[k] = v

I believe the syntax you were looking for is as follows:

import csv

with open('coors.csv', mode='r') as infile:
    reader = csv.reader(infile)
    with open('coors_new.csv', mode='w') as outfile:
        writer = csv.writer(outfile)
        mydict = {rows[0]:rows[1] for rows in reader}

Alternately, for python <= 2.7.1, you want:

mydict = dict((rows[0],rows[1]) for rows in reader)

This isn't elegant but a one line solution using pandas.

import pandas as pd
pd.read_csv('coors.csv', header=None, index_col=0, squeeze=True).to_dict()

If you want to specify dtype for your index (it can't be specified in read_csv if you use the index_col argument because of a bug):

import pandas as pd
pd.read_csv('coors.csv', header=None, dtype={0: str}).set_index(0).squeeze().to_dict()

Help from @phil-frost was very helpful, was exactly what I was looking for.

I have made few tweaks after that so I'm would like to share it here:

def csv_as_dict(file, ref_header, delimiter=None):

    import csv
    if not delimiter:
        delimiter = ';'
    reader = csv.DictReader(open(file), delimiter=delimiter)
    result = {}
    for row in reader:
        print(row)
        key = row.pop(ref_header)
        if key in result:
            # implement your duplicate row handling here
            pass
        result[key] = row
    return result

You can call it:

myvar = csv_as_dict(csv_file, 'ref_column')

Where ref_colum will be your main key for each row.


You can also use numpy for this.

from numpy import loadtxt
key_value = loadtxt("filename.csv", delimiter=",")
mydict = { k:v for k,v in key_value }

I'd suggest adding if rows in case there is an empty line at the end of the file

import csv
with open('coors.csv', mode='r') as infile:
    reader = csv.reader(infile)
    with open('coors_new.csv', mode='w') as outfile:
        writer = csv.writer(outfile)
        mydict = dict(row[:2] for row in reader if row)

Try to use a defaultdict and DictReader.

import csv
from collections import defaultdict
my_dict = defaultdict(list)

with open('filename.csv', 'r') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for line in csv_reader:
        for key, value in line.items():
            my_dict[key].append(value)

It returns:

{'key1':[value_1, value_2, value_3], 'key2': [value_a, value_b, value_c], 'Key3':[value_x, Value_y, Value_z]}

Open the file by calling open and then using csv.DictReader.

input_file = csv.DictReader(open("coors.csv"))

You may iterate over the rows of the csv file dict reader object by iterating over input_file.

for row in input_file:
    print(row)

OR To access first line only

dictobj = csv.DictReader(open('coors.csv')).next() 

UPDATE In python 3+ versions, this code would change a little:

reader = csv.DictReader(open('coors.csv'))
dictobj = next(reader) 

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to csv

Pandas: ValueError: cannot convert float NaN to integer Export result set on Dbeaver to CSV Convert txt to csv python script How to import an Excel file into SQL Server? "CSV file does not exist" for a filename with embedded quotes Save Dataframe to csv directly to s3 Python Data-frame Object has no Attribute (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape How to write to a CSV line by line? How to check encoding of a CSV file

Examples related to dictionary

JS map return object python JSON object must be str, bytes or bytearray, not 'dict Python update a key in dict if it doesn't exist How to update the value of a key in a dictionary in Python? How to map an array of objects in React C# Dictionary get item by index Are dictionaries ordered in Python 3.6+? Split / Explode a column of dictionaries into separate columns with pandas Writing a dictionary to a text file? enumerate() for dictionary in python