[python] how to import csv data into django models

I have some CSV data and I want to import into django models using the example CSV data:

1;"02-01-101101";"Worm Gear HRF 50";"Ratio 1 : 10";"input shaft, output shaft, direction A, color dark green";
2;"02-01-101102";"Worm Gear HRF 50";"Ratio 1 : 20";"input shaft, output shaft, direction A, color dark green";
3;"02-01-101103";"Worm Gear HRF 50";"Ratio 1 : 30";"input shaft, output shaft, direction A, color dark green";
4;"02-01-101104";"Worm Gear HRF 50";"Ratio 1 : 40";"input shaft, output shaft, direction A, color dark green";
5;"02-01-101105";"Worm Gear HRF 50";"Ratio 1 : 50";"input shaft, output shaft, direction A, color dark green";

I have some django models named Product. In Product there are some fields like name, description and price. I want something like this:

product=Product()
product.name = "Worm Gear HRF 70(02-01-101116)"
product.description = "input shaft, output shaft, direction A, color dark green"
product.price = 100

This question is related to python django csv django-models export

The answer is


You can also use, django-adaptors

>>> from adaptor.model import CsvModel
>>> class MyCSvModel(CsvModel):
...     name = CharField()
...     age = IntegerField()
...     length = FloatField()
...
...     class Meta:
...         delimiter = ";"

You declare a MyCsvModel which will match to a CSV file like this:

Anthony;27;1.75

To import the file or any iterable object, just do:

>>> my_csv_list = MyCsvModel.import_data(data = open("my_csv_file_name.csv"))
>>> first_line = my_csv_list[0]
>>> first_line.age
    27

Without an explicit declaration, data and columns are matched in the same order:

Anthony --> Column 0 --> Field 0 --> name
27      --> Column 1 --> Field 1 --> age
1.75    --> Column 2 --> Field 2 --> length

If you want to use a library, a quick google search for csv and django reveals two libraries - django-csvimport and django-adaptors. Let's read what they have to say about themselves...

  • django-adaptors:

Django adaptor is a tool which allow you to transform easily a CSV/XML file into a python object or a django model instance.

  • django-importcsv:

django-csvimport is a generic importer tool to allow the upload of CSV files for populating data.

The first requires you to write a model to match the csv file, while the second is more of a command-line importer, which is a huge difference in the way you work with them, and each is good for a different type of project.

So which one to use? That depends on which of those will be better suited for your project in the long run.

However, you can also avoid a library altogether, by writing your own django script to import your csv file, something along the lines of (warning, pseudo-code ahead):

# open file & create csvreader
import csv, yada yada yada

# import the relevant model
from myproject.models import Foo

#loop:
for line in csv file:
     line = parse line to a list
     # add some custom validation\parsing for some of the fields

     foo = Foo(fieldname1=line[1], fieldname2=line[2] ... etc. )
     try:
         foo.save()
     except:
         # if the're a problem anywhere, you wanna know about it
         print "there was a problem with line", i 

It's super easy. Hell, you can do it interactively through the django shell if it's a one-time import. Just - figure out what you want to do with your project, how many files do you need to handle and then - if you decide to use a library, try figuring out which one better suits your needs.


The Python csv library can do your parsing and your code can translate them into Products().


If you're working with new versions of Django (>10) and don't want to spend time writing the model definition. you can use the ogrinspect tool.

This will create a code definition for the model .

python manage.py ogrinspect [/path/to/thecsv] Product

The output will be the class (model) definition. In this case the model will be called Product. You need to copy this code into your models.py file.

Afterwards you need to migrate (in the shell) the new Product table with:

python manage.py makemigrations
python manage.py migrate

More information here: https://docs.djangoproject.com/en/1.11/ref/contrib/gis/tutorial/

Do note that the example has been done for ESRI Shapefiles but it works pretty good with standard CSV files as well.

For ingesting your data (in CSV format) you can use pandas.

import pandas as pd
your_dataframe = pd.read_csv(path_to_csv)
# Make a row iterator (this will go row by row)
iter_data = your_dataframe.iterrows()

Now, every row needs to be transformed into a dictionary and use this dict for instantiating your model (in this case, Product())

# python 2.x
map(lambda (i,data) : Product.objects.create(**dict(data)),iter_data

Done, check your database now.


For django 1.8 that im using,

I made a command that you can create objects dynamically in the future, so you can just put the file path of the csv, the model name and the app name of the relevant django application, and it will populate the relevant model without specified the field names. so if we take for example the next csv:

field1,field2,field3
value1,value2,value3
value11,value22,value33

it will create the objects [{field1:value1,field2:value2,field3:value3}, {field1:value11,field2:value22,field3:value33}] for the model name you will enter to the command.

the command code:

from django.core.management.base import BaseCommand
from django.db.models.loading import get_model
import csv


class Command(BaseCommand):
    help = 'Creating model objects according the file path specified'

    def add_arguments(self, parser):
        parser.add_argument('--path', type=str, help="file path")
        parser.add_argument('--model_name', type=str, help="model name")
        parser.add_argument('--app_name', type=str, help="django app name that the model is connected to")

    def handle(self, *args, **options):
        file_path = options['path']
        _model = get_model(options['app_name'], options['model_name'])
        with open(file_path, 'rb') as csv_file:
            reader = csv.reader(csv_file, delimiter=',', quotechar='|')
            header = reader.next()
            for row in reader:
                _object_dict = {key: value for key, value in zip(header, row)}
                _model.objects.create(**_object_dict)

note that maybe in later versions

from django.db.models.loading import get_model

is deprecated and need to be change to

from django.apps.apps import get_model

You want to use the csv module that is part of the python language and you should use Django's get_or_create method

 with open(path) as f:
        reader = csv.reader(f)
        for row in reader:
            _, created = Teacher.objects.get_or_create(
                first_name=row[0],
                last_name=row[1],
                middle_name=row[2],
                )
            # creates a tuple of the new object or
            # current object and a boolean of if it was created

In my example the model teacher has three attributes first_name, last_name and middle_name.

Django documentation of get_or_create method


Consider using Django's built-in deserializers. Django's docs are well-written and can help you get started. Consider converting your data from csv to XML or JSON and using a deserializer to import the data. If you're doing this from the command line (rather than through a web request), the loaddata manage.py command will be especially helpful.


You can use the django-csv-importer package. http://pypi.python.org/pypi/django-csv-importer/0.1.1

It works like a django model

MyCsvModel(CsvModel):
    field1 = IntegerField()
    field2 = CharField()
    etc

    class Meta:
        delimiter = ";"
        dbModel = Product

And you just have to: CsvModel.import_from_file("my file")

That will automatically create your products.


You can give a try to django-import-export. It has nice admin integration, changes preview, can create, update, delete objects.


Use the Pandas library to create a dataframe of the csv data.
Name the fields either by including them in the csv file's first line or in code by using the dataframe's columns method.
Then create a list of model instances.
Finally use the django method .bulk_create() to send your list of model instances to the database table.

The read_csv function in pandas is great for reading csv files and gives you lots of parameters to skip lines, omit fields, etc.

import pandas as pd

tmp_data=pd.read_csv('file.csv',sep=';')
#ensure fields are named~ID,Product_ID,Name,Ratio,Description
#concatenate name and Product_id to make a new field a la Dr.Dee's answer
products = [
    Product(
        name = tmp_data.ix[row]['Name'] 
        description = tmp_data.ix[row]['Description'],
        price = tmp_data.ix[row]['price'],
    )
    for row in tmp_data['ID']
]
Product.objects.bulk_create(products)

I was using the answer by mmrs151 but saving each row (instance) was very slow and any fields containing the delimiting character (even inside of quotes) were not handled by the open() -- line.split(';') method.

Pandas has so many useful caveats, it is worth getting to know


This is based off of Erik's answer from earlier, but I've found it easiest to read in the .csv file using pandas and then create a new instance of the class for every row in the in data frame.

This example is updated using iloc as pandas no longer uses ix in the most recent version. I don't know about Erik's situation but you need to create the list outside of the for loop otherwise it will not append to your array but simply overwrite it.

import pandas as pd
df = pd.read_csv('path_to_file', sep='delimiter')
products = []
for i in range(len(df)):
    products.append(
        Product(
        name=df.iloc[i][0]
        description=df.iloc[i][1]
        price=df.iloc[i][2]
        )
    )
Product.objects.bulk_create(products)

This is just breaking the DataFrame into an array of rows and then selecting each column out of that array off the zero index. (i.e. name is the first column, description the second, etc.)

Hope that helps.


something like this:

f = open('data.txt', 'r')  
for line in f:  
   line =  line.split(';')  
   product = Product()  
   product.name = line[2] + '(' + line[1] + ')'  
   product.description = line[4]  
   product.price = '' #data is missing from file  
   product.save()  

f.close()  

Here's a django egg for it:

django-csvimport


define class in models.py and a function in it.

class all_products(models.Model):
    def get_all_products():
        items = []
        with open('EXACT FILE PATH OF YOUR CSV FILE','r') as fp:
            # You can also put the relative path of csv file
            # with respect to the manage.py file
            reader1 = csv.reader(fp, delimiter=';')
            for value in reader1:
                items.append(value)
        return items

You can access ith element in the list as items[i]


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to django

How to fix error "ERROR: Command errored out with exit status 1: python." when trying to install django-heroku using pip Pylint "unresolved import" error in Visual Studio Code Is it better to use path() or url() in urls.py for django 2.0? Unable to import path from django.urls Error loading MySQLdb Module 'Did you install mysqlclient or MySQL-python?' ImportError: Couldn't import Django Django - Reverse for '' not found. '' is not a valid view function or pattern name Class has no objects member Getting TypeError: __init__() missing 1 required positional argument: 'on_delete' when trying to add parent table after child table with entries How to switch Python versions in Terminal?

Examples related to csv

Pandas: ValueError: cannot convert float NaN to integer Export result set on Dbeaver to CSV Convert txt to csv python script How to import an Excel file into SQL Server? "CSV file does not exist" for a filename with embedded quotes Save Dataframe to csv directly to s3 Python Data-frame Object has no Attribute (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape How to write to a CSV line by line? How to check encoding of a CSV file

Examples related to django-models

Getting TypeError: __init__() missing 1 required positional argument: 'on_delete' when trying to add parent table after child table with entries "Post Image data using POSTMAN" What does on_delete do on Django models? Django values_list vs values What's the difference between select_related and prefetch_related in Django ORM? Django Model() vs Model.objects.create() 'NOT NULL constraint failed' after adding to models.py django - get() returned more than one topic How to convert Django Model object to dict with its fields and values? How do I make an auto increment integer field in Django?

Examples related to export

Export multiple classes in ES6 modules Why Is `Export Default Const` invalid? How to properly export an ES6 class in Node 4? ES6 export all values from object Export a list into a CSV or TXT file in R How to export database schema in Oracle to a dump file Excel VBA to Export Selected Sheets to PDF How to export all data from table to an insertable sql format? Export pictures from excel file into jpg using VBA -bash: export: `=': not a valid identifier