Importing a CSV file into a sqlite3 database table using Python

Question

I have a CSV file and I want to bulk-import this file into my sqlite3 database using Python  the command is   import         but it seems that it cannot work like this  Can anyone give me an example of how to do it in sqlite3  I am using windows just in case  Thanks

User · Answer

If the CSV file must be imported as part of a python program, then for simplicity and efficiency, you could use os.system along the lines suggested by the following:

import os

cmd = """sqlite3 database.db <<< ".import input.csv mytable" """

rc = os.system(cmd)

print(rc)

The point is that by specifying the filename of the database, the data will automatically be saved, assuming there are no errors reading it.

User · Answer

import csv  sqlite3  def  get col datatypes fin       dr   csv DictReader fin    comma is default delimiter     fieldTypes          for entry in dr          feildslLeft    f for f in dr fieldnames if f not in fieldTypes keys                    if not feildslLeft  break   We re done         for field in feildslLeft              data   entry field             Need data to decide         if len data     0              continue          if data isdigit                fieldTypes field     INTEGER          else              fieldTypes field     TEXT        TODO  Currently there s no support for DATE in sqllite  if len feildslLeft   gt  0      raise Exception  Failed to find all the columns data types - Maybe some are empty     return fieldTypes   def escapingGenerator f       for line in f          yield line encode  ascii    xmlcharrefreplace   decode  ascii     def csvToDb csvFile dbFile tablename  outputToFile   False          TODO  implement output to file      with open csvFile mode  r   encoding  ISO-8859-1   as fin          dt    get col datatypes fin           fin seek 0           reader   csv DictReader fin             Keep the order of the columns name just as in the CSV         fields   reader fieldnames         cols                 Set field and type         for f in fields              cols append     s    s     f  dt f               Generate create table statement          stmt    create table if not exists       tablename         s         join cols          print stmt          con   sqlite3 connect dbFile          cur   con cursor           cur execute stmt           fin seek 0            reader   csv reader escapingGenerator fin              Generate insert statement          stmt    INSERT INTO       tablename       VALUES  s          join       len cols            cur executemany stmt  reader          con commit           con close

User · Answer

Creating an sqlite connection to a file on disk is left as an exercise for the reader     but there is now a two-liner made possible by the pandas library  df   pandas read csv csvfile  df to sql table name  conn  if exists  append   index False

User · Answer

The  import command is a feature of the sqlite3 command-line tool  To do it in Python  you should simply load the data using whatever facilities Python has  such as the csv module  and inserting the data as per usual   This way  you also have control over what types are inserted  rather than relying on sqlite3 s seemingly undocumented behaviour

User · Answer

The following can also add fields  name based on the CSV header  import sqlite3  def csv sql file dir table name database name       con   sqlite3 connect database name      cur   con cursor         Drop the current table by         cur execute  quot DROP TABLE IF EXISTS  s  quot    table name       with open file dir   r   as fl          hd   fl readline    -1  split              ro   fl readlines           db    tuple ro i   -1  split       for i in range len ro         header       join hd      cur execute  quot CREATE TABLE IF NOT EXISTS  s   s   quot     table name header       cur executemany  quot INSERT INTO  s   s  VALUES   s   quot     table name header       len hd    -1    db      con commit       con close      Example  csv sql    surveys csv   survey   eco db

User · Answer

My 2 cents  more generic    import csv  sqlite3 import logging  def  get col datatypes fin       dr   csv DictReader fin    comma is default delimiter     fieldTypes          for entry in dr          feildslLeft    f for f in dr fieldnames if f not in fieldTypes keys            if not feildslLeft  break   We re done         for field in feildslLeft              data   entry field                 Need data to decide             if len data     0                  continue              if data isdigit                    fieldTypes field     INTEGER              else                  fieldTypes field     TEXT            TODO  Currently there s no support for DATE in sqllite      if len feildslLeft   gt  0          raise Exception  Failed to find all the columns data types - Maybe some are empty         return fieldTypes   def escapingGenerator f       for line in f          yield line encode  ascii    xmlcharrefreplace   decode  ascii     def csvToDb csvFile  outputToFile   False         TODO  implement output to file      with open csvFile mode  r   encoding  ISO-8859-1   as fin          dt    get col datatypes fin           fin seek 0           reader   csv DictReader fin             Keep the order of the columns name just as in the CSV         fields   reader fieldnames         cols                 Set field and type         for f in fields              cols append   s  s     f  dt f               Generate create table statement          stmt    CREATE TABLE ads   s         join cols           con   sqlite3 connect   memory            cur   con cursor           cur execute stmt           fin seek 0            reader   csv reader escapingGenerator fin              Generate insert statement          stmt    INSERT INTO ads VALUES  s          join       len cols            cur executemany stmt  reader          con commit        return con

User · Answer

I ve found that it can be necessary to break up the transfer of data from the csv to the database in chunks as to not run out of memory  This can be done like this   import csv import sqlite3 from operator import itemgetter    Establish connection conn   sqlite3 connect  mydb db      Create the table  conn execute              CREATE TABLE persons          person id INTEGER          last name TEXT           first name TEXT           address TEXT                    These are the columns from the csv that we want cols     person id    last name    first name    address      If the csv file is huge  we instead add the data in chunks chunksize   10000    Parse csv file and populate db in chunks with conn  open  persons csv   as f      reader   csv DictReader f       chunk          for i  row in reader            if i   chunksize    0 and i  gt  0              conn executemany                                      INSERT INTO persons                     VALUES                                  chunk                           chunk               items   itemgetter  cols  row          chunk append items

User · Answer

import csv  sqlite3  con   sqlite3 connect   memory      change to  sqlite    your filename db  cur   con cursor   cur execute  CREATE TABLE t  col1  col2       use your column names here  with open  data csv   r   as fin     with  statement available in 2 5        csv DictReader uses first line in file for column headings by default     dr   csv DictReader fin    comma is default delimiter     to db     i  col1    i  col2    for i in dr   cur executemany  INSERT INTO t  col1  col2  VALUES           to db  con commit   con close

User · Answer

usr bin python   - - coding  utf-8 - -  import sys  csv  sqlite3  def main        con   sqlite3 connect sys argv 1     database file input     cur   con cursor       cur executescript             DROP TABLE IF EXISTS t          CREATE TABLE t  COL1 TEXT  COL2 TEXT                  checks to see if table exists and makes a fresh table       with open sys argv 2    rb   as f    CSV file input         reader   csv reader f  delimiter        no header information with delimiter         for row in reader              to db    unicode row 0    utf8    unicode row 1    utf8      Appends data from CSV file representing and handling of text             cur execute  INSERT INTO neto  COL1  COL2  VALUES          to db              con commit       con close     closes connection to database  if   name       main         main

User · Answer

Many thanks for bernie s answer   Had to tweak it a bit - here s what worked for me   import csv  sqlite3 conn   sqlite3 connect  pcfc sl3   curs   conn cursor   curs execute  CREATE TABLE PCFC  id INTEGER PRIMARY KEY  type INTEGER  term TEXT  definition TEXT     reader   csv reader open  PC txt    r    delimiter      for row in reader      to db    unicode row 0    utf8    unicode row 1    utf8    unicode row 2    utf8        curs execute  INSERT INTO PCFC  type  term  definition  VALUES              to db  conn commit     My text file  PC txt  looks like this   1   Term 1   Definition 1 2   Term 2   Definition 2 3   Term 3   Definition 3

User · Answer

Here are solutions that ll work if your CSV file is really big   Use to sql as suggested by another answer  but set chunksize so it doesn t try to process the whole file at once  import sqlite3 import pandas as pd  conn   sqlite3 connect  my data db   c   conn cursor   users   pd read csv  users csv   users to sql  users   conn  if exists  append   index   False  chunksize   10000   You can also use Dask  as described here to write a lot of Pandas DataFrames in parallel  dto sql   dask delayed pd DataFrame to sql  out    dto sql d   table name   db url  if exists  append   index True         for d in ddf to delayed    dask compute  out   See here for more details

User · Answer

Based on Guy L solution  Love it  but can handle escaped fields   import csv  sqlite3  def  get col datatypes fin       dr   csv DictReader fin    comma is default delimiter     fieldTypes          for entry in dr          feildslLeft    f for f in dr fieldnames if f not in fieldTypes keys                    if not feildslLeft  break   We re done         for field in feildslLeft              data   entry field                 Need data to decide             if len data     0                  continue              if data isdigit                    fieldTypes field     INTEGER              else                  fieldTypes field     TEXT            TODO  Currently there s no support for DATE in sqllite      if len feildslLeft   gt  0          raise Exception  Failed to find all the columns data types - Maybe some are empty         return fieldTypes   def escapingGenerator f       for line in f          yield line encode  ascii    xmlcharrefreplace   decode  ascii     def csvToDb csvFile dbFile tablename  outputToFile   False          TODO  implement output to file      with open csvFile mode  r   encoding  ISO-8859-1   as fin          dt    get col datatypes fin           fin seek 0           reader   csv DictReader fin             Keep the order of the columns name just as in the CSV         fields   reader fieldnames         cols                 Set field and type         for f in fields              cols append     s    s     f  dt f               Generate create table statement          stmt    create table if not exists       tablename         s         join cols          print stmt          con   sqlite3 connect dbFile          cur   con cursor           cur execute stmt           fin seek 0            reader   csv reader escapingGenerator fin              Generate insert statement          stmt    INSERT INTO       tablename       VALUES  s          join       len cols            cur executemany stmt  reader          con commit           con close

User · Answer

in the interest of simplicity  you could use the sqlite3 command line tool from the Makefile of your project     sql3    csv     rm -f        sqlite3    -echo -cmd   mode csv    import   lt        dump    sql3     sqlite3   lt   select   from       make test sql3 then creates the sqlite database from an existing test csv file  with a single table  test   you can then make test dump to verify the contents

User · Answer

With this you can do joins on CSVs as well  import sqlite3 import os import pandas as pd from typing import List  class CSVDriver      def   init   self  table dir path  str           self table dir path   table dir path    where tables  ie  csv files  are located         self  con   None       property     def con self  - gt  sqlite3 Connection           quot  quot  quot Make a singleton connection to an in-memory SQLite database quot  quot  quot          if not self  con              self  con   sqlite3 connect  quot  memory  quot           return self  con          def  exists self  table  str  - gt  bool          query    quot  quot  quot          SELECT name         FROM sqlite master          WHERE type   table          AND name NOT LIKE  sqlite              quot  quot  quot          tables   self con execute query  fetchall           return table in tables      def  load table to mem self  table  str  sep  str   None  - gt  None           quot  quot  quot          Load a CSV into an in-memory SQLite database         sep is set to None in order to force pandas to auto-detect the delimiter          quot  quot  quot          if self  exists table               return         file name   table    quot  csv quot          path   os path join self table dir path  file name          if not os path exists path               raise ValueError f quot CSV table  table  does not exist in  self table dir path  quot           df   pd read csv path  sep sep  engine  quot python quot      set engine to python to skip pandas  warning         df to sql table  self con  if exists  replace   index False  chunksize 10000       def query self  query  str  - gt  List tuple            quot  quot  quot          Run an SQL query on CSV file s            Tables are loaded from table dir path          quot  quot  quot          tables   extract tables query          for table in tables              self  load table to mem table          cursor   self con cursor           cursor execute query          records   cursor fetchall           return records  extract tables    import sqlparse from sqlparse sql import IdentifierList  Identifier   Function from sqlparse tokens import Keyword  DML from collections import namedtuple import itertools  class Reference namedtuple  Reference     schema    name    alias    is function            slots             def has alias self           return self alias is not None       property     def is query alias self           return self name is None and self alias is not None       property     def is table alias self           return self name is not None and self alias is not None and not self is function       property     def full name self           if self schema is None              return self name         else              return self schema         self name  def  is subselect parsed       if not parsed is group          return False     for item in parsed tokens          if item ttype is DML and item value upper   in   SELECT    INSERT                                                            UPDATE    CREATE    DELETE                return True     return False   def  identifier is function identifier       return any isinstance t  Function  for t in identifier tokens    def  extract from part parsed       tbl prefix seen   False     for item in parsed tokens          if item is group              for x in  extract from part item                   yield x         if tbl prefix seen              if  is subselect item                   for x in  extract from part item                       yield x               An incomplete nested select won t be recognized correctly as a               sub-select  eg   SELECT   FROM  SELECT id FROM user   This causes               the second FROM to trigger this elif condition resulting in a               StopIteration  So we need to ignore the keyword if the keyword               FROM                Also  SELECT   FROM abc JOIN def  will trigger this elif               condition  So we need to ignore the keyword JOIN and its variants               INNER JOIN  FULL OUTER JOIN  etc              elif item ttype is Keyword and                       not item value upper       FROM   and                       not item value upper   endswith  JOIN                     tbl prefix seen   False             else                  yield item         elif item ttype is Keyword or item ttype is Keyword DML              item val   item value upper               if  item val in   COPY    FROM    INTO    UPDATE    TABLE   or                     item val endswith  JOIN                     tbl prefix seen   True            SELECT a  FROM abc  will detect FROM as part of the column list            So this check here is necessary          elif isinstance item  IdentifierList               for identifier in item get identifiers                    if  identifier ttype is Keyword and                         identifier value upper       FROM                        tbl prefix seen   True                     break   def  extract table identifiers token stream       for item in token stream          if isinstance item  IdentifierList               for ident in item get identifiers                    try                      alias   ident get alias                       schema name   ident get parent name                       real name   ident get real name                   except AttributeError                      continue                 if real name                      yield Reference schema name  real name                                      alias   identifier is function ident           elif isinstance item  Identifier               yield Reference item get parent name    item get real name                                item get alias     identifier is function item           elif isinstance item  Function               yield Reference item get parent name    item get real name                                item get alias     identifier is function item     def extract tables sql         let s handle multiple statements in one sql string     extracted tables          statements   list sqlparse parse sql       for statement in statements          stream    extract from part statement          extracted tables append  ref name for ref in  extract table identifiers stream        return list itertools chain  extracted tables    Example  assuming account csv and tojoin csv exist in  path to files   db path   r quot  path to files quot  driver   CSVDriver db path  query    quot  quot  quot  SELECT tojoin col to join  FROM account LEFT JOIN tojoin ON account a   tojoin a  quot  quot  quot  driver query query

User · Answer

You re right that  import is the way to go  but that s a command from the SQLite3 exe shell  A lot of the top answers to this question involve native python loops  but if your files are large  mine are 10 6 to 10 7 records   you want to avoid reading everything into pandas or using a native python list comprehension loop  though I did not time them for comparison    For large files  I believe the best option is to create the empty table in advance using sqlite3 execute  CREATE TABLE       strip the headers from your CSV files  and then use subprocess run   to execute sqlite s import statement   Since the last part is I believe the most pertinent  I will start with that   subprocess run    from pathlib import Path db name   Path  my db   resolve   csv file   Path  file csv   resolve   result   subprocess run   sqlite3                            str db name                             -cmd                              mode csv                              import   str csv file  replace                                                   lt table name gt                             capture output True    Explanation From the command line  the command you re looking for is sqlite3 my db -cmd   mode csv    import file csv table    subprocess run   runs a command line process   The argument to subprocess run   is a sequence of strings which are interpreted as a command followed by all of it s arguments    sqlite3 my db opens the database   -cmd flag after the database allows you to pass multiple follow on commands to the sqlite program   In the shell  each command has to be in quotes  but here  they just need to be their own element of the sequence   mode csv  does what you d expect   import   str csv file  replace                 lt table name gt   is the import command  Unfortunately  since subprocess passes all follow-ons to -cmd as quoted strings  you need to double up your backslashes if you have a windows directory path    Stripping Headers  Not really the main point of the question  but here s what I used   Again  I didn t want to read the whole files into memory at any point   with open csv   r   as source      source readline       with open str csv    nohead    w   as target          shutil copyfileobj source  target

User · Answer

You can do this using blaze  amp  odo efficiently  import blaze as bz csv path    data csv  bz odo csv path   sqlite    data db  data     Odo will store the csv file to data db  sqlite database  under the schema data  Or you use odo directly  without blaze  Either ways is fine  Read this documentation

[python] Importing a CSV file into a sqlite3 database table using Python

Examples related to python

Examples related to database

Examples related to csv

Examples related to sqlite