Import CSV file as a pandas DataFrame

Question

What s the Python way to read in a CSV file into a pandas DataFrame  which I can then use for statistical operations  can have differently-typed columns  etc       My CSV file  value txt  has the following content    Date  price   factor 1   factor 2  2012-06-11 1600 20 1 255 1 548 2012-06-12 1610 02 1 258 1 554 2012-06-13 1618 07 1 249 1 552 2012-06-14 1624 40 1 253 1 556 2012-06-15 1626 15 1 258 1 552 2012-06-16 1626 15 1 263 1 558 2012-06-17 1626 15 1 264 1 572   In R we would read this file in using    price  lt - read csv  value txt       and that would return an R data frame    gt  price  lt - read csv  value txt    gt  price      Date   price factor 1 factor 2 1  2012-06-11 1600 20    1 255    1 548 2  2012-06-12 1610 02    1 258    1 554 3  2012-06-13 1618 07    1 249    1 552 4  2012-06-14 1624 40    1 253    1 556 5  2012-06-15 1626 15    1 258    1 552 6  2012-06-16 1626 15    1 263    1 558 7  2012-06-17 1626 15    1 264    1 572   Is there a Pythonic way to get the same functionality

User · Answer

cd C  Users asus Desktop python import pandas as pd df   pd read csv  value txt   df head       Date    price   factor 1    factor 2 0   2012-06-11  1600 20 1 255   1 548 1   2012-06-12  1610 02 1 258   1 554 2   2012-06-13  1618 07 1 249   1 552 3   2012-06-14  1624 40 1 253   1 556 4   2012-06-15  1626 15 1 258   1 552

User · Answer

Note quite as clean  but   import csv  with open  value txt    r   as f      csv reader   reader f      num            for row in csv reader          print num    t  join row          if num                        num 0         num num 1   Not as compact  but it does the job      Date price   factor 1    factor 2 1 2012-06-11    1600 20 1 255   1 548 2 2012-06-12    1610 02 1 258   1 554 3 2012-06-13    1618 07 1 249   1 552 4 2012-06-14    1624 40 1 253   1 556 5 2012-06-15    1626 15 1 258   1 552 6 2012-06-16    1626 15 1 263   1 558 7 2012-06-17    1626 15 1 264   1 572

User · Answer

pandas to the rescue   import pandas as pd print pd read csv  value txt            Date    price  factor 1  factor 2 0  2012-06-11  1600 20     1 255     1 548 1  2012-06-12  1610 02     1 258     1 554 2  2012-06-13  1618 07     1 249     1 552 3  2012-06-14  1624 40     1 253     1 556 4  2012-06-15  1626 15     1 258     1 552 5  2012-06-16  1626 15     1 263     1 558 6  2012-06-17  1626 15     1 264     1 572   This returns pandas DataFrame that is similar to R s

User · Answer

Try this  import pandas as pd data pd read csv  C  Users Downloads winequality-red csv     Replace the file target location  with where your data set is found  refer this url  https   medium com  kanchanardj jargon-in-python-used-in-data-science-to-laymans-language-part-one-12ddfd31592f

User · Answer

Here s an alternative to pandas library using Python s built-in csv module   import csv from pprint import pprint with open  foo csv    rb   as f      reader   csv reader f      headers   reader next       column    h    for h in headers      for row in reader          for h  v in zip headers  row               column h  append v      pprint column       Pretty printer   will print    Date     2012-06-11              2012-06-12              2012-06-13              2012-06-14              2012-06-15              2012-06-16              2012-06-17      factor 1     1 255    1 258    1 249    1 253    1 258    1 263    1 264      factor 2     1 548    1 554    1 552    1 556    1 552    1 558    1 572      price     1600 20               1610 02               1618 07               1624 40               1626 15               1626 15               1626 15

User · Answer

You can use the csv module found in the python standard library to manipulate CSV files   example   import csv with open  some csv    rb   as f      reader   csv reader f      for row in reader          print row

User · Answer

import pandas as pd df   pd read csv   PathToFile txt   sep          This will import your  txt or  csv file into a DataFrame

User · Answer

import pandas as pd dataset   pd read csv   home nspython Downloads movie metadata1 csv

User · Answer

To read a CSV file as a pandas DataFrame  you ll need to use pd read csv   But this isn t where the story ends  data exists in many different formats and is stored in different ways so you will often need to pass additional parameters to read csv to ensure your data is read in properly   Here s a table listing common scenarios encountered with CSV files along with the appropriate argument you will need to use  You will usually need all or some combination of the arguments below to read in your data      -------------------------------------------------------------------------------------------------------------------------------------------------      Scenario                                                    Argument                       Example                                                   ---------------------------------------------------------- ----------------------------- --------------------------------------------------------       Read CSV with different separator                            sep delimiter                  read csv      sep                                            Read CSV with tab whitespace separator                      delim whitespace               read csv      delim whitespace True                          Fix UnicodeDecodeError while reading                         encoding                       read csv      encoding  latin-1                              Read CSV without headers                                     header and names               read csv      header False  names   x    y    z              Specify which column to set as the index4                   index col                      read csv      index col  0                                   Read subset of columns                                      usecols                        read csv      usecols   x    y                               Numeric data is in European format  eg   1 234 56           thousands and decimal          read csv      thousands      decimal                      -------------------------------------------------------------------------------------------------------------------------------------------------       Footnotes         By default  read csv uses a C parser engine for performance  The C parser can only handle single character separators  If your CSV has   a multi-character separator  you will need to modify your code to use   the  python  engine  You can also pass regular expressions   df   pd read csv      sep r  s    s    engine  python      UnicodeDecodeError occurs when the data was stored in one encoding format but read in a different  incompatible one  Most common   encoding schemes are  utf-8  and  latin-1   your data is likely to   fit into one of these    header False specifies that the first row in the CSV is a data row rather than a header row  and the names       allows you to   specify a list of column names to assign to the DataFrame when it is   created     Unnamed  0  occurs when a DataFrame with an un-named index is saved to CSV and then re-read after  Instead of having to fix the   issue while reading  you can also fix the issue when writing by using   df to csv      index False        There are other arguments I ve not mentioned here  but these are the ones you ll encounter most frequently

[python] Import CSV file as a pandas DataFrame

Examples related to python

Examples related to pandas

Examples related to csv

Examples related to dataframe