Read Excel File in Python

Question

I ve an Excel File  Arm id      DSPName        DSPCode          HubCode          PinCode    PPTL 1            JaVAS            01              AGR             282001    1 2 2            JaVAS            01              AGR             282002    3 4 3            JaVAS            01              AGR             282003    5 6   I want to save a string in the form Arm id DSPCode Pincode  This format is configurable  i e  it might change to DSPCode Arm id Pincode   I save it in a list like   FORMAT     Arm id    DSPName    Pincode     How do I read the content of a specific column with provided name  given that the FORMAT is configurable   This is what I tried  Currently I m able to read all the content in the file  from xlrd import open workbook wb   open workbook  sample xls   for s in wb sheets         print  Sheet   s name     values          for row in range s nrows           col value              for col in range s ncols               value     s cell row col  value              try   value   str int value               except   pass             col value append value          values append col value  print values   My output is     u Arm id   u DSPName   u DSPCode   u HubCode   u PinCode   u PPTL      1   u JaVAS    1   u AGR    282001   u 1 2      2   u JaVAS    1   u AGR    282002   u 3 4      3   u JaVAS    1   u AGR    282003   u 5 6      Then I loop around values 0  trying to find out the FORMAT content in values 0  and then getting the index of Arm id  DSPname and Pincode in the values 0  and then from next loop I know the index of all the FORMAT factors   thereby getting to know which value do I need to get    But this is such a poor solution   How do I get the values of a specific column with name in excel file

User · Answer

Although I almost always just use pandas for this  my current little tool is being packaged into an executable and including pandas is overkill  So I created a version of poida s solution that resulted in a list of named tuples  His code with this change would look like this    from xlrd import open workbook from collections import namedtuple from pprint import pprint  wb   open workbook  sample xls    FORMAT     Arm id    DSPName    PinCode   OneRow   namedtuple  OneRow       join FORMAT   all rows       for s in wb sheets        headerRow   s row 0      columnIndex    x for y in FORMAT for x in range len headerRow   if y    headerRow x  value       for row in range 1 s nrows           currentRow   s row row          currentRowValues    currentRow x  value for x in columnIndex          all rows append OneRow  currentRowValues    pprint all rows

User · Answer

By using pandas we can read excel easily  import pandas as pd  from pandas import ExcelWriter from pandas import ExcelFile   DataF pd read excel  quot Test xlsx quot  sheet name  Sheet1    print  quot Column headings  quot   print DataF columns   Test at  https   repl it Reference  https   pythonspot com read-excel-with-pandas

User · Answer

So the key parts are to grab the header   col names   s row 0    and when iterating through the rows  to skip the first row which isn t needed for row in range 1  s nrows  - done by using range from 1 onwards  not the implicit 0   You then use zip to step through the rows holding  name  as the header of the column   from xlrd import open workbook  wb   open workbook  Book2 xls   values      for s in wb sheets         print  Sheet   s name     for row in range 1  s nrows           col names   s row 0          col value              for name  col in zip col names  range s ncols                value     s cell row col  value              try   value   str int value               except   pass             col value append  name value  value           values append col value  print values

User · Answer

This is one approach   from xlrd import open workbook  class Arm object       def   init   self  id  dsp name  dsp code  hub code  pin code  pptl           self id   id         self dsp name   dsp name         self dsp code   dsp code         self hub code   hub code         self pin code   pin code         self pptl   pptl      def   str   self           return  Arm object  n                    Arm id    0  n                    DSPName    1  n                    DSPCode    2  n                    HubCode    3  n                    PinCode    4   n                    PPTL    5                   format self id  self dsp name  self dsp code                         self hub code  self pin code  self pptl    wb   open workbook  sample xls   for sheet in wb sheets        number of rows   sheet nrows     number of columns   sheet ncols      items           rows          for row in range 1  number of rows           values              for col in range number of columns               value     sheet cell row col  value              try                  value   str int value               except ValueError                  pass             finally                  values append value          item   Arm  values          items append item   for item in items      print item     print  Accessing one single value  eg  DSPName    0   format item dsp name       print   You don t have to use a custom class  you can simply take a dict    If you use a class however  you can access all values via dot-notation  as you see above   Here is the output of the script above   Arm object    Arm id   1   DSPName   JaVAS   DSPCode   1   HubCode   AGR   PinCode   282001    PPTL   1 Accessing one single value  eg  DSPName   JaVAS  Arm object    Arm id   2   DSPName   JaVAS   DSPCode   1   HubCode   AGR   PinCode   282002    PPTL   3 Accessing one single value  eg  DSPName   JaVAS  Arm object    Arm id   3   DSPName   JaVAS   DSPCode   1   HubCode   AGR   PinCode   282003    PPTL   5 Accessing one single value  eg  DSPName   JaVAS

User · Answer

A somewhat late answer  but with pandas  it is possible to get directly a column of an excel file  import pandas  df   pandas read excel  sample xls    print the column names print df columns  get the values for a given column values   df  Arm id   values  get a data frame with selected columns FORMAT     Arm id    DSPName    Pincode   df selected   df FORMAT   Make sure you have installed xlrd and pandas  pip install pandas xlrd

User · Answer

Here is the code to read an excel file and and print all the cells present in column 1  except the first cell i e the header    import xlrd  file location  C  pythonprog xxx xlsv  workbook xlrd open workbook file location  sheet workbook sheet by index 0  print sheet cell value 0 0    for row in range 1 sheet nrows        print sheet cell value row 0

User · Answer

The approach I took reads the header information from the first row to determine the indexes  of the columns of interest   You mentioned in the question that you also want the values output to a string  I dynamically build a format string for the output from the FORMAT column list  Rows are appended to the values string separated by a new line char   The output column order is determined by the order of the column names in the FORMAT list   In my code below the case of the column name in the FORMAT list is important  In the question above you ve got  Pincode  in your FORMAT list  but  PinCode  in your excel  This wouldn t work below  it would need to be  PinCode    from xlrd import open workbook wb   open workbook  sample xls    FORMAT     Arm id    DSPName    PinCode   values       for s in wb sheets        headerRow   s row 0      columnIndex    x for y in FORMAT for x in range len headerRow   if y    firstRow x  value      formatString      s   len columnIndex   0 -1      n       for row in range 1 s nrows           currentRow   s row row          currentRowValues    currentRow x  value for x in columnIndex          values    formatString   tuple currentRowValues   print values   For the sample input you gave above this code outputs    gt  gt  gt  1 0 JaVAS 282001 0 2 0 JaVAS 282002 0 3 0 JaVAS 282003 0   And because I m a python noob  props be to  this answer   this answer  this question  this question and this answer

[python] Read Excel File in Python

Examples related to python

Examples related to excel

Examples related to xlrd