[java] Using Apache POI how to read a specific excel column

I'm having a problem in excel while using Apache POI. I can read across rows, but sometimes I'm in a situation where I would like to read a particular column only.

So is it possible to read any particular column like only the 'A' column only or the column 'C' only.

I'm using the Java language for this.

This question is related to java excel apache-poi

The answer is


You could just loop the rows and read the same cell from each row (doesn't this comprise a column?).


Here is the code to read the excel data by column.

public ArrayList<String> extractExcelContentByColumnIndex(int columnIndex){
        ArrayList<String> columndata = null;
        try {
            File f = new File("sample.xlsx")
            FileInputStream ios = new FileInputStream(f);
            XSSFWorkbook workbook = new XSSFWorkbook(ios);
            XSSFSheet sheet = workbook.getSheetAt(0);
            Iterator<Row> rowIterator = sheet.iterator();
            columndata = new ArrayList<>();

            while (rowIterator.hasNext()) {
                Row row = rowIterator.next();
                Iterator<Cell> cellIterator = row.cellIterator();
                while (cellIterator.hasNext()) {
                    Cell cell = cellIterator.next();

                    if(row.getRowNum() > 0){ //To filter column headings
                        if(cell.getColumnIndex() == columnIndex){// To match column index
                            switch (cell.getCellType()) {
                            case Cell.CELL_TYPE_NUMERIC:
                                columndata.add(cell.getNumericCellValue()+"");
                                break;
                            case Cell.CELL_TYPE_STRING:
                                columndata.add(cell.getStringCellValue());
                                break;
                            }
                        }
                    }
                }
            }
            ios.close();
            System.out.println(columndata);
        } catch (Exception e) {
            e.printStackTrace();
        }
        return columndata;
    }

Okay, from your question, you just simply want to read a particular column. So, while iterating over a row and then on its cells, your can simply check the index of the column.

Iterator<Row> rowIterator = mySheet.iterator(); // Traversing over each row of XLSX file
        while (rowIterator.hasNext()) {
            Row row = rowIterator.next(); // For each row, iterate through each columns
            Iterator<Cell> cellIterator = row.cellIterator();
            while (cellIterator.hasNext()) {
                Cell cell = cellIterator.next();
                println "column index"+cell.getColumnIndex()//You will have your columns fixed in Excel file
                if(cell.getColumnIndex()==3)//for example of c
                {
                print "done"            
                }
          }
     }     

I am using POI 3.12-- 'org.apache.poi:poi:3.12' Hope it helps. Cheers!


import java.io.*;

import org.apache.poi.hssf.util.CellReference;
import org.apache.poi.ss.usermodel.*;
import java.text.*;

public class XSLXReader {
    static DecimalFormat df = new DecimalFormat("#####0");

    public static void main(String[] args) {
        FileWriter fostream;
        PrintWriter out = null;
        String strOutputPath = "H:\\BLR_Team\\Kavitha\\Excel-to-xml\\";
        String strFilePrefix = "Master_5.2-B";

        try {
            InputStream inputStream = new FileInputStream(new File("H:\\BLR_Team\\Kavitha\\Excel-to-xml\\Stack-up 20L pure storage 11-0039-01 ISU_USA-A 1-30-17-Rev_exm.xls"));
            Workbook wb = WorkbookFactory.create(inputStream);
           // Sheet sheet = wb.getSheet(0);
            Sheet sheet =null;
            Integer noOfSheets= wb.getNumberOfSheets();

            for(int i=0;i<noOfSheets;i++){
                sheet = wb.getSheetAt(i);
                System.out.println("Sheet : "+i + " " + sheet.getSheetName());
                System.out.println("Sheet : "+i + " " + sheet.getFirstRowNum());
                System.out.println("Sheet : "+i + " " + sheet.getLastRowNum());

            //Column 29
            fostream = new FileWriter(strOutputPath + "\\" + strFilePrefix+i+ ".xml");
            out = new PrintWriter(new BufferedWriter(fostream));

            out.println("<?xml version=\"1.0\" encoding=\"UTF-8\"?>");
            out.println("<Bin-code>");

            boolean firstRow = true;
            for (Row row : sheet) {
                if (firstRow == true) {
                    firstRow = false;
                    continue;
                }
                out.println("\t<DCT>");
                out.println(formatElement("\t\t", "ID", formatCell(row.getCell(0))));
                out.println(formatElement("\t\t", "Table_name", formatCell(row.getCell(1))));
                out.println(formatElement("\t\t", "isProddaten", formatCell(row.getCell(2))));
                out.println(formatElement("\t\t", "isR3P01Data", formatCell(row.getCell(3))));

                out.println(formatElement("\t\t", "LayerNo", formatCell(row.getCell(29))));
                out.println("\t</DCT>");
            }
            CellReference ref = new CellReference("A13");
          Row r = sheet.getRow(ref.getRow());
          if (r != null) {
             Cell c = r.getCell(ref.getCol());
           System.out.println(c.getRichStringCellValue().getString());
          }

            for (Row row : sheet) {
                  for (Cell cell : row) {

                      CellReference cellRef = new CellReference(row.getRowNum(), cell.getColumnIndex());


                      switch (cell.getCellType()) {
                      case Cell.CELL_TYPE_STRING:
                          System.out.println(cell.getRichStringCellValue().getString());
                          break;
                      case Cell.CELL_TYPE_NUMERIC:
                          if (DateUtil.isCellDateFormatted(cell)) {
                              System.out.println(cell.getDateCellValue());
                          } else {
                              System.out.println(cell.getNumericCellValue());
                          }
                          break;
                      case Cell.CELL_TYPE_BOOLEAN:
                          System.out.println(cell.getBooleanCellValue());
                          break;
                      case Cell.CELL_TYPE_FORMULA:
                          System.out.println(cell.getCellFormula());
                          break;
                      case Cell.CELL_TYPE_BLANK:
                          System.out.println();
                          break;
                      default:
                          System.out.println();
                  }
                  }

            }
            out.write("</Bin-code>");
            out.flush();
            out.close();
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    private static String formatCell(Cell cell)
    {
        if (cell == null) {
            return "";
        }
        switch(cell.getCellType()) {
            case Cell.CELL_TYPE_BLANK:
                return "";
            case Cell.CELL_TYPE_BOOLEAN:
                return Boolean.toString(cell.getBooleanCellValue());
            case Cell.CELL_TYPE_ERROR:
                return "*error*";
            case Cell.CELL_TYPE_NUMERIC:
                return XSLXReader.df.format(cell.getNumericCellValue());
            case Cell.CELL_TYPE_STRING:
                return cell.getStringCellValue();
            default:
                return "<unknown value>";
        }
    }

    private static String formatElement(String prefix, String tag, String value) {
        StringBuilder sb = new StringBuilder(prefix);
        sb.append("<");
        sb.append(tag);
        if (value != null && value.length() > 0) {
            sb.append(">");
            sb.append(value);
            sb.append("</");
            sb.append(tag);
            sb.append(">");
        } else {
            sb.append("/>");
        }
        return sb.toString();
    }
}

This code does 3 things:

  1. Excel to XML file generation. Eng. Name Dong Kim
  2. Prints the content of a particular cell : A13
  3. Also print the excel content into normal text format. Jars to be imported: poi-3.9.jar,poi-ooxml-3.9.jar,poi-ooxml-schemas-3.9.jar,xbea??n-2.3.0.jar,xmlbeans??-xmlpublic-2.4.0.jar??,dom4j-1.5.jar

Please be aware, that iterating through the columns using row cell iterator ( Iterator<Cell> cellIterator = row.cellIterator();) may lead to silent skipping columns. I have just encountered a document that was exposing such behaviour.

Iterating using indexes in a for loop and using row.getCell(i) was not skipping columns and was returning values at the correct column indexes.


  Sheet sheet  = workBook.getSheetAt(0); // Get Your Sheet.

  for (Row row : sheet) { // For each Row.
      Cell cell = row.getCell(0); // Get the Cell at the Index / Column you want.
  }

My solution, a bit simpler code wise.


Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to excel

Python: Pandas pd.read_excel giving ImportError: Install xlrd >= 0.9.0 for Excel support Converting unix time into date-time via excel How to increment a letter N times per iteration and store in an array? 'Microsoft.ACE.OLEDB.16.0' provider is not registered on the local machine. (System.Data) How to import an Excel file into SQL Server? Copy filtered data to another sheet using VBA Better way to find last used row Could pandas use column as index? Check if a value is in an array or not with Excel VBA How to sort dates from Oldest to Newest in Excel?

Examples related to apache-poi

Alternative to deprecated getCellType Apache POI error loading XSSFWorkbook class Cannot get a text value from a numeric cell “Poi” org.apache.poi.POIXMLException: org.apache.poi.openxml4j.exceptions.InvalidFormatException: Merging cells in Excel using Apache POI get number of columns of a particular row in given excel using Java How to get row count in an Excel file using POI library? How to check if an excel cell is empty using Apache POI? What is the better API to Reading Excel sheets in java - JXL or Apache POI Using Apache POI how to read a specific excel column