How can I read and parse CSV files in C

Question

I need to load and use CSV file data in C++. At this point it can really just be a comma-delimited parser (ie don't worry about escaping new lines and commas). The main need is a line-by-line parser that will return a vector for the next line each time the method is called.

I found this article which looks quite promising: http://www.boost.org/doc/libs/1_35_0/libs/spirit/example/fundamental/list_parser.cpp

I've never used Boost's Spirit, but am willing to try it. But only if there isn't a more straightforward solution I'm overlooking.

User · Answer

When using the Boost Tokenizer escaped list separator for CSV files  then one should be aware of the following    It requires an escape-character  default back-slash -    It requires a splitter seperator-character  default comma -    It requires an quote-character  default quote -      The CSV format specified by wiki states that data fields can contain separators in quotes  supported       1997 Ford E350  Super  luxurious truck    The CSV format specified by wiki states that single quotes should be handled with double-quotes  escaped list separator will strip away all quote characters       1997 Ford E350  Super   luxurious   truck    The CSV format doesn t specify that any back-slash characters should be stripped away  escaped list separator will strip away all escape characters    A possible work-around to fix the default behavior of the boost escaped list separator    First replace all back-slash characters     with two back-slash characters      so they are not stripped away  Secondly replace all double-quotes      with a single back-slash character and a quote        This work-around has the side-effect that empty data-fields that are represented by a double-quote  will be transformed into a single-quote-token  When iterating through the tokens  then one must check if the token is a single-quote  and treat it like an empty string   Not pretty but it works  as long there are not newlines within the quotes

User · Answer

As parsing Excel-formatted CSVs is not a trivial task, I would like to add code of mine here:

http://fgw.ddnss.de/CSV_Endlicher_Automat.html

This is a deterministic state machine, which handles all cases (plain, quoting, escaping of quotes) explicitly and without any strange hacks.

I do think several "solutions" above are incorrect and some are excessively complex.

User · Answer

I ve got a way quicker solution  was originally intended for this question   How to pull specific part of different strings   But it was closed obviously  I m not about to throw this away though    include  lt iostream gt   include  lt string gt   include  lt regex gt   std  string text      4     3         Mon May 11 03 17 40 UTC 2009         kindle2         tpryan         TEXT HERE              int main         std  regex r                                                                                          std  smatch m      std  regex search text  m  r       std  cout lt  lt  FOUND    lt  lt m 9  lt  lt std  endl       return 0      Just pick out whichever match you want from the smatch collection by index  Regex is bliss

User · Answer

Like everyone puts his solution  here is mine using template  lambda and tuple   It can convert any CSV with wanted columns to a C   vector of tuple   It works by defining each CSV line element type in a tuple   You also need to define the std  string to type conversion Formatter lambda for each element  using std  atod for example    Then you got a vector of this struct corresponding to your CSV data   You can reuse this easily to match any CSV structure   StringsHelpers hpp   include  lt string gt   include  lt fstream gt   include  lt vector gt   include  lt functional gt   namespace StringHelpers       template lt typename Tuple gt      using Formatter   std  function lt Tuple const std  vector lt std  string gt   amp   gt        std  vector lt std  string gt  split const std  string  amp string  const std  string  amp delimiter        template lt typename Tuple gt      std  vector lt Tuple gt  readCsv const std  string  amp path  const std  string  amp delimiter  Formatter lt Tuple gt  formatter        StringsHelpers cpp   include  StringHelpers hpp   namespace StringHelpers                  Split a string with the given delimiter into several strings                param string - The string to extract the substrings from         param delimiter - The substrings delimiter                return The substrings             std  vector lt std  string gt  split const std  string  amp string  const std  string  amp delimiter                std  vector lt std  string gt  result          size t                   last   0                                   next   0           while   next   string find delimiter  last      std  string  npos                result emplace back string substr last  next - last                last   next   1                     result emplace back string substr last             return result                        Read a CSV file and store its values into the given structure  Tuple with Formatter constructor                 tparam Tuple - The CSV line structure format                param path - The CSV file path         param delimiter - The CSV values delimiter         param formatter - The CSV values formatter that take a vector of strings in input and return a Tuple                return The CSV as vector of Tuple             template lt typename Tuple gt      std  vector lt Tuple gt  readCsv const std  string  amp path  const std  string  amp delimiter  Formatter lt Tuple gt  formatter                std  ifstream      file path  std  ifstream  in           std  string        line          std  vector lt Tuple gt  result           if  file fail                  throw std  runtime error  The file     path     could not be opened                       while  std  getline file  line                 result emplace back formatter split line  delimiter                        file close             return result                Forward template declarations      template std  vector lt std  tuple lt double  double  double gt  gt  readCsv lt std  tuple lt double  double  double gt  gt  const std  string  amp   const std  string  amp   Formatter lt std  tuple lt double  double  double gt  gt         End of StringHelpers namespace   main cpp   some usage    include  StringHelpers hpp          Example of use with a CSV file which have  number Red Green Blue  as line values  We do not want to use the 1st value    of the line      int main int argc  char   argv           Declare CSV line type  formatter and template type     typedef std  tuple lt double  double  double gt                           CSV format      typedef std  function lt CSV format const std  vector lt std  string gt   amp   gt  formatterT       enum RGB   Red   1  Green  Blue         const std  string COLOR MAP PATH     some absolute path           Load the color map     auto colorMap   StringHelpers  readCsv lt CSV format gt  COLOR MAP PATH          const std  vector lt std  string gt   amp values            return CSV format                      Here is the formatter lambda that convert each value from string to what you want                 std  strtod values Red  c str    nullptr                   std  strtod values Green  c str    nullptr                   std  strtod values Blue  c str    nullptr                             Use your colorMap as you  wish

User · Answer

For what it is worth  here is my implementation  It deals with wstring input  but could be adjusted to string easily  It does not handle newline in fields  as my application does not either  but adding its support isn t too difficult  and it does not comply with   r n  end of line as per RFC  assuming you use std  getline   but it does handle whitespace trimming and double-quotes correctly  hopefully    using namespace std      trim whitespaces around field or double-quotes  remove double-quotes and replace escaped double-quotes  double double-quotes  wstring trimquote const wstring amp  str  const wstring amp  whitespace  const wchar t quotChar        wstring ws      wstring  size type strBegin   str find first not of whitespace       if  strBegin    wstring  npos          return L         wstring  size type strEnd   str find last not of whitespace       wstring  size type strRange   strEnd - strBegin   1       if  str strBegin     quotChar   amp  amp   str strEnd     quotChar                 ws   str substr strBegin 1  strRange-2           strBegin   0          while  strEnd   ws find quotChar  strBegin      wstring  npos                        ws erase strEnd  1               strBegin   strEnd 1                       else         ws   str substr strBegin  strRange       return ws     pair lt unsigned  unsigned gt  nextCSVQuotePair const wstring amp  line  const wchar t quotChar  unsigned ofs   0        pair lt unsigned  unsigned gt  r      r first   line find quotChar  ofs       r second   wstring  npos      if r first    wstring  npos                r second   r first          while   r second   line find quotChar  r second 1      wstring  npos               amp  amp   line r second 1     quotChar      WARNING  assumes null-terminated string such that line r second 1  always exist             r second               return r     unsigned parseLine vector lt wstring gt  amp  fields  const wstring amp  line        unsigned ofs  ofs0  np      const wchar t delim   L         const wstring whitespace   L   t xa0 x3000 x2000 x2001 x2002 x2003 x2004 x2005 x2006 x2007 x2008 x2009 x200a x202f x205f       const wchar t quotChar   L          pair lt unsigned  unsigned gt  quot       fields clear         ofs   ofs0   0      quot   nextCSVQuotePair line  quotChar       while  np   line find delim  ofs      wstring  npos                if  np  gt  quot first   amp  amp   np  lt  quot second                skip delimiter inside quoted field             ofs   quot second 1              quot   nextCSVQuotePair line  quotChar  ofs               continue                    fields push back  trimquote line substr ofs0  np-ofs0   whitespace  quotChar             ofs   ofs0   np 1            fields push back  trimquote line substr ofs0   whitespace  quotChar          return fields size

User · Answer

It is possible to use std  regex     Depending on the size of your file and the memory available to you   it is possible read it either line by line or entirely in an std  string    To read the file one can use     std  ifstream t  file txt    std  string sin  std  istreambuf iterator lt char gt  t                     std  istreambuf iterator lt char gt        then you can match with this which is actually customizable to your needs   std  regex word regex     s      auto what        std  sregex iterator sin begin    sin end    word regex   auto wend   std  sregex iterator     std  vector lt std  string gt  v  for   what  wend   wend        std  smatch match    what      v push back match str

User · Answer

Another solution similar to Loki Astari s answer  in C  11  Rows here are std  tuples of a given type  The code scans one line  then scans until each delimiter  and then converts and dumps the value directly into the tuple  with a bit of template code    for  auto row   csv lt std  string  int  float gt  file              std  cout  lt  lt   first col     lt  lt  std  get lt 0 gt  row   lt  lt  std  endl      Advanges    quite clean and simple to use  only C  11  automatic type conversion into std  tuple lt t1      gt  via operator gt  gt     What s missing    escaping and quoting no error handling in case of malformed CSV    The main code    include  lt iterator gt   include  lt sstream gt   include  lt string gt   namespace csvtools           Read the last element of the tuple without calling recursively     template  lt std  size t idx  class    fields gt      typename std  enable if lt idx  gt   std  tuple size lt std  tuple lt fields    gt  gt   value - 1 gt   type     read tuple std  istream  amp in  std  tuple lt fields    gt   amp out  const char delimiter            std  string cell          std  getline in  cell  delimiter           std  stringstream cell stream cell           cell stream  gt  gt  std  get lt idx gt  out                  Read the  p idx-th element of the tuple and then calls itself with  p idx   1 to         read the next element of the tuple  Automatically falls in the previous case when         reaches the last element of the tuple thanks to enable if     template  lt std  size t idx  class    fields gt      typename std  enable if lt idx  lt  std  tuple size lt std  tuple lt fields    gt  gt   value - 1 gt   type     read tuple std  istream  amp in  std  tuple lt fields    gt   amp out  const char delimiter            std  string cell          std  getline in  cell  delimiter           std  stringstream cell stream cell           cell stream  gt  gt  std  get lt idx gt  out           read tuple lt idx   1  fields    gt  in  out  delimiter                Iterable csv wrapper around a stream   p fields the list of types that form up a row  template  lt class    fields gt  class csv       std  istream  amp  in      const char  delim  public      typedef std  tuple lt fields    gt  value type      class iterator           Construct from a stream      inline csv std  istream  amp in  const char delim     in in    delim delim              Status of the underlying stream                inline bool good   const           return  in good              inline const std  istream  amp underlying stream   const           return  in                        inline iterator begin        inline iterator end    private           Reads a line into a stringstream  and then reads the line into a tuple  that is returned     inline value type read row             std  string line          std  getline  in  line           std  stringstream line stream line           std  tuple lt fields    gt  retval          csvtools  read tuple lt 0  fields    gt  line stream  retval   delim           return retval                Iterator  just calls recursively  ref csv  read row and stores the result  template  lt class    fields gt  class csv lt fields    gt   iterator       csv  value type  row      csv   parent  public      typedef std  input iterator tag iterator category      typedef csv  value type         value type      typedef std  size t             difference type      typedef csv  value type         pointer      typedef csv  value type  amp        reference           Construct an empty end iterator     inline iterator      parent nullptr             Construct an iterator at the beginning of the  p parent csv object      inline iterator csv  amp parent     parent parent good      amp parent   nullptr                this                  Read one row  if possible  Set to end if parent is not good anymore      inline iterator  amp operator               if   parent    nullptr                 row    parent- gt read row                if    parent- gt good                       parent   nullptr                                  return  this             inline iterator operator   int            iterator copy    this              this           return copy             inline csv  value type const  amp operator    const           return  row             inline csv  value type const  operator- gt    const           return  amp  row             bool operator   iterator const  amp other            return  this     amp other  or   parent    nullptr and other  parent    nullptr             bool operator   iterator const  amp other            return not   this    other             template  lt class    fields gt  typename csv lt fields    gt   iterator csv lt fields    gt   begin         return iterator  this      template  lt class    fields gt  typename csv lt fields    gt   iterator csv lt fields    gt   end         return iterator        I put a tiny working example on GitHub  I ve been using it for parsing some numerical data and it served its purpose

User · Answer

I wrote a nice way of parsing CSV files and I thought I should add it as an answer    include  lt algorithm gt   include  lt fstream gt   include  lt iostream gt   include  lt stdlib h gt   include  lt stdio h gt   struct CSVDict     std  vector lt  std  string  gt  inputImages    std  vector lt  double  gt  inputLabels           brief Splits the string   param str String to split  param delim Delimiter on the basis of which splitting is to be done  return results Output in the form of vector of strings    std  vector lt std  string gt  stringSplit  const std  string  amp str  const std  string  amp delim       std  vector lt std  string gt  results     for  size t i   0  i  lt  str length    i            std  string tempString           while   str i      delim c str     amp  amp   i  lt  str length                 tempString    str i         i              results push back tempString          return results          brief Parse the supplied CSV File and obtain Row and Column information    Assumptions  1  Header information is in first row 2  Delimiters are only used to differentiate cell members   param csvFileName The full path of the file to parse  param inputColumns The string of input columns which contain the data to be used for further processing  param inputLabels The string of input labels based on which further processing is to be done  param delim The delimiters used in inputColumns and inputLabels  return Vector of Vector of strings  Collection of rows and columns    std  vector lt  CSVDict  gt  parseCSVFile  const std  string  amp csvFileName  const std  string  amp inputColumns  const std  string  amp inputLabels  const std  string  amp delim       std  vector lt  CSVDict  gt  return CSVDict    std  vector lt  std  string  gt  inputColumnsVec   stringSplit inputColumns  delim   inputLabelsVec   stringSplit inputLabels  delim     std  vector lt  std  vector lt  std  string  gt   gt  returnVector    std  ifstream inFile csvFileName c str       int row   0    std  vector lt  size t  gt  inputColumnIndeces  inputLabelIndeces    for  std  string line  std  getline inFile  line    n             CSVDict tempDict      std  vector lt  std  string  gt  rowVec      line erase std  remove line begin    line end          line end         rowVec   stringSplit line  delim           for the first row  record the indeces of the inputColumns and inputLabels     if  row    0              for  size t i   0  i  lt  rowVec size    i                    for  size t j   0  j  lt  inputColumnsVec size    j                        if  rowVec i     inputColumnsVec j                           inputColumnIndeces push back i                                 for  size t j   0  j  lt  inputLabelsVec size    j                        if  rowVec i     inputLabelsVec j                           inputLabelIndeces push back i                                           else             for  size t i   0  i  lt  inputColumnIndeces size    i                    tempDict inputImages push back rowVec inputColumnIndeces i                   for  size t i   0  i  lt  inputLabelIndeces size    i                    double test   std  atof rowVec inputLabelIndeces i   c str             tempDict inputLabels push back std  atof rowVec inputLabelIndeces i   c str                    return CSVDict push back tempDict             row           return return CSVDict

User · Answer

My version is not using anything but the standard C++11 library. It copes well with Excel CSV quotation:

spam eggs,"foo,bar","""fizz buzz"""
1.23,4.567,-8.00E+09

The code is written as a finite-state machine and is consuming one character at a time. I think it's easier to reason about.

#include <istream>
#include <string>
#include <vector>

enum class CSVState {
    UnquotedField,
    QuotedField,
    QuotedQuote
};

std::vector<std::string> readCSVRow(const std::string &row) {
    CSVState state = CSVState::UnquotedField;
    std::vector<std::string> fields {""};
    size_t i = 0; // index of the current field
    for (char c : row) {
        switch (state) {
            case CSVState::UnquotedField:
                switch (c) {
                    case ',': // end of field
                              fields.push_back(""); i++;
                              break;
                    case '"': state = CSVState::QuotedField;
                              break;
                    default:  fields[i].push_back(c);
                              break; }
                break;
            case CSVState::QuotedField:
                switch (c) {
                    case '"': state = CSVState::QuotedQuote;
                              break;
                    default:  fields[i].push_back(c);
                              break; }
                break;
            case CSVState::QuotedQuote:
                switch (c) {
                    case ',': // , after closing quote
                              fields.push_back(""); i++;
                              state = CSVState::UnquotedField;
                              break;
                    case '"': // "" -> "
                              fields[i].push_back('"');
                              state = CSVState::QuotedField;
                              break;
                    default:  // end of quote
                              state = CSVState::UnquotedField;
                              break; }
                break;
        }
    }
    return fields;
}

/// Read CSV file, Excel dialect. Accept "quoted fields ""with quotes"""
std::vector<std::vector<std::string>> readCSV(std::istream &in) {
    std::vector<std::vector<std::string>> table;
    std::string row;
    while (!in.eof()) {
        std::getline(in, row);
        if (in.bad() || in.fail()) {
            break;
        }
        auto fields = readCSVRow(row);
        table.push_back(fields);
    }
    return table;
}

User · Answer

You might want to look at my FOSS project CSVfix (updated link), which is a CSV stream editor written in C++. The CSV parser is no prize, but does the job and the whole package may do what you need without you writing any code.

See alib/src/a_csv.cpp for the CSV parser, and csvlib/src/csved_ioman.cpp (IOManager::ReadCSV) for a usage example.

User · Answer

I needed an easy-to-use C   library for parsing CSV files but couldn t find any available  so I ended up building one  Rapidcsv is a C  11 header-only library which gives direct access to parsed columns  or rows  as vectors  in datatype of choice  For example    include  lt iostream gt   include  lt vector gt   include  lt rapidcsv h gt   int main       rapidcsv  Document doc     tests msft csv       std  vector lt float gt  close   doc GetColumn lt float gt   Close      std  cout  lt  lt   Read    lt  lt  close size    lt  lt    values    lt  lt  std  endl

User · Answer

You can use this library  https   github com vadamsky csvworker  Code for example    include  lt iostream gt   include  csvworker h   using namespace std   int main                CsvWorker csv      csv loadFromFile  example csv        cout  lt  lt  csv getRowsNumber    lt  lt        lt  lt  csv getColumnsNumber    lt  lt  endl       csv getFieldRef 0  2     0       csv getFieldRef 1  1     0       csv getFieldRef 1  3     0       csv getFieldRef 2  0     0       csv getFieldRef 2  4     0       csv getFieldRef 3  1     0       csv getFieldRef 3  3     0       csv getFieldRef 4  2     0        for unsigned int i 0 i lt csv getRowsNumber     i                  cout  lt  lt  csv getRow i   lt  lt  endl          for unsigned int j 0 j lt csv getColumnsNumber     j                        cout  lt  lt  csv getField i  j   lt  lt                         cout  lt  lt  endl             csv saveToFile  test csv                CsvWorker csv2 4 4        csv2 getFieldRef 0  0     a       csv2 getFieldRef 0  1     b       csv2 getFieldRef 0  2     r       csv2 getFieldRef 0  3     a       csv2 getFieldRef 1  0     c       csv2 getFieldRef 1  1     a       csv2 getFieldRef 1  2     d       csv2 getFieldRef 2  0     a       csv2 getFieldRef 2  1     b       csv2 getFieldRef 2  2     r       csv2 getFieldRef 2  3     a        csv2 saveToFile  test2 csv         return 0

User · Answer

Another CSV I O library can be found here   http   code google com p fast-cpp-csv-parser    include  csv h   int main      io  CSVReader lt 3 gt  in  ram csv      in read header io  ignore extra column   vendor    size    speed      std  string vendor  int size  double speed    while in read row vendor  size  speed           do stuff with the data

User · Answer

As all the CSV questions seem to get redirected here, I thought I'd post my answer here. This answer does not directly address the asker's question. I wanted to be able to read in a stream that is known to be in CSV format, and also the types of each field was already known. Of course, the method below could be used to treat every field to be a string type.

As an example of how I wanted to be able to use a CSV input stream, consider the following input (taken from wikipedia's page on CSV):

const char input[] =
"Year,Make,Model,Description,Price\n"
"1997,Ford,E350,\"ac, abs, moon\",3000.00\n"
"1999,Chevy,\"Venture \"\"Extended Edition\"\"\",\"\",4900.00\n"
"1999,Chevy,\"Venture \"\"Extended Edition, Very Large\"\"\",\"\",5000.00\n"
"1996,Jeep,Grand Cherokee,\"MUST SELL!\n\
air, moon roof, loaded\",4799.00\n"
;

Then, I wanted to be able to read in the data like this:

std::istringstream ss(input);
std::string title[5];
int year;
std::string make, model, desc;
float price;
csv_istream(ss)
    >> title[0] >> title[1] >> title[2] >> title[3] >> title[4];
while (csv_istream(ss)
       >> year >> make >> model >> desc >> price) {
    //...do something with the record...
}

This was the solution I ended up with.

struct csv_istream {
    std::istream &is_;
    csv_istream (std::istream &is) : is_(is) {}
    void scan_ws () const {
        while (is_.good()) {
            int c = is_.peek();
            if (c != ' ' && c != '\t') break;
            is_.get();
        }
    }
    void scan (std::string *s = 0) const {
        std::string ws;
        int c = is_.get();
        if (is_.good()) {
            do {
                if (c == ',' || c == '\n') break;
                if (s) {
                    ws += c;
                    if (c != ' ' && c != '\t') {
                        *s += ws;
                        ws.clear();
                    }
                }
                c = is_.get();
            } while (is_.good());
            if (is_.eof()) is_.clear();
        }
    }
    template <typename T, bool> struct set_value {
        void operator () (std::string in, T &v) const {
            std::istringstream(in) >> v;
        }
    };
    template <typename T> struct set_value<T, true> {
        template <bool SIGNED> void convert (std::string in, T &v) const {
            if (SIGNED) v = ::strtoll(in.c_str(), 0, 0);
            else v = ::strtoull(in.c_str(), 0, 0);
        }
        void operator () (std::string in, T &v) const {
            convert<is_signed_int<T>::val>(in, v);
        }
    };
    template <typename T> const csv_istream & operator >> (T &v) const {
        std::string tmp;
        scan(&tmp);
        set_value<T, is_int<T>::val>()(tmp, v);
        return *this;
    }
    const csv_istream & operator >> (std::string &v) const {
        v.clear();
        scan_ws();
        if (is_.peek() != '"') scan(&v);
        else {
            std::string tmp;
            is_.get();
            std::getline(is_, tmp, '"');
            while (is_.peek() == '"') {
                v += tmp;
                v += is_.get();
                std::getline(is_, tmp, '"');
            }
            v += tmp;
            scan();
        }
        return *this;
    }
    template <typename T>
    const csv_istream & operator >> (T &(*manip)(T &)) const {
        is_ >> manip;
        return *this;
    }
    operator bool () const { return !is_.fail(); }
};

With the following helpers that may be simplified by the new integral traits templates in C++11:

template <typename T> struct is_signed_int { enum { val = false }; };
template <> struct is_signed_int<short> { enum { val = true}; };
template <> struct is_signed_int<int> { enum { val = true}; };
template <> struct is_signed_int<long> { enum { val = true}; };
template <> struct is_signed_int<long long> { enum { val = true}; };

template <typename T> struct is_unsigned_int { enum { val = false }; };
template <> struct is_unsigned_int<unsigned short> { enum { val = true}; };
template <> struct is_unsigned_int<unsigned int> { enum { val = true}; };
template <> struct is_unsigned_int<unsigned long> { enum { val = true}; };
template <> struct is_unsigned_int<unsigned long long> { enum { val = true}; };

template <typename T> struct is_int {
    enum { val = (is_signed_int<T>::val || is_unsigned_int<T>::val) };
};

Try it online!

User · Answer

You can use Boost Tokenizer with escaped list separator      escaped list separator parses a superset of the csv  Boost  tokenizer   This only uses Boost tokenizer header files  no linking to boost libraries required   Here is an example   see Parse CSV File With Boost Tokenizer In C   for details or Boost  tokenizer      include  lt iostream gt         cout  endl  include  lt fstream gt          fstream  include  lt vector gt   include  lt string gt   include  lt algorithm gt        copy  include  lt iterator gt         ostream operator  include  lt boost tokenizer hpp gt   int main         using namespace std      using namespace boost      string data  data csv         ifstream in data c str         if   in is open    return 1       typedef tokenizer lt  escaped list separator lt char gt   gt  Tokenizer      vector lt  string  gt  vec      string line       while  getline in line                 Tokenizer tok line           vec assign tok begin   tok end                 vector now contains strings from one row  output to cout here         copy vec begin    vec end    ostream iterator lt string gt  cout                  cout  lt  lt    n----------------------   lt  lt  endl

User · Answer

Excuse me  but this all seems like a great deal of elaborate syntax to hide a few lines of code   Why not this          Read line from a CSV file     param in  fp file pointer to open file    param in  vls reference to vector of strings to hold next line       void readCSV  FILE  fp  std  vector lt std  string gt  amp  vls         vls clear        if    fp           return      char buf 10000       if    fgets  buf 999 fp            return      std  string s   buf      int p q      q   -1         loop over columns     while  1             p   q          q   s find first of    n  p 1           if  q    -1                break          vls push back  s substr p 1 q-p-1              int  tmain int argc   TCHAR  argv          std  vector lt std  string gt  vls      FILE   fp   fopen  argv 1    r         if    fp           return 1      readCSV  fp  vls        readCSV  fp  vls        readCSV  fp  vls        std  cout  lt  lt   row 3  col 4 is    lt  lt  vls 3  c str    lt  lt    n        return 0

User · Answer

If you re using Visual Studio   MFC  the following solution may make your life easier  It supports both Unicode and MBCS  has comments  doesn t have dependencies other than CString  and works well enough for me  It doesn t support line breaks embedded within a quoted string  but I don t care so long as it doesn t crash in that case  which it doesn t   The overall strategy is  handle quoted and empty strings as special cases  and use Tokenize for the rest  For quoted strings  the strategy is  find the real closing quote  keeping track of whether pairs of consecutive quotes were encountered  If they were  use Replace to convert the pairs to singles  No doubt there are more efficient methods but performance wasn t sufficiently critical in my case to justify further optimization   class CParseCSV   public     Construction     CParseCSV const CString amp  sLine       Attributes     bool    GetString CString amp  sDest    protected      CString m sLine        line to extract tokens from     int     m nLen         line length in characters     int     m iPos         index of current position     CParseCSV  CParseCSV const CString amp  sLine    m sLine sLine        m nLen   m sLine GetLength        m iPos   0     bool CParseCSV  GetString CString amp  sDest        if  m iPos  lt  0    m iPos  gt  m nLen      if position out of range         return false      if  m iPos    m nLen       if at end of string         sDest Empty        return empty token         m iPos   -1        really done now         return true            if  m sLine m iPos                 if current char is double quote         m iPos         advance to next char         int iTokenStart   m iPos          bool    bHasEmbeddedQuotes   false          while  m iPos  lt  m nLen         while more chars to parse             if  m sLine m iPos                 if current char is double quote                    if next char exists and is also double quote                 if  m iPos  lt  m nLen - 1  amp  amp  m sLine m iPos   1                                    found pair of consecutive double quotes                     bHasEmbeddedQuotes   true      request conversion                     m iPos         skip first quote in pair                   else     next char doesn t exist or is normal                     break      found closing quote  exit loop                           m iPos         advance to next char                   sDest   m sLine Mid iTokenStart  m iPos - iTokenStart           if  bHasEmbeddedQuotes     if string contains embedded quote pairs             sDest Replace  T           T               convert pairs to singles         m iPos    2        skip closing quote and trailing delimiter if any       else if  m sLine m iPos                  else if char is comma         sDest Empty        return empty token         m iPos         advance to next char       else         else get next comma-delimited token         sDest   m sLine Tokenize  T       m iPos             return true        calling code should look something like this       CStdioFile  fIn pszPath  CFile  modeRead       CString sLine  sToken      while  fIn ReadString sLine        for each line of input file         if   sLine IsEmpty         ignore blank lines             CParseCSV   csv sLine               while  csv GetString sToken                        do something with sToken here

User · Answer

Here is another implementation of a Unicode CSV parser (works with wchar_t). I wrote part of it, while Jonathan Leffler wrote the rest.

Note: This parser is aimed at replicating Excel's behavior as closely as possible, specifically when importing broken or malformed CSV files.

This is the original question - Parsing CSV file with multiline fields and escaped double quotes

This is the code as a SSCCE (Short, Self-Contained, Correct Example).

#include <stdbool.h>
#include <wchar.h>
#include <wctype.h>

extern const wchar_t *nextCsvField(const wchar_t *p, wchar_t sep, bool *newline);

// Returns a pointer to the start of the next field,
// or zero if this is the last field in the CSV
// p is the start position of the field
// sep is the separator used, i.e. comma or semicolon
// newline says whether the field ends with a newline or with a comma
const wchar_t *nextCsvField(const wchar_t *p, wchar_t sep, bool *newline)
{
    // Parse quoted sequences
    if ('"' == p[0]) {
        p++;
        while (1) {
            // Find next double-quote
            p = wcschr(p, L'"');
            // If we don't find it or it's the last symbol
            // then this is the last field
            if (!p || !p[1])
                return 0;
            // Check for "", it is an escaped double-quote
            if (p[1] != '"')
                break;
            // Skip the escaped double-quote
            p += 2;
        }
    }

    // Find next newline or comma.
    wchar_t newline_or_sep[4] = L"\n\r ";
    newline_or_sep[2] = sep;
    p = wcspbrk(p, newline_or_sep);

    // If no newline or separator, this is the last field.
    if (!p)
        return 0;

    // Check if we had newline.
    *newline = (p[0] == '\r' || p[0] == '\n');

    // Handle "\r\n", otherwise just increment
    if (p[0] == '\r' && p[1] == '\n')
        p += 2;
    else
        p++;

    return p;
}

static wchar_t *csvFieldData(const wchar_t *fld_s, const wchar_t *fld_e, wchar_t *buffer, size_t buflen)
{
    wchar_t *dst = buffer;
    wchar_t *end = buffer + buflen - 1;
    const wchar_t *src = fld_s;

    if (*src == L'"')
    {
        const wchar_t *p = src + 1;
        while (p < fld_e && dst < end)
        {
            if (p[0] == L'"' && p+1 < fld_s && p[1] == L'"')
            {
                *dst++ = p[0];
                p += 2;
            }
            else if (p[0] == L'"')
            {
                p++;
                break;
            }
            else
                *dst++ = *p++;
        }
        src = p;
    }
    while (src < fld_e && dst < end)
        *dst++ = *src++;
    if (dst >= end)
        return 0;
    *dst = L'\0';
    return(buffer);
}

static void dissect(const wchar_t *line)
{
    const wchar_t *start = line;
    const wchar_t *next;
    bool     eol;
    wprintf(L"Input %3zd: [%.*ls]\n", wcslen(line), wcslen(line)-1, line);
    while ((next = nextCsvField(start, L',', &eol)) != 0)
    {
        wchar_t buffer[1024];
        wprintf(L"Raw Field: [%.*ls] (eol = %d)\n", (next - start - eol), start, eol);
        if (csvFieldData(start, next-1, buffer, sizeof(buffer)/sizeof(buffer[0])) != 0)
            wprintf(L"Field %3zd: [%ls]\n", wcslen(buffer), buffer);
        start = next;
    }
}

static const wchar_t multiline[] =
   L"First field of first row,\"This field is multiline\n"
    "\n"
    "but that's OK because it's enclosed in double quotes, and this\n"
    "is an escaped \"\" double quote\" but this one \"\" is not\n"
    "   \"This is second field of second row, but it is not multiline\n"
    "   because it doesn't start \n"
    "   with an immediate double quote\"\n"
    ;

int main(void)
{
    wchar_t line[1024];

    while (fgetws(line, sizeof(line)/sizeof(line[0]), stdin))
        dissect(line);
    dissect(multiline);

    return 0;
}

User · Answer

Here is code for reading a matrix  note you also have a csvwrite function in matlab  void loadFromCSV  const std  string amp  filename         std  ifstream       file  filename c str          std  vector lt  std  vector lt std  string gt   gt    matrix      std  vector lt std  string gt    row      std  string                line      std  string                cell       while  file                 std  getline file line           std  stringstream lineStream line           row clear             while  std  getline  lineStream  cell                      row push back  cell             if   row empty                 matrix push back  row               for  int i 0  i lt int matrix size     i                   for  int j 0  j lt int matrix i  size     j                 std  cout  lt  lt  matrix i  j   lt  lt                std  cout  lt  lt  std  endl

User · Answer

If you don t want to deal with including boost in your project  it is considerably large if all you are going to use it for is CSV parsing      I have had luck with the CSV parsing here   http   www zedwood com article 112 cpp-csv-parser  It handles quoted fields - but does not handle inline  n characters  which is probably fine for most uses

User · Answer

Another quick and easy way is to use Boost Fusion I O    include  lt iostream gt   include  lt sstream gt    include  lt boost fusion adapted boost tuple hpp gt   include  lt boost fusion sequence io hpp gt   namespace fusion   boost  fusion   struct CsvString       std  string value          Stop reading a string once a CSV delimeter is encountered      friend std  istream amp  operator gt  gt  std  istream amp  s  CsvString amp  v            v value clear            for                   auto c   s peek                if std  istream  traits type  eof      c           c      n     c                  break              v value push back c               s get                      return s             friend std  ostream amp  operator lt  lt  std  ostream amp  s  CsvString const amp  v            return s  lt  lt  v value            int main         std  stringstream input  abc 123 true 3 14 n                               def 456 false 2 718 n         typedef boost  tuple lt CsvString  int  bool  double gt  CsvRow       using fusion  operator lt  lt       std  cout  lt  lt  std  boolalpha       using fusion  operator gt  gt       input  gt  gt  std  boolalpha      input  gt  gt  fusion  tuple open      gt  gt  fusion  tuple close   n    gt  gt  fusion  tuple delimiter            for CsvRow row  input  gt  gt  row           std  cout  lt  lt  row  lt  lt    n       Outputs    abc 123 true 3 14   def 456 false 2 718

User · Answer

This solution detects these 4 cases  complete class is at  https   github com pedro-vicente csv-parser  1 field 2 field 3  1 field 2  field 3 quoted  with separator   1 field 2  field 3 with newline   1 field 2  field 3 with newline and separator      It reads the file character by character  and reads 1 row at a time to a vector  of strings   therefore suitable for very large files   Usage is  Iterate until an empty row is returned  end of file   A row is a vector where each entry is a CSV column   read csv t csv  csv open     test csv    std  vector lt std  string gt  row  while  true      row   csv read row      if  row size      0          break          the class declaration  class read csv t   public    read csv t      int open const std  string  amp file name     std  vector lt std  string gt  read row    private    std  ifstream m ifs       the implementation  std  vector lt std  string gt  read csv t  read row       bool quote mode   false    std  vector lt std  string gt  row    std  string column    char c    while  m ifs get c           switch  c                                                                                                                            separator     detected           in quote mode add character to column         push column if not in quote mode                                                                                                                  case            if  quote mode    true                  column    c                else                 row push back column           column clear                  break                                                                                                                       quote     detected           toggle quote mode                                                                                                                  case            quote mode    quote mode        break                                                                                                                       line end detected         in quote mode add character to column         return row if not in quote mode                                                                                                                  case   n       case   r         if  quote mode    true                  column    c                else                 return row                break                                                                                                                       default  add character to column                                                                                                                  default        column    c        break                 return empty vector if end of file detected    m ifs close      std  vector lt std  string gt  v    return v

User · Answer

You could also take a look at capabilities of Qt library.

It has regular expressions support and QString class has nice methods, e.g. split() returning QStringList, list of strings obtained by splitting the original string with a provided delimiter. Should suffice for csv file..

To get a column with a given header name I use following: c++ inheritance Qt problem qstring

User · Answer

Here is a ready-to use function if all you need is to load a data file of doubles  no integers  no text     include  lt sstream gt   include  lt fstream gt   include  lt iterator gt   include  lt string gt   include  lt vector gt   include  lt algorithm gt   using namespace std          Parse a CSV data file and fill the 2d STL vector  data      Limits  only  pure datas  of doubles  not encapsulated by   and without  n inside     Further no formatting in the data  e g  scientific notation     It however handles both dots and commas as decimal separators and removes thousand separator         returnCodes 0   file access 0- gt  ok 1- gt  not able to read  2- gt  decimal separator equal to comma separator    returnCodes 1   number of records    returnCodes 2   number of fields  -1 If rows have different field size         vector lt int gt  readCsvData  vector  lt vector  lt double gt  gt  amp  data  const string amp  filename  const string amp  delimiter  const string amp  decseparator     int vv 3      0 0 0     vector lt int gt  returnCodes  amp vv 0    amp vv 0  3     string rowstring  stringtoken   double doubletoken   int rowcount 0   int fieldcount 0   data clear      ifstream iFile filename  ios base  in    if   iFile is open        returnCodes 0    1     return returnCodes      while  getline iFile  rowstring         if  rowstring      continue     empty line     rowcount       let s start with 1     if delimiter    decseparator         returnCodes 0    2        return returnCodes            if decseparator                  remove dots  used as thousand separators       string  iterator end pos   remove rowstring begin    rowstring end               rowstring erase end pos  rowstring end             replace decimal separator with dots       replace rowstring begin    rowstring end   decseparator c str   0                else           remove commas  used as thousand separators       string  iterator end pos   remove rowstring begin    rowstring end               rowstring erase end pos  rowstring end                  tokenize       vector lt double gt  tokens         Skip delimiters at beginning      string  size type lastPos   rowstring find first not of delimiter  0          Find first  non-delimiter       string  size type pos       rowstring find first of delimiter  lastPos       while  string  npos    pos    string  npos    lastPos              Found a token  convert it to double add it to the vector          stringtoken   rowstring substr lastPos  pos - lastPos           if  stringtoken                tokens push back 0 0         else             istringstream totalSString stringtoken         totalSString  gt  gt  doubletoken        tokens push back doubletoken                         Skip delimiters   Note the  not of          lastPos   rowstring find first not of delimiter  pos              Find next  non-delimiter          pos   rowstring find first of delimiter  lastPos             if rowcount    1         fieldcount   tokens size          returnCodes 2    tokens size          else         if   tokens size      fieldcount       returnCodes 2    -1                    data push back tokens       iFile close     returnCodes 1    rowcount   return returnCodes

User · Answer

If you don t care about escaping comma and newline  AND you can t embed comma and newline in quotes  If you can t escape then     then its only about three lines of code  OK 14 - gt But its only 15 to read the whole file   std  vector lt std  string gt  getNextLineAndSplitIntoTokens std  istream amp  str        std  vector lt std  string gt    result      std  string                line      std  getline str line        std  stringstream          lineStream line       std  string                cell       while std  getline lineStream cell                      result push back cell                This checks for a trailing comma with no data after it      if   lineStream  amp  amp  cell empty                     If there was a trailing comma then add an empty element          result push back  quot  quot              return result     I would just create a class representing a row  Then stream into that object   include  lt iterator gt   include  lt iostream gt   include  lt fstream gt   include  lt sstream gt   include  lt vector gt   include  lt string gt   class CSVRow       public          std  string view operator   std  size t index  const                       return std  string view  amp m line m data index    1   m data index   1  -   m data index    1                      std  size t size   const                       return m data size   - 1                    void readNextRow std  istream amp  str                        std  getline str  m line                m data clear                m data emplace back -1               std  string  size type pos   0              while  pos   m line find      pos      std  string  npos                                m data emplace back pos                     pos                               This checks for a trailing comma with no data after it              pos     m line size                m data emplace back pos                 private          std  string         m line          std  vector lt int gt     m data      std  istream amp  operator gt  gt  std  istream amp  str  CSVRow amp  data        data readNextRow str       return str       int main         std  ifstream       file  quot plop csv quot         CSVRow              row      while file  gt  gt  row                std  cout  lt  lt   quot 4th Element  quot   lt  lt  row 3   lt  lt   quot   n quot            But with a little work we could technically create an iterator  class CSVIterator          public          typedef std  input iterator tag     iterator category          typedef CSVRow                      value type          typedef std  size t                 difference type          typedef CSVRow                      pointer          typedef CSVRow amp                      reference           CSVIterator std  istream amp  str    m str str good    amp str NULL        this             CSVIterator                      m str NULL                 Pre Increment         CSVIterator amp  operator                    if  m str    if      m str   gt  gt  m row   m str   NULL   return  this              Post increment         CSVIterator operator   int               CSVIterator    tmp  this      this  return tmp           CSVRow const amp  operator      const        return m row           CSVRow const  operator- gt     const        return  amp m row            bool operator   CSVIterator const amp  rhs   return   this     amp rhs       this- gt m str    NULL   amp  amp   rhs m str    NULL              bool operator   CSVIterator const amp  rhs   return     this     rhs        private          std  istream        m str          CSVRow              m row       int main         std  ifstream       file  quot plop csv quot         for CSVIterator loop file   loop    CSVIterator      loop                std  cout  lt  lt   quot 4th Element  quot   lt  lt    loop  3   lt  lt   quot   n quot            Now that we are in 2020 lets add a CSVRange object  class CSVRange       std  istream amp    stream      public          CSVRange std  istream amp  str                stream str                     CSVIterator begin   const  return CSVIterator stream            CSVIterator end     const  return CSVIterator         int main         std  ifstream       file  quot plop csv quot         for auto amp  row  CSVRange file                 std  cout  lt  lt   quot 4th Element  quot   lt  lt  row 3   lt  lt   quot   n quot

User · Answer

The C   String Toolkit Library  StrTk  has a token grid class that allows you to load data either from text files  strings or char buffers  and to parse process them in a row-column fashion   You can specify the row delimiters and column delimiters or just use the defaults   void foo        std  string data    1 2 3 4 5 n                         0 2 4 6 8 n                         1 3 5 7 9 n       strtk  token grid grid data data size             for std  size t i   0  i  lt  grid row count      i             strtk  token grid  row type r   grid row i         for std  size t j   0  j  lt  r size      j                   std  cout  lt  lt  r get lt int gt  j   lt  lt    t                 std  cout  lt  lt  std  endl          std  cout  lt  lt  std  endl      More examples can be found Here

User · Answer

Parsing CSV file lines with Stream  I wrote a small example of parsing CSV file lines  it can be developed with for and while loops if desired    include  lt iostream gt   include  lt fstream gt   include  lt string h gt   using namespace std   int main       ifstream fin  Infile csv    ofstream fout  OutFile csv    string strline  strremain  strCol1   strout   string delimeter        int d1    to continue until the end of the file   while   fin eof        get first line from InFile        getline fin strline   n            find delimeter position in line       d1   strline find         and parse first column       strCol1   strline substr 0 d1      parse first Column     d1        strremain   strline substr d1      remaining line   create output line in CSV format       strout append strCol1       strout append delimeter     write line to Out File       fout  lt  lt  strout  lt  lt  endl    out file line      fin close    fout close     return 0       This code is compiled and running  Good luck

User · Answer

You gotta feel proud when you use something so beautiful as boost::spirit

Here my attempt of a parser (almost) complying with the CSV specifications on this link CSV specs (I didn't need line breaks within fields. Also the spaces around the commas are dismissed).

After you overcome the shocking experience of waiting 10 seconds for compiling this code :), you can sit back and enjoy.

// csvparser.cpp
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>

#include <iostream>
#include <string>

namespace qi = boost::spirit::qi;
namespace bascii = boost::spirit::ascii;

template <typename Iterator>
struct csv_parser : qi::grammar<Iterator, std::vector<std::string>(), 
    bascii::space_type>
{
    qi::rule<Iterator, char()                                           > COMMA;
    qi::rule<Iterator, char()                                           > DDQUOTE;
    qi::rule<Iterator, std::string(),               bascii::space_type  > non_escaped;
    qi::rule<Iterator, std::string(),               bascii::space_type  > escaped;
    qi::rule<Iterator, std::string(),               bascii::space_type  > field;
    qi::rule<Iterator, std::vector<std::string>(),  bascii::space_type  > start;

    csv_parser() : csv_parser::base_type(start)
    {
        using namespace qi;
        using qi::lit;
        using qi::lexeme;
        using bascii::char_;

        start       = field % ',';
        field       = escaped | non_escaped;
        escaped     = lexeme['"' >> *( char_ -(char_('"') | ',') | COMMA | DDQUOTE)  >> '"'];
        non_escaped = lexeme[       *( char_ -(char_('"') | ',')                  )        ];
        DDQUOTE     = lit("\"\"")       [_val = '"'];
        COMMA       = lit(",")          [_val = ','];
    }

};

int main()
{
    std::cout << "Enter CSV lines [empty] to quit\n";

    using bascii::space;
    typedef std::string::const_iterator iterator_type;
    typedef csv_parser<iterator_type> csv_parser;

    csv_parser grammar;
    std::string str;
    int fid;
    while (getline(std::cin, str))
    {
        fid = 0;

        if (str.empty())
            break;

        std::vector<std::string> csv;
        std::string::const_iterator it_beg = str.begin();
        std::string::const_iterator it_end = str.end();
        bool r = phrase_parse(it_beg, it_end, grammar, space, csv);

        if (r && it_beg == it_end)
        {
            std::cout << "Parsing succeeded\n";
            for (auto& field: csv)
            {
                std::cout << "field " << ++fid << ": " << field << std::endl;
            }
        }
        else
        {
            std::cout << "Parsing failed\n";
        }
    }

    return 0;
}

Compile:

make csvparser

Test (example stolen from Wikipedia):

./csvparser
Enter CSV lines [empty] to quit

1999,Chevy,"Venture ""Extended Edition, Very Large""",,5000.00
Parsing succeeded
field 1: 1999
field 2: Chevy
field 3: Venture "Extended Edition, Very Large"
field 4: 
field 5: 5000.00

1999,Chevy,"Venture ""Extended Edition, Very Large""",,5000.00"
Parsing failed

User · Answer

I wrote a header-only  C  11 CSV parser  It s well tested  fast  supports the entire CSV spec  quoted fields  delimiter terminator in quotes  quote escaping  etc    and is configurable to account for the CSVs that don t adhere to the specification   Configuration is done through a fluent interface      constructor accepts any input stream CsvParser parser   CsvParser std  cin     delimiter            delimited by   instead of      quote                quoted fields use   instead of      terminator   0       terminated by  0 instead of by  r n   n  or  r   Parsing is just a range based for loop    include  lt iostream gt   include     parser hpp   using namespace aria  csv   int main       std  ifstream f  some file csv      CsvParser parser f      for  auto amp  row   parser        for  auto amp  field   row          std  cout  lt  lt  field  lt  lt                   std  cout  lt  lt  std  endl

User · Answer

If you DO care about parsing CSV correctly  this will do it   relatively slowly as it works one char at a time    void ParseCSV const string amp  csvSource  vector lt vector lt string gt   gt  amp  lines               bool inQuote false          bool newLine false          string field         lines clear           vector lt string gt  line          string  const iterator aChar   csvSource begin           while  aChar    csvSource end                       switch   aChar                        case                   newLine   false               inQuote    inQuote               break             case                   newLine   false               if  inQuote    true                                 field     aChar                              else                                line push back field                   field clear                                break             case   n             case   r                if  inQuote    true                                 field     aChar                              else                                if  newLine    false                                       line push back field                      lines push back line                      field clear                       line clear                       newLine   true                                                break             default               newLine   false               field push back  aChar                break                         aChar                     if  field size              line push back field           if  line size              lines push back line

User · Answer

This is an old thread but its still at the top of search results, so I'm adding my solution using std::stringstream and a simple string replace method by Yves Baumes I found here.

The following example will read a file line by line, ignore comment lines starting with // and parse the other lines into a combination of strings, ints and doubles. Stringstream does the parsing, but expects fields to be delimited by whitespace, so I use stringreplace to turn commas into spaces first. It handles tabs ok, but doesn't deal with quoted strings.

Bad or missing input is simply ignored, which may or may not be good, depending on your circumstance.

#include <string>
#include <sstream>
#include <fstream>

void StringReplace(std::string& str, const std::string& oldStr, const std::string& newStr)
// code by  Yves Baumes
// http://stackoverflow.com/questions/1494399/how-do-i-search-find-and-replace-in-a-standard-string
{
  size_t pos = 0;
  while((pos = str.find(oldStr, pos)) != std::string::npos)
  {
     str.replace(pos, oldStr.length(), newStr);
     pos += newStr.length();
  }
}

void LoadCSV(std::string &filename) {
   std::ifstream stream(filename);
   std::string in_line;
   std::string Field;
   std::string Chan;
   int ChanType;
   double Scale;
   int Import;
   while (std::getline(stream, in_line)) {
      StringReplace(in_line, ",", " ");
      std::stringstream line(in_line);
      line >> Field >> Chan >> ChanType >> Scale >> Import;
      if (Field.substr(0,2)!="//") {
         // do your stuff 
         // this is CBuilder code for demonstration, sorry
         ShowMessage((String)Field.c_str() + "\n" + Chan.c_str() + "\n" + IntToStr(ChanType) + "\n" +FloatToStr(Scale) + "\n" +IntToStr(Import));
      }
   }
}

User · Answer

Solution using Boost Tokenizer   std  vector lt std  string gt  vec  using namespace boost  tokenizer lt escaped list separator lt char gt   gt  tk     line  escaped list separator lt char gt                     for  tokenizer lt escaped list separator lt char gt   gt   iterator i tk begin        i  tk end     i        vec push back  i

User · Answer

You can use the header-only Csv  Parser library   It fully supports RFC 4180  including quoted values  escaped quotes  and newlines in field values  It requires only standard C    C  17   It supports reading CSV data from std  string view at compile-time  It s extensively tested using Catch2

User · Answer

The first thing you need to do is make sure the file exists. To accomplish this you just need to try and open the file stream at the path. After you have opened the file stream use stream.fail() to see if it worked as expected, or not.

bool fileExists(string fileName)
{

ifstream test;

test.open(fileName.c_str());

if (test.fail())
{
    test.close();
    return false;
}
else
{
    test.close();
    return true;
}
}

You must also verify that the file provided is the correct type of file. To accomplish this you need to look through the file path provided until you find the file extension. Once you have the file extension make sure that it is a .csv file.

bool verifyExtension(string filename)
{
int period = 0;

for (unsigned int i = 0; i < filename.length(); i++)
{
    if (filename[i] == '.')
        period = i;
}

string extension;

for (unsigned int i = period; i < filename.length(); i++)
    extension += filename[i];

if (extension == ".csv")
    return true;
else
    return false;
}

This function will return the file extension which is used later in an error message.

string getExtension(string filename)
{
int period = 0;

for (unsigned int i = 0; i < filename.length(); i++)
{
    if (filename[i] == '.')
        period = i;
}

string extension;

if (period != 0)
{
    for (unsigned int i = period; i < filename.length(); i++)
        extension += filename[i];
}
else
    extension = "NO FILE";

return extension;
}

This function will actually call the error checks created above and then parse through the file.

void parseFile(string fileName)
{
    if (fileExists(fileName) && verifyExtension(fileName))
    {
        ifstream fs;
        fs.open(fileName.c_str());
        string fileCommand;

        while (fs.good())
        {
            string temp;

            getline(fs, fileCommand, '\n');

            for (unsigned int i = 0; i < fileCommand.length(); i++)
            {
                if (fileCommand[i] != ',')
                    temp += fileCommand[i];
                else
                    temp += " ";
            }

            if (temp != "\0")
            {
                // Place your code here to run the file.
            }
        }
        fs.close();
    }
    else if (!fileExists(fileName))
    {
        cout << "Error: The provided file does not exist: " << fileName << endl;

        if (!verifyExtension(fileName))
        {
            if (getExtension(fileName) != "NO FILE")
                cout << "\tCheck the file extension." << endl;
            else
                cout << "\tThere is no file in the provided path." << endl;
        }
    }
    else if (!verifyExtension(fileName)) 
    {
        if (getExtension(fileName) != "NO FILE")
            cout << "Incorrect file extension provided: " << getExtension(fileName) << endl;
        else
            cout << "There is no file in the following path: " << fileName << endl;
    }
}

User · Answer

You can open and read  csv file using fopen  fscanf functions  but the important thing is to parse the data Simplest way to parse the data using delimiter In case of  csv   delimiter is       Suppose your data1 csv file is as follows     A 45 76 01 B 77 67 02 C 63 76 03 D 65 44 04   you can tokenize data and store in char array and later use atoi   etc function for appropriate conversions    FILE  fp  char str1 10   str2 10   str3 10   str4 10    fp   fopen  G   data1 csv    r    if NULL    fp        printf   nError in opening file         return 0    while EOF    fscanf fp                          s   s   s   s    str1  str2  str3  str4         printf   n s  s  s  s   str1  str2  str3  str4     fclose fp             -it inverts logic   means match any string that does not contain comma then last   says to match comma that terminated previous string

User · Answer

It is not overkill to use Spirit for parsing CSVs. Spirit is well suited for micro-parsing tasks. For instance, with Spirit 2.1, it is as easy as:

bool r = phrase_parse(first, last,

    //  Begin grammar
    (
        double_ % ','
    )
    ,
    //  End grammar

    space, v);

The vector, v, gets stuffed with the values. There is a series of tutorials touching on this in the new Spirit 2.1 docs that's just been released with Boost 1.41.

The tutorial progresses from simple to complex. The CSV parsers are presented somewhere in the middle and touches on various techniques in using Spirit. The generated code is as tight as hand written code. Check out the assembler generated!

User · Answer

Since i m not used to boost right now  I will suggest a more simple solution  Lets suppose that your  csv file has 100 lines with 10 numbers in each line separated by a      You could load this data in the form of an array with the following code    include  lt iostream gt   include  lt fstream gt   include  lt sstream gt   include  lt string gt  using namespace std   int main         int A 100  10       ifstream ifs      ifs open  name of file csv        string s1      char c      for int k 0  k lt 100  k                  getline ifs s1           stringstream stream s1           int j 0          while 1                        stream  gt  gt A k  j               stream  gt  gt  c              j                if  stream   break

User · Answer

A minor edition to  sastanin s solution  so that it can deal with newlines within quotes   std  vector lt std  vector lt std  string gt  gt  readCSV std  istream  amp in        std  vector lt std  vector lt std  string gt  gt  table       while   in eof              CSVState state   CSVState  UnquotedField          std  vector lt std  string gt  fields               size t i   0     index of the current field         for  char c   row                switch  state                    case CSVState  UnquotedField                      switch  c                            case         end of field                                   fields push back      i                                      break                          case      state   CSVState  QuotedField                                    break                          default   fields i  push back c                                     break                        break                  case CSVState  QuotedField                      switch  c                            case      state   CSVState  QuotedQuote                                    break                          default   fields i  push back c                                     break                        break                  case CSVState  QuotedQuote                      switch  c                            case           after closing quote                                   fields push back      i                                      state   CSVState  UnquotedField                                    break                          case            - gt                                      fields i  push back                                         state   CSVState  QuotedField                                    break                          case   n      newline                                   table push back fields                                     state   CSVState  UnquotedField                                    fields   vector lt string gt                                         i   0                          default      end of quote                                   state   CSVState  UnquotedField                                    break                        break                                    return table

[c++] How can I read and parse CSV files in C++?

The answer is

Examples related to c++

Examples related to parsing

Examples related to text

Examples related to csv

Tags