[php] How to parse a CSV file using PHP

Suppose I have a .csv file with the following content:

 "text, with commas","another text",123,"text",5; 
 "some    without commas","another text",123,"text";
 "some text with  commas or no",,123,"text"; 

How can I parse the content through PHP?

This question is related to php csv fgetcsv

The answer is


I been seeking the same thing without using some unsupported PHP class. Excel CSV dosn't always use the quote separators and escapes the quotes using "" because the algorithm was probably made back the 80's or something. After looking at several .csv parsers in the comments section on PHP.NET, I seen ones that even used callbacks or eval'd code and they either didnt work like needed or simply didnt work at all. So, I wrote my own routines for this and they work in the most basic PHP configuration. The array keys can either be numeric or named as the fields given in the header row. Hope this helps.

    function SW_ImplodeCSV(array $rows, $headerrow=true, $mode='EXCEL', $fmt='2D_FIELDNAME_ARRAY')
    // SW_ImplodeCSV - returns 2D array as string of csv(MS Excel .CSV supported)
    // AUTHOR: [email protected]
    // RELEASED: 9/21/13 BETA
      { $r=1; $row=array(); $fields=array(); $csv="";
        $escapes=array('\r', '\n', '\t', '\\', '\"');  //two byte escape codes
        $escapes2=array("\r", "\n", "\t", "\\", "\""); //actual code

        if($mode=='EXCEL')// escape code = ""
         { $delim=','; $enclos='"'; $rowbr="\r\n"; }
        else //mode=STANDARD all fields enclosed
           { $delim=','; $enclos='"'; $rowbr="\r\n"; }

          $csv=""; $i=-1; $i2=0; $imax=count($rows);

          while( $i < $imax )
          {
            // get field names
            if($i == -1)
             { $row=$rows[0];
               if($fmt=='2D_FIELDNAME_ARRAY')
                { $i2=0; $i2max=count($row);
                  while( list($k, $v) = each($row) )
                   { $fields[$i2]=$k;
                     $i2++;
                   }
                }
               else //if($fmt='2D_NUMBERED_ARRAY')
                { $i2=0; $i2max=(count($rows[0]));
                  while($i2<$i2max)
                   { $fields[$i2]=$i2;
                     $i2++;
                   }
                }

               if($headerrow==true) { $row=$fields; }
               else                 { $i=0; $row=$rows[0];}
             }
            else
             { $row=$rows[$i];
             }

            $i2=0;  $i2max=count($row); 
            while($i2 < $i2max)// numeric loop (order really matters here)
            //while( list($k, $v) = each($row) )
             { if($i2 != 0) $csv=$csv.$delim;

               $v=$row[$fields[$i2]];

               if($mode=='EXCEL') //EXCEL 2quote escapes
                    { $newv = '"'.(str_replace('"', '""', $v)).'"'; }
               else  //STANDARD
                    { $newv = '"'.(str_replace($escapes2, $escapes, $v)).'"'; }
               $csv=$csv.$newv;
               $i2++;
             }

            $csv=$csv."\r\n";

            $i++;
          }

         return $csv;
       }

    function SW_ExplodeCSV($csv, $headerrow=true, $mode='EXCEL', $fmt='2D_FIELDNAME_ARRAY')
     { // SW_ExplodeCSV - parses CSV into 2D array(MS Excel .CSV supported)
       // AUTHOR: [email protected]
       // RELEASED: 9/21/13 BETA
       //SWMessage("SW_ExplodeCSV() - CALLED HERE -");
       $rows=array(); $row=array(); $fields=array();// rows = array of arrays

       //escape code = '\'
       $escapes=array('\r', '\n', '\t', '\\', '\"');  //two byte escape codes
       $escapes2=array("\r", "\n", "\t", "\\", "\""); //actual code

       if($mode=='EXCEL')
        {// escape code = ""
          $delim=','; $enclos='"'; $esc_enclos='""'; $rowbr="\r\n";
        }
       else //mode=STANDARD 
        {// all fields enclosed
          $delim=','; $enclos='"'; $rowbr="\r\n";
        }

       $indxf=0; $indxl=0; $encindxf=0; $encindxl=0; $enc=0; $enc1=0; $enc2=0; $brk1=0; $rowindxf=0; $rowindxl=0; $encflg=0;
       $rowcnt=0; $colcnt=0; $rowflg=0; $colflg=0; $cell="";
       $headerflg=0; $quotedflg=0;
       $i=0; $i2=0; $imax=strlen($csv);   

       while($indxf < $imax)
         {
           //find first *possible* cell delimiters
           $indxl=strpos($csv, $delim, $indxf);  if($indxl===false) { $indxl=$imax; }
           $encindxf=strpos($csv, $enclos, $indxf); if($encindxf===false) { $encindxf=$imax; }//first open quote
           $rowindxl=strpos($csv, $rowbr, $indxf); if($rowindxl===false) { $rowindxl=$imax; }

           if(($encindxf>$indxl)||($encindxf>$rowindxl))
            { $quoteflg=0; $encindxf=$imax; $encindxl=$imax;
              if($rowindxl<$indxl) { $indxl=$rowindxl; $rowflg=1; }
            }
           else 
            { //find cell enclosure area (and real cell delimiter)
              $quoteflg=1;
              $enc=$encindxf; 
              while($enc<$indxl) //$enc = next open quote
               {// loop till unquoted delim. is found
                 $enc=strpos($csv, $enclos, $enc+1); if($enc===false) { $enc=$imax; }//close quote
                 $encindxl=$enc; //last close quote
                 $indxl=strpos($csv, $delim, $enc+1); if($indxl===false)  { $indxl=$imax; }//last delim.
                 $enc=strpos($csv, $enclos, $enc+1); if($enc===false) { $enc=$imax; }//open quote
                 if(($indxl==$imax)||($enc==$imax)) break;
               }
              $rowindxl=strpos($csv, $rowbr, $enc+1); if($rowindxl===false) { $rowindxl=$imax; }
              if($rowindxl<$indxl) { $indxl=$rowindxl; $rowflg=1; }
            }

           if($quoteflg==0)
            { //no enclosured content - take as is
              $colflg=1;
              //get cell 
             // $cell=substr($csv, $indxf, ($indxl-$indxf)-1);
              $cell=substr($csv, $indxf, ($indxl-$indxf));
            }
           else// if($rowindxl > $encindxf)
            { // cell enclosed
              $colflg=1;

             //get cell - decode cell content
              $cell=substr($csv, $encindxf+1, ($encindxl-$encindxf)-1);

              if($mode=='EXCEL') //remove EXCEL 2quote escapes
                { $cell=str_replace($esc_enclos, $enclos, $cell);
                }
              else //remove STANDARD esc. sceme
                { $cell=str_replace($escapes, $escapes2, $cell);
                }
            }

           if($colflg)
            {// read cell into array
              if( ($fmt=='2D_FIELDNAME_ARRAY') && ($headerflg==1) )
               { $row[$fields[$colcnt]]=$cell; }
              else if(($fmt=='2D_NUMBERED_ARRAY')||($headerflg==0))
               { $row[$colcnt]=$cell; } //$rows[$rowcnt][$colcnt] = $cell;

              $colcnt++; $colflg=0; $cell="";
              $indxf=$indxl+1;//strlen($delim);
            }
           if($rowflg)
            {// read row into big array
              if(($headerrow) && ($headerflg==0))
                {  $fields=$row;
                   $row=array();
                   $headerflg=1;
                }
              else
                { $rows[$rowcnt]=$row;
                  $row=array();
                  $rowcnt++; 
                }
               $colcnt=0; $rowflg=0; $cell="";
               $rowindxf=$rowindxl+2;//strlen($rowbr);
               $indxf=$rowindxf;
            }

           $i++;
           //SWMessage("SW_ExplodeCSV() - colcnt = ".$colcnt."   rowcnt = ".$rowcnt."   indxf = ".$indxf."   indxl = ".$indxl."   rowindxf = ".$rowindxf);
           //if($i>20) break;
         }

       return $rows;
     }

...bob can now go back to his speadsheets


A bit shorter answer since PHP >= 5.3.0:

    $csvFile = file('../somefile.csv');
    $data = [];
    foreach ($csvFile as $line) {
        $data[] = str_getcsv($line);
    }

Handy one liner to parse a CSV file into an array

$csv = array_map('str_getcsv', file('data.csv'));

Just discovered a handy way to get an index while parsing. My mind was blown.

$handle = fopen("test.csv", "r");
for ($i = 0; $row = fgetcsv($handle ); ++$i) {
    // Do something will $row array
}
fclose($handle);

Source: link


I love this

        $data = str_getcsv($CsvString, "\n"); //parse the rows
        foreach ($data as &$row) {
            $row = str_getcsv($row, "; or , or whatever you want"); //parse the items in rows 
            $this->debug($row);
        }

in my case I am going to get a csv through web services, so in this way I don't need to create the file. But if you need to parser with a file, it's only necessary to pass as string