duplicate row names are not allowed error

Question

I am trying to load a csv file that has 14 columns like this   StartDate  var1  var2  var3       var14   when I issue this command   systems  lt - read table  http   getfile pl test csv   header   TRUE  sep          I get an error message      duplicate row names are not allowed   It seems to me that the first column name is causing the issue  When I manually download the file and remove the StartDate name from the file  R successfully reads the file and replaces the first column name with X  Can someone tell me what is going on  The file is a  comma separated  csv file

User · Answer

I had this error when opening a CSV file and one of the fields had commas embedded in it. The field had quotes around it, and I had cut and paste the read.table with quote="" in it. Once I took quote="" out, the default behavior of read.table took over and killed the problem. So I went from this:

systems <- read.table("http://getfile.pl?test.csv", header=TRUE, sep=",", quote="")

to this:

systems <- read.table("http://getfile.pl?test.csv", header=TRUE, sep=",")

User · Answer

I used read csv from the readr package In my experience  the parameter row names NULL in the read csv function will lead to a wrong reading of the file if a column name is missing  i e  every column will be shifted  read csv solves this

User · Answer

Another possible reason for this error is that you have entire rows duplicated  If that is the case  the problem is solved by removing the duplicate rows

User · Answer

The answer here  https   stackoverflow com a 22408965 2236315  by  adrianoesch should help  e g   solves  If you know of a solution that does not require the awkward workaround mentioned in your comment  shift the column names  copy the data   that would be great   and     requiring that the data be copied  proposed by  Frank     Note that if you open in some text editor  you should see that the number of header fields less than number of columns below the header row  In my case  the data set had a     missing at the end of the last header field

User · Answer

It seems the problem can arise from more than one reasons  Following two steps worked when I was having same error     I saved my file as MS-DOS csv    Earlier it was saved in as just csv   excel starter 2010     Opened the csv in notepad    No coma was inconsistent  consistency as described above  Brian        Noticed I was not using argument sep       I used and it worked   even though that is default argument

User · Answer

This related question points out a part of the  read table documentation that explains your problem      If there is a header and the first row contains one fewer field    than the number of columns  the first column in the input is used   for the row names  Otherwise if row names is missing  the rows are numbered    Your header row likely has 1 fewer column than the rest of the file and so read table assumes that the first column is the row names  which must all be unique   not a column  which can contain duplicated values    You can fix this by using one of the following two Solutions       adding a delimiter  ie  t or    to the front or end of your header row in the source file  or  removing any trailing delimiters in your data   The choice will depend on the structure of your data   Example  Here the header row is interpreted as having one fewer column than the data because the delimiters don t match   v1 v2 v3     3 items   a1 a2 a3     4 items b1 b2 b3     4 items   This is how it is interpreted by default         v1 v2 v3     3 items   a1 a2 a3     4 items b1 b2 b3     4 items   The first column  with no header  values are interpreted as row names  a1 and b1  If this column contains duplicates  which is entirely possible  then you get the duplicate  row names  are not allowed error    If you set row names   FALSE  the shift doesn t happen  but you still have a mismatching number of items in the header and in the data because the delimiters don t match    Solution 1 Add trailing delimiter to header   v1 v2 v3     4 items   a1 a2 a3     4 items b1 b2 b3     4 items   Solution 2 Remove excess trailing delimiter from non-header rows   v1 v2 v3     3 items a1 a2 a3     3 items   b1 b2 b3     3 items

User · Answer

In my case was a comma at the end of every line  By removing that worked

User · Answer

Then tell read table not to use row names   systems  lt - read table  http   getfile pl test csv                          header TRUE  sep      row names NULL    and now your rows will simply be numbered   Also look at read csv which is a wrapper for read table which already sets the sep     and header TRUE arguments so that your call simplifies to  systems  lt - read csv  http   getfile pl test csv   row names NULL

[r] duplicate 'row.names' are not allowed error

Examples related to r

Examples related to csv

Examples related to r-faq