[r] How to add header to a dataset in R?

I need to read the ''wdbc.data' in the following data folder: http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/

Doing this in R is easy using command read.csv but as the header is missing how can I add it? I have the information but don't know how to do this and I'd prefer do not edit the data file.

This question is related to r dataset statistics

The answer is


You can do the following:

Load the data:

test <- read.csv(
          "http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data",
          header=FALSE)

Note that the default value of the header argument for read.csv is TRUE so in order to get all lines you need to set it to FALSE.

Add names to the different columns in the data.frame

names(test) <- c("A","B","C","D","E","F","G","H","I","J","K")

or alternative and faster as I understand (not reloading the entire dataset):

colnames(test) <- c("A","B","C","D","E","F","G","H","I","J","K")

You can also solve this problem by creating an array of values and assigning that array:

newheaders <- c("a", "b", "c", ... "x")
colnames(data) <- newheaders

in case you are interested in reading some data from a .txt file and only extract few columns of that file into a new .txt file with a customized header, the following code might be useful:

# input some data from 2 different .txt files:
civit_gps <- read.csv(file="/path2/gpsFile.csv",head=TRUE,sep=",")
civit_cam <- read.csv(file="/path2/cameraFile.txt",head=TRUE,sep=",")

# assign the name for the output file:
seqName <- "seq1_data.txt"

#=========================================================
# Extract data from imported files
#=========================================================
# From Camera:
frame_idx <- civit_cam$X.frame
qx        <- civit_cam$q.x.rad.
qy        <- civit_cam$q.y.rad.
qz        <- civit_cam$q.z.rad.
qw        <- civit_cam$q.w

# From GPS:
gpsT      <- civit_gps$X.gpsTime.sec.
latitude  <- civit_gps$Latitude.deg.
longitude <- civit_gps$Longitude.deg.
altitude  <- civit_gps$H.Ell.m.
heading   <- civit_gps$Heading.deg.
pitch     <- civit_gps$pitch.deg.
roll      <- civit_gps$roll.deg.
gpsTime_corr <- civit_gps[frame_idx,1]

#=========================================================
# Export new data into the output txt file
#=========================================================
myData <- data.frame(c(gpsTime_corr),
                     c(frame_idx),
                     c(qx),
                     c(qy),
                     c(qz),
                     c(qw))
# Write :
cat("#GPSTime,frameIdx,qx,qy,qz,qw\n", file=seqName)
write.table(myData, file = seqName,row.names=FALSE,col.names=FALSE,append=TRUE,sep = ",")

Of course, you should modify this sample script based on your own application.


this should work out,

      kable(dt) %>%
      kable_styling("striped") %>%
      add_header_above(c(" " = 1, "Group 1" = 2, "Group 2" = 2, "Group 3" = 2))
#OR
kable(dt) %>%
  kable_styling(c("striped", "bordered")) %>%
  add_header_above(c(" ", "Group 1" = 2, "Group 2" = 2, "Group 3" = 2)) %>%
  add_header_above(c(" ", "Group 4" = 4, "Group 5" = 2)) %>%
  add_header_above(c(" ", "Group 6" = 6))

for more you can check the link


You can also use colnames instead of names if you have data.frame or matrix


Examples related to r

How to get AIC from Conway–Maxwell-Poisson regression via COM-poisson package in R? R : how to simply repeat a command? session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium How to show code but hide output in RMarkdown? remove kernel on jupyter notebook Function to calculate R2 (R-squared) in R Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph R multiple conditions in if statement What does "The following object is masked from 'package:xxx'" mean?

Examples related to dataset

How to convert a Scikit-learn dataset to a Pandas dataset? Convert floats to ints in Pandas? Convert DataSet to List C#, Looping through dataset and show each record from a dataset column How to add header to a dataset in R? How I can filter a Datatable? Stored procedure return into DataSet in C# .Net Looping through a DataTable adding a datatable in a dataset How to fill Dataset with multiple tables?

Examples related to statistics

Function to calculate R2 (R-squared) in R pandas: find percentile stats of a given column What exactly does numpy.exp() do? Find p-value (significance) in scikit-learn LinearRegression How to plot ROC curve in Python Pandas - Compute z-score for all columns Calculating percentile of dataset column How to normalize an array in NumPy to a unit vector? How to find row number of a value in R code np.mean() vs np.average() in Python NumPy?