I need to read the ''wdbc.data' in the following data folder: http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/
Doing this in R is easy using command read.csv but as the header is missing how can I add it? I have the information but don't know how to do this and I'd prefer do not edit the data file.
This question is related to
r
dataset
statistics
You can do the following:
Load the data:
test <- read.csv(
"http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data",
header=FALSE)
Note that the default value of the header
argument for read.csv
is TRUE
so in order to get all lines you need to set it to FALSE
.
Add names to the different columns in the data.frame
names(test) <- c("A","B","C","D","E","F","G","H","I","J","K")
or alternative and faster as I understand (not reloading the entire dataset):
colnames(test) <- c("A","B","C","D","E","F","G","H","I","J","K")
You can also solve this problem by creating an array of values and assigning that array:
newheaders <- c("a", "b", "c", ... "x")
colnames(data) <- newheaders
in case you are interested in reading some data from a .txt
file and only extract few columns of that file into a new .txt
file with a customized header, the following code might be useful:
# input some data from 2 different .txt files:
civit_gps <- read.csv(file="/path2/gpsFile.csv",head=TRUE,sep=",")
civit_cam <- read.csv(file="/path2/cameraFile.txt",head=TRUE,sep=",")
# assign the name for the output file:
seqName <- "seq1_data.txt"
#=========================================================
# Extract data from imported files
#=========================================================
# From Camera:
frame_idx <- civit_cam$X.frame
qx <- civit_cam$q.x.rad.
qy <- civit_cam$q.y.rad.
qz <- civit_cam$q.z.rad.
qw <- civit_cam$q.w
# From GPS:
gpsT <- civit_gps$X.gpsTime.sec.
latitude <- civit_gps$Latitude.deg.
longitude <- civit_gps$Longitude.deg.
altitude <- civit_gps$H.Ell.m.
heading <- civit_gps$Heading.deg.
pitch <- civit_gps$pitch.deg.
roll <- civit_gps$roll.deg.
gpsTime_corr <- civit_gps[frame_idx,1]
#=========================================================
# Export new data into the output txt file
#=========================================================
myData <- data.frame(c(gpsTime_corr),
c(frame_idx),
c(qx),
c(qy),
c(qz),
c(qw))
# Write :
cat("#GPSTime,frameIdx,qx,qy,qz,qw\n", file=seqName)
write.table(myData, file = seqName,row.names=FALSE,col.names=FALSE,append=TRUE,sep = ",")
Of course, you should modify this sample script based on your own application.
this should work out,
kable(dt) %>%
kable_styling("striped") %>%
add_header_above(c(" " = 1, "Group 1" = 2, "Group 2" = 2, "Group 3" = 2))
#OR
kable(dt) %>%
kable_styling(c("striped", "bordered")) %>%
add_header_above(c(" ", "Group 1" = 2, "Group 2" = 2, "Group 3" = 2)) %>%
add_header_above(c(" ", "Group 4" = 4, "Group 5" = 2)) %>%
add_header_above(c(" ", "Group 6" = 6))
for more you can check the link
You can also use colnames
instead of names if you have data.frame
or matrix
Source: Stackoverflow.com