On Sun, Mar 27, 2011 at 9:40 PM, AjayT <ajaytalati at googlemail.com>
wrote:> Hi, I'm new to R and I'm stuck trying to import some data from a
.dat file
> I've been given. The tricky bit for me is that the data has both
variable
> values and labels?
>
> The data looks like this,
>
> Id=1 time=2011-03-27 19:23:40 start=1.4018 ? ? ? end=1.4017
> Id=2 time=2011-03-27 19:23:40 start=1.8046 ? ? ? end=1.8047
> Id=1 time=2011-03-27 19:23:50 start=1.4017 ? ? ? end=1.4018
> Id=2 time=2011-03-27 19:23:50 start=1.8047 ? ? ? end=1.8046
>
> Is there a way to read the file into a dataframe or martix, so each line of
> the file is read into a row, and the data labels are the columns. I'm
try to
> get it to look like this?
>
> Id ?time ? ? ? ? ? ? ? ? ? ? ? ? ?start ? ? ?end
> 1 ?2011-03-27 19:23:40 1.4018 ?1.4017
> 2 ?2011-03-27 19:23:40 1.8046 ?1.8047
> 1 ?2011-03-27 19:23:50 1.4017 ?1.4018
> 2 ?2011-03-27 19:23:50 1.8047 ?1.8046
>
> Its driving me nuts . Any help appreciated
Here are a few ways. (You may need to adjust widths in the first two
solutions.)
1. read.fwf can read read fixed width data:
widths <- c(3, 2, 5, 20, 6, 13, 4, 6)
read.fwf("myfile.dat", widths = widths,
col.names = c(NA, "Id", NA, "time", NA, "start",
NA, "end"),
colClasses = c("NULL", "character", "NULL",
"character",
"NULL", "numeric", "NULL", "numeric"))
2. or a variation which automatically sets the names
widths <- c(2, 1, 2, 4, 1, 20, 5, 1, 13, 3, 1, 6)
DF <- read.fwf("myfile.dat", widths = widths, as.is = TRUE)
ix <- seq(3, 12, 3)
setNames(DF[ix], DF[1, ix-2])
3. or read it, change the delimiters and read it again with new
delimiters. This automatically sets names too and does not need to
know the widths.
L <- readLines("myfile.dat")
L <- gsub(" *(\\w*)=", ",\\1,", L)
DF <- read.table(textConnection(L), sep = ",", as.is = TRUE)
ix <- seq(3, 9, 2)
setNames(DF[ix], DF[1, ix-1])
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com