Lauri Nikkinen
2009-Sep-08 11:53 UTC
[R] Data separated by spaces, getting data into R using field lengths
I have a text file similar to this (separated by spaces): x <- "DF12 This is an example 1 This DF12 This is an 1232 This is DF14 This is 12334 This is an DF15 This 23 This is an example " and I know the field lengths of each variable (there is 5 variables in this data set), which are: varlength <- c(2, 2, 18, 5, 18) How can I import this kind of data into R, using the varlength variable as an field separator indicator?
Duncan Murdoch
2009-Sep-08 11:59 UTC
[R] Data separated by spaces, getting data into R using field lengths
On 9/8/2009 7:53 AM, Lauri Nikkinen wrote:> I have a text file similar to this (separated by spaces): > > x <- "DF12 This is an example 1 This > DF12 This is an 1232 This is > DF14 This is 12334 This is an > DF15 This 23 This is an example > " > > and I know the field lengths of each variable (there is 5 variables in > this data set), which are: > > varlength <- c(2, 2, 18, 5, 18) > > How can I import this kind of data into R, using the varlength > variable as an field separator indicator?See ?read.fwf. Duncan Murdoch
Barry Rowlingson
2009-Sep-08 11:59 UTC
[R] Data separated by spaces, getting data into R using field lengths
On Tue, Sep 8, 2009 at 12:53 PM, Lauri Nikkinen<lauri.nikkinen at iki.fi> wrote:> I have a text file similar to this (separated by spaces): > > x <- "DF12 This is an example 1 This > DF12 This is an 1232 This is > DF14 This is 12334 This is an > DF15 This 23 This is an example > " > > and I know the field lengths of each variable (there is 5 variables in > this data set), which are: > > varlength <- c(2, 2, 18, 5, 18) > > How can I import this kind of data into R, using the varlength > variable as an field separator indicator??read.fwf Read Fixed Width Format Files Description: Read a table of *f*ixed *w*idth *f*ormatted data into a 'data.frame'. Usage: read.fwf(file, widths, header = FALSE, sep = "\t", skip = 0, row.names, col.names, n = -1, buffersize = 2000, ...)
Lauri Nikkinen
2009-Sep-08 12:07 UTC
[R] Data separated by spaces, getting data into R using field lengths
Thanks, I tried it but I got> varlength <- c(2, 2, 18, 5, 18) > read.fwf("c:temppi.txt", widths=varlength)V1 V2 V3 V4 V5 1 DF 12 This is an exampl e 1 T his 2 DF 12 This is an 1232 T his i s 3 DF 14 This is 12334 Thi s is an 4 DF 15 This 23 This is a n exa mple Which is not the way I want it. structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", class = "factor"), V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L, 1L), .Label = c(" This 23 This is a", " This is 12334 Thi", " This is an 1232 T", " This is an exampl"), class = "factor"), V4 = structure(c(1L, 2L, 4L, 3L), .Label = c("e 1 T", "his i", "n exa", "s is "), class = "factor"), V5 = structure(c(2L, 4L, 1L, 3L), .Label = c("an ", "his", "mple", "s"), class "factor")), .Names = c("V1", "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, -4L)) Any ideas? -L 2009/9/8 Duncan Murdoch <murdoch at stats.uwo.ca>:> On 9/8/2009 7:53 AM, Lauri Nikkinen wrote: >> >> I have a text file similar to this (separated by spaces): >> >> x <- "DF12 This is an example 1 This >> DF12 This is an 1232 This is >> DF14 This is 12334 This is an >> DF15 This 23 This is an example >> " >> >> and I know the field lengths of each variable (there is 5 variables in >> this data set), which are: >> >> varlength <- c(2, 2, 18, 5, 18) >> >> How can I import this kind of data into R, using the varlength >> variable as an field separator indicator? > > See ?read.fwf. > > Duncan Murdoch >
jim holtman
2009-Sep-08 12:11 UTC
[R] Data separated by spaces, getting data into R using field lengths
Can you post how you would like it. On Tue, Sep 8, 2009 at 8:07 AM, Lauri Nikkinen<lauri.nikkinen at iki.fi> wrote:> Thanks, I tried it but I got > >> varlength <- c(2, 2, 18, 5, 18) >> read.fwf("c:temppi.txt", widths=varlength) > ?V1 V2 ? ? ? ? ? ? ? ? V3 ? ?V4 ? V5 > 1 DF 12 ?This is an exampl e 1 T ?his > 2 DF 12 ?This is an 1232 T his i ? ?s > 3 DF 14 ?This is 12334 Thi s is ? an > 4 DF 15 ?This 23 This is a n exa mple > > Which is not the way I want it. > > structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", class > = "factor"), > ? ?V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L, > ? ?1L), .Label = c(" This 23 This is a", " This is 12334 Thi", > ? ?" This is an 1232 T", " This is an exampl"), class = "factor"), > ? ?V4 = structure(c(1L, 2L, 4L, 3L), .Label = c("e 1 T", "his i", > ? ?"n exa", "s is "), class = "factor"), V5 = structure(c(2L, > ? ?4L, 1L, 3L), .Label = c("an ", "his", "mple", "s"), class > "factor")), .Names = c("V1", > "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, > -4L)) > > Any ideas? > -L > > 2009/9/8 Duncan Murdoch <murdoch at stats.uwo.ca>: >> On 9/8/2009 7:53 AM, Lauri Nikkinen wrote: >>> >>> I have a text file similar to this (separated by spaces): >>> >>> x <- "DF12 This is an example 1 This >>> DF12 This is an 1232 This is >>> DF14 This is 12334 This is an >>> DF15 This 23 This is an example >>> " >>> >>> and I know the field lengths of each variable (there is 5 variables in >>> this data set), which are: >>> >>> varlength <- c(2, 2, 18, 5, 18) >>> >>> How can I import this kind of data into R, using the varlength >>> variable as an field separator indicator? >> >> See ?read.fwf. >> >> Duncan Murdoch >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
Lauri Nikkinen
2009-Sep-08 12:15 UTC
[R] Data separated by spaces, getting data into R using field lengths
Sure, here you go structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", class = "factor"), V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L, 1L), .Label = c("This", "This is", "This is an", "This is an example" ), class = "factor"), V4 = c(1L, 1232L, 12334L, 23L), V5 structure(1:4, .Label = c("This", "This is", "This is an", "This is an example"), class "factor")), .Names = c("V1", "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, -4L)) 2009/9/8 jim holtman <jholtman at gmail.com>:> Can you post how you would like it. > > On Tue, Sep 8, 2009 at 8:07 AM, Lauri Nikkinen<lauri.nikkinen at iki.fi> wrote: >> Thanks, I tried it but I got >> >>> varlength <- c(2, 2, 18, 5, 18) >>> read.fwf("c:temppi.txt", widths=varlength) >> ?V1 V2 ? ? ? ? ? ? ? ? V3 ? ?V4 ? V5 >> 1 DF 12 ?This is an exampl e 1 T ?his >> 2 DF 12 ?This is an 1232 T his i ? ?s >> 3 DF 14 ?This is 12334 Thi s is ? an >> 4 DF 15 ?This 23 This is a n exa mple >> >> Which is not the way I want it. >> >> structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", class >> = "factor"), >> ? ?V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L, >> ? ?1L), .Label = c(" This 23 This is a", " This is 12334 Thi", >> ? ?" This is an 1232 T", " This is an exampl"), class = "factor"), >> ? ?V4 = structure(c(1L, 2L, 4L, 3L), .Label = c("e 1 T", "his i", >> ? ?"n exa", "s is "), class = "factor"), V5 = structure(c(2L, >> ? ?4L, 1L, 3L), .Label = c("an ", "his", "mple", "s"), class >> "factor")), .Names = c("V1", >> "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, >> -4L)) >> >> Any ideas? >> -L >> >> 2009/9/8 Duncan Murdoch <murdoch at stats.uwo.ca>: >>> On 9/8/2009 7:53 AM, Lauri Nikkinen wrote: >>>> >>>> I have a text file similar to this (separated by spaces): >>>> >>>> x <- "DF12 This is an example 1 This >>>> DF12 This is an 1232 This is >>>> DF14 This is 12334 This is an >>>> DF15 This 23 This is an example >>>> " >>>> >>>> and I know the field lengths of each variable (there is 5 variables in >>>> this data set), which are: >>>> >>>> varlength <- c(2, 2, 18, 5, 18) >>>> >>>> How can I import this kind of data into R, using the varlength >>>> variable as an field separator indicator? >>> >>> See ?read.fwf. >>> >>> Duncan Murdoch >>> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? >
jim holtman
2009-Sep-08 12:20 UTC
[R] Data separated by spaces, getting data into R using field lengths
This bears no relationship to what you were first asking. It look like you want to split the leading 4 characters into two groups of two and then split the remaining data into three parts based on numerics in the middle. Is this correct? On Tue, Sep 8, 2009 at 8:15 AM, Lauri Nikkinen<lauri.nikkinen at iki.fi> wrote:> Sure, here you go > > structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", class > = "factor"), > ? ?V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L, > ? ?1L), .Label = c("This", "This is", "This is an", "This is an example" > ? ?), class = "factor"), V4 = c(1L, 1232L, 12334L, 23L), V5 > structure(1:4, .Label = c("This", > ? ?"This is", "This is an", "This is an example"), class > "factor")), .Names = c("V1", > "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, > -4L)) > > > 2009/9/8 jim holtman <jholtman at gmail.com>: >> Can you post how you would like it. >> >> On Tue, Sep 8, 2009 at 8:07 AM, Lauri Nikkinen<lauri.nikkinen at iki.fi> wrote: >>> Thanks, I tried it but I got >>> >>>> varlength <- c(2, 2, 18, 5, 18) >>>> read.fwf("c:temppi.txt", widths=varlength) >>> ?V1 V2 ? ? ? ? ? ? ? ? V3 ? ?V4 ? V5 >>> 1 DF 12 ?This is an exampl e 1 T ?his >>> 2 DF 12 ?This is an 1232 T his i ? ?s >>> 3 DF 14 ?This is 12334 Thi s is ? an >>> 4 DF 15 ?This 23 This is a n exa mple >>> >>> Which is not the way I want it. >>> >>> structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", class >>> = "factor"), >>> ? ?V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L, >>> ? ?1L), .Label = c(" This 23 This is a", " This is 12334 Thi", >>> ? ?" This is an 1232 T", " This is an exampl"), class = "factor"), >>> ? ?V4 = structure(c(1L, 2L, 4L, 3L), .Label = c("e 1 T", "his i", >>> ? ?"n exa", "s is "), class = "factor"), V5 = structure(c(2L, >>> ? ?4L, 1L, 3L), .Label = c("an ", "his", "mple", "s"), class >>> "factor")), .Names = c("V1", >>> "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, >>> -4L)) >>> >>> Any ideas? >>> -L >>> >>> 2009/9/8 Duncan Murdoch <murdoch at stats.uwo.ca>: >>>> On 9/8/2009 7:53 AM, Lauri Nikkinen wrote: >>>>> >>>>> I have a text file similar to this (separated by spaces): >>>>> >>>>> x <- "DF12 This is an example 1 This >>>>> DF12 This is an 1232 This is >>>>> DF14 This is 12334 This is an >>>>> DF15 This 23 This is an example >>>>> " >>>>> >>>>> and I know the field lengths of each variable (there is 5 variables in >>>>> this data set), which are: >>>>> >>>>> varlength <- c(2, 2, 18, 5, 18) >>>>> >>>>> How can I import this kind of data into R, using the varlength >>>>> variable as an field separator indicator? >>>> >>>> See ?read.fwf. >>>> >>>> Duncan Murdoch >>>> >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> >> -- >> Jim Holtman >> Cincinnati, OH >> +1 513 646 9390 >> >> What is the problem that you are trying to solve? >> >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
Philipp Pagel
2009-Sep-08 12:33 UTC
[R] Data separated by spaces, getting data into R using field lengths
On Tue, Sep 08, 2009 at 02:53:11PM +0300, Lauri Nikkinen wrote:> I have a text file similar to this (separated by spaces): > > x <- "DF12 This is an example 1 This > DF12 This is an 1232 This is > DF14 This is 12334 This is an > DF15 This 23 This is an example > " > > and I know the field lengths of each variable (there is 5 variables in > this data set), which are: > > varlength <- c(2, 2, 18, 5, 18) > > How can I import this kind of data into R, using the varlength > variable as an field separator indicator?I am not totally sure what exaclty the expected result is. From your description I got the impression that your data file uses a mixture of separation characters and fixed-width formatting. Maybe I misinterpreted your example. Have a look at read.fwf() an if that does not solve your problem maybe explain the Structure and expected result a little further. cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/