Lauri Nikkinen
2009-Sep-08 11:53 UTC
[R] Data separated by spaces, getting data into R using field lengths
I have a text file similar to this (separated by spaces): x <- "DF12 This is an example 1 This DF12 This is an 1232 This is DF14 This is 12334 This is an DF15 This 23 This is an example " and I know the field lengths of each variable (there is 5 variables in this data set), which are: varlength <- c(2, 2, 18, 5, 18) How can I import this kind of data into R, using the varlength variable as an field separator indicator?
Duncan Murdoch
2009-Sep-08 11:59 UTC
[R] Data separated by spaces, getting data into R using field lengths
On 9/8/2009 7:53 AM, Lauri Nikkinen wrote:> I have a text file similar to this (separated by spaces): > > x <- "DF12 This is an example 1 This > DF12 This is an 1232 This is > DF14 This is 12334 This is an > DF15 This 23 This is an example > " > > and I know the field lengths of each variable (there is 5 variables in > this data set), which are: > > varlength <- c(2, 2, 18, 5, 18) > > How can I import this kind of data into R, using the varlength > variable as an field separator indicator?See ?read.fwf. Duncan Murdoch
Barry Rowlingson
2009-Sep-08 11:59 UTC
[R] Data separated by spaces, getting data into R using field lengths
On Tue, Sep 8, 2009 at 12:53 PM, Lauri Nikkinen<lauri.nikkinen at iki.fi> wrote:> I have a text file similar to this (separated by spaces): > > x <- "DF12 This is an example 1 This > DF12 This is an 1232 This is > DF14 This is 12334 This is an > DF15 This 23 This is an example > " > > and I know the field lengths of each variable (there is 5 variables in > this data set), which are: > > varlength <- c(2, 2, 18, 5, 18) > > How can I import this kind of data into R, using the varlength > variable as an field separator indicator??read.fwf Read Fixed Width Format Files Description: Read a table of *f*ixed *w*idth *f*ormatted data into a 'data.frame'. Usage: read.fwf(file, widths, header = FALSE, sep = "\t", skip = 0, row.names, col.names, n = -1, buffersize = 2000, ...)
Lauri Nikkinen
2009-Sep-08 12:07 UTC
[R] Data separated by spaces, getting data into R using field lengths
Thanks, I tried it but I got> varlength <- c(2, 2, 18, 5, 18) > read.fwf("c:temppi.txt", widths=varlength)V1 V2 V3 V4 V5 1 DF 12 This is an exampl e 1 T his 2 DF 12 This is an 1232 T his i s 3 DF 14 This is 12334 Thi s is an 4 DF 15 This 23 This is a n exa mple Which is not the way I want it. structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", class = "factor"), V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L, 1L), .Label = c(" This 23 This is a", " This is 12334 Thi", " This is an 1232 T", " This is an exampl"), class = "factor"), V4 = structure(c(1L, 2L, 4L, 3L), .Label = c("e 1 T", "his i", "n exa", "s is "), class = "factor"), V5 = structure(c(2L, 4L, 1L, 3L), .Label = c("an ", "his", "mple", "s"), class "factor")), .Names = c("V1", "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, -4L)) Any ideas? -L 2009/9/8 Duncan Murdoch <murdoch at stats.uwo.ca>:> On 9/8/2009 7:53 AM, Lauri Nikkinen wrote: >> >> I have a text file similar to this (separated by spaces): >> >> x <- "DF12 This is an example 1 This >> DF12 This is an 1232 This is >> DF14 This is 12334 This is an >> DF15 This 23 This is an example >> " >> >> and I know the field lengths of each variable (there is 5 variables in >> this data set), which are: >> >> varlength <- c(2, 2, 18, 5, 18) >> >> How can I import this kind of data into R, using the varlength >> variable as an field separator indicator? > > See ?read.fwf. > > Duncan Murdoch >
jim holtman
2009-Sep-08 12:11 UTC
[R] Data separated by spaces, getting data into R using field lengths
Can you post how you would like it. On Tue, Sep 8, 2009 at 8:07 AM, Lauri Nikkinen<lauri.nikkinen at iki.fi> wrote:> Thanks, I tried it but I got > >> varlength <- c(2, 2, 18, 5, 18) >> read.fwf("c:temppi.txt", widths=varlength) > ?V1 V2 ? ? ? ? ? ? ? ? V3 ? ?V4 ? V5 > 1 DF 12 ?This is an exampl e 1 T ?his > 2 DF 12 ?This is an 1232 T his i ? ?s > 3 DF 14 ?This is 12334 Thi s is ? an > 4 DF 15 ?This 23 This is a n exa mple > > Which is not the way I want it. > > structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", class > = "factor"), > ? ?V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L, > ? ?1L), .Label = c(" This 23 This is a", " This is 12334 Thi", > ? ?" This is an 1232 T", " This is an exampl"), class = "factor"), > ? ?V4 = structure(c(1L, 2L, 4L, 3L), .Label = c("e 1 T", "his i", > ? ?"n exa", "s is "), class = "factor"), V5 = structure(c(2L, > ? ?4L, 1L, 3L), .Label = c("an ", "his", "mple", "s"), class > "factor")), .Names = c("V1", > "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, > -4L)) > > Any ideas? > -L > > 2009/9/8 Duncan Murdoch <murdoch at stats.uwo.ca>: >> On 9/8/2009 7:53 AM, Lauri Nikkinen wrote: >>> >>> I have a text file similar to this (separated by spaces): >>> >>> x <- "DF12 This is an example 1 This >>> DF12 This is an 1232 This is >>> DF14 This is 12334 This is an >>> DF15 This 23 This is an example >>> " >>> >>> and I know the field lengths of each variable (there is 5 variables in >>> this data set), which are: >>> >>> varlength <- c(2, 2, 18, 5, 18) >>> >>> How can I import this kind of data into R, using the varlength >>> variable as an field separator indicator? >> >> See ?read.fwf. >> >> Duncan Murdoch >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
Lauri Nikkinen
2009-Sep-08 12:15 UTC
[R] Data separated by spaces, getting data into R using field lengths
Sure, here you go
structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", class
= "factor"),
V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L,
1L), .Label = c("This", "This is", "This is
an", "This is an example"
), class = "factor"), V4 = c(1L, 1232L, 12334L, 23L), V5
structure(1:4, .Label = c("This",
"This is", "This is an", "This is an
example"), class "factor")), .Names = c("V1",
"V2", "V3", "V4", "V5"), class =
"data.frame", row.names = c(NA,
-4L))
2009/9/8 jim holtman <jholtman at gmail.com>:> Can you post how you would like it.
>
> On Tue, Sep 8, 2009 at 8:07 AM, Lauri Nikkinen<lauri.nikkinen at
iki.fi> wrote:
>> Thanks, I tried it but I got
>>
>>> varlength <- c(2, 2, 18, 5, 18)
>>> read.fwf("c:temppi.txt", widths=varlength)
>> ?V1 V2 ? ? ? ? ? ? ? ? V3 ? ?V4 ? V5
>> 1 DF 12 ?This is an exampl e 1 T ?his
>> 2 DF 12 ?This is an 1232 T his i ? ?s
>> 3 DF 14 ?This is 12334 Thi s is ? an
>> 4 DF 15 ?This 23 This is a n exa mple
>>
>> Which is not the way I want it.
>>
>> structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label =
"DF", class
>> = "factor"),
>> ? ?V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L,
>> ? ?1L), .Label = c(" This 23 This is a", " This is 12334
Thi",
>> ? ?" This is an 1232 T", " This is an exampl"),
class = "factor"),
>> ? ?V4 = structure(c(1L, 2L, 4L, 3L), .Label = c("e 1 T",
"his i",
>> ? ?"n exa", "s is "), class = "factor"),
V5 = structure(c(2L,
>> ? ?4L, 1L, 3L), .Label = c("an ", "his",
"mple", "s"), class >> "factor")), .Names =
c("V1",
>> "V2", "V3", "V4", "V5"), class
= "data.frame", row.names = c(NA,
>> -4L))
>>
>> Any ideas?
>> -L
>>
>> 2009/9/8 Duncan Murdoch <murdoch at stats.uwo.ca>:
>>> On 9/8/2009 7:53 AM, Lauri Nikkinen wrote:
>>>>
>>>> I have a text file similar to this (separated by spaces):
>>>>
>>>> x <- "DF12 This is an example 1 This
>>>> DF12 This is an 1232 This is
>>>> DF14 This is 12334 This is an
>>>> DF15 This 23 This is an example
>>>> "
>>>>
>>>> and I know the field lengths of each variable (there is 5
variables in
>>>> this data set), which are:
>>>>
>>>> varlength <- c(2, 2, 18, 5, 18)
>>>>
>>>> How can I import this kind of data into R, using the varlength
>>>> variable as an field separator indicator?
>>>
>>> See ?read.fwf.
>>>
>>> Duncan Murdoch
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>
jim holtman
2009-Sep-08 12:20 UTC
[R] Data separated by spaces, getting data into R using field lengths
This bears no relationship to what you were first asking. It look like you want to split the leading 4 characters into two groups of two and then split the remaining data into three parts based on numerics in the middle. Is this correct? On Tue, Sep 8, 2009 at 8:15 AM, Lauri Nikkinen<lauri.nikkinen at iki.fi> wrote:> Sure, here you go > > structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", class > = "factor"), > ? ?V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L, > ? ?1L), .Label = c("This", "This is", "This is an", "This is an example" > ? ?), class = "factor"), V4 = c(1L, 1232L, 12334L, 23L), V5 > structure(1:4, .Label = c("This", > ? ?"This is", "This is an", "This is an example"), class > "factor")), .Names = c("V1", > "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, > -4L)) > > > 2009/9/8 jim holtman <jholtman at gmail.com>: >> Can you post how you would like it. >> >> On Tue, Sep 8, 2009 at 8:07 AM, Lauri Nikkinen<lauri.nikkinen at iki.fi> wrote: >>> Thanks, I tried it but I got >>> >>>> varlength <- c(2, 2, 18, 5, 18) >>>> read.fwf("c:temppi.txt", widths=varlength) >>> ?V1 V2 ? ? ? ? ? ? ? ? V3 ? ?V4 ? V5 >>> 1 DF 12 ?This is an exampl e 1 T ?his >>> 2 DF 12 ?This is an 1232 T his i ? ?s >>> 3 DF 14 ?This is 12334 Thi s is ? an >>> 4 DF 15 ?This 23 This is a n exa mple >>> >>> Which is not the way I want it. >>> >>> structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", class >>> = "factor"), >>> ? ?V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L, >>> ? ?1L), .Label = c(" This 23 This is a", " This is 12334 Thi", >>> ? ?" This is an 1232 T", " This is an exampl"), class = "factor"), >>> ? ?V4 = structure(c(1L, 2L, 4L, 3L), .Label = c("e 1 T", "his i", >>> ? ?"n exa", "s is "), class = "factor"), V5 = structure(c(2L, >>> ? ?4L, 1L, 3L), .Label = c("an ", "his", "mple", "s"), class >>> "factor")), .Names = c("V1", >>> "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, >>> -4L)) >>> >>> Any ideas? >>> -L >>> >>> 2009/9/8 Duncan Murdoch <murdoch at stats.uwo.ca>: >>>> On 9/8/2009 7:53 AM, Lauri Nikkinen wrote: >>>>> >>>>> I have a text file similar to this (separated by spaces): >>>>> >>>>> x <- "DF12 This is an example 1 This >>>>> DF12 This is an 1232 This is >>>>> DF14 This is 12334 This is an >>>>> DF15 This 23 This is an example >>>>> " >>>>> >>>>> and I know the field lengths of each variable (there is 5 variables in >>>>> this data set), which are: >>>>> >>>>> varlength <- c(2, 2, 18, 5, 18) >>>>> >>>>> How can I import this kind of data into R, using the varlength >>>>> variable as an field separator indicator? >>>> >>>> See ?read.fwf. >>>> >>>> Duncan Murdoch >>>> >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> >> -- >> Jim Holtman >> Cincinnati, OH >> +1 513 646 9390 >> >> What is the problem that you are trying to solve? >> >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
Philipp Pagel
2009-Sep-08 12:33 UTC
[R] Data separated by spaces, getting data into R using field lengths
On Tue, Sep 08, 2009 at 02:53:11PM +0300, Lauri Nikkinen wrote:> I have a text file similar to this (separated by spaces): > > x <- "DF12 This is an example 1 This > DF12 This is an 1232 This is > DF14 This is 12334 This is an > DF15 This 23 This is an example > " > > and I know the field lengths of each variable (there is 5 variables in > this data set), which are: > > varlength <- c(2, 2, 18, 5, 18) > > How can I import this kind of data into R, using the varlength > variable as an field separator indicator?I am not totally sure what exaclty the expected result is. From your description I got the impression that your data file uses a mixture of separation characters and fixed-width formatting. Maybe I misinterpreted your example. Have a look at read.fwf() an if that does not solve your problem maybe explain the Structure and expected result a little further. cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/