thr3ads.net - R help - [R] a simple problem [Mar 2011]

If this information is useful, please help other people find it:
Share via:

Asan Ramzan

2011-Mar-04 14:50 UTC

[R] a simple problem

Hello R-help
 
I am working with large data table that have the occasional label,
 a particular time point in an experiment. E.g:

"Time (min)", "R1 R1", "R2 R1", "R3 R1",
"R4 R1"
.909, 1.117, 1.225, 1.048, 1.258
3.942, 1.113, 1.230, 1.049, 1.262
3.976, 1.105, 1.226, 1.051, 1.259
4.009, 1.114, 1.231, 1.053, 1.259
4.042, 1.107, 1.230, 1.048, 1.262
4.076, 1.108, 1.226, 1.045, 1.257
4.109, 1.109, 1.227, 1.047, 1.259
4.142, 1.108, 1.225, 1.052, 1.260
4.176, 1.105, 1.222, 1.046, 1.260
4.209, 1.106, 1.226, 1.050, 1.258
4.242, 1.105, 1.224, 1.047, 1.258
4.276, 1.104, 1.223, 1.048, 1.259
4.309, 1.106, 1.228, 1.050, 1.260
4.342, 1.103, 1.219, 1.049, 1.260
4.376, 1.107, 1.225, 1.052, 1.259
4.409, 1.105, 1.222, 1.047, 1.258
4.442, 1.106, 1.227, 1.048, 1.262
4.476, 1.105, 1.222, 1.049, 1.261
4.509, 1.102, 1.222, 1.047, 1.259
4.555, "Gly sar"
4.555, 1.107, 1.224, 1.048, 1.261
4.576, 1.109, 1.228, 1.053, 1.259
4.609, 1.103, 1.218, 1.046, 1.258
4.642, 1.105, 1.223, 1.048, 1.256
4.676, 1.108, 1.217, 1.048, 1.260
4.709, 1.124, 1.222, 1.047, 1.258
When I try to read in the table, I get:> try<-read.table("200810_01.R",header=T,sep=",")Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
  line 136 did not have 5 elements

Is there any way to tell R to ignore these labels or better
 still interpret them as being label for particular time 
points, so when it comes to draw a line graph it is annotated
with these labels.



      
	[[alternative HTML version deleted]]

David Winsemius

2011-Mar-04 16:03 UTC

head link

[R] a simple problem

On Mar 4, 2011, at 9:50 AM, Asan Ramzan wrote:
> Hello R-help
>
> I am working with large data table that have the occasional label,
> a particular time point in an experiment. E.g:
>
> "Time (min)", "R1 R1", "R2 R1", "R3
R1", "R4 R1"
> .909, 1.117, 1.225, 1.048, 1.258
> 3.942, 1.113, 1.230, 1.049, 1.262
> 3.976, 1.105, 1.226, 1.051, 1.259
> 4.009, 1.114, 1.231, 1.053, 1.259
> 4.042, 1.107, 1.230, 1.048, 1.262
> 4.076, 1.108, 1.226, 1.045, 1.257
> 4.109, 1.109, 1.227, 1.047, 1.259
> 4.142, 1.108, 1.225, 1.052, 1.260
> 4.176, 1.105, 1.222, 1.046, 1.260
> 4.209, 1.106, 1.226, 1.050, 1.258
> 4.242, 1.105, 1.224, 1.047, 1.258
> 4.276, 1.104, 1.223, 1.048, 1.259
> 4.309, 1.106, 1.228, 1.050, 1.260
> 4.342, 1.103, 1.219, 1.049, 1.260
> 4.376, 1.107, 1.225, 1.052, 1.259
> 4.409, 1.105, 1.222, 1.047, 1.258
> 4.442, 1.106, 1.227, 1.048, 1.262
> 4.476, 1.105, 1.222, 1.049, 1.261
> 4.509, 1.102, 1.222, 1.047, 1.259
> 4.555, "Gly sar"
> 4.555, 1.107, 1.224, 1.048, 1.261
> 4.576, 1.109, 1.228, 1.053, 1.259
> 4.609, 1.103, 1.218, 1.046, 1.258
> 4.642, 1.105, 1.223, 1.048, 1.256
> 4.676, 1.108, 1.217, 1.048, 1.260
> 4.709, 1.124, 1.222, 1.047, 1.258
> When I try to read in the table, I get:
>> try<-read.table("200810_01.R",header=T,sep=",")
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,  
> na.strings,  :
>  line 136 did not have 5 elements
>
> Is there any way to tell R to ignore these labels or better
> still interpret them as being label for particular time
> points, so when it comes to draw a line graph it is annotated
> with these labels.
Option 1:
Prepare your data properly with an editor:

Option 2:
You could read the file with readLines, identify the offending lines  
with grep or grepl, then separate the offenders and non-offenders.
lines <- readLines(textConnection('"Time (min)", "R1
R1", "R2 R1", "R3
R1", "R4 R1"
.909, 1.117, 1.225, 1.048, 1.258
3.942, 1.113, 1.230, 1.049, 1.262
3.976, 1.105, 1.226, 1.051, 1.259
4.009, 1.114, 1.231, 1.053, 1.259
4.042, 1.107, 1.230, 1.048, 1.262
4.076, 1.108, 1.226, 1.045, 1.257
4.109, 1.109, 1.227, 1.047, 1.259
4.142, 1.108, 1.225, 1.052, 1.260
4.176, 1.105, 1.222, 1.046, 1.260
4.209, 1.106, 1.226, 1.050, 1.258
4.242, 1.105, 1.224, 1.047, 1.258
4.276, 1.104, 1.223, 1.048, 1.259
4.309, 1.106, 1.228, 1.050, 1.260
4.342, 1.103, 1.219, 1.049, 1.260
4.376, 1.107, 1.225, 1.052, 1.259
4.409, 1.105, 1.222, 1.047, 1.258
4.442, 1.106, 1.227, 1.048, 1.262
4.476, 1.105, 1.222, 1.049, 1.261
4.509, 1.102, 1.222, 1.047, 1.259
4.555, "Gly sar"
4.555, 1.107, 1.224, 1.048, 1.261
4.576, 1.109, 1.228, 1.053, 1.259
4.609, 1.103, 1.218, 1.046, 1.258
4.642, 1.105, 1.223, 1.048, 1.256
4.676, 1.108, 1.217, 1.048, 1.260
4.709, 1.124, 1.222, 1.047, 1.258'))

  read.table(textConnection(
         lines[ c(TRUE, !grepl("[[:alpha:]]", lines)[-1]) ]),
              skip=1)

  # the quotes and spaces don't work well with R column naming  
conventions

        V1     V2     V3     V4    V5
1   .909, 1.117, 1.225, 1.048, 1.258
2  3.942, 1.113, 1.230, 1.049, 1.262
3  3.976, 1.105, 1.226, 1.051, 1.259

snipped
23 4.642, 1.105, 1.223, 1.048, 1.256
24 4.676, 1.108, 1.217, 1.048, 1.260
25 4.709, 1.124, 1.222, 1.047, 1.258

So even more compact would be:

read.table(textConnection(
         lines[  !grepl("[[:alpha:]]", lines) ] ) )

Using the non-negated grepl expression should get you all the "labels"
lines


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Maybe Matching Threads

Search for more maybe matching threads

R help - Mar 2011 - a simple problem

[R] a simple problem

[R] a simple problem

Maybe Matching Threads