thr3ads.net - R help - [R] Skipping specified rows in scan or read.table [Apr 2008]

If this information is useful, please help other people find it:
Share via:

Ravi Varadhan

2008-Apr-09 18:37 UTC

[R] Skipping specified rows in scan or read.table

Hi,

 

I have a data file, certain lines of which are character fields.  I would
like to skip these rows, and read the data file as a numeric data frame.  I
know that I can skip lines at the beginning with read.table and scan, but is
there a way to skip a specified sequence of lines (e.g., 1, 2, 10, 11, 19,
20, 28, 29, etc.) ?  

 

If I read the entire data file, and then delete the character fields, the
values are still kept as factors, with each value denoted by its level.
Since, I have continuous variables, there are as many levels as there are
values.  I am unable to coerce this to "numeric" mode.  Is there a way
to do
this so that I can then manipulate the numeric data frame?

 

Thanks for any help.

Best,

Ravi.

----------------------------------------------------------------------------
-------

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvaradhan@jhmi.edu

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 

----------------------------------------------------------------------------
--------

 


	[[alternative HTML version deleted]]

jim holtman

2008-Apr-09 18:53 UTC

head link

[R] Skipping specified rows in scan or read.table

Read the file in as lines of text (readLines), 'grep' through the
character vector and delete the lines you want and then use:

read.table(textConntection(yourvector))

to read the corrected data in.

On 4/9/08, Ravi Varadhan <rvaradhan at jhmi.edu>
wrote:> Hi,
>
>
>
> I have a data file, certain lines of which are character fields.  I would
> like to skip these rows, and read the data file as a numeric data frame.  I
> know that I can skip lines at the beginning with read.table and scan, but
is
> there a way to skip a specified sequence of lines (e.g., 1, 2, 10, 11, 19,
> 20, 28, 29, etc.) ?
>
>
>
> If I read the entire data file, and then delete the character fields, the
> values are still kept as factors, with each value denoted by its level.
> Since, I have continuous variables, there are as many levels as there are
> values.  I am unable to coerce this to "numeric" mode.  Is there
a way to do
> this so that I can then manipulate the numeric data frame?
>
>
>
> Thanks for any help.
>
> Best,
>
> Ravi.
>
>
----------------------------------------------------------------------------
> -------
>
> Ravi Varadhan, Ph.D.
>
> Assistant Professor, The Center on Aging and Health
>
> Division of Geriatric Medicine and Gerontology
>
> Johns Hopkins University
>
> Ph: (410) 502-2619
>
> Fax: (410) 614-9625
>
> Email: rvaradhan at jhmi.edu
>
> Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
>
>
>
>
----------------------------------------------------------------------------
> --------
>
>
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

Abhijit Dasgupta

2008-Apr-09 19:37 UTC

head link

[R] Skipping specified rows in scan or read.table

Hi Ravi,

One thing I tend to do is, when using read.table, specify the option 
'colClasses='character''. This forces everything to be read as a
character. From there, as.numeric works fine, and you don't have to deal 
with factors and reconverting them.

Hope this helps
Abhijit

Ravi Varadhan wrote:> Hi,
>
>  
>
> I have a data file, certain lines of which are character fields.  I would
> like to skip these rows, and read the data file as a numeric data frame.  I
> know that I can skip lines at the beginning with read.table and scan, but
is
> there a way to skip a specified sequence of lines (e.g., 1, 2, 10, 11, 19,
> 20, 28, 29, etc.) ?  
>
>  
>
> If I read the entire data file, and then delete the character fields, the
> values are still kept as factors, with each value denoted by its level.
> Since, I have continuous variables, there are as many levels as there are
> values.  I am unable to coerce this to "numeric" mode.  Is there
a way to do
> this so that I can then manipulate the numeric data frame?
>
>  
>
> Thanks for any help.
>
> Best,
>
> Ravi.
>
>
----------------------------------------------------------------------------
> -------
>
> Ravi Varadhan, Ph.D.
>
> Assistant Professor, The Center on Aging and Health
>
> Division of Geriatric Medicine and Gerontology 
>
> Johns Hopkins University
>
> Ph: (410) 502-2619
>
> Fax: (410) 614-9625
>
> Email: rvaradhan at jhmi.edu
>
> Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
>
>  
>
>
----------------------------------------------------------------------------
> --------
>
>  
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Prof Brian Ripley

2008-Apr-09 19:43 UTC

head link

[R] Skipping specified rows in scan or read.table

On Wed, 9 Apr 2008, Ravi Varadhan wrote:
> Hi,
>
>
>
> I have a data file, certain lines of which are character fields.  I would
> like to skip these rows, and read the data file as a numeric data frame.  I
> know that I can skip lines at the beginning with read.table and scan, but
is
> there a way to skip a specified sequence of lines (e.g., 1, 2, 10, 11, 19,
> 20, 28, 29, etc.) ?
Not within scan, but you can do it within the connection that scan reads.

If the file is small, just read it all with readLines, select the lines 
you want (mydata[-c(1,2,10,11...)]) and use that as the input to a 
textConnection.  If it is large, read a line at a time, discard when it is 
one to be skipped otherwise write to an anonymous file() connection.  Then 
read.table on the anonymous connection.

Or use perl/awk within a pipe() connection.
> If I read the entire data file, and then delete the character fields, the
> values are still kept as factors, with each value denoted by its level.
> Since, I have continuous variables, there are as many levels as there are
> values.  I am unable to coerce this to "numeric" mode.  Is there
a way to do
> this so that I can then manipulate the numeric data frame?
Why does FAQ Q7.10 not apply?
>
>
>
> Thanks for any help.
>
> Best,
>
> Ravi.
>
>
----------------------------------------------------------------------------
> -------
>
> Ravi Varadhan, Ph.D.
>
> Assistant Professor, The Center on Aging and Health
>
> Division of Geriatric Medicine and Gerontology
>
> Johns Hopkins University
>
> Ph: (410) 502-2619
>
> Fax: (410) 614-9625
>
> Email: rvaradhan at jhmi.edu
>
> Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
>
>
>
>
----------------------------------------------------------------------------
> --------
>
>
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Paul Smith

2008-Apr-13 08:50 UTC

head link

[R] Skipping specified rows in scan or read.table

On Wed, Apr 9, 2008 at 7:37 PM, Ravi Varadhan <rvaradhan at jhmi.edu>
wrote:>  I have a data file, certain lines of which are character fields.  I would
>  like to skip these rows, and read the data file as a numeric data frame. 
I
>  know that I can skip lines at the beginning with read.table and scan, but
is
>  there a way to skip a specified sequence of lines (e.g., 1, 2, 10, 11, 19,
>  20, 28, 29, etc.) ?
>
>  If I read the entire data file, and then delete the character fields, the
>  values are still kept as factors, with each value denoted by its level.
>  Since, I have continuous variables, there are as many levels as there are
>  values.  I am unable to coerce this to "numeric" mode.  Is there
a way to do
>  this so that I can then manipulate the numeric data frame?
Read the entire data file to the data frame mydata, and then delete
the character fields. Afterwards,

mydata <- edit(mydata)

and, inside edit, coerce the columns that you want to numeric.

Paul

Reasonably Related Threads

Search for more maybe matching threads

R help - Apr 2008 - Skipping specified rows in scan or read.table

[R] Skipping specified rows in scan or read.table

[R] Skipping specified rows in scan or read.table

[R] Skipping specified rows in scan or read.table

[R] Skipping specified rows in scan or read.table

[R] Skipping specified rows in scan or read.table

Reasonably Related Threads