thr3ads.net - R help - [R] Unwanted Levels in R [May 2002]

If this information is useful, please help other people find it:
Share via:

MATT BORKOWSKI

2002-May-21 13:34 UTC

[R] Unwanted Levels in R

I searched the mailing list archives and found one post relevant to my problem, 
but it was not specific enough to solve my problem.

I am reading in a large file of data (I believe it is in ASCII format).  I have
tried
reading the file in as a data frame and as a list, and both methods result in 
unwanted levels being created which leads to problems when I try to copy
or reference certain cells.  I suspect the problem may be because there are 
character strings randomly intersparsed with numeric data.  If this is the
problem,
is there anyway to overcome it?  Here are a few lines of the data I'm 
attempting to read in:

A  900003024 ODEN   SWEDEN  ODEN91 NSIDC.ORG/PROJE 
B     900003         -9  1 NAN OBS         0
C 1991  9  7 13 -9 XX   90.0000     .0000 XX
D    36   10.0   10.1 4183.0 4270.7 4219.0 Z 13  0 OBSERV
E    -9.0   -9.0 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 
   10.0   10.1 -1.6970 -1.6971 31.4940 25.3287 34.8044 43.8535 -9.0000 

Here are the commands I have tried using to read in the data:
>alldata <- read.table("/home/mattb/xxx.dat", fill = TRUE, quote
= "")
>alldata <- as.list(read.table("/home/mattb/xxx.dat", fill =
TRUE, quote = "")
Thanks for your time,

Matt


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Peter Dalgaard BSA

2002-May-21 13:46 UTC

head link

[R] Unwanted Levels in R

"MATT BORKOWSKI" <mpb170 at psu.edu> writes:
> is there anyway to overcome it?  Here are a few lines of the data I'm 
> attempting to read in:
> 
> A  900003024 ODEN   SWEDEN  ODEN91 NSIDC.ORG/PROJE 
> B     900003         -9  1 NAN OBS         0
> C 1991  9  7 13 -9 XX   90.0000     .0000 XX
> D    36   10.0   10.1 4183.0 4270.7 4219.0 Z 13  0 OBSERV
> E    -9.0   -9.0 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 
>    10.0   10.1 -1.6970 -1.6971 31.4940 25.3287 34.8044 43.8535 -9.0000 
> 
> Here are the commands I have tried using to read in the data:
> 
> >alldata <- read.table("/home/mattb/xxx.dat", fill = TRUE,
quote = "")
> 
> >alldata <- as.list(read.table("/home/mattb/xxx.dat", fill
= TRUE, quote = "")
As far as I can see, there is no connection between values in the same
position in different lines? If so, trying to make a data frame out of
the file is simply inappropriate and you should rather use ReadLines
and postprocess the lines according to whatever logic they are
supposed to obey.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

MATT BORKOWSKI

2002-May-21 14:15 UTC

head link

[R] Unwanted Levels in R

To clarify:
The lines beginning with A,B,C,D,E are part of a header file.  Below the header
are lines that contain values that correspond.  The problem is that there are 
a number of data sets combined, so the header randomly repeats after an
varying number of data lines.  Would it solve the problem to simply treat the
line
that begin with A,B,C,D,, or E differently?  If so, how do they need to be
treated?
I've copied a bit more of the data below to demonstrate more clearly how the
data is arranged within the file.

A  900003024 ODEN     SWEDEN          ODEN91          NSIDC.ORG/PROJE 
B     900003     -9  1 NAN OBS         0
C 1991  9  7 13 -9 XX   90.0000     .0000 XX
D    36   10.0   10.1 4183.0 4270.7 4219.0 Z 13  0 OBSERV
E    -9.0   -9.0 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000
   25.0   25.3 -1.7050 -1.7054 31.4970 25.3313 34.8074 43.8571 -9.0000  8.630  
   50.0   50.6 -1.7400 -1.7408 32.3660 26.0382 35.5010 44.5377 -9.0000  8.280  
   89.0   90.0 -1.6550 -1.6566 32.8530 26.4320 35.8807 44.9043 -9.0000  7.430  
   109.0  110.3 -1.5420 -1.5444 33.8830 27.2659 36.6893 45.6886 -9.0000  7.360 
...
...
...
A  900002034 LOUIS ST: LAURENT   UNITED STATES   AO1994   NSIDC.ORG/PROJE 
B     900002         -9  1 NAN OBS         0
C 1994  8 20 22 -9 XX   89.0167  137.1517 XX
D    36   13.0   13.1 4075.0 4159.4 4075.0 Z 13  0 LASTLE
E    -9.0   -9.0 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 
  13.0   13.1 -1.7650 -1.7652 32.9160 26.4856 35.9403 44.9690 -9.0000  8.580 

Matt


On Tue, 21 May 2002 15:46:58 +0200, Peter Dalgaard BSA <p.dalgaard at
biostat.ku.dk> wrote:
> "MATT BORKOWSKI" <mpb170 at psu.edu> writes:
> 
> > is there anyway to overcome it?  Here are a few lines of the data
I'm
> > attempting to read in:
> > 
> > A  900003024 ODEN   SWEDEN  ODEN91 NSIDC.ORG/PROJE 
> > B     900003         -9  1 NAN OBS         0
> > C 1991  9  7 13 -9 XX   90.0000     .0000 XX
> > D    36   10.0   10.1 4183.0 4270.7 4219.0 Z 13  0 OBSERV
> > E    -9.0   -9.0 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000
-9.0000
> >    10.0   10.1 -1.6970 -1.6971 31.4940 25.3287 34.8044 43.8535 -9.0000
> > 
> > Here are the commands I have tried using to read in the data:
> > 
> > >alldata <- read.table("/home/mattb/xxx.dat", fill =
TRUE, quote = "")
> > 
> > >alldata <- as.list(read.table("/home/mattb/xxx.dat",
fill = TRUE, quote = "")
> 
> As far as I can see, there is no connection between values in the same
> position in different lines? If so, trying to make a data frame out of
> the file is simply inappropriate and you should rather use ReadLines
> and postprocess the lines according to whatever logic they are
> supposed to obey.
> 
> -- 
>    O__  ---- Peter Dalgaard             Blegdamsvej 3  
>   c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
>  (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Possibly Parallel Threads

Search for more apparently analagous threads

R help - May 2002 - Unwanted Levels in R

[R] Unwanted Levels in R

[R] Unwanted Levels in R

[R] Unwanted Levels in R

Possibly Parallel Threads