thr3ads.net - R help - [R] Question about levels/as.numeric [Apr 2011]

If this information is useful, please help other people find it:
Share via:

Thibault Vatter

2011-Apr-10 15:47 UTC

[R] Question about levels/as.numeric

Hi,

I am still new to R and this is my first post on this mailing-list.

I have two .csv (each one being a column of real numbers) coming from the
same database (the first one is just longer than the second) and I read them
in R the following way:

returns  <- read.csv("test.csv", header = FALSE)
returns2  <- read.csv("test2.csv", header = FALSE)

However, the two objects clearly don't seem to be equivalent:
> returns[2528:2537,1] [1] -0.002206 0.115696  -0.015192 0.008719  -0.004654 -0.010688 0.009453
0.002676  0.001334  -0.011326
7470 Levels: -0.000078 -0.000085 -0.000086 -0.0001 -0.000112 -0.000115
-0.000152 -0.000154 -0.000157 -0.00016 -0.000171 -0.000185 -0.000212
-0.000238 -0.000256 -0.000259 -0.000263 -0.000273 ... C
> returns2[1:10,1] [1] -0.002206  0.115696 -0.015192  0.008719 -0.004654 -0.010688  0.009453
0.002676  0.001334 -0.011326
> as.numeric(returns[2528:2537,1]) [1]  341 7444 2244 5149  787 1717 5251 4122 3878 1811
> as.numeric(returns2[1:10,1]) [1] -0.002206  0.115696 -0.015192  0.008719 -0.004654 -0.010688  0.009453
0.002676  0.001334 -0.011326

I would like to understand what's happening and how to handle the longer
one. This problem may seem stupid, but I've been trying to figure it out for
a while and nothing seems to work. I checked in excel and both seems to be
completely normal lists of real numbers).

What am I missing here? What are those "levels" and why the as.numeric
doesn't work the same with the longer one?

My final goal  is to extract small parts of those columns the following way:
> cbind(returns[which(names == id)[2528:2537],1])      [,1]
 [1,]  341
 [2,] 7444
 [3,] 2244
 [4,] 5149
 [5,]  787
 [6,] 1717
 [7,] 5251
 [8,] 4122
 [9,] 3878
[10,] 1811

Wich should be equivalent to:
> cbind(returns2[which(names == id)[1:10],1])           [,1]
 [1,] -0.002206
 [2,]  0.115696
 [3,] -0.015192
 [4,]  0.008719
 [5,] -0.004654
 [6,] -0.010688
 [7,]  0.009453
 [8,]  0.002676
 [9,]  0.001334
[10,] -0.011326

Thanks a lot,
Thibault

---------
*Thibault Vatter*
 EPFL- Master, 1ère année
Laboratory of Statistical Biophysics <lbs.epfl.ch>

Tel: +41 78 820 18 64
 @: thibault.vatter@epfl.ch
Web: personnes.epfl.ch/thibault.vatter

*Please consider the environment before printing this email.*

	[[alternative HTML version deleted]]

Petr Savicky

2011-Apr-10 21:22 UTC

head link

[R] Question about levels/as.numeric

On Sun, Apr 10, 2011 at 05:47:59PM +0200, Thibault Vatter
wrote:> Hi,
> 
> I am still new to R and this is my first post on this mailing-list.
> 
> I have two .csv (each one being a column of real numbers) coming from the
> same database (the first one is just longer than the second) and I read
them
> in R the following way:
> 
> returns  <- read.csv("test.csv", header = FALSE)
> returns2  <- read.csv("test2.csv", header = FALSE)
> 
> However, the two objects clearly don't seem to be equivalent:
> 
> > returns[2528:2537,1]
>  [1] -0.002206 0.115696  -0.015192 0.008719  -0.004654 -0.010688 0.009453
> 0.002676  0.001334  -0.011326
> 7470 Levels: -0.000078 -0.000085 -0.000086 -0.0001 -0.000112 -0.000115
> -0.000152 -0.000154 -0.000157 -0.00016 -0.000171 -0.000185 -0.000212
> -0.000238 -0.000256 -0.000259 -0.000263 -0.000273 ... C
Hi.

It seems that the first file contains a non-numeric row. It may contain
"C",
which is the last of the levels. In this case, the whole column is
considered as a character vector and is converted to a factor.
> > returns2[1:10,1]
>  [1] -0.002206  0.115696 -0.015192  0.008719 -0.004654 -0.010688  0.009453
> 0.002676  0.001334 -0.011326
> 
> > as.numeric(returns[2528:2537,1])
>  [1]  341 7444 2244 5149  787 1717 5251 4122 3878 1811
These are indices to the levels of the factor.

Petr Savicky.

Peter Ehlers

2011-Apr-10 22:08 UTC

head link

[R] Question about levels/as.numeric

Thibault,

Your questions indicate that you would benefit enormously
from reading 'An Introduction to R'.

A very useful function is str().
Understanding the concept of "factors" is crucial in R.

"Checking" anything with Excel is never much use.

Peter Ehlers

On 2011-04-10 08:47, Thibault Vatter wrote:> Hi,
>
> I am still new to R and this is my first post on this mailing-list.
>
> I have two .csv (each one being a column of real numbers) coming from the
> same database (the first one is just longer than the second) and I read
them
> in R the following way:
>
> returns<- read.csv("test.csv", header = FALSE)
> returns2<- read.csv("test2.csv", header = FALSE)
>
> However, the two objects clearly don't seem to be equivalent:
>
>> returns[2528:2537,1]
>   [1] -0.002206 0.115696  -0.015192 0.008719  -0.004654 -0.010688 0.009453
> 0.002676  0.001334  -0.011326
> 7470 Levels: -0.000078 -0.000085 -0.000086 -0.0001 -0.000112 -0.000115
> -0.000152 -0.000154 -0.000157 -0.00016 -0.000171 -0.000185 -0.000212
> -0.000238 -0.000256 -0.000259 -0.000263 -0.000273 ... C
>
>> returns2[1:10,1]
>   [1] -0.002206  0.115696 -0.015192  0.008719 -0.004654 -0.010688  0.009453
> 0.002676  0.001334 -0.011326
>
>> as.numeric(returns[2528:2537,1])
>   [1]  341 7444 2244 5149  787 1717 5251 4122 3878 1811
>
>> as.numeric(returns2[1:10,1])
>   [1] -0.002206  0.115696 -0.015192  0.008719 -0.004654 -0.010688  0.009453
> 0.002676  0.001334 -0.011326
>
> I would like to understand what's happening and how to handle the
longer
> one. This problem may seem stupid, but I've been trying to figure it
out for
> a while and nothing seems to work. I checked in excel and both seems to be
> completely normal lists of real numbers).
>
> What am I missing here? What are those "levels" and why the
as.numeric
> doesn't work the same with the longer one?
>
> My final goal  is to extract small parts of those columns the following
way:
>
>> cbind(returns[which(names == id)[2528:2537],1])
>        [,1]
>   [1,]  341
>   [2,] 7444
>   [3,] 2244
>   [4,] 5149
>   [5,]  787
>   [6,] 1717
>   [7,] 5251
>   [8,] 4122
>   [9,] 3878
> [10,] 1811
>
> Wich should be equivalent to:
>
>> cbind(returns2[which(names == id)[1:10],1])
>             [,1]
>   [1,] -0.002206
>   [2,]  0.115696
>   [3,] -0.015192
>   [4,]  0.008719
>   [5,] -0.004654
>   [6,] -0.010688
>   [7,]  0.009453
>   [8,]  0.002676
>   [9,]  0.001334
> [10,] -0.011326
>
> Thanks a lot,
> Thibault
>
> ---------
> *Thibault Vatter*
>   EPFL- Master, 1?re ann?e
> Laboratory of Statistical Biophysics<lbs.epfl.ch>
>
> Tel: +41 78 820 18 64
>   @: thibault.vatter at epfl.ch
> Web: personnes.epfl.ch/thibault.vatter
>
> *Please consider the environment before printing this email.*
>
> 	[[alternative HTML version deleted]]
>

Rolf Turner

2011-Apr-11 01:48 UTC

head link

[R] Question about levels/as.numeric

On 11/04/11 10:08, Peter Ehlers wrote:

<SNIP>> "Checking" anything with Excel is never much use.
<SNIP>

Fortune?

     cheers,

         Rolf Turner

Petr Savicky

2011-Apr-11 06:18 UTC

head link

[R] Question about levels/as.numeric

On Sun, Apr 10, 2011 at 05:47:59PM +0200, Thibault Vatter
wrote:> Hi,
> 
> I am still new to R and this is my first post on this mailing-list.
> 
> I have two .csv (each one being a column of real numbers) coming from the
> same database (the first one is just longer than the second) and I read
them
> in R the following way:
> 
> returns  <- read.csv("test.csv", header = FALSE)
> returns2  <- read.csv("test2.csv", header = FALSE)
> 
> However, the two objects clearly don't seem to be equivalent:
> 
> > returns[2528:2537,1]
>  [1] -0.002206 0.115696  -0.015192 0.008719  -0.004654 -0.010688 0.009453
> 0.002676  0.001334  -0.011326
> 7470 Levels: -0.000078 -0.000085 -0.000086 -0.0001 -0.000112 -0.000115
> -0.000152 -0.000154 -0.000157 -0.00016 -0.000171 -0.000185 -0.000212
> -0.000238 -0.000256 -0.000259 -0.000263 -0.000273 ... C
There is probably a non-numeric row in the data. In order to locate this
row, try the following

  which(is.na(as.numeric(as.character(returns[, 1]))))

This will show the indices of the rows, which cannot be converted
to numeric type.

Petr Savicky.

John Kane

2011-Apr-11 10:59 UTC

head link

[R] Question about levels/as.numeric

--- On Sun, 4/10/11, Rolf Turner <rolf.turner at xtra.co.nz> wrote:
> From: Rolf Turner <rolf.turner at xtra.co.nz>
> Subject: Re: [R] Question about levels/as.numeric
> To: r-help at r-project.org
> Received: Sunday, April 10, 2011, 9:48 PM
> On 11/04/11 10:08, Peter Ehlers
> wrote:
> 
> <SNIP>
> > "Checking" anything with Excel is never much use.
> 
> <SNIP>
> 
> Fortune?
> 
> ? ? cheers,
> 
> ? ? ? ? Rolf Turner
Definitely!

Maybe Matching Threads

Search for more possibly parallel threads

R help - Apr 2011 - Question about levels/as.numeric

[R] Question about levels/as.numeric

[R] Question about levels/as.numeric

[R] Question about levels/as.numeric

[R] Question about levels/as.numeric

[R] Question about levels/as.numeric

[R] Question about levels/as.numeric

Maybe Matching Threads