thr3ads.net - R help - [R] log2() and -min() very quick question [Jun 2011]

If this information is useful, please help other people find it:
Share via:

Ben Ganzfried

2011-Jun-13 15:59 UTC

[R] log2() and -min() very quick question

I'm looking over good-code a post-doc in my lab wrote and trying to learn
how it works.  I came across the following:
rel.abundance <-
as.matrix(read.delim("rel.abundance.csv",row.names=1,as.is
=TRUE))
rel.abundance <- log2(rel.abundance-min(rel.abundance)+1)

I'm not sure what the second line is doing.  I ran each line in R and
couldn't see a noticeable difference in the output.  I assume log2() takes
the log base 2 of the values?  I'm not clear what -min(rel.abundance) is
doing either...my hunch would be that it would take the smallest value in
each row?
I'd really like to figure out:
1) What's actually going on?
2) Is there a good way to run a command over a large dataset in R and better
be able to tell what is going on?  More specifically, when I run each line
in R it looks something like this (w/ dif. values per row):
Archaea|Euryarchaeota|Methanobacteria|Methanobacteriales|Methanobacteriaceae|Methanobrevibacter,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,23,0,3,0,0,0


There are a lot of cells w/ values per row, which is one reason why I think
it is difficult to detect a pattern....

Thanks in advance!

Ben

	[[alternative HTML version deleted]]

jim holtman

2011-Jun-13 16:08 UTC

head link

[R] log2() and -min() very quick question

The second line is just scaling the data based on log2.  It is
subtracting the minimun of the entire matrix (not just each row) and
adding 1 to make sure there is not a value of zero since log2(0) is
not valid.  Here is an example of sample data:
> x <- matrix(runif(25, -50, 50), 5)
> x          [,1]      [,2]       [,3]       [,4]       [,5]
[1,] 29.730883  15.47239 -28.679186  47.617069 -48.692242
[2,] -4.472555 -14.68027 -37.062765  23.179251  21.556607
[3,] -8.991592 -22.97399  -2.188197 -14.327309 -39.681576
[4,] 31.087024  49.26841  42.407447  -6.852631  -5.371565
[5,] 10.493329  13.34933   9.876097 -35.178844 
14.010105> # scale to log2
> x <- log2(x - min(x) + 1)
> x         [,1]     [,2]     [,3]     [,4]     [,5]
[1,] 6.311487 6.026017 4.393214 6.604506 0.000000
[2,] 5.498879 5.129776 3.658723 6.187283 6.154795
[3,] 5.346980 4.739754 5.569978 5.144248 3.323466
[4,] 6.335913 6.628783 6.525124 5.420873 5.469908
[5,] 5.911346 5.978232 5.896474 3.859313 5.993275

You should see a noticable change between the data read in and the
result of the second statement.

On Mon, Jun 13, 2011 at 11:59 AM, Ben Ganzfried <ben.ganzfried at
gmail.com> wrote:> I'm looking over good-code a post-doc in my lab wrote and trying to
learn
> how it works. ?I came across the following:
> rel.abundance <-
as.matrix(read.delim("rel.abundance.csv",row.names=1,as.is
> =TRUE))
> rel.abundance <- log2(rel.abundance-min(rel.abundance)+1)
>
> I'm not sure what the second line is doing. ?I ran each line in R and
> couldn't see a noticeable difference in the output. ?I assume log2()
takes
> the log base 2 of the values? ?I'm not clear what -min(rel.abundance)
is
> doing either...my hunch would be that it would take the smallest value in
> each row?
> I'd really like to figure out:
> 1) What's actually going on?
> 2) Is there a good way to run a command over a large dataset in R and
better
> be able to tell what is going on? ?More specifically, when I run each line
> in R it looks something like this (w/ dif. values per row):
>
Archaea|Euryarchaeota|Methanobacteria|Methanobacteriales|Methanobacteriaceae|Methanobrevibacter,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,23,0,3,0,0,0
>
>
> There are a lot of cells w/ values per row, which is one reason why I think
> it is difficult to detect a pattern....
>
> Thanks in advance!
>
> Ben
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

Petr PIKAL

2011-Jun-13 16:14 UTC

head link

[R] Odp: log2() and -min() very quick question

Hi

r-help-bounces at r-project.org napsal dne 13.06.2011 17:59:03:
> Ben Ganzfried <ben.ganzfried at gmail.com> 
> Odeslal: r-help-bounces at r-project.org
> 
> 13.06.2011 17:59
> 
> Komu
> 
> r-help at r-project.org
> 
> Kopie
> 
> P?edm?t
> 
> [R] log2() and -min() very quick question
> 
> I'm looking over good-code a post-doc in my lab wrote and trying to 
learn> how it works.  I came across the following:
> rel.abundance <- 
as.matrix(read.delim("rel.abundance.csv",row.names=1,as.is> =TRUE))
> rel.abundance <- log2(rel.abundance-min(rel.abundance)+1)
> 
> I'm not sure what the second line is doing.  I ran each line in R and
> couldn't see a noticeable difference in the output.  I assume log2() 
takes> the log base 2 of the values?  I'm not clear what -min(rel.abundance)
is
> doing either...my hunch would be that it would take the smallest value 
in> each row?
No. If rel.abundance is matrix min(rel.abundance) is overall minimum
> mat<-matrix(1:12, 3,4)
> min(mat)[1] 1

so
log2(rel.abundance-min(rel.abundance)+1)

subtract minimum value from all numbers, after that it add 1 do all 
numbers, takes log base 2 from each number and returns matrix with the 
same dimensions as input matrix.
> I'd really like to figure out:
> 1) What's actually going on?
> 2) Is there a good way to run a command over a large dataset in R and 
better> be able to tell what is going on?  More specifically, when I run each 
line> in R it looks something like this (w/ dif. values per row):
> Archaea|Euryarchaeota|Methanobacteria|Methanobacteriales|
> 
Methanobacteriaceae|Methanobrevibacter,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,> 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,> 
0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,> 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,3,0,0,0,0,0,> 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,> 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,> 0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,23,0,3,0,0,0
> 
> 
> There are a lot of cells w/ values per row, which is one reason why I 
think> it is difficult to detect a pattern....
there are some summary and structure commands

summary(data) or str(data)

which can tell you some overall information about your data.

Regards
Petr
> 
> Thanks in advance!
> 
> Ben
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.

Possibly Parallel Threads

Search for more maybe matching threads

R help - Jun 2011 - log2() and -min() very quick question

[R] log2() and -min() very quick question

[R] log2() and -min() very quick question

[R] Odp: log2() and -min() very quick question

Possibly Parallel Threads