Dear Bert,
thank you for your response. here it is the piece of R code : given 3 data
frames below ---
N <-
data.frame(N=c("n1","n2","n3","n4"))
M <-
data.frame(M=c("m1","m2","m3","m4","m5"))
C <- data.frame(n=c("n1","n2","n3"),
m=c("m1","m1","m3"), I=c(100,300,400))
how shall I integrate N, and M, and C in such a way that at the end we have
a data frame with :
- list N as the columns names
- list M as the rows names
- the values in the cells of N * M, corresponding to the numerical
values in the data frame C.
more precisely, the result shall be :
n1 n2 n3 n4
m1 100 200 - -
m2 - - - -
m3 - - 300 -
m4 - - - -
m5 - - - -
thank you !
On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <bgunter.4567 at gmail.com>
wrote:
> Reproducible example, please. -- In particular, what exactly does C look
> ilike?
>
> (You should know this by now).
>
> -- Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip
)
>
>
> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <tanasa at gmail.com>
wrote:
> > Dear all,
> >
> > please could you advise on the R code I could use in order to do the
> > following operation :
> >
> > a. -- I have 2 lists of "genome coordinates" : a list is
composed by
> > numbers that represent genome coordinates;
> >
> > let's say list N :
> >
> > n1
> >
> > n2
> >
> > n3
> >
> > n4
> >
> > and a list M:
> >
> > m1
> >
> > m2
> >
> > m3
> >
> > m4
> >
> > m5
> >
> > 2 -- and a data frame C, where for some pairs of coordinates (n,m)
from
> the
> > lists above, we have a numerical intensity;
> >
> > for example :
> >
> > n1; m1; 100
> >
> > n1; m2; 300
> >
> > The question would be : what is the most efficient R code I could use
in
> > order to integrate the list N, the list M, and the data frame C, in
order
> > to obtain a DATA FRAME,
> >
> > -- list N as the columns names
> > -- list M as the rows names
> > -- the values in the cells of N * M, corresponding to the numerical
> values
> > in the data frame C.
> >
> > A little example would be :
> >
> > n1 n2 n3 n4
> >
> > m1 100 - - -
> >
> > m2 300 - - -
> >
> > m3 - - - -
> >
> > m4 - - - -
> >
> > m5 - - - -
> > I wrote a script in perl, although i would like to do this in R
> > Many thanks ;)
> > -- bogdan
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
Hi Bogdan,
Kinda messy, but:
N <-
data.frame(N=c("n1","n2","n3","n4"))
M <-
data.frame(M=c("m1","m2","m3","m4","m5"))
C <- data.frame(n=c("n1","n2","n3"),
m=c("m1","m1","m3"), I=c(100,300,400))
MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
names(MN)<-M[,1]
rownames(MN)<-N[,1]
C[,1]<-as.character(C[,1])
C[,2]<-as.character(C[,2])
for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]
Jim
On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <tanasa at gmail.com>
wrote:> Dear Bert,
>
> thank you for your response. here it is the piece of R code : given 3 data
> frames below ---
>
> N <-
data.frame(N=c("n1","n2","n3","n4"))
>
> M <-
data.frame(M=c("m1","m2","m3","m4","m5"))
>
> C <- data.frame(n=c("n1","n2","n3"),
m=c("m1","m1","m3"), I=c(100,300,400))
>
> how shall I integrate N, and M, and C in such a way that at the end we have
> a data frame with :
>
>
> - list N as the columns names
> - list M as the rows names
> - the values in the cells of N * M, corresponding to the numerical
> values in the data frame C.
>
> more precisely, the result shall be :
>
> n1 n2 n3 n4
> m1 100 200 - -
> m2 - - - -
> m3 - - 300 -
> m4 - - - -
> m5 - - - -
>
> thank you !
>
>
> On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <bgunter.4567 at
gmail.com> wrote:
>
>> Reproducible example, please. -- In particular, what exactly does C
look
>> ilike?
>>
>> (You should know this by now).
>>
>> -- Bert
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming
along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic
strip )
>>
>>
>> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <tanasa at
gmail.com> wrote:
>> > Dear all,
>> >
>> > please could you advise on the R code I could use in order to do
the
>> > following operation :
>> >
>> > a. -- I have 2 lists of "genome coordinates" : a list is
composed by
>> > numbers that represent genome coordinates;
>> >
>> > let's say list N :
>> >
>> > n1
>> >
>> > n2
>> >
>> > n3
>> >
>> > n4
>> >
>> > and a list M:
>> >
>> > m1
>> >
>> > m2
>> >
>> > m3
>> >
>> > m4
>> >
>> > m5
>> >
>> > 2 -- and a data frame C, where for some pairs of coordinates (n,m)
from
>> the
>> > lists above, we have a numerical intensity;
>> >
>> > for example :
>> >
>> > n1; m1; 100
>> >
>> > n1; m2; 300
>> >
>> > The question would be : what is the most efficient R code I could
use in
>> > order to integrate the list N, the list M, and the data frame C,
in order
>> > to obtain a DATA FRAME,
>> >
>> > -- list N as the columns names
>> > -- list M as the rows names
>> > -- the values in the cells of N * M, corresponding to the
numerical
>> values
>> > in the data frame C.
>> >
>> > A little example would be :
>> >
>> > n1 n2 n3 n4
>> >
>> > m1 100 - - -
>> >
>> > m2 300 - - -
>> >
>> > m3 - - - -
>> >
>> > m4 - - - -
>> >
>> > m5 - - - -
>> > I wrote a script in perl, although i would like to do this in R
>> > Many thanks ;)
>> > -- bogdan
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Thank you Jim ! On Tue, Jun 6, 2017 at 4:01 AM, Jim Lemon <drjimlemon at gmail.com> wrote:> Hi Bogdan, > Kinda messy, but: > > N <- data.frame(N=c("n1","n2","n3","n4")) > M <- data.frame(M=c("m1","m2","m3","m4","m5")) > C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), > I=c(100,300,400)) > MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1]))) > names(MN)<-M[,1] > rownames(MN)<-N[,1] > C[,1]<-as.character(C[,1]) > C[,2]<-as.character(C[,2]) > for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3] > > Jim > > On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <tanasa at gmail.com> wrote: > > Dear Bert, > > > > thank you for your response. here it is the piece of R code : given 3 > data > > frames below --- > > > > N <- data.frame(N=c("n1","n2","n3","n4")) > > > > M <- data.frame(M=c("m1","m2","m3","m4","m5")) > > > > C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), > I=c(100,300,400)) > > > > how shall I integrate N, and M, and C in such a way that at the end we > have > > a data frame with : > > > > > > - list N as the columns names > > - list M as the rows names > > - the values in the cells of N * M, corresponding to the numerical > > values in the data frame C. > > > > more precisely, the result shall be : > > > > n1 n2 n3 n4 > > m1 100 200 - - > > m2 - - - - > > m3 - - 300 - > > m4 - - - - > > m5 - - - - > > > > thank you ! > > > > > > On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <bgunter.4567 at gmail.com> > wrote: > > > >> Reproducible example, please. -- In particular, what exactly does C look > >> ilike? > >> > >> (You should know this by now). > >> > >> -- Bert > >> Bert Gunter > >> > >> "The trouble with having an open mind is that people keep coming along > >> and sticking things into it." > >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >> > >> > >> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <tanasa at gmail.com> wrote: > >> > Dear all, > >> > > >> > please could you advise on the R code I could use in order to do the > >> > following operation : > >> > > >> > a. -- I have 2 lists of "genome coordinates" : a list is composed by > >> > numbers that represent genome coordinates; > >> > > >> > let's say list N : > >> > > >> > n1 > >> > > >> > n2 > >> > > >> > n3 > >> > > >> > n4 > >> > > >> > and a list M: > >> > > >> > m1 > >> > > >> > m2 > >> > > >> > m3 > >> > > >> > m4 > >> > > >> > m5 > >> > > >> > 2 -- and a data frame C, where for some pairs of coordinates (n,m) > from > >> the > >> > lists above, we have a numerical intensity; > >> > > >> > for example : > >> > > >> > n1; m1; 100 > >> > > >> > n1; m2; 300 > >> > > >> > The question would be : what is the most efficient R code I could use > in > >> > order to integrate the list N, the list M, and the data frame C, in > order > >> > to obtain a DATA FRAME, > >> > > >> > -- list N as the columns names > >> > -- list M as the rows names > >> > -- the values in the cells of N * M, corresponding to the numerical > >> values > >> > in the data frame C. > >> > > >> > A little example would be : > >> > > >> > n1 n2 n3 n4 > >> > > >> > m1 100 - - - > >> > > >> > m2 300 - - - > >> > > >> > m3 - - - - > >> > > >> > m4 - - - - > >> > > >> > m5 - - - - > >> > I wrote a script in perl, although i would like to do this in R > >> > Many thanks ;) > >> > -- bogdan > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > ______________________________________________ > >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide http://www.R-project.org/ > >> posting-guide.html > >> > and provide commented, minimal, self-contained, reproducible code. > >> > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Here's another approach:
N <-
data.frame(N=c("n1","n2","n3","n4"))
M <-
data.frame(M=c("m1","m2","m3","m4","m5"))
C <- data.frame(n=c("n1","n2","n3"),
m=c("m1","m1","m3"), I=c(100,300,400))
# Rebuild the factors using M and N
C$m <- factor(as.character(C$m), levels=levels(M$M))
C$n <- factor(as.character(C$n), levels=levels(N$N))
MN <- xtabs(I~m+n, C)
print(MN, zero.print="-")
# n
# m n1 n2 n3 n4
# m1 100 300 - -
# m2 - - - -
# m3 - - 400 -
# m4 - - - -
# m5 - - - -
class(MN)
# [1] "xtabs" "table"
# MN is a table. If you want a data.frame
MN <- as.data.frame.matrix(MN)
class(MN)
# [1] "data.frame"
-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352
-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon
Sent: Tuesday, June 6, 2017 6:02 AM
To: Bogdan Tanasa <tanasa at gmail.com>; r-help mailing list <r-help at
r-project.org>
Subject: Re: [R] integrating 2 lists and a data frame in R
Hi Bogdan,
Kinda messy, but:
N <-
data.frame(N=c("n1","n2","n3","n4"))
M <-
data.frame(M=c("m1","m2","m3","m4","m5"))
C <- data.frame(n=c("n1","n2","n3"),
m=c("m1","m1","m3"), I=c(100,300,400))
MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
names(MN)<-M[,1]
rownames(MN)<-N[,1]
C[,1]<-as.character(C[,1])
C[,2]<-as.character(C[,2])
for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]
Jim
On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <tanasa at gmail.com>
wrote:> Dear Bert,
>
> thank you for your response. here it is the piece of R code : given 3 data
> frames below ---
>
> N <-
data.frame(N=c("n1","n2","n3","n4"))
>
> M <-
data.frame(M=c("m1","m2","m3","m4","m5"))
>
> C <- data.frame(n=c("n1","n2","n3"),
m=c("m1","m1","m3"), I=c(100,300,400))
>
> how shall I integrate N, and M, and C in such a way that at the end we have
> a data frame with :
>
>
> - list N as the columns names
> - list M as the rows names
> - the values in the cells of N * M, corresponding to the numerical
> values in the data frame C.
>
> more precisely, the result shall be :
>
> n1 n2 n3 n4
> m1 100 200 - -
> m2 - - - -
> m3 - - 300 -
> m4 - - - -
> m5 - - - -
>
> thank you !
>
>
> On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <bgunter.4567 at
gmail.com> wrote:
>
>> Reproducible example, please. -- In particular, what exactly does C
look
>> ilike?
>>
>> (You should know this by now).
>>
>> -- Bert
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming
along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic
strip )
>>
>>
>> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <tanasa at
gmail.com> wrote:
>> > Dear all,
>> >
>> > please could you advise on the R code I could use in order to do
the
>> > following operation :
>> >
>> > a. -- I have 2 lists of "genome coordinates" : a list is
composed by
>> > numbers that represent genome coordinates;
>> >
>> > let's say list N :
>> >
>> > n1
>> >
>> > n2
>> >
>> > n3
>> >
>> > n4
>> >
>> > and a list M:
>> >
>> > m1
>> >
>> > m2
>> >
>> > m3
>> >
>> > m4
>> >
>> > m5
>> >
>> > 2 -- and a data frame C, where for some pairs of coordinates (n,m)
from
>> the
>> > lists above, we have a numerical intensity;
>> >
>> > for example :
>> >
>> > n1; m1; 100
>> >
>> > n1; m2; 300
>> >
>> > The question would be : what is the most efficient R code I could
use in
>> > order to integrate the list N, the list M, and the data frame C,
in order
>> > to obtain a DATA FRAME,
>> >
>> > -- list N as the columns names
>> > -- list M as the rows names
>> > -- the values in the cells of N * M, corresponding to the
numerical
>> values
>> > in the data frame C.
>> >
>> > A little example would be :
>> >
>> > n1 n2 n3 n4
>> >
>> > m1 100 - - -
>> >
>> > m2 300 - - -
>> >
>> > m3 - - - -
>> >
>> > m4 - - - -
>> >
>> > m5 - - - -
>> > I wrote a script in perl, although i would like to do this in R
>> > Many thanks ;)
>> > -- bogdan
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
> On Jun 6, 2017, at 4:01 AM, Jim Lemon <drjimlemon at gmail.com> wrote: > > Hi Bogdan, > Kinda messy, but: > > N <- data.frame(N=c("n1","n2","n3","n4")) > M <- data.frame(M=c("m1","m2","m3","m4","m5")) > C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), I=c(100,300,400)) > MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1]))) > names(MN)<-M[,1] > rownames(MN)<-N[,1] > C[,1]<-as.character(C[,1]) > C[,2]<-as.character(C[,2]) > for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]`xtabs` offers another route: C$m <- factor(C$m, levels=M$M) C$n <- factor(C$n, levels=N$N) Option 1: Zeroes in the empty positions:> (X <- xtabs(I ~ m+n , C, addNA=TRUE))n m n1 n2 n3 n4 m1 100 300 0 0 m2 0 0 0 0 m3 0 0 400 0 m4 0 0 0 0 m5 0 0 0 0 Option 2: Sparase matrix> (X <- xtabs(I ~ m+n , C, sparse=TRUE))5 x 4 sparse Matrix of class "dgCMatrix" n m n1 n2 n3 n4 m1 100 300 . . m2 . . . . m3 . . 400 . m4 . . . . m5 . . . . I wasn't sure if the sparse reuslts of xtabs would make a distinction between 0 and NA, but happily it does:> C <- data.frame(n=c("n1","n2","n3", "n3", "n4"), m=c("m1","m1","m3", "m4", "m5"), I=c(100,300,400, NA, 0)) > Cn m I 1 n1 m1 100 2 n2 m1 300 3 n3 m3 400 4 n3 m4 NA 5 n4 m5 0> (X <- xtabs(I ~ m+n , C, sparse=TRUE))4 x 4 sparse Matrix of class "dgCMatrix" n m n1 n2 n3 n4 m1 100 300 . . m3 . . 400 . m4 . . . . m5 . . . 0 (In the example I forgot to repeat the lines that augmented the factor levels so m2 is not seen. -- Davod> > > Jim > > On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <tanasa at gmail.com> wrote: >> Dear Bert, >> >> thank you for your response. here it is the piece of R code : given 3 data >> frames below --- >> >> N <- data.frame(N=c("n1","n2","n3","n4")) >> >> M <- data.frame(M=c("m1","m2","m3","m4","m5")) >> >> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), I=c(100,300,400)) >> >> how shall I integrate N, and M, and C in such a way that at the end we have >> a data frame with : >> >> >> - list N as the columns names >> - list M as the rows names >> - the values in the cells of N * M, corresponding to the numerical >> values in the data frame C. >> >> more precisely, the result shall be : >> >> n1 n2 n3 n4 >> m1 100 200 - - >> m2 - - - - >> m3 - - 300 - >> m4 - - - - >> m5 - - - - >> >> thank you ! >> >> >> On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote: >> >>> Reproducible example, please. -- In particular, what exactly does C look >>> ilike? >>> >>> (You should know this by now). >>> >>> -- Bert >>> Bert Gunter >>> >>> "The trouble with having an open mind is that people keep coming along >>> and sticking things into it." >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>> >>> >>> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <tanasa at gmail.com> wrote: >>>> Dear all, >>>> >>>> please could you advise on the R code I could use in order to do the >>>> following operation : >>>> >>>> a. -- I have 2 lists of "genome coordinates" : a list is composed by >>>> numbers that represent genome coordinates; >>>> >>>> let's say list N : >>>> >>>> n1 >>>> >>>> n2 >>>> >>>> n3 >>>> >>>> n4 >>>> >>>> and a list M: >>>> >>>> m1 >>>> >>>> m2 >>>> >>>> m3 >>>> >>>> m4 >>>> >>>> m5 >>>> >>>> 2 -- and a data frame C, where for some pairs of coordinates (n,m) from >>> the >>>> lists above, we have a numerical intensity; >>>> >>>> for example : >>>> >>>> n1; m1; 100 >>>> >>>> n1; m2; 300 >>>> >>>> The question would be : what is the most efficient R code I could use in >>>> order to integrate the list N, the list M, and the data frame C, in order >>>> to obtain a DATA FRAME, >>>> >>>> -- list N as the columns names >>>> -- list M as the rows names >>>> -- the values in the cells of N * M, corresponding to the numerical >>> values >>>> in the data frame C. >>>> >>>> A little example would be : >>>> >>>> n1 n2 n3 n4 >>>> >>>> m1 100 - - - >>>> >>>> m2 300 - - - >>>> >>>> m3 - - - - >>>> >>>> m4 - - - - >>>> >>>> m5 - - - - >>>> I wrote a script in perl, although i would like to do this in R >>>> Many thanks ;) >>>> -- bogdan >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/ >>> posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius Alameda, CA, USA