thr3ads.net - R help - [R] how to create a txt file with parsed columns [Dec 2019]

If this information is useful, please help other people find it:
Share via:

Ana Marija

2019-Dec-09 00:12 UTC

[R] how to create a txt file with parsed columns

Hello,

I have two data frames:

head(a)
              GENE        rs       BETA
1  ENSG00000154803 rs2605134  0.0360182
2  ENSG00000154803 rs7405677  0.0525463
3  ENSG00000154803 rs7211573  0.0525531
4  ENSG00000154803 rs2746026  0.0466392
5  ENSG00000141030 rs2605134  0.0806140
6  ENSG00000141030 rs7405677  0.0251654
7  ENSG00000141030 rs7211573  0.0252775
8  ENSG00000141030 rs2746026  0.0976396
9  ENSG00000205309 rs2605134  0.0838975
10 ENSG00000205309 rs7405677 -0.2148500
11 ENSG00000205309 rs7211573 -0.2148170
12 ENSG00000205309 rs2746026  0.1013920
13 ENSG00000215030 rs2605134  0.1261050
14 ENSG00000215030 rs7405677  0.0165236
15 ENSG00000215030 rs7211573  0.0163509
16 ENSG00000215030 rs2746026  0.1201180
17 ENSG00000141026 rs2605134  0.0485897
18 ENSG00000141026 rs7405677 -0.0929964
19 ENSG00000141026 rs7211573 -0.0930321
20 ENSG00000141026 rs2746026  0.0623033

head(b)
          rs       GWAS
1  rs2605134  0.0315177
2  rs7405677 -0.0816389
3  rs7211573 -0.0797796
4  rs2746026  0.0199350
5 rs11658521  0.0728377
6  rs9914107  0.0720096
7 rs56964223  0.0723903

Data frame a has:> length(unique(a$GENE))
[1] 51> dim(a)[1] 287   3

and the whole data frame b is shown

I would like to create a txt file which would have rs match for each
ENSG from data frame b. If a particular ENSG does not have matching rs
from data frame b the value under it would be zero. So the txt file
would have 7 rows (for all those unique rs from data frame b) and 53
columns (for 51 ENSGs and one for unique rs and one for GWAS)

So one row of that txt file would look like this.

GENES       ENSG00000154803   ENSG00000141030  ENSG00000205309
ENSG00000215030    ENSG00000141026  GWAS
rs2605134   0.0360182         0.0806140         0.0838975
0.1261050           0.0485897       0.0315177
?

Please advise,
Ana

Jim Lemon

2019-Dec-09 05:03 UTC

head link

[R] how to create a txt file with parsed columns

Hi Ana,
Is this what you want?

a<-read.table(text="GENE        rs       BETA
1  ENSG00000154803 rs2605134  0.0360182
2  ENSG00000154803 rs7405677  0.0525463
3  ENSG00000154803 rs7211573  0.0525531
4  ENSG00000154803 rs2746026  0.0466392
5  ENSG00000141030 rs2605134  0.0806140
6  ENSG00000141030 rs7405677  0.0251654
7  ENSG00000141030 rs7211573  0.0252775
8  ENSG00000141030 rs2746026  0.0976396
9  ENSG00000205309 rs2605134  0.0838975
10 ENSG00000205309 rs7405677 -0.2148500
11 ENSG00000205309 rs7211573 -0.2148170
12 ENSG00000205309 rs2746026  0.1013920
13 ENSG00000215030 rs2605134  0.1261050
14 ENSG00000215030 rs7405677  0.0165236
15 ENSG00000215030 rs7211573  0.0163509
16 ENSG00000215030 rs2746026  0.1201180
17 ENSG00000141026 rs2605134  0.0485897
18 ENSG00000141026 rs7405677 -0.0929964
19 ENSG00000141026 rs7211573 -0.0930321
20 ENSG00000141026 rs2746026  0.0623033",
header=TRUE,stringsAsFactors=FALSE)
b<-read.table(text="rs       GWAS
1  rs2605134  0.0315177
2  rs7405677 -0.0816389
3  rs7211573 -0.0797796
4  rs2746026  0.0199350
5 rs11658521  0.0728377
6  rs9914107  0.0720096
7 rs56964223  0.0723903",
header=TRUE,stringsAsFactors=FALSE)
ab<-merge(a,b,by="rs")
library(prettyR)
abc<-stretch_df(ab,idvar="rs",to.stretch=c("GENE","BETA"))

Jiim

On Mon, Dec 9, 2019 at 11:10 AM Ana Marija <sokovic.anamarija at
gmail.com> wrote:>
> Hello,
>
> I have two data frames:
>
> head(a)
>               GENE        rs       BETA
> 1  ENSG00000154803 rs2605134  0.0360182
> 2  ENSG00000154803 rs7405677  0.0525463
> 3  ENSG00000154803 rs7211573  0.0525531
> 4  ENSG00000154803 rs2746026  0.0466392
> 5  ENSG00000141030 rs2605134  0.0806140
> 6  ENSG00000141030 rs7405677  0.0251654
> 7  ENSG00000141030 rs7211573  0.0252775
> 8  ENSG00000141030 rs2746026  0.0976396
> 9  ENSG00000205309 rs2605134  0.0838975
> 10 ENSG00000205309 rs7405677 -0.2148500
> 11 ENSG00000205309 rs7211573 -0.2148170
> 12 ENSG00000205309 rs2746026  0.1013920
> 13 ENSG00000215030 rs2605134  0.1261050
> 14 ENSG00000215030 rs7405677  0.0165236
> 15 ENSG00000215030 rs7211573  0.0163509
> 16 ENSG00000215030 rs2746026  0.1201180
> 17 ENSG00000141026 rs2605134  0.0485897
> 18 ENSG00000141026 rs7405677 -0.0929964
> 19 ENSG00000141026 rs7211573 -0.0930321
> 20 ENSG00000141026 rs2746026  0.0623033
>
> head(b)
>           rs       GWAS
> 1  rs2605134  0.0315177
> 2  rs7405677 -0.0816389
> 3  rs7211573 -0.0797796
> 4  rs2746026  0.0199350
> 5 rs11658521  0.0728377
> 6  rs9914107  0.0720096
> 7 rs56964223  0.0723903
>
> Data frame a has:
> > length(unique(a$GENE))
> [1] 51
> > dim(a)
> [1] 287   3
>
> and the whole data frame b is shown
>
> I would like to create a txt file which would have rs match for each
> ENSG from data frame b. If a particular ENSG does not have matching rs
> from data frame b the value under it would be zero. So the txt file
> would have 7 rows (for all those unique rs from data frame b) and 53
> columns (for 51 ENSGs and one for unique rs and one for GWAS)
>
> So one row of that txt file would look like this.
>
> GENES       ENSG00000154803   ENSG00000141030  ENSG00000205309
> ENSG00000215030    ENSG00000141026  GWAS
> rs2605134   0.0360182         0.0806140         0.0838975
> 0.1261050           0.0485897       0.0315177
> ?
>
> Please advise,
> Ana
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Ana Marija

2019-Dec-09 16:38 UTC

head link

[R] how to create a txt file with parsed columns

Thanks for getting back to me, I resolved my problem with this:

library(reshape2)
c=dcast(a, rs ~ GENE)
d=merge(c,b,by="rs")
d[is.na(d)] <- 0

On Sun, Dec 8, 2019 at 11:03 PM Jim Lemon <drjimlemon at gmail.com>
wrote:>
> Hi Ana,
> Is this what you want?
>
> a<-read.table(text="GENE        rs       BETA
> 1  ENSG00000154803 rs2605134  0.0360182
> 2  ENSG00000154803 rs7405677  0.0525463
> 3  ENSG00000154803 rs7211573  0.0525531
> 4  ENSG00000154803 rs2746026  0.0466392
> 5  ENSG00000141030 rs2605134  0.0806140
> 6  ENSG00000141030 rs7405677  0.0251654
> 7  ENSG00000141030 rs7211573  0.0252775
> 8  ENSG00000141030 rs2746026  0.0976396
> 9  ENSG00000205309 rs2605134  0.0838975
> 10 ENSG00000205309 rs7405677 -0.2148500
> 11 ENSG00000205309 rs7211573 -0.2148170
> 12 ENSG00000205309 rs2746026  0.1013920
> 13 ENSG00000215030 rs2605134  0.1261050
> 14 ENSG00000215030 rs7405677  0.0165236
> 15 ENSG00000215030 rs7211573  0.0163509
> 16 ENSG00000215030 rs2746026  0.1201180
> 17 ENSG00000141026 rs2605134  0.0485897
> 18 ENSG00000141026 rs7405677 -0.0929964
> 19 ENSG00000141026 rs7211573 -0.0930321
> 20 ENSG00000141026 rs2746026  0.0623033",
> header=TRUE,stringsAsFactors=FALSE)
> b<-read.table(text="rs       GWAS
> 1  rs2605134  0.0315177
> 2  rs7405677 -0.0816389
> 3  rs7211573 -0.0797796
> 4  rs2746026  0.0199350
> 5 rs11658521  0.0728377
> 6  rs9914107  0.0720096
> 7 rs56964223  0.0723903",
> header=TRUE,stringsAsFactors=FALSE)
> ab<-merge(a,b,by="rs")
> library(prettyR)
>
abc<-stretch_df(ab,idvar="rs",to.stretch=c("GENE","BETA"))
>
> Jiim
>
> On Mon, Dec 9, 2019 at 11:10 AM Ana Marija <sokovic.anamarija at
gmail.com> wrote:
> >
> > Hello,
> >
> > I have two data frames:
> >
> > head(a)
> >               GENE        rs       BETA
> > 1  ENSG00000154803 rs2605134  0.0360182
> > 2  ENSG00000154803 rs7405677  0.0525463
> > 3  ENSG00000154803 rs7211573  0.0525531
> > 4  ENSG00000154803 rs2746026  0.0466392
> > 5  ENSG00000141030 rs2605134  0.0806140
> > 6  ENSG00000141030 rs7405677  0.0251654
> > 7  ENSG00000141030 rs7211573  0.0252775
> > 8  ENSG00000141030 rs2746026  0.0976396
> > 9  ENSG00000205309 rs2605134  0.0838975
> > 10 ENSG00000205309 rs7405677 -0.2148500
> > 11 ENSG00000205309 rs7211573 -0.2148170
> > 12 ENSG00000205309 rs2746026  0.1013920
> > 13 ENSG00000215030 rs2605134  0.1261050
> > 14 ENSG00000215030 rs7405677  0.0165236
> > 15 ENSG00000215030 rs7211573  0.0163509
> > 16 ENSG00000215030 rs2746026  0.1201180
> > 17 ENSG00000141026 rs2605134  0.0485897
> > 18 ENSG00000141026 rs7405677 -0.0929964
> > 19 ENSG00000141026 rs7211573 -0.0930321
> > 20 ENSG00000141026 rs2746026  0.0623033
> >
> > head(b)
> >           rs       GWAS
> > 1  rs2605134  0.0315177
> > 2  rs7405677 -0.0816389
> > 3  rs7211573 -0.0797796
> > 4  rs2746026  0.0199350
> > 5 rs11658521  0.0728377
> > 6  rs9914107  0.0720096
> > 7 rs56964223  0.0723903
> >
> > Data frame a has:
> > > length(unique(a$GENE))
> > [1] 51
> > > dim(a)
> > [1] 287   3
> >
> > and the whole data frame b is shown
> >
> > I would like to create a txt file which would have rs match for each
> > ENSG from data frame b. If a particular ENSG does not have matching rs
> > from data frame b the value under it would be zero. So the txt file
> > would have 7 rows (for all those unique rs from data frame b) and 53
> > columns (for 51 ENSGs and one for unique rs and one for GWAS)
> >
> > So one row of that txt file would look like this.
> >
> > GENES       ENSG00000154803   ENSG00000141030  ENSG00000205309
> > ENSG00000215030    ENSG00000141026  GWAS
> > rs2605134   0.0360182         0.0806140         0.0838975
> > 0.1261050           0.0485897       0.0315177
> > ?
> >
> > Please advise,
> > Ana
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

R help - Dec 2019 - how to create a txt file with parsed columns

[R] how to create a txt file with parsed columns

[R] how to create a txt file with parsed columns

[R] how to create a txt file with parsed columns