thr3ads.net - R help - [R] How to loop over two files ... [Jun 2020]

If this information is useful, please help other people find it:
Share via:

Ana Marija

2020-Jun-19 19:34 UTC

[R] How to loop over two files ...

Hello,

I have two files (each has 300 lines)like this:

head 1g.txt
rs6792369
rs1414517
rs16857712
rs16857703
rs12239392
...

head 1n.txt
rs1042779
rs2360630
rs10753597
rs7549096
rs2343491
...

For each pair of rs# from those two files I can run this command in R

library(httr)
library(jsonlite)
library(xml2)

server <- "http://rest.ensembl.org"
ext <-
"/ld/human/pairwise/rs6792369/rs1042779?population_name=1000GENOMES:phase_3:KHV"

r <- GET(paste(server, ext, sep = ""),
content_type("application/json"))

stop_for_status(r)
head(fromJSON(toJSON(content(r))))
   d_prime       r2 variation1 variation2         population_name
1 0.975513 0.951626  rs6792369  rs1042779 1000GENOMES:phase_3:KHV

What I would like to do is to do is to run this command for every SNP
in one list (1g.txt) to each SNP in another list (1n.txt). Where SNP#
is rs# and output every line of result in list.txt

The process is illustrated in the attachment.

Please help,
Ana

-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.pdf
Type: application/pdf
Size: 27519 bytes
Desc: not available
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20200619/45aca40e/attachment.pdf>

cpoiw@rt m@iii@g oii chemo@org@uk

2020-Jun-19 20:28 UTC

head link

[R] How to loop over two files ...

so (untested) if you did something like

f1 <- read.text("1g.txt")
f2 <- read.text("1n.txt")

for ( a in as.list(f1) ) {

   for ( b in as.list(f2) ) {

ext <- paste0( "/ld/human/pairwise/",
                a,
                "/",
                b,
                "?population_name=1000GENOMES:phase_3:KHV")

                r <- GET(paste(server, ext, sep = ""), 
content_type("application/json"))

                # You presumably need to do something with 'r' at the 
moment its over written by the next loop..  were
                # you appending it to list.txt?  Possibly its just a bit 
of the R output you want.?

                write(r,file="list.txt",append=TRUE)


   }

}


Are we doing your PhD for you ;-)  Do we get to share ;-)


On 2020-06-19 20:34, Ana Marija wrote:> Hello,
> 
> I have two files (each has 300 lines)like this:
> 
> head 1g.txt
> rs6792369
> rs1414517
> rs16857712
> rs16857703
> rs12239392
> ...
> 
> head 1n.txt
> rs1042779
> rs2360630
> rs10753597
> rs7549096
> rs2343491
> ...
> 
> For each pair of rs# from those two files I can run this command in R
> 
> library(httr)
> library(jsonlite)
> library(xml2)
> 
> server <- "http://rest.ensembl.org"
> ext <-
>
"/ld/human/pairwise/rs6792369/rs1042779?population_name=1000GENOMES:phase_3:KHV"
> 
> r <- GET(paste(server, ext, sep = ""), 
> content_type("application/json"))
> 
> stop_for_status(r)
> head(fromJSON(toJSON(content(r))))
>    d_prime       r2 variation1 variation2         population_name
> 1 0.975513 0.951626  rs6792369  rs1042779 1000GENOMES:phase_3:KHV
> 
> What I would like to do is to do is to run this command for every SNP
> in one list (1g.txt) to each SNP in another list (1n.txt). Where SNP#
> is rs# and output every line of result in list.txt
> 
> The process is illustrated in the attachment.
> 
> Please help,
> Ana
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Rasmus Liland

2020-Jun-19 20:49 UTC

head link

[R] How to loop over two files ...

On 2020-06-19 14:34 -0500, Ana Marija wrote:> 
> server <- "http://rest.ensembl.org"
> ext <-
"/ld/human/pairwise/rs6792369/rs1042779?population_name=1000GENOMES:phase_3:KHV"
> 
> r <- GET(paste(server, ext, sep = ""),
content_type("application/json"))
> 
> stop_for_status(r)
> head(fromJSON(toJSON(content(r))))
>    d_prime       r2 variation1 variation2         population_name
> 1 0.975513 0.951626  rs6792369  rs1042779 1000GENOMES:phase_3:KHV
> 
> What I would like to do is to do is to run this command for every SNP
> in one list (1g.txt) to each SNP in another list (1n.txt). Where SNP#
> is rs# and output every line of result in list.txt
Dear Ana,

I tried, but for some reason I get only a 
response for the first URL you supplied.  

I wrote this:

	files <- c("1g.txt", "1n.txt")
	files <- lapply(files, readLines)
	server <- "http://rest.ensembl.org"
	population.name <- "1000GENOMES:phase_3:KHV"
	ext <- apply(expand.grid(files), 1, function(x) {
	  return(paste0(server, "/ld/human/pairwise/",
	    x[1], "/", x[2],
	    "?population_name=", population.name))
	})
	
	# r <- lapply(ext, function(x) {
	#   httr::GET(x, httr::content_type("application/json"))
	# })
	# names(r) <- ext
	# file <- paste0(population.name, ".rds")
	# saveRDS(object=r, compress="xz", file=file)
	
	r <- readRDS(paste0(population.name, ".rds"))
	lapply(r[1:4], function(x) {
	  jsonlite::fromJSON(jsonlite::toJSON(httr::content(x)))
	})


Which if you are able to run it (saving the 
output in that rds file), yields this: 

	$`http://rest.ensembl.org/ld/human/pairwise/rs6792369/rs1042779?population_name=1000GENOMES:phase_3:KHV`
	  variation2         population_name  d_prime       r2 variation1
	1  rs1042779 1000GENOMES:phase_3:KHV 0.975513 0.951626  rs6792369
	
	$`http://rest.ensembl.org/ld/human/pairwise/rs1414517/rs1042779?population_name=1000GENOMES:phase_3:KHV`
	list()
	
	$`http://rest.ensembl.org/ld/human/pairwise/rs16857712/rs1042779?population_name=1000GENOMES:phase_3:KHV`
	list()
	
	$`http://rest.ensembl.org/ld/human/pairwise/rs16857703/rs1042779?population_name=1000GENOMES:phase_3:KHV`
	list()

For some reason, only the first url works ...

I am a bit unfamiliar working with REST 
API's.  Or web scraping in general.  Daniel 
Cegie?ka knows something in this thread some 
days ago, where it might be similar to the 
API of borsaitaliana.it, where you can supply 
headers with curl like he quickly did [2].

You might be able to supply the list of SNPs 
in a header to Ensemble in httr::GET somehow 
if you read some docs on their API? 

Best,
Rasmus

[1] https://marc.info/?t=159249246100002&r=1&w=2
[2] https://marc.info/?l=r-sig-finance&m=159249894208684&w=2

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20200619/65f5d896/attachment.sig>

Ana Marija

2020-Jun-19 21:02 UTC

head link

[R] How to loop over two files ...

Hi,

thanks for getting back to me, it is just for my job :)

so I tried it:

library(httr)
library(jsonlite)
library(xml2)
library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"),
"R", "lib")))
sparkR.session(master = "local[*]", sparkConfig
list(spark.driver.memory = "2g"))

server <- "http://rest.ensembl.org"

f1 <- read.text("1g.txt")
f2 <- read.text("1n.txt")

for ( a in as.list(f1) ) {

   for ( b in as.list(f2) ) {

ext <- paste0( "/ld/human/pairwise/",
                a,
                "/",
                b,
                "?population_name=1000GENOMES:phase_3:KHV")

                r <- GET(paste(server, ext, sep = ""),
content_type("application/json"))

                write(r,file="list.txt",append=TRUE)


   }

}

and I got this error:
Error in as.list.default(f1) :
  no method for coercing this S4 class to a vector

Please advise

On Fri, Jun 19, 2020 at 3:28 PM <cpolwart at chemo.org.uk>
wrote:>
> so (untested) if you did something like
>
> f1 <- read.text("1g.txt")
> f2 <- read.text("1n.txt")
>
> for ( a in as.list(f1) ) {
>
>    for ( b in as.list(f2) ) {
>
> ext <- paste0( "/ld/human/pairwise/",
>                 a,
>                 "/",
>                 b,
>                 "?population_name=1000GENOMES:phase_3:KHV")
>
>                 r <- GET(paste(server, ext, sep = ""),
> content_type("application/json"))
>
>                 # You presumably need to do something with 'r' at
the
> moment its over written by the next loop..  were
>                 # you appending it to list.txt?  Possibly its just a bit
> of the R output you want.?
>
>                 write(r,file="list.txt",append=TRUE)
>
>
>    }
>
> }
>
>
> Are we doing your PhD for you ;-)  Do we get to share ;-)
>
>
> On 2020-06-19 20:34, Ana Marija wrote:
> > Hello,
> >
> > I have two files (each has 300 lines)like this:
> >
> > head 1g.txt
> > rs6792369
> > rs1414517
> > rs16857712
> > rs16857703
> > rs12239392
> > ...
> >
> > head 1n.txt
> > rs1042779
> > rs2360630
> > rs10753597
> > rs7549096
> > rs2343491
> > ...
> >
> > For each pair of rs# from those two files I can run this command in R
> >
> > library(httr)
> > library(jsonlite)
> > library(xml2)
> >
> > server <- "http://rest.ensembl.org"
> > ext <-
> >
"/ld/human/pairwise/rs6792369/rs1042779?population_name=1000GENOMES:phase_3:KHV"
> >
> > r <- GET(paste(server, ext, sep = ""),
> > content_type("application/json"))
> >
> > stop_for_status(r)
> > head(fromJSON(toJSON(content(r))))
> >    d_prime       r2 variation1 variation2         population_name
> > 1 0.975513 0.951626  rs6792369  rs1042779 1000GENOMES:phase_3:KHV
> >
> > What I would like to do is to do is to run this command for every SNP
> > in one list (1g.txt) to each SNP in another list (1n.txt). Where SNP#
> > is rs# and output every line of result in list.txt
> >
> > The process is illustrated in the attachment.
> >
> > Please help,
> > Ana
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

Ana Marija

2020-Jun-19 21:07 UTC

head link

[R] How to loop over two files ...

HI Rasmus,

I tried it:

library(base)

files <- c("1g.txt", "1n.txt")
        files <- lapply(files, readLines)
        server <- "http://rest.ensembl.org"
        population.name <- "1000GENOMES:phase_3:KHV"
        ext <- apply(expand.grid(files), 1, function(x) {
          return(paste0(server, "/ld/human/pairwise/",
            x[1], "/", x[2],
            "?population_name=", population.name))
        })

r <- readRDS(paste0(population.name, ".rds"))
        lapply(r[1:4], function(x) {
          jsonlite::fromJSON(jsonlite::toJSON(httr::content(x)))
        })

and I got this error:> r <- readRDS(paste0(population.name, ".rds"))Error in gzfile(file, "rb") : cannot open the connection
In addition: Warning message:
In gzfile(file, "rb") :
  cannot open compressed file '1000GENOMES:phase_3:KHV.rds', probable
reason 'No such file or directory'>         lapply(r[1:4], function(x) {+           jsonlite::fromJSON(jsonlite::toJSON(httr::content(x)))
+         })
Error in lapply(r[1:4], function(x) { : object 'r' not found

Am I am doing here something wrong?
Do I need any other libraries loaded?

Thanks
Ana

On Fri, Jun 19, 2020 at 3:49 PM Rasmus Liland <jral at posteo.no>
wrote:>
> On 2020-06-19 14:34 -0500, Ana Marija wrote:
> >
> > server <- "http://rest.ensembl.org"
> > ext <-
"/ld/human/pairwise/rs6792369/rs1042779?population_name=1000GENOMES:phase_3:KHV"
> >
> > r <- GET(paste(server, ext, sep = ""),
content_type("application/json"))
> >
> > stop_for_status(r)
> > head(fromJSON(toJSON(content(r))))
> >    d_prime       r2 variation1 variation2         population_name
> > 1 0.975513 0.951626  rs6792369  rs1042779 1000GENOMES:phase_3:KHV
> >
> > What I would like to do is to do is to run this command for every SNP
> > in one list (1g.txt) to each SNP in another list (1n.txt). Where SNP#
> > is rs# and output every line of result in list.txt
>
> Dear Ana,
>
> I tried, but for some reason I get only a
> response for the first URL you supplied.
>
> I wrote this:
>
>         files <- c("1g.txt", "1n.txt")
>         files <- lapply(files, readLines)
>         server <- "http://rest.ensembl.org"
>         population.name <- "1000GENOMES:phase_3:KHV"
>         ext <- apply(expand.grid(files), 1, function(x) {
>           return(paste0(server, "/ld/human/pairwise/",
>             x[1], "/", x[2],
>             "?population_name=", population.name))
>         })
>
>         # r <- lapply(ext, function(x) {
>         #   httr::GET(x, httr::content_type("application/json"))
>         # })
>         # names(r) <- ext
>         # file <- paste0(population.name, ".rds")
>         # saveRDS(object=r, compress="xz", file=file)
>
>         r <- readRDS(paste0(population.name, ".rds"))
>         lapply(r[1:4], function(x) {
>           jsonlite::fromJSON(jsonlite::toJSON(httr::content(x)))
>         })
>
>
> Which if you are able to run it (saving the
> output in that rds file), yields this:
>
>        
$`http://rest.ensembl.org/ld/human/pairwise/rs6792369/rs1042779?population_name=1000GENOMES:phase_3:KHV`
>           variation2         population_name  d_prime       r2 variation1
>         1  rs1042779 1000GENOMES:phase_3:KHV 0.975513 0.951626  rs6792369
>
>        
$`http://rest.ensembl.org/ld/human/pairwise/rs1414517/rs1042779?population_name=1000GENOMES:phase_3:KHV`
>         list()
>
>        
$`http://rest.ensembl.org/ld/human/pairwise/rs16857712/rs1042779?population_name=1000GENOMES:phase_3:KHV`
>         list()
>
>        
$`http://rest.ensembl.org/ld/human/pairwise/rs16857703/rs1042779?population_name=1000GENOMES:phase_3:KHV`
>         list()
>
> For some reason, only the first url works ...
>
> I am a bit unfamiliar working with REST
> API's.  Or web scraping in general.  Daniel
> Cegie?ka knows something in this thread some
> days ago, where it might be similar to the
> API of borsaitaliana.it, where you can supply
> headers with curl like he quickly did [2].
>
> You might be able to supply the list of SNPs
> in a header to Ensemble in httr::GET somehow
> if you read some docs on their API?
>
> Best,
> Rasmus
>
> [1] https://marc.info/?t=159249246100002&r=1&w=2
> [2] https://marc.info/?l=r-sig-finance&m=159249894208684&w=2

Rasmus Liland

2020-Jun-19 23:42 UTC

head link

[R] How to loop over two files ...

On 2020-06-19 14:34 -0500, Ana Marija wrote:> 
> I have two files (each has 300 lines)like this:
The example looks quite similar to the R example in 
https://rest.ensembl.org/documentation/info/ld_pairwise_get#ra 
...

The question becomes: how did you query the 
600 variant names in 1g.txt and 1n.txt?

  curl 'https://rest.ensembl.org/ld/human/pairwise/rs6792369/rs1042779?'
-H 'Content-type:application/json'

shows the 26 population_names for the 
rs6792369/rs1042779 combination ... 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20200620/07eaf84e/attachment.sig>

Ana Marija

2020-Jun-20 00:34 UTC

head link

[R] How to loop over two files ...

Hi Rasmus,

I got those SNPs from two GWAS-es which I run with different
phenotypes and I would like to compare weather the top SNPs in both of
them are in LD.
So 1n.txt and 1g.txt are just top SNPs from those two GWAS-es.
Unfortunately https://ldlink.nci.nih.gov/?tab=ldpair works for only
two SNPs at the time and I need to do that for 300 pairs

On Fri, Jun 19, 2020 at 6:42 PM Rasmus Liland <jral at posteo.no>
wrote:>
> On 2020-06-19 14:34 -0500, Ana Marija wrote:
> >
> > I have two files (each has 300 lines)like this:
>
> The example looks quite similar to the R example in
> https://rest.ensembl.org/documentation/info/ld_pairwise_get#ra
> ...
>
> The question becomes: how did you query the
> 600 variant names in 1g.txt and 1n.txt?
>
>   curl
'https://rest.ensembl.org/ld/human/pairwise/rs6792369/rs1042779?' -H
'Content-type:application/json'
>
> shows the 26 population_names for the
> rs6792369/rs1042779 combination ...

R help - Jun 2020 - How to loop over two files ...

[R] How to loop over two files ...

[R] How to loop over two files ...

[R] How to loop over two files ...

[R] How to loop over two files ...

[R] How to loop over two files ...

[R] How to loop over two files ...

[R] How to loop over two files ...