My followup question went unanswered in another thread, so I thought
I'd start a new one and rephrase it.
I need to use CGI POST to retrieve data from a server, so thanks to
Duncan's suggestion, I used the httpRequest package as example code
for doing so. BUT, I have another problem. The data retrieved from
the socket has occasional problems. I include example code below to
demonstrate the problem. The problems occur in R v2.3.1 (2006-06-01)
on both Linux and OS X.
For comparison, here is code that produces the correct output (I used
paste on the URL to prevent word-wrapping from clobbering it):
full.url<-paste("http://genomics11.bu.edu/cgi-bin/",
"Tractor_dev/external/get_msa.cgi?user_id=0&",
"table=seqs_ucsc_hg18&len=350&gene_set_ids=",
"NM_000029,NM_000064,NM_000066&orgs=Hs,",
"mm8,canFam2",sep="")
readLines(full.url)
To reproduce the incorrect results, the socket-based code is:
host<-"genomics11.bu.edu"
path<-"/cgi-bin/Tractor_dev/external/get_msa.cgi"
dat<-paste("user_id=0&table=seqs_ucsc_hg18&len=350&",
"gene_set_ids=NM_000029,NM_000064,",
"NM_000066&orgs=Hs,mm8,canFam2",sep="")
len <- length( strsplit(dat,"")[[1]])
request<-paste("POST ",path," HTTP/1.1\nHost: ",host,
"\nReferer:\nContent-type: application/x-www-form-urlencoded\nContent-
length: ",
len,"\nConnection: Keep-Alive\n\n",dat,sep="")
fp <- socketConnection(host=host,port=80,server=FALSE,blocking=TRUE)
write(request,fp)
socketSelect(list(fp)) # Wait until results are ready
readLines(fp)
close(fp)
So my question is why do I get lines that are split up by hexadecimal
characters (e.g. element 13 -- 'a1'-- splits what should be one line:
lines 12 + 14)? Am I misunderstanding what is needed to correctly
read from a socket or is this a bug? Can anyone offer assistance?
Thanks.
--
Mike