Hi,
So you want to randomly throw away data? Doesn't sound like a good idea to
me...
You can get the combined data set using
data3 <- merge(data2, data1, all=TRUE)
>From there it's just a matter of randomly deleting rows in which the
combination of areiad, x1 and x2 are duplicated. I'll leave that to
you, but I encourage you to think about whether this is really what
you want.
-Ista
On Thu, Nov 5, 2009 at 11:34 PM, rusers.sh <rusers.sh at gmail.com>
wrote:> Hi there,
>
data1<-matrix(data=c(1,1.2,1.3,"3/23/2004",1,1.5,2.3,"3/22/2004",2,0.2,3.3,"4/23/2004",3,1.5,1.3,"5/22/2004"),nrow=4,ncol=4,byrow=TRUE)
> data1<-data.frame(data1)
>
names(data1)<-c("areaid","x","y","date")
> data1
>
> ? areaid ? x ? y ? ? ?date
> 1 ? ? ?1 1.2 1.3 3/23/2004
> 2 ? ? ?1 1.5 2.3 3/22/2004
> 3 ? ? ?2 0.2 3.3 4/23/2004
> 4 ? ? ?3 1.5 1.3 5/22/2004
> data2<-matrix(data=c(1,1.22,1.32,1, ?1.53, ?2.34,1, ?1.21, ?1.37,1,
?1.52,
> 2.35,2, ?0.21, ?3.33,2, ?0.23, ?3.35,3, ?1.57, 1.31,3, ?1.59,
> 1.33),nrow=8,ncol=3,byrow=TRUE)
> data2<-data.frame(data2)
> names(data2)<-c("areaid","x1","y1")
> data2
>
> ? areaid x1 ? y1
> 1 ? ? ?1 1.22 1.32
> 2 ? ? ?1 1.53 2.34
> 3 ? ? ?1 1.21 1.37
> 4 ? ? ?1 1.52 2.35
> 5 ? ? ?2 0.21 3.33
> 6 ? ? ?2 0.23 3.35
> 7 ? ? ?3 1.57 1.31
> 8 ? ? ?3 1.59 1.33
> ?Explains the two data. You can treat data1 as case dataset and data2 as
> control dataset,respectively.Note th number of recodes for data2 are 2
times
> as that of data1 for each records,something like 1:2 matched case-control
> study design. I hope to merge data1 and data2. Take areaid=1 as an example.
> >From the two dataset, we can see that data1 has two points(x,y) in
areaid=1,
> and data2 has four points (x1,y1) in areaid=1. Each record in data1 will
> have two matched records in data2.I want to randomly select 1/2 points of
> areaid=1 in data2 to link the one record of areaid=1 in the data1, and the
> other 1/2 points of areaid=1 in data2 to link the other record of areaid=1
> in the data1.Actually,the number of records in the same areaid will be over
> 2 in the actual dataset. This is only an example to explain the problem.
> For the cases of areaid=2 or 3,they are a little easier than areaid=1
> because there are only one value in data1.
> ?The final results are something like the following dataset.
> areaid x1 y1 ? ?date ? ? ? ? x ?y
> 1 ?1.22 ?1.32 ?3/23/2004 ? 1.2 ?1.3
> 1 ?1.53 ?2.34 ?3/22/2004 ? 1.2 ?1.3
> 1 ?1.21 ?1.37 ?3/23/2004 ? 1.5 ?2.3
> 1 ?1.52 ?2.35 ?3/22/2004 ? 1.5 ?2.3
> 2 ?0.21 ?3.33 ?4/23/2004 ? 0.2 ?3.3
> 2 ?0.23 ?3.35 ?4/23/2004 ? 0.2 ?3.3
> 3 ?1.57 ?1.31 ?5/22/2004 ? 1.5 ?1.3
> 3 ?1.59 ?1.33 ?5/22/2004 ? 1.5 ?1.3
>
> ? Any suggestions or help are greatly appreciated.
> ?Thanks a lot.
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org