arun
2012-Nov-22  15:57 UTC
[R] selcting a random sample and saving it in a seprate dataframe and also remaining part in other data frame
HI Madhu, I guess you got your solution from Rui: ?dat1<-data.frame(x=c(1,1,2,2,2,3,4,4,4),y=c(23,45,87,46,78,12,87,79,76)) s<-sample(unique(dat1[,1]),length(unique(dat1[,1]))*0.8) ?s #[1] 3 4 2 You can have a list containing both the dataframes list1<-list(dat1[dat1$x%in%s,],dat1[!dat1$x%in%s,]) list1 [[1]] #? x? y #3 2 87 #4 2 46 #5 2 78 #6 3 12 #7 4 87 #8 4 79 #9 4 76 #[[2]] ?# x? y #1 1 23 #2 1 45 is.data.frame(list1[[1]]) #[1] TRUE A.K. ----- Original Message ----- From: Madhu Ganganapalli <mganganapalli at upstreamsoftware.com> To: arun <smartpink111 at yahoo.com> Cc: Sent: Thursday, November 22, 2012 2:53 AM Subject: selcting a random sample and saving it in a seprate dataframe and also remaining part in other data frame **> My question is: I have the following data frame and my distinct values of variable x are 1,2,3,4. ? ? data<-data.frame(x=c(1,1,2,2,2,3,4,4,4),y=c(23,45,87,46,78,12,87,79));** Here my data has 8 observations but I mentioned that distinct observations are 4 so 80% data means I have to get a random sample from these 4 observations only,? in such a way that Suppose while selecting 80% random sample from x I got 1,2, and 3(80% means 80/100*4=3 roughly) so I want a following out put in separate data frame. X? y 1? 23 1? 45 2? 87 2? 46 2? 78 3? 12 That means if 1 is in 80% of? my random sample then? the data corresponding? to remaining 1's also should be there in my data frame. One more thing is after creating this data frame, we have only one distinct observations which is 4 in our actual data frame What I mean is we have to get two data sets simultaneously in two different data frames, which is of above output format. In this case second data frame is X? y 4? 12 4? 87 4? 79 This will help while building a model, because we use only 80% data for modeling and remaining 20% for validation so that is way I want two datasets simultaneously in two different data frames. Please help me......... Thanks, Madhu.
Possibly Parallel Threads
- FW: Select a random subset of rows out of matrix
- a matrix in a seprate window?
- [PATCH net-next 3/8] virtio_net: introduce virtnet_xdp_handler() to seprate the logic of run xdp
- Can R functions be implented in Matlab
- [PATCH net-next 3/8] virtio_net: introduce virtnet_xdp_handler() to seprate the logic of run xdp
