thr3ads.net - R help - [R] select a subset from a sample [Jan 2011]

If this information is useful, please help other people find it:
Share via:

Wei Yang

2011-Jan-23 12:43 UTC

[R] select a subset from a sample

Dear all,

I would like to ask whether anyone has experience with the problem below.


I want to select a subset of the sample (see data below) so that each level
(1,2,3,4 in the example) for every variable (v1,v2,v3,v4 in the example) is
shown at least once in the subset.  I also want the sample size of the
subset to be as small as possible.  Any help on it is greatly appreciated.


    Id v1 v2 v3 v4

[1,]  1 1 2 4 3

 [2,]  2 2 1 3 4

 [3,]  3 4 2 4 2

 [4,]  4 1 1 2 3

 [5,]  5 3 2 3 4

 [6,]  6 3 1 1 1

 [7,]  7 3 4 3 1

 [8,]  8 4 4 4 4

 [9,]  9 1 2 2 1

[10,] 10 4 1 1 2

[11,] 11 2 4 3 2

[12,] 12 1 4 2 3

[13,] 13 2 3 3 4

[14,] 14 4 3 1 2

[15,] 15 3 2 1 2

[16,] 16 2 3 2 3

[17,] 17 1 4 1 4

[18,] 18 2 3 4 3

[19,] 19 4 1 4 1

[20,] 20 3 3 2 1



Thanks,

Peter

	[[alternative HTML version deleted]]

Ista Zahn

2011-Jan-23 16:12 UTC

head link

[R] select a subset from a sample

I think there are multiple solutions that match your criteria. Here is one:

dat <- structure(list(Id = 1:20, v1 = c(1L, 2L, 4L, 1L, 3L, 3L, 3L,
+ 4L, 1L, 4L, 2L, 1L, 2L, 4L, 3L, 2L, 1L, 2L, 4L, 3L), v2 = c(2L,
+ 1L, 2L, 1L, 2L, 1L, 4L, 4L, 2L, 1L, 4L, 4L, 3L, 3L, 2L, 3L, 4L,
+ 3L, 1L, 3L), v3 = c(4L, 3L, 4L, 2L, 3L, 1L, 3L, 4L, 2L, 1L, 3L,
+ 2L, 3L, 1L, 1L, 2L, 1L, 4L, 4L, 2L), v4 = c(3L, 4L, 2L, 3L, 4L,
+ 1L, 1L, 4L, 1L, 2L, NA, 3L, 4L, NA, 2L, 3L, 4L, 3L, 1L, 1L)), .Names
= c("Id",
+ "v1", "v2", "v3", "v4"), class =
"data.frame", row.names = c(NA,
+ -20L))> keep <- rowSums(apply(dat[,-1], 2, function(x) !duplicated(x)))
> dat.sub <- dat[keep > 0 ,]
Best,
Ista

On Sun, Jan 23, 2011 at 12:43 PM, Wei Yang <peterwyang1 at gmail.com>
wrote:> Dear all,
>
> I would like to ask whether anyone has experience with the problem below.
>
>
> I want to select a subset of the sample (see data below) so that each level
> (1,2,3,4 in the example) for every variable (v1,v2,v3,v4 in the example) is
> shown at least once in the subset. ?I also want the sample size of the
> subset to be as small as possible. ?Any help on it is greatly appreciated.
>
>
> ? ?Id v1 v2 v3 v4
>
> [1,] ?1 1 2 4 3
>
> ?[2,] ?2 2 1 3 4
>
> ?[3,] ?3 4 2 4 2
>
> ?[4,] ?4 1 1 2 3
>
> ?[5,] ?5 3 2 3 4
>
> ?[6,] ?6 3 1 1 1
>
> ?[7,] ?7 3 4 3 1
>
> ?[8,] ?8 4 4 4 4
>
> ?[9,] ?9 1 2 2 1
>
> [10,] 10 4 1 1 2
>
> [11,] 11 2 4 3 2
>
> [12,] 12 1 4 2 3
>
> [13,] 13 2 3 3 4
>
> [14,] 14 4 3 1 2
>
> [15,] 15 3 2 1 2
>
> [16,] 16 2 3 2 3
>
> [17,] 17 1 4 1 4
>
> [18,] 18 2 3 4 3
>
> [19,] 19 4 1 4 1
>
> [20,] 20 3 3 2 1
>
>
>
> Thanks,
>
> Peter
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

Den

2011-Jan-23 18:21 UTC

head link

[R] select a subset from a sample

Maybe that:

su <- lapply(dat[2:5],function(x)table(x))
su
mode(su)
myBYdata <- data.frame( do.call(cbind,lapply(su, as.data.frame)) )
myBYdata


? ???, 23/01/2011 ? 07:43 -0500, Wei Yang ????:> Dear all,
> 
> I would like to ask whether anyone has experience with the problem below.
> 
> 
> I want to select a subset of the sample (see data below) so that each level
> (1,2,3,4 in the example) for every variable (v1,v2,v3,v4 in the example) is
> shown at least once in the subset.  I also want the sample size of the
> subset to be as small as possible.  Any help on it is greatly appreciated.
> 
> 
>     Id v1 v2 v3 v4
> 
> [1,]  1 1 2 4 3
> 
>  [2,]  2 2 1 3 4
> 
>  [3,]  3 4 2 4 2
> 
>  [4,]  4 1 1 2 3
> 
>  [5,]  5 3 2 3 4
> 
>  [6,]  6 3 1 1 1
> 
>  [7,]  7 3 4 3 1
> 
>  [8,]  8 4 4 4 4
> 
>  [9,]  9 1 2 2 1
> 
> [10,] 10 4 1 1 2
> 
> [11,] 11 2 4 3 2
> 
> [12,] 12 1 4 2 3
> 
> [13,] 13 2 3 3 4
> 
> [14,] 14 4 3 1 2
> 
> [15,] 15 3 2 1 2
> 
> [16,] 16 2 3 2 3
> 
> [17,] 17 1 4 1 4
> 
> [18,] 18 2 3 4 3
> 
> [19,] 19 4 1 4 1
> 
> [20,] 20 3 3 2 1
> 
> 
> 
> Thanks,
> 
> Peter
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Seemingly Similar Threads

Search for more maybe matching threads

R help - Jan 2011 - select a subset from a sample

[R] select a subset from a sample

[R] select a subset from a sample

[R] select a subset from a sample

Seemingly Similar Threads