Dear all,
I have a table like this:
a <- read.csv("test.csv", header = TRUE, sep = ";")
a
         UTM       pUrb                   pUrb_class      pAgri                 
pAgri_class      pNatFor          pNatFor_class
1     NF1885    20.160307       NA                     79.921386        NA      
0.000000       NA
2     NF1886    51.965649       NA                     46.657713        NA      
0.000000       NA
3     NF1893    26.009581       NA                     40.269204        NA      
0.000000       NA
4     NF1894    3.141484         NA                      0.000000          NA   
0.000000       NA
5     NF1895    64.296826       NA                      0.440691         NA     
0.000000       NA
6     NF1896    14.174068       NA                     25.613839        NA      
0.000000       NA
7     NF1897    40.985589       NA                     37.680521        NA      
0.000000       NA
8     NF1898    34.054325       NA                     66.027334        NA      
0.000000       NA
9     NF1899    20.657632       NA                     79.424024        NA      
0.000000       NA
10   NF1982    94.857605       NA                     45.368606        NA       
0.000000       NA
...
And I executed the following code:
#data classification#
a$pUrb_class<-cut(a$pUrb, c(-Inf,80,Inf), labels = c(0,1))
a$pAgri_class<-cut(a$pAgri, c(-Inf,80,Inf), labels = c(0,1))
a$pNatFor_class<-cut(a$pNatFor, c(-Inf,80,Inf), labels = c(0,1))
a
         UTM       pUrb                   pUrb_class      pAgri                 
pAgri_class      pNatFor          pNatFor_class
1     NF1885    20.160307       0                        79.921386        0     
0.000000       0
2     NF1886    51.965649       0                        46.657713        0     
0.000000       0
3     NF1893    26.009581       0                        40.269204        0     
0.000000       0
4     NF1894    3.141484         0                         0.000000          0  
0.000000       0
5     NF1895    64.296826       0                         0.440691         0    
0.000000       0
6     NF1896    14.174068       0                        25.613839        0     
0.000000       0
7     NF1897    40.985589       0                        37.680521        0     
0.000000       0
8     NF1898    34.054325       0                        66.027334        0     
0.000000       0
9     NF1899    20.657632       0                        79.424024        0     
0.000000       0
10   NF1982    94.857605       1                        45.368606        0      
0.000000       0
...
#obtaining the number of combinations present in the data base#
library(survival)
b<-strata(a$pUrb_class,a$pAgri_class,a$pNatFor_class, sep=",")
table(b)
b
a$pUrb_class=0,a$pAgri_class=0,a$pNatFor_class=0 
                                           17698 
a$pUrb_class=0,a$pAgri_class=0,a$pNatFor_class=1 
                                             112 
a$pUrb_class=0,a$pAgri_class=1,a$pNatFor_class=0 
                                            4360 
a$pUrb_class=1,a$pAgri_class=0,a$pNatFor_class=0 
                                             160
median(table(b))
[1] 2260
In this stage I have 3 questions:
1st:
how can I obtain the combinations witch are present over the median (in this
case the first and the second combination)?
2nd:
how can I obtain the combinations witch are present over the median and have at
least one condition present (in this case only the second combination)?
3rd:
how can I select/extract from the original table the rows witch comply with the
2nd question, in this case:
         UTM       pUrb                   pUrb_class      pAgri                 
pAgri_class      pNatFor          pNatFor_class
10   NF1982    94.857605       1                        45.368606        0      
0.000000       0
...
Thanks in advance,
Carlos Guerra