thr3ads.net - R help - [R] Identifying records with the correct number of repeated measures [Dec 2011]

If this information is useful, please help other people find it:
Share via:

Keith Larson

2011-Dec-18 22:38 UTC

[R] Identifying records with the correct number of repeated measures

Dear list,

I have a dataset where we sampled multiple individuals either 1 or 9
times. Our measurement variable is 'Delta13C' (see below sample
dataset). I cannot figure out how to efficiently use a vector command
(preferably) or a loop to create a new vector of the names of the
individuals sampled 9 times. Note that the 'FeatherPosition' variable
will only be "P1" for individuals sampled only once, while it will be
%in% c('P1', 'P2', 'P3', 'P4', 'P5',
'P6', 'P7', 'P8', 'P9')  for
individuals sampled 9 times. In my sample data below the new vector
(e.g. WW_Names) would include only 'WW_08I_01' and 'WW_08I_03'.

Two other quick questions: 1) how can I re-number my 'ROWID', as when
I subset my complete dataset to a smaller dataset the old ROWID's are
no longer meaningful, and 2) when I subset my dataset my 'factor'
variables contain all the levels from the complete dataset, how can I
reset these factor variables to condense my 'dump' file as much as
possible?

Many Holiday Cheers from a NEW R user!
Keith

Sample data:

WW_Sample_SI <-
structure(list(Individual_ID = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 5L
), .Label = c("WW_08I_01", "WW_08I_02",
"WW_08I_03", "WW_08I_04",
"WW_08I_05"), class = "factor"), Site_Name = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L), .Label = "Anjan", class = "factor"),
Latitude = c(63.72935,
63.72935, 63.72935, 63.72935, 63.72935, 63.72935, 63.72935, 63.72935,
63.72935, 63.72935, 63.72935, 63.72935, 63.72935, 63.72935, 63.72935,
63.72935, 63.72935, 63.72935, 63.72935, 63.72935, 63.72935),
    Longitude = c(12.54022, 12.54022, 12.54022, 12.54022, 12.54022,
    12.54022, 12.54022, 12.54022, 12.54022, 12.54022, 12.54022,
    12.54022, 12.54022, 12.54022, 12.54022, 12.54022, 12.54022,
    12.54022, 12.54022, 12.54022, 12.54022), FeatherPosition = structure(c(1L,
    2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 1L, 1L, 2L, 3L, 4L, 5L, 6L,
    7L, 8L, 9L, 1L, 1L), .Label = c("P1", "P2",
"P3", "P4", "P5",
    "P6", "P7", "P8", "P9"), class =
"factor"), Delta13C = c(-18.3,
    -18.53, -19.55, -20.18, -20.96, -21.08, -21.5, -17.42, -13.18,
    -19.95, -22.3, -22.2, -22.18, -22.14, -21.55, -20.85, -23.1,
    -20.75, -20.9, -21.61, -22.24)), .Names = c("Individual_ID",
"Site_Name", "Latitude", "Longitude",
"FeatherPosition", "Delta13C"
), class = "data.frame", row.names = c("1282",
"1277", "1279",
"1270", "1272", "1274", "1280",
"1276", "1271", "1284", "1289",
"1290", "1295", "1293", "1292",
"1288", "1291", "1285", "1297",
"1298", "1299"))

*******************************************************************************************
Keith Larson, PhD Student
Evolutionary Ecology, Lund University
S?lvegatan 37
223 62 Lund Sweden
Phone: +46 (0)46 2229014 Mobile: +46 (0)73 0465016 Fax: +46 (0)46 2224716
Skype: sternacaspia FB: keith.w.larson at gmail.com

Sarah Goslee

2011-Dec-18 23:35 UTC

head link

[R] Identifying records with the correct number of repeated measures

Thank you for asking a clear question and including a reproducible
small example.

Here's one possible (2-line) solution to your main question, and both
the others:
> WW_Names <- table(WW_Sample_SI$Individual_ID)
> WW_Names <- names(WW_Names)[WW_Names == 9]
> WW_Names
[1] "WW_08I_01" "WW_08I_03">
> #by ROWID you mean row names? If so:
> row.names(WW_Sample_SI) <- 1:nrow(WW_Sample_SI)
> head(WW_Sample_SI)  Individual_ID Site_Name Latitude Longitude FeatherPosition Delta13C
1     WW_08I_01     Anjan 63.72935  12.54022              P1   -18.30
2     WW_08I_01     Anjan 63.72935  12.54022              P2   -18.53
3     WW_08I_01     Anjan 63.72935  12.54022              P3   -19.55
4     WW_08I_01     Anjan 63.72935  12.54022              P4   -20.18
5     WW_08I_01     Anjan 63.72935  12.54022              P5   -20.96
6     WW_08I_01     Anjan 63.72935  12.54022              P6  
-21.08>
# factor() can be used to eliminate unused levels
# your sample data doesn't have any, but here's an
example:> testdata <- factor(c("a", "a", "b",
"c", "d"))
> str(testdata) Factor w/ 4 levels "a","b","c","d": 1 1
2 3 4> testdata <- testdata[1:3]
> str(testdata) Factor w/ 4 levels "a","b","c","d": 1 1
2> testdata <- factor(testdata)
> str(testdata) Factor w/ 2 levels "a","b": 1 1 2


Sarah

On Sun, Dec 18, 2011 at 5:38 PM, Keith Larson <keith.larson at biol.lu.se>
wrote:> Dear list,
>
> I have a dataset where we sampled multiple individuals either 1 or 9
> times. Our measurement variable is 'Delta13C' (see below sample
> dataset). I cannot figure out how to efficiently use a vector command
> (preferably) or a loop to create a new vector of the names of the
> individuals sampled 9 times. Note that the 'FeatherPosition'
variable
> will only be "P1" for individuals sampled only once, while it
will be
> %in% c('P1', 'P2', 'P3', 'P4',
'P5', 'P6', 'P7', 'P8', 'P9') ?for
> individuals sampled 9 times. In my sample data below the new vector
> (e.g. WW_Names) would include only 'WW_08I_01' and
'WW_08I_03'.
>
> Two other quick questions: 1) how can I re-number my 'ROWID', as
when
> I subset my complete dataset to a smaller dataset the old ROWID's are
> no longer meaningful, and 2) when I subset my dataset my 'factor'
> variables contain all the levels from the complete dataset, how can I
> reset these factor variables to condense my 'dump' file as much as
> possible?
>
> Many Holiday Cheers from a NEW R user!
> Keith
>
> Sample data:
>
> WW_Sample_SI <-
> structure(list(Individual_ID = structure(c(1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 5L
> ), .Label = c("WW_08I_01", "WW_08I_02",
"WW_08I_03", "WW_08I_04",
> "WW_08I_05"), class = "factor"), Site_Name =
structure(c(1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L), .Label = "Anjan", class = "factor"),
Latitude = c(63.72935,
> 63.72935, 63.72935, 63.72935, 63.72935, 63.72935, 63.72935, 63.72935,
> 63.72935, 63.72935, 63.72935, 63.72935, 63.72935, 63.72935, 63.72935,
> 63.72935, 63.72935, 63.72935, 63.72935, 63.72935, 63.72935),
> ? ?Longitude = c(12.54022, 12.54022, 12.54022, 12.54022, 12.54022,
> ? ?12.54022, 12.54022, 12.54022, 12.54022, 12.54022, 12.54022,
> ? ?12.54022, 12.54022, 12.54022, 12.54022, 12.54022, 12.54022,
> ? ?12.54022, 12.54022, 12.54022, 12.54022), FeatherPosition =
structure(c(1L,
> ? ?2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 1L, 1L, 2L, 3L, 4L, 5L, 6L,
> ? ?7L, 8L, 9L, 1L, 1L), .Label = c("P1", "P2",
"P3", "P4", "P5",
> ? ?"P6", "P7", "P8", "P9"), class =
"factor"), Delta13C = c(-18.3,
> ? ?-18.53, -19.55, -20.18, -20.96, -21.08, -21.5, -17.42, -13.18,
> ? ?-19.95, -22.3, -22.2, -22.18, -22.14, -21.55, -20.85, -23.1,
> ? ?-20.75, -20.9, -21.61, -22.24)), .Names = c("Individual_ID",
> "Site_Name", "Latitude", "Longitude",
"FeatherPosition", "Delta13C"
> ), class = "data.frame", row.names = c("1282",
"1277", "1279",
> "1270", "1272", "1274", "1280",
"1276", "1271", "1284", "1289",
> "1290", "1295", "1293", "1292",
"1288", "1291", "1285", "1297",
> "1298", "1299"))
>


-- 
Sarah Goslee
http://www.sarahgoslee.com

Apparently Analagous Threads

Search for more possibly parallel threads

R help - Dec 2011 - Identifying records with the correct number of repeated measures

[R] Identifying records with the correct number of repeated measures

[R] Identifying records with the correct number of repeated measures

Apparently Analagous Threads