Weidong Gu
2008-Feb-11 16:30 UTC
[R] how to generate a column based on other columns in a data frame
HI, I am working on a data set with multiple collections of mosquitoes at sampling sites. Each row represents a collection of individual samples with coordinates for each collection. ... X, Y,... 1 36.435 30.118 2 36.435 30.118 3 36.435 30.118 4 35.329 29.657 5 35.329 29.657 6 36.431 30.111 7 36.431 30.111 8 35.421 29.797 9 35.421 29.797 10 35.421 29.797 Unfortunately, there is no 'site' entry. I would like to add a column of 'site' based on the coordinates of samples so that samples from the same sites have the same site ID like S1, S2,.... How to do this in R way? Thanks. Weidong Gu, Department of Medicine University of Alabama, Birmingham 1900 University Blvd., Birmingham, Alabama 35294 Email: wgu@uab.edu PH: (205)-975-9053 [[alternative HTML version deleted]]
Henrique Dallazuanna
2008-Feb-11 16:46 UTC
[R] how to generate a column based on other columns in a data frame
Try this: x2 <- merge(x, cbind(unique(x), Site=sprintf("S%d", seq_len(nrow(unique(x))))), by=c("X", "Y")) x2[order(x2$site)] On 11/02/2008, Weidong Gu <wgu at uab.edu> wrote:> HI, > > > > I am working on a data set with multiple collections of mosquitoes at > sampling sites. Each row represents a collection of individual samples > with coordinates for each collection. > > ... X, Y,... > > 1 36.435 30.118 > > 2 36.435 30.118 > > 3 36.435 30.118 > > 4 35.329 29.657 > > 5 35.329 29.657 > > 6 36.431 30.111 > > 7 36.431 30.111 > > 8 35.421 29.797 > > 9 35.421 29.797 > > 10 35.421 29.797 > > > > Unfortunately, there is no 'site' entry. I would like to add a column of > 'site' based on the coordinates of samples so that samples from the same > sites have the same site ID like S1, S2,.... > > > > How to do this in R way? Thanks. > > > > > > Weidong Gu, > > Department of Medicine > University of Alabama, Birmingham > 1900 University Blvd., Birmingham, Alabama 35294 > Email: wgu at uab.edu > PH: (205)-975-9053 > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O
Gabor Grothendieck
2008-Feb-11 17:04 UTC
[R] how to generate a column based on other columns in a data frame
Assuming this data frame: DF <- data.frame(X = c(36.435, 36.435, 36.435, 35.329, 35.329, 36.431, 36.431, 35.421, 35.421, 35.421), Y = c(30.118, 30.118, 30.118, 29.657, 29.657, 30.111, 30.111, 29.797, 29.797, 29.797)) # Try this: DF$site <- as.numeric(factor(interaction(DF$X, DF$Y))) If X and Y can vary slightly while still referring to the same site then round them first to k decimal places first. See ?round On Feb 11, 2008 11:30 AM, Weidong Gu <wgu at uab.edu> wrote:> HI, > > > > I am working on a data set with multiple collections of mosquitoes at > sampling sites. Each row represents a collection of individual samples > with coordinates for each collection. > > ... X, Y,... > > 1 36.435 30.118 > > 2 36.435 30.118 > > 3 36.435 30.118 > > 4 35.329 29.657 > > 5 35.329 29.657 > > 6 36.431 30.111 > > 7 36.431 30.111 > > 8 35.421 29.797 > > 9 35.421 29.797 > > 10 35.421 29.797 > > > > Unfortunately, there is no 'site' entry. I would like to add a column of > 'site' based on the coordinates of samples so that samples from the same > sites have the same site ID like S1, S2,.... > > > > How to do this in R way? Thanks. > > > > > > Weidong Gu, > > Department of Medicine > University of Alabama, Birmingham > 1900 University Blvd., Birmingham, Alabama 35294 > Email: wgu at uab.edu > PH: (205)-975-9053 > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >