Hi all, I'm trying to identify a particular digit or value within a vector of factors. Specifically, this is environmental data where in some cases the minimum value reported is "<" a particular number (and I want to manipulate only these). For example: x<-c("1","2","3","4","<1") For a dataset that is hundreds or thousands of lines long, I'd like to find or identify only those that have a "<" symbol (R automatically stores the entire vector in factor format due to these symbols when it imports the data-I don't mind converting if necessary). Eventually, I'd like to divide the number in half for these cases, but I think I have that coding lined up once I can just identify them from the stew. I've exhausted help and net resources so far... Thanks, Ryan -- Ryan Utz, Ph.D. Aquatic Ecologist/STREON Scientist National Ecological Observatory Network Home/Cell: (724) 272-7769 Work: (720) 746-4844 ext. 2488 [[alternative HTML version deleted]]
Hi Ryan, The key to this is the grep() command. It's easiest if you import your data as character rather than factor, since these aren't properly factors. You don't say how that's done, but if you are using read.table() then as.is=TRUE will prevent conversion to factor. For a character vector, here's how to find out which elements contain a <.> x<-c("1","2","3","4","<1") > str(x)chr [1:5] "1" "2" "3" "4" "<1"> grep("<", x)[1] 5> x[grep("<", x)][1] "<1" If there's some other compelling reason that your data are stored as factors, one approach would be to identify which factor levels have < in them, then identify the values that are of those levels.> x <- factor(x) > x[1] 1 2 3 4 <1 Levels: <1 1 2 3 4> levels(x)[1] "<1" "1" "2" "3" "4"> grep("<", levels(x))[1] 1> x[x %in% levels(x)[grep("<", levels(x))]][1] <1 Levels: <1 1 2 3 4 All of this is assuming that there are more possibilities than "<1"; if that's the only concern then == should work just fine. Sarah On Mon, Jul 25, 2011 at 1:29 PM, Ryan Utz <utz.ryan at gmail.com> wrote:> Hi all, > > I'm trying to identify a particular digit or value within a vector of > factors. Specifically, this is environmental data where in some cases the > minimum value reported is "<" a particular number (and I want to manipulate > only these). For example: > > ?x<-c("1","2","3","4","<1") > > For a dataset that is hundreds or thousands of lines long, I'd like to find > or identify only those that have a "<" symbol (R automatically stores the > entire vector in factor format due to these symbols when it imports the > data-I don't mind converting if necessary). Eventually, I'd like to divide > the number in half for these cases, but I think I have that coding lined up > once I can just identify them from the stew. > > I've exhausted help and net resources so far... > > Thanks, > Ryan >-- Sarah Goslee http://www.functionaldiversity.org
b<-c("1","2","3","4","<1") grep('<',b) HTH, Daniel Ryan Utz-2 wrote:> > Hi all, > > I'm trying to identify a particular digit or value within a vector of > factors. Specifically, this is environmental data where in some cases the > minimum value reported is "<" a particular number (and I want to > manipulate > only these). For example: > > x<-c("1","2","3","4","<1") > > For a dataset that is hundreds or thousands of lines long, I'd like to > find > or identify only those that have a "<" symbol (R automatically stores the > entire vector in factor format due to these symbols when it imports the > data-I don't mind converting if necessary). Eventually, I'd like to divide > the number in half for these cases, but I think I have that coding lined > up > once I can just identify them from the stew. > > I've exhausted help and net resources so far... > > Thanks, > Ryan > > > -- > > Ryan Utz, Ph.D. > Aquatic Ecologist/STREON Scientist > National Ecological Observatory Network > > Home/Cell: (724) 272-7769 > Work: (720) 746-4844 ext. 2488 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- View this message in context: http://r.789695.n4.nabble.com/Finding-identifying-a-value-within-a-factor-tp3693426p3693477.html Sent from the R help mailing list archive at Nabble.com.
Hi, you provided a character vector as an example. I guess you meant something like: x <- factor(c("1","2","3","4","<1")) # You can identify those elements with an "<" by ?grep or ?grepl: indices <- grep("<",as.character(x)) # You can transform those elements by ?as.numeric as.numeric(x[indices]) HTH, Denes> Hi all, > > I'm trying to identify a particular digit or value within a vector of > factors. Specifically, this is environmental data where in some cases the > minimum value reported is "<" a particular number (and I want to > manipulate > only these). For example: > > x<-c("1","2","3","4","<1") > > For a dataset that is hundreds or thousands of lines long, I'd like to > find > or identify only those that have a "<" symbol (R automatically stores the > entire vector in factor format due to these symbols when it imports the > data-I don't mind converting if necessary). Eventually, I'd like to divide > the number in half for these cases, but I think I have that coding lined > up > once I can just identify them from the stew. > > I've exhausted help and net resources so far... > > Thanks, > Ryan > > > -- > > Ryan Utz, Ph.D. > Aquatic Ecologist/STREON Scientist > National Ecological Observatory Network > > Home/Cell: (724) 272-7769 > Work: (720) 746-4844 ext. 2488 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >