thr3ads.net - R help - [R] Extracting part of a factor [Mar 2016]

If this information is useful, please help other people find it:
Share via:

KMNanus

2016-Mar-04 18:00 UTC

[R] Extracting part of a factor

Here?s the dataset I?m working with, called test - 

subject	group	wk1	wk2	wk3	wk4	place
001-002	boys	2	3	4	5  	
002-003	boys	7	6	5	4	
003-004	boys	9	4	6	1	
004-005	girls	5	7	8	9	
005-006	girls	2	6	3	8	
006-007	girls	1	4	7	4	


if I call mutate(test, place = substr(subject,1,3), ?001 is the first
observation in the place column

But it?s a character and ?subject? is a factor.  I need place to be a factor,
too, but I need the observations to be ONLY the first three numbers of
?subject.?

Does that make my request more understandable?

Ken
kmnanus at gmail.com
914-450-0816 (tel)
347-730-4813 (fax)


> On Mar 4, 2016, at 12:49 PM, Sarah Goslee <sarah.goslee at gmail.com>
wrote:
> 
> Hi Ken,
> 
> You do that with as.factor(), as has already been suggested. You'll
need to provide a reproducible example to show us what's going wrong. Using
fake data is fine, we just need to see some data that look like yours and the
code you're using.
> 
> Sarah
> 
> On Fri, Mar 4, 2016 at 11:57 AM, KMNanus <kmnanus at gmail.com
<mailto:kmnanus at gmail.com>> wrote:
> Let me see if I can ask the question more clearly - I am trying to extract
a section of a hyphenated factor. For example, 001-004 is one observation of
test$ken, which is a factor, and I want to set up a new factor variable called
place that would have 001 as an observation. If I call mutate(place =
(as.character (test$ken)), I can extract 001 from  001-004, but but don't
know how to subsequently convert that character string back into a factor.
> 
> 
> Or can 001 be extracted from a factor as a factor?
> 
> Do you know how to execute either of these approaches?
> 
> Ken
> kmnanus at gmail.com <mailto:kmnanus at gmail.com>
> 914-450-0816 <tel:914-450-0816> (tel)
> 347-730-4813 <tel:347-730-4813> (fax)
> 
> 
> 
> > On Mar 3, 2016, at 8:33 PM, Herv? Pag?s <hpages at fredhutch.org
<mailto:hpages at fredhutch.org>> wrote:
> >
> > On 03/03/2016 02:13 PM, KMNanus wrote:
> >> When I do that,
> >
> > When you do what exactly?
> >
> > It's impossible for anyone here to know what you're doing if
you
> > don't show the code.
> >
> >> I get "Error in `$<-.data.frame`(`*tmp*`,
"site", value
> >> = integer(0)) :
> >>   replacement has 0 rows, data has 6?
> >>
> >> The data frame has 6 rows.
> >
> > You said you had a factor variable, you never mentioned you had a
> > data.frame. If the factor variable is part of a data.frame
'df',
> > then first extract it with something like df$myvar or
df[["myvar"]],
> > and then call substr() followed by as.factor() on it.
> >
> > H.
> >
> >>
> >> Ken
> >> kmnanus at gmail.com <mailto:kmnanus at gmail.com>
<mailto:kmnanus at gmail.com <mailto:kmnanus at gmail.com>>
> >> 914-450-0816 <tel:914-450-0816> (tel)
> >> 347-730-4813 <tel:347-730-4813> (fax)
> >>
> >>
> >>> On Mar 3, 2016, at 4:52 PM, Herv? Pag?s <hpages at
fredhutch.org <mailto:hpages at fredhutch.org>
> >>> <mailto:hpages at fredhutch.org <mailto:hpages at
fredhutch.org>>> wrote:
> >>>
> >>> Hi,
> >>>
> >>> On 03/03/2016 12:18 PM, KMNanus wrote:
> >>>> I have a factor variable that is 6 digits and hyphenated. 
For
> >>>> example, 001-014.
> >>>>
> >>>> I need to extract the first 3 digits to a new variable
using mutate
> >>>> in dplyr - in this case 001 - but can?t find a function to
do it.
> >>>>
> >>>> substr will do this for character strings, but I need the
variable to
> >>>> remain as a factor.
> >>>
> >>> What prevents you from calling as.factor() on the result to
turn it
> >>> back into a factor?
> >>>
> >>> H.
> >>>
> >>>>
> >>>> Is there an R function  or workaround to do this?
> >>>>
> >>>>
> >>>> Ken
> >>>> kmnanus at gmail.com <mailto:kmnanus at gmail.com>
<mailto:kmnanus at gmail.com <mailto:kmnanus at gmail.com>>
> >>>> 914-450-0816 <tel:914-450-0816> (tel)
> >>>> 347-730-4813 <tel:347-730-4813> (fax)
> >>>>
> >>>>
> >>>>
> >>>> ______________________________________________
> >>>> R-help at r-project.org <mailto:R-help at
r-project.org> mailing list -- To UNSUBSCRIBE and more, see
> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
> >>>> PLEASE do read the posting guide
> >>>> http://www.R-project.org/posting-guide.html
<http://www.r-project.org/posting-guide.html>
> >>>> and provide commented, minimal, self-contained,
reproducible code.
> >>>>
> >>>
> >>> --
> >>> Herv? Pag?s
> >>>
> >>> Program in Computational Biology
> >>> Division of Public Health Sciences
> >>> Fred Hutchinson Cancer Research Center
> >>> 1100 Fairview Ave. N, M1-B514
> >>> P.O. Box 19024
> >>> Seattle, WA 98109-1024
> >>>
> >>> E-mail: hpages at fredhutch.org <mailto:hpages at
fredhutch.org> <mailto:hpages at fredhutch.org <mailto:hpages at
fredhutch.org>>
> >>> Phone:  (206) 667-5791 <tel:%28206%29%20667-5791>
> >>> Fax:    (206) 667-1319 <tel:%28206%29%20667-1319>
> >>
> >
> > --
> > Herv? Pag?s
> >
> > Program in Computational Biology
> > Division of Public Health Sciences
> > Fred Hutchinson Cancer Research Center
> > 1100 Fairview Ave. N, M1-B514
> > P.O. Box 19024
> > Seattle, WA 98109-1024
> >
> > E-mail: hpages at fredhutch.org <mailto:hpages at fredhutch.org>
> > Phone:  (206) 667-5791 <tel:%28206%29%20667-5791>
> > Fax:    (206) 667-1319 <tel:%28206%29%20667-1319>
>

Sarah Goslee

2016-Mar-04 18:07 UTC

head link

[R] Extracting part of a factor

As everyone has been telling you, as.factor().
If you like the mutate approach, you can call as.factor(test$subject)
to convert it.

Here's a one-liner with reproducible data.


testdata <- structure(list(subject = structure(1:6, .Label =
c("001-002",
"002-003", "003-004", "004-005",
"005-006", "006-007"), class = "factor"),
    group = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("boys",
    "girls"), class = "factor"), wk1 = c(2L, 7L, 9L, 5L, 2L,
    1L), wk2 = c(3L, 6L, 4L, 7L, 6L, 4L), wk3 = c(4L, 5L, 6L,
    8L, 3L, 7L), wk4 = c(5L, 4L, 1L, 9L, 8L, 4L)), .Names =
c("subject",
"group", "wk1", "wk2", "wk3",
"wk4"), class = "data.frame", row.names = c(NA,
-6L))

testdata$subject <- as.factor(substring(as.character(testdata$subject), 1,
3))
> testdata  subject group wk1 wk2 wk3 wk4
1     001  boys   2   3   4   5
2     002  boys   7   6   5   4
3     003  boys   9   4   6   1
4     004 girls   5   7   8   9
5     005 girls   2   6   3   8
6     006 girls   1   4   7   4> str(testdata)'data.frame': 6 obs. of  6 variables:
 $ subject: Factor w/ 6 levels
"001","002","003",..: 1 2 3 4 5 6
 $ group  : Factor w/ 2 levels "boys","girls": 1 1 1 2 2 2
 $ wk1    : int  2 7 9 5 2 1
 $ wk2    : int  3 6 4 7 6 4
 $ wk3    : int  4 5 6 8 3 7
 $ wk4    : int  5 4 1 9 8 4

Sarah

On Fri, Mar 4, 2016 at 1:00 PM, KMNanus <kmnanus at gmail.com>
wrote:>
> Here?s the dataset I?m working with, called test -
>
> subject group wk1 wk2 wk3 wk4 place
> 001-002 boys 2 3 4 5
> 002-003 boys 7 6 5 4
> 003-004 boys 9 4 6 1
> 004-005 girls 5 7 8 9
> 005-006 girls 2 6 3 8
> 006-007 girls 1 4 7 4
>
>
> if I call mutate(test, place = substr(subject,1,3), ?001 is the first
observation in the place column
>
> But it?s a character and ?subject? is a factor.  I need place to be a
factor, too, but I need the observations to be ONLY the first three numbers of
?subject.?
>
> Does that make my request more understandable?

Jeff Newmiller

2016-Mar-04 20:46 UTC

head link

[R] Extracting part of a factor

I much prefer the factor function over the as.factor function for converting
character to factor, since you can set the levels in the order you want them to
be.
-- 
Sent from my phone. Please excuse my brevity.

On March 4, 2016 10:07:27 AM PST, Sarah Goslee <sarah.goslee at gmail.com>
wrote:>As everyone has been telling you, as.factor().
>If you like the mutate approach, you can call as.factor(test$subject)
>to convert it.
>
>Here's a one-liner with reproducible data.
>
>
>testdata <- structure(list(subject = structure(1:6, .Label
>c("001-002",
>"002-003", "003-004", "004-005",
"005-006", "006-007"), class >"factor"),
>    group = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label =
c("boys",
>    "girls"), class = "factor"), wk1 = c(2L, 7L, 9L, 5L,
2L,
>    1L), wk2 = c(3L, 6L, 4L, 7L, 6L, 4L), wk3 = c(4L, 5L, 6L,
>   8L, 3L, 7L), wk4 = c(5L, 4L, 1L, 9L, 8L, 4L)), .Names =
c("subject",
>"group", "wk1", "wk2", "wk3",
"wk4"), class = "data.frame", row.names >c(NA,
>-6L))
>
>testdata$subject <- as.factor(substring(as.character(testdata$subject),
>1, 3))
>
>> testdata
>  subject group wk1 wk2 wk3 wk4
>1     001  boys   2   3   4   5
>2     002  boys   7   6   5   4
>3     003  boys   9   4   6   1
>4     004 girls   5   7   8   9
>5     005 girls   2   6   3   8
>6     006 girls   1   4   7   4
>> str(testdata)
>'data.frame': 6 obs. of  6 variables:
> $ subject: Factor w/ 6 levels
"001","002","003",..: 1 2 3 4 5 6
> $ group  : Factor w/ 2 levels "boys","girls": 1 1 1 2 2
2
> $ wk1    : int  2 7 9 5 2 1
> $ wk2    : int  3 6 4 7 6 4
> $ wk3    : int  4 5 6 8 3 7
> $ wk4    : int  5 4 1 9 8 4
>
>Sarah
>
>On Fri, Mar 4, 2016 at 1:00 PM, KMNanus <kmnanus at gmail.com> wrote:
>>
>> Here?s the dataset I?m working with, called test -
>>
>> subject group wk1 wk2 wk3 wk4 place
>> 001-002 boys 2 3 4 5
>> 002-003 boys 7 6 5 4
>> 003-004 boys 9 4 6 1
>> 004-005 girls 5 7 8 9
>> 005-006 girls 2 6 3 8
>> 006-007 girls 1 4 7 4
>>
>>
>> if I call mutate(test, place = substr(subject,1,3), ?001 is the first
>observation in the place column
>>
>> But it?s a character and ?subject? is a factor.  I need place to be a
>factor, too, but I need the observations to be ONLY the first three
>numbers of ?subject.?
>>
>> Does that make my request more understandable?
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
	[[alternative HTML version deleted]]

R help - Mar 2016 - Extracting part of a factor

[R] Extracting part of a factor

[R] Extracting part of a factor

[R] Extracting part of a factor