Having had to face this problem myself more than once, I sympathize with
Ted's argument. First let me confess that I regard sex as a measure of the
reproductive phenotype. Given the ongoing experimentation with both sex and
gender, I have had to add "U" (Unstated - includes all those acronyms
that
can be mistaken for gamma hydroxy butyrate) to "M" and "F"
in a dataset or
two.
Even worse is the crap shoot of sex chromosomes. While XYY is not much of a
problem at all, Turner's Syndrome (XO) is neither female (although they
appear to be) nor male. Given a reasonably large sample (the dream of
some), nature usually provides a few permutations that, while we know what
they are, don't really fit comfortably in either "M" or
"F".
Jim
On Sun, Nov 1, 2015 at 9:02 AM, Duncan Murdoch <murdoch.duncan at
gmail.com>
wrote:
> On 31/10/2015 3:47 PM, (Ted Harding) wrote:
> > [Apologies if the message below should arrive twice. When first
> > sent there was apparently something wrong with the email address
> > to r-help, and it was held for moderation because "Message has
> > implicit destination" (whatever that means). I have made sure
> > that this time the email address is correct.]
> >
> > John Fox has given a neat expression to achieve the desired result!
> >
> > I would like to comment, however, on the somewhat insistent criticism
> > of Val's request from several people.
> >
> > It can make sense to have three "sex"es. Suppose, for
example,
> > that the data are records of street crime reported by victims.
> > The victim may be able to identify the sex of the preprator
> > as definitely "M", or definitely "F". One of the
aims of the
> > analysis is to investgate whether there is an association
> > between the gender of the offender and the type of crime.
> >
> > But in some cases the victim may not have been able to recognise
> > the offender's sex. Then it would have to go in the record as
"NA"
> > (or equivalent). There can be two kinds of reason why the victim
> > was unable to recognise the sex. One kind is where the victim
> > simply did not see the offender (e.g. their purse was stolen
> > while they were concentrating on something else, and they only
> > found out later). Another kind is where the offender deliberately
> > disguises their gender, so that it cannot be determined from their
> > appearance. This second kind could be associated with a particular
> > category of crime (and I leave it to people's lurid imaginations
> > to think of possible examples ... ).
>
> I'm not convinced by your example. I'm quite happy to say that the
sex
> is M or F or unobserved, but unobserved is not a third sex, under that
> model it just means "M or F but I don't know which". It is
an
> incomplete observation, it's not a third sex.
>
> I can imagine 3 sexes in a case of multiple individuals: "all
M", "all
> F", "mixed".
>
> I can also imagine more complicated definitions of "sex" that
include
> more than 2 categories, but I think that's not what we're talking
about
> here.
>
> >
> > Then one indeed has three "sex"es: Male, Female, and
Indeterminate,
> > for each of which there is a potential assoctiation with type of
crime.
> > With most analyses, however, a category of "NA" would be
ignored
> > (at least by R).
>
> That claim is nonsense. R never ignores *anything* unless the analyst
> tells it to. The analyst may choose to ignore something, but don't
> blame R if the analyst makes a bad decision.
>
> Duncan Murdoch
>
>
> > And then one has a variable which is a factor with 3 levels, all
> > of which can (as above) be meaningful), and "NA" would not
be
> > ignored.
> >
> > Hoping this helps to clarify! (And, Val, does the above somehow
> > correspond to your objectives).
> >
> > Best wishes to all,
> > Ted.
> >
> > On 31-Oct-2015 17:41:02 Jeff Newmiller wrote:
> >> Rolf gave you two ways. There are others. They all misrepresent
the data
> >> (there are only two sexes but you are effectively acting as if
there are
> >> three); hence the inquisition in hopes of diverting you to a more
> correct
> >> method of analysis. However, this is not the support forum for
whatever
> other
> >> software you plan to proceed with so never mind.
> >>
> ---------------------------------------------------------------------------
> >> Jeff Newmiller The ..... .....
Go
> Live...
> >> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#.
##.#. Live
> Go...
> >> Live: OO#.. Dead: OO#..
Playing
> >> Research Engineer (Solar/Batteries O.O#. #.O#.
with
> >> /Software/Embedded Controllers) .OO#. .OO#.
> rocks...1k
> >>
> ---------------------------------------------------------------------------
> >> Sent from my phone. Please excuse my brevity.
> >>
> >> On October 31, 2015 10:15:33 AM PDT, Val <valkremk at
gmail.com> wrote:
> >>> Hi Jeff,
> >>>
> >>> I thought I answered. Yes I was not clear about it. The
further
> >>> analysis
> >>> will no be done by R. It is another software that will not
accept a
> >>> character response variable.
> >>>
> >>> Why R is so complicated to do that. If it is SAS then I can
do it on
> >>> one
> >>> statement. .
> >>>
> >>>
> >>> On Sat, Oct 31, 2015 at 11:39 AM, Jeff Newmiller
> >>> <jdnewmil at dcn.davis.ca.us>
> >>> wrote:
> >>>
> >>>> You haven't actually answered John's question as
to the type of
> >>> analysis
> >>>> you plan to do. It still looks from here like you should
be using
> >>> factor
> >>>> data rather than numeric, but since you are not being
clear we cannot
> >>> give
> >>>> specifics as to how to proceed.
> >>>>
> >>>
> ---------------------------------------------------------------------------
> >>>> Jeff Newmiller The .....
..... Go
> >>> Live...
> >>>> DCN:<jdnewmil at dcn.davis.ca.us> Basics:
##.#. ##.#. Live
> >>>> Go...
> >>>> Live: OO#.. Dead:
OO#..
> >>> Playing
> >>>> Research Engineer (Solar/Batteries O.O#.
#.O#. with
> >>>> /Software/Embedded Controllers) .OO#.
.OO#.
> >>> rocks...1k
> >>>>
> >>>
> ---------------------------------------------------------------------------
> >>>> Sent from my phone. Please excuse my brevity.
> >>>>
> >>>> On October 31, 2015 8:23:05 AM PDT, Val <valkremk at
gmail.com> wrote:
> >>>>> Hi All,
> >>>>>
> >>>>>
> >>>>> Yes I need to change to numeric because I am
preparing a data set
> >>>>> for
> >>>>> further analysis. The variable to be changed from
character to
> >>>>> numeric
> >>>>> (in this case, sex) will be a response variable. Some
records have
> >>>>> missing
> >>>>> observation on sex and it is blank.
> >>>>> id sex
> >>>>> 1
> >>>>> 2
> >>>>> 3 M
> >>>>> 4 F
> >>>>> 5 M
> >>>>> 6 F
> >>>>> 7 F
> >>>>>
> >>>>> I am reading the data like this
> >>>>>
> >>>>> mydata <- read.csv(header=TRUE, text=',
sep=", ")
> >>>>> id sex
> >>>>> 1 NA
> >>>>> 2 NA
> >>>>> 3 M
> >>>>> 4 F
> >>>>> 5 M
> >>>>> 6 F
> >>>>> 7 F
> >>>>>
> >>>>> The data set is huge (>250,000)
> >>>>>
> >>>>>
> >>>>> I want the output like this
> >>>>>
> >>>>> id sex sex1
> >>>>> 1 NA 0
> >>>>> 2 NA 0
> >>>>> 3 M 1
> >>>>> 4 F 2
> >>>>> 5 M 1
> >>>>> 6 F 2
> >>>>> 7 F 2
> >>>>>
> >>>>> Thank you in advance
> >>>>>
> >>>>>
> >>>>> On Sat, Oct 31, 2015 at 5:59 AM, John Kane
<jrkrideau at inbox.com>
> >>> wrote:
> >>>>>
> >>>>>> In line.
> >>>>>>
> >>>>>> John Kane
> >>>>>> Kingston ON Canada
> >>>>>>
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: valkremk at gmail.com
> >>>>>>> Sent: Fri, 30 Oct 2015 20:40:03 -0500
> >>>>>>> To: istazahn at gmail.com
> >>>>>>> Subject: Re: [R] If else
> >>>>>>>
> >>>>>>> I am trying to change the mydata$sex from
character to numeric
> >>>>>>
> >>>>>> Why?
> >>>>>> As Ista (mydata$confusingWillCauseProblemsLater)
has pointed out
> >>>>> this is
> >>>>>> a very unusual thing to do in R.
> >>>>>>
> >>>>>> Is there a very specific reason for doing this in
your analysis.
> >>>>>> Otherwise it may better to leave the coding as NA.
Some of the
> >>> data
> >>>>> mungers
> >>>>>> here may be able to suggest which is the best
strategy in R.
> >>>>>>
> >>>>>> R is 'weird' compared to more mundane
stats packages such as SAS
> >>> or
> >>>>> SPSS
> >>>>>> and common techniques that one would use with them
often are not
> >>>>>> appropriate in R.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> I want teh out put like
> >>>>>>> id sex
> >>>>>>> 1 NA 0
> >>>>>>> 2 NA 0
> >>>>>>> 3 M 1
> >>>>>>> 4 F 2
> >>>>>>> 5 M 1
> >>>>>>> 6 F 2
> >>>>>>> 7 F 2
> >>>>>>>
> >>>>>>> mydata$sex1 <- 0
> >>>>>>> if(mydata$sex =="M " ){
> >>>>>>> mydata$sex1<-1
> >>>>>>> } else {
> >>>>>>> mydata$sex1<-2
> >>>>>>> }
> >>>>>>>
> >>>>>>> mydata$sex1
> >>>>>>>
> >>>>>>> Warning message:In if (mydata$sex == "M
") { :
> >>>>>>> the condition has length > 1 and only the
first element will
> >>> be
> >>>>>>> used> mydata$sex1[1] 2 2 2 2 2 2 2 2
> >>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, Oct 30, 2015 at 8:28 PM, Ista Zahn
<istazahn at gmail.com>
> >>>>> wrote:
> >>>>>>>
> >>>>>>>> Using numeric for missing sounds like
asking for trouble. But
> >>> if
> >>>>> you
> >>>>>>>> must, something like
> >>>>>>>>
> >>>>>>>> mydata$confusingWillCauseProblemsLater
<-
> >>>>>>>> ifelse(
> >>>>>>>> is.na(mydata$sex),
> >>>>>>>> 0,
> >>>>>>>> as.numeric(factor(mydata$sex,
> >>>>>>>> levels =
c("M", "F"))))
> >>>>>>>>
> >>>>>>>> should do it.
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Ista
> >>>>>>>>
> >>>>>>>> On Fri, Oct 30, 2015 at 9:15 PM, Val
<valkremk at gmail.com>
> >>> wrote:
> >>>>>>>>> Hi all,
> >>>>>>>>> Iam trying to change character to
numeric but have probelm
> >>>>>>>>>
> >>>>>>>>> mydata <- read.table(header=TRUE,
text=', sep=" "
> >>>>>>>>> id sex
> >>>>>>>>> 1 NA
> >>>>>>>>> 2 NA
> >>>>>>>>> 3 M
> >>>>>>>>> 4 F
> >>>>>>>>> 5 M
> >>>>>>>>> 6 F
> >>>>>>>>> 7 F
> >>>>>>>>> ')
> >>>>>>>>>
> >>>>>>>>> if sex is missing then sex=0;
> >>>>>>>>> if sex is"M" then sex=1;
> >>>>>>>>> if sex is"F" then sex=2;
> >>>>>>>>>
> >>>>>>>>> Any help please ?
> >>>>>>>>>
> >>>>>>>>> [[alternative HTML version
deleted]]
> >>>>>>>>>
> >>>>>>>>>
______________________________________________
> >>>>>>>>> R-help at r-project.org mailing list
-- To UNSUBSCRIBE and more,
> >>> see
> >>>>>>>>>
https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>>>>>> PLEASE do read the posting guide
> >>>>>>>>
http://www.R-project.org/posting-guide.html
> >>>>>>>>> and provide commented, minimal,
self-contained, reproducible
> >>>>> code.
> >>>>>>>>
> >>>>>>>
> >>>>>>> [[alternative HTML version deleted]]
> >>>>>>>
> >>>>>>> ______________________________________________
> >>>>>>> R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more,
> >>> see
> >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>>>> PLEASE do read the posting guide
> >>>>>>> http://www.R-project.org/posting-guide.html
> >>>>>>> and provide commented, minimal,
self-contained, reproducible
> >>> code.
> >>>>>>
> >>>>>>
____________________________________________________________
> >>>>>> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch
dolphins, sharks &
> >>> orcas
> >>>>> on
> >>>>>> your desktop!
> >>>>>> Check it out at
http://www.inbox.com/marineaquarium
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> [[alternative HTML version deleted]]
> >>>>>
> >>>>> ______________________________________________
> >>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE
and more, see
> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>> PLEASE do read the posting guide
> >>>>> http://www.R-project.org/posting-guide.html
> >>>>> and provide commented, minimal, self-contained,
reproducible code.
> >>>>
> >>>>
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > -------------------------------------------------
> > E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
> > Date: 31-Oct-2015 Time: 19:29:50
> > This message was sent by XFMail
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]