Tom,
You may have a very different impression of what was asked! LOL!
Unless Janet clarifies what seems a bit like a homework assignment, it seems?to
be a fairly simple and straightforward assignment with exactly three
rows/columns and?asking how to replace the variables, in a sense, by finding the
high and low and?perhaps thus identifying the medium, but to do this for each
row without changing?the order of the resulting data.frame.
I note most techniques people have used focus on columns, not rows, but an
all-numeric?data.frame can be transposed, or converted to a matrix and later
converted back.
If this is HW, the question becomes what has been taught so far and is supposed
to be?used in solving it. Can they make their own functions perhaps to be called
three times,?once per row or column, to replace that row/column, or can they use
some form of loop to?iterate over the columns? Does it need to sort of be done
in place or can they create gradually?a second data.frame and then move the
pointer to it and lots of other similar ideas.
I am not sure, other than as a HW assignment, why this transformation would need
to be done?but of course, there may well be a reason.
I note that the particular example shown just happens to create almost a magic
square as the sum?of rows and columns and the major diagonal happen to be 0,
albeit the reverse diagonal is all 50's.?
Again, there are many solutions imaginable but the goal may be more specific and
I shudder to?supply one given that too often questions here are not detailed
enough and are misunderstood.?In this case, I thought I understood until I saw
what Tom wrote! LOL!
I will add this. Is it guaranteed that no two items in the same row are never
equal or is there some?requirement for how to handle a tie? And note there are
base R functions called min() and max()?and you can ask for things like:
if ( current == min(mydata[1,])) ...
-----Original Message-----
From: Tom Woolman <twoolman at ontargettek.com>
To: Janet Choate <jsc.eco at gmail.com>
Cc: r-help at r-project.org
Sent: Sun, May 29, 2022 3:42 pm
Subject: Re: [R] categorizing data
Some ideas:
You could create a cluster model with k=3 for each of the 3 variables,
to determine what constitutes high/medium/low centroid values for each
of the 3 types of plant types. Centroid values could then be used as the
upper/lower boundary ranges for high/med/low.
Or utilize a histogram for each variable, and use quantiles or
densities, etc. to determine the natural breaks for the high/med/low
ranges for each of the IVs.
On 2022-05-29 15:28, Janet Choate wrote:> Hi R community,
> I have a data frame with three variables, where each row adds up to 90.
> I want to assign a category of low, medium, or high to the values in
> each
> row - where the lowest value per row will be set to 10, the medium
> value
> set to 30, and the high value set to 50 - so each row still adds up to
> 90.
>
> For example:
> Data: Orig
> tree? shrub? grass
> 32? ? 11? ? ? 47
> 23? ? ? 41? ? ? 26
> 49? ? ? 23? ? ? 18
>
> Data: New
> tree? shrub? grass
> 30? ? ? 10? ? ? 50
> 10? ? ? 50? ? 30
> 50? ? ? 30? ? 10
>
> I am not attaching any code here as I have not been able to write
> anything
> effective! appreciate help with this!
> thank you,
> JC
>
> --
>
> ??? [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]