thr3ads.net - R help - [R] Data reshaping with conditions [Apr 2016]

If this information is useful, please help other people find it:
Share via:

sri vathsan

2016-Apr-20 05:03 UTC

[R] Data reshaping with conditions

Dear All,

I am trying to reshape the data with some conditions. A small part of the
data looks like below. Like this there will be more data with repeating ID.

Count id name type
117 335 sally A
19 335 sally A
167 335 sally B
18 340 susan A
56 340 susan A
22 340 susan B
53 340 susan B
135 351 lee A
114 351 lee A
84 351 lee A
80 351 lee A
19 351 lee A
8 351 lee A
21 351 lee A
88 351 lee B
111 351 lee B
46 351 lee B
108 351 lee B
>From the above data I am expecting an output like below.
id name type count_of_B Max of count B     x               y
335 sally B 167 167 117,19      NA
340 susan B 22,53 53 18              56
351 lee B 88,111,46,108  111 84,80,19,8,2   135,114

Where, the column x and column y are:

x = Count_A_less_than_max of (Count type B)
y = Count_A_higher_than_max of (Count type B)?.

*1)* I tried with dplyr with the following code for the initial step to get
the values for each column.
*2)*  I thought to transpose the columns which has the unique ID alone.

I tried with the following code and I am struck with the intial step
itself. The code is executed but higher and lower value of A is not coming.

Expected_output= data %>%
  group_by(id, Type) %>%
  mutate(Count_of_B = paste(unlist(count[Type=="B"]), collapse =
","))%>%
  mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type
="B"]),max(count[Type == "A"]))) %>%
  mutate(count_type_A_lesser = ifelse
(Type=="B",(paste(unlist(count[Type=="A"]) <
Max_of_count_B[Type=="B"],
collapse = ",")), "NA"))%>%
  mutate(count_type_A_higher
ifelse(Type=="B",(paste(unlist(count[Type=="A"]) >
Max_of_count_B[Type=="B"], collapse = ",")),
"NA"))

I hope I make my point clear. Please bare with the code, as I am new to
this.

Regards,
?sri

	[[alternative HTML version deleted]]

Jim Lemon

2016-Apr-20 23:22 UTC

head link

[R] Data reshaping with conditions

Hi sri,
As your problem involves a few logical steps, I found it easier to
approach it in a stepwise way. Perhaps there are more elegant ways to
accomplish this.

svdat<-read.table(text="Count id name type
117 335 sally A
19 335 sally A
167 335 sally B
18 340 susan A
56 340 susan A
22 340 susan B
53 340 susan B
135 351 lee A
114 351 lee A
84 351 lee A
80 351 lee A
19 351 lee A
8 351 lee A
21 351 lee A
88 351 lee B
111 351 lee B
46 351 lee B
108 351 lee B",header=TRUE)
# you can also do this with other reshape functions
library(prettyR)
svdatstr<-stretch_df(svdat,"id",c("Count","type"))
count_ind<-grep("Count",names(svdatstr))
type_ind<-grep("type",names(svdatstr))
svdatstr$maxA<-NA
svdatstr$maxB<-NA
svdatstr$x<-NA
svdatstr$y<-NA
for(row in 1:nrow(svdatstr)) {
 svdatstr[row,"maxA"]<-
 
max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"A",0))]])
 svdatstr[row,"maxB"]<-
 
max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"B",0))]])
 svdatstr[row,"x"]<-svdatstr[row,"maxA"] <
svdatstr[row,"maxB"]
 svdatstr[row,"y"]<-!svdatstr[row,"x"]
}
svdatstr

You can then just extract the columns that you need.

Jim


On Wed, Apr 20, 2016 at 3:03 PM, sri vathsan <srivibish at gmail.com>
wrote:> Dear All,
>
> I am trying to reshape the data with some conditions. A small part of the
> data looks like below. Like this there will be more data with repeating ID.
>
> Count id name type
> 117 335 sally A
> 19 335 sally A
> 167 335 sally B
> 18 340 susan A
> 56 340 susan A
> 22 340 susan B
> 53 340 susan B
> 135 351 lee A
> 114 351 lee A
> 84 351 lee A
> 80 351 lee A
> 19 351 lee A
> 8 351 lee A
> 21 351 lee A
> 88 351 lee B
> 111 351 lee B
> 46 351 lee B
> 108 351 lee B
>
> >From the above data I am expecting an output like below.
>
> id name type count_of_B Max of count B     x               y
> 335 sally B 167 167 117,19      NA
> 340 susan B 22,53 53 18              56
> 351 lee B 88,111,46,108  111 84,80,19,8,2   135,114
>
> Where, the column x and column y are:
>
> x = Count_A_less_than_max of (Count type B)
> y = Count_A_higher_than_max of (Count type B).
>
> *1)* I tried with dplyr with the following code for the initial step to get
> the values for each column.
> *2)*  I thought to transpose the columns which has the unique ID alone.
>
> I tried with the following code and I am struck with the intial step
> itself. The code is executed but higher and lower value of A is not coming.
>
> Expected_output= data %>%
>   group_by(id, Type) %>%
>   mutate(Count_of_B = paste(unlist(count[Type=="B"]), collapse =
","))%>%
>   mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type
=> "B"]),max(count[Type == "A"]))) %>%
>   mutate(count_type_A_lesser = ifelse
> (Type=="B",(paste(unlist(count[Type=="A"]) <
Max_of_count_B[Type=="B"],
> collapse = ",")), "NA"))%>%
>   mutate(count_type_A_higher >
ifelse(Type=="B",(paste(unlist(count[Type=="A"]) >
> Max_of_count_B[Type=="B"], collapse = ",")),
"NA"))
>
> I hope I make my point clear. Please bare with the code, as I am new to
> this.
>
> Regards,
> sri
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

sri vathsan

2016-Apr-21 04:23 UTC

head link

[R] Data reshaping with conditions

Hi Jim,

Thanks for your time. But somehow this code did not help me to achieve my
expected output.
Problems: 1) x, y are coming as logical rather than values as I mentioned
in my post
               2) The values that I get for Max A and Max B not correct
               3) It looks like a pretty big data, but I just need to
concatenate the values with a comma, the final output will be a character
variable.

Regards,
Sri

On Thu, Apr 21, 2016 at 4:52 AM, Jim Lemon <drjimlemon at gmail.com>
wrote:
> Hi sri,
> As your problem involves a few logical steps, I found it easier to
> approach it in a stepwise way. Perhaps there are more elegant ways to
> accomplish this.
>
> svdat<-read.table(text="Count id name type
> 117 335 sally A
> 19 335 sally A
> 167 335 sally B
> 18 340 susan A
> 56 340 susan A
> 22 340 susan B
> 53 340 susan B
> 135 351 lee A
> 114 351 lee A
> 84 351 lee A
> 80 351 lee A
> 19 351 lee A
> 8 351 lee A
> 21 351 lee A
> 88 351 lee B
> 111 351 lee B
> 46 351 lee B
> 108 351 lee B",header=TRUE)
> # you can also do this with other reshape functions
> library(prettyR)
>
svdatstr<-stretch_df(svdat,"id",c("Count","type"))
> count_ind<-grep("Count",names(svdatstr))
> type_ind<-grep("type",names(svdatstr))
> svdatstr$maxA<-NA
> svdatstr$maxB<-NA
> svdatstr$x<-NA
> svdatstr$y<-NA
> for(row in 1:nrow(svdatstr)) {
>  svdatstr[row,"maxA"]<-
>
>
max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"A",0))]])
>  svdatstr[row,"maxB"]<-
>
>
max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"B",0))]])
>  svdatstr[row,"x"]<-svdatstr[row,"maxA"] <
svdatstr[row,"maxB"]
>  svdatstr[row,"y"]<-!svdatstr[row,"x"]
> }
> svdatstr
>
> You can then just extract the columns that you need.
>
> Jim
>
>
> On Wed, Apr 20, 2016 at 3:03 PM, sri vathsan <srivibish at gmail.com>
wrote:
> > Dear All,
> >
> > I am trying to reshape the data with some conditions. A small part of
the
> > data looks like below. Like this there will be more data with
repeating
> ID.
> >
> > Count id name type
> > 117 335 sally A
> > 19 335 sally A
> > 167 335 sally B
> > 18 340 susan A
> > 56 340 susan A
> > 22 340 susan B
> > 53 340 susan B
> > 135 351 lee A
> > 114 351 lee A
> > 84 351 lee A
> > 80 351 lee A
> > 19 351 lee A
> > 8 351 lee A
> > 21 351 lee A
> > 88 351 lee B
> > 111 351 lee B
> > 46 351 lee B
> > 108 351 lee B
> >
> > >From the above data I am expecting an output like below.
> >
> > id name type count_of_B Max of count B     x               y
> > 335 sally B 167 167 117,19      NA
> > 340 susan B 22,53 53 18              56
> > 351 lee B 88,111,46,108  111 84,80,19,8,2   135,114
> >
> > Where, the column x and column y are:
> >
> > x = Count_A_less_than_max of (Count type B)
> > y = Count_A_higher_than_max of (Count type B).
> >
> > *1)* I tried with dplyr with the following code for the initial step
to
> get
> > the values for each column.
> > *2)*  I thought to transpose the columns which has the unique ID
alone.
> >
> > I tried with the following code and I am struck with the intial step
> > itself. The code is executed but higher and lower value of A is not
> coming.
> >
> > Expected_output= data %>%
> >   group_by(id, Type) %>%
> >   mutate(Count_of_B = paste(unlist(count[Type=="B"]),
collapse = ","))%>%
> >   mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type
=> > "B"]),max(count[Type == "A"]))) %>%
> >   mutate(count_type_A_lesser = ifelse
> > (Type=="B",(paste(unlist(count[Type=="A"]) <
Max_of_count_B[Type=="B"],
> > collapse = ",")), "NA"))%>%
> >   mutate(count_type_A_higher > >
ifelse(Type=="B",(paste(unlist(count[Type=="A"]) >
> > Max_of_count_B[Type=="B"], collapse = ",")),
"NA"))
> >
> > I hope I make my point clear. Please bare with the code, as I am new
to
> > this.
> >
> > Regards,
> > sri
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>


-- 

Regards,
Srivathsan.K
Phone : 9600165206

	[[alternative HTML version deleted]]

Reasonably Related Threads

Search for more maybe matching threads

R help - Apr 2016 - Data reshaping with conditions

[R] Data reshaping with conditions

[R] Data reshaping with conditions

[R] Data reshaping with conditions

Reasonably Related Threads