Hi Jim, Thanks for your time. But somehow this code did not help me to achieve my expected output. Problems: 1) x, y are coming as logical rather than values as I mentioned in my post 2) The values that I get for Max A and Max B not correct 3) It looks like a pretty big data, but I just need to concatenate the values with a comma, the final output will be a character variable. Regards, Sri On Thu, Apr 21, 2016 at 4:52 AM, Jim Lemon <drjimlemon at gmail.com> wrote:> Hi sri, > As your problem involves a few logical steps, I found it easier to > approach it in a stepwise way. Perhaps there are more elegant ways to > accomplish this. > > svdat<-read.table(text="Count id name type > 117 335 sally A > 19 335 sally A > 167 335 sally B > 18 340 susan A > 56 340 susan A > 22 340 susan B > 53 340 susan B > 135 351 lee A > 114 351 lee A > 84 351 lee A > 80 351 lee A > 19 351 lee A > 8 351 lee A > 21 351 lee A > 88 351 lee B > 111 351 lee B > 46 351 lee B > 108 351 lee B",header=TRUE) > # you can also do this with other reshape functions > library(prettyR) > svdatstr<-stretch_df(svdat,"id",c("Count","type")) > count_ind<-grep("Count",names(svdatstr)) > type_ind<-grep("type",names(svdatstr)) > svdatstr$maxA<-NA > svdatstr$maxB<-NA > svdatstr$x<-NA > svdatstr$y<-NA > for(row in 1:nrow(svdatstr)) { > svdatstr[row,"maxA"]<- > > max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"A",0))]]) > svdatstr[row,"maxB"]<- > > max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"B",0))]]) > svdatstr[row,"x"]<-svdatstr[row,"maxA"] < svdatstr[row,"maxB"] > svdatstr[row,"y"]<-!svdatstr[row,"x"] > } > svdatstr > > You can then just extract the columns that you need. > > Jim > > > On Wed, Apr 20, 2016 at 3:03 PM, sri vathsan <srivibish at gmail.com> wrote: > > Dear All, > > > > I am trying to reshape the data with some conditions. A small part of the > > data looks like below. Like this there will be more data with repeating > ID. > > > > Count id name type > > 117 335 sally A > > 19 335 sally A > > 167 335 sally B > > 18 340 susan A > > 56 340 susan A > > 22 340 susan B > > 53 340 susan B > > 135 351 lee A > > 114 351 lee A > > 84 351 lee A > > 80 351 lee A > > 19 351 lee A > > 8 351 lee A > > 21 351 lee A > > 88 351 lee B > > 111 351 lee B > > 46 351 lee B > > 108 351 lee B > > > > >From the above data I am expecting an output like below. > > > > id name type count_of_B Max of count B x y > > 335 sally B 167 167 117,19 NA > > 340 susan B 22,53 53 18 56 > > 351 lee B 88,111,46,108 111 84,80,19,8,2 135,114 > > > > Where, the column x and column y are: > > > > x = Count_A_less_than_max of (Count type B) > > y = Count_A_higher_than_max of (Count type B). > > > > *1)* I tried with dplyr with the following code for the initial step to > get > > the values for each column. > > *2)* I thought to transpose the columns which has the unique ID alone. > > > > I tried with the following code and I am struck with the intial step > > itself. The code is executed but higher and lower value of A is not > coming. > > > > Expected_output= data %>% > > group_by(id, Type) %>% > > mutate(Count_of_B = paste(unlist(count[Type=="B"]), collapse = ","))%>% > > mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type => > "B"]),max(count[Type == "A"]))) %>% > > mutate(count_type_A_lesser = ifelse > > (Type=="B",(paste(unlist(count[Type=="A"]) < Max_of_count_B[Type=="B"], > > collapse = ",")), "NA"))%>% > > mutate(count_type_A_higher > > ifelse(Type=="B",(paste(unlist(count[Type=="A"]) > > > Max_of_count_B[Type=="B"], collapse = ",")), "NA")) > > > > I hope I make my point clear. Please bare with the code, as I am new to > > this. > > > > Regards, > > sri > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >-- Regards, Srivathsan.K Phone : 9600165206 [[alternative HTML version deleted]]
Hi sri, I think that I see what you mean. Your statements: x = Count_A_less_than_max of (Count type B) y = Count_A_higher_than_max of (Count type B). I took to mean that you wanted a logical value for x and y. Looking more closely at your initial message, I see that you wanted _all_ values of A with respect to maxB in x and y. The error with maximum values was due to a typo. Perhaps this will do what you want: svdat<-read.table(text="Count id name type 117 335 sally A 19 335 sally A 167 335 sally B 18 340 susan A 56 340 susan A 22 340 susan B 53 340 susan B 135 351 lee A 114 351 lee A 84 351 lee A 80 351 lee A 19 351 lee A 8 351 lee A 21 351 lee A 88 351 lee B 111 351 lee B 46 351 lee B 108 351 lee B",header=TRUE) # you can also do this with other reshape functions library(prettyR) svdatstr<-stretch_df(svdat,"id",c("Count","type")) count_ind<-grep("Count",names(svdatstr)) type_ind<-grep("type",names(svdatstr)) svdatstr$maxA<-NA svdatstr$maxB<-NA svdatstr$x<-NA svdatstr$y<-NA for(row in 1:nrow(svdatstr)) { indicesA<-count_ind[as.logical(match(svdatstr[row,type_ind],"A",0))] svdatstr[row,"maxA"]<-max(svdatstr[row,indicesA]) indicesB<-count_ind[as.logical(match(svdatstr[row,type_ind],"B",0))] svdatstr[row,"maxB"]<-max(svdatstr[row,indicesB]) AltB<-svdatstr[row,indicesA][svdatstr[row,indicesA]<svdatstr[row,"maxB"]] svdatstr[row,"x"]<-paste(AltB,collapse=",") AgeB<-svdatstr[row,indicesA][svdatstr[row,indicesA]>=svdatstr[row,"maxB"]] svdatstr[row,"y"]<-paste(AgeB,collapse=",") } svdatstr[,c("id","name","maxB","x","y")] Jim On Thu, Apr 21, 2016 at 2:23 PM, sri vathsan <srivibish at gmail.com> wrote:> Hi Jim, > > Thanks for your time. But somehow this code did not help me to achieve my > expected output. > Problems: 1) x, y are coming as logical rather than values as I mentioned in > my post > 2) The values that I get for Max A and Max B not correct > 3) It looks like a pretty big data, but I just need to > concatenate the values with a comma, the final output will be a character > variable. > > Regards, > Sri > > On Thu, Apr 21, 2016 at 4:52 AM, Jim Lemon <drjimlemon at gmail.com> wrote: >> >> Hi sri, >> As your problem involves a few logical steps, I found it easier to >> approach it in a stepwise way. Perhaps there are more elegant ways to >> accomplish this. >> >> svdat<-read.table(text="Count id name type >> 117 335 sally A >> 19 335 sally A >> 167 335 sally B >> 18 340 susan A >> 56 340 susan A >> 22 340 susan B >> 53 340 susan B >> 135 351 lee A >> 114 351 lee A >> 84 351 lee A >> 80 351 lee A >> 19 351 lee A >> 8 351 lee A >> 21 351 lee A >> 88 351 lee B >> 111 351 lee B >> 46 351 lee B >> 108 351 lee B",header=TRUE) >> # you can also do this with other reshape functions >> library(prettyR) >> svdatstr<-stretch_df(svdat,"id",c("Count","type")) >> count_ind<-grep("Count",names(svdatstr)) >> type_ind<-grep("type",names(svdatstr)) >> svdatstr$maxA<-NA >> svdatstr$maxB<-NA >> svdatstr$x<-NA >> svdatstr$y<-NA >> for(row in 1:nrow(svdatstr)) { >> svdatstr[row,"maxA"]<- >> >> max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"A",0))]]) >> svdatstr[row,"maxB"]<- >> >> max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"B",0))]]) >> svdatstr[row,"x"]<-svdatstr[row,"maxA"] < svdatstr[row,"maxB"] >> svdatstr[row,"y"]<-!svdatstr[row,"x"] >> } >> svdatstr >> >> You can then just extract the columns that you need. >> >> Jim >> >> >> On Wed, Apr 20, 2016 at 3:03 PM, sri vathsan <srivibish at gmail.com> wrote: >> > Dear All, >> > >> > I am trying to reshape the data with some conditions. A small part of >> > the >> > data looks like below. Like this there will be more data with repeating >> > ID. >> > >> > Count id name type >> > 117 335 sally A >> > 19 335 sally A >> > 167 335 sally B >> > 18 340 susan A >> > 56 340 susan A >> > 22 340 susan B >> > 53 340 susan B >> > 135 351 lee A >> > 114 351 lee A >> > 84 351 lee A >> > 80 351 lee A >> > 19 351 lee A >> > 8 351 lee A >> > 21 351 lee A >> > 88 351 lee B >> > 111 351 lee B >> > 46 351 lee B >> > 108 351 lee B >> > >> > >From the above data I am expecting an output like below. >> > >> > id name type count_of_B Max of count B x y >> > 335 sally B 167 167 117,19 NA >> > 340 susan B 22,53 53 18 56 >> > 351 lee B 88,111,46,108 111 84,80,19,8,2 135,114 >> > >> > Where, the column x and column y are: >> > >> > x = Count_A_less_than_max of (Count type B) >> > y = Count_A_higher_than_max of (Count type B). >> > >> > *1)* I tried with dplyr with the following code for the initial step to >> > get >> > the values for each column. >> > *2)* I thought to transpose the columns which has the unique ID alone. >> > >> > I tried with the following code and I am struck with the intial step >> > itself. The code is executed but higher and lower value of A is not >> > coming. >> > >> > Expected_output= data %>% >> > group_by(id, Type) %>% >> > mutate(Count_of_B = paste(unlist(count[Type=="B"]), collapse >> > ","))%>% >> > mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type =>> > "B"]),max(count[Type == "A"]))) %>% >> > mutate(count_type_A_lesser = ifelse >> > (Type=="B",(paste(unlist(count[Type=="A"]) < Max_of_count_B[Type=="B"], >> > collapse = ",")), "NA"))%>% >> > mutate(count_type_A_higher >> > ifelse(Type=="B",(paste(unlist(count[Type=="A"]) > >> > Max_of_count_B[Type=="B"], collapse = ",")), "NA")) >> > >> > I hope I make my point clear. Please bare with the code, as I am new to >> > this. >> > >> > Regards, >> > sri >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. > > > > > -- > > Regards, > Srivathsan.K > Phone : 9600165206
In R, square brackets [] are called "extraction operators" as they are interpreted so as to "extract" the parts of an object specified by the information within them. Your message contained only part of the line below: AltB<-svdatstr[row,indicesA][svdatstr[row,indicesA]<svdatstr[row,"maxB"]] Extraction operators may be specified sequentially or nested. Both are used in this line. However, to understand how it works, first look at the line that defines the value of "indicesA": indicesA<-count_ind[as.logical(match(svdatstr[row,type_ind],"A",0))] This defines a vector of logical (TRUE/FALSE) values for the valid "Count" values in the current row by finding the matches in the "type" values for the character string "A". The reason I have used the "match" function instead of "=="" is to get rid of the NA values that would have been generated by specifying "0" (zero) in the "incomparables" argument. I then used the "as.logical" function to transform the "1" AND "0" to TRUE and FALSE..This logical vector can then be placed within square brackets to "extract" the valid "Count" values that have a "type" of "A" in the current row of svdatstr. Having done this, the expression within the latter pair of extraction operators can be translated as "Return a logical vector in which the TRUE values correspond to the "Count" values with a type of "A" that are less than the maximum value for type "B" count values". The expression within the first pair of extraction operators can be translated, "Extract all of the values of "Count" with a type of "A" in the current row". So, the second extraction then operates on those values, producing a vector of all of the "Count" values that are less than the value of "maxB". The line following this in the code uses the same logic to extract those values that are greater than or equal to "maxB". Jim On Thu, Apr 21, 2016 at 11:25 PM, sri vathsan <srivibish at gmail.com> wrote:> Dear Jim, > > I hope, I am not pestering you. When I was studying your code to grasp and > learn and found one thing which I can not understand the following line. I > would be grateful If you could tell what it exactly denote and this helps me > to grow better in R. > > I understand the first part of the code will check the rows of indicesA, but > I finding difficulty to grasp the second part starting with the square > bracket. > > "AltB<-svdatstr[row,indicesA][svdatstr[row,indicesA]" >