Hello All: I would like to get some help with the following problem and understand how this can be done in R efficiently. The header is given in the data frame. *Component, TLA* C1, TLA1 C2, TLA1 C1, TLA2 C3, TLA2 C4, TLA3 C5, TLA3 Notice that C1 is a component of TLA1 and TLA2. I would like to form groups of mutually exclusive subsets and create a new column called group for this subset. For the above data, the subsets and the new group column value will be like so: *Component, TLA, Group* C1, TLA1, 1 C2, TLA1, 1 C1, TLA2, 1 C3, TLA2, 1 C4, TLA3, 2 C5, TLA3, 2 Appreciate any help on this. I could have looped through the observations and tried some logic but I did not try that yet. -- Satish Vadlamani [[alternative HTML version deleted]]
It isn't at all clear to me how you are creating the groups. They aren't the unique combinations of Component and TLA. They might be based only on TLA value: in your example TLA1 and TLA2 form one group, and TLA3 the other. Without understanding your logic, I can't replicate it with R code. Sarah On Sun, Mar 27, 2016 at 8:56 PM, Satish Vadlamani <satish.vadlamani at gmail.com> wrote:> Hello All: > I would like to get some help with the following problem and understand how > this can be done in R efficiently. The header is given in the data frame. > > *Component, TLA* > C1, TLA1 > C2, TLA1 > C1, TLA2 > C3, TLA2 > C4, TLA3 > C5, TLA3 > > Notice that C1 is a component of TLA1 and TLA2. > > I would like to form groups of mutually exclusive subsets and create a new > column called group for this subset. For the above data, the subsets and > the new group column value will be like so: > > *Component, TLA, Group* > C1, TLA1, 1 > C2, TLA1, 1 > C1, TLA2, 1 > C3, TLA2, 1 > C4, TLA3, 2 > C5, TLA3, 2 > > Appreciate any help on this. I could have looped through the observations > and tried some logic but I did not try that yet. > > -- > > Satish Vadlamani > > [[alternative HTML version deleted]] >And please don't post in HTML.
Satish, If you rearrange your data into a network of nodes and edges, you can use the igraph package to identify disconnected (mutually exclusive) groups. # example data df <- data.frame( Component = c("C1", "C2", "C1", "C3", "C4", "C5"), TLA = c("TLA1", "TLA1", "TLA2", "TLA2", "TLA3", "TLA3") ) # characterize data as a network of nodes and edges nodes <- levels(unlist(df)) edges <- apply(df, 2, match, nodes) # use the igraph package to identify disconnected groups library(igraph) g <- graph(edges) ngroup <- clusters(g)$membership df$Group <- ngroup[match(df$Component, nodes)] df Component TLA Group 1 C1 TLA1 1 2 C2 TLA1 1 3 C1 TLA2 1 4 C3 TLA2 1 5 C4 TLA3 2 6 C5 TLA3 2 Jean On Sun, Mar 27, 2016 at 7:56 PM, Satish Vadlamani < satish.vadlamani at gmail.com> wrote:> Hello All: > I would like to get some help with the following problem and understand how > this can be done in R efficiently. The header is given in the data frame. > > *Component, TLA* > C1, TLA1 > C2, TLA1 > C1, TLA2 > C3, TLA2 > C4, TLA3 > C5, TLA3 > > Notice that C1 is a component of TLA1 and TLA2. > > I would like to form groups of mutually exclusive subsets and create a new > column called group for this subset. For the above data, the subsets and > the new group column value will be like so: > > *Component, TLA, Group* > C1, TLA1, 1 > C2, TLA1, 1 > C1, TLA2, 1 > C3, TLA2, 1 > C4, TLA3, 2 > C5, TLA3, 2 > > Appreciate any help on this. I could have looped through the observations > and tried some logic but I did not try that yet. > > -- > > Satish Vadlamani > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Jean: Wow. Thank you so much for this. I will read up igraph and then see if this is going to work for me for the larger dataset. Thanks for the wonderful snippet code you wrote. Basically, the requirement is this: TLA1 (Top Level Assembly) and its components should belong to the same group. If a component belongs to a different TLA (say TLA2), then that TLA1 and all of its components should belong to the same as that of TLA1. Are these types of questions appropriate for this group? Thanks, Satish On Mar 28, 2016 9:10 AM, "Adams, Jean" <jvadams at usgs.gov> wrote:> Satish, > > If you rearrange your data into a network of nodes and edges, you can use > the igraph package to identify disconnected (mutually exclusive) groups. > > # example data > df <- data.frame( > Component = c("C1", "C2", "C1", "C3", "C4", "C5"), > TLA = c("TLA1", "TLA1", "TLA2", "TLA2", "TLA3", "TLA3") > ) > > # characterize data as a network of nodes and edges > nodes <- levels(unlist(df)) > edges <- apply(df, 2, match, nodes) > > # use the igraph package to identify disconnected groups > library(igraph) > g <- graph(edges) > ngroup <- clusters(g)$membership > df$Group <- ngroup[match(df$Component, nodes)] > df > > Component TLA Group > 1 C1 TLA1 1 > 2 C2 TLA1 1 > 3 C1 TLA2 1 > 4 C3 TLA2 1 > 5 C4 TLA3 2 > 6 C5 TLA3 2 > > Jean > > On Sun, Mar 27, 2016 at 7:56 PM, Satish Vadlamani < > satish.vadlamani at gmail.com> wrote: > >> Hello All: >> I would like to get some help with the following problem and understand >> how >> this can be done in R efficiently. The header is given in the data frame. >> >> *Component, TLA* >> C1, TLA1 >> C2, TLA1 >> C1, TLA2 >> C3, TLA2 >> C4, TLA3 >> C5, TLA3 >> >> Notice that C1 is a component of TLA1 and TLA2. >> >> I would like to form groups of mutually exclusive subsets and create a new >> column called group for this subset. For the above data, the subsets and >> the new group column value will be like so: >> >> *Component, TLA, Group* >> C1, TLA1, 1 >> C2, TLA1, 1 >> C1, TLA2, 1 >> C3, TLA2, 1 >> C4, TLA3, 2 >> C5, TLA3, 2 >> >> Appreciate any help on this. I could have looped through the observations >> and tried some logic but I did not try that yet. >> >> -- >> >> Satish Vadlamani >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > >[[alternative HTML version deleted]]