Frederik Lyngsaa Lang
2011-Nov-16 09:25 UTC
[R] Replacing values in matrix/dataframe according to changing criteria
Hi there, I am doing some network analysis working with k-cliques and over time I want to see what nodes are members of what cliques and how big these cliques are. I have managed to produce a matrix which shows which k-cliques each node is part of over the 100 time periods (slow though) but I cannot seem to calculate the size of each k-clique which is actually just a count. Basically I have a dataframe like this with V1 being the node ID's and V2-V4 showing which k-cliques the nodes are part of : df <- as.data.frame(cbind(c(1,2,3,4,5,6,7,8),c(1,1,1,1,2,2,2,2),c(0,0,0,2,0,0,0,0),c(0,0,0,0,0,0,0,0))) # What I want is a dataframe like this where each k-clique value is replaced by its size: wanted <- as.data.frame(cbind(c(1,2,3,4,5,6,7,8),c(4,4,4,4,5,5,5,5),c(0,0,0,5,0,0,0,0),c(0,0,0,0,0,0,0,0))) It seems simple but I cannot get it working since these dataframes grow for each time period and the k-cliques change. I have tried using a loop that references to a table of the values but it does not work. I am sure there is an easy way, however. Hope to get some help, Kind regards, Frederik [[alternative HTML version deleted]]
R. Michael Weylandt
2011-Nov-16 17:39 UTC
[R] Replacing values in matrix/dataframe according to changing criteria
This is a little ugly, but I think it should work pretty robustly: T <- table(unlist(df[,-1])) # Take a close look at this to see how it works -- it's the key to the whole thing and basically creates something we will roughly use like a hash-table (if that term is familiar to you) (or more accurately, like a python dictionary) T[names(T) == "0"] <- 0 # Set zero back to zero if desired df[,-1] <- T[match(unlist(df[,-1]), names(T))] A heads up -- if your data are floating point numbers rather than integers, you might run into FAQ 7.31 trouble -- that's a much tougher question to work around, but hopefully still possible Michael On Wed, Nov 16, 2011 at 4:25 AM, Frederik Lyngsaa Lang <frederiklang at gmail.com> wrote:> Hi there, > > I am doing some network analysis working with k-cliques and over time I > want to see what nodes are members of what cliques and how big these > cliques are. I have managed to produce a matrix which shows which k-cliques > each node is part of over the 100 time periods (slow though) but I cannot > seem to calculate the size of each k-clique which is actually just a count. > > Basically I have a dataframe like this with V1 being the node ID's and > V2-V4 showing which k-cliques the nodes are part of : > > > > > df <- > as.data.frame(cbind(c(1,2,3,4,5,6,7,8),c(1,1,1,1,2,2,2,2),c(0,0,0,2,0,0,0,0),c(0,0,0,0,0,0,0,0))) > > # What I want is a dataframe like this where each k-clique value is > replaced by its size: > > wanted <- > as.data.frame(cbind(c(1,2,3,4,5,6,7,8),c(4,4,4,4,5,5,5,5),c(0,0,0,5,0,0,0,0),c(0,0,0,0,0,0,0,0))) > > > > > It seems simple but I cannot get it working since these dataframes grow for > each time period and the k-cliques change. I have tried using a loop that > references to a table of the values but it does not work. I am sure there > is an easy way, however. > > > > Hope to get some help, > > > Kind regards, > > > Frederik > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >