Hi all, I am *completely* lost in trying to solve a relatively simple task. I want to compute the relative number of occurences of an event, the data of which sits in a large table (read from file). I have the occurences of the events in a table 'tt' 0 2 10 11 13 14 15 15 6 1 3 8 15 10 .. meaning that event of type '0' occurs 15 times, type '2' occurs 6 times etc. Now I want to divide the occurence counts by the total number of events of that type, which is given in the table tt2: 0 1 2 10 11 12 13 14 15 817 119 524 96 700 66 559 358 283 Saying that event type '0' occurred 817 times, type '1' occurs 119 times etc. The obvious problem is that not all events in tt2 are present in tt, which is the result of the experiment so that cannot be changed. What needs to be done is loop over tt, take the occurence count, and divide that with the corresponding count in tt2. This corresponding tt2 count is *not* at the same index in tt2, so I need a reverse lookup of the type number. For example: event type 10: occurs 1 time (from table tt) occurs 96 times in total (from table tt2) <- this is found by looking up type '10' in tt2 and reading out 96 result: 1/96 I have tried programming this as follows: tt <- table(V32[V48 == 0]) # this is taking the events I want counted tt2 <- table(V32) # this is taking the total event count per type df <- as.data.frame(tt) #convert to dataframe to allow access to type-numbers .. ? df2 <- as.data.frame(tt2) #same here print(tt); print(df); print(tt2); print(df2); for( i in 1:length(tt) ) { #loop over smallest table tt print("i:"); #index print(i); print( "denominator "); #corresponds to the "1" in the example print( df$Freq[i] ); denomtag = ( df$Var1[ i ] ); # corresponds to the "10" in the example, being the type number of the event print("denomtag "); print( denomtag ); print( "nominator: " ); print( df2[2][ df[1] == as.numeric(denomtag) ] ); #this fails .... #result would then be somthing like : denomitor / nominator } The problem is that the factor names that are extracted in 'denomtag' are not usable as index in the dataframe in the last line. I have tried converting to numeric using 'as.numeric', but that fails since this returns the index in the factor rather then the factor name I need from the list. Any suggestions .. ? I am sure its dead simple, as always. Thanks, Piet (Belgium) PS: please reply to pvremortNOSPAM at vub.ac.be
PvR wrote:> Hi all, > > I am *completely* lost in trying to solve a relatively simple task. > > I want to compute the relative number of occurences of an event, the > data of which sits in a large table (read from file). > > I have the occurences of the events in a table 'tt' > > 0 2 10 11 13 14 15 > 15 6 1 3 8 15 10 > > .. meaning that event of type '0' occurs 15 times, type '2' occurs 6 > times etc. > > Now I want to divide the occurence counts by the total number of events > of that type, which is given in the table tt2: > > 0 1 2 10 11 12 13 14 15 > 817 119 524 96 700 66 559 358 283 > > Saying that event type '0' occurred 817 times, type '1' occurs 119 > times etc. > > The obvious problem is that not all events in tt2 are present in tt, > which is the result of the experiment so that cannot be changed. > > What needs to be done is loop over tt, take the occurence count, and > divide that with the corresponding count in tt2. This corresponding > tt2 count is *not* at the same index in tt2, so I need a reverse lookup > of the type number. For example: > > event type 10: > occurs 1 time (from table tt) > occurs 96 times in total (from table tt2) <- this is found by looking > up type '10' in tt2 and reading out 96 > > result: 1/96 > > > > I have tried programming this as follows:It's *much* easier. Just make V32 a factor. After that, table() knows all the levels and counts also the zeros: V32 <- factor(V32) table(V32[V48 == 0]) / table(V32) Uwe Ligges> > tt <- table(V32[V48 == 0]) # this is taking the events I want counted > tt2 <- table(V32) # this is taking the total event count per type > df <- as.data.frame(tt) #convert to dataframe to allow access to > type-numbers .. ? > df2 <- as.data.frame(tt2) #same here > > print(tt); > print(df); > > print(tt2); > print(df2); > > for( i in 1:length(tt) ) { #loop over smallest table tt > print("i:"); #index > print(i); > print( "denominator "); #corresponds to the "1" in the example > print( df$Freq[i] ); > denomtag = ( df$Var1[ i ] ); # corresponds to the "10" in the > example, being the type number of the event > print("denomtag "); > print( denomtag ); > print( "nominator: " ); > print( df2[2][ df[1] == as.numeric(denomtag) ] ); #this fails .... > #result would then be somthing like : denomitor / nominator > } > > The problem is that the factor names that are extracted in 'denomtag' > are not usable as index in the dataframe in the last line. I have > tried converting to numeric using 'as.numeric', but that fails since > this returns the index in the factor rather then the factor name I need > from the list. > > Any suggestions .. ? I am sure its dead simple, as always. > > > Thanks, > > > Piet (Belgium) > > PS: please reply to pvremortNOSPAM at vub.ac.be > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html
Hi Piet: I considered an example based on the problem you posed. How about: # create three lists, catenate them to rows and then create a dataframe out of them x <- c(0,1,2,10,11,12,13,14,15) a <- c(15,0,6,1,3,0,8,15,10) b <- c(817,119,524,96,700,66,559,358,283) xab <- rbind(x,a,b) xabdf <- as.data.frame(t(xab)) # now get the a/b values abyb <- xabdf[,2]/xabdf[,3] HTH, Arin Basu>Message: 1 >Date: Sun, 18 Jul 2004 13:17:42 +0200 > From: PvR <pvremort@vub.ac.be> >Subject: [R] a problem: factors, names, tables .. >To: r-help@stat.math.ethz.ch >Message-ID: <opsbbyzsbvxpv241@jodokus.pietnet.net> >Content-Type: text/plain; format=flowed; delsp=yes; charset=iso-8859-1 > >Hi all, > >I am *completely* lost in trying to solve a relatively simple task. > >I want to compute the relative number of occurences of an event, the data >of which sits in a large table (read from file). > >I have the occurences of the events in a table 'tt' > >0 2 10 11 13 14 15 >15 6 1 3 8 15 10 > >.. meaning that event of type '0' occurs 15 times, type '2' occurs 6 times >etc. > >Now I want to divide the occurence counts by the total number of events of >that type, which is given in the table tt2: > > 0 1 2 10 11 12 13 14 15 >817 119 524 96 700 66 559 358 283 > >Saying that event type '0' occurred 817 times, type '1' occurs 119 times >etc. > >The obvious problem is that not all events in tt2 are present in tt, which >is the result of the experiment so that cannot be changed. > >What needs to be done is loop over tt, take the occurence count, and >divide that with the corresponding count in tt2. This corresponding tt2 >count is *not* at the same index in tt2, so I need a reverse lookup of the >type number. For example: > >event type 10: >occurs 1 time (from table tt) >occurs 96 times in total (from table tt2) <- this is found by looking up >type '10' in tt2 and reading out 96 > >result: 1/96 > > > >I have tried programming this as follows: > > >tt <- table(V32[V48 == 0]) # this is taking the events I want counted >tt2 <- table(V32) # this is taking the total event count per type >df <- as.data.frame(tt) #convert to dataframe to allow access to >type-numbers .. ? >df2 <- as.data.frame(tt2) #same here > >print(tt); >print(df); > >print(tt2); >print(df2); > >for( i in 1:length(tt) ) { #loop over smallest table tt > print("i:"); #index > print(i); > print( "denominator "); #corresponds to the "1" in the example > print( df$Freq[i] ); > denomtag = ( df$Var1[ i ] ); # corresponds to the "10" in the example, >being the type number of the event > print("denomtag "); > print( denomtag ); > print( "nominator: " ); > print( df2[2][ df[1] == as.numeric(denomtag) ] ); #this fails .... > #result would then be somthing like : denomitor / nominator >} > >The problem is that the factor names that are extracted in 'denomtag' are >not usable as index in the dataframe in the last line. I have tried >converting to numeric using 'as.numeric', but that fails since this >returns the index in the factor rather then the factor name I need from >the list. > >Any suggestions .. ? I am sure its dead simple, as always. > > >Thanks, > > >Piet (Belgium) > >PS: please reply to pvremortNOSPAM@vub.ac.be>**************************************Arindam Basu MD MPH DBI Assistant Director Fogarty International Program on Environmental Health in India IPGMER 244 AJC Bose Road, Kolkata 700027 India [[alternative HTML version deleted]]