thr3ads.net - R help - [R] a problem: factors, names, tables .. [Jul 2004]

If this information is useful, please help other people find it:
Share via:

PvR

2004-Jul-18 11:17 UTC

[R] a problem: factors, names, tables ..

Hi all,

I am *completely* lost in trying to solve a relatively simple task.

I want to compute the relative number of occurences of an event, the data  
of which sits in a large table (read from file).

I have the occurences of the events in a table 'tt'

0  2 10 11 13 14 15
15  6  1  3  8 15 10

.. meaning that event of type '0' occurs 15 times, type '2'
occurs 6 times
etc.

Now I want to divide the occurence counts by the total number of events of  
that type, which is given in the table tt2:

  0   1   2  10  11  12  13  14  15
817 119 524  96 700  66 559 358 283

Saying that event type '0' occurred 817 times, type '1' occurs
119 times
etc.

The obvious problem is that not all events in tt2 are present in tt, which  
is the result of the experiment so that cannot be changed.

What needs to be done is loop over tt, take the occurence count, and  
divide that with the corresponding count in tt2.  This corresponding tt2  
count is *not* at the same index in tt2, so I need a reverse lookup of the  
type number.  For example:

event type 10:
occurs 1 time (from table tt)
occurs 96 times in total (from table tt2)  <- this is found by looking up  
type '10' in tt2 and reading out 96

result: 1/96



I have tried programming this as follows:


tt <- table(V32[V48 == 0]) # this is taking the events I want counted
tt2 <- table(V32) # this is taking the total event count per type
df <- as.data.frame(tt) #convert to dataframe to allow access to  
type-numbers .. ?
df2 <-  as.data.frame(tt2) #same here

print(tt);
print(df);

print(tt2);
print(df2);

for( i in 1:length(tt) ) { #loop over smallest table tt
	print("i:"); #index
	print(i);
	print( "denominator "); #corresponds to the "1" in the
example
	print(	 df$Freq[i] );
	denomtag = ( df$Var1[ i ] );	# corresponds to the "10" in the
example,
being the type number of the event
	print("denomtag ");
	print( denomtag );
	print( "nominator: " );
	print( df2[2][ df[1] == as.numeric(denomtag) ] );  #this fails ....
	#result would then be somthing like :  denomitor / nominator	
}

The problem is that the factor names that are extracted in 'denomtag'
are
not usable as index in the dataframe in the last line.   I have tried  
converting to numeric using 'as.numeric', but that fails since this  
returns the index in the factor rather then the factor name I need from  
the list.

Any suggestions .. ?   I am sure its dead simple, as always.


Thanks,


Piet (Belgium)

PS: please reply to pvremortNOSPAM at vub.ac.be

Uwe Ligges

2004-Jul-18 12:05 UTC

head link

[R] a problem: factors, names, tables ..

PvR wrote:> Hi all,
> 
> I am *completely* lost in trying to solve a relatively simple task.
> 
> I want to compute the relative number of occurences of an event, the 
> data  of which sits in a large table (read from file).
> 
> I have the occurences of the events in a table 'tt'
> 
> 0  2 10 11 13 14 15
> 15  6  1  3  8 15 10
> 
> .. meaning that event of type '0' occurs 15 times, type '2'
occurs 6
> times  etc.
> 
> Now I want to divide the occurence counts by the total number of events 
> of  that type, which is given in the table tt2:
> 
>  0   1   2  10  11  12  13  14  15
> 817 119 524  96 700  66 559 358 283
> 
> Saying that event type '0' occurred 817 times, type '1'
occurs 119
> times  etc.
> 
> The obvious problem is that not all events in tt2 are present in tt, 
> which  is the result of the experiment so that cannot be changed.
> 
> What needs to be done is loop over tt, take the occurence count, and  
> divide that with the corresponding count in tt2.  This corresponding 
> tt2  count is *not* at the same index in tt2, so I need a reverse lookup 
> of the  type number.  For example:
> 
> event type 10:
> occurs 1 time (from table tt)
> occurs 96 times in total (from table tt2)  <- this is found by looking 
> up  type '10' in tt2 and reading out 96
> 
> result: 1/96
> 
> 
> 
> I have tried programming this as follows:

It's *much* easier. Just make V32 a factor. After that, table() knows 
all the levels and counts also the zeros:

V32 <- factor(V32)
table(V32[V48 == 0]) / table(V32)

Uwe Ligges



> 
> tt <- table(V32[V48 == 0]) # this is taking the events I want counted
> tt2 <- table(V32) # this is taking the total event count per type
> df <- as.data.frame(tt) #convert to dataframe to allow access to  
> type-numbers .. ?
> df2 <-  as.data.frame(tt2) #same here
> 
> print(tt);
> print(df);
> 
> print(tt2);
> print(df2);
> 
> for( i in 1:length(tt) ) { #loop over smallest table tt
>     print("i:"); #index
>     print(i);
>     print( "denominator "); #corresponds to the "1" in
the example
>     print(     df$Freq[i] );
>     denomtag = ( df$Var1[ i ] );    # corresponds to the "10" in
the
> example,  being the type number of the event
>     print("denomtag ");
>     print( denomtag );
>     print( "nominator: " );
>     print( df2[2][ df[1] == as.numeric(denomtag) ] );  #this fails ....
>     #result would then be somthing like :  denomitor / nominator   
> }
> 
> The problem is that the factor names that are extracted in
'denomtag'
> are  not usable as index in the dataframe in the last line.   I have 
> tried  converting to numeric using 'as.numeric', but that fails
since
> this  returns the index in the factor rather then the factor name I need 
> from  the list.
> 
> Any suggestions .. ?   I am sure its dead simple, as always.
> 
> 
> Thanks,
> 
> 
> Piet (Belgium)
> 
> PS: please reply to pvremortNOSPAM at vub.ac.be
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html

Arin Basu

2004-Jul-20 04:03 UTC

head link

[R] a problem: factors, names, tables ..

Hi Piet:

I considered an example based on the problem you posed.

How about:

# create three lists, catenate them to rows and then create a dataframe out of
them
x <- c(0,1,2,10,11,12,13,14,15)
a <- c(15,0,6,1,3,0,8,15,10)
b <- c(817,119,524,96,700,66,559,358,283)
xab <- rbind(x,a,b)
xabdf <- as.data.frame(t(xab))
# now get the a/b values
abyb <- xabdf[,2]/xabdf[,3]

HTH,
Arin Basu  

>Message: 1
>Date: Sun, 18 Jul 2004 13:17:42 +0200
> From: PvR <pvremort@vub.ac.be>
>Subject: [R] a problem: factors, names, tables ..
>To: r-help@stat.math.ethz.ch
>Message-ID: <opsbbyzsbvxpv241@jodokus.pietnet.net>
>Content-Type: text/plain; format=flowed; delsp=yes; charset=iso-8859-1
>
>Hi all,
>
>I am *completely* lost in trying to solve a relatively simple task.
>
>I want to compute the relative number of occurences of an event, the data
>of which sits in a large table (read from file).
>
>I have the occurences of the events in a table 'tt'
>
>0  2 10 11 13 14 15
>15  6  1  3  8 15 10
>
>.. meaning that event of type '0' occurs 15 times, type '2'
occurs 6 times
>etc.
>
>Now I want to divide the occurence counts by the total number of events of
>that type, which is given in the table tt2:
>
>   0   1   2  10  11  12  13  14  15
>817 119 524  96 700  66 559 358 283
>
>Saying that event type '0' occurred 817 times, type '1'
occurs 119 times
>etc.
>
>The obvious problem is that not all events in tt2 are present in tt, which
>is the result of the experiment so that cannot be changed.
>
>What needs to be done is loop over tt, take the occurence count, and
>divide that with the corresponding count in tt2.  This corresponding tt2
>count is *not* at the same index in tt2, so I need a reverse lookup of the
>type number.  For example:
>
>event type 10:
>occurs 1 time (from table tt)
>occurs 96 times in total (from table tt2)  <- this is found by looking up
>type '10' in tt2 and reading out 96
>
>result: 1/96
>
>
>
>I have tried programming this as follows:
>
>
>tt <- table(V32[V48 == 0]) # this is taking the events I want counted
>tt2 <- table(V32) # this is taking the total event count per type
>df <- as.data.frame(tt) #convert to dataframe to allow access to
>type-numbers .. ?
>df2 <-  as.data.frame(tt2) #same here
>
>print(tt);
>print(df);
>
>print(tt2);
>print(df2);
>
>for( i in 1:length(tt) ) { #loop over smallest table tt
> 	print("i:"); #index
> 	print(i);
> 	print( "denominator "); #corresponds to the "1" in the
example
> 	print(	 df$Freq[i] );
> 	denomtag = ( df$Var1[ i ] );	# corresponds to the "10" in the
example,
>being the type number of the event
> 	print("denomtag ");
> 	print( denomtag );
> 	print( "nominator: " );
> 	print( df2[2][ df[1] == as.numeric(denomtag) ] );  #this fails ....
> 	#result would then be somthing like :  denomitor / nominator
>}
>
>The problem is that the factor names that are extracted in
'denomtag' are
>not usable as index in the dataframe in the last line.   I have tried
>converting to numeric using 'as.numeric', but that fails since this
>returns the index in the factor rather then the factor name I need from
>the list.
>
>Any suggestions .. ?   I am sure its dead simple, as always.
>
>
>Thanks,
>
>
>Piet (Belgium)
>
>PS: please reply to pvremortNOSPAM@vub.ac.be
>**************************************

Arindam Basu MD MPH DBI
Assistant Director
Fogarty International Program on Environmental Health in India
IPGMER
244 AJC Bose Road, Kolkata 700027
India
	[[alternative HTML version deleted]]

Apparently Analagous Threads

Search for more possibly parallel threads

R help - Jul 2004 - a problem: factors, names, tables ..

[R] a problem: factors, names, tables ..

[R] a problem: factors, names, tables ..

[R] a problem: factors, names, tables ..

Apparently Analagous Threads