Displaying 1 result from an estimated 1 matches for "genre2".
Did you mean:
genre
2010 Sep 29
0
Transforming/appending data (words in IMDB)
...w for each word in the IMDB, whether it is over- or under-represented in a particular category (Rating x Genre). I was figuring on estimating this with a g-test, fwiw. But the basic question I'm asking here is about data transformation/appending. To go from these columns:
Film | Genre1 | Genre2 | Genre3 | Reviewer | Rating | Word | Word_ct
to these:
Word | Genre | Rating | Word_ct | Word_ct_in_genre | Word_ct_in_Rating | Expected_word_ct | G-test-score
The actual amount of data is enormous (I have 10 files of ~1.5 GB each) and I suspect I'm going to have to learn how to use the bi...