I have a data frame with a character field of the form "ACUTE URI NOS", "OPEN WOUND OF FOREHEAD", "CROUP", "STREP SORE THROAT", .... How can I get counts of all the words and their co-occurences? I've spent a long time searching on google, but it just takes me on a wild goose chase of dozens of modules involving advanced natural language processing theory. All I want is word counts and co-occurences. Thanks CONFIDENTIALITY NOTICE:\ This email message and any acco...{{dropped:13}}
I have a data frame with a character field of the form "ACUTE URI NOS", "OPEN WOUND OF FOREHEAD", "CROUP", "STREP SORE THROAT", .... How can I get counts of all the words and their co-occurences? I've spent a long time searching on google, but it just takes me on a wild goose chase of dozens of modules involving advanced natural language processing theory. All I want is word counts and co-occurences. Thanks CONFIDENTIALITY NOTICE:\ This email message and any acco...{{dropped:13}}
Henry, Have look at the qdap package's termco, wfm, adjacency_matrix, and (possibly) word_associate functions. ?I'm not sure if they'll work as you really don't give much in the way of what the data is and the desired output (an example of the output). Cheers, Tyler Rinker ?----------------------------------------> From: HTRobertson at seton.org > To: r-help at r-project.org > Date: Fri, 6 Sep 2013 21:14:42 +0000 > Subject: [R] How do I parse text? > > I have a data frame with a character field of the form "ACUTE URI NOS", "OPEN WOUND OF FOREHEAD", "CROUP", "STREP SORE THROAT", .... > > How can I get counts of all the words and their co-occurences? I've spent a long time searching on google, but it just takes me on a wild goose chase of dozens of modules involving advanced natural language processing theory. All I want is word counts and co-occurences. > > Thanks > > > > > CONFIDENTIALITY NOTICE:\ This email message and any acco...{{dropped:13}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
> I have a data frame with a character field of the form "ACUTE > URI NOS", "OPEN WOUND OF FOREHEAD", "CROUP", "STREP SORE THROAT", .... > > How can I get counts of all the words and their > co-occurences? I've spent a long time searching on google, > but it just takes me on a wild goose chase of dozens of > modules involving advanced natural language processing > theory. All I want is word counts and co-occurences.Perhaps a combination of strsplit(), unlist() and table() would do the job? Example: sometext <- c("ACUTE URI NOS", "OPEN WOUND OF FOREHEAD", "CROUP", "STREP SORE THROAT", "ACUTE STREP SORE THROAT") st <- strsplit(sometext, " ") table(unlist(st)) S Ellison ******************************************************************* This email and any attachments are confidential. Any use...{{dropped:8}}