search for: sentence

Displaying 20 results from an estimated 2378 matches for "sentence".

2017 Jul 12
0
Extracting sentences with combinations of target words/terms from cancer patient text medical records
Hi Paul, Sounds like you have your answer, but for fun I thought I'd try solving your problem using only a regular expression query and base R. I believe this works: > txt <- "Patient had stage IV breast cancer. Nothing matches this sentence. Metastatic and breast match this sentence. French bike champion takes stage IV victory in Tour de France." > pattern <- "([^.?!]*(?=[^.?!]*\\bbreast\\b)(?=[^.?!]*\\b(metastatic|stage IV)\\b)(?=[\\s.?!])[^.?!]*[.?!])" > regmatches(txt, gregexpr(pattern, txt, perl=TRUE, ign...
2017 Jul 13
1
Extracting sentences with combinations of target words/terms from cancer patient text medical records
...5 million text records and I think that number will only rise. So efficiency really should matter. I've pasted the latest version of my sample code below. This shows how I'd like to add the result of the text search as a column in a data frame. It also shows how I'd like to append the sentence number to each identified sentence. The single colon that appears where there is no match is not by design. It's something that I need to tidy. My sense is that if I used your regular expression as written, I'd lose the information about the sentence number when I added the result as a col...
2017 Jul 12
2
Extracting sentences with combinations of target words/terms from cancer patient text medical records
...nt "x" in my function. I've corrected that and now my code seems to be working fine. Paul ________________________________ From: Bert Gunter <bgunter.4567 at gmail.com> Cc: R-help <r-help at r-project.org> Sent: Tuesday, July 11, 2017 2:00 PM Subject: Re: [R] Extracting sentences with combinations of target words/terms from cancer patient text medical records Have you looked at the CRAN Natural Language Processing Task View? If not, why not? If so, why were the resources described there inadequate? Bert On Jul 11, 2017 10:49 AM, "Paul Miller via R-help" &lt...
2017 Jul 11
2
Extracting sentences with combinations of target words/terms from cancer patient text medical records
Hello All, I need some help figuring out how to extract combinations of target words/terms from cancer patient text medical records. I've provided some sample data and code below to illustrate what I'm trying to do. At the moment, I'm trying to extract sentences that contain the word "breast" plus either "metastatic" or "stage IV". It's been some time since I used R and I feel a bit rusty. I wrote a function called "sentence_match" that seemed to work well when applied to a single piece of text. You can see th...
2017 Jul 13
0
Extracting sentences with combinations of target words/terms from cancer patient text medical records
Hi Paul, No need to collapse the information into a single text string, gregexpr() can take a vector of strings (sentences in your case). You can split your sentences up, number them how you want, then search for your pattern either via regex or via these extra packages you use which probably use the PCRE regex library anyway. However, as this is basically what you did, I'm not sure why you're not happy with y...
2017 Jul 11
0
Extracting sentences with combinations of target words/terms from cancer patient text medical records
...te: > Hello All, > > I need some help figuring out how to extract combinations of target > words/terms from cancer patient text medical records. I've provided some > sample data and code below to illustrate what I'm trying to do. At the > moment, I'm trying to extract sentences that contain the word "breast" plus > either "metastatic" or "stage IV". > > It's been some time since I used R and I feel a bit rusty. I wrote a > function called "sentence_match" that seemed to work well when applied to a > single piece...
2012 Jun 05
1
Trouble with Functions
...tweet, our score.sentiment() function uses laply() to iterate through the input text. It strips punctuation and control characters from each line using R?s regular expression-powered substitution function, gsub() and uses match() against each word list to find matches:/ score.sentiment = function(sentences, pos.words, neg.words, .progress=?none?) { require(plyr) require(stringr) # we got a vector of sentences. plyr will handle a list # or a vector as an ?l? for us # we want a simple array of scores back, so we use # ?l? + ?a? + ?ply? = ?laply?: scores = laply(sentences, function(sentence, pos.wor...
2007 Oct 02
2
Ordering of names on X- and Y-axis
Hi, I am new to R. I have a bit of data looking like this: SemType, Length GeoLocation, Sentence GeneralInfo, Paragraphs GeneralInfo, Paragraphs GeneralInfo, Sentence GeneralInfo, Paragraphs NatLang, Phrase Advice, Article GeneralInfo Advice, Article Resource, Sentence ... (roughly 40,000 lines in total) I am interested in how many counts of each item in the second row I get for each item in...
2012 Jun 13
2
separate the sentence after finding a particular word
hello, I want to know ..how we can separate the sentence after finding a particular word... for example I love to watch movies of Hollywood but should not be romantic...I want to join you school but due to bad financial condition I cant.. I want output in following format I love to watch movies of Hollywood should not be romantic I want to join you s...
2012 Jul 13
1
Need Suggestions for Sentence Breaking Implementation
...2f8da> I have added a simple example in the xapian-core/examples directory, that shows the outcome and results of this feature. The example is present at < https://github.com/sehaj-sk/xapian/commit/75c2e4749e9084fca5f390b88d565cb117e90d38> At present it is capable of indexing only single sentences. So to index a large text, I need to break it into sentences. So I need suggestions for doing the Sentence Boundary Disambiguation. Please suggest any paper/algorithm that could be coded or any existing library that can be used. The focus at present is on English language only. I have done some...
2010 Aug 15
2
problems with which
Dear all, I'm quite new in R and I have a problem with the function which. When I use it to select a subset of a dataframe it works well but somewhere R takes trace of the past dataframe and this creates problems with following operations. For example: sentences <- read.xls("frasi.tot.march.3.xls", header=TRUE) head(sentences) fam subjID Cond Code reg total first second 1 f 30 an fDan1 1 0.2812500 0.2812500 0.0000000 2 f 30 an fDan1 2 1.7851562 0.5390625 1.2460938 3 f 30 an fDan1 3 1.2304688 0.6679...
2010 Nov 02
1
splitting First 10 words in a string
Hi Steven, Thank you for the help. I get an error though when i do this : >lit<-read.csv("litologija.csv", sep=";", dec=".") >sent <-data.frame(sentence=lit$Opis,stringsAsFactors=FALSE) >str(sent) >sentV<-rep(sent,10) >str(sentV) >first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=10) >DF <-data.frame(Sentence=sent,first,second,third,fourth,fifth,sixth,seventh,eighth,ninth,tenth,stringsAs...
2014 Feb 28
2
GSoC 2014
...ted > scoring system. > Now I basically plan on developing a generic QA system which encompasses a > large number of questions. The biggest drawback of my previous QA system > was the lack of relevance measuring mechanism. I want to develop a > relevance measure between a query and a sentence. I believe there already > exist many relevance measuring codes but those relate a query to a > document( as far as I know). The term "document" is what the literature uses, but the mental image that might conjure up of a multi-page printout with a staple through the corner is misl...
2010 Mar 08
1
How can I understand this sentence,and express it by means of Mathematical approach?
This topic refer to independent variables reduction, as we know ,a lot of method can do with it,however, for pre-processing independent varibles, a method like the sentence below can reduce many variable, How can I understand it? what is significant correlation at 5% level, what is the criterion? P value?or what? "Independent variables whose correlation with the response variable was not significant at 5% level were removed" how can I calucate the corre...
2006 Jul 03
5
How do I code this conditional statement in Ruby
Hi, I am a COBOL programmer and I am busy teaching myself Rails and Ruby. In COBOL I can code this conditional If x = 1 next sentence else .......................... The "next sentence" statement enables me to get out of the conditional. How would I code the same thing in Ruby? In C you could use break but I understand that Ruby has no break statement. Regards, Paul
2005 Jul 13
1
Can I introduce sql sentences in the DialPlan (Asterisk Realtime)??
Hallo all! Know somebody, if exist Dialplan commands (specifically sql sentences) for Asterisk Realtime? For example: I have users defined in mysql database. In the dialplan, I would like to select one field of a table. select email from sip_buddies where name=200 I try to use DBget, but I have error. I think because DBget use intern Database, and can 't connect to mysql...
2014 Feb 26
2
GSoC 2014
The Letor project involves descent amount of Machine Learning while all the ranking related projects are around IR. Its better to introduce your idea on mailing list where all the mentors can have a detailed look at it, potential mentors can respond and the idea is kind of registered under your name. Cheers, Parth. On Wed, Feb 26, 2014 at 10:20 AM, Olly Betts <olly at survex.com> wrote:
2009 Nov 21
1
p.value OR F.value?
Hi?all friends, Please help me understand this sentence below: ?From this set, 858 columns not significantly correlated with the response variable TBG at the 5% level were removed, leaving a set of 390 columns.? and ? the F-test's value for the one-parameter correlation with the descriptor is below 1.0? is equal?? I want to perform this above s...
2009 Oct 15
1
"Complex?" import of pdf files (criminal records) into R table
...t for us | Date: xx.xx.xxxx (relevant for us) Anonymous person number: xxxxxxxxxxx Entries in the register 1. xx.xx.1902 -City- Be in force since: xx.xx.1902 Date of offense:xx.xx.xxxx Elements of the offence: For example "Rape" Section in law: ?176, ?178 Abs. 1 Sentenced to 5 years imprisonment "Irrelevant text for us" Accommodation in an forensic psychiatry Accommodation sentenced on probation Rest of sentence sentenced on probation until the xx.xx.xxxx 2. xx.xx.1910 Be in force since: .... ..... -------------------------------...
2009 Nov 12
1
How can this code be improved?
...cat(i.d, "reports completed at ", date(), ".\n") } cat("Terminating at ", date(), ".\n") The object in the innermost loop are: * tokens: a list of lists. In the expression tokens[[i.d]][[i.s]], the first index runs over 1697 reports, the second over the sentences in the report, each of which consists of a vector of tokens, i.e., the character strings between the white spaces in the sentence. One of the largest reports takes up 58MB on the harddisk. Thus, the number of sentences can be quite large, and some of the sentences are quite long (measu...