Daniel Malter
2009-Dec-28 12:46 UTC
[R] apply loop - using/providing a data frame to loop over
Hi, I want to extract individual names from a single string that contains all names. My problem is not the extraction itself, but the looping over the extraction start and end points, which I try to realize with apply. #Say, I have a string with names. authors=c("Schleyer T, Spallek H, Butler BS, Subramanian S, Weiss D, Poythress ML, Rattanathikun P, Mueller G") #Since I only want the surname and the initial of the first name, I create respective indices starts=c(1, 13, 24, 35, 50, 59, 73, 90) ends=c(10, 21, 31, 47, 56, 69, 87, 98) #Now I can extract the names, e.g. the third one, with substr(authors,start=starts[3],stop=ends[3]) #So far so good, but I want to loop over all indices using apply #For that I wrote a function g, that takes "a" as the author string, and "data" as the start and end points for extraction g=function(a,data){substr(a,data[,1],data[,2])} #If provided with a specific row of the data frame, g works g(authors,data.frame(starts,ends)[3,]) #If I try to loop g through the rows of the starts/ends data frame, it does not work. apply(data.frame(starts,ends),1,g,a=authors) #Interestingly, if the data frame to loop over is just a vector, it works also (e.g. for extracting just the first initial) g=function(e,a){substr(a,e,e)} apply(data.frame(ends),1,g,a=authors) So the problem probably lies in correctly supplying "apply" with the data frame. I would greatly appreciate your help. Daniel ----------------------------------------------- "Who has visions should see a doctor," Helmut Schmidt, German Chancellor (1974-1982).
Gabor Grothendieck
2009-Dec-28 17:33 UTC
[R] apply loop - using/providing a data frame to loop over
Try this. It picks out each string of word characters (\w+) followed by a space followed by a word character:> library(gsubfn) > strapply(authors, "\\w+ \\w", c)[[1]][1] "Schleyer T" "Spallek H" "Butler B" "Subramanian S" [5] "Weiss D" "Poythress M" "Rattanathikun P" "Mueller G" You might need to adjust the regular expression slightly depending on what the general case is. See http://gsubfn.googlecode.com for more. On Mon, Dec 28, 2009 at 7:46 AM, Daniel Malter <dmalter at gmx.net> wrote:> Hi, > > I want to extract individual names from a single string that contains all > names. My problem is not the extraction itself, but the looping over the > extraction start and end points, which I try to realize with apply. > > #Say, I have a string with names. > authors=c("Schleyer T, Spallek H, Butler BS, Subramanian S, Weiss D, > Poythress ML, Rattanathikun P, Mueller G") > > #Since I only want the surname and the initial of the first name, I create > respective indices > starts=c(1, 13, 24, 35, 50, 59, 73, 90) > ends=c(10, 21, 31, 47, 56, 69, 87, 98) > > #Now I can extract the names, e.g. the third one, with > substr(authors,start=starts[3],stop=ends[3]) > > #So far so good, but I want to loop over all indices using apply > #For that I wrote a function g, that takes "a" as the author string, and > "data" as the start and end points for extraction > g=function(a,data){substr(a,data[,1],data[,2])} > > #If provided with a specific row of the data frame, g works > g(authors,data.frame(starts,ends)[3,]) > > #If I try to loop g through the rows of the starts/ends data frame, it does > not work. > apply(data.frame(starts,ends),1,g,a=authors) > > #Interestingly, if the data frame to loop over is just a vector, it works > also (e.g. for extracting just the first initial) > g=function(e,a){substr(a,e,e)} > apply(data.frame(ends),1,g,a=authors) > > So the problem probably lies in correctly supplying "apply" with the data > frame. I would greatly appreciate your help. > > Daniel > > ----------------------------------------------- > "Who has visions should see a doctor," > Helmut Schmidt, German Chancellor (1974-1982). > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Henrique Dallazuanna
2009-Dec-28 17:38 UTC
[R] apply loop - using/providing a data frame to loop over
Try this: mapply(substr, x = authors, start = starts, stop = ends) On Mon, Dec 28, 2009 at 10:46 AM, Daniel Malter <dmalter at gmx.net> wrote:> Hi, > > I want to extract individual names from a single string that contains all > names. My problem is not the extraction itself, but the looping over the > extraction start and end points, which I try to realize with apply. > > #Say, I have a string with names. > authors=c("Schleyer T, Spallek H, Butler BS, Subramanian S, Weiss D, > Poythress ML, Rattanathikun P, Mueller G") > > #Since I only want the surname and the initial of the first name, I create > respective indices > starts=c(1, 13, 24, 35, 50, 59, 73, 90) > ends=c(10, 21, 31, 47, 56, 69, 87, 98) > > #Now I can extract the names, e.g. the third one, with > substr(authors,start=starts[3],stop=ends[3]) > > #So far so good, but I want to loop over all indices using apply > #For that I wrote a function g, that takes "a" as the author string, and > "data" as the start and end points for extraction > g=function(a,data){substr(a,data[,1],data[,2])} > > #If provided with a specific row of the data frame, g works > g(authors,data.frame(starts,ends)[3,]) > > #If I try to loop g through the rows of the starts/ends data frame, it does > not work. > apply(data.frame(starts,ends),1,g,a=authors) > > #Interestingly, if the data frame to loop over is just a vector, it works > also (e.g. for extracting just the first initial) > g=function(e,a){substr(a,e,e)} > apply(data.frame(ends),1,g,a=authors) > > So the problem probably lies in correctly supplying "apply" with the data > frame. I would greatly appreciate your help. > > Daniel > > ----------------------------------------------- > "Who has visions should see a doctor," > Helmut Schmidt, German Chancellor (1974-1982). > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O
Daniel Malter
2009-Dec-28 17:44 UTC
[R] apply loop - using/providing a data frame to loop over
Works a charme, thanks. Daniel ------------------------- cuncta stricte discussurus ------------------------- -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Gabor Grothendieck Sent: Monday, December 28, 2009 12:34 PM To: Daniel Malter Cc: r-help at stat.math.ethz.ch Subject: Re: [R] apply loop - using/providing a data frame to loop over Try this. It picks out each string of word characters (\w+) followed by a space followed by a word character:> library(gsubfn) > strapply(authors, "\\w+ \\w", c)[[1]][1] "Schleyer T" "Spallek H" "Butler B" "Subramanian S" [5] "Weiss D" "Poythress M" "Rattanathikun P" "Mueller G" You might need to adjust the regular expression slightly depending on what the general case is. See http://gsubfn.googlecode.com for more. On Mon, Dec 28, 2009 at 7:46 AM, Daniel Malter <dmalter at gmx.net> wrote:> Hi, > > I want to extract individual names from a single string that contains all > names. My problem is not the extraction itself, but the looping over the > extraction start and end points, which I try to realize with apply. > > #Say, I have a string with names. > authors=c("Schleyer T, Spallek H, Butler BS, Subramanian S, Weiss D, > Poythress ML, Rattanathikun P, Mueller G") > > #Since I only want the surname and the initial of the first name, I create > respective indices > starts=c(1, 13, 24, 35, 50, 59, 73, 90) > ends=c(10, 21, 31, 47, 56, 69, 87, 98) > > #Now I can extract the names, e.g. the third one, with > substr(authors,start=starts[3],stop=ends[3]) > > #So far so good, but I want to loop over all indices using apply > #For that I wrote a function g, that takes "a" as the author string, and > "data" as the start and end points for extraction > g=function(a,data){substr(a,data[,1],data[,2])} > > #If provided with a specific row of the data frame, g works > g(authors,data.frame(starts,ends)[3,]) > > #If I try to loop g through the rows of the starts/ends data frame, itdoes> not work. > apply(data.frame(starts,ends),1,g,a=authors) > > #Interestingly, if the data frame to loop over is just a vector, it works > also (e.g. for extracting just the first initial) > g=function(e,a){substr(a,e,e)} > apply(data.frame(ends),1,g,a=authors) > > So the problem probably lies in correctly supplying "apply" with the data > frame. I would greatly appreciate your help. > > Daniel > > ----------------------------------------------- > "Who has visions should see a doctor," > Helmut Schmidt, German Chancellor (1974-1982). > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. >______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Maybe Matching Threads
- redundant factor levels after subsetting a dataset
- Assessing standard errors of polynomial contrasts
- Samba3 with W2K Native Mode
- intra-class correlation? coherence among multiple ordinal responses
- [PATCH] x86: mmiotrace: Use cpumask_available for cpumask_var_t variables