similar to: gsub does not support \b?

Displaying 20 results from an estimated 20000 matches similar to: "gsub does not support \b?"

2012 May 30
1
gsub/strsplit with multiple patterns/splits
Hi, I have a vector like this: DF <- c("Aetna, Inc.", "Alexander's Inc.", "Allegheny Energy, Inc") For each element in the vector I would like to remove the "incorporated" info, so that my vector looks like this: DF <- c("Aetna", "Alexander's", "Allegheny Energy") That means that I have to strip: strip <-
2010 Mar 09
5
data frame select max group by like function
Hi, I have a data frame with 3 columns: ID, year and score. How can I select for each unique ID, the year that has the max score? For example, for data frame ID, year, score tom, 1995, 88 rick, 1994, 90 mary, 2000, 97 tom, 1998, 60 mary, 1998,100 I shall have ID, year, score tom, 1995, 88 rick, 1994, 90 mary, 1998,100 Thanks, Richard [[alternative HTML version deleted]]
2009 Apr 13
3
toupper does not work in sub + regex
Hi, I don't know what I am doing wrong to the toupper does not seem working in sub + regex. The following returns 's' not the upper class 'S' as I expect: sub("q_([a-z])[a-zA-Z]*",toupper('\\1'),"q_sviRaw") Can someone tell me where I did wrong? Thanks, Richard [[alternative HTML version deleted]]
2009 Feb 12
3
get top 50 correlated item from a correlation matrix for each item
Hi, I have a correlation matrix of about 3000 items, i.e., a 3000*3000 matrix. For each of the 3000 items, I want to get the top 50 items that have the highest correlation with it (excluding itself) and generate a data frame with 3 columns like ("ID", "ID2", "cor"), where ID is those 3000 items each repeat 50 times, and ID2 is the top 50 correlated items with ID,
2010 Oct 07
3
aggregate text column by a few rows
Hi, R function aggregate can only take summary stats functions, can I aggregate text columns? For example, for the dataframe below, > a <- rbind(data.frame(id=1, name='Tom', hobby='fishing'),data.frame(id=1, name='Tom', hobby='reading'),data.frame(id=2, name='Mary', hobby='reading'),data.frame(id=3, name='John',
2010 Sep 16
3
get top n rows group by a column from a dataframe
Hi, is there an R function like sql's TOP key word? I have a dataframe that has 3 columns: company, person, salary How do I get top 5 highest paid person for each company, and if I have fewer than 5 people for a company, just return all of them? Thanks, Richard [[alternative HTML version deleted]]
2009 Jun 08
1
Regex question to find a string that contains 5-9 alpha-numeric characters, at least one of which is a number
Hi, This is not exactly an R question but I am trying to use gsub to replace a string that contains 5-9 alpha-numeric characters, at least one of which is a number. Is there a good way to write it in a one line regex? Thanks, Richard
2006 May 26
2
combinatorial programming problem
Hola! I am programming a class (S3) "symarray" for storing the results of functions symmetric in its k arguments. Intended use is for association indices for more than two variables, for instance coresistivity against antibiotics. There is one programming problem I haven't solved, making an inverse of the index function indx() --- se code below. It could for instance return the
2012 Nov 08
1
Extract cell of many values from dataframe cells and sample from them.
Hi, First my apologies for a non-working piece of code in a previous submission, I have corrected this error. I'm doing is individual based modelling of a pathogen and it's host. The way I've thought of doing this is with two dataframes, one of the pathogen and it's genes and effector genes, and one of the host and it's resistance genes. During the simulation, these things
2004 Feb 02
2
Nearest Neighbor Algorithm in R -- again.
Several of the methods I use for analyzing large data sets, such as WinGamma: determining the level of noise in data Relief-F: estimating the influence of variables depend on finding the k nearest neighbors of a point in a data frame or matrix efficiently. (For large data sets it is not feasible to compute the 'dist' matrix anyway.) Seeing the proposed solution to "[R] distance
2009 Mar 14
2
gsub and regex to tidy comma-limited values
I am cleaning up comma-limited values, so that only one comma separates each value. Using the example below, as much as I try with regex, I can't remove the last comma. I hope to have a one-liner solution, if possible. gsub("^,*|,*$|(,)*", "\\1", ",,,apple,,orange,,,,,lemon,strawberry,,,,") [1] "apple,orange,lemon,strawberry,"
2009 Apr 21
2
multiple plots in same graph window
Hi, I'm trying to make multiple plots in a same graph window in R. The multiple graphs are showing up in the right positions on the window, but I'm having the problem that the graphic window is being refreshed every time a new plot is drawn, so that I end up with only the last graph coming up; the previous ones are all erased If I try to print in a .eps file directly, then I end up
2010 Nov 22
2
aggregate a Date column does not work?
Hi, I am trying to aggregate max a Date type column but have weird result, how do I fix this? > a <- rbind( + data.frame(name='Tom', payday=as.Date('1999-01-01')), + data.frame(name='Tom', payday=as.Date('2000-01-01')), + data.frame(name='Pete', payday=as.Date('1998-01-01')), + data.frame(name='Pete',
2006 Apr 11
1
pattern in history
Hi, Sometimes I need to consult the history of commands that are matching a regex, so I modified the utils::history function for that purpose. I found it useful. I append the code ( I only added the two lines with #**) Romain. history2 <- function (pattern="", max.show = 25, reverse = FALSE, unique = pattern!="", ...) { file1 <- tempfile("Rrawhist")
2012 Nov 06
1
sample from list
Hi all, I have a list of genes present in 500 individuals, the individuals are the elements: Genes <- lapply(1:nrow(inds),function(x) sample(1:10000,inds$No_of_Genes,replace=TRUE)) (This was later written to a dataframe as well as kept as the list object: inds2 <- data.frame(inds,Genes=I(Genes))) I also have a vector of how many of those genes are expressed in the individuals, this can
2010 Oct 08
3
Efficiency Question - Nested lapply or nested for loop
My data looks like this: > data name G_hat_0_0 G_hat_1_0 G_hat_2_0 G_0 G_hat_0_1 G_hat_1_1 G_hat_2_1 G_1 1 rs0 0.488000 0.448625 0.063375 1 0.480875 0.454500 0.064625 1 2 rs1 0.002375 0.955375 0.042250 1 0.000000 0.062875 0.937125 2 3 rs2 0.050375 0.835875 0.113750 1 0.877250 0.115875 0.006875 0 4 rs3 0.000000 0.074750 0.925250 2 0.897750 0.102000
2002 Apr 30
1
followup -- deficiencies in readline capability
Why would R lack history capability? Someone in a private electronic mail message suggested the possibility that I was running R in a non-writable directory. This is not the case, as the following logfile shows (where "$ " is my shell prompt): $ ls -ld `pwd` drwxrwxrwx 15 sys sys 2560 Apr 30 08:10 /tmp $ R --vanilla R : Copyright 2002, The R Development Core Team
2009 Mar 13
1
search for string insider a string
Hi, sorry if it is a too stupid question, but how do I a string search in R: I have a dataframe A with A$test like: test1 bcdtestblabla2.1bla cdtestblablabla3.88blabla and I want to search for string that start with 'dtest' and ends with number and return the location of that substring and the number, so the end result would be: NA NA 3 2.1 2 3.88 I find grep can
2009 May 30
1
arithmetic problem
Hello list I have a problem with a dataset (see toy example below) where I am trying to find the difference between two (or more numbers) and discard those observations which fall outside a set interval. An example and further explanation: values ind 1 2655 7A5 2 3028 7A5 3 689 ABBA-1 4 1336 ABBA-1 5 1560 ABBA-1 6 2820 ABLIM1 7 3339 ABLIM1 8
2010 Dec 02
2
Hmisc label function applied to data frame
Hello, I'm attempting to create a data frame with correlations between every pair of variables in a data frame, so that I can then sort by the value of the correlation coefficient and see which pairs of variables are most strongly correlated. The sm2vec function in the corpcor library works very nicely as shown here: library(Hmisc) library(corpcor) # Create example data x1 = runif(50) x2 =