thr3ads.net - similar to: "Equivalent in R of the Contains operator in SAS"

Displaying 20 results from an estimated 7000 matches similar to: "Equivalent in R of the Contains operator in SAS"

Equivalent in R of the Contains operator in SAS

2016 Apr 16

Equivalent in R of the Contains operator in SAS

Check the string matching functions, e.g. grepl(). -pd > On 16 Apr 2016, at 15:18 , Dan Abner <dan.abner99 at gmail.com> wrote: > > Hi all, > > I want to select all variables in the data.frame with a name that > includes are certain string. Something like the following: > > merge3[,names(merge3) %in% c("Email","Email.x")] > > But there

Writing a function to return column position XXXX

2012 Jan 24

Writing a function to return column position XXXX

Hello everyone, I am writing my own function to return the column index of all variables (these are currently character vectors) in a data frame that contain a dollar sign($). A small piece of the data look like this: can_sta can_zip ind_ite_con ind_uni_con AL 36106 $251,895.80 $22,874.43 AL 35802 $141,373.60 $7,100.00 AL 35201 $273,208.50 $18,193.66 AR 72404 $186,918.00 $25,391.00 AR

%in% operator - NOT IN

2011 May 08

%in% operator - NOT IN

Hello everyone, I am attempting to use the %in% operator with the ! to produce a NOT IN type of operation. Why does this not work? Suggestions? > data2[data1$char1 %in% c("string1","string2"),1]<-min(data1$x1) > data2[data1$char1 ! %in% c("string1","string2"),1]<-max(data1$x1)+1000 Error: unexpected '!' in "data2[data1$char1

Memory issue. XXXX

2012 Mar 02

Memory issue. XXXX

Hi everyone, Any ideas on troubleshooting this memory issue: > d1<-read.csv("arrears.csv") Error: cannot allocate vector of size 77.3 Mb In addition: Warning messages: 1: In class(data) <- "data.frame" : Reached total allocation of 1535Mb: see help(memory.size) 2: In class(data) <- "data.frame" : Reached total allocation of 1535Mb: see

Placing a column name in a variable XXXX

2011 Aug 27

Placing a column name in a variable XXXX

Hi everyone, How does one place an object name (in this case a vector name) into another object (while essentially masking the values of the first object? For example: > JOBSAT<-rnorm(40) > > CI<-function(x,alpha){ + result<-cbind(x,mean=mean(x),alpha) + print(result) + } > CI(JOBSAT,.05) I want this to return: Variable mean alpha JOBSTAT 0.02844131 0.05

Importing data from MS EXCEL (.xls) to R XXXX

2011 Aug 24

Importing data from MS EXCEL (.xls) to R XXXX

Hello everyone, What is the simplest, most RELIABLE way to import data from MS EXCEL (.xls) format to R? In the past I have used the read.xls() function from the xlsReadWrite package, however, I have been wrestling with it all afternoon long with no success. I continue to receive the following error message: > {widge<-read.xls("F:\\Classes\\Z1.Data\\stat.3010\\WidgeOne.xls", +

Efficient Binning

2017 Jul 14

Efficient Binning

Hi all, I have a situation where I have 16 bins. I generate a random number and then want to know which bin number the random number falls in. Right now, I am using a serious of 16 if() else {} statements which get very complicated with the embedded curly braces. Is there a more efficient (i.e., easier) way to go about this? boundaries<-(0:16)/16 rand<-runif(1) Which bin number (1:16)

2 Y-AXIS labels on the same (left-hand side) Y-AXIS XXXX

2011 Nov 28

2 Y-AXIS labels on the same (left-hand side) Y-AXIS XXXX

Hello everyone, Is it possible to specify a 2 line y-axis label on the same lef-hand side y-axis? I am using the \n regular expression, but only the 2nd line appears (I assume the 1st line is printed off the page...) plot(PRE_SHB,R1, main="Figure 1.1: Scatterplot of Residualized Post Score", xlab = "Pre Score", ylab = "Residualized Post Score \n (Adjusted for Age

Using !is.na() in a HAVING clause in sqldf() XXXX

2012 Jan 17

Using !is.na() in a HAVING clause in sqldf() XXXX

Hi everyone, I have the following: sqldf("select Premie,count(tpounds) N,avg(tpounds) Avg_Weight, stddev_samp(tpounds) StdDev from children group by Premie having !is.na(Premie)") sqldf() does not like the !is.na(Premie) specification. How does one exclude a "missing" group in an aggregated query using sqldf()? Thanks! Dan [[alternative HTML version deleted]]

Obtaining the internal integer codes of a factor XXXX

2013 Mar 25

Obtaining the internal integer codes of a factor XXXX

Hi everyone, I understand the process of obtaining the internal integer codes for the raw values of a factor (using as.numeric() as below), but what is the best way to obtain these codes for the levels() of a factor (since the as.numeric() results don't really make clear which code maps to which level)?

Using a mathematical expression in sapply() XXXX

2012 Jan 04

Using a mathematical expression in sapply() XXXX

Hello everyone, I have the following call to sapply() and error message. Is the most efficient way to deal with this to make sum(!is.na(x)) a function in a separate line prior to this call? If not, please advise. N.Valid=sapply(x,sum(!is.na(x))) Error in match.fun(FUN) : 'sum(!is.na(x))' is not a function, character or symbol Thanks! Dan [[alternative HTML version deleted]]

Remove a word from a character vector value XXXX

2012 Mar 07

Remove a word from a character vector value XXXX

Hi everyone, What is the easiest way to remove the word Average and strip leading and trailing blanks from the character vector (d5.Region) below? .nrow.d5. d5.Region 1 1 Central Average 2 2 Coastal Average 3 3 East Average 4 4 Metro East Average 5 5 Metro North Average 6 6 Metro South Average 7

Including only a subset of the levels of a factor XXXX

2011 Sep 01

Including only a subset of the levels of a factor XXXX

Hello everyone, I have the following factor: levels(pp_income) [1] "" "1" "2" "3" "4" "5" "6" "7" [9] "8" "9" "Renter" I want to subset so that only values 1:9 are included. I have the following: > income<-pp_income[pp_income %in%

Reading in tab (and space) delimited data within a script XXXX

2012 Jan 19

Reading in tab (and space) delimited data within a script XXXX

Hello everyone, I use Bob Muenchen's approach for reading in "in-stream" (to use SAS parlance) delimited data within a script. This works great: mystring <- "id,workshop,gender,q1,q2,q3,q4 1,1,f,1,1,5,1 2,2,f,2,1,4,1 3,1,f,2,2,4,3 4,2, ,3,1, ,3 5,1,m,4,5,2,4 6,2,m,5,4,5,5 7,1,m,5,3,4,4 8,2,m,4,5,5,5" mydata <- read.table( textConnection(mystring),

Applyiing mode() or class() to each column of a data.frame XXXX

2011 Dec 30

Applyiing mode() or class() to each column of a data.frame XXXX

Hi everyone, I am attempting to use the apply() function to obtain the mode and class of each column in a data frame, however, I am encountering unexpected results. I have the following example data: v13<-1:6 v14<-c(1,2,3,3,NA,1) v15<-c("Good","Bad",NA,"Good","Bad","Bad")

Summary functions in sqldf() XXXX

2013 Oct 08

Summary functions in sqldf() XXXX

Hi everyone, Is it possible to obtain the 1st & 3rd quartiles & the median in a sqldf() select statement? If so, can you please provide the summary fn code? Thanks! Dan [[alternative HTML version deleted]]

Options for print()

2011 May 05

Options for print()

Hello everyone, I have a few questions about the print() fn: 1) I have the following code that does not center the character string: print("The TITLE",quote=FALSE,justify="center") 2) How can I get R to not print the leading [1], etc. when using print()? (Sorry, I don't know what the leading [1] is called. I tried looking it up in "An Introduction", but

R package for segmentation with both continuous and categorical input variables XXXX

2011 Nov 10

R package for segmentation with both continuous and categorical input variables XXXX

Hello everyone, Can anyone suggest a decently documented (with good examples in the documentation) R package/function that performs segmentation (cluster, mixture modeling) of a population using both continuous and categorical input variables? Thank you, Dan [[alternative HTML version deleted]]

Convert components of a list to separate columns in a data frame or matrix XXXX

2012 Jan 08

Convert components of a list to separate columns in a data frame or matrix XXXX

Hello everyone, What is the most efficient & simpliest way to convert all components of a list to separate columns in a matrix? Is there an easy way to programmatically "pad" the length of the resulting shorter character vectors so that they can be easily combined into a data frame? I have the following code that stores the 2 compoents (of differing lengths) in the same character

SAPPLY function XXXX

2011 May 04

SAPPLY function XXXX

Hello everyone, I am attempting to write a function to count the number of non-missing values of each column in a data frame using the sapply function. I have the following code which is receiving the error message below. > n.valid<-sapply(data1,sum(!is.na)) Error in !is.na : invalid argument type Ultimately, I would like for this to be 1 conponent in a larger function that will produce

similar to: Equivalent in R of the Contains operator in SAS