thr3ads.net - similar to: "Efficient Binning"

Displaying 20 results from an estimated 10000 matches similar to: "Efficient Binning"

2003 Nov 05

objects inside curly braces

Hello, I am running a program in r that calls a function, which calls another function, which calls another etc. These functions are of the form: example<- function(x,y,z) {x, y, and z are defined within curly braces like this} Here's my question. To start the main function, I input as an initial parameter a matrix of regressors of the form: MyMatrix<-cbind(this.one,that.one)

vlookup in R?

2010 May 28

vlookup in R?

Hi R-users, I would like to search for the values of seq that match my rand values. In excel I will use =VLOOKUP(G2,$E$2:$F$32,2). For example, for rand=.262 it will give me approximately seq=120 and rand=0.964293344, seq=460 and etc. E F G cdf seq rand 0.00E+00 0 0.262123478 1.56E-03 20 0.964293344 1.55E-02 40 0.494827113 5.30E-02 60

Memory issue. XXXX

2012 Mar 02

Memory issue. XXXX

Hi everyone, Any ideas on troubleshooting this memory issue: > d1<-read.csv("arrears.csv") Error: cannot allocate vector of size 77.3 Mb In addition: Warning messages: 1: In class(data) <- "data.frame" : Reached total allocation of 1535Mb: see help(memory.size) 2: In class(data) <- "data.frame" : Reached total allocation of 1535Mb: see

Placing a column name in a variable XXXX

2011 Aug 27

Placing a column name in a variable XXXX

Hi everyone, How does one place an object name (in this case a vector name) into another object (while essentially masking the values of the first object? For example: > JOBSAT<-rnorm(40) > > CI<-function(x,alpha){ + result<-cbind(x,mean=mean(x),alpha) + print(result) + } > CI(JOBSAT,.05) I want this to return: Variable mean alpha JOBSTAT 0.02844131 0.05

Importing data from MS EXCEL (.xls) to R XXXX

2011 Aug 24

Importing data from MS EXCEL (.xls) to R XXXX

Hello everyone, What is the simplest, most RELIABLE way to import data from MS EXCEL (.xls) format to R? In the past I have used the read.xls() function from the xlsReadWrite package, however, I have been wrestling with it all afternoon long with no success. I continue to receive the following error message: > {widge<-read.xls("F:\\Classes\\Z1.Data\\stat.3010\\WidgeOne.xls", +

Equivalent in R of the Contains operator in SAS

2016 Apr 16

Equivalent in R of the Contains operator in SAS

Hi all, I want to select all variables in the data.frame with a name that includes are certain string. Something like the following: merge3[,names(merge3) %in% c("Email","Email.x")] But there are too many variations on the Email variable names to list them all. Can anyone advise? Thanks! Dan

2 Y-AXIS labels on the same (left-hand side) Y-AXIS XXXX

2011 Nov 28

2 Y-AXIS labels on the same (left-hand side) Y-AXIS XXXX

Hello everyone, Is it possible to specify a 2 line y-axis label on the same lef-hand side y-axis? I am using the \n regular expression, but only the 2nd line appears (I assume the 1st line is printed off the page...) plot(PRE_SHB,R1, main="Figure 1.1: Scatterplot of Residualized Post Score", xlab = "Pre Score", ylab = "Residualized Post Score \n (Adjusted for Age

Binning or grouping data

2009 Jun 04

Binning or grouping data

Newbie here. Many apologies in advance for using the incorrect lingo. I'm new to statistics and VERY new to R. I'm attempting to "group" or "bin" data together in order to analyze them as a combined group rather than as discrete set. I'll provide a simple example of the data for illustrative purposes. Patient ID | Charges | Age | Race 1 |

Fastest non-overlapping binning mean function out there?

2012 Oct 03

Fastest non-overlapping binning mean function out there?

Hi, I'm looking for a super-duper fast mean/sum binning implementation available in R, and before implementing z = binnedMeans(x y) in native code myself, does any one know of an existing function/package for this? I'm sure it already exists. So, given data (x,y) and B bins bx[1] < bx[2] < ... < bx[B] < bx[B+1], I'd like to calculate the binned means (or sums)

Using !is.na() in a HAVING clause in sqldf() XXXX

2012 Jan 17

Using !is.na() in a HAVING clause in sqldf() XXXX

Hi everyone, I have the following: sqldf("select Premie,count(tpounds) N,avg(tpounds) Avg_Weight, stddev_samp(tpounds) StdDev from children group by Premie having !is.na(Premie)") sqldf() does not like the !is.na(Premie) specification. How does one exclude a "missing" group in an aggregated query using sqldf()? Thanks! Dan [[alternative HTML version deleted]]

Obtaining the internal integer codes of a factor XXXX

2013 Mar 25

Obtaining the internal integer codes of a factor XXXX

Hi everyone, I understand the process of obtaining the internal integer codes for the raw values of a factor (using as.numeric() as below), but what is the best way to obtain these codes for the levels() of a factor (since the as.numeric() results don't really make clear which code maps to which level)?

Using a mathematical expression in sapply() XXXX

2012 Jan 04

Using a mathematical expression in sapply() XXXX

Hello everyone, I have the following call to sapply() and error message. Is the most efficient way to deal with this to make sum(!is.na(x)) a function in a separate line prior to this call? If not, please advise. N.Valid=sapply(x,sum(!is.na(x))) Error in match.fun(FUN) : 'sum(!is.na(x))' is not a function, character or symbol Thanks! Dan [[alternative HTML version deleted]]

Equivalent in R of the Contains operator in SAS

2016 Apr 16

Equivalent in R of the Contains operator in SAS

Check the string matching functions, e.g. grepl(). -pd > On 16 Apr 2016, at 15:18 , Dan Abner <dan.abner99 at gmail.com> wrote: > > Hi all, > > I want to select all variables in the data.frame with a name that > includes are certain string. Something like the following: > > merge3[,names(merge3) %in% c("Email","Email.x")] > > But there

%in% operator - NOT IN

2011 May 08

%in% operator - NOT IN

Hello everyone, I am attempting to use the %in% operator with the ! to produce a NOT IN type of operation. Why does this not work? Suggestions? > data2[data1$char1 %in% c("string1","string2"),1]<-min(data1$x1) > data2[data1$char1 ! %in% c("string1","string2"),1]<-max(data1$x1)+1000 Error: unexpected '!' in "data2[data1$char1

Remove a word from a character vector value XXXX

2012 Mar 07

Remove a word from a character vector value XXXX

Hi everyone, What is the easiest way to remove the word Average and strip leading and trailing blanks from the character vector (d5.Region) below? .nrow.d5. d5.Region 1 1 Central Average 2 2 Coastal Average 3 3 East Average 4 4 Metro East Average 5 5 Metro North Average 6 6 Metro South Average 7

Writing a function to return column position XXXX

2012 Jan 24

Writing a function to return column position XXXX

Hello everyone, I am writing my own function to return the column index of all variables (these are currently character vectors) in a data frame that contain a dollar sign($). A small piece of the data look like this: can_sta can_zip ind_ite_con ind_uni_con AL 36106 $251,895.80 $22,874.43 AL 35802 $141,373.60 $7,100.00 AL 35201 $273,208.50 $18,193.66 AR 72404 $186,918.00 $25,391.00 AR

Binning data

2011 Mar 13

Binning data

Hello I have a large series of data value -- effectivly say the point across the x-axis where a pitch crosses home plate. What I want to do is find the % of ground balls at various distances across home plate. I therefore need to 'bin' the two data sets I have - plate location for ground balls and plate location for all other outcomes. Question is how can I set up a series of bins

Including only a subset of the levels of a factor XXXX

2011 Sep 01

Including only a subset of the levels of a factor XXXX

Hello everyone, I have the following factor: levels(pp_income) [1] "" "1" "2" "3" "4" "5" "6" "7" [9] "8" "9" "Renter" I want to subset so that only values 1:9 are included. I have the following: > income<-pp_income[pp_income %in%

Reading in tab (and space) delimited data within a script XXXX

2012 Jan 19

Reading in tab (and space) delimited data within a script XXXX

Hello everyone, I use Bob Muenchen's approach for reading in "in-stream" (to use SAS parlance) delimited data within a script. This works great: mystring <- "id,workshop,gender,q1,q2,q3,q4 1,1,f,1,1,5,1 2,2,f,2,1,4,1 3,1,f,2,2,4,3 4,2, ,3,1, ,3 5,1,m,4,5,2,4 6,2,m,5,4,5,5 7,1,m,5,3,4,4 8,2,m,4,5,5,5" mydata <- read.table( textConnection(mystring),

Applyiing mode() or class() to each column of a data.frame XXXX

2011 Dec 30

Applyiing mode() or class() to each column of a data.frame XXXX

Hi everyone, I am attempting to use the apply() function to obtain the mode and class of each column in a data frame, however, I am encountering unexpected results. I have the following example data: v13<-1:6 v14<-c(1,2,3,3,NA,1) v15<-c("Good","Bad",NA,"Good","Bad","Bad")

similar to: Efficient Binning