similar to: partitioning data

Displaying 20 results from an estimated 20000 matches similar to: "partitioning data"

2007 Oct 16
0
partitioning data [SEC=UNCLASSIFIED]
Hi Stephen, Check the help for predict.glm(). The argument for passing new data is actually 'newdata', as in: > pred = predict(glm.model, newdata=form[150001:200000,-1], > type="response") Cheers Joe -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of stephenc at ics.mq.edu.au Sent: Tuesday, 16
2007 Oct 02
1
problems with glm
I am having a couple of problems someone may be able to cast some light on. Question 1: I am making a logistic model but when i do this: glm.model = glm(as.factor(form$finished) ~ ., family=binomial, data=form[1:150000,]) I get this: Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : variable lengths differ (found for 'barrier') which is
2006 May 27
2
boosting - second posting
Hi I am using boosting for a classification and prediction problem. For some reason it is giving me an outcome that doesn't fall between 0 and 1 for the predictions. I have tried type="response" but it made no difference. Can anyone see what I am doing wrong? Screen output shown below: > boost.model <- gbm(as.factor(train$simNuance) ~ ., # formula +
2010 Aug 27
1
Band-wise Sum
Hi I have a large credit portfolio (exceeding 50000 borrowers). For particular process I need to add up the exposures based on the bands. I am giving a small test data below. rating <- c("A", "AAA", "A", "BBB","AA","A","BB", "BBB", "AA", "AA", "AA", "A", "A",
2011 Dec 26
2
glm predict issue
Hello, I have tried reading the documentation and googling for the answer but reviewing the online matches I end up more confused than before. My problem is apparently simple. I fit a glm model (2^k experiment), and then I would like to predict the response variable (Throughput) for unseen factor levels. When I try to predict I get the following error: > throughput.pred <-
2009 Feb 16
1
Overdispersion with binomial distribution
I am attempting to run a glm with a binomial model to analyze proportion data. I have been following Crawley's book closely and am wondering if there is an accepted standard for how much is too much overdispersion? (e.g. change in AIC has an accepted standard of 2). In the example, he fits several models, binomial and quasibinomial and then accepts the quasibinomial. The output for residual
2010 May 26
1
how to Store loop output from a function
HI, Dear R community, I am writing the following function to create one data set(*tree.pred*) and one vector(*valid.out*) from loops. Later, I want to use the data set from this loop to plot curves. I have tried return, list, but I can not use the *tree.pred* data and *valid.out* vector. auc.tree<- function(msplit,mbucket) { * tree.pred<-data.frame()
2012 Dec 11
1
Rprof causing R to crash
I'm trying to use Rprof() to identify bottlenecks and speed up a particullary slow section of code which reads in a portion of a tif file and compares each of the values to values of predictors used for model fitting. I've written up an example that anyone can run. Generally temp would be a section of a tif read into a data.frame and used later for other processing. The first portion
2017 Jul 06
0
svm.formula versus svm.default - different results
Dear community, I'm performing svm-regression with svm at library e1071. As I wrote in another post: "svm e1071 call - different results", I get different results if I use the svm.default rather than the svm.formula, being better the ones at svm.formula I've debugged both options. While debugging the svm.formula, I've seen that when I reach the call: ret <-
2018 Feb 27
0
Random Seed Location
In case you don't get an answer from someone more knowledgeable: 1. I don't know. 2. But it is possible that other packages that are loaded after set.seed() fool with the RNG. 3. So I would call set.seed just before you invoke each random number generation to be safe. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking
2018 Feb 26
3
Random Seed Location
Hi all, For some odd reason when running na?ve bayes, k-NN, etc., I get slightly different results (e.g., error rates, classification probabilities) from run to run even though I am using the same random seed. Nothing else (input-wise) is changing, but my results are somewhat different from run to run. The only randomness should be in the partitioning, and I have set the seed before this
2012 Jun 15
0
argument "x" is missing, with no default - Please help find argument x
R programming question, not machine learning, although that's the content. Apologies to all for whom the following code is eye-burning. I am using foreach() to run a simulation on a randomForest model (actually conditional randomForest ... "party" package). The simulation is in two dimensions. examining how "mtry" and "ntrees" are related in terms of predictive
2012 Dec 04
3
list to matrix?
How do I convert a list to a matrix? --8<---------------cut here---------------start------------->8--- list(c(50000, 101), c(1e+05, 46), c(150000, 31), c(2e+05, 17), c(250000, 19), c(3e+05, 11), c(350000, 12), c(4e+05, 25), c(450000, 19), c(5e+05, 16)) as.matrix(a) [,1] [1,] Numeric,2 [2,] Numeric,2 [3,] Numeric,2 [4,] Numeric,2 [5,] Numeric,2 [6,] Numeric,2 [7,]
2010 Aug 30
2
Band-wise Conditional Sum - Actual problem
Dear R helpers, Thanks a lot for your earlier guidance esp. Mr Davind Winsemius Sir. However, there seems to be mis-communication from my end corresponding to my requirement. As I had mentioned in my earlier mail, I am dealing with a very large database of borrowers and I had given a part of it in my earlier mail as given below. For a given rating say "A", I needed to have the bad-wise
2004 Sep 23
0
followup: Re: Issue with predict() for glm models
Could you just use lines(newX, myPred, col=2) -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Paul Johnson Sent: Thursday, September 23, 2004 10:3 AM To: r help Subject: followup: Re: [R] Issue with predict() for glm models I have a follow up question that fits with this thread. Can you force an overlaid plot
2017 Sep 02
0
problem in testing data with e1071 package (SVM Multiclass)
Hello all, this is the first time I'm using R and e1071 package and SVM multiclass (and I'm not a statistician)! I'm very confused, then. The goal is: I have a sentence with sunny; it will be classified as "yes" sentence; I have a sentence with cloud, it will be classified as "maybe"; I have a sentence with rainy il will be classified as "no". The
2004 Feb 23
2
outputs of KNN prediction
Hello there: I got 13 variables in my training/target set, the first 12 variables are mixture of numerical and categorical variables. The last one is the one I need to predict, and it is a numerical variable. >train<-read.table("train.txt") >test<-read.table("test.txt") >cl<-factor(train[,13]) >pred<-knn(train, test, clk=3, prob=TRUE) >pred I got
2010 Jun 30
1
how to tabulate the prediction value using table function for naive baiyes in R
Hi, I have written a code in R for classifying microarray data using naive bayes, the code is given below: library(e1071) train<-read.table("Z:/Documents/train.txt",header=T); test<-read.table("Z:/Documents/test.txt",header=T); cl <- c(c(rep("ALL",10), rep("AML",10))); cl <- factor(cl) model <- naiveBayes(train,cl);
2010 Sep 30
1
Can this code be written more efficiently?
Dear users, I'm working on binary classification problem using Support Vector Machines (SVM). My objective is to train a series of SVM models on a grid of hyperparameters and then select those that maximize the AUC based on an independent validation sample. My attempted code is shown below. It runs well on "small" data sets but when I use it on a slightly larger sample (e.g., my
2018 Mar 04
0
Random Seed Location
Thank you, everybody, who replied! I appreciate your valuable advise! I will move the location of the set.seed() command to after all packages have been installed and loaded. Best regards, Gary Sent from my iPad > On Mar 4, 2018, at 12:18 PM, Paul Gilbert <pgilbert902 at gmail.com> wrote: > > On Mon, Feb 26, 2018 at 3:25 PM, Gary Black <gwblack001 at sbcglobal.net> >