thr3ads.net - similar to: "tree model with at most one split point per variable"

Displaying 20 results from an estimated 140 matches similar to: "tree model with at most one split point per variable"

2017 Jul 28

problem with "unique" function

Most likely, previous computations have ended up giving slightly different values of say 0.13333. A pragmatic way out is to round to, say, 5 digits before applying unique. In this particular case, it seems like all numbers are multiples of 1/30, so another idea could be to multiply by 30, round, and divide by 30. -pd > On 28 Jul 2017, at 17:17 , li li <hannah.hlx at gmail.com> wrote:

problem with "unique" function

2017 Jul 28

problem with "unique" function

I have the joint distribution of three discrete random variables z1, z2 and z3 which is captured by "z" and "prob" as described below. For example, the probability for z1=0.46667, z2=-1 and z3=-1 is 2.752e-13. Also, the probability adds up to 1. > head(z) z1 z2 z3 [1,] -0.46667 -1.0000 -1.0000 [2,] -0.33333 -0.9333 -0.9333 [3,] -0.20000 -0.8667 -0.8667

How does R compute sums of squares?

2010 Dec 13

How does R compute sums of squares?

Consider the following missing data problem: y = c(1, 2, 2, 2, 3) a = factor(c(1, 1, 1, 2, 2)) b = factor(c(1, 2, 3, 1, 2)) fit = lm(y ~ a + b) anova(fit) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) a 1 0.83333 0.83333 1.3637e+33 < 2.2e-16 *** b 2 1.16667 0.58333 9.5461e+32 < 2.2e-16 *** Residuals 1 0.00000 0.00000 ---

Unexpected behavior with weights in binomial glm()

2012 Sep 29

Unexpected behavior with weights in binomial glm()

Hi useRs, I'm experiencing something quite weird with glm() and weights, and maybe someone can explain what I'm doing wrong. I have a dataset where each row represents a single case, and I run glm(...,family="binomial") and get my coefficients. However, some of my cases have the exact same values for predictor variables, so I should be able to aggregate up my data frame and

arima.sim

2012 Oct 08

arima.sim

Hi, I have been using arima.sim from the stats package recently, and I'm wondering why I get different results when using what seem to be the same parameters. For example, I've given examples of three different ways to run arima.sim with what I believe are the same parameters. It's my understanding from the R documentation that rnorm is the default function for rand.gen if not

row by row similarity

2008 Apr 06

row by row similarity

Hello all and thanks in advance for any advice. I am very new to R and have searched my question but have not come up with anything quite like what I would like to do. My problem is: I have a data set for individuals (rows) and values for behaviours (columns). I would like to know the proportion of shared behaviours for all possible pairs of individuals. The sum of shared behaviours divided by

Confidence intervals for relative risk

2006 Nov 13

Confidence intervals for relative risk

Wolfgang, It is common to handle relative risk problems using Poisson regression. In your example you have 8 events out of 508 tries, and 0/500 in the second data set. > tdata <- data.frame(y=c(8,0), n=c(508,500), group=1:0) > fit <- glm(y ~ group + offset(log(n)), data=tdata, family=poisson) Because of the zero, the standard beta/se(beta) confidence intervals don't work.

Difficulty with 'merge'

2006 Jan 04

Difficulty with 'merge'

Dear R-helpers, Happy New Year to all the helpful members of the list. Here is the behavior I'm looking for: > v1 <- c("a","b","c") > n1 <- c(0, 1, 2) > v2 <- c("c", "a", "b") > n2 <- c(0, 1 , 2) > (f1 <- data.frame(v1, n1)) v1 n1 1 a 0 2 b 1 3 c 2 > (f2 <- data.frame(v2, n2))

Signif. codes

2009 Dec 21

Signif. codes

My question is about the "Signif. codes" and the p-value, specifically, the output when I run summary(nameofregression.lm) So you get this little key: Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 And on a regression I ran, next to the intercept data, I get '***' Coefficients: > > Estimate Std. Error t value Pr(>|t|) > >

ERROR NaNs produced; when comparing two logistic regression models with the ANOVA CHI test

2005 Feb 22

ERROR NaNs produced; when comparing two logistic regression models with the ANOVA CHI test

Dear R-list, *When comparing two logistic regression models with the anova CHi test, I obtain the following error: (there are no NA's in the time series). How can this be solved such that I can compare two models on the same dataset were different explanatory variables are used? l.KBDI <- glm(zna.arson2 ~ zna.KBDI,family = binomial) l.NDWI <- glm(zna.arson2 ~ zna.NDWI,family

how to extract the variables used in decision tree

2010 May 11

how to extract the variables used in decision tree

HI, Dear R community, How to extract the variables actually used in tree construction? I want to extract these variables and combine other variable as my features in next step model building. > printcp(fit.dimer) Classification tree: rpart(formula = outcome ~ ., data = p_df, method = "class") Variables actually used in tree construction: [1] CT DP DY FC NE NW QT SK TA WC WD WG WW

Error in hclust?

2012 Jul 04

Error in hclust?

Dear R users, I have noted a difference in the merge distances given by hclust using centroid method. For the following data: x<-c(1009.9,1012.5,1011.1,1011.8,1009.3,1010.6) and using Euclidean distance, hclust using centroid method gives the following results: > x.dist<-dist(x) > x.aah<-hclust(x.dist,method="centroid") > x.aah$merge [,1] [,2] [1,] -3 -6

significant anova but no distinct groups ?

2007 Mar 02

significant anova but no distinct groups ?

Dear all, I am studying a dataset using the aov() function. The independant variable 'cds' is a factor() with 8 levels and here is the result in studying the dependant variable 'rta' with aov() : > summary(aov(rta ~ cds)) Df Sum Sq Mean Sq F value Pr(>F) cds 7 0.34713 0.04959 2.3807 0.02777 Residuals 92 1.91635 0.02083 The dependant variable

Producing a table with mean values

2012 Sep 07

Producing a table with mean values

Hi All, I have a data set wit three size classes (pico, nano and micro) and 12 different sites (Seamounts). I want to produce a table with the mean and standard deviation values for each site. Seamount Pico Nano Micro Total_Ch 1 Off_Mount 1 0.0691 0.24200 0.00100 0.31210 2 Off_Mount 1 0.0938 0.00521 0.02060 0.11961 3 Off_Mount 1 0.1130 0.20000 0.06620 0.37920 4 Off_Mount 1

converting MATLAB -> R | element-wise operation

2024 Feb 27

converting MATLAB -> R | element-wise operation

So, trying to convert a very long, somewhat technical bit of lin alg MATLAB code to R. Most of it working, but raninto a stumbling block that is probaably simple enough for someone to explain. Basically, trying to 'line up' MATLAB results from an element-wise division of a matrix by a vector with R output. Here is a simplified version of the MATLAB code I'm translating: NN = [1,

[External] converting MATLAB -> R | element-wise operation

2024 Feb 27

[External] converting MATLAB -> R | element-wise operation

> t(t(NN)/lambda) [,1] [,2] [,3] [1,] 0.5 0.6666667 0.75 [2,] 2.0 1.6666667 1.50 > R matrices are column-based. MATLAB matrices are row-based. > On Feb 27, 2024, at 14:54, Evan Cooch <evan.cooch at gmail.com> wrote: > > So, trying to convert a very long, somewhat technical bit of lin alg > MATLAB code to R. Most of it working, but raninto a stumbling block

help with aggregate(): tables of means for terms in an mlm

2007 Aug 28

help with aggregate(): tables of means for terms in an mlm

I'm trying to extend some work in the car and heplots packages that requires getting a table of multivariate means for one (or later, more) terms in an mlm object. I can do this for concrete examples, using aggregate(), but can't figure out how to generalize it. I want to return a result that has the factor-level combinations as rownames, and the means as the body of the table

Anova Type II and Contrasts

2012 Jul 06

Anova Type II and Contrasts

the study design of the data I have to analyse is simple. There is 1 control group (CTRL) and 2 different treatment groups (TREAT_1 and TREAT_2). The data also includes 2 covariates COV1 and COV2. I have been asked to check if there is a linear or quadratic treatment effect in the data. I created a dummy data set to explain my situation: df1 <- data.frame( Observation =

variable type assignment in daisy

2010 Nov 06

variable type assignment in daisy

Dear Rhelp, I did a daisy on 5 lifestyle variables, 3 of which were nominal and 2 were ordinal and assigned types “nominal” and “ordinal” for the variables, respectively. I got an output indicating their types as “I” for interval(?). Doing it on the Rdata example “flower” gave the same types in the output as the types they were assigned to. Why is this so? Below are the codes and outputs.

time function behavior for ts class objects

2009 Jun 14

time function behavior for ts class objects

Hi all- I am trying to use the time function for ts class objects and do not understand the return value. I want to use it to set up a time trend in arima fits. It does not seem to return a correct linear sequence that matches the underlying time series. I am running: R version 2.8.1 (2008-12-22). For example: R> ## create a time series R> x <- rnorm(24) R> (xts <-

similar to: tree model with at most one split point per variable