similar to: Hints for Data Mining

Displaying 20 results from an estimated 6000 matches similar to: "Hints for Data Mining"

2011 Sep 02
1
Hints for Data Clustering
Dear All, I will be confronted (relatively soon) with the following problem: given a set of known statistical indicators {s_i} , i=1,2...N for a N countries I would like to be able to do some data clustering i.e. determining the best way to partition the N countries according to their known properties, encoded by the {s_i} set of indicators for those countries. Some properties of these
2004 Apr 18
2
lm with data=(means,sds,ns)
Hi Folks, I am dealing with data which have been presented as at each x_i, mean m_i of the y-values at x_i, sd s_i of the y-values at x_i number n_i of the y-values at x_i and I want to linearly regress y on x. There does not seem to be an option to 'lm' which can deal with such data directly, though the regression problem could be algebraically
2009 Oct 01
1
Help for 3D Plotting Data on 'Irregular' Grid
Dear All, Here is what I am trying to achieve: I would like to plot some data in 3D. Usually, one has a matrix of the kind y_1(x_1) , y_1(x_2).....y_1(x_i) y_2(x_1) , y_2(x_2).....y_2(x_i) ........................................... y_n(x_1) , y_n(x_2)......y_n(x_i) where e.g. y_2(x_1) is the value of y at time 2 at point x_1 (see that the grid in x is the same for the y values at all times).
2011 Oct 31
1
Question on estimating standard errors with noisy signals using the quantreg package
Dear all, My question might be more of a statistics question than a question on R, although it's on how to apply the 'quantreg' package. Please accept my apologies if you believe I am strongly misusing this list. To be very brief, the problem is that I have data on only a random draw, not all of doctors' patients. I am interested in the, say, median number of patients of
1998 Jun 24
1
SPAM: Important Legislative Alert (fwd)
this has serious ramifications for the "nt domains for unix" project. luke. ---------- Forwarded message ---------- Date: Tue, 23 Jun 1998 13:25:57 -0500 From: Simple Nomad <thegnome@NMRC.ORG> To: NTBUGTRAQ@LISTSERV.NTBUGTRAQ.COM Subject: SPAM: Important Legislative Alert June 23rd, 1998 - The World Intellectual Property Organization treaty has already passed the US Senate and is
2002 Apr 09
3
expressions on graphs
Hello, I am trying to get a time derivative on a plot title. I prefer to have it in the form \dot{s_i}, but \partial s_i/\partial t would be O.K. In the graphics demo I cannot find either a dot or a partial equivalent. Thanks, John. -- ========================================== John Janmaat Department of Economics Acadia University, Wolfville, NS, B0P 1X0 (902)585-1461 All opinions stated
2017 Aug 28
5
"Improvement with the R code"
Hello, I am trying to implement a formula aij= transition from state S_i to S_j/no of transition at state S_i Code I have written is working with three state {1,2,3 }, but if the number of states become={1,2,3,4,......n} then the code will not work, so can some help me with this. For and some rows of my data frame look like
2007 Feb 17
1
Constraint maximum (likelihood) using nlm
Hi, I'm trying to find the maximum (likelihood) of a function. Therefore, I'm trying to minimize the negative likelihood function: # params: vector containing values of mu and sigma # params[1] - mu, params[2]- sigma # dat: matrix of data pairs y_i and s_i # dat[,1] - column of y_i , dat[,2] column of s_i negll <- function(params,dat,constant=0) { for(i in 1:length(dat[,1])) {
2010 Feb 06
1
Canberra distance
Hi the list, According to what I know, the Canberra distance between X et Y is : sum[ (|x_i - y_i|) / (|x_i|+|y_i|) ] (with | | denoting the function 'absolute value') In the source code of the canberra distance in the file distance.c, we find : sum = fabs(x[i1] + x[i2]); diff = fabs(x[i1] - x[i2]); dev = diff/sum; which correspond to the formula : sum[ (|x_i - y_i|) /
2010 Nov 28
1
faster base::sequence
Hello, Based on yesterday's R-help thread (help: program efficiency), and following Bill's suggestions, it appeared that sequence: > sequence function (nvec) unlist(lapply(nvec, seq_len)) <environment: namespace:base> could benefit from being written in C to avoid unnecessary memory allocations. I made this version using inline: require( inline ) sequence_c <- local( {
2000 Oct 26
1
competing risks survival analysis
I will have data in the following form: Time resp type stim type 300 a A 200 b A 155 a B 250 b B 80 c A 1000 d B ... c is left censored observation; d is right censored This sort of problem is discussed in Chap 9 of Cox & Oakes Analysis of Survival Data under the name
2017 Aug 28
0
"Improvement with the R code"
Hi, I think you overthought this one a little bit, I don't know if this is the kind of code you are expecting but I came up with something like that: generate_transition_matrix <- function(data, n_states) { #To be sure I imagine you should check n_states is right at this point transitions <- matrix(0, n_states, n_states) #we could improve a little bit here because at
2006 Dec 08
1
MAXIMIZATION WITH CONSTRAINTS
Dear R users, I?m a graduate students and in my master thesis I must obtain the values of the parameters x_i which maximize this Multinomial log?likelihood function log(n!)-sum_{i=1]^4 log(n_i!)+sum_ {i=1}^4 n_i log(x_i) under the following constraints: a) sum_i x_i=1, x_i>=0, b) x_1<=x_2+x_3+x_4 c)x_2<=x_3+x_4 I have been using the ?ConstrOptim? R-function with the instructions
2007 May 08
5
Weighted least squares
Dear all, I'm struggling with weighted least squares, where something that I had assumed to be true appears not to be the case. Take the following data set as an example: df <- data.frame(x = runif(100, 0, 100)) df$y <- df$x + 1 + rnorm(100, sd=15) I had expected that: summary(lm(y ~ x, data=df, weights=rep(2, 100))) summary(lm(y ~ x, data=rbind(df,df))) would be equivalent, but
2009 Sep 11
2
[PATCH] generator.ml: Fix string list memory leak
Parsed string lists are allocated by malloc, but were never freed. --- src/generator.ml | 16 +++++++++++++++- 1 files changed, 15 insertions(+), 1 deletions(-) diff --git a/src/generator.ml b/src/generator.ml index 7571f95..c72c329 100755 --- a/src/generator.ml +++ b/src/generator.ml @@ -6320,7 +6320,7 @@ and generate_fish_cmds () = | OptString n | FileIn n |
2010 Sep 24
3
boundary check
Dear R, I have a covariates matrix with 10 observations, e.g. > X <- matrix(rnorm(50), 10, 5) > X [,1] [,2] [,3] [,4] [,5] [1,] 0.24857135 0.30880745 -1.44118657 1.10229027 1.0526010 [2,] 1.24316806 0.36275370 -0.40096866 -0.24387888 -1.5324384 [3,] -0.33504014 0.42996246 0.03902479 -0.84778875 -2.4754644 [4,] 0.06710229 1.01950917
2007 Feb 01
3
Help with efficient double sum of max (X_i, Y_i) (X & Y vectors)
Greetings. For R gurus this may be a no brainer, but I could not find pointers to efficient computation of this beast in past help files. Background - I wish to implement a Cramer-von Mises type test statistic which involves double sums of max(X_i,Y_j) where X and Y are vectors of differing length. I am currently using ifelse pointwise in a vector, but have a nagging suspicion that there is a
2001 Mar 05
1
Canberra dist and double zeros
Canberra distance is defined in function `dist' (standard library `mva') as sum(|x_i - y_i| / |x_i + y_i|) Obviously this is undefined for cases where both x_i and y_i are zeros. Since double zeros are common in many data sets, this is a nuisance. In our field (from which the distance is coming), it is customary to remove double zeros: contribution to distance is zero when both x_i
2001 Mar 05
1
Canberra dist and double zeros
Canberra distance is defined in function `dist' (standard library `mva') as sum(|x_i - y_i| / |x_i + y_i|) Obviously this is undefined for cases where both x_i and y_i are zeros. Since double zeros are common in many data sets, this is a nuisance. In our field (from which the distance is coming), it is customary to remove double zeros: contribution to distance is zero when both x_i
2011 Jan 21
2
ordering a vector
Hi, is there a R function that order a matrix according to some criteria based on the rows(or cols) of that matrix? For example, let's say that my matrix S is composed by n rows S_1, S_2,.., S_n and that I compute some real value g_i=g(S_i) for each row. Then I want to order this set of g_i (from smaller to bigger) and order the correspondent row to the new position. Is it possible (apart