thr3ads.net - similar to: "terms weight access"

Displaying 20 results from an estimated 10000 matches similar to: "terms weight access"

2004 Dec 14

stopwords

Hi! I would like to use the lists of stopwords provided with Xapian. Are there some standard way to remove stopwords automatically, or should I implement it mysel in the indexer? Regards, Georges Dupret

survfit & number of variables != number of variable names

2012 Nov 17

survfit & number of variables != number of variable names

This works ok: > cox = coxph(surv ~ bucket*(today + accor + both) + activity, data = data) > fit = survfit(cox, newdata=data[1:100,]) but using strata leads to problems: > cox.s = coxph(surv ~ bucket*(today + accor + both) + strata(activity), > data = data) > fit.s = survfit(cox.s, newdata=data[1:100,]) Error in model.frame.default(data = data[1:100, ], formula = ~bucket + :

k-means clustering

2007 Sep 12

k-means clustering

Dear list, first apologies for this is not strictly an R question but a theoretical one. I have read that use of k-means clustering assumes sphericity of data distribution. Can anyone explain me what this means? My statistical background is too poor. Is it another kind of distribution, like gaussian or binomial? What does it happen if the distribution is not spherical? Could you give me an

distance method in kmeans

2007 Apr 22

distance method in kmeans

I am trying to cluster some binary data using k-means . As the regular "kmeans" available from stats package in R does'nt provide the option to change the distance method. I was wondering there is any package available to specify type of distance measure to be used in k means clustering in R. Especially distances like "Jaccard" which is good for binary data.

Why last doesn't return an ActiveRecord::Relation

2013 Jul 17

Why last doesn't return an ActiveRecord::Relation

Hello, Sorry if this has been still answered, I haven''t found nothing on it. I would love to know why ActiveRecord::Base#last doesn''t return an ActiveRecord::Relation just like all or where since an ActiveRecord::Relation can act more or less like an array (as specified here<https://github.com/rails/rails/commit/0a6833b6f701c8c8febadfe2f45e25df29493602> )? Thanks, have

coxph reference hazard rate

2012 May 02

coxph reference hazard rate

Hi, In the following results I interpret exp(coef) as the factor that multiplies the base hazard rate if the corresponding variable is TRUE. For example, when the bucket is ks008 and fidelity <= 3, then the rate, compared to the base rate h_0(t), is h(t) = 0.200 h_0(t). My question is then, to what case does the base hazard rate correspond to? I would expect the reference to be the first

Using kmeans given cluster centroids and data with NAs

2005 Mar 31

Using kmeans given cluster centroids and data with NAs

Hello, I have used the functions agnes and cutree to cluster my data (4977 objects x 22 variables) into 8 clusters. I would like to refine the solution using a k-means or similar algorithm, setting the initial cluster centres as the group means from agnes. However my data matrix has NA's in it and the function kmeans does not appear to accept this? > dim(centres) [1] 8 22 > dim(data)

Cluster on both categorical and numerical data

2008 Jun 18

Cluster on both categorical and numerical data

Hello there. Is there any function in R that can do cluster on a set of data that has both categorical and numerical variables? thanks. siangli

keep the centre fixed in K-means clustering

2013 May 21

keep the centre fixed in K-means clustering

Dear R users I have the matrix of the centres of some clusters, e.g. 20 clusters each with 100 dimentions, so this matrix contains 20 rows * 100 columns numeric values. I have collected new data (each with 100 numeric values) and would like to keep the above 20 centres fixed/'unmoved' whilst just see how my new data fit in this grouping system, e.g. if the data is close to cluster 1

predicting fuzzy cluster membership

2003 May 24

predicting fuzzy cluster membership

Dear all, I'm trying to obtain a fuzzy clustering with fanny from the cluster package, using a given set of data. That worked just fine. I have another separate sample of data from the same problem. For each case in this new sample I would like to know their membership coefficients with respect to the clustering obtained with the first dataset. In effect I want to have a kind of prediction

Array as time series?

2001 Sep 06

Array as time series?

Dear R-helpers, I have 4-dimensional atmospheric data (x,y,z,t), which I want to analyse on spatio-temporal diversities. As far as I understand there only exists the possibility to construct time series as two-dimensional matrices (mts). For the moment, I hold it in different objects: 1. a four-dimensional array for the spatial related analyses 2. a two-dimensional mts timeserie, which was

GSOC-2016 Project : Clustering of search results

2016 Mar 07

GSOC-2016 Project : Clustering of search results

On Mon, Mar 07, 2016 at 01:36:43AM +0530, Richhiey Thomas wrote: > My questions are: > 1) Can you direct me on how to convert this raw idea into a proposal in > context to Xapian with more detail? What areas do I focus on? Our GSoC guide has an application template <https://trac.xapian.org/wiki/GSoCApplicationTemplate> which you should use to structure your proposal. It has some

Information criteria for kmeans

2007 Dec 05

Information criteria for kmeans

Hello, how is, for example, the Schwarz criterion is defined for kmeans? It should be something like: k <- 2 vars <- 4 nobs <- 100 dat <- rbind(matrix(rnorm(nobs, sd = 0.3), ncol = vars), matrix(rnorm(nobs, mean = 1, sd = 0.3), ncol = vars)) colnames(dat) <- paste("var",1:4) (cl <- kmeans(dat, k)) schwarz <- sum(cl$withinss)+ vars*k*log(nobs) Thanks

distance in the function kmeans

2004 May 28

distance in the function kmeans

Hi, I want to know which distance is using in the function kmeans and if we can change this distance. Indeed, in the function pam, we can put a distance matrix in parameter (by the line "pam<-pam(dist(matrixdata),k=7)" ) but we can't do it in the function kmeans, we have to put the matrix of data directly ... Thanks in advance, Nicolas BOUGET

Cluster analysis on weighted survey data with continuous and categorical variables

2013 Mar 19

Cluster analysis on weighted survey data with continuous and categorical variables

I am trying to perform cluster analysis on survey data where each respondent has answered several questions, some of which have categorical answers ("blue" "pink" "green" etc) and some of which have scale answers (rating from 1 to 10 etc).My problem is that certain age groups were over-sampled and I need to weight the data collected in order to accurately reflect the

Using final sample weight in survey package

2016 Apr 04

Using final sample weight in survey package

I have the final sample weight (expansion factor) from a socieconomic survey. I don't know the exact design used in the study ( (probably is a stratified two-stage design). To illustrate my problem I will use the next dataset which have a sample weight (but the design is not specified) and incorporate the design with svydesign and create some bootstrap replicates in order to be able to

algorithm used in k-mean clustering

2005 Apr 22

algorithm used in k-mean clustering

Hi, I have used the kmean fucntion in R to produce some results for my analysis. I like to know the specific underlying algorithm used for the implementation of the function kmean in R. I tried looking for some documents but could not find any. I obtained the kmean result for k ranging from 2 to 10. When i did this initally it worked perfectly. When i tried running again i get the error

GSOC-2016 Project : Clustering of search results

2016 Mar 06

GSOC-2016 Project : Clustering of search results

On Sun, Mar 6, 2016 at 7:17 AM, James Aylett <james-xapian at tartarus.org> wrote: > On Sat, Mar 05, 2016 at 10:58:43PM +0530, Richhiey Thomas wrote: > > K-Means or something related certainly seems like a viable approach, > so what you'll need to do is to come up with a proposal of how you'd > implement this in Xapian (either with reference to the previous work, >

Using final sample weight in survey package

2016 Apr 04

Using final sample weight in survey package

hi, probably not.. if your survey dataset has a complex design (like clusters/strata), you need to include them in the `svydesign` call. coercing an incorrect survey design into a replicate-weighted design will not fix the problem of failing to account for the sampling strategy On Mon, Apr 4, 2016 at 12:01 AM, Jos? Fernando Zea <jfzeac at gmail.com> wrote: > I have the final sample

Survey - Cluster Sampling

2005 Jun 16

Survey - Cluster Sampling

Dear WizaRds, I am struggling to compute correctly a cluster sampling design. I want to do one stage clustering with different parametric changes: Let M be the total number of clusters in the population, and m the number sampled. Let N be the total of elements in the population and n the number sampled. y are the values sampled. This is my example data: clus1 <-

similar to: terms weight access