thr3ads.net - similar to: "k-means: should columns in dataset be in same scale?"

Displaying 20 results from an estimated 10000 matches similar to: "k-means: should columns in dataset be in same scale?"

Exporting to file: passing source name to file name in loop

2004 Nov 14

Exporting to file: passing source name to file name in loop

Hi, I'm having a mental block as to how I can automatically assign filenames to the output of the following code. I am wishing to create a separate .png file for every image created, each of them having a sequential filename ie "sourcefile_index.png" so that I can create a movie from them. Please could someone tell me where I am going wrong? the following code works fine and

[ subscripting sometimes loses names (PR#8192)

2005 Oct 09

[ subscripting sometimes loses names (PR#8192)

--rwEMma7ioTxnRzrJ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline R, like recent versions of S-Plus, sometimes - but not always - loses names when subscripting objects with "[". (Earlier versions of S and S-Plus had the correct, name-preserving behavior.) This seems bad, it would be better to remove names only by explicit request, not as an accidental

distance coefficient for amatrix with ngative valus

2011 Oct 03

distance coefficient for amatrix with ngative valus

Hi, I need to run a PCoA (PCO) for a data set wich has both positive and negative values for variables. I could not find any distancecoefficient other than euclidean distace running for the data set. Are there any other coefficient works with negtive values.Also I cannot get summary out put (the eigen values) for PCO as for PCA. Thanks. Dilshan [[alternative HTML version deleted]]

Comparing 2 dale columns

2017 Aug 23

Comparing 2 dale columns

Patrick, ## Run the following script an notice the different values of the dataframe "data" in each instance. # I understand you have done something like the following: data <- data.frame(COL1 = c("6/1/14", "7/1/14"), COL2 = c("5/1/15", "5/1/15"), stringsAsFactors = FALSE) data$Date_Flag <- ifelse(data$COL2 >

data(eurodist) and PCA ??

2004 Oct 13

data(eurodist) and PCA ??

If I perform PCA on the 'eurodist' data, should I get an accurate geographic layout of the cities with biplot? (barring inversions, i.e. their is no way to define north.. but you get the idea...) I have a complex distance matrix, and I am thinking about how to cluster it and how to visualize the quality of the resulting clusters. If I could 'see' the clusters in space I could

Converting from a dataset to a single "column"

2006 Jan 23

Converting from a dataset to a single "column"

I have a dataset of 3 ?columns? and 5 ?rows?. temp<-data.frame(col1=c(5,10,14,56,7),col2=c(4,2,8,3,34),col3=c(28,4,52,34,67)) I wish to convert this to a single ?column?, with column 1 on ?top? and column 3 on ?bottom?. i.e. 5 10 14 56 7 4 2 8 3 34 28 4 52 34 67 Are there any functions that do this, and that will work well on much larger datasets (e.g. 1000 rows, 6000 columns)?

selecting dataframe columns based on substring of col name(s)

2017 Jun 21

selecting dataframe columns based on substring of col name(s)

> On Jun 21, 2017, at 9:11 AM, Evan Cooch <evan.cooch at gmail.com> wrote: > > Suppose I have the following sort of dataframe, where each column name has a common structure: prefix, followed by a number (for this example, col1, col2, col3 and col4): > > d = data.frame( col1=runif(10), col2=runif(10), col3=runif(10),col4=runif(10)) > > What I haven't been able to

modify a data frame by values in the columns

2011 Jun 03

modify a data frame by values in the columns

I have a data frame like this: col1 col2 r1 2 1 r2 4 3 r3 6 5 r4 8 7 r5 10 9 r6 12 11 r7 14 13 r8 16 15 r9 18 17 r10 20 19 I want to modify this data frame, for example, assign every row in column col1 and col2 to -1 if the values in col1 is less than 12 and values in col2 is greater than 10. The result should look like this: col1

How to loop through all the columns in dataframe

2008 Mar 16

How to loop through all the columns in dataframe

Hi: Can anyone advice me on how to loop and perform a calculation through all the columns. here's my data xd<- c(2.2024,2.4216,1.4672,1.4817,1.4957,1.4431,1.5676) pd<- c(0.017046,0.018504,0.012157,0.012253,0.012348,0.011997,0.012825) td<- c(160524,163565,143973,111956,89677,95269,81558) mydf<-data.frame(xd,pd,td) trans<-t(mydf) trans I have these values that I need to

cmdscale k=1

2002 Feb 15

cmdscale k=1

In applying multidimensional scaling, it seems to me that sometimes the underlying dimensionality of the matrix is 1. However I found a case where cmdscale failed when I tried k=1. Here it is: m<-matrix( c(.5,.81,.23,.47,.61, .19,.5,.06,.17,.28, .77,.94,.5,.74,.85, .53,.83,.26,.5,.64, .39,.72,.15,.36,.5), nrow=5) # BTW I think cmdscale uses only the lower triangle--how to enter only # that

selecting dataframe columns based on substring of col name(s)

2017 Jun 21

selecting dataframe columns based on substring of col name(s)

Suppose I have the following sort of dataframe, where each column name has a common structure: prefix, followed by a number (for this example, col1, col2, col3 and col4): d = data.frame( col1=runif(10), col2=runif(10), col3=runif(10),col4=runif(10)) What I haven't been able to suss out is how to efficiently 'extract/manipulate/play with' columns from the data frame, making use

Comparing 2 dale columns

2017 Aug 23

Comparing 2 dale columns

Hi your code is wrong. I get > test<-read.table("clipboard", header=T) > str(test) 'data.frame': 2 obs. of 2 variables: $ COL1: Factor w/ 2 levels "6/1/14","7/1/14": 1 2 $ COL2: Factor w/ 1 level "5/1/15": 1 1 > test$COL2<- as.Date(as.character(test$COL2, format="%y/%m/%d")) > test$COL1<-

Comparing 2 dale columns

2017 Aug 23

Comparing 2 dale columns

Thanks. But when I apply your codes I get all NA instead of TRUE and FALSE ________________________________ From: PIKAL Petr <petr.pikal at precheza.cz> Sent: Wednesday, August 23, 2017 11:20:00 AM To: Patrick Casimir; r-help at r-project.org Subject: RE: Comparing 2 dale columns Hi your code is wrong. I get > test<-read.table("clipboard", header=T) > str(test)

help! kennard-stone algorithm in soil.spec packages does not work for my dataset!!!

2010 Nov 07

help! kennard-stone algorithm in soil.spec packages does not work for my dataset!!!

http://r.789695.n4.nabble.com/file/n3031344/RSV.Rdata RSV.Rdata I want to split my dataset to training set and test set using kennard-stone(KS) algorithm, it is lucky there is R packages soil.spec to implement it. but when I used it to my dataset, it does not work, who can help me, how reasons is it, below, it is my code, and my data in the attachment.

k-means with euclidian distance but no coordinates

2001 Dec 13

k-means with euclidian distance but no coordinates

Hi, I'm trying to build a thesaurus that will sensible values for rare words. I suspect the best algorithm to use is k-means although I'm not sure about that -- I would have preferred a k dimensional space with a binary cluster in each dimension so a word can belong to 0..k clusters, but I digress... I can measure the strength of correlation between words fairly easily by counting

Comparing 2 dale columns

2017 Aug 23

Comparing 2 dale columns

Dear R fellows, I created a new column Date_flag to compare the dates of COL1 and COL2 using the code below. But it showed that 5/1/15 is greater than 6/1/2014 and 5/1/2015 greater than 7/1/2014 despite the year is greater. How do I fix that? I did try to format as %y/%m/%d but it does not fix that. data$Date_Flag <- ifelse(data$COL2 > data$COL1, 0,1) COL1 COL2 6/1/14

Significance of Principal Coordinates

2005 Mar 14

Significance of Principal Coordinates

Dear all, I was looking for methods in R that allow assessing the number of significant principal coordinates. Unfortunatly I was not very successful. I expanded my search to the web and Current Contents, however, the information I found is very limited. Therefore, I tried to write code for doing a randomization. I would highly appriciate if somebody could comment on the following approach.

row-wise means

2009 Nov 18

row-wise means

I have a dataframe with 3 columns. The first column stores an index. I would like to calculate the mean of the numbers stored in each of the rest of the columns. So, here is my data matrix: col1 col2 col3 1 23 34 2 45 56 3 23 56 4 34 68 For each row I would like to calculate the means of the numbers stored in col2 and col3. How can this be done in R? TIA, Anjan -- =============================

how to draw a 45 degree line on qqnorm() plot?

2005 Apr 03

how to draw a 45 degree line on qqnorm() plot?

# I can not draw a 45 degree line on a qqnorm() plot, jj <- sample(c(1:100), 10) qqnorm(jj) abline() don't work. Thank you.

K MEANS clustering

2016 Jul 27

K MEANS clustering

Hey Parth, Thanks for the reply. I am considering implementing a cosine distance metric too, along with euclidian distance because of the dimensionality issue that comes in with K-Means and euclidian distance metric. That does help when we deal with sparse vectors for documents. The particular problem I'm having is representing centroids in an efficient way. For example, when we find the mean

similar to: k-means: should columns in dataset be in same scale?