Displaying 20 results from an estimated 8000 matches similar to: "translate grouped data to their centroid"
2008 Sep 02
2
cluster a distance(analogue)-object using agnes(cluster)
I try to perform a clustering using an existing dissimilarity matrix that I
calculated using distance (analogue)
I tried two different things. One of them worked and one not and I don`t
understand why.
Here the code:
not working example
library(cluster)
library(analogue)
iris2<-as.data.frame(iris)
str(iris2)
'data.frame': 150 obs. of 5 variables:
$ Sepal.Length: num 5.1 4.9 4.7
2018 Jan 28
0
Newbie wants to compare 2 huge RDSs row by row.
The diffobj package (https://cran.r-project.org/package=diffobj) is
really helpful here. It provides "diff" functions diffPrint(),
diffStr(), and diffChr() to compare two object 'x' and 'y' and provide
neat colorized summary output.
Example:
> iris2 <- iris
> iris2[122:125,4] <- iris2[122:125,4] + 0.1
> diffobj::diffPrint(iris2, iris)
< iris2
>
2018 Jan 28
1
Newbie wants to compare 2 huge RDSs row by row.
Thanks, I think I've found the most succinct expression of differences in two data.frames...
length(which( rowSums( x1 != x2 ) > 0))
gives a count of the # of records in two data.frames that do not match.
//
________________________________________
From: Henrik Bengtsson [henrik.bengtsson at gmail.com]
Sent: Sunday, January 28, 2018 11:12 AM
To: Ulrik Stervbo
Cc: Marsh Hardy ARA/RISK;
2005 Apr 07
2
axis colors in pairs plot
The following command produces red axis line in a pairs
plot:
pairs(iris[1:4], main = "Anderson's Iris Data -- 3 species",
pch = "+", col = c("red", "green3", "blue")[unclass(iris$Species)])
Trying to fool pairs in the following way produces the
same plot as above:
pairs(iris[1:4], main = "Anderson's Iris Data -- 3
2006 Jul 11
2
R newbie: logical subsets
Hello! I'm a newcomer to R hoping to replace some convoluted database
code with an R script. Unfortunately, I haven't been able to figure out
how to implement the following logic.
Essentially, we have a database of transactions that are coded with a
geographic locale and a type. These are being loaded into a data.frame
with named variables city, type, and price. E.g., trans$city
2012 Apr 15
2
xyplot type="l"
Probably a stupidly simple question, but I wouldn't know how to google it:
xyplot(neuro ~ time | UserID, data=data_sub)
creates a proper plot.
However, if I add
type = "l"
the lines do not go first through time1, then time2, then time3 etc but in
about 50% of all subjects the lines go through points seemingly random
(e.g. from 1 to 4 to 2 to 5 to 3).
The lines always start at time
2010 Sep 21
5
removed data is still there!
I'm confused, hope someone can point out what is not obvious to me.
I thought I was creating a new data frame by 'deleting' rows from an
existing dataframe - I've tried 2 methods.
But this new data frame seems to remember values from its parent - even
though there are no occurences.
Where does it get the values versicolor and virginica from and give then a
count of 0?
What
2012 Dec 10
3
splitting dataset based on variable and re-combining
I have a dataset and I wish to use two different models to predict. Both models are SVM. The reason for two different models is based
on the sex of the observation. I wish to be able to make predictions and have the results be in the same order as my original dataset. To
illustrate I will use iris:
# Take Iris and create a dataframe of just two Species, setosa and versicolor, shuffle them
2006 Sep 22
3
extract data from lm object and then use again?
Hi list,
I want to write a general function so that it would take an lm object,
extract its data element, then use the data at another R function (eg, glm).
I searched R-help list, and found this would do the trick of the first part:
a.lm$call$data
this would return a name object but could not be recognized as a
data.frameby glm. I also tried
call(as.character(a.lm$call$data))
or
2012 Aug 01
3
Neuralnet Error
I require some help in debugging this code
library(neuralnet)
ir<-read.table(file="iris_data.txt",header=TRUE,row.names=NULL)
ir1 <- data.frame(ir[1:100,2:6])
ir2 <- data.frame(ifelse(ir1$Species=="setosa",1,ifelse(ir1$Species=="versicolor",0,"")))
colnames(ir2)<-("Output")
ir3 <- data.frame(rbind(ir1[1:4],ir2))
2018 Jan 28
2
Newbie wants to compare 2 huge RDSs row by row.
The anti_join from the package dplyr might also be handy.
install.package("dplyr")
library(dplyr)
anti_join (x1, x2)
You can get help on the different functions by ?function.name(), so
?anti_join() will bring you help - and examples - on the anti_join
function.
It might be worth testing your approach on a small subset of the data. That
makes it easier for you to follow what happens
2004 Aug 21
2
more on apply on data frame
Hi R People:
Several of you pointed out that using "tapply" on a data frame will work on
the iris data frame.
I'm still having a problem.
The iris data frame has 150 rows, 5 variables. The first 4 are numeric,
while the last is a factor, which has the Species names.
I can use tapply for 1 variable at a time:
>tapply(iris[,1],iris[,5],mean)
setosa versicolor virginica
2008 Mar 27
2
colMeans in a data.frame with numeric and character data
Hi all,
I would like to know if it is posible by, someway, to get colMeans from
a data.frame with numeric as well as character data, dispersed all over
the object. Note that I would like to get colMeans neglecting character
data.
I am really in need of some function proceeding in that way…
All the best
Diogo André Alagador
[[alternative HTML version deleted]]
2009 Dec 10
1
question about centroid-linkage (cluster analysis)
Dear R community,
I would be greatful if somebody could shed light on the following.
I have created a set of 6 points to check how centroid
agglomeration works in cluster analysis:
> Y <- data.frame(x=c(-1,1,1,-1,10,12),y=c(1,1,-1,-1,0,0))
It is quite intuitive to understand that the last clusters to be joined will be
{1,2,3,4} with {5,6}. Now, the centroid for the first cluster has
2008 Jul 03
1
Otpmial initial centroid in kmeans
Helo there. I am using kmeans of base package to cluster my customers. As
the results of kmeans is dependent on the initial centroid, may I know:
1) how can we specify the centroid in the R function? (I don't want random
starting pt)
2) how to determine the optimal (if not, a good) centroid to start with? (I
am not after the fixed seed solution as it only ensure that the
2012 Aug 28
1
Don't dput() data frames?
/src/main/attrib.c contains this comment in row_names_gets():
/* This should not happen, but if a careless user dput()s a
data frame and sources the result, it will */
which svn blame says Prof Ripley placed there in r39830 with the
commit message "correct the work of dput() on the row names of a data
frame with compact representation."
Is there a problem / better way to
2012 Nov 18
1
centroid of hclust
Dear UseRs,i want to find centroid of clusters, which i generated by hclust. Is there a way doing that? i took mean to elements in each cluster to get centroid but i am not sure if i am right?
thanks in advanceeliza
[[alternative HTML version deleted]]
2008 Aug 07
6
multiple tapply
Hi folk,
I tried this and it works just perfectly
tapply(iris[,1],iris[5],mean)
but, how to obtain a single table from multiple variables?
In tapply x is an atomic object so this code doesn't work
tapply(iris[,1:4],iris[5],mean)
Thanx and great summer holidays
Gianandrea
--
View this message in context: http://www.nabble.com/multiple-tapply-tp18868063p18868063.html
Sent from the R help
2006 Dec 06
2
Usage of apply
Dear R Users,
Are there any documents on the usage of apply, tapply,
sapply so that I avoid explicit loops. I found that these
three functions were quite hard to be understood. Thank you
in advance.
--
Jin
2009 Feb 05
1
Does the "labpt" object in the Polygons-class represent the centroid of the polygon
Hello,
I need to calculate the centroids of some spatial polygons that I have
placed into a Polygons-class object. Is the labeling point in the
Polygons-class the centroid of the polygon?
Thank you for your help.