thr3ads.net - similar to: "Checking for duplicate rows in data frame efficiently"

Displaying 20 results from an estimated 300 matches similar to: "Checking for duplicate rows in data frame efficiently"

help need on working in subset within a dataframe

2011 Mar 22

help need on working in subset within a dataframe

Dear R-experts Execuse me for an easy question, but I need help, sorry for that. >From days I have been working with a large dataset, where operations are needed within a component of dataset. Here is my question: I have big dataset where x1:.....x1000 or so. What I need to do is to work on 4 consequite variables to calculate a statistics and output. So far so good. There are more vector

Adding RcppFrame to RcppResultSet causes segmentation fault

2010 Mar 30

Adding RcppFrame to RcppResultSet causes segmentation fault

Hi, I'm a bit puzzled. I uses exactly the same code in RcppExamples package to try adding RcppFrame object to RcppResultSet. When running it gives me segmentation fault problem. I'm using gcc 4.1.2 on redhat 64bit. I'm not sure if this is the cause of the problem. Any advice would be greatly appreciated. Thank you. Rob. int numCol=4; std::vector<std::string>

Generating groupings of ordered observations

2008 Jun 21

Generating groupings of ordered observations

Dear List, I have a problem I'm finding it difficult to make headway with. Say I have 6 ordered observations, and I want to find all combinations of splitting these 6 ordered observations in g groups, where g = 1, ..., 6. Groups can only be formed by adjacent observations, so observations 1 and 4 can't be in a group on their own, only if 1,2,3&4 are all in the group. For example,

Problem with the fdim package

2017 Aug 10

Problem with the fdim package

Hi, I?m new to R but I?m interested in using the fdim package to find the fractal dimension of a dataset. I downloaded the the package from https://cran.r-project.org/src/contrib/Archive/fdim/ and successfully installed it together with xgobi. However, when I try to run the first example, after df <- fdim(mydata,q=0,Alpha=0.3) I get the following error: Error in .C("pointdif",

IFELSE across a 3D array?

2004 Nov 22

IFELSE across a 3D array?

Dear all, We are trying to clean multiple realizations of a pattern. Erroneous NODATA and spurious DATA occur in the realizations. As we have to do a 1000 realizations for many patterns, efficiency of the code is important. We need to correct the realizations with a 'mask' pattern of DATA/NODATA. We think an ifelse should do the job. Spurious DATA will be simply removed using the

Problem running R from within a script

2008 Nov 25

Problem running R from within a script

Howdy Folks, I am running R version 2.7.2 (2008-08-25) on CentOS 5.2 - the standard RPM distribution. I am having a curious occurance trying to run a R script from within a shell script. Basically, I have a small R script that processes a file. It takes two parameters - the input file, and the output file. I then have a shell script that runs the R script for each file matching a glob. R

Problem with the fdim package

2017 Aug 11

Problem with the fdim package

Hm, I am not an expert in this field but trying to use obviously old package which was removed about 5 years ago from CRAN is asking for problems. There is probably some incompatibility between recent R version and obsolete package. You either 1. need to install/compile R version from 2012 and install/compile this old package for this version. 2. Or you could try to find some similar

Gcc 3.4.0 and syslinux-2.09 menu

2004 Apr 27

Gcc 3.4.0 and syslinux-2.09 menu

The menu directory won't compile with gcc 3.4.0 because gcc complains that "ebp" cannot be used as a constraint. If preserving ebp is necessary then it will have to be done in the assembler code. Gcc 3.3.3 just seems to ignore the constraint. It didn't do anything special to preserve the register. I also noticed that getnumrows returns the contents of 0x484 which is the

naming things in functions

2002 Jun 21

naming things in functions

Hello, I'm working with R version 1.5.0 in Windows. I've written a function (SummaryMat, segment below) which uses a loop to repeatedly call another function (PercentsMat, segment below). PercentsMat creates a matrix and adds rows to it each time it is called. I use deparse(substitute(...)) to get the names of the lists sent to PercentsMat to use them as row names in the generated

reorder a matrix

2012 Apr 26

reorder a matrix

Hi, Here are part of my data, > pr16.5 origin dest ldco time.slot distortion 1 1 1 1 1 4.3 2 1 1 1 2 4.7 3 1 1 1 3 5.6 4 1 1 1 4 7.7 5 1 1 2 1 6.8 6 1 1 2 2 7.6 7 1 1 2 3 9.2 8 1 1

Extracting column name in apply/lapply

2006 Aug 28

Extracting column name in apply/lapply

Hi, any good trick to get the column names for title() aside from running lapply on the column indexes? Thanks Nick. apply(X[,numCols],2,function(x){ nunqs <- length(unique(x)) nnans <- sum(is.na(x)) info <- paste("uniques:",nunqs,"(",nunqs/n,")","NAs:",nnans,"(",nnans/n,")") hist(x,xlab=info) #

constrOptim - error: initial value not feasible

2010 Mar 17

constrOptim - error: initial value not feasible

Hello at all, working with a dataset I try to optimize a non-linear function with constraint. test<-read.csv2("C:/Users/Herb/Desktop/Opti/NORM.csv") fkt<- function(x){ a<-c(0) s<-c(0) #Minimizing square error for(j in 1:107){ s<-(test[j,2] - (x[1] * test[j,3]) - (x[2] * test[j,4]) - (x[3]*test[j,5]) - (x[4]*test[j,6]) - (x[5]*test[j,7]))^2 a<- a+s} a<-as.double(a)

plotCI error when trying to omit upper or lower bars (PR#7764)

2005 Apr 01

plotCI error when trying to omit upper or lower bars (PR#7764)

Full_Name: Volker Franz Version: 2.0.1 (2004-11-15) OS: Mac OSX / Debian Submission from: (NULL) (84.58.8.232) Hi there, the new version of plotCI (Version: 2.0.3 of gplots) produces errors if the upper or lower error bars should be omitted by passing NULL as an argument. Older versions of plotCI had no problem with this. Here is an example: library(gplots) means <- c(1,2,3,4,5) upperw

nlme graphics in a loop problem

2004 Jun 17

nlme graphics in a loop problem

Hi, I'm fitting mixed effects models using the lme function of the nlme package. This involves using the various associated plot functions. However, when I attempt to fit a number of models using an loop, whilst the models work, the plot functions fail. As a trivial example, the following works: library(nlme) graphics.off() x<-c(1:10) y<-c(1:4,7:12)

What is wrong with this contrast matrix?

2008 Jul 24

What is wrong with this contrast matrix?

Dear all, I am fitting a multivariate linear model with 7 response variables and 1 explanatory variable. The following matrix P: P <- cbind( c(1,-1,0,0,0,0,0), c(2,2,2,2,2,-5,-5), c(1,0,0,-1,0,0,0), c(-2,-2,0,-2,2,2,2), c(-2,1,0,1,0,0,0), c(0,-1,0,1,0,0,0)) should consist of orthogonal elements (as can be shown using %*% on the individual columns). However, when I use

Proba( Ut+2=1 / ((Ut+1==1) && (Ut==1))) ?

2005 Apr 25

Proba( Ut+2=1 / ((Ut+1==1) && (Ut==1))) ?

Dear all, First I apologize if my question is quite simple, but i'm very newbie with R. I have vectors of the form v = c(1,1,-1,-1,-1,1,1,1,1,-1,1) (longer than this one of course). The elements are only +1 or -1. I would like to calculate : - the frequencies of -1 occurences after 2 consecutives -1 - the frequencies of +1 occurences after 2 consecutives +1 It looks probably something like

which on array

2004 Apr 02

which on array

Good morning ! Today I found a strange, for my poor knowledge of R, behaviour of 'which' on a matrix: HAL9000> str(cluster.matrix) num [1:227, 1:6300] 2 2 2 2 2 2 2 2 2 2 ... HAL9000> class(cluster.matrix) [1] "matrix" HAL9000> ase <- cluster.matrix[1:5,1:5] HAL9000> ase [,1] [,2] [,3] [,4] [,5] [1,] 2 2 2 0 -2 [2,] 2 2 2 0 -2 [3,]

list of lists question

2004 Nov 24

list of lists question

Hello all, As a general programming question I can't seem to figure out how to make a list of lists in R. As matrices won't work as they have to be rectangular. I am sure that there is an easy solution but... the specific situation is this: - I have created a Tukey confidence interval table and have listed the means that are not significantly different - then using these not

Distance between a vector and matrix rows

2011 Aug 08

Distance between a vector and matrix rows

I am trying to find the distance between a vector and each row of a dataframe. I am using the function "distancevector" in the package "hopach" as follows: mydata<-as.data.frame(matrix(c(1,1,1,1,0,1,1,1,1,0),nrow=2)) V1 V2 V3 V4 V5 1 1 1 0 1 1 2 1 1 1 1 0 vec <- c(1,1,1,1,1) d2<-distancevector(mydata,vec,d="euclid") The Euclidean distance

creating objects of class "xtabs" "table" in R

2007 Oct 05

creating objects of class "xtabs" "table" in R

I have an application that would generate a cross-tabulation in array format in R. In particular, my application would give me a result similar to that of : array(5,c(2,2,2,2,2)) The above could be seen as a cross-tabulation of 5 variables with 2 levels each (could be 0 and 1). In this case, the data were such that each cell has exactly 5 observations. I Now, I want the output to look like the

similar to: Checking for duplicate rows in data frame efficiently