thr3ads.net - similar to: "Compare two data frames"

Displaying 20 results from an estimated 10000 matches similar to: "Compare two data frames"

2010 Oct 04

Help with apply

Suppose I have the following data: tmp <- data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE)) I can run the following double loop and yield what I want in the end (rr1) as: library(statmod) Q <- 2 b <- runif(3) qq <- gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) rr1 <- matrix(0,

nlminb and optim

2010 Sep 29

nlminb and optim

I am using both nlminb and optim to get MLEs from a likelihood function I have developed. AFAIK, the model I has not been previously used in this way and so I am struggling a bit to unit test my code since I don't have another data set to compare this kind of estimation to. The likelihood I have is (in tex below) \begin{equation} \label{eqn:marginal} L(\beta) = \prod_{s=1}^N \int

Integral of PDF

2010 Dec 02

Integral of PDF

The integral of any probability density from -Inf to Inf should equal 1, correct? I don't understand last result below. > integrate(function(x) dnorm(x, 0,1), -Inf, Inf) 1 with absolute error < 9.4e-05 > integrate(function(x) dnorm(x, 100,10), -Inf, Inf) 1 with absolute error < 0.00012 > integrate(function(x) dnorm(x, 500,50), -Inf, Inf) 8.410947e-11 with absolute error <

Use apply only on non-missing values

2010 Jun 02

Use apply only on non-missing values

I have a function that I am currently using very inefficiently. The following are needed to illustrate the problem: set.seed(12345) dat <- matrix(sample(c(0,1), 110, replace = TRUE), nrow = 11, ncol=10) mis <- sample(1:110, 5) dat[mis] <- NA theta <- rnorm(11) b_vector <- runif(10, -4,4) empty <- which(is.na(t(dat))) So, I have a matrix (dat) with some values within the matrix

puzzle with integrate over infinite range

2010 Sep 21

puzzle with integrate over infinite range

Dear list, I'm calculating the integral of a Gaussian function from 0 to infinity. I understand from ?integrate that it's usually better to specify Inf explicitly as a limit rather than an arbitrary large number, as in this case integrate() performs a trick to do the integration better. However, I do not understand the following, if I shift the Gauss function by some amount the integral

Proper use of grep

2010 Jul 15

Proper use of grep

I just need to confirm something with pattern matching folks. I have a factor with the following levels in a very large data set: > levels(all$Classical.Statistic) [1] "" "AB;ABD" "CollapsedSteps" "CR_P" "CR_Prop;CR_P;AB" [6] "NMK"

Basic vector operations was: Function to approximate complex integral

2006 Apr 19

Basic vector operations was: Function to approximate complex integral

Dear List I apologize for the multiple postings. After being in the weeds on this problem for a while I think my original post may have been a little cryptic. I think I can be clearer. Essentially, I need the following a <- c(2,3) b <- c(4,5,6) (2*4) + (2*5) + (2*6) + (3*4) + (3*5) +(3*6) But I do not know of a built in function that would do this. Any suggestions? -----Original

Error handling with frozen RCurl function calls + Identification of frozen R processes

2011 Jan 26

Error handling with frozen RCurl function calls + Identification of frozen R processes

Dear list, I'm tackling an empiric research problem that requires me to address a whole bunch of conceptual and/or technical details at the same time which cuts time short for all the nitty-gritty details of the "components" involved. Having said this, I'm lacking the time at the moment to deeply dive into parallel computing and HTTP requests via RCurl and I hope you can help me

Comparing rows in a dataframe

2004 Aug 06

Comparing rows in a dataframe

Hello I have a longitudinal dataframe organized in the long format and would like to make comparison between successive rows if certain conditions apply. Specifically, I have four variables of interest: grade, score, year, and schid, associated with each school with 3 measurements per school per grade, therefore the rows are temporally ordered and each school occupies multiple rows. For example,

By() with method = spearman

2007 Sep 19

By() with method = spearman

I have a data set where I want the correlations between 2 variables conditional on a students grade level. This code works just fine. by(tmp[,c('mtsc07', 'DCBASmathscoreSPRING')], tmp$Grade, cor, use='complete', method='pearson') However, this generates an error by(tmp[,c('mtsc07', 'DCBASmathscoreSPRING')], tmp$Grade, cor, use='complete',

rowSums()

2008 Sep 24

rowSums()

Say I have the following data: testDat <- data.frame(A = c(1,NA,3), B = c(NA, NA, 3)) > testDat A B 1 1 NA 2 NA NA 3 3 3 rowsums() with na.rm=TRUE generates the following, which is not desired: > rowSums(testDat[, c('A', 'B')], na.rm=T) [1] 1 0 6 rowsums() with na.rm=F generates the following, which is also not desired: > rowSums(testDat[, c('A',

--compare-dest; I'm missing the boat

2009 Jan 15

--compare-dest; I'm missing the boat

I must be seriously misunderstanding the man page coverage of --compre-dest. My take was that if a file in compare-dest=dir matches a file in SOURCE/ then it won't be transferred to DEST/. I tried this test. (d1 has single files and 2 subdir with files) cp -a d1 d1a mkdir d2 rsync -avv --compare-dest="./d1a" d1/ d2/ d1a is carbon copy of d1 but still every last file in

Communative Matrix Multiplcation

2012 Aug 14

Communative Matrix Multiplcation

Friends I'm not seeing why the following occurs: > T1 <- (A1 - A2) %*% D > T2 <- (A1 %*% D) - (A2 %*% D) > identical(T1, T2) [1] FALSE Harold > dput(A1) new("dsCMatrix" , i = c(0L, 1L, 2L, 3L, 0L, 1L, 4L, 2L, 3L, 5L) , p = c(0L, 1L, 2L, 3L, 4L, 7L, 10L) , Dim = c(6L, 6L) , Dimnames = list(NULL, NULL) , x = c(5, 5, 5, 5, 5, 5, 10, 5, 5, 10)

Adding an Sweave Vignette to a package

2008 Jan 21

Adding an Sweave Vignette to a package

I'm finalizing development of a package that will include a vignette. Without the vignette, the package builds fine with no warnings and is ready for distribution. Now, I am following the directions for developing vignettes "Sweave, Part II: Package Vignettes" by Friedrich Leisch. I am using a windows XP machine (other session info below). Here is what I have done. 1) I add the

Behavior of apply()

2011 Dec 08

Behavior of apply()

Suppose I have the following matrix > class(cov_50) [1] "matrix" > cov_50 [,1] [,2] [1,] 0.3201992 2.308084 [2,] 6.7312928 5.719641 I then use the following function via apply and get the desired output, a list signif <- function(x) which(abs(x) > 1.96) apply(cov_50, 1, signif) > apply(cov_50, 1, signif) [[1]] [1] 2 [[2]] [1] 1 2 However, I can't

Print methods

2009 Nov 09

Print methods

I've built a package that contains only two functions for a test run. They are: g <- function(x){ x <- x^2 class(x) <- "foo" x } print.foo <- function(x, ...){ cat("This is a test:\n") cat(x, "\n") invisible(x) } Simply testing these functions in the R workspace prior to a build yields: > g(1:5) This is a test: 1 4 9 16 25 Now, I

Data Extraction - benchmark()

2012 Nov 22

Data Extraction - benchmark()

Hi Berend, I see you are one of the contributors to the rbecnhmark package. I am sorry that I am bothering you again. I have tried to run your code (slightly tweaked) involving the benchmark function, and I am getting the following error message. What am I doing wrong? Error in benchmark(d1 <- s1(df), d2 <- s2(df), d3 <- s3(df), d4 <- s4(df), : could not find function

Recode factors

2008 Mar 27

Recode factors

I know this comes up, but I didn't see my exact issue in the archives. I have variables in a dataframe that need to be recoded. Here is what I'm dealing with I have a factor called aa > class(aa) [1] "factor" > table(aa) aa * 0 1 2 3 A B C D L N T 0 0 1908 725 2089 0 0 67 0 0 2 1 6 I need to recode

Spectral Decomposition

2007 Jun 29

Spectral Decomposition

All of my resources for numerical analysis show that the spectral decomposition is A = CBC' Where C are the eigenvectors and B is a diagonal matrix of eigen values. Now, using the eigen function in R # Original matrix aa <- matrix(c(1,-1,-1,1), ncol=2) ss <- eigen(aa) # This results yields back the original matrix according to the formula above ss$vectors %*% diag(ss$values) %*%

Condional Density Plot from different data

2010 Dec 09

Condional Density Plot from different data

I'm not certain I am using the lattice plot correctly here. Below is reproducible code. Suppose I have two data frames, such as: set.seed(1234) datA <- data.frame(condition = gl(3, 100), scores = c(rnorm(100), rnorm(100, 1,1), rnorm(100, 2,1))) datB <- data.frame(condition = gl(3, 1000), scores = c(rnorm(1000, 3,1), rnorm(1000, 4,1), rnorm(1000, 5,1))) I would like to plot the

similar to: Compare two data frames