thr3ads.net - similar to: "removing duplicate rows"

Displaying 20 results from an estimated 5000 matches similar to: "removing duplicate rows"

splitting character strings and converting to numeric vectors

2010 May 06

splitting character strings and converting to numeric vectors

This seemingly should be quite simple but I can't solve it: I have a long character vector of geographic data (data frame column named "XY") whose elements vary in length (from 11 to 14 chars). Each element is structured as a set of digits, then an underscore, then more digits, e.g: > data.frame(head(as.character(XY))) head.as.character.XY.. 1 -448623_854854 2

problem selecting rows meeting a criterion

2009 Aug 10

problem selecting rows meeting a criterion

When I try to select only those rows from the following data frame, called "data", in which X > Y X Y V3 2 2 1 8.062258 3 3 1 2.236068 4 4 1 6.324555 5 5 1 5.000000 6 1 2 8.062258 8 3 2 9.486833 9 4 2 2.236068 10 5 2 5.656854 11 1 3 2.236068 12 2 3 9.486833 14 4 3 8.062258 15 5 3 5.099020 16 1 4 6.324555 17 2 4 2.236068 18 3 4 8.062258 20 5 4 5.385165 21 1 5 5.000000

quantiles on rows of a matrix

2010 Jul 07

quantiles on rows of a matrix

I'm trying to obtain the mean of the middle 95% of the values from each row of a matrix (that is, the highest and lowest 2.5% of values in each row are removed before calculating the mean). I am having all sorts of problems with this; for example the command: apply(matrix1,1,function(x) quantile(c(.05,.90),na.rm=T)) returns the exact same quantile values for each row, which is clearly

linear regression on groups of consecutive rows of a matrix

2009 Nov 24

linear regression on groups of consecutive rows of a matrix

I want to perform linear regression on groups of consecutive rows--say 5 to 10 such--of two matrices. There are many such potential groups because the matrices have thousands of rows. The matrices are both of the form: > shp[1:5,16:20] SL495B SL004C SL005C SL005A SL017A -2649 1.06 0.56 NA NA NA -2648 0.97 0.57 NA NA NA -2647 0.46 0.30 NA NA

R's database capabilities

2009 Aug 04

R's database capabilities

I admit that I've not done a thorough search on this topic, but from the several instructional manuals and/or tutorials I've looked at, I don't see any mention of relational database capabilities in R? Have I missed something, and if so, can someone point me in the right direction to get started? Thanks! Jim Bouldin, PhD Research Ecologist Department of Plant Sciences, UC Davis

error message: .Random.seed is not an integer vector but of type 'list'

2009 Jul 23

error message: .Random.seed is not an integer vector but of type 'list'

I'm trying to run this simple random sample procedure and keep getting the error message shown. I don't understand this; I've designated x as a numeric vector, so what is going on here? Thanks. > x = as.vector(c(1:12));x [1] 1 2 3 4 5 6 7 8 9 10 11 12 > mode(x) [1] "numeric" > sample(x, 3) Error in sample(x, 3) : .Random.seed is not an integer vector

unloading loaded packages

2009 Apr 30

unloading loaded packages

I can't seem to find info on how to unload packages that have been loaded. My goal in doing so is to gain access to functions that have been masked out by those packages. Or is there another way to do so? Thanks in advance. Jim Bouldin, PhD Research Ecologist Department of Plant Sciences, UC Davis Davis CA, 95616 530-554-1740

nls error message

2009 Dec 28

nls error message

When I try to run the following non-linear regression with variables index1 and prl3: > beta = 4 > nls(index1~beta*(1/prl3),start = list(beta = 4)) I get this error message: Error in nls(index1 ~ beta * (1/prl3), start = list(beta = 4)) : REAL() can only be applied to a 'numeric', not a 'logical' I've got no clue as to the REAL() to which this is referring. Any

calculations on columns with partially matching names

2010 Jan 03

calculations on columns with partially matching names

Is there a command for partial matching of character strings? Specifically, I'd like to be able to calculate the mean of the values in any columns in a data frame or matrix that have identity in part of their column names. For example, columns labeled "mpw06a" and "mpw06b" match on the first five characters; their mean would be taken whereas any columns beginning with

no html help upon upgrading to 2.10

2009 Dec 04

no html help upon upgrading to 2.10

I just upgraded from 2.8.1 to 2.10 on Windows Vista. BIG MISTAKE apparently because now when I type: > help(functionname) or ?functionname I get only a small text window giving some very basic info on the topic, e.g.: base-package package:base R Documentation The R Base Package Description: Base R functions Details: This package contains the

NAs and row/column calculations

2010 Mar 11

NAs and row/column calculations

I continue to have great frustrations with NA values--in particular making summary calculations on rows or cols of a matrix containing them. For example, why does: > a = matrix(1:30,nrow=5) > is.na(a[c(1:2),c(3:4)]);a [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1 6 NA NA 21 26 [2,] 2 7 NA NA 22 27 [3,] 3 8 13 18 23 28 [4,] 4 9 14 19 24 29

consecutive numbering of elements in a matrix

2009 Nov 21

consecutive numbering of elements in a matrix

Within a very large matrix composed of a mix of values and NAs, e.g, matrix A: [,1] [,2] [,3] [1,] 1 NA NA [2,] 3 NA NA [3,] 3 10 17 [4,] 4 12 18 [5,] 6 16 19 [6,] 6 22 20 [7,] 5 11 NA I need to be able to consecutively number, in new columns, the non-NA values within each column (i.e. A[1,1] A[3,2] and A[3,3] would all be set to one, and

Random # generator accuracy

2009 Jul 23

Random # generator accuracy

Dan Nordlund wrote: "It would be necessary to see the code for your 'brief test' before anyone could meaningfully comment on your results. But your results for a single test could have been a valid "random" result." I've re-created what I did below. The problem appears to be with the weighting process: the unweighted sample came out much closer to the actual

error message: .Random.seed is not an integer vector but

2009 Jul 23

error message: .Random.seed is not an integer vector but

Thanks much Ted. I actually had just tried what you suggest here before you posted, and resolved the problem. Thanks also for the other tips. I wrote x = as.vector(c(1:12)) because I thought that the mode of x might be the problem, the error message pointing to .Random.seed notwithstanding. On a related note, I did a brief test a couple weeks back where I ran a million random samples of 3 from

converting object elements to variable names and making subsequent assignments thereto

2011 Sep 23

converting object elements to variable names and making subsequent assignments thereto

This has got to be incredibly simple but I nevertheless can't figure it out as I am apparently brain dead. I just want to convert the elements of a character vector to variable names, so as to then assign formulas to them, e.g: z = c("model1","model2"); I want to assign formulas, such as lm(y~x[,1]) and lm(y~x[,2]), to the variables "model1" and

object names from character strings

2010 Dec 26

object names from character strings

I realize this is probably pretty basic but I can't figure it out. I'm looping through an array, doing various calculations and producing a resulting data frame in each loop iteration. I need to give each data frame a different name. Although I can easily create a new character string for writing each frame to an output file, I cannot figure out how to convert such strings to

functions on rows or columns of two (or more) arrays

2011 Aug 04

functions on rows or columns of two (or more) arrays

I realize this should be simple, but even after reading over the several help pages several times, I still cannot decide between the myriad "apply" functions to address it. I simply want to apply a function to all the rows (or columns) of the same index from two (or more) identically sized arrays (or data frames). For example: > a=matrix(1:50,nrow=10) >

extract the p value

2011 Oct 24

extract the p value

OK, what is the trick to extracting the overall p value from an lm object? It shows up in the summary(lm(model)) output but I can't seem to extract it: > test2 = apply(aa, 1, function(x) summary(lm(x[,1] ~ 0 + x[,3] + x[,6]))) > test2[[1]] Call: lm(formula = x[, 1] ~ 0 + x[, 3] + x[, 6]) [omitted summary output] F-statistic: 40.94 on 2 and 7 DF, p-value: 0.0001371 It does not seem

subsetting from a vector or matrix

2009 Sep 24

subsetting from a vector or matrix

I realize this should be simple but I'm having trouble subsetting vectors and matrices, for example extracting all values meeting a certain criterion, from a vector. Cannot seem to figure out the correct syntax and help page not very helpful. Or should I be using some other function than subset. Thanks for any help. Jim Bouldin

nls error regarding numerics vs logicals

2010 Jul 09

nls error regarding numerics vs logicals

I am trying to perform an nls for a valid negative exponential function: zz=nls(y~constant+a.est*2.7183^(b.est*x),start=list(constant=4.0,a.est=-4,b.est = -.005),trace=T) and am getting a number of different error messages, the most problematic of which is "Error in nls(ring.area ~ constant + a.est * 2.7183^(b.est * ba.beg), start = list(constant = 4, : REAL() can only be applied to a

similar to: removing duplicate rows