thr3ads.net - similar to: "Logical subset of the columns in a dataframe"

Displaying 20 results from an estimated 20000 matches similar to: "Logical subset of the columns in a dataframe"

Probability of exceedance function question

2006 Oct 08

Probability of exceedance function question

I'm trying to calculate a cumulative area distribution (graph) of drainage areas. This is defined as P(A > A*). Simple in principle. I can do this in excel, with "COUNTIF", which will count the number of cells in the row "area" that have area A, then determine, for each cell in the row "area, how many cells exceede that area, then dividing that number by the total

Combining logical operators to extract columns from a dataframe

2008 Mar 25

Combining logical operators to extract columns from a dataframe

Hi R-helpers, I have a dataframe (called data) with 100 columns, the columns of which are named with integers ranging from 1900 to 1999. I wish to extract those columns which names are >=1950 and <=1970. I tried: data2<-subset(data,select=(names(data)>=1950 & names(data)<=1970)) but that doesn't work. Any ideas? Thanks! Mark [[alternative HTML version deleted]]

MDS with ranking data (and transformation)

2009 Feb 15

MDS with ranking data (and transformation)

Dear Sirs and madams :-) I am trying to teach myself multidimensional scaling. To that effect I have collected a survey asking people to rank 10 philosophers and politicians according to their preference. I have collected 61 answers. The data is organized in ten columns and 61 rows. the columns are "choice_1", "choice_2", "choice_3" etc. The cells is the name of the

Ordering Duplicates for Selection

2010 Oct 05

Ordering Duplicates for Selection

Hi all, I've found a lot of helpful info regarding identifying and deleting duplicates but I'd like to do something a little different - I'd like to identify the duplicate values but instead of deletion, label them with a value. I am working with historical data regarding school courses: Student Number Course Final Mark

My surprising experience in trying out REvolution's R

2009 Apr 21

My surprising experience in trying out REvolution's R

I care a lot about R's speed. So I decided to give REvolution's R (http://revolution-computing.com/) a try, which bills itself as an optimized R. Note that I used the free version. My machine is a Intel core 2 duo under Windows XP professional. The code I run is in the end of this post. First, the regular R 1.9. It takes 2 minutes and 6 seconds, CPU usage 50% Next, REvolution's R.

Summing Select Columns of a Data Frame?

2009 Jan 20

Summing Select Columns of a Data Frame?

Hi, I would like to operate on certain columns in a dataframe, but not others. My data looks like this: x1 x2 x3 1 2 3 4 5 6 7 8 9 I want to create a new column named x4 that is the sum of x1 and x2, but NOT x3. I looked at colSums and apply, but those functions seem to use all the columns in a dataframe. How do I only use select columns? If it helps, in Stata this would be gen x4

DataFrame help

2009 Jul 16

DataFrame help

Alright, so I am trying to write my own function to calculate column sums in a matrix. I want the result as a single list with the values. So far I have: csum<-function(m) { a = data.frame(m) s = lapply(a,sum) return(s) } What is the easiest way to have it return in a format such as [1] 6 15 24 ? Thanks. -- View this message in context:

logical masking of a matrix converts it to a vector

2009 Dec 04

logical masking of a matrix converts it to a vector

One problem I've been having is the special case in which only one row/column remains and the variable gets converted into a vector when entries are removed by logical masking. This is a problem because subsequent code may rely on matrix operations (apply, colsums, dim, etc) For example: > a <- matrix(c(1, 2, 3, 4), nrow = 2) > a [,1] [,2] [1,] 1 3 [2,] 2 4 >

looking for a faster way to compare two columns of a matrix

2010 Sep 23

looking for a faster way to compare two columns of a matrix

Please consider this matrix: x <- structure(c(5, 4, 3, 2, 1, 6, 3, 2, 1, 0, 3, 2, 1, 0, 0, 2, 1, 1, 0, 0, 2, 0, 0, 0, 0), .Dim = c(5L, 5L)) For each pair of columns, I want to calculate the proportion of entries different than 0 in column j (i > j) that have lower values than the entries in the same row in column i: x[, 1:2] sum((x[,1] > x[,2]) & (x[,2] > 0))/sum(x[,2] > 0)

Deleting columns based on the number of non-blank observations

2009 Jan 18

Deleting columns based on the number of non-blank observations

Hello, I have a dataset (named "x") with many (966) columns. What I would like to do is delete any columns that do not have at least 375 non-blank observations (i.e., the cells have some value in them besides NA). How can I do this? I have come up with the following code to _count_ the non-blank observations in each column, but how would I adapt this code to _delete_ columns from the

Hello!

2011 Feb 26

Hello!

Hi,i would like to know what is the best way to write a procedure in R,it seems that when i run a script it doesn't wait for previouse code lines to be excuted,i would like to know how to force R to hold on a code line until it is processed,for example:i want to choose from a list the name of the sheet in which to do an analisys,then i want to run a goodness of fit test for the data on that

compare one field of dataframe with excel sheet using R

2012 Jun 26

compare one field of dataframe with excel sheet using R

I have a data frame consisting of three columns(name of compund,ppm and frequency).Name contains string values .ppm and frequency contains numeric values with decimal points upto four digits. I have an excel sheet which is like a library.The first column contains the name of compounds and remaining column contains the ppm values of the compound which satisfy certain rules.The number of ppm values

rowMean, specify subset of columns within Dataframe?

2007 Nov 25

rowMean, specify subset of columns within Dataframe?

I would like to calculate the mean of tree leader increment growth over 5 years (I1 through I5) where each tree is a row and each row has 5 columns. So far I have achieved this using rowMeans when all columns are numeric type and used in the calculation: Data1 <- data.frame(cbind(I1 = 3, I2 = c(0,3:1, 2:5,NA), I3 =c(1:4,NA,5:2),I4=2,I5=3)) Data1 Data1$mean_5 <- rowMeans(Data1, na.rm =T)

Computing sums of the columns of an array

2005 Aug 05

Computing sums of the columns of an array

Hi, I have a 5x731 array A, and I want to compute the sums of the columns. Currently I do: apply(A, 2, sum) But it turns out, this is slow: 70% of my CPU time is spent here, even though there are many complicated steps in my computation. Is there a faster way? Thanks, Martin

t.test in a loop

2009 Jan 28

t.test in a loop

Hi All, I've been having a little trouble with creating a loop that will run a a series of t.tests for inspection, Below is the code i've tried, and some checks i've looked at. I've used the get(paste()) idea as i was told previously that the use of the eval should try and be avoided. I've run a single syntax to check that my systax is correct and works without any problems

Warning: matrix by vector division

2008 Mar 07

Warning: matrix by vector division

Dear list, I just made a very simple mistake, but it was hard to spot. And I think that I should warn other people, because it is probably so simple to make... === R code === # Let us create a matrix: (a <- cbind(c(0,1,1), rep(1,3))) # [,1] [,2] # [1,] 0 1 # [2,] 1 1 # [3,] 1 1 # That is a MISTAKE: a/colSums(a) # [,1] [,2] # [1,] 0.0000000 0.3333333

operating on a subset of a dataframe

2002 Jul 03

operating on a subset of a dataframe

Hi everyone, I've got a dataframe with columns of different types. A certain number of columns in the dataframe hold the results of a series of Likert-type items. I've got a function that will print a simple table of frequencies and I want to apply that function to those columns of the dataframe only. What's the best approach? -Tim -- Tim Wilson | Visit Sibley online: |

sort matrix by sum of columns

2006 Jun 21

sort matrix by sum of columns

Hi all, I would like to know how can I sort the cols of a matrix by the sum of their elements. a <- matrix(as.integer(rnorm(25,4,2)),10,5) colnames(a) = c("alfa","bravo","charlie","delta","echo") I guess I should use colSums, and then rearrange the matrix somehow according to the result. My idea is to display a "sorted" barplot:

Multiple graphs > boxplot

2012 Oct 05

Multiple graphs > boxplot

Dear all I am trying to represent a dependent variable (treatment) against different independent variables (v1, v2, v3....v20). I am using the following command: boxplot(v1~treatment,data=y, main="xxxxxx",xlab="xxxxxx", ylab="xxxxxx") However, it provides me only one graph for v1~treatment. For the other comparisons, I have to repeat the same command but changing

search species with all absence in a presence-absence matrix

2013 Sep 20

search species with all absence in a presence-absence matrix

Dear list I have a matrix composed of islandID as rows and speciesID as columns. IslandID: Island A, B, C….O (15 islands in total) SpeciesID: D0001, D0002, D0003….D0100 (100 species in total) The cell of the matrix describes presence (1) or absence (0) of the species in an island. Now I would like to search the species with absence (0) in all the islands (Island A to Island O.)

similar to: Logical subset of the columns in a dataframe