similar to: Logical subset of the columns in a dataframe

Displaying 20 results from an estimated 20000 matches similar to: "Logical subset of the columns in a dataframe"

2006 Oct 08
1
Probability of exceedance function question
I'm trying to calculate a cumulative area distribution (graph) of drainage areas. This is defined as P(A > A*). Simple in principle. I can do this in excel, with "COUNTIF", which will count the number of cells in the row "area" that have area A, then determine, for each cell in the row "area, how many cells exceede that area, then dividing that number by the total
2008 Mar 25
2
Combining logical operators to extract columns from a dataframe
Hi R-helpers, I have a dataframe (called data) with 100 columns, the columns of which are named with integers ranging from 1900 to 1999. I wish to extract those columns which names are >=1950 and <=1970. I tried: data2<-subset(data,select=(names(data)>=1950 & names(data)<=1970)) but that doesn't work. Any ideas? Thanks! Mark [[alternative HTML version deleted]]
2009 Feb 15
1
MDS with ranking data (and transformation)
Dear Sirs and madams :-) I am trying to teach myself multidimensional scaling. To that effect I have collected a survey asking people to rank 10 philosophers and politicians according to their preference. I have collected 61 answers. The data is organized in ten columns and 61 rows. the columns are "choice_1", "choice_2", "choice_3" etc. The cells is the name of the
2010 Oct 05
1
Ordering Duplicates for Selection
Hi all, I've found a lot of helpful info regarding identifying and deleting duplicates but I'd like to do something a little different - I'd like to identify the duplicate values but instead of deletion, label them with a value. I am working with historical data regarding school courses: Student Number Course Final Mark
2009 Apr 21
4
My surprising experience in trying out REvolution's R
I care a lot about R's speed. So I decided to give REvolution's R (http://revolution-computing.com/) a try, which bills itself as an optimized R. Note that I used the free version. My machine is a Intel core 2 duo under Windows XP professional. The code I run is in the end of this post. First, the regular R 1.9. It takes 2 minutes and 6 seconds, CPU usage 50% Next, REvolution's R.
2009 Jan 20
2
Summing Select Columns of a Data Frame?
Hi, I would like to operate on certain columns in a dataframe, but not others. My data looks like this: x1 x2 x3 1 2 3 4 5 6 7 8 9 I want to create a new column named x4 that is the sum of x1 and x2, but NOT x3. I looked at colSums and apply, but those functions seem to use all the columns in a dataframe. How do I only use select columns? If it helps, in Stata this would be gen x4
2009 Jul 16
3
DataFrame help
Alright, so I am trying to write my own function to calculate column sums in a matrix. I want the result as a single list with the values. So far I have: csum<-function(m) { a = data.frame(m) s = lapply(a,sum) return(s) } What is the easiest way to have it return in a format such as [1] 6 15 24 ? Thanks. -- View this message in context:
2009 Dec 04
5
logical masking of a matrix converts it to a vector
One problem I've been having is the special case in which only one row/column remains and the variable gets converted into a vector when entries are removed by logical masking. This is a problem because subsequent code may rely on matrix operations (apply, colsums, dim, etc) For example: > a <- matrix(c(1, 2, 3, 4), nrow = 2) > a [,1] [,2] [1,] 1 3 [2,] 2 4 >
2010 Sep 23
1
looking for a faster way to compare two columns of a matrix
Please consider this matrix: x <- structure(c(5, 4, 3, 2, 1, 6, 3, 2, 1, 0, 3, 2, 1, 0, 0, 2, 1, 1, 0, 0, 2, 0, 0, 0, 0), .Dim = c(5L, 5L)) For each pair of columns, I want to calculate the proportion of entries different than 0 in column j (i > j) that have lower values than the entries in the same row in column i: x[, 1:2] sum((x[,1] > x[,2]) & (x[,2] > 0))/sum(x[,2] > 0)
2009 Jan 18
2
Deleting columns based on the number of non-blank observations
Hello, I have a dataset (named "x") with many (966) columns. What I would like to do is delete any columns that do not have at least 375 non-blank observations (i.e., the cells have some value in them besides NA). How can I do this? I have come up with the following code to _count_ the non-blank observations in each column, but how would I adapt this code to _delete_ columns from the
2011 Feb 26
1
Hello!
Hi,i would like to know what is the best way to write a procedure in R,it seems that when i run a script it doesn't wait for previouse code lines to be excuted,i would like to know how to force R to hold on a code line until it is processed,for example:i want to choose from a list the name of the sheet in which to do an analisys,then i want to run a goodness of fit test for the data on that
2012 Jun 26
1
compare one field of dataframe with excel sheet using R
I have a data frame consisting of three columns(name of compund,ppm and frequency).Name contains string values .ppm and frequency contains numeric values with decimal points upto four digits. I have an excel sheet which is like a library.The first column contains the name of compounds and remaining column contains the ppm values of the compound which satisfy certain rules.The number of ppm values
2007 Nov 25
2
rowMean, specify subset of columns within Dataframe?
I would like to calculate the mean of tree leader increment growth over 5 years (I1 through I5) where each tree is a row and each row has 5 columns. So far I have achieved this using rowMeans when all columns are numeric type and used in the calculation: Data1 <- data.frame(cbind(I1 = 3, I2 = c(0,3:1, 2:5,NA), I3 =c(1:4,NA,5:2),I4=2,I5=3)) Data1 Data1$mean_5 <- rowMeans(Data1, na.rm =T)
2005 Aug 05
6
Computing sums of the columns of an array
Hi, I have a 5x731 array A, and I want to compute the sums of the columns. Currently I do: apply(A, 2, sum) But it turns out, this is slow: 70% of my CPU time is spent here, even though there are many complicated steps in my computation. Is there a faster way? Thanks, Martin
2009 Jan 28
2
t.test in a loop
Hi All, I've been having a little trouble with creating a loop that will run a a series of t.tests for inspection, Below is the code i've tried, and some checks i've looked at. I've used the get(paste()) idea as i was told previously that the use of the eval should try and be avoided. I've run a single syntax to check that my systax is correct and works without any problems
2008 Mar 07
4
Warning: matrix by vector division
Dear list, I just made a very simple mistake, but it was hard to spot. And I think that I should warn other people, because it is probably so simple to make... === R code === # Let us create a matrix: (a <- cbind(c(0,1,1), rep(1,3))) # [,1] [,2] # [1,] 0 1 # [2,] 1 1 # [3,] 1 1 # That is a MISTAKE: a/colSums(a) # [,1] [,2] # [1,] 0.0000000 0.3333333
2002 Jul 03
2
operating on a subset of a dataframe
Hi everyone, I've got a dataframe with columns of different types. A certain number of columns in the dataframe hold the results of a series of Likert-type items. I've got a function that will print a simple table of frequencies and I want to apply that function to those columns of the dataframe only. What's the best approach? -Tim -- Tim Wilson | Visit Sibley online: |
2006 Jun 21
3
sort matrix by sum of columns
Hi all, I would like to know how can I sort the cols of a matrix by the sum of their elements. a <- matrix(as.integer(rnorm(25,4,2)),10,5) colnames(a) = c("alfa","bravo","charlie","delta","echo") I guess I should use colSums, and then rearrange the matrix somehow according to the result. My idea is to display a "sorted" barplot:
2012 Oct 05
3
Multiple graphs > boxplot
Dear all I am trying to represent a dependent variable (treatment) against different independent variables (v1, v2, v3....v20). I am using the following command: boxplot(v1~treatment,data=y, main="xxxxxx",xlab="xxxxxx", ylab="xxxxxx") However, it provides me only one graph for v1~treatment. For the other comparisons, I have to repeat the same command but changing
2004 May 27
4
extract columns using their names
Hello, Is there a way to extract multiple columns from a dataframe using their names instead of their numbers? Currently I use: data2 <- data1[, c(1,3,9)] And I am looking for something like data2 <- data1[, c("XX","YY","ZZ")] I use the same dataframe for many purposes, and I run codes that change the order of the columns every time. Many thanks, Adrian