similar to: Performing a function on columns specified in another dataframe

Displaying 20 results from an estimated 7000 matches similar to: "Performing a function on columns specified in another dataframe"

2010 Jun 09
1
Subset columns by prefix
Hello R listserve, I would appreciate someone's help with this problem. Consider the following toy dataset: x <- read.table(textConnection("worldclim.1 worldclim.2 cru.1 cru.2 indv.1 7 8 32 658 indv.2 7 7 39 422"), header = TRUE) How could I create a subset of the data based on the column prefix? For instance, let's say I wanted to subset only the columns with the
2004 Sep 15
5
replacing NA's with 0 in a dataframe for specified columns
I know that there must be a cool way of doing this, but I can't think of it. Let's say I have an dataframe with NA's. > x <- data.frame(a = c(0,1,2,NA), b = c(0,NA,1,2), c = c(NA, 0, 1, 2)) > x a b c 1 0 0 NA 2 1 NA 0 3 2 1 1 4 NA 2 2 > I know it is easy to replace all the NA's with zeroes. > x[is.na(x)] <- 0 > x a b c 1 0 0 0 2 1 0 0 3 2 1
2011 Feb 17
4
Find and replace all the elements in a data frame
Hi all, I'm having a problem once again, trying to do something very simple. Consider the following data frame: x <- read.table(textConnection("locus1 locus2 locus3 A T C A T NA T C C A T G"), header = TRUE) closeAllConnections() I am trying to make a new data frame, replacing "A" with "A/A", "T" with "T/T", "G" with
2009 Apr 16
2
Translate the elements of a dataframe
The second beginner question. I want to create a new dataframe, where each element of the original dataframe is translated to 1 if it was "+", to 0 if it was "-" to -1 otherwise. I could do with: Lines <- "a b c d + - + + + + + - + 1 - '+ ' - + + + + N - +" DF <-
2010 Jun 09
2
Change the name of one column ONLY
Hi all, I have a very simple problem that I cannot seem to find the answer to. Consider the following toy dataset: x <- read.table(textConnection("V1 apples bananas cherries indv.1 7 8 4 3 indv.2 7 7 4 9"), header = TRUE) How would I change the column name of ONLY the first column, not the others? Surely I should not have to re-specify the names of ALL the columns -- e.g.,
2010 Sep 26
2
Splitting a data frame into several completely separate data frames
Hello again, How do I split a data frame into smaller, completely separate data frames (rather than separate data frames comprising a single "list")? Consider the following data, and my coding attempt: x <- read.table(textConnection("id type number indv.1 bagel 6 indv.2 bagel 1 indv.3 donuts 10 indv.4 donuts 9"), header = TRUE) closeAllConnections() x.split <-
2011 Apr 11
2
Plotting a quadratic line on top of an xy scatterplot
Dear Listserv, Here is my latest in a series of simple-seeming questions that dog me. Consider the following data: x <- read.table(textConnection("temperature probability 0.11 9.4 0 2.3 0.38 8.7 0.43 9.2 0.6 15.6 0.47 8.7 0.09 12.8 0.11 9.4 0.01 7.7 0.83 8 0.65 9.3 0.05 7.4 0.34 10.1 0.02 4.8 0.07 9.1 0.6 15.6 0.01 8.4 0.9 9.6 0.83 8 0.12 8.4 0.01 8 0 5 0.11 9.7 0.41 7.4 0.05 9.4 0.09
2010 Mar 13
2
Indexing a matrix within loops
Hi, I was hoping someone could help me with the following problem. Consider this toy example. For the input dataset there are four individuals (rows "indv.1" through "indv.4"), measured for two different variables (columns "var.1" and "var.2") at two different levels of a factor (column "factor.level"). I want to calculate a matrix that has the
2007 Nov 25
2
rowMean, specify subset of columns within Dataframe?
I would like to calculate the mean of tree leader increment growth over 5 years (I1 through I5) where each tree is a row and each row has 5 columns. So far I have achieved this using rowMeans when all columns are numeric type and used in the calculation: Data1 <- data.frame(cbind(I1 = 3, I2 = c(0,3:1, 2:5,NA), I3 =c(1:4,NA,5:2),I4=2,I5=3)) Data1 Data1$mean_5 <- rowMeans(Data1, na.rm =T)
2011 Dec 07
1
removing specified length of text after a period in dataframe of char's
Dear all, I'm trying to remove some text after the period (a decimal point) in the data frame 'hi', below. This is one step in formatting a table. So I would like e.g. "2.0" to become "2" and "5.3" to be "5.3", where the variable digordered contains the number of digits after the decimal that I would like to display, in the same order in which
2009 Nov 05
3
performing operations on a dataframe
Hey all, I feel like the solution to this problem should be relatively simple, but for some reason I can't find answers or come up with my own solution. Given the dataframe: (SpA and SpB not important, want to look at distribution of cooccurance for each year) Year SpA SpB Coocc 2000 0 2000 2 2000 1 2001 8 2001 2 2001 0 2001 0 2002 1 2002 2 How can I apply different functions to
2009 Aug 04
2
Caculate first difference from a dataframe; write a simulation
Dear R Users I'm writing my first simulation in R. I've put across my problems with a smaller example in the attachment along with the questions. Please help. Best regards Meenu -------------- next part -------------- mydat<-read.table(textConnection("Level spread change State 4.57 1.6 BlF NA 4.45 2.04 BrS NA 3.07 2.49 BlS NA 3.26 -0.26 BlF NA 2.80 0.22 BrF NA 3.22 2.5 BrS NA
2010 Apr 16
2
Scanning only specific columns into R from a VERY large file
Hi, I turn to you, the R Sages, once again for help. You've never let me down! (1) Please make the following toy files: x <- read.table(textConnection("var.1 var.2 var.3 var.1000 indv.1 1 5 9 7 indv.210000 2 9 3 8"), header = TRUE) y <- read.table(textConnection("var.3 var.1000"), header = TRUE) write.csv(x, file = "x.csv") write.csv(y, file =
2011 Jan 31
5
Finding a Diff within a Dataframe columns
Hi, I have a Dataframe. A B C D 0.1 0.7 0.9 0.8 0.20 0.60 0.80 0.70 0.40 0.80 0.70 0.76 I need a resultant dataframe (A-B) (C-D) -0.6 0.1 -0.40 0.1 -0.40 -0.06 Any suggestion would be of a great help Thanks Ramya -- View this message in context: http://r.789695.n4.nabble.com/Finding-a-Diff-within-a-Dataframe-columns-tp3247943p3247943.html Sent from
2007 Sep 04
2
Howto sort dataframe columns by colMeans
I read from external data source containing several columns. Each column represents value of a metric. The columns are time series data. I want to sort the resulting dataframe such that the column with the largest mean is the leftmost column, descending in colMean values to the right. I see many solutions for sorting rows based on some column characteristic, but haven't found any
2008 Jul 20
3
Order of columns(variables) in dataframe
Dear R experts,   I have a dataframe with 4 columns (variables). I want to redorder (or reposition) these columns on the basis of a value in its last row. e.g.   df1<-data.frame( v1= c(2,3,1,9,5), v2=c(8,5,12,4,11), v3=c(7,8,2,6,9), v4=c(1,4,6,3,6))    > df1    v1 v2 v3 v4 1  2  8  7  1 2  3  5  8  4 3  1 12  2  6 4  9  4  6  3 5  5 11  9  6 I wanto to get the order of df1 on the basis of
2011 Mar 10
1
about textConnection
I need read a table in a string with special format. I used read.csv and textConnection function. But i am confuse about textConnection by follow code. case A: It is OK£¡ str0 <- '{"abc",{"def","X,1&Y,2&Z,3"}}' str1 <- strsplit(str0,'"')[[1]][6] str2 <- gsub("&","\n", str1) con <-
2017 Dec 14
1
match and new columns
Hi Bill, I put stringsAsFactors = FALSE still did not work. tdat <- read.table(textConnection("A B C Y A12 B03 C04 0.70 A23 B05 C06 0.05 A14 B06 C07 1.20 A25 A23 A12 3.51 A16 A25 A14 2,16"),header = TRUE ,stringsAsFactors = FALSE) tdat$D <- 0 tdat$E <- 0 tdat$D <- (ifelse(tdat$B %in% tdat$A, tdat$A[tdat$B], 0)) tdat$E <- (ifelse(tdat$B %in% tdat$A, tdat$A[tdat$C], 0))
2005 Sep 18
0
Updated rawConnection() patch
Here's an update of my rawConnection() implementation. In addition to providing a raw version of textConnection(), this fixes two existing issues with textConnection(): one is that the current textConnection() implementation carries around unprotected SEXP pointers, the other is a performance problem due to prolific copying of the output buffer as output is accumulated line by line. This new
2009 Jul 29
0
Maximizing values in subsetted dataframe
Dear List, I am trying to sub-sample some data by taking a data point every x minutes. The data contains missing values, and I would like to take the sub-sample that maximizes the number of valid points in the sample. I.e. minimizes the number of NA's in the data set. For example, given the following: da<-seq(Sys.time(),by=1,length.out=10) x<-c(1,2,NA,4,NA,6,NA,8,9,10)