thr3ads.net - similar to: "Multiple subsetting of a dataframe based on many conditions"

Displaying 20 results from an estimated 6000 matches similar to: "Multiple subsetting of a dataframe based on many conditions"

Subset dataframe based on condition

2006 Apr 17

Subset dataframe based on condition

Hi, I am trying to extract subset of data from my original data frame based on some condition. For example : (mydf -original data frame, submydf - subset dada frame) >submydf = subset(mydf, a > 1 & b <= a), here column a contains values ranging from 0.01 to 100000. I want to extract only those matching condition 1 i.e a > . But when i execute this command it is

subsetting by cell value with a list

2012 Mar 15

subsetting by cell value with a list

I would like to subset by dataframe by matching all rows that have any value from a list of values. I can get it to work if I have exactly one value, I'm not sure how to do it with a list of values though. This works and gives me exactly one line: my.df[ which( mydf$IDX==17)), ] I would like to do something like this: my.df[ which( mydf$IDX==c(17, 42), ] Obviously that won't work, but

filesystem becomming read only

2007 Jan 28

filesystem becomming read only

Hi list, I'm looking for advice/help in tracking down a problem with a new system I've purchased. I have a beige box server with a Gigabyte GA-M51GM-S2G motherboard. It has the nVidia MCP51 SATA controller with 3 250 gig Western Digital hard drives attached to it. It seems that when doing a considerable amount of file writing, the filesystem will become read-only. See attached dmesg

calculating p-values of columns in a dataframe

2007 Jul 07

calculating p-values of columns in a dataframe

I have a dataframe ("mydf") that contains "differences of means". I wish to test whether these differences are significantly different from zero. Below, I calculate the t-statistic for each column. What is a "good" method to calculate/look-up the p-value for each column? mydf=data.frame(a=c(1,-22,3,-4),b=c(5,-6,-7,9)) mymean=mean(mydf) mysd=sd(mydf)

Reference to dataframe and contents

2007 Feb 04

Reference to dataframe and contents

This is probably easy for experienced users but I could not find a solution. I have several R scripts that process several columns of a dataframe (several dataframes and columns actually, but simplified for my question). References such as: myDF$myCol are all over. I like to automate this for other dataframes and columns by defining a reference only once in the beginning of the script. One

retaining "POSIXct" formatting when using apply(muff, FUN=MAX) on POSIXct dataframe?

2008 Jan 08

retaining "POSIXct" formatting when using apply(muff, FUN=MAX) on POSIXct dataframe?

How do I retain "POSIXct" formatting when using apply, with FUN=max? #example: mydata <- rep(Sys.time(), 10) mydf <- data.frame(matrix(data=mydata, nrow=2, ncol=length(mydata) ) ) for(i in seq(mydf))class(mydf[[i]]) <- class(mydata) str(mydf) maxdates <- apply(mydf,2,max,na.rm=T) str(maxdates) #Why is the formattign now "chr", and not

add factor to dataframe given ranges

2005 Dec 22

add factor to dataframe given ranges

Hi all, I would like to factorize the entries in a dataframe given some groupings. E.g: mydf = data.frame( a = rnorm(100,10), b = rnorm(100,10), c = rgamma(100, 1, scale=1)) group = hist(mydf$c, breaks="FD") group$breaks The idea is to create a factor "mydf$d" with levels corresponding to the ranges in group$breaks. There must be an easy way to do this that I

How to loop through all the columns in dataframe

2008 Mar 16

How to loop through all the columns in dataframe

Hi: Can anyone advice me on how to loop and perform a calculation through all the columns. here's my data xd<- c(2.2024,2.4216,1.4672,1.4817,1.4957,1.4431,1.5676) pd<- c(0.017046,0.018504,0.012157,0.012253,0.012348,0.011997,0.012825) td<- c(160524,163565,143973,111956,89677,95269,81558) mydf<-data.frame(xd,pd,td) trans<-t(mydf) trans I have these values that I need to

Dataframe help

2008 Nov 05

Dataframe help

Hi there, I have a dataframe length.unique.info > length.unique.info abc 12 345 def 16 550 lmn 6 600 I want those names that fall under the condition (length.unique.info[,2][i] <=5 && length.unique.info[,3][i] >=500) abcder<-length.unique.info[which(length.unique.info[,2][i] <=5 && length.unique.info[,3][i] >= 500),1] will "&&" look for

Transform values from one column into column names of new dataframe

2008 May 02

Transform values from one column into column names of new dataframe

Hi, I have a question about reformatting data. It looks like it should be simple, but I've been working at it for awhile now and it's about time I ask for help. My data look like this: ITEM VALUE STEP item1 A first item2 C first item2 D second item1 A second item3 A first item3 B second item3 A third I just want to transform

Output a dataframe from R to excel

2005 Mar 13

Output a dataframe from R to excel

Hi, I am trying to output an dataframe from R to Excel file. Can anyone tell me how to do it? Thanks a lot. Eg. R dataframe: A B C 1 2 1 3 4 2 . . . [[alternative HTML version deleted]]

compare one field of dataframe with excel sheet using R

2012 Jun 26

compare one field of dataframe with excel sheet using R

I have a data frame consisting of three columns(name of compund,ppm and frequency).Name contains string values .ppm and frequency contains numeric values with decimal points upto four digits. I have an excel sheet which is like a library.The first column contains the name of compounds and remaining column contains the ppm values of the compound which satisfy certain rules.The number of ppm values

Reading multiple text files and then combining into a dataframe

2011 Dec 03

Reading multiple text files and then combining into a dataframe

I have a multiple text files, separated by a single space, that I need to combine into a single data.frame. Below is the track I'm on using list.files to capture the names of the files and then lapply with read.table. But I run into problems making a usable dataframe out of the data. #Creating example data in similar format to data I have sub <- rep(1,10) trial <- seq(1,10,1) size

Problem with subset() function?

2009 Jan 20

Problem with subset() function?

Hi all, Can anyone explain why the following use of the subset() function produces a different outcome than the use of the "[" extractor? The subset() function as used in density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age))) appears to me from documentation to be equivalent to density(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"])

merging or joining 2 dataframes: merge, rbind.fill, etc.?

2013 Feb 26

merging or joining 2 dataframes: merge, rbind.fill, etc.?

#I want to "merge" or "join" 2 dataframes (df1 & df2) into a 3rd (mydf). I want the 3rd dataframe to contain 1 row for each row in df1 & df2, and all the columns in both df1 & df2. The solution should "work" even if the 2 dataframes are identical, and even if the 2 dataframes do not have the same column names. The rbind.fill function seems to work. For

save conditions in a list

2012 Jul 02

save conditions in a list

Hi how would you save conditions like a = "day > 100"; b = "val < 50"; c = "year == 2012" in a list? I like to have variables like "day", "val", "year" and a list of conditions list(a,b,c). Then I want to check if a & b & c is true or if a | b | c is true or similar things. Greetings Christof

subset select within a function

2004 Jan 21

subset select within a function

Dear all, I'd like to subset a df within a function, and use select for choosing the variable. Something like (simplified example): mydf <- data.frame(a= 0:9, b= 10:19) ttt <- function(vv) { tmpdf <- subset(mydf, select= vv) mean(tmpdf$vv) } ttt(mydf$b) But this is not the correct way. Any help? Thanks in advance Juli

Extracting lists in the dataframe $ format

2007 Jun 04

Extracting lists in the dataframe $ format

I'm new to R and am trying to extract the factors of a dataframe using numeric indices (e.g. df[1]) that are input to a function definition instead of the other types of references (e.g. df$out). df[1] is a list(?) whose class is "dataframe". These indexed lists can be printed successfuly but are not agreeable to the plot() and lm() functions shown below as are their df$out

I need to create new variables based on two numeric variables and one dichotomize conditional category variables.

2023 Nov 04

I need to create new variables based on two numeric variables and one dichotomize conditional category variables.

I might have factored the gender. I'm not sure it would in any way be quicker. But might be to some extent easier to develop variations of. And is sort of what factors should be doing... # make dummy data gender <- c("Male", "Female", "Male", "Female") WC <- c(70,60,75,65) TG <- c(0.9, 1.1, 1.2, 1.0) myDf <- data.frame( gender, WC, TG ) #

Subset

2017 Sep 25

Subset

myDF <- data.frame(a = c("<0.1", NA, 0.3, 5, "Nil"), b = c("<0.1", 1, 0.3, 5, "Nil"), stringsAsFactors = FALSE) # you can subset the b-column in several ways myDF[ , 2] myDF[ , "b"] myDF$b # using the column, you make a logical vector ! is.na(as.numeric(myDF$b)) # This can be used to select the

similar to: Multiple subsetting of a dataframe based on many conditions