Displaying 20 results from an estimated 10000 matches similar to: "difference of two data frames"
2008 Feb 10
data frame question
I have 2 data frames df1 and df2. I would like to create a
new data frame new_df which will contain only the common rows based on the first 2
columns (chrN and start). The column score in the new data frame
be replaced with a column containing the average score (average_score) from df1
and df2.
df1= data.frame(chrN= c(“chr1”, “chr1”, “chr1”, “chr1”, “chr2”,
“chr2”, “chr2”),
2011 Aug 22
Selecting cases from matrices stored in lists
I have two lists (c and h - see below) containing matrices with similar
cases but different values. I want to split these matrices into multiple
matrices based on the values in h. So, I did the following:
for (t in 1:length(years))
h[[year]]<-sapply(colnames(h[[year]]), function(var)
2008 Sep 03
subsetting a data frame
I have a data frame that looks like this:
V1 V2 V3
a b 0:1:12
d f 1:2:1
c d 1:0:9
where V3 is in the form x:y:z
Can someone show me how to subset the rows where the values of x, y and z <= 10:
V1 V2 V3
d f 1:2:1
c d 1:0:9
[[alternative HTML version deleted]]
2018 Feb 25
reshaping column items into rows per unique ID
Hi All
I have a datafram which looks like this :
CustomerID DietType
1 a
1 c
1 b
2 f
2 a
3 j
4 c
4 c
4 f
And I would like to reshape this so I can
2010 Sep 10
Counting occurances of a letter by a factor
I'm trying to find a more elegant way of doing this. What I'm trying to accomplish is to count the frequency of letters (major / minor alleles) in a string grouped by the factor levels in another column of my data frame.
> DF<-data.frame(c("CC", "CC", NA, "CG", "GG", "GC"), c("L", "U", "L",
2018 Feb 25
reshaping column items into rows per unique ID
I believe you need to spend time with an R tutorial or two: a data frame
(presumably the "table" data structure you describe) can *not* contain
"blanks" -- all columns must be the same length, which means NA's are
filled in as needed.
Also, 8e^5 * 7e^4 = 5.6e^10, which almost certainly will not fit into any
local version of R (maybe it would in some server version --
2008 Feb 18
remove column names from a data frame
I want to remove the column names from a data frame. I do
it the long way, can any body show me a better way ?
df= data.frame(chrN= c(“chr1”, “chr2”, “chr3”), start= c(1,
2, 3), end= c(4, 5, 6), score= c(7, 8, 9))
#I write a txt file without row or column names
#then I read it with the header = F
2010 Dec 16
Compare two dataframes
I have two dataframes DF1 and DF2 that should be identical but are not
(DF1 has some rows that aren't in DF2, and vice versa). I would like
to produce a new dataframe DF3 containing rows in DF1 that aren't in
DF2 (and similarly DF4 would contain rows in DF2 that aren't in DF1).
I have a solution for this problem (see self contained example below)
but it's awkward and
2011 Oct 25
question regarding intersect function
I have probably a very simple question but I'm going crazy trying to find
the solution.
I have two data.frames with headers and I'm doing an intersection between
them by names, such that the intersected data.frames are returned by:
df1[intersect(names (df1), names(df2))] and the same for df2
Now, I want to have all the opposite data that did not intersect. I tried to
2011 May 16
rbind with partially overlapping column names
I would like to merge two data frames with partially overlapping column
names with an rbind-like operation.
For the follow data frames,
df1 <- data.frame(a=c("A","A"),b=c("B","B"))
df2 <- data.frame(b=c("b","b"),c=c("c","c"))
I would like the output frame to be (with NAs where the frames don't
2012 Jul 14
Can't understand syntax
OK, I need help!!
I've been searching, but I don't understand the logic of some this
dataframe addressing syntax.
What is this type of code called?
test [["v3"]] [is.na(test[["v2"]])] <-10 #choose column v3 where column v2
is == 4 and replace with 10
and where is it documented?
The code below works for what I want to do (find the non-missing value in a
2010 Nov 05
assignment operator saving factor level as number
Hi all,
I have a dataframe (df1) that I am trying to select values from to a second
dataframe that at the current time is only for the selected items from df1
(df2). The values that I am trying to save from df1 are factors with
alphanumeric names
df1 looks like this:
'data.frame': 3014 obs. of 13 variables:
$ Num : int 1 1 1 2 2 2 3 3 3 4 ...
$ Tag_Num : int 1195
2011 Jul 13
UNIX diff function
(R: 2.13.0; OS X)
I often receive sequential datasets in which there are new rows interposed between existing rows. For example:
SET1 <- data.frame(list(LETTERS=LETTERS[c(1:4, 6:10)], NUMBERS=c(1:4, 6:10)))
SET2 <- data.frame(list(LETTERS=LETTERS[1:10], NUMBERS=1:10))
> SET1
1 A 1
2 B 2
3 C 3
4 D 4
2003 Apr 07
subsetting a dataframe
How does one remove a column from a data frame when the name of
the column to remove is stored in a variable?
For Example:
colname <- "LOT"
newdf <- subset(olddf,select = - colname)
The above statement will give an error, but thats what I'm trying to
If I had used:
newdf <- subset(olddf,select = - LOT)
then it would have worked, but as I said the column
2006 Sep 19
Union of two data frames
I have two data frames each with 5 columns and different number of
rows. some of the row names in one data frame are the same as the row
names in the other. I want to be able to merge the two data frames to
get a new data frame in which the duplicated row names are only shown
once with the data for the rest of the columns used from the first
data frame.
Essentially, I want to make a union
2013 Feb 26
merging or joining 2 dataframes: merge, rbind.fill, etc.?
#I want to "merge" or "join" 2 dataframes (df1 & df2) into a 3rd
(mydf). I want the 3rd dataframe to contain 1 row for each row in df1
& df2, and all the columns in both df1 & df2. The solution should
"work" even if the 2 dataframes are identical, and even if the 2
dataframes do not have the same column names. The rbind.fill function
seems to work. For
2011 Aug 18
Best way/practice to create a new data frame from two given ones with last column computed from the two data frames?
Dear expeRts,
What is the best approach to create a third data frame from two given ones, when
the new/third data frame has last column computed from the last columns of the two given
data frames?
## Okay, sounds complicated, so here is an example. Assume we have the two data frames:
df1 <- data.frame(Year=rep(2001:2010, each=2), Group=c("Group 1","Group 2"), Value=1:20)
2012 Jul 11
Help with loop
I have two dataframes:
The first, df1, contains some missing data:
cola colb colc cold cole
1 NA 5 9 NA 17
2 NA 6 NA 14 NA
3 3 NA 11 15 19
4 4 8 12 NA 20
The second, df2, contains the following:
cola colb colc cold cole
1 1.4 0.8 0.02 1.6 0.6
I'm wanting all missing data in df1$cola to be replaced by the value of
2018 Jan 08
Replace NAs in split lists
Why do you want to modify df1?
Why not just reassemble the parts as a new data frame and use that going forward in your calculations? That is generally the preferred approach in R so you can re-do your calculations easily if you find a mistake later.
Sent from my phone. Please excuse my brevity.
On January 7, 2018 7:35:59 PM PST, Ek Esawi <esawiek at gmail.com> wrote:
>I just came
2013 Jan 02
rbind: inconsistent behaviour with empty data frames?
The rbind on empty and nonempty data frames behaves inconsistently. I am
not sure if by design.
In the first example, first row is deleted, which may or may not be on
df1 <- data.frame()
df2 <- data.frame(foo=c(1, 2), bar=c("a", "b"))
rbind(df1, df2)
foo bar
2 2 b
Now if we continue:
df1 <- data.frame(matrix(0, 0, 2))
names(df1) <- names(df2)