thr3ads.net - R help - [R] possible reason for merge not working [Aug 2011]

If this information is useful, please help other people find it:
Share via:

world peace

2011-Aug-01 16:17 UTC

[R] possible reason for merge not working

Hi Guys,

working on a "merge" for 2 data frames.

Using the command:

x <- merge(annotatedData, UCSCgenes, by.x="names",
by.y="Ensembl.Gene.ID", all.x=TRUE)

names and Ensembl.Gene.ID are columns with similar elements from the x
and y data frames.

annotatedData has 8909 entries, so has x(as expected). x has columns
for UCSCgenes, but there is no data in them, all n/a, as if no match
exists.
This is not true as I can manually see and find many similarities
between the names and UCSCgenes columns.

I am wondering if there is any syntax error, or logical.

comments appreciated.

Thanks
Dan

jim holtman

2011-Aug-01 16:25 UTC

head link

[R] possible reason for merge not working

What you "see" and what the data really is may be two different
things.  You should have at least enclosed an 'str' of the two data
frames; even better would be a subset of the data using 'dput'.  Most
likely your problem is that your data is not what you 'expect' it to
be.

On Mon, Aug 1, 2011 at 12:17 PM, world peace <buysellrentoffer at
gmail.com> wrote:> Hi Guys,
>
> working on a "merge" for 2 data frames.
>
> Using the command:
>
> x <- merge(annotatedData, UCSCgenes, by.x="names",
> by.y="Ensembl.Gene.ID", all.x=TRUE)
>
> names and Ensembl.Gene.ID are columns with similar elements from the x
> and y data frames.
>
> annotatedData has 8909 entries, so has x(as expected). x has columns
> for UCSCgenes, but there is no data in them, all n/a, as if no match
> exists.
> This is not true as I can manually see and find many similarities
> between the names and UCSCgenes columns.
>
> I am wondering if there is any syntax error, or logical.
>
> comments appreciated.
>
> Thanks
> Dan
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

Jean V Adams

2011-Aug-01 16:33 UTC

head link

[R] possible reason for merge not working

Dan,

If the variables you are merging by are character variables, there may be 
subtle differences that you haven't noticed, e.g., capitalization or 
spacing.  You can look for differences by listing off the unique values:

table(c(annotatedData$names, UCSCgenes$Ensembl.Gene.ID))

Jean


`·.,,  ><(((º>   `·.,,  ><(((º>   `·.,,  ><(((º>

Jean V. Adams
Statistician
U.S. Geological Survey
Great Lakes Science Center
223 East Steinfest Road
Antigo, WI 54409  USA




From:
world peace <buysellrentoffer@gmail.com>
To:
r-help@r-project.org
Date:
08/01/2011 11:24 AM
Subject:
[R] possible reason for merge not working
Sent by:
r-help-bounces@r-project.org



Hi Guys,

working on a "merge" for 2 data frames.

Using the command:

x <- merge(annotatedData, UCSCgenes, by.x="names",
by.y="Ensembl.Gene.ID", all.x=TRUE)

names and Ensembl.Gene.ID are columns with similar elements from the x
and y data frames.

annotatedData has 8909 entries, so has x(as expected). x has columns
for UCSCgenes, but there is no data in them, all n/a, as if no match
exists.
This is not true as I can manually see and find many similarities
between the names and UCSCgenes columns.

I am wondering if there is any syntax error, or logical.

comments appreciated.

Thanks
Dan

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



	[[alternative HTML version deleted]]

David Winsemius

2011-Aug-01 16:35 UTC

head link

[R] possible reason for merge not working

On Aug 1, 2011, at 12:17 PM, world peace wrote:
> Hi Guys,
>
> working on a "merge" for 2 data frames.
>
> Using the command:
>
> x <- merge(annotatedData, UCSCgenes, by.x="names",
> by.y="Ensembl.Gene.ID", all.x=TRUE)
>
> names and Ensembl.Gene.ID are columns with similar elements from the x
> and y data frames.
>
> annotatedData has 8909 entries, so has x(as expected). x has columns
> for UCSCgenes, but there is no data in them, all n/a, as if no match
> exists.
> This is not true as I can manually see and find many similarities
The merge function does not work on "similarities". Matches need to be
exact.
> between the names and UCSCgenes columns.
>
> I am wondering if there is any syntax error, or logical.
Probably logical.

-- 

David Winsemius, MD
West Hartford, CT

Reasonably Related Threads

Search for more maybe matching threads

R help - Aug 2011 - possible reason for merge not working

[R] possible reason for merge not working

[R] possible reason for merge not working

[R] possible reason for merge not working

[R] possible reason for merge not working

Reasonably Related Threads