thr3ads.net - R help - [R] Help with Merge - unexpected loss of factor level [Dec 2009]

If this information is useful, please help other people find it:
Share via:

Zoe van Havre

2009-Dec-17 05:17 UTC

[R] Help with Merge - unexpected loss of factor level

Hi, Thanks in advance for any advice you can give me, I am very stumped on this
problem...

I use R every day and consider myself a confident user, but this seems to be an
elementary problem..

Outline of problem: I am analysing the results of a study on protein expression
in cancer tissues. I have raw intensities from 2 different types of cancer and
normal tissue,  which can be taken from several different parts of the cell, as
well as patient information. Part of the analysis calls for a fold-change
calculation. In order to do this I am sub-setting the dataset by cancer type,
merging each cancer dataset with the data from the Normal tissue, then
calculating fold change for matching individuals and cell section.

The problem is that I have been tracking one factor in particular
('branch', values 2 or 3) and once the final merge occurs, the second
level of this factor seems to disappear in the last dataset, even though it was
present before.
See code & output below:
>  dim(tma)
>  names(tma)[1] "Code"       "marker"     "cell"      
"tumourA"    "tumourEXP"  "int"       
"stain"      "tumourPERC"
"branch"> levels(tma$tumourA)[1] "DCIS"                       "LN Metastasis"            
"Normal"                     "Primary Invasive Carcinoma"
#split into cancer and normal tissue>  tma1<-subset(tma, tumourA=="Primary Invasive Carcinoma")
>   tma2<-subset(tma, tumourA=="LN Metastasis")
>   tmaN<-subset(tma, tumourA=="Normal")
#size of datasets> dim(tma1)
[1] 587   9> dim(tma2)
[1] 323   9> dim(tmaN)[1] 142   9

#merge back with normal type> tma1.1<-merge(tmaN, tma1, by="Code")
> tma2.1<-merge(tmaN, tma2, by="Code")
#new dimensions (seem excessively large)> dim(tma1.1)
[1] 2439   17> dim(tma2.1)[1] 625  17

#progression of "branch: factor in datasets. Note last one where it
disappears...> table(tma$branch)
  2   3
450 613> table(tma1$branch)
  2   3
314 273> table(tma2$branch)
  2   3
 39 284> table(tmaN$branch)
 2  3
91 51> table(tma1.1$branch.x)
   2    3
1806  633> table(tma2.1$branch.x)
  3
625


Please, can someone tell me what's going on?

Thanks you very much,
Zoe van Havre

	[[alternative HTML version deleted]]

Patrick Connolly

2009-Dec-17 06:51 UTC

head link

[R] Help with Merge - unexpected loss of factor level

On Thu, 17-Dec-2009 at 03:17PM +1000, Zoe van Havre wrote:

[...]

|> The problem is that I have been tracking one factor in particular
|> ('branch', values 2 or 3) and once the final merge occurs, the
|> second level of this factor seems to disappear in the last dataset,
|> even though it was present before.  See code & output below:


|> 
|> >  dim(tma)

You didn't tell us that one.  What size is it?

|> >  names(tma)
|> [1] "Code"       "marker"     "cell"      
"tumourA"    "tumourEXP"  "int"       
"stain"      "tumourPERC" "branch"
|> > levels(tma$tumourA)
|> [1] "DCIS"                       "LN Metastasis"      
"Normal"                     "Primary Invasive Carcinoma"
|> #split into cancer and normal tissue
|> >  tma1<-subset(tma, tumourA=="Primary Invasive
Carcinoma")
|> >   tma2<-subset(tma, tumourA=="LN Metastasis")
|> >   tmaN<-subset(tma, tumourA=="Normal")
|> 

[...]

|>  2  3
|> 91 51
|> > table(tma1.1$branch.x)
|> 
|>    2    3
|> 1806  633
|> > table(tma2.1$branch.x)
|> 
|>   3
|> 625
|> 
|> 
|> Please, can someone tell me what's going on?


I suspect you'd have a lot of NAs in there.  Try this:
 sapply(tma, function(x)
    sum(is.na(x)))

If that doesn't tell you something interesting, try with the subsets.
Or maybe when you use table(), try the exclude=NULL argument.

HTH

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___    Patrick Connolly   
 {~._.~}                   Great minds discuss ideas    
 _( Y )_  	         Average minds discuss events 
(:_~*~_:)                  Small minds discuss people  
 (_)-(_)  	                      ..... Eleanor Roosevelt
	  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

Apparently Analagous Threads

Search for more seemingly similar threads

R help - Dec 2009 - Help with Merge - unexpected loss of factor level

[R] Help with Merge - unexpected loss of factor level

[R] Help with Merge - unexpected loss of factor level

Apparently Analagous Threads