thr3ads.net - R help - [R] Help with Loops [May 2010]

If this information is useful, please help other people find it:
Share via:

Amit Patel

2010-May-13 14:49 UTC

[R] Help with Loops

Hi

I have tried many attempts but cant get the loop right, as I am not a strong
programmer. What I am basically trying to do is compare 2 spreadsheets. The
problem is that one of them only contain a portion of the overall data
(TESTSAMP), where the other has a full datasetFULLSAMP. From the complete set I
would like to remove the rows of data which are not in the TESTSAMP. Column 1
contains the sample numbers which can be used to identify samples. Does anyone
have any suggestions?

I have tried various things like double loops and so on, but I am sure there is
an easier way or function to do this.

i tried this method, but Im not sure how to only keep looping until a match is
found. I dont understand how repeat loops work in R.

for (i in 1:length(FULLSAMP[,1])) {

if (FULLSAMP[i,1] != TESTSAMP[i,1]) {
FULLSAMP <- FULLSAMP[-i,]
}


Thanks in advance

Sarah Goslee

2010-May-13 15:00 UTC

head link

[R] Help with Loops

You don't need a loop for this, I think. Since you don't provide an
example it's hard to know how your data are set up, but look at this:
> FULLSAMP <- data.frame(A = 1:10, B=letters[1:10])
> TESTSAMP <- data.frame(A = c(2,4,5,8), C=1:4)
> FULLSAMP    A B
1   1 a
2   2 b
3   3 c
4   4 d
5   5 e
6   6 f
7   7 g
8   8 h
9   9 i
10 10 j> TESTSAMP  A C
1 2 1
2 4 2
3 5 3
4 8 4> FULLSAMP[FULLSAMP$A %in% TESTSAMP$A,]  A B
2 2 b
4 4 d
5 5 e
8 8 h

Sarah

On Thu, May 13, 2010 at 10:49 AM, Amit Patel <amitrhelp at yahoo.co.uk>
wrote:> Hi
>
> I have tried many attempts but cant get the loop right, as I am not a
strong programmer. What I am basically trying to do is compare 2 spreadsheets.
The problem is that one of them only contain a portion of the overall data
(TESTSAMP), where the other has a full datasetFULLSAMP. From the complete set I
would like to remove the rows of data which are not in the TESTSAMP. Column 1
contains the sample numbers which can be used to identify samples. Does anyone
have any suggestions?
>
> I have tried various things like double loops and so on, but I am sure
there is an easier way or function to do this.
>
> i tried this method, but Im not sure how to only keep looping until a match
is found. I dont understand how repeat loops work in R.
>
> for (i in 1:length(FULLSAMP[,1])) {
>
> if (FULLSAMP[i,1] != TESTSAMP[i,1]) {
> FULLSAMP <- FULLSAMP[-i,]
> }
>
>
> Thanks in advance
>
>
>


-- 
Sarah Goslee
http://www.functionaldiversity.org

Steve Lianoglou

2010-May-13 15:12 UTC

head link

[R] Help with Loops

Hi,

On Thu, May 13, 2010 at 10:49 AM, Amit Patel <amitrhelp at yahoo.co.uk>
wrote:> Hi
>
> I have tried many attempts but cant get the loop right, as I am not a
strong programmer. What I am basically trying to do is compare 2 spreadsheets.
The problem is that one of them only contain a portion of the overall data
(TESTSAMP), where the other has a full datasetFULLSAMP. From the complete set I
would like to remove the rows of data which are not in the TESTSAMP. Column 1
contains the sample numbers which can be used to identify samples. Does anyone
have any suggestions?
>
> I have tried various things like double loops and so on, but I am sure
there is an easier way or function to do this.
>
> i tried this method, but Im not sure how to only keep looping until a match
is found. I dont understand how repeat loops work in R.
>
> for (i in 1:length(FULLSAMP[,1])) {
>
> if (FULLSAMP[i,1] != TESTSAMP[i,1]) {
> FULLSAMP <- FULLSAMP[-i,]
> }
You want to not use for loops as much as possible.

Imagine your samples are identified as letters, so FULLSAMP[,1] will
be letters A..Z, and TESTSAMP[,1] will be some random 15 letters. Now
the job is to match the rows in TESTAMP to the rows in FULLSAMP, and
remove any "extra" rows in FULLSAMP that don' appear in testamp.

## Making some data
R> fullsamp <- data.frame(id=LETTERS, something=sample(1:100,
length(letters)), stringsAsFactors=FALSE)
R> testsamp <- data.frame(id=sample(LETTERS, 15),
something=sample(1:100, 15), stringsAsFactors=FALSE)

## Let's find where the "testamp" rows appear in
"fullsamp"
R> xref <- match(testsamp[,1], fullsamp[,1])

## Now reduce fullsamp to have only the data corresponding to testsamp
## (and in the same order
R> fullsamp.sub <- fullsamp[xref,]

Notice that fullsamp.sub now has only rows with IDs appearing in
testsamp and they are also in the same order as testsamp.

Now go ahead and read the help you'll find in ?match

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

David Winsemius

2010-May-13 15:22 UTC

head link

[R] Help with Loops

On May 13, 2010, at 10:49 AM, Amit Patel wrote:
> Hi
>
> I have tried many attempts but cant get the loop right, as I am not  
> a strong programmer. What I am basically trying to do is compare 2  
> spreadsheets. The problem is that one of them only contain a portion  
> of the overall data (TESTSAMP), where the other has a full  
> datasetFULLSAMP. From the complete set I would like to remove the  
> rows of data which are not in the TESTSAMP. Column 1 contains the  
> sample numbers which can be used to identify samples. Does anyone  
> have any suggestions?
>
> I have tried various things like double loops and so on, but I am  
> sure there is an easier way or function to do this.
>
> i tried this method, but Im not sure how to only keep looping until  
> a match is found. I dont understand how repeat loops work in R.
>
> for (i in 1:length(FULLSAMP[,1])) {
>
> if (FULLSAMP[i,1] != TESTSAMP[i,1]) {
> FULLSAMP <- FULLSAMP[-i,]
> }
>
Abandon the loop. Use merge.
  ... or the %in% function.
-- 

David Winsemius, MD
West Hartford, CT

Apparently Analagous Threads

Search for more maybe matching threads

R help - May 2010 - Help with Loops

[R] Help with Loops

[R] Help with Loops

[R] Help with Loops

[R] Help with Loops

Apparently Analagous Threads