thr3ads.net - R help - [R] Near function? [Feb 2007]

If this information is useful, please help other people find it:
Share via:

Bart Joosen

2007-Feb-10 07:43 UTC

[R] Near function?

Hi,

I have an integer which is extracted from a dataframe, which is sorted by
another column of the dataframe.
Now I would like to remove some elements of the integer, which are near to
others by their value. For example: integer: c(1,20,2,21) should be c(1,20).

I tried to write a function, but for some reason, somethings won't work

x <- 1:20
near <- function(x,th) {
    nr <- NROW(x)
        for (i in 1:(nr-1)){
        for (j in (i+1):nr){
            if (j > nr) break
            t=0
            if (abs(x[i] - x[j]) < th) t = 1
            if (t== 1) x <- x[-j]
            if (t== 1) nr <- nr-1
            if (t== 1) j <- (j-1)
            cat (" i",i," j",j,"\n")
            }} 
x
}
near(x,10)


This gives you 1  3  7 13 17 while I was suspecting 1, 20 as the outcome.
If you look at the intermediate results of the cat instruction, you see that,
after he substracted a number, he skipped the next one.

Sorting the integer is not an option, the order is important.
I used an integer from 1:20 as an example, while x <- sample((1:20),20) is
maybe a bit more representable for our data, but isn't reproducible for the
output of the function.

Maybe there is already an R-function, which does such thing, or what is wrong
with my coding?


thanks a lot for your time


Bart
	[[alternative HTML version deleted]]

Dieter Menne

2007-Feb-10 10:38 UTC

head link

[R] Near function?

Bart Joosen <bartjoosen <at> hotmail.com> writes:
> 
> Hi,
> 
> I have an integer which is extracted from a dataframe, which is sorted by
another column of the dataframe.> Now I would like to remove some elements of the integer, which are near to
others by their value. For example:> integer: c(1,20,2,21) should be c(1,20).
....> Sorting the integer is not an option, the order is important.
Why not? It's extremely efficient for large series and the only method that
would work with large array. The idea: Keep the indexes of the sort order, mark
the "near others" for example making their index NA, and restore
original order.
No for-loop needed.

Dieter

jim holtman

2007-Feb-10 13:05 UTC

head link

[R] Near function?

One of the reasons it might not be working is that you are changing the
index of the 'for' within the loop.  The following is from the help page
for
'for':

The index seq in a for loop is evaluated at the start of the loop; changing
it subsequently does not affect the loop. The variable var has the same type
as seq, and is read-only: assigning to it does not alter seq. If seq is a
factor (which is not strictly allowed) then its internal codes are used: the
effect is that of
as.integer<mk:@MSITStore:C:\PROGRA~1\R\R-24~1.1\library\base\chtml\base.chm::/integer.html>not
as.vector<mk:@MSITStore:C:\PROGRA~1\R\R-24~1.1\library\base\chtml\base.chm::/vector.html>.



On 2/10/07, Bart Joosen <bartjoosen@hotmail.com>
wrote:>
> Hi,
>
> I have an integer which is extracted from a dataframe, which is sorted by
> another column of the dataframe.
> Now I would like to remove some elements of the integer, which are near to
> others by their value. For example: integer: c(1,20,2,21) should be
c(1,20).
>
> I tried to write a function, but for some reason, somethings won't work
>
> x <- 1:20
> near <- function(x,th) {
>    nr <- NROW(x)
>        for (i in 1:(nr-1)){
>        for (j in (i+1):nr){
>            if (j > nr) break
>            t=0
>            if (abs(x[i] - x[j]) < th) t = 1
>            if (t== 1) x <- x[-j]
>            if (t== 1) nr <- nr-1
>            if (t== 1) j <- (j-1)
>            cat (" i",i," j",j,"\n")
>            }}
> x
> }
> near(x,10)
>
>
> This gives you 1  3  7 13 17 while I was suspecting 1, 20 as the outcome.
> If you look at the intermediate results of the cat instruction, you see
> that, after he substracted a number, he skipped the next one.
>
> Sorting the integer is not an option, the order is important.
> I used an integer from 1:20 as an example, while x <- sample((1:20),20)
is
> maybe a bit more representable for our data, but isn't reproducible for
the
> output of the function.
>
> Maybe there is already an R-function, which does such thing, or what is
> wrong with my coding?
>
>
> thanks a lot for your time
>
>
> Bart
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

	[[alternative HTML version deleted]]

Bart Joosen

2007-Feb-11 07:18 UTC

head link

[R] Near function?

All,

thanks for your help.

Dieter,

thanks, it's a different way of tackling the problem.
But I still need a for loop to scroll throug the list?

For example:
c(1,2,3,5,)
and a threshold of 3, then c(1,5) should remain. If I make an integer with the
difference between each element and the previous element,
then 5 should be eliminated, while it shouldn't.

Or am I wrong with this assumption?

Thanks anyway

Bart

> 
> Hi,
> 
> I have an integer which is extracted from a dataframe, which is sorted by
another column of the dataframe.> Now I would like to remove some elements of the integer, which are near to
others by their value. For example:> integer: c(1,20,2,21) should be c(1,20).
...> Sorting the integer is not an option, the order is important.
Why not? It's extremely efficient for large series and the only method that
would work with large array. The idea: Keep the indexes of the sort order, mark
the "near others" for example making their index NA, and restore
original order.
No for-loop needed.

Dieter
	[[alternative HTML version deleted]]

Wolfgang Huber

2007-Feb-11 15:09 UTC

head link

[R] Near function?

Dear Bart,

"hclust" might be useful for this as well:

   dat = c(1,20,2,21)

   hc = hclust(dist(dat))

   thresh = 2
   ct = cutree(hc, h=thresh)

   clusteredNumbers = split(dat, ct)
   firstOne = dat[!duplicated(ct)]

 >  clusteredNumbers
$`1`
[1] 1 2
$`2`
[1] 20 21


 > firstOne
[1]  1 20


  Best wishes
   Wolfgang

> 
> I have an integer which is extracted from a dataframe, which is sorted by
another column of the dataframe.
> Now I would like to remove some elements of the integer, which are near to
others by their value. For example: integer: c(1,20,2,21) should be c(1,20).
> 
> I tried to write a function, but for some reason, somethings won't work
> 
> x <- 1:20
> near <- function(x,th) {
>     nr <- NROW(x)
>         for (i in 1:(nr-1)){
>         for (j in (i+1):nr){
>             if (j > nr) break
>             t=0
>             if (abs(x[i] - x[j]) < th) t = 1
>             if (t== 1) x <- x[-j]
>             if (t== 1) nr <- nr-1
>             if (t== 1) j <- (j-1)
>             cat (" i",i," j",j,"\n")
>             }} 
> x
> }
> near(x,10)
> 
> 
> This gives you 1  3  7 13 17 while I was suspecting 1, 20 as the outcome.
> If you look at the intermediate results of the cat instruction, you see
that, after he substracted a number, he skipped the next one.
> 
> Sorting the integer is not an option, the order is important.
> I used an integer from 1:20 as an example, while x <- sample((1:20),20)
is maybe a bit more representable for our data, but isn't reproducible for
the output of the function.
> 
> Maybe there is already an R-function, which does such thing, or what is
wrong with my coding?
> 
> 
> thanks a lot for your time
> 
> 
> Bart
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help a stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
------------------------------------------------------------------
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber

Apparently Analagous Threads

Search for more apparently analagous threads

R help - Feb 2007 - Near function?

[R] Near function?

[R] Near function?

[R] Near function?

[R] Near function?

[R] Near function?

Apparently Analagous Threads