madr
2010-Dec-04 22:11 UTC
[R] what is this averaging function called ?, has R a built in function for it ?
I know little of statistics and have created this function out of intuition. But since this algorithm is so basic I wonder what is the proper name of this function and is it build in R. here is some code in PHP to illustrate what the function is doing, it uses some function I created but the meaning is obvious: #get csv file and interchange rows with columns to get two arrays $csv = aic(getcsv(file_get_contents("out.csv"))); #now those arrays contained in one bigger array are sorted array_multisort($csv[0],SORT_NUMERIC,$csv[1],SORT_NUMERIC); #second array is created and values that will be put on x or 0 axis are made unique with every y or 1 # value is going into array under x/0 it will be used after to make mean arithmetic, geometric or harmonic foreach ($csv[0] as $k=>$x) { $sum[$x][] = $csv[1][$k]; } #the x values are put on other array for later use $x = array_keys($sum); $rang = $sum = array_values($sum); #and here is the key feature, to smooth the line the function looks for (in this case) 500 values above and beond given value # if they exist of course, the search stops when search goes outside the array # the search also stop when number of gathered values goes beyond 500 or next value that would be added will be making # this value more than 500, you can imagine that there could be a large spike in data and this would be affecting points near # if this precaution haven't been conceived foreach ($rang as $k=>&$v) { if (!($k % 100)) echo $k.' '; $up = $down = array(); $walk = 0; while (true) { ++$walk; if (isset($sum[$k-$walk]) and count($v)+count($up)+count($sum[$k-$walk])<500) $up = array_merge($up,$sum[$k-$walk]); else break; } $walk = 0; while (true) { ++$walk; if (isset($sum[$k+$walk]) and count($v)+count($down)+count($sum[$k+$walk])<500) $down = array_merge($down,$sum[$k+$walk]); else break; } $rang[$k] = array_merge($up,$rang[$k],$down); # after gathering data for given point it makes a mean, in this case arithmetic $rang[$k] = array_sum($rang[$k])/count($rang[$k]); } # now the array with x values can be added and fipped array is ready to go to a file $csv = aic(array($x,$rang)); # in php this is awfully slow but I like it because it is sensitive for the densiti of the data and to not goes away in strange # directions when data density becomes very low -- View this message in context: http://r.789695.n4.nabble.com/what-is-this-averaging-function-called-has-R-a-built-in-function-for-it-tp3072826p3072826.html Sent from the R help mailing list archive at Nabble.com.
Tal Galili
2010-Dec-05 15:15 UTC
[R] what is this averaging function called ?, has R a built in function for it ?
Hi there, I didn't fully follow your code, but allow me to ask: *What is your input - and what do you want to output to be?* Is you input a vector (array) of points? Is your output some sort of averaging of the points according to order? If you searching for a function to "smooth" a sequance of points? Have a look at this: http://en.wikipedia.org/wiki/Lowess In R you can use ?loess Cheers, Tal ----------------Contact Details:------------------------------------------------------- Contact me: Tal.Galili@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- On Sun, Dec 5, 2010 at 12:11 AM, madr <madrazel@interia.pl> wrote:> > I know little of statistics and have created this function out of > intuition. > But since this algorithm is so basic I wonder what is the proper name of > this function and is it build in R. > > here is some code in PHP to illustrate what the function is doing, it uses > some function I created but the meaning is obvious: > > #get csv file and interchange rows with columns to get two arrays > $csv = aic(getcsv(file_get_contents("out.csv"))); > #now those arrays contained in one bigger array are sorted > array_multisort($csv[0],SORT_NUMERIC,$csv[1],SORT_NUMERIC); > > #second array is created and values that will be put on x or 0 axis are > made > unique with every y or 1 > # value is going into array under x/0 it will be used after to make mean > arithmetic, geometric or harmonic > foreach ($csv[0] as $k=>$x) { > $sum[$x][] = $csv[1][$k]; > } > > #the x values are put on other array for later use > $x = array_keys($sum); > $rang = $sum = array_values($sum); > > #and here is the key feature, to smooth the line the function looks for (in > this case) 500 values above and beond given value > # if they exist of course, the search stops when search goes outside the > array > # the search also stop when number of gathered values goes beyond 500 or > next value that would be added will be making > # this value more than 500, you can imagine that there could be a large > spike in data and this would be affecting points near > # if this precaution haven't been conceived > foreach ($rang as $k=>&$v) { > if (!($k % 100)) echo $k.' '; > $up = $down = array(); > $walk = 0; > while (true) { > ++$walk; > if (isset($sum[$k-$walk]) and > count($v)+count($up)+count($sum[$k-$walk])<500) > $up = array_merge($up,$sum[$k-$walk]); > else break; > } > $walk = 0; > while (true) { > ++$walk; > if (isset($sum[$k+$walk]) and > count($v)+count($down)+count($sum[$k+$walk])<500) > $down = array_merge($down,$sum[$k+$walk]); > else break; > } > > $rang[$k] = array_merge($up,$rang[$k],$down); > # after gathering data for given point it makes a mean, in this case > arithmetic > $rang[$k] = array_sum($rang[$k])/count($rang[$k]); > } > # now the array with x values can be added and fipped array is ready to go > to a file > $csv = aic(array($x,$rang)); > > # in php this is awfully slow but I like it because it is sensitive for the > densiti of the data and to not goes away in strange > # directions when data density becomes very low > -- > View this message in context: > http://r.789695.n4.nabble.com/what-is-this-averaging-function-called-has-R-a-built-in-function-for-it-tp3072826p3072826.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]