Hello dear R-list members, I have a question about normalizing data. The goal is to normalize the dataset per column, so all the data in each column is scaled to the interval (0,1), will have a mean of 0 and a standard deviation of 1. I know the way to do this is to take each datapoint, subtract the mean of the column it is located in and divide this by the standard deviation of the column. Now my question is: is there a function in R that does this, and if so, which function? Thanks very much, Jonck
On 06/14/03 14:57, Jonck van der Kogel wrote:>Hello dear R-list members, >I have a question about normalizing data. The goal is to normalize the >dataset per column, so all the data in each column is scaled to the >interval (0,1), will have a mean of 0 and a standard deviation of 1.In psychology, we usually call this standardizing. "Normalizing" is subtracting the mean but NOT dividing by the s.d.>I know the way to do this is to take each datapoint, subtract the mean >of the column it is located in and divide this by the standard >deviation of the column. Now my question is: is there a function in R >that does this, and if so, which function?scale() It will standardize (default) or normalize. And the default is to it by column, as you describe. -- Jonathan Baron, Professor of Psychology, University of Pennsylvania R page: http://finzi.psych.upenn.edu/
> >I have a question about normalizing data. The goal is to normalize the > >dataset per column, so all the data in each column is scaled to the > >interval (0,1), will have a mean of 0 and a standard deviation of 1.Note that standardizing (i.e., taking a bunch of observations, subtracting the mean, and dividing by the standard deviation) does not necessarily ensure that the data will fall into any particular interval. In fact, it is impossible for the data to fall into the interval (0,1) because some observations will be above and some below the mean. -- Wolfgang Viechtbauer