?s 19:36 de 20/04/2023, AbouEl-Makarim Aboueissa escreveu:> Dear All: > > > > *Re:* detect and replace outliers by the average > > > > The dataset, please see attached, contains a group factoring column ? > *factor*? and two columns of data ?x1? and ?x2? with some NA values. I need > some help to detect the outliers and replace it and the NAs with the > average within each level (0,1,2) for each variable ?x1? and ?x2?. > > > > I tried the below code, but it did not accomplish what I want to do. > > > > > > data<-read.csv("G:/20-Spring_2023/Outliers/data.csv", header=TRUE) > > data > > replace_outlier_with_mean <- function(x) { > > replace(x, x %in% boxplot.stats(x)$out, mean(x, na.rm=TRUE)) #### , > na.rm=TRUE NOT working > > } > > data[] <- lapply(data, replace_outlier_with_mean) > > > > > > Thank you all very much for your help in advance. > > > > > > with many thanks > > abou > > > ______________________ > > > *AbouEl-Makarim Aboueissa, PhD* > > *Professor, Mathematics and Statistics* > *Graduate Coordinator* > > *Department of Mathematics and Statistics* > *University of Southern Maine* > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.Hello, There is no data set attached, see the posting guide on what file extensions are allowed as attachments. As for the question, try to compute mean(x, na.rm = TRUE) first, then use this value in the replace instruction. Without data I'm just guessing. Hope this helps, Rui Barradas
AbouEl-Makarim Aboueissa
2023-Apr-20 18:46 UTC
[R] detect and replace outliers by the average
Hi Rui: here is the dataset factor x1 x2 0 700 700 0 700 500 0 470 470 0 710 560 0 5555 520 0 610 720 0 710 670 0 610 9999 1 690 620 1 580 540 1 690 690 1 NA 401 1 450 580 1 700 700 1 400 8888 1 6666 600 1 500 400 1 680 650 2 117 63 2 120 68 2 130 73 2 120 69 2 125 54 2 999 70 2 165 62 2 130 987 2 123 70 2 78 2 98 2 5 2 321 NA with many thanks abou ______________________ *AbouEl-Makarim Aboueissa, PhD* *Professor, Mathematics and Statistics* *Graduate Coordinator* *Department of Mathematics and Statistics* *University of Southern Maine* On Thu, Apr 20, 2023 at 2:44?PM Rui Barradas <ruipbarradas at sapo.pt> wrote:> ?s 19:36 de 20/04/2023, AbouEl-Makarim Aboueissa escreveu: > > Dear All: > > > > > > > > *Re:* detect and replace outliers by the average > > > > > > > > The dataset, please see attached, contains a group factoring column ? > > *factor*? and two columns of data ?x1? and ?x2? with some NA values. I > need > > some help to detect the outliers and replace it and the NAs with the > > average within each level (0,1,2) for each variable ?x1? and ?x2?. > > > > > > > > I tried the below code, but it did not accomplish what I want to do. > > > > > > > > > > > > data<-read.csv("G:/20-Spring_2023/Outliers/data.csv", header=TRUE) > > > > data > > > > replace_outlier_with_mean <- function(x) { > > > > replace(x, x %in% boxplot.stats(x)$out, mean(x, na.rm=TRUE)) #### , > > na.rm=TRUE NOT working > > > > } > > > > data[] <- lapply(data, replace_outlier_with_mean) > > > > > > > > > > > > Thank you all very much for your help in advance. > > > > > > > > > > > > with many thanks > > > > abou > > > > > > ______________________ > > > > > > *AbouEl-Makarim Aboueissa, PhD* > > > > *Professor, Mathematics and Statistics* > > *Graduate Coordinator* > > > > *Department of Mathematics and Statistics* > > *University of Southern Maine* > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > Hello, > > There is no data set attached, see the posting guide on what file > extensions are allowed as attachments. > > As for the question, try to compute mean(x, na.rm = TRUE) first, then > use this value in the replace instruction. Without data I'm just guessing. > > Hope this helps, > > Rui Barradas > >[[alternative HTML version deleted]]