thr3ads.net - R help - [R] Thank you your help. [Jan 2013]

If this information is useful, please help other people find it:
Share via:

arun

2013-Jan-28 14:48 UTC

[R] Thank you your help.

Hi,
temp3<- read.table(text="
ID CTIME WEIGHT
HM001 1223 24.0
HM001 1224 25.2
HM001 1225 23.1
HM001 1226 NA
HM001 1227 32.1
HM001 1228 32.4
HM001 1229 1323.2
HM001 1230 27.4
HM001 1231 22.4236 #changed here to test the previous solution
",sep="",header=TRUE,stringsAsFactors=FALSE)
?tempnew<- na.omit(temp3)


?grep("\\d{4}",temp3$WEIGHT) 
#[1] 7 9 #not correct


temp3[,3][grep("\\d{4}..*",temp3$WEIGHT)]<-NA #match 4 digit
numbers before the decimals
tail(temp3)
#???? ID CTIME? WEIGHT
#4 HM001? 1226????? NA
#5 HM001? 1227 32.1000
#6 HM001? 1228 32.4000
#7 HM001? 1229????? NA
#8 HM001? 1230 27.4000
#9 HM001? 1231 22.4236

#Based on the variance,
You could set up some limit, for example 50 and use:
tempnew$WEIGHT<- ifelse(tempnew$WEIGHT>50,NA,tempnew$WEIGHT)
A.K.





________________________________
From: ??? <jamansymptom at naver.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Monday, January 28, 2013 2:20 AM
Subject: Re: Thank you your help.



Thank you for your reply again.??Your understanding is exactly right.
I attached?a?picture that show dataset.
'weight' is a dependent variable. And CTIME means hour/minute. This data
will have accumulated for years.
Speaking of accepted variance range, it would?be from 10 to 50. 
Actually, I am java programmer. So, I am strange this R Language.
Can u give me some example to use grep function?
-----Original Message-----
From: "arun"<smartpink111 at yahoo.com> 
To: "jamansymptom at naver.com"<jamansymptom at naver.com>; 
Cc: 
Sent: 2013-01-28 (?) 15:27:12
Subject: Re: Thank you your help.

Hi,
Your original post was that 
"...it was evaluated from 20kg -40kg. But By some errors, it is evaluated
2000 kg".

So, my understanding was that you get values 2000 or 2000-4000 reads in place of
20-40 occasionally due to some misreading.

If your dataset contains observed value, strange value and NA and you want to
replace the strange value to NA, could you mention the range of strange values.?
If the strange value ranges anywhere between 1000-9999, it should get replaced
with the ?grep() solution.? But, if it depends upon something else, you need to
specify.? Also, regarding the variance, what is your accepted range of variance.
A.K.





----- Original Message -----
From: "jamansymptom at naver.com" <jamansymptom>@naver.com>
To: smartpink111 at yahoo.com
Cc: 
Sent: Monday, January 28, 2013 1:15 AM
Subject: Thank you your help.

Thank you to answer my question. 
It is not exactly what I want. I should have informed detailed situation. 
There is a sensor get data every minute. And that data will be accumulated and
be portion of dataset.
And the dataset contains observed value, strange value and NA. 
Namely, I am not sure where strange value will be occured. 
And I can't expect when strange value will be occured. 

I need the procedure performing like below.? 
1. using a method, set the range of variance 
2. using for(i) statement, check whether variance(weihgt) is in the range. 
3. when variance is out of range, impute weight[i] as NA. 

Thank you.?

arun

2013-Jan-29 02:25 UTC

head link

[R] Thank you your help and one more question.

HI,

How do you want to combine the results?
It looks like the 5 datasets are list elements.

If I take the first three list elements,
imput1_2_3<-list(imp1=structure(list(ID = c("HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001"), CTIME = 1223:1237,
WEIGHT = c(24.9,
25.2, 25.5, 25.24132, 25.7, 27.1, 27.3, 27.4, 28.4, 29.2, 30.1377, 
31.17251, 32.4, 33.7, 34.3)), .Names = c("ID", "CTIME",
"WEIGHT"
), class = "data.frame", row.names = c("1", "2",
"3", "4", "5",
"6", "7", "8", "9", "10",
"11", "12", "13", "14",
"15")),
imp2=structure(list(ID = c("HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001"), CTIME = 1223:1237,
WEIGHT = c(24.9,
25.2, 25.5, 25.54828, 25.7, 27.1, 27.3, 27.4, 28.4, 29.2, 29.8977, 
31.35045, 32.4, 33.7, 34.3)), .Names = c("ID", "CTIME",
"WEIGHT"
), class = "data.frame", row.names = c("1", "2",
"3", "4", "5",
"6", "7", "8", "9", "10",
"11", "12", "13", "14",
"15")),
imp3=structure(list(ID = c("HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001"), CTIME = 1223:1237,
WEIGHT = c(24.9,
25.2, 25.5, 25.46838, 25.7, 27.1, 27.3, 27.4, 28.4, 29.2, 30.88185, 
31.57952, 32.4, 33.7, 34.3)), .Names = c("ID", "CTIME",
"WEIGHT"
), class = "data.frame", row.names = c("1", "2",
"3", "4", "5",
"6", "7", "8", "9", "10",
"11", "12", "13", "14",
"15")))
#It could be combined by:
do.call(rbind, imput1_2_3)# But if you do this the total number or rows will be
the sum of the number of rows of each dataset.

I guess you want something like this:

res<-Reduce(function(...)
merge(...,by=c("ID","CTIME")),imput1_2_3)
?names(res)[3:5]<-
paste("WEIGHT","IMP",1:3,sep="")
?res
#????? ID CTIME WEIGHTIMP1 WEIGHTIMP2 WEIGHTIMP3
#1? HM001? 1223?? 24.90000?? 24.90000?? 24.90000
#2? HM001? 1224?? 25.20000?? 25.20000?? 25.20000
#3? HM001? 1225?? 25.50000?? 25.50000?? 25.50000
#4? HM001? 1226?? 25.24132?? 25.54828?? 25.46838
#5? HM001? 1227?? 25.70000?? 25.70000?? 25.70000
#6? HM001? 1228?? 27.10000?? 27.10000?? 27.10000
#7? HM001? 1229?? 27.30000?? 27.30000?? 27.30000
#8? HM001? 1230?? 27.40000?? 27.40000?? 27.40000
#9? HM001? 1231?? 28.40000?? 28.40000?? 28.40000
#10 HM001? 1232?? 29.20000?? 29.20000?? 29.20000
#11 HM001? 1233?? 30.13770?? 29.89770?? 30.88185
#12 HM001? 1234?? 31.17251?? 31.35045?? 31.57952
#13 HM001? 1235?? 32.40000?? 32.40000?? 32.40000
#14 HM001? 1236?? 33.70000?? 33.70000?? 33.70000
#15 HM001? 1237?? 34.30000?? 34.30000?? 34.30000
A.K.







________________________________
From: ??? <jamansymptom at naver.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Monday, January 28, 2013 7:35 PM
Subject: Thank you your help and one more question.

http://us-mg6.mail.yahoo.com/neo/launch?.rand=3qkohpi922i2q#
I deeply appreciate your help.?Answering your question, I am software engineer.
And I am developing system accumulating data to draw chart and table.
For higher perfromance, I have to deal missing value treatment.? So, I use
Amelia Pacakge. Below is the result follwing your answer.
---------------------------------------------------------------->temp2??? #origin data?ID CTIME WEIGHT
1? HM001? 1223?? 24.9
2? HM001? 1224?? 25.2
3? HM001? 1225?? 25.5
4? HM001? 1226???? NA
5? HM001? 1227?? 25.7
6? HM001? 1228?? 27.1
7? HM001? 1229?? 27.3
8? HM001? 1230?? 27.4
9? HM001? 1231?? 28.4
10 HM001? 1232?? 29.2
11 HM001? 1233 1221.0
12 HM001? 1234???? NA
13 HM001? 1235?? 32.4
14 HM001? 1236?? 33.7
15 HM001? 1237?? 34.3?>?temp2$WEIGHT<- ifelse(temp2$WEIGHT>50,NA,temp2$WEIGHT)?>temp2??? # After eliminating?strange value
????? ID CTIME WEIGHT
1? HM001? 1223?? 24.9
2? HM001? 1224?? 25.2
3? HM001? 1225?? 25.5
4? HM001? 1226???? NA
5? HM001? 1227?? 25.7
6? HM001? 1228?? 27.1
7? HM001? 1229?? 27.3
8? HM001? 1230?? 27.4
9? HM001? 1231?? 28.4
10 HM001? 1232?? 29.2
11 HM001? 1233???? NA
12 HM001? 1234???? NA
13 HM001? 1235?? 32.4
14 HM001? 1236?? 33.7
15 HM001? 1237?? 34.3
-------------------------------------------------------------- 
I have One more question. Below?are codes and results.
--------------------------------------------------------------> a.out2<-amelia(temp2, m=5, ts="CTIME", cs="ID",
polytime=1)-- Imputation 1 --
?1? 2? 3? 4 
-- Imputation 2 --
?1? 2? 3 
-- Imputation 3 --
?1? 2? 3? 4 
-- Imputation 4 --
?1? 2? 3 
-- Imputation 5 --
?1? 2? 3 
> a.out2$imputations$imp1
????? ID CTIME?? WEIGHT
1? HM001? 1223 24.90000
2? HM001? 1224 25.20000
3? HM001? 1225 25.50000
4? HM001? 1226 25.24132
5? HM001? 1227 25.70000
6? HM001? 1228 27.10000
7? HM001? 1229 27.30000
8? HM001? 1230 27.40000
9? HM001? 1231 28.40000
10 HM001? 1232 29.20000
11 HM001? 1233 30.13770
12 HM001? 1234 31.17251
13 HM001? 1235 32.40000
14 HM001? 1236 33.70000
15 HM001? 1237 34.30000
$imp2
????? ID CTIME?? WEIGHT
1? HM001? 1223 24.90000
2? HM001? 1224 25.20000
3? HM001? 1225 25.50000
4? HM001? 1226 25.54828
5? HM001? 1227 25.70000
6? HM001? 1228 27.10000
7? HM001? 1229 27.30000
8? HM001? 1230 27.40000
9? HM001? 1231 28.40000
10 HM001? 1232 29.20000
11 HM001? 1233 29.89770
12 HM001? 1234 31.35045
13 HM001? 1235 32.40000
14 HM001? 1236 33.70000
15 HM001? 1237 34.30000
$imp3
????? ID CTIME?? WEIGHT
1? HM001? 1223 24.90000
2? HM001? 1224 25.20000
3? HM001? 1225 25.50000
4? HM001? 1226 25.46838
5? HM001? 1227 25.70000
6? HM001? 1228 27.10000
7? HM001? 1229 27.30000
8? HM001? 1230 27.40000
9? HM001? 1231 28.40000
10 HM001? 1232 29.20000
11 HM001? 1233 30.88185
12 HM001? 1234 31.57952
13 HM001? 1235 32.40000
14 HM001? 1236 33.70000
15 HM001? 1237 34.30000
$imp4
????? ID CTIME?? WEIGHT
1? HM001? 1223 24.90000
2? HM001? 1224 25.20000
3? HM001? 1225 25.50000
4? HM001? 1226 25.86703
5? HM001? 1227 25.70000
6? HM001? 1228 27.10000
7? HM001? 1229 27.30000
8? HM001? 1230 27.40000
9? HM001? 1231 28.40000
10 HM001? 1232 29.20000
11 HM001? 1233 30.61241
12 HM001? 1234 30.17042
13 HM001? 1235 32.40000
14 HM001? 1236 33.70000
15 HM001? 1237 34.30000
$imp5
????? ID CTIME?? WEIGHT
1? HM001? 1223 24.90000
2? HM001? 1224 25.20000
3? HM001? 1225 25.50000
4? HM001? 1226 26.05747
5? HM001? 1227 25.70000
6? HM001? 1228 27.10000
7? HM001? 1229 27.30000
8? HM001? 1230 27.40000
9? HM001? 1231 28.40000
10 HM001? 1232 29.20000
11 HM001? 1233 31.03894
12 HM001? 1234 30.90960
13 HM001? 1235 32.40000
14 HM001? 1236 33.70000
15 HM001? 1237 34.30000
----------------------------------------
I got 5 datasets including imputed values. But What I want is not five datasets,
only one data set which combine those 5 imputed datasets.
I wannacombine $imp1, $imp2... $imp5 to get a final result set. This result set
is also (3 X 15) matrix.
Would you help me once more please?


-----Original Message-----
From: "arun"<smartpink111 at yahoo.com> 
To: "???"<jamansymptom at naver.com>; 
Cc: "R help"<r-help at r-project.org>; 
Sent: 2013-01-28 (?) 23:48:51
Subject: Re: Thank you your help.



Hi,
temp3<- read.table(text="
ID CTIME WEIGHT
HM001 1223 24.0
HM001 1224 25.2
HM001 1225 23.1
HM001 1226 NA
HM001 1227 32.1
HM001 1228 32.4
HM001 1229 1323.2
HM001 1230 27.4
HM001 1231 22.4236 #changed here to test the previous solution
",sep="",header=TRUE,stringsAsFactors=FALSE)
?tempnew<- na.omit(temp3)


?grep("\\d{4}",temp3$WEIGHT) 
#[1] 7 9 #not correct


temp3[,3][grep("\\d{4}..*",temp3$WEIGHT)]<-NA #match 4 digit
numbers before the decimals
tail(temp3)
#???? ID CTIME? WEIGHT
#4 HM001? 1226????? NA
#5 HM001? 1227 32.1000
#6 HM001? 1228 32.4000
#7 HM001? 1229????? NA
#8 HM001? 1230 27.4000
#9 HM001? 1231 22.4236

#Based on the variance,
You could set up some limit, for example 50 and use:
tempnew$WEIGHT<- ifelse(tempnew$WEIGHT>50,NA,tempnew$WEIGHT)
A.K.





________________________________
From: ??? <jamansymptom>@naver.com>
To: arun <smartpink111>@yahoo.com> 
Sent: Monday, January 28, 2013 2:20 AM
Subject: Re: Thank you your help.



Thank you for your reply again.??Your understanding is exactly right.
I attached?a?picture that show dataset.
'weight' is a dependent variable. And CTIME means hour/minute. This data
will have accumulated for years.
Speaking of accepted variance range, it would?be from 10 to 50. 
Actually, I am java programmer. So, I am strange this R Language.
Can u give me some example to use grep function?
-----Original Message-----
From: "arun"<smartpink111>@yahoo.com> 
To: "jamansymptom at naver.com"<jamansymptom>@naver.com>; 
Cc: 
Sent: 2013-01-28 (?) 15:27:12
Subject: Re: Thank you your help.

Hi,
Your original post was that 
"...it was evaluated from 20kg -40kg. But By some errors, it is evaluated
2000 kg".

So, my understanding was that you get values 2000 or 2000-4000 reads in place of
20-40 occasionally due to some misreading.

If your dataset contains observed value, strange value and NA and you want to
replace the strange value to NA, could you mention the range of strange values.?
If the strange value ranges anywhere between 1000-9999, it should get replaced
with the ?grep() solution.? But, if it depends upon something else, you need to
specify.? Also, regarding the variance, what is your accepted range of variance.
A.K.





----- Original Message -----
From: "jamansymptom at naver.com" <jamansymptom>@naver.com>
To: smartpink111 at yahoo.com
Cc: 
Sent: Monday, January 28, 2013 1:15 AM
Subject: Thank you your help.

Thank you to answer my question. 
It is not exactly what I want. I should have informed detailed situation. 
There is a sensor get data every minute. And that data will be accumulated and
be portion of dataset.
And the dataset contains observed value, strange value and NA. 
Namely, I am not sure where strange value will be occured. 
And I can't expect when strange value will be occured. 

I need the procedure performing like below.? 
1. using a method, set the range of variance 
2. using for(i) statement, check whether variance(weihgt) is in the range. 
3. when variance is out of range, impute weight[i] as NA. 

Thank you.????

arun

2013-Jan-29 03:20 UTC

head link

[R] Thank you your help and one more question.

HI,

I don't have Amelia package installed.


If you want to get the mean value, you could use either ?aggregate(),? or
?ddply() from library(plyr)

library(plyr)
imputNew<-do.call(rbind,imput1_2_3)
?res1<-ddply(imputNew,.(ID,CTIME),function(x) mean(x$WEIGHT))
?names(res1)[3]<-"WEIGHT"
?head(res1)
?# ?? ID CTIME?? WEIGHT
#1 HM001? 1223 24.90000
#2 HM001? 1224 25.20000
#3 HM001? 1225 25.50000
#4 HM001? 1226 25.41933
#5 HM001? 1227 25.70000
#6 HM001? 1228 27.10000


#or
res2<-aggregate(.~ID+CTIME,data=imputNew,mean)
#or
res3<-? do.call(rbind,lapply(split(imputNew,imputNew$CTIME),function(x)
{x$WEIGHT<-mean(x[,3]);head(x,1)}))
row.names(res3)<-1:nrow(res3)
identical(res1,res2)
#[1] TRUE
?identical(res1,res3)
#[1] TRUE
A.K.


________________________________
From: ??? <jamansymptom at naver.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Monday, January 28, 2013 9:47 PM
Subject: Re: Thank you your help and one more question.


Thank you for replying my question.
What I want is the matrix like below.
I have 3 data sets that named weightimp1, 2, 3. 
And, to get the matrix like below, I have to combine 3 data sets(named
weightimp1, 2, 3).
I don't know how to 3data sets combined. It could be mean of 3 data set. Or,
there?might be a?value(temp2$imputations$...) in?Amelia package.
I prefer to use Amelia package method, but if it?dosen't exist, can u
recommend how to?set as a mean value??

#????? ID CTIME WEIGHT (It represents 3 data sets(weightimp1, 2, 3)
#1? HM001? 1223?? 24.90000?? 
#2? HM001? 1224?? 25.20000?
#3? HM001? 1225?? 25.50000??
#4? HM001? 1226?? 25.24132??
#5? HM001? 1227?? 25.70000?? 
#6? HM001? 1228?? 27.10000?? 
#7? HM001? 1229?? 27.30000?? 
#8? HM001? 1230?? 27.40000??
#9? HM001? 1231?? 28.40000?? 
#10 HM001? 1232?? 29.20000??
#11 HM001? 1233?? 30.13770?? 
#12 HM001? 1234?? 31.17251?? 
#13 HM001? 1235?? 32.40000?? 
#14 HM001? 1236?? 33.70000?? 
#15 HM001? 1237?? 34.30000?? 
-----Original Message-----
From: "arun"<smartpink111 at yahoo.com> 
To: "???"<jamansymptom at naver.com>; 
Cc: "R help"<r-help at r-project.org>; 
Sent: 2013-01-29 (?) 11:25:38
Subject: Re: Thank you your help and one more question.

HI,

How do you want to combine the results?
It looks like the 5 datasets are list elements.

If I take the first three list elements,
imput1_2_3<-list(imp1=structure(list(ID = c("HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001"), CTIME = 1223:1237,
WEIGHT = c(24.9,
25.2, 25.5, 25.24132, 25.7, 27.1, 27.3, 27.4, 28.4, 29.2, 30.1377, 
31.17251, 32.4, 33.7, 34.3)), .Names = c("ID", "CTIME",
"WEIGHT"
), class = "data.frame", row.names = c("1", "2",
"3", "4", "5",
"6", "7", "8", "9", "10",
"11", "12", "13", "14",
"15")),
imp2=structure(list(ID = c("HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001"), CTIME = 1223:1237,
WEIGHT = c(24.9,
25.2, 25.5, 25.54828, 25.7, 27.1, 27.3, 27.4, 28.4, 29.2, 29.8977, 
31.35045, 32.4, 33.7, 34.3)), .Names = c("ID", "CTIME",
"WEIGHT"
), class = "data.frame", row.names = c("1", "2",
"3", "4", "5",
"6", "7", "8", "9", "10",
"11", "12", "13", "14",
"15")),
imp3=structure(list(ID = c("HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001"), CTIME = 1223:1237,
WEIGHT = c(24.9,
25.2, 25.5, 25.46838, 25.7, 27.1, 27.3, 27.4, 28.4, 29.2, 30.88185, 
31.57952, 32.4, 33.7, 34.3)), .Names = c("ID", "CTIME",
"WEIGHT"
), class = "data.frame", row.names = c("1", "2",
"3", "4", "5",
"6", "7", "8", "9", "10",
"11", "12", "13", "14",
"15")))
#It could be combined by:
do.call(rbind, imput1_2_3)# But if you do this the total number or rows will be
the sum of the number of rows of each dataset.

I guess you want something like this:

res<-Reduce(function(...)
merge(...,by=c("ID","CTIME")),imput1_2_3)
?names(res)[3:5]<-
paste("WEIGHT","IMP",1:3,sep="")
?res
#????? ID CTIME WEIGHTIMP1 WEIGHTIMP2 WEIGHTIMP3
#1? HM001? 1223?? 24.90000?? 24.90000?? 24.90000
#2? HM001? 1224?? 25.20000?? 25.20000?? 25.20000
#3? HM001? 1225?? 25.50000?? 25.50000?? 25.50000
#4? HM001? 1226?? 25.24132?? 25.54828?? 25.46838
#5? HM001? 1227?? 25.70000?? 25.70000?? 25.70000
#6? HM001? 1228?? 27.10000?? 27.10000?? 27.10000
#7? HM001? 1229?? 27.30000?? 27.30000?? 27.30000
#8? HM001? 1230?? 27.40000?? 27.40000?? 27.40000
#9? HM001? 1231?? 28.40000?? 28.40000?? 28.40000
#10 HM001? 1232?? 29.20000?? 29.20000?? 29.20000
#11 HM001? 1233?? 30.13770?? 29.89770?? 30.88185
#12 HM001? 1234?? 31.17251?? 31.35045?? 31.57952
#13 HM001? 1235?? 32.40000?? 32.40000?? 32.40000
#14 HM001? 1236?? 33.70000?? 33.70000?? 33.70000
#15 HM001? 1237?? 34.30000?? 34.30000?? 34.30000
A.K.







________________________________
From: ??? <jamansymptom>@naver.com>
To: arun <smartpink111>@yahoo.com> 
Sent: Monday, January 28, 2013 7:35 PM
Subject: Thank you your help and one more question.

http://us-mg6.mail.yahoo.com/neo/launch?.rand=3qkohpi922i2q#
I deeply appreciate your help.?Answering your question, I am software engineer.
And I am developing system accumulating data to draw chart and table.
For higher perfromance, I have to deal missing value treatment.? So, I use
Amelia Pacakge. Below is the result follwing your answer.
---------------------------------------------------------------->temp2??? #origin data?ID CTIME WEIGHT
1? HM001? 1223?? 24.9
2? HM001? 1224?? 25.2
3? HM001? 1225?? 25.5
4? HM001? 1226???? NA
5? HM001? 1227?? 25.7
6? HM001? 1228?? 27.1
7? HM001? 1229?? 27.3
8? HM001? 1230?? 27.4
9? HM001? 1231?? 28.4
10 HM001? 1232?? 29.2
11 HM001? 1233 1221.0
12 HM001? 1234???? NA
13 HM001? 1235?? 32.4
14 HM001? 1236?? 33.7
15 HM001? 1237?? 34.3?>?temp2$WEIGHT<- ifelse(temp2$WEIGHT>50,NA,temp2$WEIGHT)?>temp2??? # After eliminating?strange value
????? ID CTIME WEIGHT
1? HM001? 1223?? 24.9
2? HM001? 1224?? 25.2
3? HM001? 1225?? 25.5
4? HM001? 1226???? NA
5? HM001? 1227?? 25.7
6? HM001? 1228?? 27.1
7? HM001? 1229?? 27.3
8? HM001? 1230?? 27.4
9? HM001? 1231?? 28.4
10 HM001? 1232?? 29.2
11 HM001? 1233???? NA
12 HM001? 1234???? NA
13 HM001? 1235?? 32.4
14 HM001? 1236?? 33.7
15 HM001? 1237?? 34.3
-------------------------------------------------------------- 
I have One more question. Below?are codes and results.
--------------------------------------------------------------> a.out2<-amelia(temp2, m=5, ts="CTIME", cs="ID",
polytime=1)-- Imputation 1 --
?1? 2? 3? 4 
-- Imputation 2 --
?1? 2? 3 
-- Imputation 3 --
?1? 2? 3? 4 
-- Imputation 4 --
?1? 2? 3 
-- Imputation 5 --
?1? 2? 3 
> a.out2$imputations$imp1
????? ID CTIME?? WEIGHT
1? HM001? 1223 24.90000
2? HM001? 1224 25.20000
3? HM001? 1225 25.50000
4? HM001? 1226 25.24132
5? HM001? 1227 25.70000
6? HM001? 1228 27.10000
7? HM001? 1229 27.30000
8? HM001? 1230 27.40000
9? HM001? 1231 28.40000
10 HM001? 1232 29.20000
11 HM001? 1233 30.13770
12 HM001? 1234 31.17251
13 HM001? 1235 32.40000
14 HM001? 1236 33.70000
15 HM001? 1237 34.30000
$imp2
????? ID CTIME?? WEIGHT
1? HM001? 1223 24.90000
2? HM001? 1224 25.20000
3? HM001? 1225 25.50000
4? HM001? 1226 25.54828
5? HM001? 1227 25.70000
6? HM001? 1228 27.10000
7? HM001? 1229 27.30000
8? HM001? 1230 27.40000
9? HM001? 1231 28.40000
10 HM001? 1232 29.20000
11 HM001? 1233 29.89770
12 HM001? 1234 31.35045
13 HM001? 1235 32.40000
14 HM001? 1236 33.70000
15 HM001? 1237 34.30000
$imp3
????? ID CTIME?? WEIGHT
1? HM001? 1223 24.90000
2? HM001? 1224 25.20000
3? HM001? 1225 25.50000
4? HM001? 1226 25.46838
5? HM001? 1227 25.70000
6? HM001? 1228 27.10000
7? HM001? 1229 27.30000
8? HM001? 1230 27.40000
9? HM001? 1231 28.40000
10 HM001? 1232 29.20000
11 HM001? 1233 30.88185
12 HM001? 1234 31.57952
13 HM001? 1235 32.40000
14 HM001? 1236 33.70000
15 HM001? 1237 34.30000
$imp4
????? ID CTIME?? WEIGHT
1? HM001? 1223 24.90000
2? HM001? 1224 25.20000
3? HM001? 1225 25.50000
4? HM001? 1226 25.86703
5? HM001? 1227 25.70000
6? HM001? 1228 27.10000
7? HM001? 1229 27.30000
8? HM001? 1230 27.40000
9? HM001? 1231 28.40000
10 HM001? 1232 29.20000
11 HM001? 1233 30.61241
12 HM001? 1234 30.17042
13 HM001? 1235 32.40000
14 HM001? 1236 33.70000
15 HM001? 1237 34.30000
$imp5
????? ID CTIME?? WEIGHT
1? HM001? 1223 24.90000
2? HM001? 1224 25.20000
3? HM001? 1225 25.50000
4? HM001? 1226 26.05747
5? HM001? 1227 25.70000
6? HM001? 1228 27.10000
7? HM001? 1229 27.30000
8? HM001? 1230 27.40000
9? HM001? 1231 28.40000
10 HM001? 1232 29.20000
11 HM001? 1233 31.03894
12 HM001? 1234 30.90960
13 HM001? 1235 32.40000
14 HM001? 1236 33.70000
15 HM001? 1237 34.30000
----------------------------------------
I got 5 datasets including imputed values. But What I want is not five datasets,
only one data set which combine those 5 imputed datasets.
I wannacombine $imp1, $imp2... $imp5 to get a final result set. This result set
is also (3 X 15) matrix.
Would you help me once more please?


-----Original Message-----
From: "arun"<smartpink111>@yahoo.com> 
To: "???"<jamansymptom>@naver.com>; 
Cc: "R help"<r-help>@r-project.org>; 
Sent: 2013-01-28 (?) 23:48:51
Subject: Re: Thank you your help.



Hi,
temp3<- read.table(text="
ID CTIME WEIGHT
HM001 1223 24.0
HM001 1224 25.2
HM001 1225 23.1
HM001 1226 NA
HM001 1227 32.1
HM001 1228 32.4
HM001 1229 1323.2
HM001 1230 27.4
HM001 1231 22.4236 #changed here to test the previous solution
",sep="",header=TRUE,stringsAsFactors=FALSE)
?tempnew<- na.omit(temp3)


?grep("\\d{4}",temp3$WEIGHT) 
#[1] 7 9 #not correct


temp3[,3][grep("\\d{4}..*",temp3$WEIGHT)]<-NA #match 4 digit
numbers before the decimals
tail(temp3)
#???? ID CTIME? WEIGHT
#4 HM001? 1226????? NA
#5 HM001? 1227 32.1000
#6 HM001? 1228 32.4000
#7 HM001? 1229????? NA
#8 HM001? 1230 27.4000
#9 HM001? 1231 22.4236

#Based on the variance,
You could set up some limit, for example 50 and use:
tempnew$WEIGHT<- ifelse(tempnew$WEIGHT>50,NA,tempnew$WEIGHT)
A.K.





________________________________
From: ??? <jamansymptom>@naver.com>
To: arun <smartpink111>@yahoo.com> 
Sent: Monday, January 28, 2013 2:20 AM
Subject: Re: Thank you your help.



Thank you for your reply again.??Your understanding is exactly right.
I attached?a?picture that show dataset.
'weight' is a dependent variable. And CTIME means hour/minute. This data
will have accumulated for years.
Speaking of accepted variance range, it would?be from 10 to 50. 
Actually, I am java programmer. So, I am strange this R Language.
Can u give me some example to use grep function?
-----Original Message-----
From: "arun"<smartpink111>@yahoo.com> 
To: "jamansymptom at naver.com"<jamansymptom>@naver.com>; 
Cc: 
Sent: 2013-01-28 (?) 15:27:12
Subject: Re: Thank you your help.

Hi,
Your original post was that 
"...it was evaluated from 20kg -40kg. But By some errors, it is evaluated
2000 kg".

So, my understanding was that you get values 2000 or 2000-4000 reads in place of
20-40 occasionally due to some misreading.

If your dataset contains observed value, strange value and NA and you want to
replace the strange value to NA, could you mention the range of strange values.?
If the strange value ranges anywhere between 1000-9999, it should get replaced
with the ?grep() solution.? But, if it depends upon something else, you need to
specify.? Also, regarding the variance, what is your accepted range of variance.
A.K.





----- Original Message -----
From: "jamansymptom at naver.com" <jamansymptom>@naver.com>
To: smartpink111 at yahoo.com
Cc: 
Sent: Monday, January 28, 2013 1:15 AM
Subject: Thank you your help.

Thank you to answer my question. 
It is not exactly what I want. I should have informed detailed situation. 
There is a sensor get data every minute. And that data will be accumulated and
be portion of dataset.
And the dataset contains observed value, strange value and NA. 
Namely, I am not sure where strange value will be occured. 
And I can't expect when strange value will be occured. 

I need the procedure performing like below.? 
1. using a method, set the range of variance 
2. using for(i) statement, check whether variance(weihgt) is in the range. 
3. when variance is out of range, impute weight[i] as NA. 

Thank you.?????

arun

2013-Jan-29 05:48 UTC

head link

[R] Thank you your help and one more question.

Hi,

I think I understand your mistake.
imput1_2_3<-list(imp1=structure(list(ID = c("HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001"), CTIME = 1223:1237,
WEIGHT = c(24.9,
25.2, 25.5, 25.24132, 25.7, 27.1, 27.3, 27.4, 28.4, 29.2, 30.1377,
31.17251, 32.4, 33.7, 34.3)), .Names = c("ID", "CTIME",
"WEIGHT"
), class = "data.frame", row.names = c("1", "2",
"3", "4", "5",
"6", "7", "8", "9", "10",
"11", "12", "13", "14",
"15")),
imp2=structure(list(ID = c("HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001"), CTIME = 1223:1237,
WEIGHT = c(24.9,
25.2, 25.5, 25.54828, 25.7, 27.1, 27.3, 27.4, 28.4, 29.2, 29.8977,
31.35045, 32.4, 33.7, 34.3)), .Names = c("ID", "CTIME",
"WEIGHT"
), class = "data.frame", row.names = c("1", "2",
"3", "4", "5",
"6", "7", "8", "9", "10",
"11", "12", "13", "14",
"15")),
imp3=structure(list(ID = c("HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001"), CTIME = 1223:1237,
WEIGHT = c(24.9,
25.2, 25.5, 25.46838, 25.7, 27.1, 27.3, 27.4, 28.4, 29.2, 30.88185,
31.57952, 32.4, 33.7, 34.3)), .Names = c("ID", "CTIME",
"WEIGHT"
), class = "data.frame", row.names = c("1", "2",
"3", "4", "5",
"6", "7", "8", "9", "10",
"11", "12", "13", "14",
"15")))
?imput<- list(imput1_2_3[1],imput1_2_3[2],imput1_2_3[3]) #what you tried.?
You should use [[ ]]instead of [].? Here, it is not necessary
aggregate(.~ID+CTIME,data=imput,mean)
#Error in eval(expr, envir, enclos) : object 'ID' not found

#You don't need the above step.
class(imput1_2_3) #already a list
[1] "list"
?imput<-do.call(rbind,imput1_2_3)
?aggregate(.~ID+CTIME,data=imput,mean)
? # ?? ID CTIME?? WEIGHT
#1? HM001? 1223 24.90000
#2? HM001? 1224 25.20000
#3? HM001? 1225 25.50000
#4? HM001? 1226 25.41933
#5? HM001? 1227 25.70000
#6? HM001? 1228 27.10000
#7? HM001? 1229 27.30000
#8? HM001? 1230 27.40000
#9? HM001? 1231 28.40000
#10 HM001? 1232 29.20000
#11 HM001? 1233 30.30575
#12 HM001? 1234 31.36749
#13 HM001? 1235 32.40000
#14 HM001? 1236 33.70000
#15 HM001? 1237 34.30000
A.K.










________________________________
From: ??? <jamansymptom at naver.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Tuesday, January 29, 2013 12:04 AM
Subject: Re: Thank you your help and one more question.


I decided to follow aggregate(). So i install library(plyr). 
But, While executing this statement 'res <- aggregate(.~ID+CIME,
data=input,mean)', Error was occcured.
What should I do next time?
?> library(plyr) > a.out2$imputations$imp1
????? ID?????? CTIME ACTIVE_KWH
1? HM001 2.01212e+11?? 24.20000
2? HM001 2.01212e+11?? 25.50000
3? HM001 2.01212e+11?? 25.60000
4? HM001 2.01212e+11?? 25.90065
5? HM001 2.01212e+11?? 26.60000
6? HM001 2.01212e+11?? 26.70000
7? HM001 2.01212e+11?? 27.10000
8? HM001 2.01212e+11?? 27.40000
9? HM001 2.01212e+11?? 27.50000
10 HM001 2.01212e+11?? 27.80000
11 HM001 2.01212e+11?? 28.20000
12 HM001 2.01212e+11?? 28.44605
13 HM001 2.01212e+11?? 28.70000
14 HM001 2.01212e+11?? 28.90000
15 HM001 2.01212e+11?? 29.10000
$imp2
????? ID?????? CTIME ACTIVE_KWH
1? HM001 2.01212e+11?? 24.20000
2? HM001 2.01212e+11?? 25.50000
3? HM001 2.01212e+11?? 25.60000
4? HM001 2.01212e+11?? 25.87163
5? HM001 2.01212e+11?? 26.60000
6? HM001 2.01212e+11?? 26.70000
7? HM001 2.01212e+11?? 27.10000
8? HM001 2.01212e+11?? 27.40000
9? HM001 2.01212e+11?? 27.50000
10 HM001 2.01212e+11?? 27.80000
11 HM001 2.01212e+11?? 28.20000
12 HM001 2.01212e+11?? 28.68048
13 HM001 2.01212e+11?? 28.70000
14 HM001 2.01212e+11?? 28.90000
15 HM001 2.01212e+11?? 29.10000?> imput <- list(a.out2$imputations[1], a.out2$imputations[2])
> do.call(rbind, imput)
[[1]]
[[1]]$imp1
????? ID?????? CTIME ACTIVE_KWH
1? HM001 2.01212e+11?? 24.20000
2? HM001 2.01212e+11?? 25.50000
3? HM001 2.01212e+11?? 25.60000
4? HM001 2.01212e+11?? 25.90065
5? HM001 2.01212e+11?? 26.60000
6? HM001 2.01212e+11?? 26.70000
7? HM001 2.01212e+11?? 27.10000
8? HM001 2.01212e+11?? 27.40000
9? HM001 2.01212e+11?? 27.50000
10 HM001 2.01212e+11?? 27.80000
11 HM001 2.01212e+11?? 28.20000
12 HM001 2.01212e+11?? 28.44605
13 HM001 2.01212e+11?? 28.70000
14 HM001 2.01212e+11?? 28.90000
15 HM001 2.01212e+11?? 29.10000

[[2]]
[[2]]$imp2
????? ID?????? CTIME ACTIVE_KWH
1? HM001 2.01212e+11?? 24.20000
2? HM001 2.01212e+11?? 25.50000
3? HM001 2.01212e+11?? 25.60000
4? HM001 2.01212e+11?? 25.87163
5? HM001 2.01212e+11?? 26.60000
6? HM001 2.01212e+11?? 26.70000
7? HM001 2.01212e+11?? 27.10000
8? HM001 2.01212e+11?? 27.40000
9? HM001 2.01212e+11?? 27.50000
10 HM001 2.01212e+11?? 27.80000
11 HM001 2.01212e+11?? 28.20000
12 HM001 2.01212e+11?? 28.68048
13 HM001 2.01212e+11?? 28.70000
14 HM001 2.01212e+11?? 28.90000
15 HM001 2.01212e+11?? 29.10000> res <- aggregate(.~ID+CTIME, data=imput,mean)Follwing Error. eval(expr, envir, enclos) : no element 'ID'????? # I
transfer this line in english because it was written by my mother language.

-----Original Message-----
From: "arun"<smartpink111 at yahoo.com> 
To: "???"<jamansymptom at naver.com>; 
Cc: "R help"<r-help at r-project.org>; 
Sent: 2013-01-29 (?) 12:20:10
Subject: Re: Thank you your help and one more question.



HI,

I don't have Amelia package installed.


If you want to get the mean value, you could use either ?aggregate(),? or
?ddply() from library(plyr)

library(plyr)
imputNew<-do.call(rbind,imput1_2_3)
?res1<-ddply(imputNew,.(ID,CTIME),function(x) mean(x$WEIGHT))
?names(res1)[3]<-"WEIGHT"
?head(res1)
?# ?? ID CTIME?? WEIGHT
#1 HM001? 1223 24.90000
#2 HM001? 1224 25.20000
#3 HM001? 1225 25.50000
#4 HM001? 1226 25.41933
#5 HM001? 1227 25.70000
#6 HM001? 1228 27.10000


#or
res2<-aggregate(.~ID+CTIME,data=imputNew,mean)
#or
res3<-? do.call(rbind,lapply(split(imputNew,imputNew$CTIME),function(x)
{x$WEIGHT<-mean(x[,3]);head(x,1)}))
row.names(res3)<-1:nrow(res3)
identical(res1,res2)
#[1] TRUE
?identical(res1,res3)
#[1] TRUE
A.K.


________________________________
From: ??? <jamansymptom>@naver.com>
To: arun <smartpink111>@yahoo.com> 
Sent: Monday, January 28, 2013 9:47 PM
Subject: Re: Thank you your help and one more question.


Thank you for replying my question.
What I want is the matrix like below.
I have 3 data sets that named weightimp1, 2, 3. 
And, to get the matrix like below, I have to combine 3 data sets(named
weightimp1, 2, 3).
I don't know how to 3data sets combined. It could be mean of 3 data set. Or,
there?might be a?value(temp2$imputations$...) in?Amelia package.
I prefer to use Amelia package method, but if it?dosen't exist, can u
recommend how to?set as a mean value??

#????? ID CTIME WEIGHT (It represents 3 data sets(weightimp1, 2, 3)
#1? HM001? 1223?? 24.90000?? 
#2? HM001? 1224?? 25.20000?
#3? HM001? 1225?? 25.50000??
#4? HM001? 1226?? 25.24132??
#5? HM001? 1227?? 25.70000?? 
#6? HM001? 1228?? 27.10000?? 
#7? HM001? 1229?? 27.30000?? 
#8? HM001? 1230?? 27.40000??
#9? HM001? 1231?? 28.40000?? 
#10 HM001? 1232?? 29.20000??
#11 HM001? 1233?? 30.13770?? 
#12 HM001? 1234?? 31.17251?? 
#13 HM001? 1235?? 32.40000?? 
#14 HM001? 1236?? 33.70000?? 
#15 HM001? 1237?? 34.30000?? 
-----Original Message-----
From: "arun"<smartpink111>@yahoo.com> 
To: "???"<jamansymptom>@naver.com>; 
Cc: "R help"<r-help>@r-project.org>; 
Sent: 2013-01-29 (?) 11:25:38
Subject: Re: Thank you your help and one more question.

HI,

How do you want to combine the results?
It looks like the 5 datasets are list elements.

If I take the first three list elements,
imput1_2_3<-list(imp1=structure(list(ID = c("HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001"), CTIME = 1223:1237,
WEIGHT = c(24.9,
25.2, 25.5, 25.24132, 25.7, 27.1, 27.3, 27.4, 28.4, 29.2, 30.1377, 
31.17251, 32.4, 33.7, 34.3)), .Names = c("ID", "CTIME",
"WEIGHT"
), class = "data.frame", row.names = c("1", "2",
"3", "4", "5",
"6", "7", "8", "9", "10",
"11", "12", "13", "14",
"15")),
imp2=structure(list(ID = c("HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001"), CTIME = 1223:1237,
WEIGHT = c(24.9,
25.2, 25.5, 25.54828, 25.7, 27.1, 27.3, 27.4, 28.4, 29.2, 29.8977, 
31.35045, 32.4, 33.7, 34.3)), .Names = c("ID", "CTIME",
"WEIGHT"
), class = "data.frame", row.names = c("1", "2",
"3", "4", "5",
"6", "7", "8", "9", "10",
"11", "12", "13", "14",
"15")),
imp3=structure(list(ID = c("HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001", "HM001",
"HM001", "HM001", "HM001",
"HM001", "HM001", "HM001"), CTIME = 1223:1237,
WEIGHT = c(24.9,
25.2, 25.5, 25.46838, 25.7, 27.1, 27.3, 27.4, 28.4, 29.2, 30.88185, 
31.57952, 32.4, 33.7, 34.3)), .Names = c("ID", "CTIME",
"WEIGHT"
), class = "data.frame", row.names = c("1", "2",
"3", "4", "5",
"6", "7", "8", "9", "10",
"11", "12", "13", "14",
"15")))
#It could be combined by:
do.call(rbind, imput1_2_3)# But if you do this the total number or rows will be
the sum of the number of rows of each dataset.

I guess you want something like this:

res<-Reduce(function(...)
merge(...,by=c("ID","CTIME")),imput1_2_3)
?names(res)[3:5]<-
paste("WEIGHT","IMP",1:3,sep="")
?res
#????? ID CTIME WEIGHTIMP1 WEIGHTIMP2 WEIGHTIMP3
#1? HM001? 1223?? 24.90000?? 24.90000?? 24.90000
#2? HM001? 1224?? 25.20000?? 25.20000?? 25.20000
#3? HM001? 1225?? 25.50000?? 25.50000?? 25.50000
#4? HM001? 1226?? 25.24132?? 25.54828?? 25.46838
#5? HM001? 1227?? 25.70000?? 25.70000?? 25.70000
#6? HM001? 1228?? 27.10000?? 27.10000?? 27.10000
#7? HM001? 1229?? 27.30000?? 27.30000?? 27.30000
#8? HM001? 1230?? 27.40000?? 27.40000?? 27.40000
#9? HM001? 1231?? 28.40000?? 28.40000?? 28.40000
#10 HM001? 1232?? 29.20000?? 29.20000?? 29.20000
#11 HM001? 1233?? 30.13770?? 29.89770?? 30.88185
#12 HM001? 1234?? 31.17251?? 31.35045?? 31.57952
#13 HM001? 1235?? 32.40000?? 32.40000?? 32.40000
#14 HM001? 1236?? 33.70000?? 33.70000?? 33.70000
#15 HM001? 1237?? 34.30000?? 34.30000?? 34.30000
A.K.







________________________________
From: ??? <jamansymptom>@naver.com>
To: arun <smartpink111>@yahoo.com> 
Sent: Monday, January 28, 2013 7:35 PM
Subject: Thank you your help and one more question.

http://us-mg6.mail.yahoo.com/neo/launch?.rand=3qkohpi922i2q#
I deeply appreciate your help.?Answering your question, I am software engineer.
And I am developing system accumulating data to draw chart and table.
For higher perfromance, I have to deal missing value treatment.? So, I use
Amelia Pacakge. Below is the result follwing your answer.
---------------------------------------------------------------->temp2??? #origin data?ID CTIME WEIGHT
1? HM001? 1223?? 24.9
2? HM001? 1224?? 25.2
3? HM001? 1225?? 25.5
4? HM001? 1226???? NA
5? HM001? 1227?? 25.7
6? HM001? 1228?? 27.1
7? HM001? 1229?? 27.3
8? HM001? 1230?? 27.4
9? HM001? 1231?? 28.4
10 HM001? 1232?? 29.2
11 HM001? 1233 1221.0
12 HM001? 1234???? NA
13 HM001? 1235?? 32.4
14 HM001? 1236?? 33.7
15 HM001? 1237?? 34.3?>?temp2$WEIGHT<- ifelse(temp2$WEIGHT>50,NA,temp2$WEIGHT)?>temp2??? # After eliminating?strange value
????? ID CTIME WEIGHT
1? HM001? 1223?? 24.9
2? HM001? 1224?? 25.2
3? HM001? 1225?? 25.5
4? HM001? 1226???? NA
5? HM001? 1227?? 25.7
6? HM001? 1228?? 27.1
7? HM001? 1229?? 27.3
8? HM001? 1230?? 27.4
9? HM001? 1231?? 28.4
10 HM001? 1232?? 29.2
11 HM001? 1233???? NA
12 HM001? 1234???? NA
13 HM001? 1235?? 32.4
14 HM001? 1236?? 33.7
15 HM001? 1237?? 34.3
-------------------------------------------------------------- 
I have One more question. Below?are codes and results.
--------------------------------------------------------------> a.out2<-amelia(temp2, m=5, ts="CTIME", cs="ID",
polytime=1)-- Imputation 1 --
?1? 2? 3? 4 
-- Imputation 2 --
?1? 2? 3 
-- Imputation 3 --
?1? 2? 3? 4 
-- Imputation 4 --
?1? 2? 3 
-- Imputation 5 --
?1? 2? 3 
> a.out2$imputations$imp1
????? ID CTIME?? WEIGHT
1? HM001? 1223 24.90000
2? HM001? 1224 25.20000
3? HM001? 1225 25.50000
4? HM001? 1226 25.24132
5? HM001? 1227 25.70000
6? HM001? 1228 27.10000
7? HM001? 1229 27.30000
8? HM001? 1230 27.40000
9? HM001? 1231 28.40000
10 HM001? 1232 29.20000
11 HM001? 1233 30.13770
12 HM001? 1234 31.17251
13 HM001? 1235 32.40000
14 HM001? 1236 33.70000
15 HM001? 1237 34.30000
$imp2
????? ID CTIME?? WEIGHT
1? HM001? 1223 24.90000
2? HM001? 1224 25.20000
3? HM001? 1225 25.50000
4? HM001? 1226 25.54828
5? HM001? 1227 25.70000
6? HM001? 1228 27.10000
7? HM001? 1229 27.30000
8? HM001? 1230 27.40000
9? HM001? 1231 28.40000
10 HM001? 1232 29.20000
11 HM001? 1233 29.89770
12 HM001? 1234 31.35045
13 HM001? 1235 32.40000
14 HM001? 1236 33.70000
15 HM001? 1237 34.30000
$imp3
????? ID CTIME?? WEIGHT
1? HM001? 1223 24.90000
2? HM001? 1224 25.20000
3? HM001? 1225 25.50000
4? HM001? 1226 25.46838
5? HM001? 1227 25.70000
6? HM001? 1228 27.10000
7? HM001? 1229 27.30000
8? HM001? 1230 27.40000
9? HM001? 1231 28.40000
10 HM001? 1232 29.20000
11 HM001? 1233 30.88185
12 HM001? 1234 31.57952
13 HM001? 1235 32.40000
14 HM001? 1236 33.70000
15 HM001? 1237 34.30000
$imp4
????? ID CTIME?? WEIGHT
1? HM001? 1223 24.90000
2? HM001? 1224 25.20000
3? HM001? 1225 25.50000
4? HM001? 1226 25.86703
5? HM001? 1227 25.70000
6? HM001? 1228 27.10000
7? HM001? 1229 27.30000
8? HM001? 1230 27.40000
9? HM001? 1231 28.40000
10 HM001? 1232 29.20000
11 HM001? 1233 30.61241
12 HM001? 1234 30.17042
13 HM001? 1235 32.40000
14 HM001? 1236 33.70000
15 HM001? 1237 34.30000
$imp5
????? ID CTIME?? WEIGHT
1? HM001? 1223 24.90000
2? HM001? 1224 25.20000
3? HM001? 1225 25.50000
4? HM001? 1226 26.05747
5? HM001? 1227 25.70000
6? HM001? 1228 27.10000
7? HM001? 1229 27.30000
8? HM001? 1230 27.40000
9? HM001? 1231 28.40000
10 HM001? 1232 29.20000
11 HM001? 1233 31.03894
12 HM001? 1234 30.90960
13 HM001? 1235 32.40000
14 HM001? 1236 33.70000
15 HM001? 1237 34.30000
----------------------------------------
I got 5 datasets including imputed values. But What I want is not five datasets,
only one data set which combine those 5 imputed datasets.
I wannacombine $imp1, $imp2... $imp5 to get a final result set. This result set
is also (3 X 15) matrix.
Would you help me once more please?


-----Original Message-----
From: "arun"<smartpink111>@yahoo.com> 
To: "???"<jamansymptom>@naver.com>; 
Cc: "R help"<r-help>@r-project.org>; 
Sent: 2013-01-28 (?) 23:48:51
Subject: Re: Thank you your help.



Hi,
temp3<- read.table(text="
ID CTIME WEIGHT
HM001 1223 24.0
HM001 1224 25.2
HM001 1225 23.1
HM001 1226 NA
HM001 1227 32.1
HM001 1228 32.4
HM001 1229 1323.2
HM001 1230 27.4
HM001 1231 22.4236 #changed here to test the previous solution
",sep="",header=TRUE,stringsAsFactors=FALSE)
?tempnew<- na.omit(temp3)


?grep("\\d{4}",temp3$WEIGHT) 
#[1] 7 9 #not correct


temp3[,3][grep("\\d{4}..*",temp3$WEIGHT)]<-NA #match 4 digit
numbers before the decimals
tail(temp3)
#???? ID CTIME? WEIGHT
#4 HM001? 1226????? NA
#5 HM001? 1227 32.1000
#6 HM001? 1228 32.4000
#7 HM001? 1229????? NA
#8 HM001? 1230 27.4000
#9 HM001? 1231 22.4236

#Based on the variance,
You could set up some limit, for example 50 and use:
tempnew$WEIGHT<- ifelse(tempnew$WEIGHT>50,NA,tempnew$WEIGHT)
A.K.





________________________________
From: ??? <jamansymptom>@naver.com>
To: arun <smartpink111>@yahoo.com> 
Sent: Monday, January 28, 2013 2:20 AM
Subject: Re: Thank you your help.



Thank you for your reply again.??Your understanding is exactly right.
I attached?a?picture that show dataset.
'weight' is a dependent variable. And CTIME means hour/minute. This data
will have accumulated for years.
Speaking of accepted variance range, it would?be from 10 to 50. 
Actually, I am java programmer. So, I am strange this R Language.
Can u give me some example to use grep function?
-----Original Message-----
From: "arun"<smartpink111>@yahoo.com> 
To: "jamansymptom at naver.com"<jamansymptom>@naver.com>; 
Cc: 
Sent: 2013-01-28 (?) 15:27:12
Subject: Re: Thank you your help.

Hi,
Your original post was that 
"...it was evaluated from 20kg -40kg. But By some errors, it is evaluated
2000 kg".

So, my understanding was that you get values 2000 or 2000-4000 reads in place of
20-40 occasionally due to some misreading.

If your dataset contains observed value, strange value and NA and you want to
replace the strange value to NA, could you mention the range of strange values.?
If the strange value ranges anywhere between 1000-9999, it should get replaced
with the ?grep() solution.? But, if it depends upon something else, you need to
specify.? Also, regarding the variance, what is your accepted range of variance.
A.K.





----- Original Message -----
From: "jamansymptom at naver.com" <jamansymptom>@naver.com>
To: smartpink111 at yahoo.com
Cc: 
Sent: Monday, January 28, 2013 1:15 AM
Subject: Thank you your help.

Thank you to answer my question. 
It is not exactly what I want. I should have informed detailed situation. 
There is a sensor get data every minute. And that data will be accumulated and
be portion of dataset.
And the dataset contains observed value, strange value and NA. 
Namely, I am not sure where strange value will be occured. 
And I can't expect when strange value will be occured. 

I need the procedure performing like below.? 
1. using a method, set the range of variance 
2. using for(i) statement, check whether variance(weihgt) is in the range. 
3. when variance is out of range, impute weight[i] as NA. 

Thank you.??????

arun

2013-Jan-29 14:28 UTC

head link

[R] I succeed to get result dataset.

HI,

temp<-read.table(text="
?ID??????? CTIME?? ACTIVE_KWH REACTIVE_KWH
1? HM001 201212121301 1201.9 1115.5
2? HM001 201212121302 1202.2 1115.8
3? HM001 201212121303 1202.8 1115.8
4? HM001 201212121304???? NA 1116.1
5? HM001 201212121305 1203.9 1116.7
6? HM001 201212121306???? NA 1116.7
7? HM001 201212121307???? NA 1116.7
8? HM001 201212121308?? 12.0?? 31.0
9? HM001 201212121309 1206.0 1118.2
10 HM001 201212121310 1206.3 1118.6
11 HM001 201212121311 1206.5 1118.8
12 HM001 201212121312???? NA???? NA
13 HM001 201212121313 1207.3???? NA
14 HM001 201212121314 1207.9 1121.1
15 HM001 201212121315 1208.4 1121.3
",sep="",header=TRUE,stringsAsFactors=F)

#Here, I assume that you consider <1000 as low values, You can change it
accordingly.
?temp[,3:4][temp[,3]<1000& !is.na(temp[,3]),]<-NA
?temp
#????? ID??????? CTIME ACTIVE_KWH REACTIVE_KWH
#1? HM001 201212121301???? 1201.9?????? 1115.5
#2? HM001 201212121302???? 1202.2?????? 1115.8
#3? HM001 201212121303???? 1202.8?????? 1115.8
#4? HM001 201212121304???????? NA?????? 1116.1
#5? HM001 201212121305???? 1203.9?????? 1116.7
#6? HM001 201212121306???????? NA?????? 1116.7
#7? HM001 201212121307???????? NA?????? 1116.7
#8? HM001 201212121308???????? NA?????????? NA
#9? HM001 201212121309???? 1206.0?????? 1118.2
#10 HM001 201212121310???? 1206.3?????? 1118.6
#11 HM001 201212121311???? 1206.5?????? 1118.8
#12 HM001 201212121312???????? NA?????????? NA
#13 HM001 201212121313???? 1207.3?????????? NA
#14 HM001 201212121314???? 1207.9?????? 1121.1
#15 HM001 201212121315???? 1208.4?????? 1121.3


#Suppose your dataset is like this:
temp1<-read.table(text="
?ID??????? CTIME?? ACTIVE_KWH REACTIVE_KWH
1? HM001 201212121301 1201.9 1115.5
2? HM001 201212121302 1202.2 1115.8
3? HM001 201212121303 1202.8 1115.8
4? HM001 201212121304???? NA 1116.1
5? HM001 201212121305 1203.9 1116.7
6? HM001 201212121306???? NA 1116.7
7? HM001 201212121307???? NA 1116.7
8? HM001 201212121308?? 12.0?? 31.0
9? HM001 201212121309 1206.0 1118.2
10 HM001 201212121310 21.0 1118.6
11 HM001 201212121311 1206.5 1118.8
12 HM001 201212121312???? NA???? NA
13 HM001 201212121313 1207.3???? NA
14 HM001 201212121314 1207.9 1121.1
15 HM001 201212121315 1208.4 22.0
",sep="",header=TRUE,stringsAsFactors=F)
temp1[,3][temp1[,3]<1000&!is.na(temp[,3])]<-NA
?temp1[,4][temp1[,4]<1000&!is.na(temp[,4])]<-NA

Hope it helps.

A.K.






________________________________
From: ??? <jamansymptom at naver.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Tuesday, January 29, 2013 3:36 AM
Subject: Re: I succeed to get result dataset.


Arun ~ I have a dfficuliting in using R again. 
A?Dataset?'temp'?contatins NA and strange value(like 8 row 12.0, 31.0
which is out of range of value).

**What I want is to set strange value as NA.**? 
Then I'll impute dataset 'temp' by myself.
Since, It is impossible to be little for 'WIDTH' and 'HEIGHT', 
I?define a procdeure like below. > for(i in 2:m){?ex$WIDTH[i]<- ifelse(ex$WIDTH [i]- ex$WIDTH [i-1]<0,NA, ex$WIDTH [i])
?ex$HEIGHT[i]<- ifelse(ex$HEIGHT[i]- ex$HEIGHT [i-1]<0,NA, ex$HEIGHT [i])
}

But result is wrong.?Do u have better idea to define procedure performing well?

`There is a dataset named 'temp'.

????? ID??????? CTIME?? ACTIVE_KWH REACTIVE_KWH

1? HM001 201212121301 1201.9 1115.5

2? HM001 201212121302 1202.2 1115.8

3? HM001 201212121303 1202.8 1115.8

4? HM001 201212121304???? NA 1116.1

5? HM001 201212121305 1203.9 1116.7

6? HM001 201212121306???? NA 1116.7

7? HM001 201212121307???? NA 1116.7

8? HM001 201212121308?? 12.0?? 31.0

9? HM001 201212121309 1206.0 1118.2

10 HM001 201212121310 1206.3 1118.6

11 HM001 201212121311 1206.5 1118.8

12 HM001 201212121312???? NA???? NA

13 HM001 201212121313 1207.3???? NA

14 HM001 201212121314 1207.9 1121.1

15 HM001 201212121315 1208.4 1121.3
> m<- 15
> for(i in 2:m){temp$ACTIVE_KWH[i]<- ifelse(temp$ ACTIVE_KWH [i]-
temp$ACTIVE_KWH[i-1]<0,NA, temp$ ACTIVE_KWH [i])temp$REACTIVE_KWH[i]<- ifelse(temp$ REACTIVE_KWH [i]-
temp$REACTIVE_KWH[i-1]<0,NA, temp$ REACTIVE_KWH [i])
}

**result of for statement** 

?? ID??????? CTIME ACTIVE_KWH REACTIVE_KWH

1? HM001 201212121301???? 1201.9?????? 1115.5

2? HM001 201212121302???? 1202.2?????? 1115.8

3? HM001 201212121303???? 1202.8?????? 1115.8

4? HM001 201212121304???????? NA?????? 1116.1

5? HM001 201212121305???????? NA?????? 1116.7

6 ?HM001 201212121306???????? NA?????? 1116.7

7? HM001 201212121307???????? NA?????? 1116.7

8? HM001 201212121308???????? NA?????????? NA

9? HM001 201212121309???????? NA?????????? NA

10 HM001 201212121310???????? NA?????????? NA

11 HM001 201212121311???????? NA?????????? NA

12 HM001 201212121312???????? NA?????????? NA

13 HM001 201212121313???????? NA?????????? NA

14 HM001 201212121314???????? NA?????????? NA

15 HM001 201212121315???????? NA?????????? NA

**What I expect (row8 WIDTH=NA, HEIGHT=NA)**? 
ID??????? CTIME? WIDTH HEIGHT

1? HM001 201212121301 1201.9 1115.5

2? HM001 201212121302 1202.2 1115.8

3? HM001 201212121303 1202.8 1115.8

4? HM001 201212121304???? NA 1116.1

5? HM001 201212121305 1203.9 1116.7

6? HM001 201212121306???? NA 1116.7

7? HM001 201212121307???? NA 1116.7

8? HM001 201212121308???? NA???? NA

9? HM001 201212121309 1206.0 1118.2

10 HM001 201212121310 1206.3 1118.6

11 HM001 201212121311 1206.5 1118.8

12 HM001 201212121312???? NA???? NA

13 HM001 201212121313 1207.3???? NA

14 HM001 201212121314 1207.9 1121.1

15 HM001 201212121315 1208.4 1121.3
-----Original Message-----
From: "arun"<smartpink111 at yahoo.com> 
To: "???"<jamansymptom at naver.com>; 
Cc: 
Sent: 2013-01-29 (?) 15:23:56
Subject: Re: I succeed to get result dataset.

HI,

I am glad that it got fixed.

You can ask for help.
Thank you for the kind words.
Good night!
Arun?????????????????????????????????????????????????????

arun

2013-Jan-30 01:37 UTC

head link

[R] I think you misunderstood my explantation.

Hi,
Sorry, I didn't check your codes previously.

I hope this works for you (especially the <0).
Using the first dataset temp:
temp$ACTIVE_KWH[!is.na(temp$ACTIVE_KWH)][c(FALSE,diff(temp$ACTIVE_KWH[!is.na(temp$ACTIVE_KWH)])<
0)]<-NA
temp$REACTIVE_KWH[!is.na(temp$REACTIVE_KWH)][c(FALSE,diff(temp$REACTIVE_KWH[!is.na(temp$REACTIVE_KWH)])<
0)]<-NA
temp
#????? ID??????? CTIME ACTIVE_KWH REACTIVE_KWH
#1? HM001 201212121301???? 1201.9?????? 1115.5
#2? HM001 201212121302???? 1202.2?????? 1115.8
#3? HM001 201212121303???? 1202.8?????? 1115.8
#4? HM001 201212121304???????? NA?????? 1116.1
#5? HM001 201212121305???? 1203.9?????? 1116.7
#6? HM001 201212121306???????? NA?????? 1116.7
#7? HM001 201212121307???????? NA?????? 1116.7
#8? HM001 201212121308???????? NA?????????? NA
#9? HM001 201212121309???? 1206.0?????? 1118.2
#10 HM001 201212121310???? 1206.3?????? 1118.6
#11 HM001 201212121311???? 1206.5?????? 1118.8
#12 HM001 201212121312???????? NA?????????? NA
#13 HM001 201212121313???? 1207.3?????????? NA
#14 HM001 201212121314???? 1207.9?????? 1121.1
#15 HM001 201212121315???? 1208.4?????? 1121.3
temp1$ACTIVE_KWH[!is.na(temp1$ACTIVE_KWH)][c(FALSE,diff(temp1$ACTIVE_KWH[!is.na(temp1$ACTIVE_KWH)])<
0)]<-NA

#Similarly with the second dataset:
temp1$ACTIVE_KWH[!is.na(temp1$ACTIVE_KWH)][c(FALSE,diff(temp1$ACTIVE_KWH[!is.na(temp1$ACTIVE_KWH)])<
0)]<-NA
temp1$REACTIVE_KWH[!is.na(temp1$REACTIVE_KWH)][c(FALSE,diff(temp1$REACTIVE_KWH[!is.na(temp1$REACTIVE_KWH)])<
0)]<-NA


A.K.






________________________________
From: ??? <jamansymptom at naver.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Tuesday, January 29, 2013 7:42 PM
Subject: I think you misunderstood my explantation.


Hi,

Assume that first CTIME value is '201201010000'. It
means?ACTIVE_KWH?measured from??'201201010000' to present.
show example below row.

1? HM001 201212121301 1201.9 1115.5

1 row's? ACTIVE_KWH?? accumulated?value that measured from
'201201010000' to '201212121301'.
when CTIME is '201212121301',??ACTIVE_KWH? is '1201.9'.? And,
when? CTIME is? '201212121302', ACTIVE_KWH? is?'1202.2'.
It?means that?0.3 is measured?during 1 minute.? And??ACTIVE_KWH? is a
accumulated value.
Thus, ACTIVE_KWH? must increase, as CTIME? increases.
You got it?? So, I have to define strange value?as subtraction?value like
(?temp$ACTIVE_KWH[i] -??temp$ACTIVE_KWH[i-1]) > 50). '50' can be
chagned.
---------------------------------------------------------------------> for(i in 2:m){?temp$ACTIVE_KWH[i]<- ifelse(temp$ACTIVE_KWH[i]-
temp$ACTIVE_KWH[i-1]<0,NA, temp$ACTIVE_KWH[i])
}
----------------------------------------------------------------------
But, in this case, ?critical error occured.?If??temp$ACTIVE_KWH[3]?is NA,
posterior data (temp$ACTIVE_KWH[4], [5], [6]...) ?is imputed as NA.
Last mail contains Detailed source code and result. 
Can you recommend better idea to avoid imputed dataset as a successive NA. 
-----Original Message-----
From: "arun"<smartpink111 at yahoo.com> 
To: "???"<jamansymptom at naver.com>; 
Cc: "R help"<r-help at r-project.org>; 
Sent: 2013-01-29 (?) 23:28:30
Subject: Re: I succeed to get result dataset.

HI,

temp<-read.table(text="
?ID??????? CTIME?? ACTIVE_KWH REACTIVE_KWH
1? HM001 201212121301 1201.9 1115.5
2? HM001 201212121302 1202.2 1115.8
3? HM001 201212121303 1202.8 1115.8
4? HM001 201212121304???? NA 1116.1
5? HM001 201212121305 1203.9 1116.7
6? HM001 201212121306???? NA 1116.7
7? HM001 201212121307???? NA 1116.7
8? HM001 201212121308?? 12.0?? 31.0
9? HM001 201212121309 1206.0 1118.2
10 HM001 201212121310 1206.3 1118.6
11 HM001 201212121311 1206.5 1118.8
12 HM001 201212121312???? NA???? NA
13 HM001 201212121313 1207.3???? NA
14 HM001 201212121314 1207.9 1121.1
15 HM001 201212121315 1208.4 1121.3
",sep="",header=TRUE,stringsAsFactors=F)

#Here, I assume that you consider <1000 as low values, You can change it
accordingly.
?temp[,3:4][temp[,3]<1000& !is.na(temp[,3]),]<-NA
?temp
#????? ID??????? CTIME ACTIVE_KWH REACTIVE_KWH
#1? HM001 201212121301???? 1201.9?????? 1115.5
#2? HM001 201212121302???? 1202.2?????? 1115.8
#3? HM001 201212121303???? 1202.8?????? 1115.8
#4? HM001 201212121304???????? NA?????? 1116.1
#5? HM001 201212121305???? 1203.9?????? 1116.7
#6? HM001 201212121306???????? NA?????? 1116.7
#7? HM001 201212121307???????? NA?????? 1116.7
#8? HM001 201212121308???????? NA?????????? NA
#9? HM001 201212121309???? 1206.0?????? 1118.2
#10 HM001 201212121310???? 1206.3?????? 1118.6
#11 HM001 201212121311???? 1206.5?????? 1118.8
#12 HM001 201212121312???????? NA?????????? NA
#13 HM001 201212121313???? 1207.3?????????? NA
#14 HM001 201212121314???? 1207.9?????? 1121.1
#15 HM001 201212121315???? 1208.4?????? 1121.3


#Suppose your dataset is like this:
temp1<-read.table(text="
?ID??????? CTIME?? ACTIVE_KWH REACTIVE_KWH
1? HM001 201212121301 1201.9 1115.5
2? HM001 201212121302 1202.2 1115.8
3? HM001 201212121303 1202.8 1115.8
4? HM001 201212121304???? NA 1116.1
5? HM001 201212121305 1203.9 1116.7
6? HM001 201212121306???? NA 1116.7
7? HM001 201212121307???? NA 1116.7
8? HM001 201212121308?? 12.0?? 31.0
9? HM001 201212121309 1206.0 1118.2
10 HM001 201212121310 21.0 1118.6
11 HM001 201212121311 1206.5 1118.8
12 HM001 201212121312???? NA???? NA
13 HM001 201212121313 1207.3???? NA
14 HM001 201212121314 1207.9 1121.1
15 HM001 201212121315 1208.4 22.0
",sep="",header=TRUE,stringsAsFactors=F)
temp1[,3][temp1[,3]<1000&!is.na(temp[,3])]<-NA
?temp1[,4][temp1[,4]<1000&!is.na(temp[,4])]<-NA

Hope it helps.

A.K.






________________________________
From: ??? <jamansymptom>@naver.com>
To: arun <smartpink111>@yahoo.com> 
Sent: Tuesday, January 29, 2013 3:36 AM
Subject: Re: I succeed to get result dataset.


Arun ~ I have a dfficuliting in using R again. 
A?Dataset?'temp'?contatins NA and strange value(like 8 row 12.0, 31.0
which is out of range of value).

**What I want is to set strange value as NA.**? 
Then I'll impute dataset 'temp' by myself.
Since, It is impossible to be little for 'WIDTH' and 'HEIGHT', 
I?define a procdeure like below. > for(i in 2:m){?ex$WIDTH[i]<- ifelse(ex$WIDTH [i]- ex$WIDTH [i-1]<0,NA, ex$WIDTH [i])
?ex$HEIGHT[i]<- ifelse(ex$HEIGHT[i]- ex$HEIGHT [i-1]<0,NA, ex$HEIGHT [i])
}

But result is wrong.?Do u have better idea to define procedure performing well?

`There is a dataset named 'temp'.

????? ID??????? CTIME?? ACTIVE_KWH REACTIVE_KWH

1? HM001 201212121301 1201.9 1115.5

2? HM001 201212121302 1202.2 1115.8

3? HM001 201212121303 1202.8 1115.8

4? HM001 201212121304???? NA 1116.1

5? HM001 201212121305 1203.9 1116.7

6? HM001 201212121306???? NA 1116.7

7? HM001 201212121307???? NA 1116.7

8? HM001 201212121308?? 12.0?? 31.0

9? HM001 201212121309 1206.0 1118.2

10 HM001 201212121310 1206.3 1118.6

11 HM001 201212121311 1206.5 1118.8

12 HM001 201212121312???? NA???? NA

13 HM001 201212121313 1207.3???? NA

14 HM001 201212121314 1207.9 1121.1

15 HM001 201212121315 1208.4 1121.3
> m<- 15
> for(i in 2:m){temp$ACTIVE_KWH[i]<- ifelse(temp$ ACTIVE_KWH [i]-
temp$ACTIVE_KWH[i-1]<0,NA, temp$ ACTIVE_KWH [i])temp$REACTIVE_KWH[i]<- ifelse(temp$ REACTIVE_KWH [i]-
temp$REACTIVE_KWH[i-1]<0,NA, temp$ REACTIVE_KWH [i])
}

**result of for statement** 

?? ID??????? CTIME ACTIVE_KWH REACTIVE_KWH

1? HM001 201212121301???? 1201.9?????? 1115.5

2? HM001 201212121302???? 1202.2?????? 1115.8

3? HM001 201212121303???? 1202.8?????? 1115.8

4? HM001 201212121304???????? NA?????? 1116.1

5? HM001 201212121305???????? NA?????? 1116.7

6 ?HM001 201212121306???????? NA?????? 1116.7

7? HM001 201212121307???????? NA?????? 1116.7

8? HM001 201212121308???????? NA?????????? NA

9? HM001 201212121309???????? NA?????????? NA

10 HM001 201212121310???????? NA?????????? NA

11 HM001 201212121311???????? NA?????????? NA

12 HM001 201212121312???????? NA?????????? NA

13 HM001 201212121313???????? NA?????????? NA

14 HM001 201212121314???????? NA?????????? NA

15 HM001 201212121315???????? NA?????????? NA

**What I expect (row8 WIDTH=NA, HEIGHT=NA)**? 
ID??????? CTIME? WIDTH HEIGHT

1? HM001 201212121301 1201.9 1115.5

2? HM001 201212121302 1202.2 1115.8

3? HM001 201212121303 1202.8 1115.8

4? HM001 201212121304???? NA 1116.1

5? HM001 201212121305 1203.9 1116.7

6? HM001 201212121306???? NA 1116.7

7? HM001 201212121307???? NA 1116.7

8? HM001 201212121308???? NA???? NA

9? HM001 201212121309 1206.0 1118.2

10 HM001 201212121310 1206.3 1118.6

11 HM001 201212121311 1206.5 1118.8

12 HM001 201212121312???? NA???? NA

13 HM001 201212121313 1207.3???? NA

14 HM001 201212121314 1207.9 1121.1

15 HM001 201212121315 1208.4 1121.3
-----Original Message-----
From: "arun"<smartpink111>@yahoo.com> 
To: "???"<jamansymptom>@naver.com>; 
Cc: 
Sent: 2013-01-29 (?) 15:23:56
Subject: Re: I succeed to get result dataset.

HI,

I am glad that it got fixed.

You can ask for help.
Thank you for the kind words.
Good night!
Arun????????????????????????????????????????????????????????

arun

2013-Jan-30 11:53 UTC

head link

[R] I think you misunderstood my explantation.

Hi,

Your dataset had already some missing values.? So, I need to subset only those
rows that are not missing.
!is.na(temp$ACTIVE_KWH)
# [1]? TRUE? TRUE? TRUE FALSE? TRUE FALSE FALSE? TRUE? TRUE? TRUE? TRUE FALSE
#[13]? TRUE? TRUE? TRUE
temp$ACTIVE_KWH[!is.na(temp$ACTIVE_KWH)]
#[1] 1201.9 1202.2 1202.8 1203.9?? 12.0 1206.0 1206.3 1206.5 1207.3 1207.9
#[11] 1208.4

?diff() will get the differences between successive values
diff(temp$ACTIVE_KWH[!is.na(temp$ACTIVE_KWH)])
?#[1]???? 0.3???? 0.6???? 1.1 -1191.9? 1194.0???? 0.3???? 0.2???? 0.8???? 0.6
#[10]???? 0.5

#Here, the length is 1 less than the previous case as the first value is
removed.
?diff(temp$ACTIVE_KWH[!is.na(temp$ACTIVE_KWH)])<0
# [1] FALSE FALSE FALSE? TRUE FALSE FALSE FALSE FALSE FALSE FALSE

#Added `FALSE` at the beginning to make the length equal to subset data
indx<- c(FALSE,diff(temp$ACTIVE_KWH[!is.na(temp$ACTIVE_KWH)])<0)
indx 
#[1] FALSE FALSE FALSE FALSE? TRUE FALSE FALSE FALSE FALSE FALSE FALSE

#Using this index, further subset the already subset data for differences of
values <0
?temp$ACTIVE_KWH[!is.na(temp$ACTIVE_KWH)][indx]
#[1] 12
temp$ACTIVE_KWH[!is.na(temp$ACTIVE_KWH)][indx]<- NA #changed to NA

#Similarly for REACTIVE_KWH
Hope this helps.
A.K.









________________________________
From: ??? <jamansymptom at naver.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Wednesday, January 30, 2013 12:51 AM
Subject: Re: I think you misunderstood my explantation.


Oh, I forgot to ask about those code.
Can u expain what dose that mean?

Using the first dataset temp:
temp$ACTIVE_KWH[!is.na(temp$ACTIVE_KWH)][c(FALSE,diff(temp$ACTIVE_KWH[!is.na(temp$ACTIVE_KWH)])<
0)]<-NA
temp$REACTIVE_KWH[!is.na(temp$REACTIVE_KWH)][c(FALSE,diff(temp$REACTIVE_KWH[!is.na(temp$REACTIVE_KWH)])<
0)]<-NA?
-----Original Message-----
From: "arun"<smartpink111 at yahoo.com> 
To: "???"<jamansymptom at naver.com>; 
Cc: "R help"<r-help at r-project.org>; 
Sent: 2013-01-30 (?) 10:37:18
Subject: Re: I think you misunderstood my explantation.

Hi,
Sorry, I didn't check your codes previously.

I hope this works for you (especially the <0).
Using the first dataset temp:
temp$ACTIVE_KWH[!is.na(temp$ACTIVE_KWH)][c(FALSE,diff(temp$ACTIVE_KWH[!is.na(temp$ACTIVE_KWH)])<
0)]<-NA
temp$REACTIVE_KWH[!is.na(temp$REACTIVE_KWH)][c(FALSE,diff(temp$REACTIVE_KWH[!is.na(temp$REACTIVE_KWH)])<
0)]<-NA
temp
#????? ID??????? CTIME ACTIVE_KWH REACTIVE_KWH
#1? HM001 201212121301???? 1201.9?????? 1115.5
#2? HM001 201212121302???? 1202.2?????? 1115.8
#3? HM001 201212121303???? 1202.8?????? 1115.8
#4? HM001 201212121304???????? NA?????? 1116.1
#5? HM001 201212121305???? 1203.9?????? 1116.7
#6? HM001 201212121306???????? NA?????? 1116.7
#7? HM001 201212121307???????? NA?????? 1116.7
#8? HM001 201212121308???????? NA?????????? NA
#9? HM001 201212121309???? 1206.0?????? 1118.2
#10 HM001 201212121310???? 1206.3?????? 1118.6
#11 HM001 201212121311???? 1206.5?????? 1118.8
#12 HM001 201212121312???????? NA?????????? NA
#13 HM001 201212121313???? 1207.3?????????? NA
#14 HM001 201212121314???? 1207.9?????? 1121.1
#15 HM001 201212121315???? 1208.4?????? 1121.3
temp1$ACTIVE_KWH[!is.na(temp1$ACTIVE_KWH)][c(FALSE,diff(temp1$ACTIVE_KWH[!is.na(temp1$ACTIVE_KWH)])<
0)]<-NA

#Similarly with the second dataset:
temp1$ACTIVE_KWH[!is.na(temp1$ACTIVE_KWH)][c(FALSE,diff(temp1$ACTIVE_KWH[!is.na(temp1$ACTIVE_KWH)])<
0)]<-NA
temp1$REACTIVE_KWH[!is.na(temp1$REACTIVE_KWH)][c(FALSE,diff(temp1$REACTIVE_KWH[!is.na(temp1$REACTIVE_KWH)])<
0)]<-NA


A.K.






________________________________
From: ??? <jamansymptom>@naver.com>
To: arun <smartpink111>@yahoo.com> 
Sent: Tuesday, January 29, 2013 7:42 PM
Subject: I think you misunderstood my explantation.


Hi,

Assume that first CTIME value is '201201010000'. It
means?ACTIVE_KWH?measured from??'201201010000' to present.
show example below row.

1? HM001 201212121301 1201.9 1115.5

1 row's? ACTIVE_KWH?? accumulated?value that measured from
'201201010000' to '201212121301'.
when CTIME is '201212121301',??ACTIVE_KWH? is '1201.9'.? And,
when? CTIME is? '201212121302', ACTIVE_KWH? is?'1202.2'.
It?means that?0.3 is measured?during 1 minute.? And??ACTIVE_KWH? is a
accumulated value.
Thus, ACTIVE_KWH? must increase, as CTIME? increases.
You got it?? So, I have to define strange value?as subtraction?value like
(?temp$ACTIVE_KWH[i] -??temp$ACTIVE_KWH[i-1]) > 50). '50' can be
chagned.
---------------------------------------------------------------------> for(i in 2:m){?temp$ACTIVE_KWH[i]<- ifelse(temp$ACTIVE_KWH[i]-
temp$ACTIVE_KWH[i-1]<0,NA, temp$ACTIVE_KWH[i])
}
----------------------------------------------------------------------
But, in this case, ?critical error occured.?If??temp$ACTIVE_KWH[3]?is NA,
posterior data (temp$ACTIVE_KWH[4], [5], [6]...) ?is imputed as NA.
Last mail contains Detailed source code and result. 
Can you recommend better idea to avoid imputed dataset as a successive NA. 
-----Original Message-----
From: "arun"<smartpink111>@yahoo.com> 
To: "???"<jamansymptom>@naver.com>; 
Cc: "R help"<r-help>@r-project.org>; 
Sent: 2013-01-29 (?) 23:28:30
Subject: Re: I succeed to get result dataset.

HI,

temp<-read.table(text="
?ID??????? CTIME?? ACTIVE_KWH REACTIVE_KWH
1? HM001 201212121301 1201.9 1115.5
2? HM001 201212121302 1202.2 1115.8
3? HM001 201212121303 1202.8 1115.8
4? HM001 201212121304???? NA 1116.1
5? HM001 201212121305 1203.9 1116.7
6? HM001 201212121306???? NA 1116.7
7? HM001 201212121307???? NA 1116.7
8? HM001 201212121308?? 12.0?? 31.0
9? HM001 201212121309 1206.0 1118.2
10 HM001 201212121310 1206.3 1118.6
11 HM001 201212121311 1206.5 1118.8
12 HM001 201212121312???? NA???? NA
13 HM001 201212121313 1207.3???? NA
14 HM001 201212121314 1207.9 1121.1
15 HM001 201212121315 1208.4 1121.3
",sep="",header=TRUE,stringsAsFactors=F)

#Here, I assume that you consider <1000 as low values, You can change it
accordingly.
?temp[,3:4][temp[,3]<1000& !is.na(temp[,3]),]<-NA
?temp
#????? ID??????? CTIME ACTIVE_KWH REACTIVE_KWH
#1? HM001 201212121301???? 1201.9?????? 1115.5
#2? HM001 201212121302???? 1202.2?????? 1115.8
#3? HM001 201212121303???? 1202.8?????? 1115.8
#4? HM001 201212121304???????? NA?????? 1116.1
#5? HM001 201212121305???? 1203.9?????? 1116.7
#6? HM001 201212121306???????? NA?????? 1116.7
#7? HM001 201212121307???????? NA?????? 1116.7
#8? HM001 201212121308???????? NA?????????? NA
#9? HM001 201212121309???? 1206.0?????? 1118.2
#10 HM001 201212121310???? 1206.3?????? 1118.6
#11 HM001 201212121311???? 1206.5?????? 1118.8
#12 HM001 201212121312???????? NA?????????? NA
#13 HM001 201212121313???? 1207.3?????????? NA
#14 HM001 201212121314???? 1207.9?????? 1121.1
#15 HM001 201212121315???? 1208.4?????? 1121.3


#Suppose your dataset is like this:
temp1<-read.table(text="
?ID??????? CTIME?? ACTIVE_KWH REACTIVE_KWH
1? HM001 201212121301 1201.9 1115.5
2? HM001 201212121302 1202.2 1115.8
3? HM001 201212121303 1202.8 1115.8
4? HM001 201212121304???? NA 1116.1
5? HM001 201212121305 1203.9 1116.7
6? HM001 201212121306???? NA 1116.7
7? HM001 201212121307???? NA 1116.7
8? HM001 201212121308?? 12.0?? 31.0
9? HM001 201212121309 1206.0 1118.2
10 HM001 201212121310 21.0 1118.6
11 HM001 201212121311 1206.5 1118.8
12 HM001 201212121312???? NA???? NA
13 HM001 201212121313 1207.3???? NA
14 HM001 201212121314 1207.9 1121.1
15 HM001 201212121315 1208.4 22.0
",sep="",header=TRUE,stringsAsFactors=F)
temp1[,3][temp1[,3]<1000&!is.na(temp[,3])]<-NA
?temp1[,4][temp1[,4]<1000&!is.na(temp[,4])]<-NA

Hope it helps.

A.K.






________________________________
From: ??? <jamansymptom>@naver.com>
To: arun <smartpink111>@yahoo.com> 
Sent: Tuesday, January 29, 2013 3:36 AM
Subject: Re: I succeed to get result dataset.


Arun ~ I have a dfficuliting in using R again. 
A?Dataset?'temp'?contatins NA and strange value(like 8 row 12.0, 31.0
which is out of range of value).

**What I want is to set strange value as NA.**? 
Then I'll impute dataset 'temp' by myself.
Since, It is impossible to be little for 'WIDTH' and 'HEIGHT', 
I?define a procdeure like below. > for(i in 2:m){?ex$WIDTH[i]<- ifelse(ex$WIDTH [i]- ex$WIDTH [i-1]<0,NA, ex$WIDTH [i])
?ex$HEIGHT[i]<- ifelse(ex$HEIGHT[i]- ex$HEIGHT [i-1]<0,NA, ex$HEIGHT [i])
}

But result is wrong.?Do u have better idea to define procedure performing well?

`There is a dataset named 'temp'.

????? ID??????? CTIME?? ACTIVE_KWH REACTIVE_KWH

1? HM001 201212121301 1201.9 1115.5

2? HM001 201212121302 1202.2 1115.8

3? HM001 201212121303 1202.8 1115.8

4? HM001 201212121304???? NA 1116.1

5? HM001 201212121305 1203.9 1116.7

6? HM001 201212121306???? NA 1116.7

7? HM001 201212121307???? NA 1116.7

8? HM001 201212121308?? 12.0?? 31.0

9? HM001 201212121309 1206.0 1118.2

10 HM001 201212121310 1206.3 1118.6

11 HM001 201212121311 1206.5 1118.8

12 HM001 201212121312???? NA???? NA

13 HM001 201212121313 1207.3???? NA

14 HM001 201212121314 1207.9 1121.1

15 HM001 201212121315 1208.4 1121.3
> m<- 15
> for(i in 2:m){temp$ACTIVE_KWH[i]<- ifelse(temp$ ACTIVE_KWH [i]-
temp$ACTIVE_KWH[i-1]<0,NA, temp$ ACTIVE_KWH [i])temp$REACTIVE_KWH[i]<- ifelse(temp$ REACTIVE_KWH [i]-
temp$REACTIVE_KWH[i-1]<0,NA, temp$ REACTIVE_KWH [i])
}

**result of for statement** 

?? ID??????? CTIME ACTIVE_KWH REACTIVE_KWH

1? HM001 201212121301???? 1201.9?????? 1115.5

2? HM001 201212121302???? 1202.2?????? 1115.8

3? HM001 201212121303???? 1202.8?????? 1115.8

4? HM001 201212121304???????? NA?????? 1116.1

5? HM001 201212121305???????? NA?????? 1116.7

6 ?HM001 201212121306???????? NA?????? 1116.7

7? HM001 201212121307???????? NA?????? 1116.7

8? HM001 201212121308???????? NA?????????? NA

9? HM001 201212121309???????? NA?????????? NA

10 HM001 201212121310???????? NA?????????? NA

11 HM001 201212121311???????? NA?????????? NA

12 HM001 201212121312???????? NA?????????? NA

13 HM001 201212121313???????? NA?????????? NA

14 HM001 201212121314???????? NA?????????? NA

15 HM001 201212121315???????? NA?????????? NA

**What I expect (row8 WIDTH=NA, HEIGHT=NA)**? 
ID??????? CTIME? WIDTH HEIGHT

1? HM001 201212121301 1201.9 1115.5

2? HM001 201212121302 1202.2 1115.8

3? HM001 201212121303 1202.8 1115.8

4? HM001 201212121304???? NA 1116.1

5? HM001 201212121305 1203.9 1116.7

6? HM001 201212121306???? NA 1116.7

7? HM001 201212121307???? NA 1116.7

8? HM001 201212121308???? NA???? NA

9? HM001 201212121309 1206.0 1118.2

10 HM001 201212121310 1206.3 1118.6

11 HM001 201212121311 1206.5 1118.8

12 HM001 201212121312???? NA???? NA

13 HM001 201212121313 1207.3???? NA

14 HM001 201212121314 1207.9 1121.1

15 HM001 201212121315 1208.4 1121.3
-----Original Message-----
From: "arun"<smartpink111>@yahoo.com> 
To: "???"<jamansymptom>@naver.com>; 
Cc: 
Sent: 2013-01-29 (?) 15:23:56
Subject: Re: I succeed to get result dataset.

HI,

I am glad that it got fixed.

You can ask for help.
Thank you for the kind words.
Good night!
Arun?????????????????????????????????????????????????????????

R help - Jan 2013 - Thank you your help.

[R] Thank you your help.

[R] Thank you your help and one more question.

[R] Thank you your help and one more question.

[R] Thank you your help and one more question.

[R] I succeed to get result dataset.

[R] I think you misunderstood my explantation.

[R] I think you misunderstood my explantation.