#Let's say I have the following data set:
dat3 = data.frame(A_CaseID = c(1881, 1882, 1883, 1884, 1885),
B_MW_EEsDue1 = c(2, 2, 1, 4, 6),
C_MW_EEsDue2 = c(5, 5, 4, 1, 6),
D_MW_EEsDueTotal = c(7, 9, 5, 6, 112))
dat3
# A_CaseID B_MW_EEsDue1 C_MW_EEsDue2 D_MW_EEsDueTotal
# 1 1881 2 5 7
# 2 1882 2 5 9
# 3 1883 1 4 5
# 4 1884 4 1 6
# 5 1885 6 6 112
# I want to:
#CREATE A NEW 1-COLUMN MATRIX (of unknown #rows) LISTING ONLY
"A"'s WHERE "D
!= B + C"
#THIS COLUMN CAN BE LABELED "MW_EEsDue_ERRORS", and output for this
example
should be:
# MW_EEsDue_ERRORS
# 1 1882
# 2 1884
# 3 1885
#What is the best way to do this? Thanks for your time. BNC
--
View this message in context:
http://r.789695.n4.nabble.com/create-new-matrix-from-user-defined-function-tp4671250.html
Sent from the R help mailing list archive at Nabble.com.
Hi, You could try: ?mat1<-matrix(dat3[rowSums(dat3[,2:3])!=dat3[,4],1],ncol=1,dimnames=list(NULL,"MW_EEsDue_ERRORS")) ?mat1 #???? MW_EEsDue_ERRORS #[1,]???????????? 1882 #[2,]???????????? 1884 #[3,]???????????? 1885 A.K. #Let's say I have the following data set: dat3 = data.frame(A_CaseID = c(1881, 1882, 1883, 1884, 1885), ? ? ? ? ? ? ? ? ? B_MW_EEsDue1 = c(2, 2, 1, 4, 6), ? ? ? ? ? ? ? ? ? C_MW_EEsDue2 = c(5, 5, 4, 1, 6), ? ? ? ? ? ? ? ? ? D_MW_EEsDueTotal = c(7, 9, 5, 6, 112)) dat3 # A_CaseID B_MW_EEsDue1 C_MW_EEsDue2 D_MW_EEsDueTotal # 1 ? ? 1881 ? ? ? ? ? ?2 ? ? ? ? ? ?5 ? ? ? ? ? ? ? ?7 # 2 ? ? 1882 ? ? ? ? ? ?2 ? ? ? ? ? ?5 ? ? ? ? ? ? ? ?9 # 3 ? ? 1883 ? ? ? ? ? ?1 ? ? ? ? ? ?4 ? ? ? ? ? ? ? ?5 # 4 ? ? 1884 ? ? ? ? ? ?4 ? ? ? ? ? ?1 ? ? ? ? ? ? ? ?6 # 5 ? ? 1885 ? ? ? ? ? ?6 ? ? ? ? ? ?6 ? ? ? ? ? ? ?112 # I want to: #CREATE A NEW 1-COLUMN MATRIX (of unknown #rows) LISTING ONLY "A"'s WHERE "D != B + C" #THIS COLUMN CAN BE LABELED "MW_EEsDue_ERRORS", and output for this example should be: # MW_EEsDue_ERRORS # 1 1882 # 2 1884 # 3 1885 #What is the best way to do this? ?Thanks for your time. ?BNC
Nordlund, Dan (DSHS/RDA)
2013-Jul-10 21:42 UTC
[R] create new matrix from user-defined function
> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of bcrombie > Sent: Wednesday, July 10, 2013 12:19 PM > To: r-help at r-project.org > Subject: [R] create new matrix from user-defined function > > #Let's say I have the following data set: > > dat3 = data.frame(A_CaseID = c(1881, 1882, 1883, 1884, 1885), > B_MW_EEsDue1 = c(2, 2, 1, 4, 6), > C_MW_EEsDue2 = c(5, 5, 4, 1, 6), > D_MW_EEsDueTotal = c(7, 9, 5, 6, 112)) > dat3 > # A_CaseID B_MW_EEsDue1 C_MW_EEsDue2 D_MW_EEsDueTotal > # 1 1881 2 5 7 > # 2 1882 2 5 9 > # 3 1883 1 4 5 > # 4 1884 4 1 6 > # 5 1885 6 6 112 > > # I want to: > #CREATE A NEW 1-COLUMN MATRIX (of unknown #rows) LISTING ONLY "A"'s > WHERE "D > != B + C" > #THIS COLUMN CAN BE LABELED "MW_EEsDue_ERRORS", and output for this > example > should be: > > # MW_EEsDue_ERRORS > # 1 1882 > # 2 1884 > # 3 1885 > > #What is the best way to do this? Thanks for your time. BNC > >Here is one option, there are many others. Only you can decide what is "best". data.frame(MW_EEsDue_ERRORS=dat3[dat3[[4]] != dat3[[2]]+dat3[[3]],][[1]]) Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204
Hi BNC,
No problem.
You could also use ?with()
data.frame(MW_EEsDue_ERRORS=with(dat3,A_CaseID[D_MW_EEsDueTotal!=rowSums(cbind(B_MW_EEsDue1,C_MW_EEsDue2))]))
#? MW_EEsDue_ERRORS
#1???????????? 1882
#2???????????? 1884
#3???????????? 1885
A.K.
----- Original Message -----
From: "Crombie, Burnette N" <bcrombie at utk.edu>
To: arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Thursday, July 11, 2013 4:40 PM
Subject: RE: [R] create new matrix from user-defined function
You understood me perfectly, and I agree is it easier to index using numbers
than names.? I'm just afraid if my dataset gets too big I'll mess up
which index numbers I'm supposed to be using.? "data.table()"
looks very useful and a good way to approach the issue.? Thanks.? I really
appreciate your (everyone's) help.? BNC
-----Original Message-----
From: arun [mailto:smartpink111 at yahoo.com]
Sent: Thursday, July 11, 2013 4:29 PM
To: Crombie, Burnette N
Cc: R help
Subject: Re: [R] create new matrix from user-defined function
Hi,
Not sure I understand you correctly.
I found it easier to index using number than replace it by lengthy column names.
You could do it similar to the one below.
matNew<-matrix(dat3[rowSums(dat3[c("B_MW_EEsDue1","C_MW_EEsDue2")])!=dat3["D_MW_EEsDueTotal"],1],ncol=1,dimnames=list(NULL,"MW_EEsDue_ERRORS"))
?matNew
#???? MW_EEsDue_ERRORS
#[1,]???????????? 1882
#[2,]???????????? 1884
#[3,]???????????? 1885
If you have very large dataset, you could also check ?data.table().
library(data.table)
dt3<- data.table(dat3)
dtNew<-subset(dt3[D_MW_EEsDueTotal!=B_MW_EEsDue1+C_MW_EEsDue2],select=1)
?dtNew
#?? A_CaseID
#1:???? 1882
#2:???? 1884
#3:???? 1885
#Some speed comparisons:
set.seed(1254)
datTest<- data.frame(A=sample(1000:15000,1e7,replace=TRUE),B=
sample(1:10,1e7,replace=TRUE),C=sample(5:15,1e7,replace=TRUE),D=sample(5:25,1e7,replace=TRUE))
system.time(res1<- data.frame(MW_EEsDue_ERRORS=datTest[datTest[[4]] !=
datTest[[2]]+datTest[[3]],][[1]]))
# user? system elapsed
#? 2.256?? 0.000?? 2.145
system.time(mat1<-matrix(datTest[rowSums(datTest[,2:3])!=datTest[,4],1],ncol=1,dimnames=list(NULL,"MW_EEsDue_ERRORS")))
?#? user? system elapsed
?# 0.756?? 0.088?? 0.849
system.time(res2<-
data.frame(MW_EEsDue_ERRORS=datTest[addmargins(as.matrix(datTest[,2:3]),2)[,3]!=datTest[,4],1]))
#?? user? system elapsed
#115.740?? 0.000 105.778
dtTest<- data.table(datTest)
system.time(res3<- subset(dtTest[D!=B+C],select=1))
?# user? system elapsed
?# 0.508?? 0.000?? 0.477
identical(res1,res2)
#[1] TRUE
setnames(res3,"A","MW_EEsDue_ERRORS")
?identical(res1,as.data.frame(res3))
#[1] TRUE
A.K.
----- Original Message -----
From: bcrombie <bcrombie at utk.edu>
To: r-help at r-project.org
Cc:
Sent: Thursday, July 11, 2013 3:54 PM
Subject: Re: [R] create new matrix from user-defined function
Dan and Arun, thank you very much for your replies.? They are both very helpful
and I love to get different versions of an answer so I can learn more R code.?
You both used indexing to refer to the columns needed in the function, but since
my real data frame will be much larger I'm assuming I can replace the index
numbers with the names of the columns in quotes instead??? I'll try this on
my own if you're busy with other forum questions.? Thanks, again.
From: Nordlund, Dan (DSHS/RDA) [via R] [mailto:ml-node+s789695n4671267h89 at
n4.nabble.com]
Sent: Wednesday, July 10, 2013 5:46 PM
To: Crombie, Burnette N
Subject: Re: create new matrix from user-defined function
> -----Original Message-----
> From: [hidden
email]</user/SendEmail.jtp?type=node&node=4671267&i=0>
> [mailto:r-help-bounces at r-
> project.org<mailto:r-help-bounces at r-%20%0b%3e%20project.org>] On
> Behalf Of bcrombie
> Sent: Wednesday, July 10, 2013 12:19 PM
> To: [hidden
email]</user/SendEmail.jtp?type=node&node=4671267&i=1>
> Subject: [R] create new matrix from user-defined function
>
> #Let's say I have the following data set:
>
> dat3 = data.frame(A_CaseID = c(1881, 1882, 1883, 1884, 1885),
>? ? ? ? ? ? ? ? ?? B_MW_EEsDue1 = c(2, 2, 1, 4, 6),
>? ? ? ? ? ? ? ? ?? C_MW_EEsDue2 = c(5, 5, 4, 1, 6),
>? ? ? ? ? ? ? ? ?? D_MW_EEsDueTotal = c(7, 9, 5, 6, 112))
> dat3
> # A_CaseID B_MW_EEsDue1 C_MW_EEsDue2 D_MW_EEsDueTotal? # 1? ?? 1881? ? ? ?
? ?
>2? ? ? ? ? ? 5? ? ? ? ? ? ? ? 7? # 2? ?? 1882? ? ? ? ? ? 2? ? ? ? ? ? 5? ? ?
? ? ? ? ?
>9? # 3? ?? 1883? ? ? ? ? ? 1? ? ? ? ? ? 4? ? ? ? ? ? ? ? 5? # 4? ??
>1884? ? ? ? ? ? 4? ? ? ? ? ? 1? ? ? ? ? ? ? ? 6? # 5? ?? 1885? ? ? ? ? ?
>6? ? ? ? ? ? 6? ? ? ? ? ? ? 112
>
> # I want to:
> #CREATE A NEW 1-COLUMN MATRIX (of unknown #rows) LISTING ONLY
"A"'s
> WHERE "D != B + C"
> #THIS COLUMN CAN BE LABELED "MW_EEsDue_ERRORS", and output for
this
> example should be:
>
> # MW_EEsDue_ERRORS
> # 1 1882
> # 2 1884
> # 3 1885
>
> #What is the best way to do this?? Thanks for your time.? BNC
>
>
Here is one option, there are many others.? Only you can decide what is
"best".
data.frame(MW_EEsDue_ERRORS=dat3[dat3[[4]] != dat3[[2]]+dat3[[3]],][[1]])
Hope this is helpful,
Dan
Daniel J. Nordlund
Washington State Department of Social and Health Services Planning, Performance,
and Accountability Research and Data Analysis Division Olympia, WA 98504-5204
______________________________________________
[hidden email]</user/SendEmail.jtp?type=node&node=4671267&i=2>
mailing list https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
________________________________
If you reply to this email, your message will be added to the discussion below:
http://r.789695.n4.nabble.com/create-new-matrix-from-user-defined-function-tp4671250p4671267.html
To unsubscribe from create new matrix from user-defined function, click
here<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4671250&code=YmNyb21iaWVAdXRrLmVkdXw0NjcxMjUwfC0xMzI5MzM0NzI3>.
NAML<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
--
View this message in context:
http://r.789695.n4.nabble.com/create-new-matrix-from-user-defined-function-tp4671250p4671361.html
Sent from the R help mailing list archive at Nabble.com.
??? [[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.