(1) I have a master data frame that reads ClientID |date |value (2) I also have a control data frame that reads Client ID| Min date| Max date| control parameters The control data set may not have all client IDs . I want to use the control data frame on the master data frame to remove client IDS that don't exist in the control data set and for those that do, remove dates outside the required range. (3) We can either put the control parameters on all rows corresponding to a client ID or look it up from the control data frame (4) The basic function call looks like do.something(df,control parameters) where df is the subset of the master data set that corresponds to a single client with unwanted dates removed and the control parameters pertain to that client. Any help would be appreciated.
#dummy data: master=as.data.frame(list(clientId=c(1:4,2), date=1001:1005, value=10001:10005)) control=as.data.frame(list(clientId=c(2,3), mindate=c(100,1005), maxdate=c(1005,1005), control.params=c(1,2))) #reducing master df: #generating "TRUE FALSE index": idIndex=master$clientId %in% control$clientId #choose only those lines where index==TRUE master_reduced=master[idIndex,] master_reduced #merging dfs: mergingIndex= match(master_reduced$clientId, control$clientId) master_reduced=cbind(master_reduced, control[mergingIndex,]) master_reduced #finally choose those lines where date is in range dateIndex=master_reduced$date>master_reduced$mindate & master_reduced$date<master_reduced$maxdate finalDF=master_reduced[dateIndex,] finalDF Hope this helps Moritz _________________________ Moritz Grenke http://www.360mix.de -----Urspr?ngliche Nachricht----- Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im Auftrag von analyst41 at hotmail.com Gesendet: Freitag, 21. Januar 2011 03:02 An: r-help at r-project.org Betreff: [R] data and parameters (1) I have a master data frame that reads ClientID |date |value (2) I also have a control data frame that reads Client ID| Min date| Max date| control parameters The control data set may not have all client IDs . I want to use the control data frame on the master data frame to remove client IDS that don't exist in the control data set and for those that do, remove dates outside the required range. (3) We can either put the control parameters on all rows corresponding to a client ID or look it up from the control data frame (4) The basic function call looks like do.something(df,control parameters) where df is the subset of the master data set that corresponds to a single client with unwanted dates removed and the control parameters pertain to that client. Any help would be appreciated. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
try 'sqldf'> master=as.data.frame(list(clientId=c(1:4,2), date=1001:1005,+ value=10001:10005))> control=as.data.frame(list(clientId=c(2,3), mindate=c(100,1005),+ maxdate=c(1005,1005), control.params=c(1,2)))> masterclientId date value 1 1 1001 10001 2 2 1002 10002 3 3 1003 10003 4 4 1004 10004 5 2 1005 10005> controlclientId mindate maxdate control.params 1 2 100 1005 1 2 3 1005 1005 2> require(sqldf) > sqldf("+ select m.* + from master m, control c + where m.clientId = c.clientID + and m.date between c.mindate and c.maxdate + ") clientId date value 1 2 1002 10002 2 2 1005 10005> >On Thu, Jan 20, 2011 at 9:02 PM, analyst41 at hotmail.com <analyst41 at hotmail.com> wrote:> (1) I have a master data frame that reads > > ClientID |date |value > > (2) I also have a control data frame that reads > > Client ID| Min date| Max date| control parameters > > The control data set may not have all client IDs . > > I want to use the control data frame on the master data frame to > remove client IDS that don't exist in the control data set and for > those that do, remove dates outside the required range. > > (3) We can either put the control parameters on all rows corresponding > to a client ID or look it up from the control data frame > > (4) The basic function call looks like > > do.something(df,control parameters) > > where df is the subset of the master data set that corresponds to a > single client with unwanted dates removed and the control parameters > pertain to that client. > > Any help would be appreciated. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve?
forgot the control parameters:> sqldf("+ select m.*, c.control_params + from master m, control c + where m.clientId = c.clientID + and m.date between c.mindate and c.maxdate + ") clientId date value control_params 1 2 1002 10002 1 2 2 1005 10005 1>On Thu, Jan 20, 2011 at 9:02 PM, analyst41 at hotmail.com <analyst41 at hotmail.com> wrote:> (1) I have a master data frame that reads > > ClientID |date |value > > (2) I also have a control data frame that reads > > Client ID| Min date| Max date| control parameters > > The control data set may not have all client IDs . > > I want to use the control data frame on the master data frame to > remove client IDS that don't exist in the control data set and for > those that do, remove dates outside the required range. > > (3) We can either put the control parameters on all rows corresponding > to a client ID or look it up from the control data frame > > (4) The basic function call looks like > > do.something(df,control parameters) > > where df is the subset of the master data set that corresponds to a > single client with unwanted dates removed and the control parameters > pertain to that client. > > Any help would be appreciated. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve?