thr3ads.net - R help - [R] Adding Records to a Table in R [Nov 2017]

If this information is useful, please help other people find it:
Share via:

Paul Bernal

2017-Nov-01 19:45 UTC

[R] Adding Records to a Table in R

Dear R friends,

I am currently working with time series data, and I have a table(as data
frame) that has looks like this (TransitDate are in format =
"%e-%B-%Y") :

TransitDate       Transits      CargoTons
1985-04-01        100            2500
1985-05-01        135            4500
1985-06-01        120            1750
1985-07-01        100            3750
1985-08-01        200            1250

The problem is, that there are several periods that don?t exist in the
table, so it has the following behavior:

TransitDate        Transits      CargoTons
1985-04-01        100             1000
1985-07-01        100             1080
1985-12-01        500             3785
1986-04-01        325             4200
.
.
2017-09-01        400             2350 (*this is the last observation)

You can see in the last table fragment that the series jumps from
1985-04-01 to 1985-07-01, then it jumps from there to 1985-12-01 making the
time series quite irregular (non-constant chronologically speaking).

What I want to do is create a dummy table that has the sequence from the
first observation (1985-04-01) up to the last one (2017-09-01) and then
develop a code that checks if the dates contained in the dummy table exist
in the original table, if they don?t exist then add those dates and put
zeroes on the fields.

How can I achieve this?

Any help will be greatly appreciated,

Best regards,

Paul

	[[alternative HTML version deleted]]

Eric Berger

2017-Nov-01 20:21 UTC

head link

[R] Adding Records to a Table in R

Hi Paul,

#First I set up some sample data since I don't have a copy of your data
dtOrig <- as.Date(
c("1985-04-01","1985-07-01","1985-12-01","1986-04-01"))
dfOrig <- data.frame( TransitDate=dtOrig, Transits=c(100,100,500,325),
CargoTons=c(1000,1080,3785,4200) )

#Generate the complete set of dates as a data frame
dfDates<- data.frame(
TransitDate=seq(from=as.Date("1985-04-01"),by="1
month",length=13) )

# do the merge adding the "missing" rows (where NA will appear)
dfNew  <- merge(dfDates, dfOrig, by="TransitDate", all.x=TRUE )

# replace the NA's by zero
dfNew[is.na(dfNew)] <- 0

HTH,
Eric


On Wed, Nov 1, 2017 at 9:45 PM, Paul Bernal <paulbernal07 at gmail.com>
wrote:
> Dear R friends,
>
> I am currently working with time series data, and I have a table(as data
> frame) that has looks like this (TransitDate are in format =
"%e-%B-%Y") :
>
> TransitDate       Transits      CargoTons
> 1985-04-01        100            2500
> 1985-05-01        135            4500
> 1985-06-01        120            1750
> 1985-07-01        100            3750
> 1985-08-01        200            1250
>
> The problem is, that there are several periods that don?t exist in the
> table, so it has the following behavior:
>
> TransitDate        Transits      CargoTons
> 1985-04-01        100             1000
> 1985-07-01        100             1080
> 1985-12-01        500             3785
> 1986-04-01        325             4200
> .
> .
> 2017-09-01        400             2350 (*this is the last observation)
>
> You can see in the last table fragment that the series jumps from
> 1985-04-01 to 1985-07-01, then it jumps from there to 1985-12-01 making the
> time series quite irregular (non-constant chronologically speaking).
>
> What I want to do is create a dummy table that has the sequence from the
> first observation (1985-04-01) up to the last one (2017-09-01) and then
> develop a code that checks if the dates contained in the dummy table exist
> in the original table, if they don?t exist then add those dates and put
> zeroes on the fields.
>
> How can I achieve this?
>
> Any help will be greatly appreciated,
>
> Best regards,
>
> Paul
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
	[[alternative HTML version deleted]]

Paul Bernal

2017-Nov-08 12:54 UTC

head link

[R] Adding Records to a Table in R

Dear Eric, thank you for your kind reply,

Assume dataset1Frame is the table containing the missing dates,

and TransitDateFrame <- seq(as.Date(dataset1Frame[1,1]),
as.Date(dataset1Frame[nrow(dataset1Frame),1]), "months")

#dataset1Frame is basically reading some fields  from a SQL Server table,
the first one being the date

#here dataset1Frame[1,1] is the first date that appears in the table
#dataset1Frame[nrow(dataset1Frame),1] is the last date available in the
table

if(nrow(dataset1Frame)!=nrow(TransitDateFrame)){
    for(i in 1:nrow(dataset1Frame)){
        if(!TransitDateFrame[i,1] %in% dataset1Frame){
            dataset1Frame <- rbind(dataset1Frame,
TransitDateFrame[i,])}else{
                dataset1Frame}
    }
}

I used this code but didn?t work, maybe I am doing something wrong here?

Best regards,

Paul

2017-11-01 15:21 GMT-05:00 Eric Berger <ericjberger at gmail.com>:
> Hi Paul,
>
> #First I set up some sample data since I don't have a copy of your data
> dtOrig <- as.Date(
c("1985-04-01","1985-07-01","1985-12-01","1986-04-01"))
> dfOrig <- data.frame( TransitDate=dtOrig, Transits=c(100,100,500,325),
> CargoTons=c(1000,1080,3785,4200) )
>
> #Generate the complete set of dates as a data frame
> dfDates<- data.frame(
TransitDate=seq(from=as.Date("1985-04-01"),by="1
> month",length=13) )
>
> # do the merge adding the "missing" rows (where NA will appear)
> dfNew  <- merge(dfDates, dfOrig, by="TransitDate", all.x=TRUE
)
>
> # replace the NA's by zero
> dfNew[is.na(dfNew)] <- 0
>
> HTH,
> Eric
>
>
> On Wed, Nov 1, 2017 at 9:45 PM, Paul Bernal <paulbernal07 at
gmail.com>
> wrote:
>
>> Dear R friends,
>>
>> I am currently working with time series data, and I have a table(as
data
>> frame) that has looks like this (TransitDate are in format =
"%e-%B-%Y") :
>>
>> TransitDate       Transits      CargoTons
>> 1985-04-01        100            2500
>> 1985-05-01        135            4500
>> 1985-06-01        120            1750
>> 1985-07-01        100            3750
>> 1985-08-01        200            1250
>>
>> The problem is, that there are several periods that don?t exist in the
>> table, so it has the following behavior:
>>
>> TransitDate        Transits      CargoTons
>> 1985-04-01        100             1000
>> 1985-07-01        100             1080
>> 1985-12-01        500             3785
>> 1986-04-01        325             4200
>> .
>> .
>> 2017-09-01        400             2350 (*this is the last observation)
>>
>> You can see in the last table fragment that the series jumps from
>> 1985-04-01 to 1985-07-01, then it jumps from there to 1985-12-01 making
>> the
>> time series quite irregular (non-constant chronologically speaking).
>>
>> What I want to do is create a dummy table that has the sequence from
the
>> first observation (1985-04-01) up to the last one (2017-09-01) and then
>> develop a code that checks if the dates contained in the dummy table
exist
>> in the original table, if they don?t exist then add those dates and put
>> zeroes on the fields.
>>
>> How can I achieve this?
>>
>> Any help will be greatly appreciated,
>>
>> Best regards,
>>
>> Paul
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
	[[alternative HTML version deleted]]

Paul Bernal

2017-Nov-08 13:46 UTC

head link

[R] Adding Records to a Table in R

Dear Eric,

Hope you are doing great. I also tried the following:

#First I created the complete date sequence

TransitDateFrame <- data.frame(TransitDate=seq(as.Date(dataset1[1,1]),
as.Date(dataset1[nrow(dataset1),1]), by = "month"))

#Then I did the merging

 dataset1NEW <- merge(TransitDateFrame, dataset1, by="TransitDate",
all.x=TRUE)

Now it has, as expected the total number of rows. The problem is, it filled
absolutely everything with NAs, and this shouldn?t be the case since there
are dates that actually have data.

why is this happening?

I am attaching the dataset1 table as a .csv document for your reference.
Basically what I want is to bring all the values in dataset1 and only add
the dates missing with value 0.

Best regards,

Paul

2017-11-01 15:21 GMT-05:00 Eric Berger <ericjberger at gmail.com>:
> Hi Paul,
>
> #First I set up some sample data since I don't have a copy of your data
> dtOrig <- as.Date(
c("1985-04-01","1985-07-01","1985-12-01","1986-04-01"))
> dfOrig <- data.frame( TransitDate=dtOrig, Transits=c(100,100,500,325),
> CargoTons=c(1000,1080,3785,4200) )
>
> #Generate the complete set of dates as a data frame
> dfDates<- data.frame(
TransitDate=seq(from=as.Date("1985-04-01"),by="1
> month",length=13) )
>
> # do the merge adding the "missing" rows (where NA will appear)
> dfNew  <- merge(dfDates, dfOrig, by="TransitDate", all.x=TRUE
)
>
> # replace the NA's by zero
> dfNew[is.na(dfNew)] <- 0
>
> HTH,
> Eric
>
>
> On Wed, Nov 1, 2017 at 9:45 PM, Paul Bernal <paulbernal07 at
gmail.com>
> wrote:
>
>> Dear R friends,
>>
>> I am currently working with time series data, and I have a table(as
data
>> frame) that has looks like this (TransitDate are in format =
"%e-%B-%Y") :
>>
>> TransitDate       Transits      CargoTons
>> 1985-04-01        100            2500
>> 1985-05-01        135            4500
>> 1985-06-01        120            1750
>> 1985-07-01        100            3750
>> 1985-08-01        200            1250
>>
>> The problem is, that there are several periods that don?t exist in the
>> table, so it has the following behavior:
>>
>> TransitDate        Transits      CargoTons
>> 1985-04-01        100             1000
>> 1985-07-01        100             1080
>> 1985-12-01        500             3785
>> 1986-04-01        325             4200
>> .
>> .
>> 2017-09-01        400             2350 (*this is the last observation)
>>
>> You can see in the last table fragment that the series jumps from
>> 1985-04-01 to 1985-07-01, then it jumps from there to 1985-12-01 making
>> the
>> time series quite irregular (non-constant chronologically speaking).
>>
>> What I want to do is create a dummy table that has the sequence from
the
>> first observation (1985-04-01) up to the last one (2017-09-01) and then
>> develop a code that checks if the dates contained in the dummy table
exist
>> in the original table, if they don?t exist then add those dates and put
>> zeroes on the fields.
>>
>> How can I achieve this?
>>
>> Any help will be greatly appreciated,
>>
>> Best regards,
>>
>> Paul
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>

Reasonably Related Threads

Search for more possibly parallel threads

R help - Nov 2017 - Adding Records to a Table in R

[R] Adding Records to a Table in R

[R] Adding Records to a Table in R

[R] Adding Records to a Table in R

[R] Adding Records to a Table in R

Reasonably Related Threads