Katherine,
Split the rate names into their currency and tenor parts and assign a
numeric value to each tenor. Choose a model to do your approximations (I
used linear regression in the example below). Use this model to generate
estimates for all combinations of currency and tenor.
For example:
# split the rate names into currency and tenor
splitnames <- do.call(rbind, strsplit(df$rate_name, "_"))
df$currency <- as.factor(splitnames[, 1])
df$tenor <- splitnames[, 2]
# assign numeric value to each tenor
uniquetenors <- c("1w", "2w", "1m",
"2m")
uniquedays <- c(7, 14, 30.5, 61)
df$tenordays <- uniquedays[match(df$tenor, uniquetenors)]
# fit a linear model of rate on tenordays for each currency
fit <- lm(rates ~ currency*tenordays, data=df)
# estimate rates for all combinations of currency and tenor
fulldf <- expand.grid(tenordays=unique(df$tenordays),
currency=unique(df$currency))
fulldf$est.rates = predict(fit, newdata=fulldf)
# merge observed rates with estimated rates
dfwithest <- merge(df, fulldf, all=TRUE)
Jean
On Thu, Apr 25, 2013 at 12:33 AM, Katherine Gobin
<katherine_gobin@yahoo.com> wrote:
> Dear R forum
>
> I have data.frame as
>
> df = data.frame(rate_name = c("USD_1w", "USD_1w",
"USD_1w", "USD_1w",
> "USD_1m", "USD_1m", "USD_1m",
"USD_1m", "USD_2m", "USD_2m", "USD_2m",
> "USD_2m", "GBP_1w", "GBP_1w",
"GBP_1w", "GBP_1w", "GBP_1m", "GBP_1m",
> "GBP_1m", "GBP_1m", "GBP_2m",
"GBP_2m", "GBP_2m", "GBP_2m", "EURO_1w",
> "EURO_1w", "EURO_1w", "EURO_1w",
"EURO_2w", "EURO_2w", "EURO_2w",
> "EURO_2w", "EURO_2m", "EURO_2m",
"EURO_2m", "EURO_2m"), rates = c(2.05,
> 2.07, 2.06, 2.06, 2.22, 2.24, 2.23, 2.23, 2.31, 2.33, 2.33, 2.31, 1.06,
> 1.08, 1.08, 1.08, 1.21, 1.21, 1.23, 1.21, 1.41, 1.39, 1.39, 1.37, 1.82,
> 1.82, 1.81, 1.80, 1.98, 1.98, 1.97, 1.97, 2.1, 2.09, 2.09, 2.11))
>
> currency = c("EURO", "GBP", "USD")
> tenor = c("1w", "2w", "1m", "2m",
"3m")
>
> # _________________________________________
>
> > df
> rate_name rates
> rate_name rates
> 1 USD_1w 2.05
> 2 USD_1w 2.07
> 3 USD_1w 2.06
> 4 USD_1w 2.06
> 5 USD_1m 2.22
> 6 USD_1m 2.24
> 7 USD_1m 2.23
> 8 USD_1m 2.23
> 9 USD_2m 2.31
> 10 USD_2m 2.33
> 11 USD_2m 2.33
> 12 USD_2m 2.31
> 13 GBP_1w 1.06
> 14 GBP_1w 1.08
> 15 GBP_1w 1.08
> 16 GBP_1w 1.08
> 17 GBP_1m 1.21
> 18 GBP_1m 1.21
> 19 GBP_1m 1.23
> 20 GBP_1m 1.21
> 21 GBP_2m 1.41
> 22 GBP_2m 1.39
> 23 GBP_2m 1.39
> 24 GBP_2m 1.37
> 25 EURO_1w 1.82
> 26 EURO_1w 1.82
> 27 EURO_1w 1.81
> 28 EURO_1w 1.80
> 29 EURO_2w 1.98
> 30 EURO_2w 1.98
> 31 EURO_2w 1.97
> 32 EURO_2w 1.97
> 33 EURO_2m 2.10
> 34 EURO_2m 2.09
> 35 EURO_2m 2.09
> 36 EURO_2m 2.11
>
> As can be seen that USD_2w, GBP_2w and EURO_1m are missing and I need to
> INTERPOLATE these rates, which can be done using approx or approxfun. In
> reality I can have many currencies with many tenors. Problem is when the
> data.frame "df" is read or accessed in R, I am not aware which
tenor is
> missing. For a given currency, it is possible that mare than 1 consecutive
> tenors may be missing e.g. in case of EURO, I may have EURO_1w, EURO_2w and
> then EURO_4m. So EURO_1m, EURO_2m and EURO_3m are missing.
>
>
> I understand it's sort of vague question from me and do apologize for
the
> same. Any suggestion please.
>
> Regards
>
> Katherine
>
>
>
>
>
> [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help@r-project.org mailing list
> stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
[[alternative HTML version deleted]]