I want to calculate "expansion factors" for elements in my dataframe
based on a 2-d cross classification. Since I'll have "missing
values"
(many combinations will have no record) I'll need a second "expansion
factor" for each "row". I've included my "work to
date" below, but I'm
not very close to getting this right.
My first question is why CurrentX2Sums seems OK but NewTargetX2Sums has
no totals? I've added the total to the "ID". How do I do this
correctly?
I expected CurrentX2Sums (and NewTargetX2Sums) to print in column form.
How do I make it do so? t(CurrentX2Sums) doesn't seem to do the trick.
How do I avoid getting e.g. "De.Soto" when I read "De Soto"
into
NewTargetData?
How do I put this back into my dataframe?
SurveyData$NewX1 = NewTargetX1Sums/CurrentX1Sums but how do I
specify (tripid_nu lineon) the indices?
If I'm only a ?LookHere or ??LookThere away I'd appreciate being pointed
in the right direction.
Thanks in advance.
All the gory details:
> sessionInfo() # List loaded packages
R version 2.8.0 (2008-10-20)
i386-pc-mingw32
locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
attached base packages:
[1] graphics grDevices utils datasets stats methods base
other attached packages:
[1] fortunes_1.3-5 prettyR_1.3-3 survey_3.9-1 foreign_0.8-29
> SurveyData <- read.spss("C:/Data/R/orange_delivery.sav",
use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)
> NewTargetData <- read.table("C:/Data/R/NewTarget.csv",
header=TRUE,
sep=",", na.strings="NA", dec=".",
strip.white=TRUE)
>
#-----------------------------------------------------------------------
--------
> temp <- sub(' +$', '', SurveyData$direction_) #
Remove spaces
from variable names
> SurveyData$direction_ <- temp
>
#-----------------------------------------------------------------------
--------
> SurveyData$StnNum=as.numeric(SurveyData$lineon)
> CurrentX1Sums <- as.matrix(xtabs(~tripid_nu+lineon, data=SurveyData))
> CurrentX2Sums <- apply(CurrentX1Sums, 1, sum)
> NewTargetX1Sums <- as.matrix(NewTargetData)
> NewTargetX2Sums <- apply(NewTargetX1Sums, 1, sum)
>
> CurrentX1Sums
lineon
tripid_nu Warner Center De Soto Pierce College Tampa Reseda Balboa
Woodley Sepulveda Van Nuys Woodman Valley College Laurel Canyon North
Hollywood
9011880 1 0 2 1 0 2
1 0 0 0 1 0
0
9011890 0 0 0 0 0 0
1 0 0 0 0 1
0
9011960 1 1 2 0 1 1
0 1 3 2 1 0
0
.. {Snip} ..
9015640 0 0 0 0 0 0
0 0 0 0 0 0
1
9015650 0 0 0 0 0 0
0 0 1 0 0 0
5
9015840 0 5 0 0 0 0
0 0 0 0 0 0
0
> CurrentX2Sums
9011880 9011890 9011960 9011970 9012040 9012050 9012130 9012280 9012290
9012720 9012730 9012760 9012770 9012840 9012850 9012880 9012890 9013000
9013010 9013240 9013250 9013280 9013290 9013320 9013330 9013360 9013440
8 2 13 25 18 8 13 28 20
9 5 22 14 19 8 13 11 10
5 6 9 11 9 10 9 13 5
9013450 9013800 9013810 9013880 9013890 9013960 9013970 9014080 9014090
9014120 9014130 9014240 9014250 9014440 9014450 9014640 9014650 9014760
9014770 9014960 9014970 9015280 9015290 9015520 9015530 9015640 9015650
8 17 14 16 3 5 8 17 16
23 8 15 18 7 9 16 14 6
19 5 19 7 11 20 16 1 6
9015840
5
> NewTargetX1Sums
TripID Warner.Center De.Soto Pierce.College Tampa Reseda Balboa
Woodley Sepulveda Van.Nuys Woodman Valley.College Laurel.Canyon
North.Hollywood
[1,] 9011880 5 2 2 2 2 2
2 2 2 2 6 4
1
[2,] 9011890 1 1 1 1 1 1
2 1 1 1 1 2
1
[3,] 9011960 2 2 2 1 2 2
1 2 3 2 2 1
1
.. {Snip} ..
[53,] 9015640 1 1 1 1 1 1
1 1 1 1 1 1
2
[54,] 9015650 1 1 1 1 1 1
1 1 2 1 1 1
5
[55,] 9015840 1 5 1 1 1 1
1 1 1 1 1 1
1
> NewTargetX2Sums
[1] 9011914 9011905 9011983 9012016 9012068 9012070 9012153 9012326
9012319 9012739 9012747 9012788 9012806 9012866 9012870 9012902 9012913
9013020 9013027 9013257 9013270 9013301 9013310 9013340 9013351 9013382
9013456
[28] 9013469 9013825 9013847 9013904 9013906 9013977 9013989 9014107
9014128 9014149 9014149 9014277 9014290 9014459 9014470 9014666 9014686
9014778 9014798 9014976 9015011 9015297 9015324 9015549 9015568 9015654
9015668
[55] 9015857
>
> NewTargetData
TripID Warner.Center De.Soto Pierce.College Tampa Reseda Balboa
Woodley Sepulveda Van.Nuys Woodman Valley.College Laurel.Canyon
North.Hollywood
1 9011880 5 2 2 2 2 2
2 2 2 2 6 4
1
2 9011890 1 1 1 1 1 1
2 1 1 1 1 2
1
3 9011960 2 2 2 1 2 2
1 2 3 2 2 1
1
.. {Snip} ..
Robert Farley
Metro
1 Gateway Plaza
Mail Stop 99-23-7
Los Angeles, CA 90012-2952
Voice: (213)922-2532
Fax: (213)922-2868
www.Metro.net
[[alternative HTML version deleted]]