Hi:
Here's one approach using the reshape() function in base R:
# Read in your data:
d <- read.table(textConnection("
Candidate.ID Specialty Office Score
110002 C London 47
110002 C East 48
110003 RM West 45
110003 RM Southwest 39
110003 C Southwest 38
110004 H South 42
110006 G East 47
110006 G London 45"), header =
TRUE)
closeAllConnections()
# Create a variable to distinguish positions
d$Position <- c(1, 2, 1, 2, 1, 1, 1, 2)
reshape(d, idvar = 'Candidate.ID', timevar = 'Position',
v.names = c('Specialty', 'Office', 'Score'),
direction = 'wide')
HTH,
Dennis
On Sun, Sep 25, 2011 at 6:47 PM, Jen M <jmstatshelp at gmail.com>
wrote:> Hi all,
>
> I'm having a problem restructuring my data the way I'd like it. ?I
have data
> that look like this:
>
> ?Candidate.ID ? ? ? ?Specialty ? ? ? Office ? ? ? ? Score
> ? ? ? 110002 ? ? ? ? C ? ? ? ? ? ? ? ? ?London ? ? ? ? ? 47
> ? ? ? 110002 ? ? ? ? C ? ? ? ? ? ? ? ? ?East ? ? ? ? ? ? ?48
> ? ? ? 110003 ? ? ? ? RM ? ? ? ? ? ? ? West ? ? ? ? ? ? ?45
> ? ? ? 110003 ? ? ? ? RM ? ? ? ? ? ? ? Southwest ? ? ?39
> ? ? ? 110003 ? ? ? ? C ? ? ? ? ? ? ? ? ?Southwest ? ? ?38
> ? ? ? 110004 ? ? ? ? H ? ? ? ? ? ? ? ? ?South ? ? ? ? ? ? 42
> ? ? ? 110006 ? ? ? ? G ? ? ? ? ? ? ? ? ?East ? ? ? ? ? ? ? 47
> ? ? ? 110006 ? ? ? ? G ? ? ? ? ? ? ? ? ?London ? ? ? ? ? 45
>
> Candidates can apply for the same job specialty in up to 2 offices (never
> more). ?They can apply for different specialties in further centres. ?I
> would like to look at score differences when candidates apply for the same
> specialty in two different offices. ?With the help of the archives I have
> tried various stack/unstack and reshape/melt/cast combinations, and
I've
> managed to get a huge matrix where the columns are all possible
combinations
> of Specialties & Offices - and there are many. ?This leaves a very
sparse
> matrix ?with mainly null values, and this is not what I want. ?I'd like
the
> scores from the two attempts in two columns so I can do scatterplots,
> calculate differences by specialty etc. In SPSS I'd use
'restructure' to get
> what I want. ?I'm working to order with specific requests here so I
have to
> do it this way (as opposed to a modelling approach).
>
> I would like it restructured to look something like this:
>
> ?Candidate.ID ? ? ? ?Specialty ? ? ? Office.1 ? ? ? Score.1 ? ? ?Office.2
> Score.2
> ? ? ? 110002 ? ? ? ? C ? ? ? ? ? ? ? ? ?London ? ? ? ? ? 47
> ?East ? ? ? ? ?48
> ? ? ? 110003 ? ? ? ? RM ? ? ? ? ? ? ? West ? ? ? ? ? ? ?45
> ?Southwest ?39
> ? ? ? 110003 ? ? ? ? C ? ? ? ? ? ? ? ? ?Southwest ? ? ?38
> ? ? ? 110004 ? ? ? ? H ? ? ? ? ? ? ? ? ?South ? ? ? ? ? ? 42
> ? ? ? 110006 ? ? ? ? G ? ? ? ? ? ? ? ? ?East ? ? ? ? ? ? ? 47
> London ? ? ?45
> ? ? ? 110006 ? ? ? ? G ? ? ? ? ? ? ? ? ?London ? ? ? ? ? 45
>
> So one row per candidate/specialty combination, with 2 sets of
> offices/scores and null values in the second set if they've only
applied
> once for that specialty. ?Can anyone help me out with this? Is it possible
> using stack or reshape?
>
> Many thanks for reading,
> Jan
>
> PS Closest I've come to what I need is the sparse matrix produced by
this:
>
> recast(spec.scores, Candidate.ID ~ Specialty + Office,
measure.var="Score")
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>