Dear r-helpers,
I am trying to extract quantities of interest from my iTunes library xml file.
For example, i'd like to be able to run a simple regression of playcount on
track number, under the theory that tracks near the beginning of albums get
played more (either because they are "better" or because people listen
to the beginnings of albums)
I have an xml file that is of the following form:
<key>13162</key>
<dict>
<key>Track ID</key><integer>13162</integer>
<key>Name</key><string>I'm A Wheel</string>
<key>Artist</key><string>Wilco</string>
<key>Composer</key><string>Jeff Tweedy</string>
<key>Album</key><string>A Ghost is Born</string>
<key>Genre</key><string>Rock</string>
<key>Kind</key><string>Matched AAC audio file</string>
<key>Size</key><integer>6248701</integer>
<key>Total Time</key><integer>154648</integer>
<key>Disc Number</key><integer>1</integer>
<key>Disc Count</key><integer>1</integer>
<key>Track Number</key><integer>9</integer>
<key>Track Count</key><integer>12</integer>
<key>Year</key><integer>2004</integer>
<key>Date
Modified</key><date>2012-07-26T22:29:15Z</date>
<key>Date Added</key><date>2010-01-27T00:02:21Z</date>
<key>Bit Rate</key><integer>256</integer>
<key>Sample Rate</key><integer>44100</integer>
<key>Play Count</key><integer>3</integer>
<key>Play Date</key><integer>3434905791</integer>
<key>Play Date
UTC</key><date>2012-11-05T00:29:51Z</date>
<key>Artwork Count</key><integer>1</integer>
<key>Sort Album</key><string>Ghost is Born</string>
<key>Persistent
ID</key><string>A8B0E5CF2E86A4C6</string>
<key>Track Type</key><string>File</string>
<key>Location</key><string>file://localhost/Users/Alex/Music/iTunes/iTunes%20Media/Music/Wilco/A%20Ghost%20is%20Born/09%20I'm%20A%20Wheel.m4a</string>
<key>File Folder Count</key><integer>5</integer>
<key>Library Folder Count</key><integer>1</integer>
</dict>
>From each entry, i'd like to extract: Track ID, Track Number and Play
Count. In this case, it would be
13162, 9, 3
my guess is that this can be done using library(XML).
If anyone has any guidance, it would be appreciated. Please note:
a) I do not understand XML data structures, so please explain what you mean by
"children" etc…
b) Not every entry in my database has a track number and a play count -- i'd
like to have NAs associated with the appropriate Track ID, which all entries
have.
c) it'd also be OK if this XML database just got turned into a normal r data
frame.
Thanks!
[[alternative HTML version deleted]]