I am trying to extract data from an XML file, sample structure below:
<pwx creator="PerfPRO" version="1.0">
<workout>
<athlete></athlete>
<title></title>
<sportType>Bike</sportType>
<cmt></cmt>
<device id=""></device>
<time>2016-01-19T08:01:00</time>
<summarydata>
<beginning>0</beginning>
<duration>3600.012</duration>
</summarydata>
<segment>
<summarydata>
<beginning>0</beginning>
<duration>120</duration>
</summarydata>
</segment>
<segment>
<summarydata>
<beginning>120</beginning>
<duration>120</duration>
</summarydata>
</segment>
<segment>
<summarydata>
<beginning>240</beginning>
<duration>120</duration>
</summarydata>
</segment>
I would like to access the data in the 'segment' blocks (both beginning and duration) ideally as a data frame. There are numerous segment blocks.
I have tried numerous things and still can't seem to extract it, all I get is an empty list. Here is what I have done (pwx is the file name):
xmlData <- xmlInternalTreeParse(pwx, useInternalNodes = TRUE)
xmltop = xmlRoot(XMLdata)
d <- xpathSApply(doc = xmlData, path = "//pwx/workout/segment/summarydata/beginning", fun = xmlValue)
I can also seem to access all the segments through:
segment <- xmltop[[1]]["segment"]
but can't seem to get the values. I have tried numerous variations on the above.
Any help greatly appreciated, thanks.
edit:
> summary(xmlData)
$nameCounts
cad dist hr pwr sample spd timeoffset beginning
3274 3274 3274 3274 3274 3274 3274 16
duration summarydata segment athlete cmt device make model
16 16 15 1 1 1 1 1
name pwx sportType time title workout
1 1 1 1 1 1
$numNodes
[1] 22992