Thanks Ben, got it working, just want one more help on this, If i have a node like: <precipitation mode="no"/> and in some other city it came like: <precipitation unit="3h" value="0.0925" type="rain"/> How can i make my code to handle this dynamically? I am sorry to ask such novice questions but it would be extremely helpful if you could help me with this. So, i would want my resulting data set from this code:- ppt <- (x %>% xml_find_all("precipitation") %>% xml_attrs()) if mode is no, then the three columns should come and values should be NA and if values are populated then as is. Unit Value Type NA NA NA 3h 0.0925 rain Thanks again and in advance ! Archit On Thu, Apr 27, 2017 at 6:27 PM, Ben Tupper <btupper at bigelow.org> wrote:> Hi, > > There might be an easy solution out there already, but I suspect that you > will need to parse the XML yourself. The example below uses package xml2 > not XML but you could do this with either. The example simply shows how to > get values out of the XML hierarchy. Once you have the attributes you want > in hand you can assemble the elements into a data frame (or a tibble from > package tibble.) > > By the way, I had to prepend your example with '<current>' > > Cheers, > Ben > > ### START > > library(tidyverse) > library(xml2) > > txt <- "<current><city id=\"2643743\" name=\"London\"><coord lon=\"-0.13\" > lat=\"51.51\"/><country>GB</country><sun rise=\"2017-01-30T07:40:36\" > set=\"2017-01-30T16:47:56\"/></city><temperature value=\"280.15\" > min=\"278.15\" max=\"281.15\" unit=\"kelvin\"/><humidity value=\"81\" > unit=\"%\"/><pressure value=\"1012\" unit=\"hPa\"/><wind><speed > value=\"4.6\" name=\"Gentle Breeze\"/><gusts/><direction value=\"90\" > code=\"E\" name=\"East\"/></wind><clouds value=\"90\" name=\"overcast > clouds\"/><visibility value=\"10000\"/><precipitation > mode=\"no\"/><weather number=\"701\" value=\"mist\" > icon=\"50d\"/><lastupdate value=\"2017-01-30T15:50:00\"/></current>" > > x <- read_xml(txt) > > windspeed <- x %>% > xml_find_first("wind/speed") %>% > xml_attrs() > > winddir <- x %>% > xml_find_first("wind/direction") %>% > xml_attrs() > > windspeed > # value name > # "4.6" "Gentle Breeze" > > winddir > # value code name > # "90" "E" "East" > > ### END > > > > > On Apr 27, 2017, at 6:08 AM, Archit Soni <soni.archit1989 at gmail.com> > wrote: > > > > Hi All, > > > > I have a XML file like : > > > > <city id="2643743" name="London"> > > <coord lon="-0.13" lat="51.51"/> > > <country>GB</country> > > <sun rise="2017-01-30T07:40:36" set="2017-01-30T16:47:56"/> > > </city> > > <temperature value="280.15" min="278.15" max="281.15" unit="kelvin"/> > > <humidity value="81" unit="%"/> > > <pressure value="1012" unit="hPa"/> > > <wind> > > <speed value="4.6" name="Gentle Breeze"/> > > <gusts/> > > <direction value="90" code="E" name="East"/> > > </wind> > > <clouds value="90" name="overcast clouds"/> > > <visibility value="10000"/> > > <precipitation mode="no"/> > > <weather number="701" value="mist" icon="50d"/> > > <lastupdate value="2017-01-30T15:50:00"/> > > </current> > > > > I want to create a data frame out of this XML but > > obviously xmlToDataFrame() is not working. > > > > It has dynamic attributes like for node precipitation , it could have > > attributes like value and mode both if there is ppt in some city. > > > > My basic issue now id to read XML attributes of different nodes and > convert > > it into a data frame, I have scraped many forums but could not find any > > help in this. > > > > For starters, please suggest a solution to parse the value of city node > and > > corresponding id, name, lat, long etc. > > > > I know I am asking a lot, thanks for reading and cheers! :) > > > > -- > > Regards > > Archit > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > Ben Tupper > Bigelow Laboratory for Ocean Sciences > 60 Bigelow Drive, P.O. Box 380 > East Boothbay, Maine 04544 > http://www.bigelow.org > > > >-- Regards Archit [[alternative HTML version deleted]]
Hi again, It would be super easy if xml2::xml_attrs() accepted a list of attribute names and defaults values like xml2::xml_attr() does, but it doesn't. Once you have a list of character vectors like that returned by your ... ppt <- x %>% xml_find_all("precipitation") %>% xml_attrs() ..then you need only try to extract the fields you want. Perhaps something like the following untested steps... precip <- tibble::as_tibble(do.call(rbind, lapply(ppt, '[', c('unit', 'value', 'type')) )) colnames(precip) <- c('unit', 'value', 'type') Bon chance! Ben P.S. Don't forget to change your email client to send plain text messages to this list. Typically rich text and html emails get turned into hash by the R-help list services.> On Apr 28, 2017, at 4:25 AM, Archit Soni <soni.archit1989 at gmail.com> wrote: > > Thanks Ben, got it working, just want one more help on this, > > If i have a node like: <precipitation mode="no"/> and in some other city it came like: <precipitation unit="3h" value="0.0925" type="rain"/> > > How can i make my code to handle this dynamically? I am sorry to ask such novice questions but it would be extremely helpful if you could help me with this. > > So, i would want my resulting data set from this code:- ppt <- (x %>% xml_find_all("precipitation") %>% xml_attrs()) > if mode is no, then the three columns should come and values should be NA and if values are populated then as is. > > Unit Value Type > NA NA NA > 3h 0.0925 rain > > Thanks again and in advance ! > > Archit > > On Thu, Apr 27, 2017 at 6:27 PM, Ben Tupper <btupper at bigelow.org> wrote: > Hi, > > There might be an easy solution out there already, but I suspect that you will need to parse the XML yourself. The example below uses package xml2 not XML but you could do this with either. The example simply shows how to get values out of the XML hierarchy. Once you have the attributes you want in hand you can assemble the elements into a data frame (or a tibble from package tibble.) > > By the way, I had to prepend your example with '<current>' > > Cheers, > Ben > > ### START > > library(tidyverse) > library(xml2) > > txt <- "<current><city id=\"2643743\" name=\"London\"><coord lon=\"-0.13\" lat=\"51.51\"/><country>GB</country><sun rise=\"2017-01-30T07:40:36\" set=\"2017-01-30T16:47:56\"/></city><temperature value=\"280.15\" min=\"278.15\" max=\"281.15\" unit=\"kelvin\"/><humidity value=\"81\" unit=\"%\"/><pressure value=\"1012\" unit=\"hPa\"/><wind><speed value=\"4.6\" name=\"Gentle Breeze\"/><gusts/><direction value=\"90\" code=\"E\" name=\"East\"/></wind><clouds value=\"90\" name=\"overcast clouds\"/><visibility value=\"10000\"/><precipitation mode=\"no\"/><weather number=\"701\" value=\"mist\" icon=\"50d\"/><lastupdate value=\"2017-01-30T15:50:00\"/></current>" > > x <- read_xml(txt) > > windspeed <- x %>% > xml_find_first("wind/speed") %>% > xml_attrs() > > winddir <- x %>% > xml_find_first("wind/direction") %>% > xml_attrs() > > windspeed > # value name > # "4.6" "Gentle Breeze" > > winddir > # value code name > # "90" "E" "East" > > ### END > > > > > On Apr 27, 2017, at 6:08 AM, Archit Soni <soni.archit1989 at gmail.com> wrote: > > > > Hi All, > > > > I have a XML file like : > > > > <city id="2643743" name="London"> > > <coord lon="-0.13" lat="51.51"/> > > <country>GB</country> > > <sun rise="2017-01-30T07:40:36" set="2017-01-30T16:47:56"/> > > </city> > > <temperature value="280.15" min="278.15" max="281.15" unit="kelvin"/> > > <humidity value="81" unit="%"/> > > <pressure value="1012" unit="hPa"/> > > <wind> > > <speed value="4.6" name="Gentle Breeze"/> > > <gusts/> > > <direction value="90" code="E" name="East"/> > > </wind> > > <clouds value="90" name="overcast clouds"/> > > <visibility value="10000"/> > > <precipitation mode="no"/> > > <weather number="701" value="mist" icon="50d"/> > > <lastupdate value="2017-01-30T15:50:00"/> > > </current> > > > > I want to create a data frame out of this XML but > > obviously xmlToDataFrame() is not working. > > > > It has dynamic attributes like for node precipitation , it could have > > attributes like value and mode both if there is ppt in some city. > > > > My basic issue now id to read XML attributes of different nodes and convert > > it into a data frame, I have scraped many forums but could not find any > > help in this. > > > > For starters, please suggest a solution to parse the value of city node and > > corresponding id, name, lat, long etc. > > > > I know I am asking a lot, thanks for reading and cheers! :) > > > > -- > > Regards > > Archit > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > Ben Tupper > Bigelow Laboratory for Ocean Sciences > 60 Bigelow Drive, P.O. Box 380 > East Boothbay, Maine 04544 > http://www.bigelow.org > > > > > > > -- > Regards > ArchitBen Tupper Bigelow Laboratory for Ocean Sciences 60 Bigelow Drive, P.O. Box 380 East Boothbay, Maine 04544 http://www.bigelow.org
Thanks Ben, I'll give it a shot.. Thanks again :) On Apr 28, 2017 18:54, "Ben Tupper" <btupper at bigelow.org> wrote:> Hi again, > > It would be super easy if xml2::xml_attrs() accepted a list of attribute > names and defaults values like xml2::xml_attr() does, but it doesn't. Once > you have a list of character vectors like that returned by your ... > > ppt <- x %>% xml_find_all("precipitation") %>% xml_attrs() > > ..then you need only try to extract the fields you want. Perhaps > something like the following untested steps... > > precip <- tibble::as_tibble(do.call(rbind, lapply(ppt, '[', c('unit', > 'value', 'type')) )) > > colnames(precip) <- c('unit', 'value', 'type') > > Bon chance! > Ben > > P.S. Don't forget to change your email client to send plain text messages > to this list. Typically rich text and html emails get turned into hash by > the R-help list services. > > > On Apr 28, 2017, at 4:25 AM, Archit Soni <soni.archit1989 at gmail.com> > wrote: > > > > Thanks Ben, got it working, just want one more help on this, > > > > If i have a node like: <precipitation mode="no"/> and in some other city > it came like: <precipitation unit="3h" value="0.0925" type="rain"/> > > > > How can i make my code to handle this dynamically? I am sorry to ask > such novice questions but it would be extremely helpful if you could help > me with this. > > > > So, i would want my resulting data set from this code:- ppt <- (x %>% > xml_find_all("precipitation") %>% xml_attrs()) > > if mode is no, then the three columns should come and values should be > NA and if values are populated then as is. > > > > Unit Value Type > > NA NA NA > > 3h 0.0925 rain > > > > Thanks again and in advance ! > > > > Archit > > > > On Thu, Apr 27, 2017 at 6:27 PM, Ben Tupper <btupper at bigelow.org> wrote: > > Hi, > > > > There might be an easy solution out there already, but I suspect that > you will need to parse the XML yourself. The example below uses package > xml2 not XML but you could do this with either. The example simply shows > how to get values out of the XML hierarchy. Once you have the attributes > you want in hand you can assemble the elements into a data frame (or a > tibble from package tibble.) > > > > By the way, I had to prepend your example with '<current>' > > > > Cheers, > > Ben > > > > ### START > > > > library(tidyverse) > > library(xml2) > > > > txt <- "<current><city id=\"2643743\" name=\"London\"><coord > lon=\"-0.13\" lat=\"51.51\"/><country>GB</country><sun > rise=\"2017-01-30T07:40:36\" set=\"2017-01-30T16:47:56\"/></city><temperature > value=\"280.15\" min=\"278.15\" max=\"281.15\" unit=\"kelvin\"/><humidity > value=\"81\" unit=\"%\"/><pressure value=\"1012\" > unit=\"hPa\"/><wind><speed value=\"4.6\" name=\"Gentle > Breeze\"/><gusts/><direction value=\"90\" code=\"E\" > name=\"East\"/></wind><clouds value=\"90\" name=\"overcast > clouds\"/><visibility value=\"10000\"/><precipitation > mode=\"no\"/><weather number=\"701\" value=\"mist\" > icon=\"50d\"/><lastupdate value=\"2017-01-30T15:50:00\"/></current>" > > > > x <- read_xml(txt) > > > > windspeed <- x %>% > > xml_find_first("wind/speed") %>% > > xml_attrs() > > > > winddir <- x %>% > > xml_find_first("wind/direction") %>% > > xml_attrs() > > > > windspeed > > # value name > > # "4.6" "Gentle Breeze" > > > > winddir > > # value code name > > # "90" "E" "East" > > > > ### END > > > > > > > > > On Apr 27, 2017, at 6:08 AM, Archit Soni <soni.archit1989 at gmail.com> > wrote: > > > > > > Hi All, > > > > > > I have a XML file like : > > > > > > <city id="2643743" name="London"> > > > <coord lon="-0.13" lat="51.51"/> > > > <country>GB</country> > > > <sun rise="2017-01-30T07:40:36" set="2017-01-30T16:47:56"/> > > > </city> > > > <temperature value="280.15" min="278.15" max="281.15" unit="kelvin"/> > > > <humidity value="81" unit="%"/> > > > <pressure value="1012" unit="hPa"/> > > > <wind> > > > <speed value="4.6" name="Gentle Breeze"/> > > > <gusts/> > > > <direction value="90" code="E" name="East"/> > > > </wind> > > > <clouds value="90" name="overcast clouds"/> > > > <visibility value="10000"/> > > > <precipitation mode="no"/> > > > <weather number="701" value="mist" icon="50d"/> > > > <lastupdate value="2017-01-30T15:50:00"/> > > > </current> > > > > > > I want to create a data frame out of this XML but > > > obviously xmlToDataFrame() is not working. > > > > > > It has dynamic attributes like for node precipitation , it could have > > > attributes like value and mode both if there is ppt in some city. > > > > > > My basic issue now id to read XML attributes of different nodes and > convert > > > it into a data frame, I have scraped many forums but could not find any > > > help in this. > > > > > > For starters, please suggest a solution to parse the value of city > node and > > > corresponding id, name, lat, long etc. > > > > > > I know I am asking a lot, thanks for reading and cheers! :) > > > > > > -- > > > Regards > > > Archit > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > Ben Tupper > > Bigelow Laboratory for Ocean Sciences > > 60 Bigelow Drive, P.O. Box 380 > > East Boothbay, Maine 04544 > > http://www.bigelow.org > > > > > > > > > > > > > > -- > > Regards > > Archit > > Ben Tupper > Bigelow Laboratory for Ocean Sciences > 60 Bigelow Drive, P.O. Box 380 > East Boothbay, Maine 04544 > http://www.bigelow.org > > > >[[alternative HTML version deleted]]