Hello, All: What would you suggest I do to parse the following XML file into a list that I can understand: XMLfile <- "https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ndnp_Alabama_all-yrs_e_0001_0050.xml" This is the first of 6666 XML files containing "U.S. Newspaper Directory" maintained by the US Library of Congress discussed in the thread below. I've tried various things using the XML and xml2. XMLdata <- xml2::read_xml(XMLfile) str(XMLdata) XMLdat <- XML::xmlParse(XMLdata) str(XMLdat) XMLtxt <- xml2::xml_text(XMLdata) nchar(XMLtxt) #[1] 29415 Someplace there's a schema for this. I don't know if it's embedded in this XML file or in a separate file. If it's in a separate file, how could I describe it to my contacts with the Library of Congress so they would understand what I needed and could help me get it. Thanks, Spencer Graves p.s. All 29415 characters in XMLtext appear in the thread below. -------- Forwarded Message -------- Subject: [Newspapers and Current Periodicals] How can I get counts of the numbers of newspapers by year in the US, and preferably also elsewhere? A search of "U.S. Newspaper Directory, Date: Wed, 27 Jul 2022 14:59:03 +0000 From: Kerry Huller <serials at ask.loc.gov> To: Spencer Graves <spencer.graves at effectivedefense.org> CC: twes at loc.gov --# Type your reply above this line #-- ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 27 2022, 10:59am via System Hello Spencer, So, when I view the xml, I'm actually looking at it in XML editor software, so I can view the tags and it's structured neatly. I've copied and pasted the text from the beginning of the file and the first newspaper title below from my XML editor: <?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type='text/xsl' href='/webservices/catalog/xsl/searchRetrieveResponse.xsl'?> <searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/" xmlns:oclcterms="http://purl.org/oclc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:diag="http://www.loc.gov/zing/srw/diagnostic/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <version>1.1</version> <numberOfRecords>2250</numberOfRecords> <records> <record> <recordSchema>info:srw/schema/1/marcxml</recordSchema> <recordPacking>xml</recordPacking> <recordData> <record xmlns="http://www.loc.gov/MARC21/slim"> ? ? <leader>00000nas a22000007i 4500</leader> ? ? <controlfield tag="001">1030438981</controlfield> ? ? <controlfield tag="008">180404c20159999aluwr n ? ? ? 0 ? a0eng ?</controlfield> ? ? <datafield ind1=" " ind2=" " tag="010"> ? ? ? <subfield code="a"> ?2018200464</subfield> ? ? </datafield> ? ? <datafield ind1=" " ind2=" " tag="040"> ? ? ? <subfield code="a">DLC</subfield> ? ? ? <subfield code="e">rda</subfield> ? ? ? <subfield code="c">DLC</subfield> ? ? ? <subfield code="b">eng</subfield> ? ? </datafield> ? ? <datafield ind1=" " ind2=" " tag="012"> ? ? ? <subfield code="m">1</subfield> ? ? </datafield> ? ? <datafield ind1="0" ind2=" " tag="022"> ? ? ? <subfield code="a">2577-5316</subfield> ? ? ? <subfield code="2">1</subfield> ? ? </datafield> ? ? <datafield ind1=" " ind2=" " tag="032"> ? ? ? <subfield code="a">021110</subfield> ? ? ? <subfield code="b">USPS</subfield> ? ? </datafield> ? ? <datafield ind1=" " ind2=" " tag="037"> ? ? ? <subfield code="b">711 Alabama Avenue, Selma, AL 36701</subfield> ? ? </datafield> ? ? <datafield ind1=" " ind2=" " tag="042"> ? ? ? <subfield code="a">nsdp</subfield> ? ? ? <subfield code="a">pcc</subfield> ? ? </datafield> ? ? <datafield ind1="1" ind2="0" tag="050"> ? ? ? <subfield code="a">ISSN RECORD</subfield> ? ? </datafield> ? ? <datafield ind1="1" ind2="0" tag="082"> ? ? ? <subfield code="a">071</subfield> ? ? ? <subfield code="2">15</subfield> ? ? </datafield> ? ? <datafield ind1=" " ind2="0" tag="222"> ? ? ? <subfield code="a">Selma sun</subfield> ? ? </datafield> ? ? <datafield ind1="0" ind2="0" tag="245"> ? ? ? <subfield code="a">Selma sun.</subfield> ? ? </datafield> ? ? <datafield ind1=" " ind2="1" tag="264"> ? ? ? <subfield code="a">Selma, AL :</subfield> ? ? ? <subfield code="b">North Shore Press, LLC</subfield> ? ? ? <subfield code="c">2016-</subfield> ? ? </datafield> ? ? <datafield ind1=" " ind2=" " tag="310"> ? ? ? <subfield code="a">Weekly</subfield> ? ? </datafield> ? ? <datafield ind1=" " ind2=" " tag="336"> ? ? ? <subfield code="a">text</subfield> ? ? ? <subfield code="b">txt</subfield> ? ? ? <subfield code="2">rdacontent</subfield> ? ? </datafield> ? ? <datafield ind1=" " ind2=" " tag="337"> ? ? ? <subfield code="a">unmediated</subfield> ? ? ? <subfield code="b">n</subfield> ? ? ? <subfield code="2">rdamedia</subfield> ? ? </datafield> ? ? <datafield ind1=" " ind2=" " tag="338"> ? ? ? <subfield code="a">volume</subfield> ? ? ? <subfield code="b">nc</subfield> ? ? ? <subfield code="2">rdacarrier</subfield> ? ? </datafield> ? ? <datafield ind1="1" ind2=" " tag="362"> ? ? ? <subfield code="a">Began in 2015.</subfield> ? ? </datafield> ? ? <datafield ind1=" " ind2=" " tag="588"> ? ? ? <subfield code="a">Description based on: Volume 2, Issue 40 (October 5, 2017) (surrogate); title from caption.</subfield> ? ? </datafield> ? ? <datafield ind1=" " ind2=" " tag="588"> ? ? ? <subfield code="a">Latest issue consulted: Volume 2, Issue 40 (October 5, 2017).</subfield> ? ? </datafield> ? ? <datafield ind1=" " ind2=" " tag="752"> ? ? ? <subfield code="a">United States</subfield> ? ? ? <subfield code="b">Alabama</subfield> ? ? ? <subfield code="c">Dallas</subfield> ? ? ? <subfield code="d">Selma.</subfield> ? ? </datafield> ? </record> </recordData> </record> When I view the records in the XML editor, these 2 lines below do begin each of the records for each individual title, but of course this is including the xml tags: <recordSchema>info:srw/schema/1/marcxml</recordSchema> <recordPacking>xml</recordPacking> Hopefully this helps you decide where to break or parse each record. On another note, I just noticed as well that at the top of this first file it lists the total number of records for the Alabama grouping - 2250. This also appeared to be the case for the Alaska records when I took a look at the first one for that state. I imagine that should be consistent throughout each "grouping" of records. Let me know if you have follow-up questions! Best wishes, Kerry Huller Newspaper & Current Periodical Reading Room Serial & Government Publications Division Library of Congress ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 27 2022, 10:21am via Email Hi, Kerry: Thanks. I understand the chunking in files of at most 50. I've read the first file "ndnp_Alabama_all-yrs_e_0001_0050.xml" into a string of 29415 characters, copied below. Might you have any suggestions on the next step in parsing this? Staring at it now, it looks splitting on "info:srw/schema/1/marcxmlxml" might convert the 29415 characters into shorter chunks, each of which could then be parsed further. This is not as bad as reading ancient Egyptian heiroglyphics without the Rosetta Stone, but I wondered if you might have something that could make this work easier and more reliable? I guess I could compare with what I already read as JSON ;-) Thanks, Spencer Graves "1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i 45001030438981180404c20159999aluwr n 0 a0eng 2018200464DLCrdaDLCeng12577-53161021110USPS711 Alabama Avenue, Selma, AL 36701nsdppccISSN RECORD07115Selma sunSelma sun.Selma, AL :North Shore Press, LLC2016-WeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in 2015.Description based on: Volume 2, Issue 40 (October 5, 2017) (surrogate); title from caption.Latest issue consulted: Volume 2, Issue 40 (October 5, 2017).United StatesAlabamaDallasSelma.info:srw/schema/1/marcxmlxml00000cas a22000007a 4500502150053100127c20109999aluwr n 0 a0eng 2010200019DLCengDLCDLCOCLCQ112153-18111750USPSB & C Publishing, LLC, 3514 Martin St. S. Ste 104, Cropwell, AL 35054pccnsdpISSN RECORDSt. Clair County news (Cropwell, Ala.)St. Clair County news(Cropwell, Ala.)St. Clair County news.Cropwell, AL :B & C Pub.WeeklyBegan in 2010.Description based on: Nov. 4, 2010 (surrogate); title from caption.info:srw/schema/1/marcxmlxml00000cas a22000007a 4500426491872090720c20099999alumr n 0 a0eng 2009203372DLCengDLCOCLCQ12150-346X2150-346X1AU at 000044489617NZ116076352Devon Applewhite/Applewhite Publishing Co., 1910 Honeysuckle Rd., #N183, Dothan, AL 36305mscnsdpISSN RECORD30514Triangle tribune(Dothan, Ala.)Triangle tribune.Dothan, AL :Applewhite Pub. CoMonthlyBegan with vol. 1, issue 1 (May 2009).\"Connecting the Tri-State African -American Community.\"Description based on: Vol. 1, issue 1 (May 2009); title from masthead.Applewhite, Devon.United StatesAlabama.United StatesGeorgia.United StatesFlorida.info:srw/schema/1/marcxmlxml00000cas a22000007a 4500289017315081219c20089999aluwr n | a0eng c 2008213218NSDengNSDOCLCQDLCOCLCQ111945-93191945-93191005270USPSSpringhill Publications, LLC, P.O. Box 186, Greenville, AL 36037nsdppccISSN RECORD07014Greenville standardThe Greenville standard.Greenville, AL :Springhill PublicationsWeeklytexttxtrdacontentunmediatednrdamediaBegan with vol. 1, issue 1 (Sept. 3, 2008)Description based on surrogate of: Vol. 1, no. 15 (Dec. 18, 2008); title from masthead (publisher's Web site, viewed Dec. 19, 2008).Latest issue consulted: Vol. 1, no. 99 (July 27, 2011) (surrogate).info:srw/schema/1/marcxmlxml00000cas a22000007a 4500123539969070426c20079999aluwr ne 0 a0eng c 2007212138NSDengNSDNSDOCLCQ101936-95571936-95571The Western Tribune, 1530 Third Ave. N., Bessemer, AL 35020mscnsdpISSN RECORDWestern tribune (Bessemer, Ala.)The Western tribune(Bessemer, Ala.)The Western tribune.Bessemer, Ala. :D-Med, Inc.v.WeeklyBegan in 2007.Description based on: May 23, 2007 (surrogate); title from caption.AU at 000041575341info:srw/schema/1/marcxmlxml00000cas a22000007a 4500226300653080425c20079999aluwr ne | a0eng 2008212112NSDengNSDNSDOCLCQ11942-20751942-20751nsdppccISSN RECORDThe corridor messengerThe corridor messenger.Carbon Hill, AL :Corridor Messenger, Inc.WeeklyBegan with vol. 1, issue (10.03.2007).Description based on: 1st issue.United StatesAlabamaWalkerCarbon Hill.http://www.corridormessenger.cominfo:srw/schema/1/marcxmlxml00000cas a22000007a 450077560432070109c20069999aluwr ne 0 a0eng c 2007213400NSDengNSDOCLCQAUBRNOCLCOOCLCFa01935-37901935-37901AU at 000041190283The Auburn Villager, P.O. Box 1633, Auburn, AL 36831-1633pccnsdpISSN RECORDThe Auburn villagerThe Auburn villager.Auburn, AL :Auburn Villagerv.WeeklyBegan in 2006.Description based on: Vol. 1, no. 4 (July 20, 2006) (surrogate); title from caption.Auburn (Ala.)Newspapers.Lee County (Ala.)Newspapers.AlabamaAuburn.fast(OCoLC)fst01209634AlabamaLee County.fast(OCoLC)fst01211930Newspapers.fast(OCoLC)fst01423814United StatesAlabamaLeeAuburn.info:srw/schema/1/marcxmlxml00000cas a2200000Ii 4500872286785m o d s cr mn|---a||||140311c20069999alucr n o b s0 a0eng cABCengrdaABCABCOCLCFLD59.13University of Alabama at Birmingham.The eReporter.[Birmingham, Alabama] :The University of Alabama at Birmingham,[2006]-[Birmingham, Alabama] :Offices of Public Relations & Marketing and Information Technology1 online resource2 issues weeklytexttxtrdacontentcomputercrdamediaonline resourcecrrdacarrierSeptember 19, 2006-\"The eReporter is an official communication of The University of Alabama at Birmingham, companion to the UAB Reporter and recommended alternative to mass e-mails.\"Issues for <March 11, 2014- published and distributed via e-mail subscription on Tuesdays and Fridays.Description based on: September 19, 2006; title from title screen (viewed March 12, 2014).University of Alabama at BirminghamPeriodicals.Periodicals.fast(OCoLC)fst01411641University of Alabama at Birmingham.fast(OCoLC)fst00645114University of Alabama at Birmingham.Office of Public Relations and Marketing.University of Alabama at Birmingham.Information Technology.2006-2012, companion to:University of Alabama at Birmingham.UAB reporter.(OCoLC)32435748Archived issueshttp://hatteras.dpo.uab.edu/cgi-bin/ereporter.cgiinfo:srw/schema/1/marcxmlxml00000cas a22000007a 4500166387050070829c20059999aluwr ne | a0eng c 2007215501NSDengNSDOCLCQ11939-68991939-68991The Wilkie Clark Memorial Foundation, P.O. Box 514, Roanoke, AL 36274$30.00nsdpmscISSN RECORD305.89614People's voice (Roanoke, Ala.)The people's voice(Roanoke, Ala.)The people's voice.Roanoke, AL :Wilkie Clark Memorial Foundationv.WeeklyBegan with vol. 1, no. 1 in 2005.Description based on: Vol. 2, no. 20 (Apr. 20, 2007); title from caption.Wilkie Clark Memorial Foundation.United StatesAlabamaRandolphRoanoke.AU at 000042141390info:srw/schema/1/marcxmlxml00000nas a22000007i 45001124677787191021c20uu9999aluwr ne | a0eng 2019202521DLCengrdaDLC12689-3258122730USPSNorth Jackson Press, 42950 Hwy 72, Suite 406, Stevenson, AL 35772nsdppccISSN RECORD071.323North Jackson pressNorth Jackson press.Stevenson, AL :Caney Creek Publications LLCWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierDescription based on surrogate of: Volume 1, number 36 (October 11, 2019); title from masthead.Latest issue consulted: Volume 1, number 36 (October 11, 2019) (Surrogate).United StatesAlabamaJacksonStevensoninfo:srw/schema/1/marcxmlxml00000cas a2200000 a 4500226315099080428d19981998aluwr ne | 0eng c 2008233691GUAengGUAOCLCQOCLCFOCLCO39911644pccn-us-gaThe Dekalb news.Birmingham, Ala. :Community newspaper holdings Inc.v.WeeklyBegan with 1st year, no. 1 (Apr. 1, 1998); ceased with 1st year, no. 31 (Oct. 28, 1998).Final issue consulted.Description based on first issue; title from caption.Decatur (Ga.)Newspapers.DeKalb County (Ga.)Newspapers.Newspapers.fast(OCoLC)fst01423814GeorgiaDecatur.fast(OCoLC)fst01226234GeorgiaDeKalb County.fast(OCoLC)fst01215288United StatesGeorgiaDeKalbDecatur.Decatur-DeKalb news/era(DLC)sn 89053661(OCoLC)19946163info:srw/schema/1/marcxmlxml00000cas a2200000 i 450050263311m o d cr cn|||||||||020730c19979999alu x neo 0 a0eng c 2015238492AMHengrdapnAMHOCLCQOCLCFOCLCOIULOCLHTMOCLCQCOODLC66460694810970435082687-93791AU at 000050711528OCLCS45109pccnsdpn-us---AP2.B5707023Birmingham weekly (Online)Birmingham weekly(Online)Birmingham weekly.Birmingham, AL :Birmingham Weekly1 online resourceIrregular,Feb. 16-28, 2012-Weekly,Sept. 4-11, 1997-Feb. 9-16, 2012texttxtrdacontentcomputercrdamediaonline resourcecrrdacarrierBegan with vol. 1, issue 1 (Sept. 4-11, 1997).\"City news, views & entertainment\"--Cover.Numbering dropped in Mar. 2012.Also issued in print.Description based on: Publication information from ProQuest; title from web page (viewed June 18, 2015).Latest issue consulted: Aug. 15-20, 2012.Birmingham (Ala.)Newspapers.Internet resources.Electronic journals.AlabamaBirmingham.fast(OCoLC)fst01204958Newspapers.fast(OCoLC)fst01423814United StatesAlabamaBirmingham.Print version:Birmingham Weekly(OCoLC)39271050http://apw.softlineweb.com/http://WC2VB5MT8E.search.serialssolutions.com/?sid=sersol&SS_jc=JC_000051895&title=Birmingham+Weeklyinfo:srw/schema/1/marcxmlxml00000cas a22000007a 450031471314941116d19941995aluwr ne 0 a0eng csn 94003083 NSDengNSDANEOCLCQOCLCFOCLCOOCLCQ11079-65411079-65411nsdppccn-us-akSoutheast shopperSoutheast shopper.Juneau, Alaska :Kemper Communications,1994-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 1, no. 1 (Nov. 16, 1994)-Ceased in Feb. 1995.Juneau (Alaska)Newspapers.AlaskaJuneau.fast(OCoLC)fst01213587Newspapers.fast(OCoLC)fst01423814United StatesAlaskaJuneau.AU at 000011356572info:srw/schema/1/marcxmlxml00000cas a22000008a 450027910515930413c19949999alumr n 0 a0eng dsn 93002581 NSDengNSDOCLCQ11069-06621Birmingham Tribune, 216 Ave. T. Pratt City, Birmingham, AL 35214nsdpBirmingham tribuneBirmingham tribune.Birmingham, Ala. :Kervin Fondren9501volumesMonthlytexttxtrdacontentunmediatednrdamediavolumencrdacarrierPREPUB: publication expected Jan. 1995AU at 000025863987info:srw/schema/1/marcxmlxml00000cas a22000007a 450026199931920716d19922013alumr ne 0 a0eng csn 92003357 NSDengNSDOCLOCLCQDLC011064-01341064-01341Black & White, POB 13215, Birmingham, AL 35202-3215nsdppccBlack & white (Birmingham, Ala.)Black & white(Birmingham, Ala.)Black & white.Black and whiteBirmingham, Ala. :Black & White, Inc.v.Biweekly,Oct. 2, 1997-Monthly,May 1, 1992-Sept. 1997Began in May 1992; ceased with Jan. 10, 2013.\"Birmingham's New City paper.\"Description based on: June 1992.Latest issue consulted: No. 67 (Oct. 16, 1997) (surrogate).info:srw/schema/1/marcxmlxml00000cas a2200000 a 450032145723950314d19901999alumr ne 0 a0eng csn 95068755 MGNengMGNNSDCLUOCLCQOCLCFOCLCOOCLCA971211082-34841082-34841AU at 000011579542nsdppccn-us-alF335.J5S68The Southern shofarThe Southern shofar.Birmingham, AL :L. Brook,-[1999]v. :ill. ;35 cm.MonthlyBegan in 1990.-v. 9, issue 9 (Aug./Sept. 1999).\"The monthly newspaper of Alabama's Jewish community.\"Some issues also available on the Internet via the World Wide Web.Description based on: Vol. 3, issue 11 (Oct. 1993).Jewish newspapersAlabama.Jewish newspapers.fast(OCoLC)fst00982872Alabama.fast(OCoLC)fst01204694United StatesAlabamaJeffersonBirmingham.Deep South Jewish voice(DLC)sn 99018499(OCoLC)42431704CLUhttp://bibpurl.oclc.org/web/719http://www.bham.net/shofar/info:srw/schema/1/marcxmlxml00000cas a22000007a 450021265141900326c19909999aluwr ne 0 a0eng csn 90099004 AARengAARCPNNSDOCLCQ11050-08981050-08981005022USPSE.O.N., Inc., Main St., Eclectic, AL 36024pccnsdpISSN RECORDThe Eclectic observerThe Eclectic observer.Eclectic, Ala. :E.O.N., Inc.,1990-v.WeeklyVol. 1, no. 1 (Feb. 22, 1990)-Published by: Price Publications, Inc., <2006->Latest issue consulted: Vol. 17, no. 1 (Jan. 5, 2006).United StatesAlabamaElmoreEclectic.AU at 000040212446info:srw/schema/1/marcxmlxml00000cas a22000007a 450021214781900314c19909999aluir ne 0 a0eng csn 90002457 AAAengAAANSDOCLCQ111050-20841050-20841931180USPSClanton Newspapers, 1109 Seventh St., N., PO Box 1379, Clanton, AL 35045nsdppccn-us-alThe Clanton advertiserThe Clanton advertiser.AdvertiserClanton, Ala. :Clanton Newspapersv. :ill. ;58 cm.Three no. a week,<May 13, 1992->Semiweekly,<Apr. 4, 1990->Began in Jan. 1990.Description based on: Vol. 19, no. 27 (Wed., Apr. 4, 1990).Latest issue consulted: Vol. 22, no. 58 (May 13, 1992).United StatesAlabamaChiltonClanton.Independent advertiser (Clanton, Ala.)(OCoLC)21214732AU at 000025908452info:srw/schema/1/marcxmlxml00000cas a2200000 a 450021214814900314c19909999aluwr ne 0 a0eng dsn 90099009 AAAengAAACPNNSDOCLCQ11056-32881056-32881505740USPSThe Blount Countian, 3rd St. at Washington Ave., PO Box 310, Oneonta, AL 35121mscnsdpn-us-alThe Blount countianThe Blount countian.Oneonta, Ala. :Southern Democrat, Inc.,1990-v. :ill.WeeklyVol. 1, no. 1 (Jan. 3, 1990)-Editor: Molly Howard Ryan, 1990-Latest issue consulted: Vol. 1, no. 36 (Sept. 5, 1990).Ryan, Molly Howard.United StatesAlabamaBlountOneonta.Southern Democrat(DLC)sn 85044741(OCoLC)12038577AU at 000025884049info:srw/schema/1/marcxmlxml00000cas a22000007a 450022413044900920c19909999aluwr ne 0 a0eng dsn 90099011 AARengAARCPNNSDNSTOCLCQ92081707011191053-91231053-91231314240USPSmscnsdpThe Clay times-journalThe Clay times-journal.Lineville, Ala. :C.L. Proctor,1990-v.WeeklyVol. 1, no. 1 (Sept. 6, 1990)-United StatesAlabamaClayLineville.Ashland progress(DLC)sn 85044701Lineville tribune(DLC)sn 85044702AUinfo:srw/schema/1/marcxmlxml00000cas a22000007a 450021265218900326c19909999aluwr ne 0 0eng dsn 90099005 AARengAARCPNOCLCQmscTrussville news-journal.Trussville, Ala. :Mike Mitchell,1990-v.BimonthlyVol. 1, no. 1 (Feb. 20, 1990)-United StatesAlabamaJeffersonTrussville.info:srw/schema/1/marcxmlxml00000cas a22000007a 450022301035900831c19909999aluwr ne 0 0eng dsn 90099010 AARengAARCPNOCLCQmscWeaver tribune.Oxford, Ala. :Cheaha Pub.,1990-v.WeeklyVol. 1, no. 1 (July 19, 1990)-United StatesAlabamaCalhounWeaver.United StatesAlabamaCalhounOxford.info:srw/schema/1/marcxmlxml00000cas a22000007a 450015155895870205c19879999aludr ne 0 a0eng csn 87050045 AAAengAAACPNNSDDLCCPNNSDDLCCPNDLCOCLDLCOCLCQOCLCFOCLCQ19261126829944596670892-44570892-44571AU at 000020456714360980USPSThe Advertiser, P.O. Box 1000, Montgomery, AL 36192pccnsdpn-us-alNewspaperMontgomery advertiser (Montgomery, Ala. : 1987)The Montgomery advertiser(1987)The Montgomery advertiser.Montgomery advertiser & the Alabama journalSunday Montgomery advertiserMontgomery, Ala. :Advertiser Co.,1987-volumes :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier160th year, no. 1 (Jan. 2, 1987)-On Saturdays, Sundays and holidays a combined edition is published with the Alabama journal, and called: Montgomery advertiser and the Alabama journal, Jan. 3, 1987, and: Alabama journal and Montgomery advertiser, Jan. 4, 1987-Feb. 25, 1990.Issues for Sunday called: Sunday Montgomery advertiser, Mar. 4, 1990-Issues for Saturday, Sunday and holidays have their own numbering, Jan. 3, 1987-Feb. 25, 1990.Montgomery (Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United StatesAlabamaMontgomeryMontgomery.Advertiser (Montgomery, Ala.)0745-3221(DLC)sn 82008412(OCoLC)9049482Alabama journal (Montgomery, Ala. : 1940)0745-323X(DLC)sn 87062018(OCoLC)2666111info:srw/schema/1/marcxmlxml00000cas a2200000 a 450016942287871105c19879999aludn ne 0 a0eng dsn 88050149 AAAengAAACPNNSDOCLCQy1044-00701044-0070746--32780746-32781565580USPSTroy Publications, Inc., 113 North Market St., Troy, AL 36081mscnsdpMessenger (Troy, Ala.)The Messenger(Troy, Ala.)The Messenger.Troy, Ala. :Troy Pub.,1987-v.Daily (Sunday, Tuesday, Thursday and Friday)Vol. 121, no. 166 (July 1, 1987)-Sunday, Apr. 2, 1989 misprinted as v. 113.Latest issue consulted: Vol. 113 [sic 123], no. 96 (Sunday, Apr. 2, 1989).United StatesAlabamaPikeTroy.Troy messenger0746-3278(DLC)sn 83009935(OCoLC)9921908info:srw/schema/1/marcxmlxml00000cas a22000007a 450017799786880415c19879999aluir ne 0 a0eng dsn 88050086 AARengAARCPNNSDOCLCQ1p1044-03801044-03800745-75961441520USPSThe Prattville Progress, 152 W. 3rd St., Prattville, AL 36067mscnsdpPrattville progress (Prattville, Ala. : 1987)The Prattville progress(Prattville, Ala.)The Prattville progress.Prattville, Ala. :James C. Seymour,1987-v.Three times a weekVol. 102, no. 8 (Jan. 20, 1987)-Latest issue consulted: Vol. 105, no. 153 (Wednesday, Dec. 26, 1990).United StatesAlabamaAutaugaPrattville.Progress (Prattville, Ala.)0745-7596(DLC)sn 83007623(OCoLC)9428489info:srw/schema/1/marcxmlxml00000cas a22000007a 450015344667870319c19869999aluwr ne 0 a0eng dsn 87000284 NSDengNSDCPNOCLCQy0893-07670893-07671431800USPSPickens County Herald, P.O. Drawer E, Carrollton, AL 35447nsdpPickens County heraldPickens County herald.Pickens County herald and west AlabamianCarrollton, Ala. :Pickens Newspapers, Inc.,1986-WeeklyVol. 138, no. 40 (Oct. 2, 1986)-United StatesAlabamaPickensCarrollton.Pickens County herald and west Alabamian0746-0473(DLC)sn 83008141AU at 000040635809info:srw/schema/1/marcxmlxml00000cas a22000007a 450018917586881217c19869999aluwr ne 0 0eng dsn 88050225 CPNengCPNOCLCQmscThe Oxford sun/times.Oxford, Ala. :[s.n.],1986-v.WeeklyVol. 1, no. 1 (Jan. 16, 1986)-Editor: Andy Goggans.Numbering is irregular.United StatesAlabamaCalhounOxford.Oxford sun (Oxford, Ala.)(DLC)sn 85045023AU at 000025803813info:srw/schema/1/marcxmlxml00000cas a22000007a 450013991168860731c19869999aluwr ne 0 0eng dsn 86050322 CPNengCPNOCLCQmscIndependent (Brewton, Ala.)The Independent.Brewton, Ala. :Jim Thornton,1986-v. :ill. ;58 cm.WeeklyVol. 1, no. 1 (June 19, 1986)-United StatesAlabamaEscambiaBrewton.info:srw/schema/1/marcxmlxml00000cas a22000007a 450018957493881231c19859999aluwr ne 0 0eng dsn 88050247 CPNengCPNOCLCQmscPiedmont journal-independent (Piedmont, Ala.)The Piedmont journal-independent.Journal independentPiedmont, Ala. :Lane Weatherbee,1985-v.WeeklyVol. 4, no. 52 (Dec. 24, 1985)-Sometimes published as: Journal independent.United StatesAlabamaCalhounPiedmont.Journal-independent(DLC)sn 85045014info:srw/schema/1/marcxmlxml00000cas a22000007a 450012715821851024d19841985aluwr ne 0 a0eng dsn 85045014 CPNengCPNNSDCPNOCLCQmscThe Journal-independent.Piedmont, Ala. :Journal-Independent, Inc.,1984-1985.volumes :illustrations ;58 cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 3, no. 27 (July 3, 1984)- v. 4, no. 51 (Dec. 18, 1985).Carries the same vol. numbering as the Piedmont journal-independent.United StatesAlabamaCalhounPiedmont.Piedmont journal-independent0890-6017(DLC)sn 85045013Piedmont journal-independent (Piedmont, Ala.)(DLC)sn 88050247info:srw/schema/1/marcxmlxml00000cas a22000007a 450012691448851018c19839999aludr ne 0 0eng dsn 85045007 CPNengCPNOCLCQmscTimesDaily.Times dailyFlorence, Ala. :T.S.P. Newspapers, Inc.,1983-volumes :illustrations ;58 cmDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 114, no. 226 (Aug. 14, 1983)-United StatesAlabamaLauderdaleFlorence.Florence times + tri-cities daily(DLC)sn 85044995info:srw/schema/1/marcxmlxml00000cas a22000007a 45009428489830420d19831987aluir ne 0 a0eng dsn 83007623 NSDengNSDCPNNSDNSTOCLCQ89090d0745-75960745-75961The Progress, 152 W. 3rd St., Prattville, AL 36067nsdpmscProgress (Prattville, Ala.)The Progress(Prattville, Ala.)The Progress.Prattville, Ala. :The Prattville Progress,1983-1987.volumes :illustrations ;58 cmThree times a weektexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 98, no. 32 (Mar. 17, 1983)-v. 102, no. 7 (Jan. 17, 1987).United StatesAlabamaAutaugaPrattville.Prattville progress(DLC)sn 85044740Prattville progress (Prattville, Ala.)1044-0380(DLC)sn 88050086(OCoLC)12254317AAPinfo:srw/schema/1/marcxmlxml00000cas a2200000 a 45009867255830831c19839999aludr ne 0 a0eng dsn 84008052 AAAengAAANSDOCLOCLCQX0743-15110743-15111617760USPST.S.P. Newspapers, Inc., 219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Shoals edition)TimesDaily(Shoals ed.)TimesDaily.Times dailyShoals ed.Florence, Ala. :T.S.P. Newspapersvolumes :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan with: Vol. 114, no. 226 (Aug. 14, 1983).\"Florence/Sheffield/Tuscumbia/Muscle Shoals.\"Shoals ed. and Regional ed. combined on Sundays.Description based on: Vol. 114, no. 346 (Monday, Dec. 12, 1983).United StatesAlabamaLauderdaleFlorence.TimesDaily (Regional edition)0743-152XTimes Tri-cities dailyUnknownDec. 12, 1983info:srw/schema/1/marcxmlxml00000cas a22000007a 450010536023840319c19839999aludr ne 0 a0eng dsn 84008051 NSDengNSDOCLCQ1x0743-152X0743-152X1617760USPST.S.P. Newspapers, Inc., 219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Regional edition)TimesDaily(Regional ed.)TimesDaily.Times dailyRegional ed.Florence, Ala. :T.S.P. NewspapersDailytexttxtrdacontentunmediatednrdamediaBegan with: Vol. 114, no. 226 (Aug. 14, 1983).Shoals ed. and Regional ed. combined on Sundays.Description based on: Vol. 114, no. 346 (Monday, Dec. 12, 1983).United StatesAlabamaLauderdaleFlorence.TimesDaily (Shoals edition)0743-1511Times Tri-cities dailyDec. 12, 1983AU at 000025818125info:srw/schema/1/marcxmlxml00000cas a22000007a 45009049482821213d19821987aludn ne 0 a0eng csn 82008412 AAAengAAANSDNPWCPNDLCCPNNSDDLCNSDDLCCPNNVFDLCOCLCQCRLOCLCFOCLCQ1d0745-32210745-32211nsdppccn-us-alNewspaperAdvertiser (Montgomery, Ala.)The Advertiser(Montgomery, Ala.)The advertiser.Alabama journal and advertiserMontgomery, Ala. :Advertiser Co.,1982-1987.volumes :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier155th year, no. 232 (Nov. 22, 1982)- ; -v. 14-3, Jan. 1, 1987.On Saturdays, Sundays and holidays published as: The Alabama journal and advertiser, Nov. 27, 1982-Jan. 1, 1987.Saturday, Sunday and holiday issues have their own numbering.Montgomery (Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United StatesAlabamaMontgomeryMontgomery.Montgomery advertiser (Montgomery, Ala. : Daily)(DLC)sn 84020645(OCoLC)2685433Montgomery advertiser (Montgomery, Ala. : 1987)0892-4457(DLC)sn 87050045(OCoLC)15155895AU at 000020281746info:srw/schema/1/marcxmlxml00000cas a2200000 a 45009237931830218c19829999aluwr ne 0 0eng dsn 86050139 AAAengAAACPNOCLOCLCQmscThe Randolph leader.Roanoke, Ala. :David S. Stevenson,1982-volumes :illustrations ;58 cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 91, no. 1 (Oct. 6, 1982)-United StatesAlabamaRandolphRoanoke.Roanoke leader(DLC)sn 86050137Randolph press(DLC)sn 86050138info:srw/schema/1/marcxmlxml00000cas a22000007a 450012715815851024d19821984aluwr ne 0 a0eng dsn 85045013 CPNengCPNNSDCPNOCLCQ110890-60170890-60171432080USPSThe Piedmont Journal-Independent, 115 N. Center Ave., Piedmont, AL 36272mscnsdpThe Piedmont journal-independentThe Piedmont journal-independent.Piedmont, Ala. :Piedmont Journal-Independent, Inc.,1982-1984.volumes :illustrations ;58 cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 1, no. 1 (Mar. 31, 1982)-v. 3, no. 26 (June 27, 1984).Latest issue consulted: Vol. 5, no. 31 (August 20, 1986).United StatesAlabamaCalhounPiedmont.Piedmont journal(DLC)sn 85045012Journal-independent(DLC)sn 85045014(OCoLC)12715821AU at 000045312916info:srw/schema/1/marcxmlxml00000cas a22000007a 45009183905830202c19829999aluwr n 0 a0eng dsn 85044580 AAAengAAACPNNSDOCLOCLCQ11098-58671098-58671016409USPSNo. 4, Rucker Plaza, Enterprise, AL 36331P.O. Box 1536, Enterprise, AL 36331mscnsdpSoutheast sun (Enterprise, Ala.)The southeast sun(Enterprise, Ala.)The Southeast sun.Enterprise, Ala. :QST Publicationsvolumes :illustrations ;58 cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in 1982.Description based on: Vol. 1, no. 25 (Oct. 21, 1982).Latest issue consulted: Vol. 16, no. 43 (Mar. 4, 1998).United StatesAlabamaCoffeeEnterprise.AU at 000025827687info:srw/schema/1/marcxmlxml00000cas a22000007a 450010487314840305c19819999aluwr ne 0 a0eng dsn 85044906 AAAengAAACPNNSDNSTCPNOCLOCLCQOCLCFOCLCOOCLCAOCLCQ900410885-16620885-16621749310USPSThe New Times, 1618 1/2 St. Stephens Rd., Mobile, AL 36603mscnsdpn-us-alNew times (Mobile, Ala.)The New times(Mobile, Ala.)The new times.Mobile, Ala. :New Times Groupvolumes :illustrationsWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in 1981.Vol. 3, no. 49 (Dec. 15-21, 1983) and vol. 3, no. 50 (Dec. 22-28, 1983) are both called vol. 3, no. 49 (Dec. 15-21, 1983).Description based on: Vol. 2, no. 3 (Jan. 28-Feb. 3, 1982).African AmericansAlabamaNewspapers.African Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694Newspapers.fast(OCoLC)fst01423814United StatesAlabamaMobileMobile.AAPUnknownAug. 15, 1985AU at 000024686659info:srw/schema/1/marcxmlxml00000cas a22000007a 450018922463881219d19811983alucr ne 0 0eng dsn 88050233 AARengAARCPNNSDOCLCQmscThe Sylacauga daily advance.Advance/Sylacauga dailySylacauga advanceSunday advanceAdvanceSylacauga, Ala. :Mrs. W.A. Moody,1981-1893.v.Semiweekly,<Nov. 24, 1982-Feb. 13, 1983>Daily (except Mon., Tues. & Sat.),<May 26, 1982-Nov. 21, 1982>Daily (except Sat. & Mon.),<Jan. 1, 1981-May 23, 1982>74th Year, no. 123 (Jan. 1, 1981)-76th year, no. 83 (Feb. 13, 1983).Days of publication vary.Published as: The Advance/Sylacauga daily, <Aug. 28, 1981-May 23, 1982>.Published as: Sylacauga advance, <Nov. 24, 1982-Feb. 13, 1983>.On Sunday, published as: Sunday advance.United StatesAlabamaTalladegaSylacauga.Childersburg star(DLC)sn 88050232Coosa press(DLC)sn 86050293Daily home1059-6461(DLC)sn 88050234info:srw/schema/1/marcxmlxml00000cas a22000007a 450021026715cr un|||||||||900209c19809999aluwr ne 0 0eng dsn 90099002 AARengAARCPNCUSOCLOCLCQTJCOCLCQOCLCFOCLCOOCLCA926143844AU at 000020585756mscn-us-alSpeakin' out news.Speaking out newsDecatur, Ala. :Minority Network, Inc.v.WeeklyBegan in 1980.Published in Huntsville, Ala., <1987>-Also issued by subscription via the World Wide Web.Description based on: Vol. 7, no. 8 (Jan. 7-13, 1987).African AmericansAlabamaNewspapers.African American newspapersAlabama.AlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African American newspapers.fast(OCoLC)fst00799278African Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United StatesAlabamaMorganDecatur.United StatesAlabamaMadisonHuntsville.Speakin' out weekly news(DLC)sn 88050097http://www.softlineweb.com/softlineweb/ethnic.htminfo:srw/schema/1/marcxmlxml00000cas a22000007a 450014996511861219c19809999aluwr ne 0 a0eng csn 86050472 AARengAARCPNNSDOCLCQ11080-15021080-15021328110USPSnsdppccWest-Alabama gazetteWest-Alabama gazette.GazetteMillport, Ala. :Millport Pub. Co.,1980-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrier4th year, no. 32 (Jan. 3, 1980)-United StatesAlabamaLamarMillport.Gazette (Millport, Ala.)(DLC)sn 86050471info:srw/schema/1/marcxmlxml00000cas a2200000 a 450011828156850320c19809999aluwr ne 0 0eng dsn 86050314 AAAengAAACPNOCLOCLCQmscThe Hartford news-herald.Hartford, Ala. :Geneva Publications,1980-volumes :illustrations ;57-59 cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 80, no. 20 (Feb. 14, 1980)-United StatesAlabamaGenevaHartford.News-herald (Hartford, Ala.)(DLC)sn 86050313info:srw/schema/1/marcxmlxml00000cas a22000007a 450017857788880427d198u198ualusr ne 0 0eng dsn 88050097 AARengAARCPNOCLOCLCQOCLCFOCLCOOCLCAmscn-us-alSpeakin' out weekly news.Decatur, Ala. :Smothers PublicationsPublished every first and third Wed. of each monthDescription based on: Vol. 3, no. 13 (May 4-17, 1983).African AmericansAlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United StatesAlabamaMorganDecatur.Weekly news (Huntsville, Ala.)(DLC)sn 87050012Speakin' out news(DLC)sn 90099002info:srw/schema/1/marcxmlxml00000cas a2200000 a 450017807936880418c198u9999aluwr ne 0 a0eng dsn 90099001 AAAengAAACPNOCLOCLCQThe Daleville Sun-Courier, 310 Daleville Ave., Daleville, AL 36322mscn-us-alDaleville sun-courier.Daleville, Ala. :QST Publicationsv. :ill. ;58 cm.WeeklyDescription based on: Vol. 2, no. 28 (Wed., Feb. 17, 1988).United StatesAlabamaDaleDaleville.AU at 000020585749info:srw/schema/1/marcxmlxml00000cas a22000007a 450015580838870423c198u9999aluwr ne 0 0eng dsn 87050128 AARengAARCPNOCLCQmscGreene County independent.Eutaw, Ala. :Greene County Independent, Inc.v.WeeklyDescription based on: Vol. 2, no. 10 (Mar. 12, 1987).United StatesAlabamaGreeneEutaw.info:srw/schema/1/marcxmlxml00000cas a22000007a 450010125135831114d198u198ualucr ne 0 a0eng dsn 83003221 NSDengNSDOCLCQ0d0746-55210746-55211Auburn Bulletin & Lee County Eagle, PO Box 2111, Auburn, Ala. 36830nsdpThe Auburn bulletin & the Lee County eagleThe Auburn bulletin & the Lee County eagle.Lee County eagleAuburn bulletin and the Lee County eagleAuburn, Ala. :[publisher not identified]Semiweekly,<Sept. 5, 1984->WeeklytexttxtrdacontentunmediatednrdamediaDescription based on: Oct. 19, 1983.United StatesAlabamaLeeAuburn.Auburn bulletin(DLC)sn 89050006Eagle (Auburn, Ala.)(OCoLC)18435663Sept. 5, 1984info:srw/schema/1/marcxmlxml00000cas a22000007a 450018370324880818c198u9999aluwr ne 0 0eng dsn 88050147 CPNengCPNOCLCQmscTri-city times (Geraldine, Ala.)The Tri-City times.Geraldine, Ala. :Wanda Nelsonv.WeeklyDescription based on: Vol. 2, no. 24 (Jan. 6, 1982).United StatesAlabamaDeKalbGeraldine.info:srw/schema/1/marcxmlxml00000cas a22000007a 450010199338831208c198u9999aluwr ne 0 a0eng dsn 83005367 NSDengNSDCPNOCLCQ10746-62770746-62771707590USPSSpringville Pub. Co., 539 Main St., Springville, AL 35146nsdpThe St. Clair clarionThe St. Clair clarion.Saint Clair clarionSpringville, AL :Gary L. ShultsWeeklytexttxtrdacontentunmediatednrdamediaDescription based on: Vol. 2, no. 1 (Jan. 5, 1982).United StatesAlabamaSt. ClairSpringville.AU at 000025783743info:srw/schema/1/marcxmlxml00000cas a22000007a 450013787251860627c198u9999aluwr ne 0 a0eng dsn 86001923 NSDengNSDCPNOCLCQ10889-00800889-00801The Westerner Star, P.O. Box 2060, Bessemer, AL 35021nsdpWestern star (Bessemer, Ala.)The Western star(Bessemer, Ala.)The western star.Bessemer, Ala. :Hal HodgensWeeklytexttxtrdacontentunmediatednrdamediaDescription based on: Vol. 3, no. 15 (Wednesday, June 11, 1986).United StatesAlabamaJeffersonBessemer.Bessemer advertiser(DLC)sn 87050117AU at 000025805174511.1srw.pc any \"y\" and srw.mt any \"newspaper\" and srw.cp exact \"Alabama\"50info:srw/schema/1/marcxmlxml1Date,,0mq1lME887FoIbjulKUV6bx9ImwWQNCv9GqZzGS92IKS31lEbcpRJBNHgcE1l29tFaHP9CHe0Yexk1uWQofffull" ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 27 2022, 09:22am via System Hello Spencer, Thank you for reaching out about the bulk xml files for the US Newspaper Directory. We don't have documentation specific to these bulk xml files, but upon further inspection I can say that each of those files don't necessarily contain info for 50 newspaper titles. The structure of the titles for California and New York for instance are different from say, Alabama. If you look at California for example, the file naming structure indicates the year the title started, and then the number of titles included in that xml file. So for instance, the files below include info for newspapers that started in 2000, 2001, and 2002 respectively. And there is info for 30 titles in the xml file from 2000, and 14 in the file for 2001, and so on. * ndnp_California_2000_e_0001_0030.xml * ndnp_California_2001_e_0001_0014.xml * ndnp_California_2002_e_0001_0012.xml If there's more than 50 titles for a given year, say for California starting in 1880, then the next 50 titles will roll into the next xml file, and so on. And the last xml file for that year may not include 50 titles. Many of the states seem to group all the years together, so each xml file contains 50 titles, until possibly the last one for a given state, which may contain less. I hope this information helps explain the total number of records and structure a bit better. Let me know if you have any further questions. Best wishes, Kerry Huller Newspaper & Current Periodical Reading Room Serial & Government Publications Division Library of Congress ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 25 2022, 02:22pm via Email Hi, Kerry: Might there be documentation on the XML files you mentioned? I've successfully read 'https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/', extracted the names of 6666 XML files, and read the first one, "ndnp_Alabama_all-yrs_e_0001_0050.xml". It contains 29415 characters, beginning, "1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i 45001030438981180404c20159999aluwr n 0 a0eng ". With a bit more effort, I will likely be able to parse all 6666 of these. The names suggest that each contains information on 50 newspapers, totaling 333,300. The main page "https://chroniclingamerica.loc.gov/search/titles/" says there are only 157,521 "Titles currently listed". This suggests that these XML files include place holders for a little more than double the number of entries currently in "https://chroniclingamerica.loc.gov/search/titles/". Thanks for this. Progress. ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 07 2022, 08:55am via System Hi Spencer, I thought of one more option after I emailed you yesterday that I wanted to make you aware of. I had explained the other day how we pull the records from OCLC into our U.S. Newspaper Directory. You can also access all of?the raw MARC records found in the directory in xml format from here if you choose: https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ <https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/>?These?will provide you all of the data from the record fields in MARC format, so you'd get all the data you see here for example: https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/ <https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/>?but in xml. I don't know if this might be more data and info than you want to work with, but wanted to make sure you were aware of this option as well. Best wishes, Kerry Huller Newspaper & Current Periodical Reading Room Serial & Government Publications Division Library of Congress ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 06 2022, 10:55am via System Hi Spencer, Thanks for reaching out again. I have been looking at the json view a bit closer this morning and your example of "9999." After talking with a colleague this morning and looking at various examples, I see there is some variation in how the titles with either an unknown starting/ending date or currently published titles are being handled - depending on the view. As an example, I completed a search in the directory for Alaska and the city of Anchorage. There are 80 results, and?on the first page of results you'll see # 4. Fort Richardson news, which was published from 1952-19??. The csv view of this state/city search result will show the ending date of 19??. But if I append &format=json to this search result, this specific title will show an ending date of 1999. After talking with a colleague this morning, I discovered an integer had to be used in these cases where dates were "?" so that the search based on year range would work. Similarly, if you look at # 12 Alaska digest, which was published 1994-current, the "current" becomes "9999" in the json view. So, the records you are seeing with "9999" would most likely be titles with an ending date of "current." However, there is an issue with the unknown dates, like "1999" being used for "19??" in the example above. The "9" does not get inserted in place of "?" when you are looking at the title/LCCN view of a specific newspaper. So for instance, if you view the #4 title: Fort Richardson news at this url: https://chroniclingamerica.loc.gov/lccn/sn98059792/ <https://chroniclingamerica.loc.gov/lccn/sn98059792/>?but append .json to the end of the url, after the LCCN, like this: https://chroniclingamerica.loc.gov/lccn/sn98059792.json <https://chroniclingamerica.loc.gov/lccn/sn98059792.json>?you'll see that the end_year is "19??." Viewing the title/LCCN json view for titles that are currently published will also show the end_year as "current." The Alaska digest example from above can be viewed here: https://chroniclingamerica.loc.gov/lccn/sn97060056.json <https://chroniclingamerica.loc.gov/lccn/sn97060056.json> I wasn't aware of the difference between the directory search json view and the title/LCCN view. But I think it would be possible to grab the?data from?the title/LCCN json url through an additional script potentially. The json url is included in the view under the?"url" field. Of course, there are unknowns with publishing dates, but better to know where the question marks are, and what titles are considered to be current. I hope this clarifies the data a bit more - let me know if any of it needs more clarification though. And let me know if you have follow-up questions. Thank you, Kerry Huller Newspaper & Current Periodical Reading Room Serial & Government Publications Division Library of Congress ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 05 2022, 04:42pm via Email Hi, Kerry: What would you suggest I do to get a count of the numbers of newspapers and publishers operating by year from, say, 1790 to 2021? I just determined that 20630 (13 percent) of the 157520 records in the US Newspaper database I downloaded a week ago have end_year = 9999. I don't think it's feasible to assume that all or even most of those are still publishing. Might there be some other database that might have this kind of information? I ask, because Robert McChesney (2004) The Problem of the Media (Monthly Review Pr., esp. pp. 34-35) suggests that in the first half of the nineteenth century, the US had more newspapers and newspaper publishers per capita than any other place or time. He suggests that that diversity of newspapers helped encourage literacy and limit political corruption, both of which helped propel the young US to its current dominance of the international political economy. I'm hoping to get some data to evaluate this claim. Sadly, it looks like there is too much missing and questionable data in this dataset for me to use this without a fairly substantive data cleaning effort. ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 05 2022, 09:05am via System Hello Spencer, Thank you for reaching out about your additional questions. I was looking at the records you mention above, and yes, you are correct - those 9 records with the date inconsistencies and the one record?for the The New Mexican mining news <https://chroniclingamerica.loc.gov/lccn/sn93061507/> containing "Santa Fe.\" have typos in them. Thanks for spotting these - it may be possible to have the cataloger in our division correct those typos. I will look into this further. The U.S. Newspaper Directory doesn't have a connection with Wikimedia or Wikipedia. The Library of Congress?periodically pulls the records for the Directory from OCLC Worldcat <https://www.oclc.org/en/worldcat.html>. And those?newspaper records in OCLC Worldcat have been created by catalogers?at various institutions around the U.S. over the span of several years. So, occasionally, you will find a typo in the records. Corrections can be?made by OCLC and library staff at the various institutions. Every time we complete a new pull on the OCLC records, any corrected records will then populate our Directory. Regarding your question on the New-York weekly journal - yes, that is also correct that it has two records. There is actually a?record?for each format of the newspaper, so this record is for the microfilm format <https://chroniclingamerica.loc.gov/lccn/2009252748/> and this one is for the original print format <https://chroniclingamerica.loc.gov/lccn/sn83030211/>. You can see in the heading for the microfilm record where it says [microfilm reel] and the print version shows [volume]. You are likely to see this for other titles as well because each format has been cataloged with its own LCCN. You are also likely to see additional records with [online resource] identified as the format as more and more titles are available as ePrints or online. I hope this helps answer your additional questions a bit more. Please reach out if you have any other questions. Thank you, Kerry Huller Newspaper & Current Periodical Reading Room Serial & Government Publications Division Library of Congress ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 04 2022, 01:47pm via Email Hi, Kelly: At the risk of bombing your inbox with more emails than you want, what is your relationship with Wikipedia and other Wikimedia Foundation projects like Wikidata? I ask, because I've logged over 20,000 edits in Wikimedia Foundation projects since 2010, and I would happily try to answer questions about Wikidata and other Wikimedia Foundation projects. I have NOT organized an edit-a-thon, but I've made presentations at conferences with people who have, and I would happily try to help organize such if you could find a group of people who want to work to improve this US Newspaper database. I think it would be good to establish links between this US Newspaper database and Wikidata, with appropriate procedures so changes to one could be evaluated for acceptance into the other. FYI, John Peter Zenger's famous "New-York weekly journal" (1733-1751) appears TWICE in your database with lccn = 2009252748 and sn83030211 and ONCE in Wikidata WITHOUT an lccn, even though many other Wikidata items have an lccn. See: https://www.wikidata.org/wiki/Q23091960 There's a "WikiProject Newspapers" on Wikipedia and a companion "WikiProject Periodicals" on Wikidata: https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Newspapers/Wikidata https://www.wikidata.org/wiki/Wikidata:WikiProject_Periodicals I've tried to connect with others on those projects, so far with only limited success. However, you may know that almost anyone can change almost anything on Wikipedia and other Wikimedia Foundation projects. What stays tends to be written from a neutral point of view citing credible sources. They have problems with vandals, but the problems are usually easily controlled. This makes Wikipedia and Wikidata very useful platforms for cleaning up databases like your US Newspaper dataset. Spencer Graves ########## Hello, Kelly: In addition to the invalid JSON, discussed below [NOTE: The "below" contains a slight addition to the report of the I sent last Friday.], I found 9 (NINE!) cases where start_year was AFTER end_year. These have lccn = "sn86071531" "sn95069213" "sn90059096" "sn86058451" "sn90060926" "sn99065409" "sn89065002" "sn98069857" "sn91059179" See: https://chroniclingamerica.loc.gov/lccn/sn86071531/ https://chroniclingamerica.loc.gov/lccn/sn95069213/ https://chroniclingamerica.loc.gov/lccn/sn90059096/ https://chroniclingamerica.loc.gov/lccn/sn86058451/ https://chroniclingamerica.loc.gov/lccn/sn90060926/ https://chroniclingamerica.loc.gov/lccn/sn99065409/ https://chroniclingamerica.loc.gov/lccn/sn89065002/ https://chroniclingamerica.loc.gov/lccn/sn98069857/ https://chroniclingamerica.loc.gov/lccn/sn91059179/ These all have obvious coding errors that can be easily fixed. The data may not be completely accurate after the fix, but at least they are not obviously wrong ;-) ################## I got invalid JSON from: https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json After some experimentation, I was able to replicate the problem with a request for rows=10: https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics and Associate Dean for Graduate Programs at the University of California - Davis, confirmed that it was a JSON error using: https://codebeautify.org/jsonvalidator He is part of the core team developing the R free, open-source programming language. He said, that starting at offsets 161070 and 161502 in the character string you get from [the R code RCurl::getURL()] we have: Santa Fe.\" and these are in an entry such as "city": ["Santa Fe.\"] So the final " is escaped and therefore there is no closing " for the string. The parser continues to consume characters looking for the end of that string. If one "repairs" the text from getURL() with ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt) then the rest of my code worked fine. You may wish to do something to implement other checks for valid JSON and repair this problem. I've scanned all the 157520 records that were in that database a couple of days ago, and this is the only JSON error identified by the code I used. NOTE: I was NOT able to replicate this error when downloading records one at a time. That suggests a problem NOT in the database itself but in the download algorithm. ??? Thank you for your help. I will almost certainly have other questions ;-) ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 03 2022, 10:39pm via Email Hello, Kelly: In addition to the invalid JSON, discussed below [NOTE: The "below" contains a slight addition to the report of the I sent last Friday.], I found 9 (NINE!) cases where start_year was AFTER end_year. These have lccn = "sn86071531" "sn95069213" "sn90059096" "sn86058451" "sn90060926" "sn99065409" "sn89065002" "sn98069857" "sn91059179" See: https://chroniclingamerica.loc.gov/lccn/sn86071531/ https://chroniclingamerica.loc.gov/lccn/sn95069213/ https://chroniclingamerica.loc.gov/lccn/sn90059096/ https://chroniclingamerica.loc.gov/lccn/sn86058451/ https://chroniclingamerica.loc.gov/lccn/sn90060926/ https://chroniclingamerica.loc.gov/lccn/sn99065409/ https://chroniclingamerica.loc.gov/lccn/sn89065002/ https://chroniclingamerica.loc.gov/lccn/sn98069857/ https://chroniclingamerica.loc.gov/lccn/sn91059179/ These all have obvious coding errors that can be easily fixed. The data may not be completely accurate after the fix, but at least they are not obviously wrong ;-) ################## I got invalid JSON from: https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json After some experimentation, I was able to replicate the problem with a request for rows=10: https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics and Associate Dean for Graduate Programs at the University of California - Davis, confirmed that it was a JSON error using: https://codebeautify.org/jsonvalidator He is part of the core team developing the R free, open-source programming language. He said, that starting at offsets 161070 and 161502 in the character string you get from [the R code RCurl::getURL()] we have: Santa Fe.\" and these are in an entry such as "city": ["Santa Fe.\"] So the final " is escaped and therefore there is no closing " for the string. The parser continues to consume characters looking for the end of that string. If one "repairs" the text from getURL() with ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt) then the rest of my code worked fine. You may wish to do something to implement other checks for valid JSON and repair this problem. I've scanned all the 157520 records that were in that database a couple of days ago, and this is the only JSON error identified by the code I used. NOTE: I was NOT able to replicate this error when downloading records one at a time. That suggests a problem NOT in the database itself but in the download algorithm. ??? Thank you for your help. I will almost certainly have other questions ;-) ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 01 2022, 11:46am via Email Hello, Kelly: I got invalid JSON from: https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json After some experimentation, I was able to replicate the problem with a request for rows=10: https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics and Associate Dean for Graduate Programs at the University of California - Davis, confirmed that it was a JSON error using: https://codebeautify.org/jsonvalidator He is part of the core team developing the R free, open-source programming language. He said, that starting at offsets 161070 and 161502 in the character string you get from [the R code RCurl::getURL()] we have: Santa Fe.\" and these are in an entry such as "city": ["Santa Fe.\"] So the final " is escaped and therefore there is no closing " for the string. The parser continues to consume characters looking for the end of that string. If one "repairs" the text from getURL() with ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt) then the rest of my code worked fine. You may wish to do something to implement other checks for valid JSON and repair this problem. I've scanned all the 157520 records that were in that database a couple of days ago, and this is the only JSON error identified by the code I used. Thank you for your help. I will almost certainly have other questions ;-) ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 28 2022, 02:20pm via System Hello Spencer, Thank you for sending along your follow-up questions. I'm glad to hear the json view?will work for you. It was recommended to me that you limit your requests to 500 rows at a time. And a developer here at LC suggests the following regarding rate limiting: ?To avoid being blocked by the server, the current rate-limiting rules restrict un-cached requests to URLs starting with https://chroniclingamerica.loc.gov/search/ <https://chroniclingamerica.loc.gov/search/> to 120 requests every 10 minutes from a single IP address.? So, I think if you limited each of your requests to 500 rows at a time with the proper pauses, then you should be able to access what you need. As for the csv view, I checked on this as well, and was informed that the?csv view was not implemented for all url formats. The csv view was only implemented for this view: https://chroniclingamerica.loc.gov/newspapers/ <https://chroniclingamerica.loc.gov/newspapers/>and urls resulting from US Directory search results - for e.g. if you wanted to narrow down your search results by state, city, date range, etc. found at this link: https://chroniclingamerica.loc.gov/search/titles/ <https://chroniclingamerica.loc.gov/search/titles/>. So, if you wanted a csv and limited your search by state ( for example: https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv <https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv> ), you could append &format=csv to the search result url and get the csv to automatically download. But, if your search results ended up being over a couple thousand titles, then the system would probably time out. I hope this info helps! Let me know if you have any other questions. Best wishes, Kerry Huller Newspaper & Current Periodical Reading Room Serial & Government Publications Division Library of Congress ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 27 2022, 04:15pm via Email Hello, Kerry: Thanks for the reply. Can you please give me some further guidance on two thing "so that the system is not overwhelmed"? 1. The max size in a small batch? 2. Any limit on the number of small batches in a second or minute? I've found that I can download small batches under program control using "RCurl::getURL" in R (programming language) using, e.g.; https://chroniclingamerica.loc.gov/search/titles/results/?rows=20&page=2&format=json With this, I can control the batch size with "row=20" vs. "row=50" vs., e.g., "row=1000". A naive search says there are 157520 "results". With "row=1000", this would require 158 calls. With "row=20", it would require 7876 calls. Before I start, I need to decide which fields I want; I don't need them all. Thanks, Spencer Graves p.s. I tried appending "&format=csv" and got "Error 504 Ray ID: 7220896da85e86e7 ? 2022-06-27 19:19:53 UTC Gateway time-out". I used: https://chroniclingamerica.loc.gov/search/titles/results/?state=&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv I can get what I want using json so do not need csv. However, I thought you might want to know that I was unable to get csv to work. ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 27 2022, 10:54am via System Hello Spencer, Thank you for contacting the Library of Congress about searching the US Newspaper Directory. I wanted to follow up with you regarding your request to output the data in a machine readable format. It looks like you were provided the link to the API documentation for the website: About the Site and API <https://chroniclingamerica.loc.gov/about/api/>. Scroll down to the section with the heading, Searching the directory and newspaper pages using OpenSearch. This section describes the search functionality and structure for the US Newspaper Directory in more detail. It is possible to return your directory searches in json format by appending &format=json to the end of the url. It is also possible to return search results in csv format by appending &format=csv to the end of the url, but I would strongly suggest that you do this in small batches by putting limits on your search so that the system is not overwhelmed. So, from the search page for the US Newspaper Directory <https://chroniclingamerica.loc.gov/search/titles/>?you could potentially limit your search based on state?and city, or date range, and/or even frequency. Then once you've completed the search, you can add &format=csv to the end of the url to automatically download a csv of those records. The resulting csv will contain several fields/headers: lccn, title, place of publication, start year, end year, publisher, edition, frequency, subject, state, city, country, language, oclc number, and holding type. I think these fields include the information you were looking for. But, again, I would like to stress that you put limits on your search before creating the csv so as not overwhelm the system. Please let me know if you have any other additional questions. Best wishes, Kerry Huller Newspaper & Current Periodical Reading Room Serial & Government Publications Division Library of Congress ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 23 2022, 01:55pm via System Mr. Graves, I'm going to transfer you request to a member of our digital collections team who may be of more assistance to you than me. Mike ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 23 2022, 01:51pm via Email Dear Mr. Queen: Thanks for the reply. I'm still confused. I downloaded and installed Docker Desktop and "docker-compose.yml" and ran their "Getting Started" Tutorial, but I don't see what to do next. I repeat: I'd like to analyze "U.S. Newspaper Directory, 1690-Present" (https://chroniclingamerica.loc.gov/search/titles/), which ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 22 2022, 07:15pm via System Mr. Graves, Programmatic access to the data forChronicling America <https://chroniclingamerica.loc.gov/>and possibly the U.S. Newspaper Directory <https://chroniclingamerica.loc.gov/search/titles/>can be found on theAbout the Site and API <https://chroniclingamerica.loc.gov/about/api/>page in various formats. Also, please note that Chronicling Americacontains newspapers published from 1777-1963, but does not include everyU.S. newspaper published in that time period. Please let me know if I can be of further assistance. ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 22 2022, 06:14pm via Email Dear Mr. Queen: Can we simplify this to just giving me the data behind "U.S. Newspaper Directory, 1690-Present" (https://chroniclingamerica.loc.gov/search/titles/) in a machine readable format, e.g., csv or xlsx or a MySQL database? As I mentioned in my original email, a naive search of that without restrictions returned 157520 titles in 7876 pages with up to 20 titles per page giving date ranges in at least some cases. I could probably write software to scrape those 7876 pages from your web site and combine them into a data file. I have a PhD in statistics, I have been using the R programming language and similar software for decades. This includes publishing tutorials on how to analyze data like this on Wikiversity.[1] I'd like to do something similar with this. I could help make your data more useful to others and discuss with you how we might prioritize improvements like accessing the other sources you mentioned. Thanks very much for your reply. Sincerely, Spencer Graves, PhD Founder, EffectiveDefense.org 4550 Warwick Blvd 508 Kansas City, MO 64111 m: 408-655-4567 [1] e.g.: https://en.wikiversity.org/wiki/US_Gross_Domestic_Product_(GDP)_per_capita ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 22 2022, 05:27pm via System Mr. Graves Your request is a little more complex than it first appears and requires extensive research. A variety of resources should be consulted to determine the circulation statistics of newspapers published prior to 1851. You will need to check newspaper union lists and newspaper histories. Union listspresent lists of newspapers in geographic arrangement according to place of publication, and specify which libraries or other institutions hold collections of those newspapers and the dates of their holdings. These can also be useful for tracking title changes throughout a newspaper's history. Newspaper historieslikeAmerican Journalism: A History: 1690-1960 <https://lccn.loc.gov/62007157>(Mott),The Penny Press <https://lccn.loc.gov/2004043078>(Thompson), andThe Press and America <https://lccn.loc.gov/99044295>(Emery et al.) may not include circulation statistics, but they do document the diversity and progress of newspaper publishing, including notable newspapers of the era. Newspaper histories also cover the history of the printers and printing of newspapers in a state, county, or region more generally, and provide more condensed histories of the editors, journalists, and evolution of the newspapers in a specific area. Newspaper histories and union lists should be available at most large public or university libraries. More information about union lists, newspaper histories, and researching newspapers in general can be found in theU.S. Newspaper Collections at the Library of Congress <https://guides.loc.gov/united-states-newspapers/introduction>research guide (see Reference Sources). Please let me know if I can be of further assistance. ------------------------------------------------------------------------ Original Question Jun 20 2022, 02:34pm via System How can I get counts of the numbers of newspapers by year in the US, and preferably also elsewhere? A search of "U.S. Newspaper Directory, How can I get counts of the numbers of newspapers by year in the US, and preferably also elsewhere? A search of "U.S. Newspaper Directory, 1690-Present" (https://chroniclingamerica.loc.gov/search/titles/) returned 157520 titles in 7876 pages with up to 20 titles per page giving date ranges to the extent that it's known. If I can get a data file (e.g., csv or xls), I can summarize. I could also use data on circulation and frequency and especially parent company for multiple newspapers published by the same company, to the extant that such is available. I'm interested in this, because McChesney quoted Tocqueville in suggesting that the US had more newspapers per person (or per million population) prior to 1851 than at any other time or place in history. I'd like to evaluate that claim with data to the extent that I can. See "https://en.wikiversity.org/wiki/Social_construction_of_crime_and_what_we_can_do_about_it#Newspapers_1790_-_present". Thanks, Spencer Graves, PhD m: 408-655-4567 ------------------------------------------------------------------------ Thank you for using Newspapers & Current Periodicals Ask a Librarian Service! This email is sent from Ask a Librarian in relationship to ticket #9625195. Read our privacy policy. <https://springshare.com/privacy.html>
General XML is not intended to be parsable as a list. But there are lots of tools you can use to extract various patterns out of XML in forms like a list. But your data example is huge and I am falling asleep waiting to see if it loads. I looked sideways and it is not that big directly but my browser may be trying to show it as a web page. How about you copying and pasting a sample of say the first few dozen lines so we see what is in it for the purpose of ... The schema would be mentioned in an attribute if you know what you are looking for and may be an external file. So decide what you want, like a list of all titles and use something like xpath(). -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Spencer Graves Sent: Wednesday, July 27, 2022 4:51 PM To: 'R-help' <r-help at r-project.org> Subject: [R] Parsing XML? Hello, All: What would you suggest I do to parse the following XML file into a list that I can understand: XMLfile <- "https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ndnp_Alabama_all-yrs_e_0001_0050.xml" This is the first of 6666 XML files containing "U.S. Newspaper Directory" maintained by the US Library of Congress discussed in the thread below. I've tried various things using the XML and xml2. XMLdata <- xml2::read_xml(XMLfile) str(XMLdata) XMLdat <- XML::xmlParse(XMLdata) str(XMLdat) XMLtxt <- xml2::xml_text(XMLdata) nchar(XMLtxt) #[1] 29415 Someplace there's a schema for this. I don't know if it's embedded in this XML file or in a separate file. If it's in a separate file, how could I describe it to my contacts with the Library of Congress so they would understand what I needed and could help me get it. Thanks, Spencer Graves p.s. All 29415 characters in XMLtext appear in the thread below. -------- Forwarded Message -------- Subject: [Newspapers and Current Periodicals] How can I get counts of the numbers of newspapers by year in the US, and preferably also elsewhere? A search of "U.S. Newspaper Directory, Date: Wed, 27 Jul 2022 14:59:03 +0000 From: Kerry Huller <serials at ask.loc.gov> To: Spencer Graves <spencer.graves at effectivedefense.org> CC: twes at loc.gov --# Type your reply above this line #-- ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 27 2022, 10:59am via System Hello Spencer, So, when I view the xml, I'm actually looking at it in XML editor software, so I can view the tags and it's structured neatly. I've copied and pasted the text from the beginning of the file and the first newspaper title below from my XML editor: <?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type='text/xsl' href='/webservices/catalog/xsl/searchRetrieveResponse.xsl'?> <searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/" xmlns:oclcterms="http://purl.org/oclc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:diag="http://www.loc.gov/zing/srw/diagnostic/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <version>1.1</version> <numberOfRecords>2250</numberOfRecords> <records> <record> <recordSchema>info:srw/schema/1/marcxml</recordSchema> <recordPacking>xml</recordPacking> <recordData> <record xmlns="http://www.loc.gov/MARC21/slim"> <leader>00000nas a22000007i 4500</leader> <controlfield tag="001">1030438981</controlfield> <controlfield tag="008">180404c20159999aluwr n 0 a0eng </controlfield> <datafield ind1=" " ind2=" " tag="010"> <subfield code="a"> 2018200464</subfield> </datafield> <datafield ind1=" " ind2=" " tag="040"> <subfield code="a">DLC</subfield> <subfield code="e">rda</subfield> <subfield code="c">DLC</subfield> <subfield code="b">eng</subfield> </datafield> <datafield ind1=" " ind2=" " tag="012"> <subfield code="m">1</subfield> </datafield> <datafield ind1="0" ind2=" " tag="022"> <subfield code="a">2577-5316</subfield> <subfield code="2">1</subfield> </datafield> <datafield ind1=" " ind2=" " tag="032"> <subfield code="a">021110</subfield> <subfield code="b">USPS</subfield> </datafield> <datafield ind1=" " ind2=" " tag="037"> <subfield code="b">711 Alabama Avenue, Selma, AL 36701</subfield> </datafield> <datafield ind1=" " ind2=" " tag="042"> <subfield code="a">nsdp</subfield> <subfield code="a">pcc</subfield> </datafield> <datafield ind1="1" ind2="0" tag="050"> <subfield code="a">ISSN RECORD</subfield> </datafield> <datafield ind1="1" ind2="0" tag="082"> <subfield code="a">071</subfield> <subfield code="2">15</subfield> </datafield> <datafield ind1=" " ind2="0" tag="222"> <subfield code="a">Selma sun</subfield> </datafield> <datafield ind1="0" ind2="0" tag="245"> <subfield code="a">Selma sun.</subfield> </datafield> <datafield ind1=" " ind2="1" tag="264"> <subfield code="a">Selma, AL :</subfield> <subfield code="b">North Shore Press, LLC</subfield> <subfield code="c">2016-</subfield> </datafield> <datafield ind1=" " ind2=" " tag="310"> <subfield code="a">Weekly</subfield> </datafield> <datafield ind1=" " ind2=" " tag="336"> <subfield code="a">text</subfield> <subfield code="b">txt</subfield> <subfield code="2">rdacontent</subfield> </datafield> <datafield ind1=" " ind2=" " tag="337"> <subfield code="a">unmediated</subfield> <subfield code="b">n</subfield> <subfield code="2">rdamedia</subfield> </datafield> <datafield ind1=" " ind2=" " tag="338"> <subfield code="a">volume</subfield> <subfield code="b">nc</subfield> <subfield code="2">rdacarrier</subfield> </datafield> <datafield ind1="1" ind2=" " tag="362"> <subfield code="a">Began in 2015.</subfield> </datafield> <datafield ind1=" " ind2=" " tag="588"> <subfield code="a">Description based on: Volume 2, Issue 40 (October 5, 2017) (surrogate); title from caption.</subfield> </datafield> <datafield ind1=" " ind2=" " tag="588"> <subfield code="a">Latest issue consulted: Volume 2, Issue 40 (October 5, 2017).</subfield> </datafield> <datafield ind1=" " ind2=" " tag="752"> <subfield code="a">United States</subfield> <subfield code="b">Alabama</subfield> <subfield code="c">Dallas</subfield> <subfield code="d">Selma.</subfield> </datafield> </record> </recordData> </record> When I view the records in the XML editor, these 2 lines below do begin each of the records for each individual title, but of course this is including the xml tags: <recordSchema>info:srw/schema/1/marcxml</recordSchema> <recordPacking>xml</recordPacking> Hopefully this helps you decide where to break or parse each record. On another note, I just noticed as well that at the top of this first file it lists the total number of records for the Alabama grouping - 2250. This also appeared to be the case for the Alaska records when I took a look at the first one for that state. I imagine that should be consistent throughout each "grouping" of records. Let me know if you have follow-up questions! Best wishes, Kerry Huller Newspaper & Current Periodical Reading Room Serial & Government Publications Division Library of Congress ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 27 2022, 10:21am via Email Hi, Kerry: Thanks. I understand the chunking in files of at most 50. I've read the first file "ndnp_Alabama_all-yrs_e_0001_0050.xml" into a string of 29415 characters, copied below. Might you have any suggestions on the next step in parsing this? Staring at it now, it looks splitting on "info:srw/schema/1/marcxmlxml" might convert the 29415 characters into shorter chunks, each of which could then be parsed further. This is not as bad as reading ancient Egyptian heiroglyphics without the Rosetta Stone, but I wondered if you might have something that could make this work easier and more reliable? I guess I could compare with what I already read as JSON ;-) Thanks, Spencer Graves "1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i 45001030438981180404c20159999aluwr n 0 a0eng 2018200464DLCrdaDLCeng12577-53161021110USPS711 Alabama Avenue, Selma, AL 36701nsdppccISSN RECORD07115Selma sunSelma sun.Selma, AL :North Shore Press, LLC2016-WeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in 2015.Description based on: Volume 2, Issue 40 (October 5, 2017) (surrogate); title from caption.Latest issue consulted: Volume 2, Issue 40 (October 5, 2017).United StatesAlabamaDallasSelma.info:srw/schema/1/marcxmlxml00000cas a22000007a 4500502150053100127c20109999aluwr n 0 a0eng 2010200019DLCengDLCDLCOCLCQ112153-18111750USPSB & C Publishing, LLC, 3514 Martin St. S. Ste 104, Cropwell, AL 35054pccnsdpISSN RECORDSt. Clair County news (Cropwell, Ala.)St. Clair County news(Cropwell, Ala.)St. Clair County news.Cropwell, AL :B & C Pub.WeeklyBegan in 2010.Description based on: Nov. 4, 2010 (surrogate); title from caption.info:srw/schema/1/marcxmlxml00000cas a22000007a 4500426491872090720c20099999alumr n 0 a0eng 2009203372DLCengDLCOCLCQ12150-346X2150-346X1AU at 000044489617NZ116076352Devon Applewhite/Applewhite Publishing Co., 1910 Honeysuckle Rd., #N183, Dothan, AL 36305mscnsdpISSN RECORD30514Triangle tribune(Dothan, Ala.)Triangle tribune.Dothan, AL :Applewhite Pub. CoMonthlyBegan with vol. 1, issue 1 (May 2009).\"Connecting the Tri-State African -American Community.\"Description based on: Vol. 1, issue 1 (May 2009); title from masthead.Applewhite, Devon.United StatesAlabama.United StatesGeorgia.United StatesFlorida.info:srw/schema/1/marcxmlxml00000cas a22000007a 4500289017315081219c20089999aluwr n | a0eng c 2008213218NSDengNSDOCLCQDLCOCLCQ111945-93191945-93191005270USPSSpringhill Publications, LLC, P.O. Box 186, Greenville, AL 36037nsdppccISSN RECORD07014Greenville standardThe Greenville standard.Greenville, AL :Springhill PublicationsWeeklytexttxtrdacontentunmediatednrdamediaBegan with vol. 1, issue 1 (Sept. 3, 2008)Description based on surrogate of: Vol. 1, no. 15 (Dec. 18, 2008); title from masthead (publisher's Web site, viewed Dec. 19, 2008).Latest issue consulted: Vol. 1, no. 99 (July 27, 2011) (surrogate).info:srw/schema/1/marcxmlxml00000cas a22000007a 4500123539969070426c20079999aluwr ne 0 a0eng c 2007212138NSDengNSDNSDOCLCQ101936-95571936-95571The Western Tribune, 1530 Third Ave. N., Bessemer, AL 35020mscnsdpISSN RECORDWestern tribune (Bessemer, Ala.)The Western tribune(Bessemer, Ala.)The Western tribune.Bessemer, Ala. :D-Med, Inc.v.WeeklyBegan in 2007.Description based on: May 23, 2007 (surrogate); title from caption.AU at 000041575341info:srw/schema/1/marcxmlxml00000cas a22000007a 4500226300653080425c20079999aluwr ne | a0eng 2008212112NSDengNSDNSDOCLCQ11942-20751942-20751nsdppccISSN RECORDThe corridor messengerThe corridor messenger.Carbon Hill, AL :Corridor Messenger, Inc.WeeklyBegan with vol. 1, issue (10.03.2007).Description based on: 1st issue.United StatesAlabamaWalkerCarbon Hill.http://www.corridormessenger.cominfo:srw/schema/1/marcxmlxml00000cas a22000007a 450077560432070109c20069999aluwr ne 0 a0eng c 2007213400NSDengNSDOCLCQAUBRNOCLCOOCLCFa01935-37901935-37901AU at 000041190283The Auburn Villager, P.O. Box 1633, Auburn, AL 36831-1633pccnsdpISSN RECORDThe Auburn villagerThe Auburn villager.Auburn, AL :Auburn Villagerv.WeeklyBegan in 2006.Description based on: Vol. 1, no. 4 (July 20, 2006) (surrogate); title from caption.Auburn (Ala.)Newspapers.Lee County (Ala.)Newspapers.AlabamaAuburn.fast(OCoLC)fst01209634AlabamaLee County.fast(OCoLC)fst01211930Newspapers.fast(OCoLC)fst01423814United StatesAlabamaLeeAuburn.info:srw/schema/1/marcxmlxml00000cas a2200000Ii 4500872286785m o d s cr mn|---a||||140311c20069999alucr n o b s0 a0eng cABCengrdaABCABCOCLCFLD59.13University of Alabama at Birmingham.The eReporter.[Birmingham, Alabama] :The University of Alabama at Birmingham,[2006]-[Birmingham, Alabama] :Offices of Public Relations & Marketing and Information Technology1 online resource2 issues weeklytexttxtrdacontentcomputercrdamediaonline resourcecrrdacarrierSeptember 19, 2006-\"The eReporter is an official communication of The University of Alabama at Birmingham, companion to the UAB Reporter and recommended alternative to mass e-mails.\"Issues for <March 11, 2014- published and distributed via e-mail subscription on Tuesdays and Fridays.Description based on: September 19, 2006; title from title screen (viewed March 12, 2014).University of Alabama at BirminghamPeriodicals.Periodicals.fast(OCoLC)fst01411641University of Alabama at Birmingham.fast(OCoLC)fst00645114University of Alabama at Birmingham.Office of Public Relations and Marketing.University of Alabama at Birmingham.Information Technology.2006-2012, companion to:University of Alabama at Birmingham.UAB reporter.(OCoLC)32435748Archived issueshttp://hatteras.dpo.uab.edu/cgi-bin/ereporter.cgiinfo:srw/schema/1/marcxmlxml00000cas a22000007a 4500166387050070829c20059999aluwr ne | a0eng c 2007215501NSDengNSDOCLCQ11939-68991939-68991The Wilkie Clark Memorial Foundation, P.O. Box 514, Roanoke, AL 36274$30.00nsdpmscISSN RECORD305.89614People's voice (Roanoke, Ala.)The people's voice(Roanoke, Ala.)The people's voice.Roanoke, AL :Wilkie Clark Memorial Foundationv.WeeklyBegan with vol. 1, no. 1 in 2005.Description based on: Vol. 2, no. 20 (Apr. 20, 2007); title from caption.Wilkie Clark Memorial Foundation.United StatesAlabamaRandolphRoanoke.AU at 000042141390info:srw/schema/1/marcxmlxml00000nas a22000007i 45001124677787191021c20uu9999aluwr ne | a0eng 2019202521DLCengrdaDLC12689-3258122730USPSNorth Jackson Press, 42950 Hwy 72, Suite 406, Stevenson, AL 35772nsdppccISSN RECORD071.323North Jackson pressNorth Jackson press.Stevenson, AL :Caney Creek Publications LLCWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierDescription based on surrogate of: Volume 1, number 36 (October 11, 2019); title from masthead.Latest issue consulted: Volume 1, number 36 (October 11, 2019) (Surrogate).United StatesAlabamaJacksonStevensoninfo:srw/schema/1/marcxmlxml00000cas a2200000 a 4500226315099080428d19981998aluwr ne | 0eng c 2008233691GUAengGUAOCLCQOCLCFOCLCO39911644pccn-us-gaThe Dekalb news.Birmingham, Ala. :Community newspaper holdings Inc.v.WeeklyBegan with 1st year, no. 1 (Apr. 1, 1998); ceased with 1st year, no. 31 (Oct. 28, 1998).Final issue consulted.Description based on first issue; title from caption.Decatur (Ga.)Newspapers.DeKalb County (Ga.)Newspapers.Newspapers.fast(OCoLC)fst01423814GeorgiaDecatur.fast(OCoLC)fst01226234GeorgiaDeKalb County.fast(OCoLC)fst01215288United StatesGeorgiaDeKalbDecatur.Decatur-DeKalb news/era(DLC)sn 89053661(OCoLC)19946163info:srw/schema/1/marcxmlxml00000cas a2200000 i 450050263311m o d cr cn|||||||||020730c19979999alu x neo 0 a0eng c 2015238492AMHengrdapnAMHOCLCQOCLCFOCLCOIULOCLHTMOCLCQCOODLC66460694810970435082687-93791AU at 000050711528OCLCS45109pccnsdpn-us---AP2.B5707023Birmingham weekly (Online)Birmingham weekly(Online)Birmingham weekly.Birmingham, AL :Birmingham Weekly1 online resourceIrregular,Feb. 16-28, 2012-Weekly,Sept. 4-11, 1997-Feb. 9-16, 2012texttxtrdacontentcomputercrdamediaonline resourcecrrdacarrierBegan with vol. 1, issue 1 (Sept. 4-11, 1997).\"City news, views & entertainment\"--Cover.Numbering dropped in Mar. 2012.Also issued in print.Description based on: Publication information from ProQuest; title from web page (viewed June 18, 2015).Latest issue consulted: Aug. 15-20, 2012.Birmingham (Ala.)Newspapers.Internet resources.Electronic journals.AlabamaBirmingham.fast(OCoLC)fst01204958Newspapers.fast(OCoLC)fst01423814United StatesAlabamaBirmingham.Print version:Birmingham Weekly(OCoLC)39271050http://apw.softlineweb.com/http://WC2VB5MT8E.search.serialssolutions.com/?sid=sersol&SS_jc=JC_000051895&title=Birmingham+Weeklyinfo:srw/schema/1/marcxmlxml00000cas a22000007a 450031471314941116d19941995aluwr ne 0 a0eng csn 94003083 NSDengNSDANEOCLCQOCLCFOCLCOOCLCQ11079-65411079-65411nsdppccn-us-akSoutheast shopperSoutheast shopper.Juneau, Alaska :Kemper Communications,1994-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 1, no. 1 (Nov. 16, 1994)-Ceased in Feb. 1995.Juneau (Alaska)Newspapers.AlaskaJuneau.fast(OCoLC)fst01213587Newspapers.fast(OCoLC)fst01423814United StatesAlaskaJuneau.AU at 000011356572info:srw/schema/1/marcxmlxml00000cas a22000008a 450027910515930413c19949999alumr n 0 a0eng dsn 93002581 NSDengNSDOCLCQ11069-06621Birmingham Tribune, 216 Ave. T. Pratt City, Birmingham, AL 35214nsdpBirmingham tribuneBirmingham tribune.Birmingham, Ala. :Kervin Fondren9501volumesMonthlytexttxtrdacontentunmediatednrdamediavolumencrdacarrierPREPUB: publication expected Jan. 1995AU at 000025863987info:srw/schema/1/marcxmlxml00000cas a22000007a 450026199931920716d19922013alumr ne 0 a0eng csn 92003357 NSDengNSDOCLOCLCQDLC011064-01341064-01341Black & White, POB 13215, Birmingham, AL 35202-3215nsdppccBlack & white (Birmingham, Ala.)Black & white(Birmingham, Ala.)Black & white.Black and whiteBirmingham, Ala. :Black & White, Inc.v.Biweekly,Oct. 2, 1997-Monthly,May 1, 1992-Sept. 1997Began in May 1992; ceased with Jan. 10, 2013.\"Birmingham's New City paper.\"Description based on: June 1992.Latest issue consulted: No. 67 (Oct. 16, 1997) (surrogate).info:srw/schema/1/marcxmlxml00000cas a2200000 a 450032145723950314d19901999alumr ne 0 a0eng csn 95068755 MGNengMGNNSDCLUOCLCQOCLCFOCLCOOCLCA971211082-34841082-34841AU at 000011579542nsdppccn-us-alF335.J5S68The Southern shofarThe Southern shofar.Birmingham, AL :L. Brook,-[1999]v. :ill. ;35 cm.MonthlyBegan in 1990.-v. 9, issue 9 (Aug./Sept. 1999).\"The monthly newspaper of Alabama's Jewish community.\"Some issues also available on the Internet via the World Wide Web.Description based on: Vol. 3, issue 11 (Oct. 1993).Jewish newspapersAlabama.Jewish newspapers.fast(OCoLC)fst00982872Alabama.fast(OCoLC)fst01204694United StatesAlabamaJeffersonBirmingham.Deep South Jewish voice(DLC)sn 99018499(OCoLC)42431704CLUhttp://bibpurl.oclc.org/web/719http://www.bham.net/shofar/info:srw/schema/1/marcxmlxml00000cas a22000007a 450021265141900326c19909999aluwr ne 0 a0eng csn 90099004 AARengAARCPNNSDOCLCQ11050-08981050-08981005022USPSE.O.N., Inc., Main St., Eclectic, AL 36024pccnsdpISSN RECORDThe Eclectic observerThe Eclectic observer.Eclectic, Ala. :E.O.N., Inc.,1990-v.WeeklyVol. 1, no. 1 (Feb. 22, 1990)-Published by: Price Publications, Inc., <2006->Latest issue consulted: Vol. 17, no. 1 (Jan. 5, 2006).United StatesAlabamaElmoreEclectic.AU at 000040212446info:srw/schema/1/marcxmlxml00000cas a22000007a 450021214781900314c19909999aluir ne 0 a0eng csn 90002457 AAAengAAANSDOCLCQ111050-20841050-20841931180USPSClanton Newspapers, 1109 Seventh St., N., PO Box 1379, Clanton, AL 35045nsdppccn-us-alThe Clanton advertiserThe Clanton advertiser.AdvertiserClanton, Ala. :Clanton Newspapersv. :ill. ;58 cm.Three no. a week,<May 13, 1992->Semiweekly,<Apr. 4, 1990->Began in Jan. 1990.Description based on: Vol. 19, no. 27 (Wed., Apr. 4, 1990).Latest issue consulted: Vol. 22, no. 58 (May 13, 1992).United StatesAlabamaChiltonClanton.Independent advertiser (Clanton, Ala.)(OCoLC)21214732AU at 000025908452info:srw/schema/1/marcxmlxml00000cas a2200000 a 450021214814900314c19909999aluwr ne 0 a0eng dsn 90099009 AAAengAAACPNNSDOCLCQ11056-32881056-32881505740USPSThe Blount Countian, 3rd St. at Washington Ave., PO Box 310, Oneonta, AL 35121mscnsdpn-us-alThe Blount countianThe Blount countian.Oneonta, Ala. :Southern Democrat, Inc.,1990-v. :ill.WeeklyVol. 1, no. 1 (Jan. 3, 1990)-Editor: Molly Howard Ryan, 1990-Latest issue consulted: Vol. 1, no. 36 (Sept. 5, 1990).Ryan, Molly Howard.United StatesAlabamaBlountOneonta.Southern Democrat(DLC)sn 85044741(OCoLC)12038577AU at 000025884049info:srw/schema/1/marcxmlxml00000cas a22000007a 450022413044900920c19909999aluwr ne 0 a0eng dsn 90099011 AARengAARCPNNSDNSTOCLCQ92081707011191053-91231053-91231314240USPSmscnsdpThe Clay times-journalThe Clay times-journal.Lineville, Ala. :C.L. Proctor,1990-v.WeeklyVol. 1, no. 1 (Sept. 6, 1990)-United StatesAlabamaClayLineville.Ashland progress(DLC)sn 85044701Lineville tribune(DLC)sn 85044702AUinfo:srw/schema/1/marcxmlxml00000cas a22000007a 450021265218900326c19909999aluwr ne 0 0eng dsn 90099005 AARengAARCPNOCLCQmscTrussville news-journal.Trussville, Ala. :Mike Mitchell,1990-v.BimonthlyVol. 1, no. 1 (Feb. 20, 1990)-United StatesAlabamaJeffersonTrussville.info:srw/schema/1/marcxmlxml00000cas a22000007a 450022301035900831c19909999aluwr ne 0 0eng dsn 90099010 AARengAARCPNOCLCQmscWeaver tribune.Oxford, Ala. :Cheaha Pub.,1990-v.WeeklyVol. 1, no. 1 (July 19, 1990)-United StatesAlabamaCalhounWeaver.United StatesAlabamaCalhounOxford.info:srw/schema/1/marcxmlxml00000cas a22000007a 450015155895870205c19879999aludr ne 0 a0eng csn 87050045 AAAengAAACPNNSDDLCCPNNSDDLCCPNDLCOCLDLCOCLCQOCLCFOCLCQ19261126829944596670892-44570892-44571AU at 000020456714360980USPSThe Advertiser, P.O. Box 1000, Montgomery, AL 36192pccnsdpn-us-alNewspaperMontgomery advertiser (Montgomery, Ala. : 1987)The Montgomery advertiser(1987)The Montgomery advertiser.Montgomery advertiser & the Alabama journalSunday Montgomery advertiserMontgomery, Ala. :Advertiser Co.,1987-volumes :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier160th year, no. 1 (Jan. 2, 1987)-On Saturdays, Sundays and holidays a combined edition is published with the Alabama journal, and called: Montgomery advertiser and the Alabama journal, Jan. 3, 1987, and: Alabama journal and Montgomery advertiser, Jan. 4, 1987-Feb. 25, 1990.Issues for Sunday called: Sunday Montgomery advertiser, Mar. 4, 1990-Issues for Saturday, Sunday and holidays have their own numbering, Jan. 3, 1987-Feb. 25, 1990.Montgomery (Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United StatesAlabamaMontgomeryMontgomery.Advertiser (Montgomery, Ala.)0745-3221(DLC)sn 82008412(OCoLC)9049482Alabama journal (Montgomery, Ala. : 1940)0745-323X(DLC)sn 87062018(OCoLC)2666111info:srw/schema/1/marcxmlxml00000cas a2200000 a 450016942287871105c19879999aludn ne 0 a0eng dsn 88050149 AAAengAAACPNNSDOCLCQy1044-00701044-0070746--32780746-32781565580USPSTroy Publications, Inc., 113 North Market St., Troy, AL 36081mscnsdpMessenger (Troy, Ala.)The Messenger(Troy, Ala.)The Messenger.Troy, Ala. :Troy Pub.,1987-v.Daily (Sunday, Tuesday, Thursday and Friday)Vol. 121, no. 166 (July 1, 1987)-Sunday, Apr. 2, 1989 misprinted as v. 113.Latest issue consulted: Vol. 113 [sic 123], no. 96 (Sunday, Apr. 2, 1989).United StatesAlabamaPikeTroy.Troy messenger0746-3278(DLC)sn 83009935(OCoLC)9921908info:srw/schema/1/marcxmlxml00000cas a22000007a 450017799786880415c19879999aluir ne 0 a0eng dsn 88050086 AARengAARCPNNSDOCLCQ1p1044-03801044-03800745-75961441520USPSThe Prattville Progress, 152 W. 3rd St., Prattville, AL 36067mscnsdpPrattville progress (Prattville, Ala. : 1987)The Prattville progress(Prattville, Ala.)The Prattville progress.Prattville, Ala. :James C. Seymour,1987-v.Three times a weekVol. 102, no. 8 (Jan. 20, 1987)-Latest issue consulted: Vol. 105, no. 153 (Wednesday, Dec. 26, 1990).United StatesAlabamaAutaugaPrattville.Progress (Prattville, Ala.)0745-7596(DLC)sn 83007623(OCoLC)9428489info:srw/schema/1/marcxmlxml00000cas a22000007a 450015344667870319c19869999aluwr ne 0 a0eng dsn 87000284 NSDengNSDCPNOCLCQy0893-07670893-07671431800USPSPickens County Herald, P.O. Drawer E, Carrollton, AL 35447nsdpPickens County heraldPickens County herald.Pickens County herald and west AlabamianCarrollton, Ala. :Pickens Newspapers, Inc.,1986-WeeklyVol. 138, no. 40 (Oct. 2, 1986)-United StatesAlabamaPickensCarrollton.Pickens County herald and west Alabamian0746-0473(DLC)sn 83008141AU at 000040635809info:srw/schema/1/marcxmlxml00000cas a22000007a 450018917586881217c19869999aluwr ne 0 0eng dsn 88050225 CPNengCPNOCLCQmscThe Oxford sun/times.Oxford, Ala. :[s.n.],1986-v.WeeklyVol. 1, no. 1 (Jan. 16, 1986)-Editor: Andy Goggans.Numbering is irregular.United StatesAlabamaCalhounOxford.Oxford sun (Oxford, Ala.)(DLC)sn 85045023AU at 000025803813info:srw/schema/1/marcxmlxml00000cas a22000007a 450013991168860731c19869999aluwr ne 0 0eng dsn 86050322 CPNengCPNOCLCQmscIndependent (Brewton, Ala.)The Independent.Brewton, Ala. :Jim Thornton,1986-v. :ill. ;58 cm.WeeklyVol. 1, no. 1 (June 19, 1986)-United StatesAlabamaEscambiaBrewton.info:srw/schema/1/marcxmlxml00000cas a22000007a 450018957493881231c19859999aluwr ne 0 0eng dsn 88050247 CPNengCPNOCLCQmscPiedmont journal-independent (Piedmont, Ala.)The Piedmont journal-independent.Journal independentPiedmont, Ala. :Lane Weatherbee,1985-v.WeeklyVol. 4, no. 52 (Dec. 24, 1985)-Sometimes published as: Journal independent.United StatesAlabamaCalhounPiedmont.Journal-independent(DLC)sn 85045014info:srw/schema/1/marcxmlxml00000cas a22000007a 450012715821851024d19841985aluwr ne 0 a0eng dsn 85045014 CPNengCPNNSDCPNOCLCQmscThe Journal-independent.Piedmont, Ala. :Journal-Independent, Inc.,1984-1985.volumes :illustrations ;58 cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 3, no. 27 (July 3, 1984)- v. 4, no. 51 (Dec. 18, 1985).Carries the same vol. numbering as the Piedmont journal-independent.United StatesAlabamaCalhounPiedmont.Piedmont journal-independent0890-6017(DLC)sn 85045013Piedmont journal-independent (Piedmont, Ala.)(DLC)sn 88050247info:srw/schema/1/marcxmlxml00000cas a22000007a 450012691448851018c19839999aludr ne 0 0eng dsn 85045007 CPNengCPNOCLCQmscTimesDaily.Times dailyFlorence, Ala. :T.S.P. Newspapers, Inc.,1983-volumes :illustrations ;58 cmDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 114, no. 226 (Aug. 14, 1983)-United StatesAlabamaLauderdaleFlorence.Florence times + tri-cities daily(DLC)sn 85044995info:srw/schema/1/marcxmlxml00000cas a22000007a 45009428489830420d19831987aluir ne 0 a0eng dsn 83007623 NSDengNSDCPNNSDNSTOCLCQ89090d0745-75960745-75961The Progress, 152 W. 3rd St., Prattville, AL 36067nsdpmscProgress (Prattville, Ala.)The Progress(Prattville, Ala.)The Progress.Prattville, Ala. :The Prattville Progress,1983-1987.volumes :illustrations ;58 cmThree times a weektexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 98, no. 32 (Mar. 17, 1983)-v. 102, no. 7 (Jan. 17, 1987).United StatesAlabamaAutaugaPrattville.Prattville progress(DLC)sn 85044740Prattville progress (Prattville, Ala.)1044-0380(DLC)sn 88050086(OCoLC)12254317AAPinfo:srw/schema/1/marcxmlxml00000cas a2200000 a 45009867255830831c19839999aludr ne 0 a0eng dsn 84008052 AAAengAAANSDOCLOCLCQX0743-15110743-15111617760USPST.S.P. Newspapers, Inc., 219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Shoals edition)TimesDaily(Shoals ed.)TimesDaily.Times dailyShoals ed.Florence, Ala. :T.S.P. Newspapersvolumes :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan with: Vol. 114, no. 226 (Aug. 14, 1983).\"Florence/Sheffield/Tuscumbia/Muscle Shoals.\"Shoals ed. and Regional ed. combined on Sundays.Description based on: Vol. 114, no. 346 (Monday, Dec. 12, 1983).United StatesAlabamaLauderdaleFlorence.TimesDaily (Regional edition)0743-152XTimes Tri-cities dailyUnknownDec. 12, 1983info:srw/schema/1/marcxmlxml00000cas a22000007a 450010536023840319c19839999aludr ne 0 a0eng dsn 84008051 NSDengNSDOCLCQ1x0743-152X0743-152X1617760USPST.S.P. Newspapers, Inc., 219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Regional edition)TimesDaily(Regional ed.)TimesDaily.Times dailyRegional ed.Florence, Ala. :T.S.P. NewspapersDailytexttxtrdacontentunmediatednrdamediaBegan with: Vol. 114, no. 226 (Aug. 14, 1983).Shoals ed. and Regional ed. combined on Sundays.Description based on: Vol. 114, no. 346 (Monday, Dec. 12, 1983).United StatesAlabamaLauderdaleFlorence.TimesDaily (Shoals edition)0743-1511Times Tri-cities dailyDec. 12, 1983AU at 000025818125info:srw/schema/1/marcxmlxml00000cas a22000007a 45009049482821213d19821987aludn ne 0 a0eng csn 82008412 AAAengAAANSDNPWCPNDLCCPNNSDDLCNSDDLCCPNNVFDLCOCLCQCRLOCLCFOCLCQ1d0745-32210745-32211nsdppccn-us-alNewspaperAdvertiser (Montgomery, Ala.)The Advertiser(Montgomery, Ala.)The advertiser.Alabama journal and advertiserMontgomery, Ala. :Advertiser Co.,1982-1987.volumes :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier155th year, no. 232 (Nov. 22, 1982)- ; -v. 14-3, Jan. 1, 1987.On Saturdays, Sundays and holidays published as: The Alabama journal and advertiser, Nov. 27, 1982-Jan. 1, 1987.Saturday, Sunday and holiday issues have their own numbering.Montgomery (Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United StatesAlabamaMontgomeryMontgomery.Montgomery advertiser (Montgomery, Ala. : Daily)(DLC)sn 84020645(OCoLC)2685433Montgomery advertiser (Montgomery, Ala. : 1987)0892-4457(DLC)sn 87050045(OCoLC)15155895AU at 000020281746info:srw/schema/1/marcxmlxml00000cas a2200000 a 45009237931830218c19829999aluwr ne 0 0eng dsn 86050139 AAAengAAACPNOCLOCLCQmscThe Randolph leader.Roanoke, Ala. :David S. Stevenson,1982-volumes :illustrations ;58 cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 91, no. 1 (Oct. 6, 1982)-United StatesAlabamaRandolphRoanoke.Roanoke leader(DLC)sn 86050137Randolph press(DLC)sn 86050138info:srw/schema/1/marcxmlxml00000cas a22000007a 450012715815851024d19821984aluwr ne 0 a0eng dsn 85045013 CPNengCPNNSDCPNOCLCQ110890-60170890-60171432080USPSThe Piedmont Journal-Independent, 115 N. Center Ave., Piedmont, AL 36272mscnsdpThe Piedmont journal-independentThe Piedmont journal-independent.Piedmont, Ala. :Piedmont Journal-Independent, Inc.,1982-1984.volumes :illustrations ;58 cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 1, no. 1 (Mar. 31, 1982)-v. 3, no. 26 (June 27, 1984).Latest issue consulted: Vol. 5, no. 31 (August 20, 1986).United StatesAlabamaCalhounPiedmont.Piedmont journal(DLC)sn 85045012Journal-independent(DLC)sn 85045014(OCoLC)12715821AU at 000045312916info:srw/schema/1/marcxmlxml00000cas a22000007a 45009183905830202c19829999aluwr n 0 a0eng dsn 85044580 AAAengAAACPNNSDOCLOCLCQ11098-58671098-58671016409USPSNo. 4, Rucker Plaza, Enterprise, AL 36331P.O. Box 1536, Enterprise, AL 36331mscnsdpSoutheast sun (Enterprise, Ala.)The southeast sun(Enterprise, Ala.)The Southeast sun.Enterprise, Ala. :QST Publicationsvolumes :illustrations ;58 cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in 1982.Description based on: Vol. 1, no. 25 (Oct. 21, 1982).Latest issue consulted: Vol. 16, no. 43 (Mar. 4, 1998).United StatesAlabamaCoffeeEnterprise.AU at 000025827687info:srw/schema/1/marcxmlxml00000cas a22000007a 450010487314840305c19819999aluwr ne 0 a0eng dsn 85044906 AAAengAAACPNNSDNSTCPNOCLOCLCQOCLCFOCLCOOCLCAOCLCQ900410885-16620885-16621749310USPSThe New Times, 1618 1/2 St. Stephens Rd., Mobile, AL 36603mscnsdpn-us-alNew times (Mobile, Ala.)The New times(Mobile, Ala.)The new times.Mobile, Ala. :New Times Groupvolumes :illustrationsWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in 1981.Vol. 3, no. 49 (Dec. 15-21, 1983) and vol. 3, no. 50 (Dec. 22-28, 1983) are both called vol. 3, no. 49 (Dec. 15-21, 1983).Description based on: Vol. 2, no. 3 (Jan. 28-Feb. 3, 1982).African AmericansAlabamaNewspapers.African Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694Newspapers.fast(OCoLC)fst01423814United StatesAlabamaMobileMobile.AAPUnknownAug. 15, 1985AU at 000024686659info:srw/schema/1/marcxmlxml00000cas a22000007a 450018922463881219d19811983alucr ne 0 0eng dsn 88050233 AARengAARCPNNSDOCLCQmscThe Sylacauga daily advance.Advance/Sylacauga dailySylacauga advanceSunday advanceAdvanceSylacauga, Ala. :Mrs. W.A. Moody,1981-1893.v.Semiweekly,<Nov. 24, 1982-Feb. 13, 1983>Daily (except Mon., Tues. & Sat.),<May 26, 1982-Nov. 21, 1982>Daily (except Sat. & Mon.),<Jan. 1, 1981-May 23, 1982>74th Year, no. 123 (Jan. 1, 1981)-76th year, no. 83 (Feb. 13, 1983).Days of publication vary.Published as: The Advance/Sylacauga daily, <Aug. 28, 1981-May 23, 1982>.Published as: Sylacauga advance, <Nov. 24, 1982-Feb. 13, 1983>.On Sunday, published as: Sunday advance.United StatesAlabamaTalladegaSylacauga.Childersburg star(DLC)sn 88050232Coosa press(DLC)sn 86050293Daily home1059-6461(DLC)sn 88050234info:srw/schema/1/marcxmlxml00000cas a22000007a 450021026715cr un|||||||||900209c19809999aluwr ne 0 0eng dsn 90099002 AARengAARCPNCUSOCLOCLCQTJCOCLCQOCLCFOCLCOOCLCA926143844AU at 000020585756mscn-us-alSpeakin' out news.Speaking out newsDecatur, Ala. :Minority Network, Inc.v.WeeklyBegan in 1980.Published in Huntsville, Ala., <1987>-Also issued by subscription via the World Wide Web.Description based on: Vol. 7, no. 8 (Jan. 7-13, 1987).African AmericansAlabamaNewspapers.African American newspapersAlabama.AlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African American newspapers.fast(OCoLC)fst00799278African Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United StatesAlabamaMorganDecatur.United StatesAlabamaMadisonHuntsville.Speakin' out weekly news(DLC)sn 88050097http://www.softlineweb.com/softlineweb/ethnic.htminfo:srw/schema/1/marcxmlxml00000cas a22000007a 450014996511861219c19809999aluwr ne 0 a0eng csn 86050472 AARengAARCPNNSDOCLCQ11080-15021080-15021328110USPSnsdppccWest-Alabama gazetteWest-Alabama gazette.GazetteMillport, Ala. :Millport Pub. Co.,1980-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrier4th year, no. 32 (Jan. 3, 1980)-United StatesAlabamaLamarMillport.Gazette (Millport, Ala.)(DLC)sn 86050471info:srw/schema/1/marcxmlxml00000cas a2200000 a 450011828156850320c19809999aluwr ne 0 0eng dsn 86050314 AAAengAAACPNOCLOCLCQmscThe Hartford news-herald.Hartford, Ala. :Geneva Publications,1980-volumes :illustrations ;57-59 cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 80, no. 20 (Feb. 14, 1980)-United StatesAlabamaGenevaHartford.News-herald (Hartford, Ala.)(DLC)sn 86050313info:srw/schema/1/marcxmlxml00000cas a22000007a 450017857788880427d198u198ualusr ne 0 0eng dsn 88050097 AARengAARCPNOCLOCLCQOCLCFOCLCOOCLCAmscn-us-alSpeakin' out weekly news.Decatur, Ala. :Smothers PublicationsPublished every first and third Wed. of each monthDescription based on: Vol. 3, no. 13 (May 4-17, 1983).African AmericansAlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United StatesAlabamaMorganDecatur.Weekly news (Huntsville, Ala.)(DLC)sn 87050012Speakin' out news(DLC)sn 90099002info:srw/schema/1/marcxmlxml00000cas a2200000 a 450017807936880418c198u9999aluwr ne 0 a0eng dsn 90099001 AAAengAAACPNOCLOCLCQThe Daleville Sun-Courier, 310 Daleville Ave., Daleville, AL 36322mscn-us-alDaleville sun-courier.Daleville, Ala. :QST Publicationsv. :ill. ;58 cm.WeeklyDescription based on: Vol. 2, no. 28 (Wed., Feb. 17, 1988).United StatesAlabamaDaleDaleville.AU at 000020585749info:srw/schema/1/marcxmlxml00000cas a22000007a 450015580838870423c198u9999aluwr ne 0 0eng dsn 87050128 AARengAARCPNOCLCQmscGreene County independent.Eutaw, Ala. :Greene County Independent, Inc.v.WeeklyDescription based on: Vol. 2, no. 10 (Mar. 12, 1987).United StatesAlabamaGreeneEutaw.info:srw/schema/1/marcxmlxml00000cas a22000007a 450010125135831114d198u198ualucr ne 0 a0eng dsn 83003221 NSDengNSDOCLCQ0d0746-55210746-55211Auburn Bulletin & Lee County Eagle, PO Box 2111, Auburn, Ala. 36830nsdpThe Auburn bulletin & the Lee County eagleThe Auburn bulletin & the Lee County eagle.Lee County eagleAuburn bulletin and the Lee County eagleAuburn, Ala. :[publisher not identified]Semiweekly,<Sept. 5, 1984->WeeklytexttxtrdacontentunmediatednrdamediaDescription based on: Oct. 19, 1983.United StatesAlabamaLeeAuburn.Auburn bulletin(DLC)sn 89050006Eagle (Auburn, Ala.)(OCoLC)18435663Sept. 5, 1984info:srw/schema/1/marcxmlxml00000cas a22000007a 450018370324880818c198u9999aluwr ne 0 0eng dsn 88050147 CPNengCPNOCLCQmscTri-city times (Geraldine, Ala.)The Tri-City times.Geraldine, Ala. :Wanda Nelsonv.WeeklyDescription based on: Vol. 2, no. 24 (Jan. 6, 1982).United StatesAlabamaDeKalbGeraldine.info:srw/schema/1/marcxmlxml00000cas a22000007a 450010199338831208c198u9999aluwr ne 0 a0eng dsn 83005367 NSDengNSDCPNOCLCQ10746-62770746-62771707590USPSSpringville Pub. Co., 539 Main St., Springville, AL 35146nsdpThe St. Clair clarionThe St. Clair clarion.Saint Clair clarionSpringville, AL :Gary L. ShultsWeeklytexttxtrdacontentunmediatednrdamediaDescription based on: Vol. 2, no. 1 (Jan. 5, 1982).United StatesAlabamaSt. ClairSpringville.AU at 000025783743info:srw/schema/1/marcxmlxml00000cas a22000007a 450013787251860627c198u9999aluwr ne 0 a0eng dsn 86001923 NSDengNSDCPNOCLCQ10889-00800889-00801The Westerner Star, P.O. Box 2060, Bessemer, AL 35021nsdpWestern star (Bessemer, Ala.)The Western star(Bessemer, Ala.)The western star.Bessemer, Ala. :Hal HodgensWeeklytexttxtrdacontentunmediatednrdamediaDescription based on: Vol. 3, no. 15 (Wednesday, June 11, 1986).United StatesAlabamaJeffersonBessemer.Bessemer advertiser(DLC)sn 87050117AU at 000025805174511.1srw.pc any \"y\" and srw.mt any \"newspaper\" and srw.cp exact \"Alabama\"50info:srw/schema/1/marcxmlxml1Date,,0mq1lME887FoIbjulKUV6bx9ImwWQNCv9GqZzGS92IKS31lEbcpRJBNHgcE1l29tFaHP9CHe0Yexk1uWQofffull" ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 27 2022, 09:22am via System Hello Spencer, Thank you for reaching out about the bulk xml files for the US Newspaper Directory. We don't have documentation specific to these bulk xml files, but upon further inspection I can say that each of those files don't necessarily contain info for 50 newspaper titles. The structure of the titles for California and New York for instance are different from say, Alabama. If you look at California for example, the file naming structure indicates the year the title started, and then the number of titles included in that xml file. So for instance, the files below include info for newspapers that started in 2000, 2001, and 2002 respectively. And there is info for 30 titles in the xml file from 2000, and 14 in the file for 2001, and so on. * ndnp_California_2000_e_0001_0030.xml * ndnp_California_2001_e_0001_0014.xml * ndnp_California_2002_e_0001_0012.xml If there's more than 50 titles for a given year, say for California starting in 1880, then the next 50 titles will roll into the next xml file, and so on. And the last xml file for that year may not include 50 titles. Many of the states seem to group all the years together, so each xml file contains 50 titles, until possibly the last one for a given state, which may contain less. I hope this information helps explain the total number of records and structure a bit better. Let me know if you have any further questions. Best wishes, Kerry Huller Newspaper & Current Periodical Reading Room Serial & Government Publications Division Library of Congress ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 25 2022, 02:22pm via Email Hi, Kerry: Might there be documentation on the XML files you mentioned? I've successfully read 'https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/', extracted the names of 6666 XML files, and read the first one, "ndnp_Alabama_all-yrs_e_0001_0050.xml". It contains 29415 characters, beginning, "1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i 45001030438981180404c20159999aluwr n 0 a0eng ". With a bit more effort, I will likely be able to parse all 6666 of these. The names suggest that each contains information on 50 newspapers, totaling 333,300. The main page "https://chroniclingamerica.loc.gov/search/titles/" says there are only 157,521 "Titles currently listed". This suggests that these XML files include place holders for a little more than double the number of entries currently in "https://chroniclingamerica.loc.gov/search/titles/". Thanks for this. Progress. ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 07 2022, 08:55am via System Hi Spencer, I thought of one more option after I emailed you yesterday that I wanted to make you aware of. I had explained the other day how we pull the records from OCLC into our U.S. Newspaper Directory. You can also access all of the raw MARC records found in the directory in xml format from here if you choose: https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ <https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/> These will provide you all of the data from the record fields in MARC format, so you'd get all the data you see here for example: https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/ <https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/> but in xml. I don't know if this might be more data and info than you want to work with, but wanted to make sure you were aware of this option as well. Best wishes, Kerry Huller Newspaper & Current Periodical Reading Room Serial & Government Publications Division Library of Congress ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 06 2022, 10:55am via System Hi Spencer, Thanks for reaching out again. I have been looking at the json view a bit closer this morning and your example of "9999." After talking with a colleague this morning and looking at various examples, I see there is some variation in how the titles with either an unknown starting/ending date or currently published titles are being handled - depending on the view. As an example, I completed a search in the directory for Alaska and the city of Anchorage. There are 80 results, and on the first page of results you'll see # 4. Fort Richardson news, which was published from 1952-19??. The csv view of this state/city search result will show the ending date of 19??. But if I append &format=json to this search result, this specific title will show an ending date of 1999. After talking with a colleague this morning, I discovered an integer had to be used in these cases where dates were "?" so that the search based on year range would work. Similarly, if you look at # 12 Alaska digest, which was published 1994-current, the "current" becomes "9999" in the json view. So, the records you are seeing with "9999" would most likely be titles with an ending date of "current." However, there is an issue with the unknown dates, like "1999" being used for "19??" in the example above. The "9" does not get inserted in place of "?" when you are looking at the title/LCCN view of a specific newspaper. So for instance, if you view the #4 title: Fort Richardson news at this url: https://chroniclingamerica.loc.gov/lccn/sn98059792/ <https://chroniclingamerica.loc.gov/lccn/sn98059792/> but append .json to the end of the url, after the LCCN, like this: https://chroniclingamerica.loc.gov/lccn/sn98059792.json <https://chroniclingamerica.loc.gov/lccn/sn98059792.json> you'll see that the end_year is "19??." Viewing the title/LCCN json view for titles that are currently published will also show the end_year as "current." The Alaska digest example from above can be viewed here: https://chroniclingamerica.loc.gov/lccn/sn97060056.json <https://chroniclingamerica.loc.gov/lccn/sn97060056.json> I wasn't aware of the difference between the directory search json view and the title/LCCN view. But I think it would be possible to grab the data from the title/LCCN json url through an additional script potentially. The json url is included in the view under the "url" field. Of course, there are unknowns with publishing dates, but better to know where the question marks are, and what titles are considered to be current. I hope this clarifies the data a bit more - let me know if any of it needs more clarification though. And let me know if you have follow-up questions. Thank you, Kerry Huller Newspaper & Current Periodical Reading Room Serial & Government Publications Division Library of Congress ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 05 2022, 04:42pm via Email Hi, Kerry: What would you suggest I do to get a count of the numbers of newspapers and publishers operating by year from, say, 1790 to 2021? I just determined that 20630 (13 percent) of the 157520 records in the US Newspaper database I downloaded a week ago have end_year = 9999. I don't think it's feasible to assume that all or even most of those are still publishing. Might there be some other database that might have this kind of information? I ask, because Robert McChesney (2004) The Problem of the Media (Monthly Review Pr., esp. pp. 34-35) suggests that in the first half of the nineteenth century, the US had more newspapers and newspaper publishers per capita than any other place or time. He suggests that that diversity of newspapers helped encourage literacy and limit political corruption, both of which helped propel the young US to its current dominance of the international political economy. I'm hoping to get some data to evaluate this claim. Sadly, it looks like there is too much missing and questionable data in this dataset for me to use this without a fairly substantive data cleaning effort. ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 05 2022, 09:05am via System Hello Spencer, Thank you for reaching out about your additional questions. I was looking at the records you mention above, and yes, you are correct - those 9 records with the date inconsistencies and the one record for the The New Mexican mining news <https://chroniclingamerica.loc.gov/lccn/sn93061507/> containing "Santa Fe.\" have typos in them. Thanks for spotting these - it may be possible to have the cataloger in our division correct those typos. I will look into this further. The U.S. Newspaper Directory doesn't have a connection with Wikimedia or Wikipedia. The Library of Congress periodically pulls the records for the Directory from OCLC Worldcat <https://www.oclc.org/en/worldcat.html>. And those newspaper records in OCLC Worldcat have been created by catalogers at various institutions around the U.S. over the span of several years. So, occasionally, you will find a typo in the records. Corrections can be made by OCLC and library staff at the various institutions. Every time we complete a new pull on the OCLC records, any corrected records will then populate our Directory. Regarding your question on the New-York weekly journal - yes, that is also correct that it has two records. There is actually a record for each format of the newspaper, so this record is for the microfilm format <https://chroniclingamerica.loc.gov/lccn/2009252748/> and this one is for the original print format <https://chroniclingamerica.loc.gov/lccn/sn83030211/>. You can see in the heading for the microfilm record where it says [microfilm reel] and the print version shows [volume]. You are likely to see this for other titles as well because each format has been cataloged with its own LCCN. You are also likely to see additional records with [online resource] identified as the format as more and more titles are available as ePrints or online. I hope this helps answer your additional questions a bit more. Please reach out if you have any other questions. Thank you, Kerry Huller Newspaper & Current Periodical Reading Room Serial & Government Publications Division Library of Congress ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 04 2022, 01:47pm via Email Hi, Kelly: At the risk of bombing your inbox with more emails than you want, what is your relationship with Wikipedia and other Wikimedia Foundation projects like Wikidata? I ask, because I've logged over 20,000 edits in Wikimedia Foundation projects since 2010, and I would happily try to answer questions about Wikidata and other Wikimedia Foundation projects. I have NOT organized an edit-a-thon, but I've made presentations at conferences with people who have, and I would happily try to help organize such if you could find a group of people who want to work to improve this US Newspaper database. I think it would be good to establish links between this US Newspaper database and Wikidata, with appropriate procedures so changes to one could be evaluated for acceptance into the other. FYI, John Peter Zenger's famous "New-York weekly journal" (1733-1751) appears TWICE in your database with lccn = 2009252748 and sn83030211 and ONCE in Wikidata WITHOUT an lccn, even though many other Wikidata items have an lccn. See: https://www.wikidata.org/wiki/Q23091960 There's a "WikiProject Newspapers" on Wikipedia and a companion "WikiProject Periodicals" on Wikidata: https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Newspapers/Wikidata https://www.wikidata.org/wiki/Wikidata:WikiProject_Periodicals I've tried to connect with others on those projects, so far with only limited success. However, you may know that almost anyone can change almost anything on Wikipedia and other Wikimedia Foundation projects. What stays tends to be written from a neutral point of view citing credible sources. They have problems with vandals, but the problems are usually easily controlled. This makes Wikipedia and Wikidata very useful platforms for cleaning up databases like your US Newspaper dataset. Spencer Graves ########## Hello, Kelly: In addition to the invalid JSON, discussed below [NOTE: The "below" contains a slight addition to the report of the I sent last Friday.], I found 9 (NINE!) cases where start_year was AFTER end_year. These have lccn = "sn86071531" "sn95069213" "sn90059096" "sn86058451" "sn90060926" "sn99065409" "sn89065002" "sn98069857" "sn91059179" See: https://chroniclingamerica.loc.gov/lccn/sn86071531/ https://chroniclingamerica.loc.gov/lccn/sn95069213/ https://chroniclingamerica.loc.gov/lccn/sn90059096/ https://chroniclingamerica.loc.gov/lccn/sn86058451/ https://chroniclingamerica.loc.gov/lccn/sn90060926/ https://chroniclingamerica.loc.gov/lccn/sn99065409/ https://chroniclingamerica.loc.gov/lccn/sn89065002/ https://chroniclingamerica.loc.gov/lccn/sn98069857/ https://chroniclingamerica.loc.gov/lccn/sn91059179/ These all have obvious coding errors that can be easily fixed. The data may not be completely accurate after the fix, but at least they are not obviously wrong ;-) ################## I got invalid JSON from: https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json After some experimentation, I was able to replicate the problem with a request for rows=10: https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics and Associate Dean for Graduate Programs at the University of California - Davis, confirmed that it was a JSON error using: https://codebeautify.org/jsonvalidator He is part of the core team developing the R free, open-source programming language. He said, that starting at offsets 161070 and 161502 in the character string you get from [the R code RCurl::getURL()] we have: Santa Fe.\" and these are in an entry such as "city": ["Santa Fe.\"] So the final " is escaped and therefore there is no closing " for the string. The parser continues to consume characters looking for the end of that string. If one "repairs" the text from getURL() with ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt) then the rest of my code worked fine. You may wish to do something to implement other checks for valid JSON and repair this problem. I've scanned all the 157520 records that were in that database a couple of days ago, and this is the only JSON error identified by the code I used. NOTE: I was NOT able to replicate this error when downloading records one at a time. That suggests a problem NOT in the database itself but in the download algorithm. ??? Thank you for your help. I will almost certainly have other questions ;-) ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 03 2022, 10:39pm via Email Hello, Kelly: In addition to the invalid JSON, discussed below [NOTE: The "below" contains a slight addition to the report of the I sent last Friday.], I found 9 (NINE!) cases where start_year was AFTER end_year. These have lccn = "sn86071531" "sn95069213" "sn90059096" "sn86058451" "sn90060926" "sn99065409" "sn89065002" "sn98069857" "sn91059179" See: https://chroniclingamerica.loc.gov/lccn/sn86071531/ https://chroniclingamerica.loc.gov/lccn/sn95069213/ https://chroniclingamerica.loc.gov/lccn/sn90059096/ https://chroniclingamerica.loc.gov/lccn/sn86058451/ https://chroniclingamerica.loc.gov/lccn/sn90060926/ https://chroniclingamerica.loc.gov/lccn/sn99065409/ https://chroniclingamerica.loc.gov/lccn/sn89065002/ https://chroniclingamerica.loc.gov/lccn/sn98069857/ https://chroniclingamerica.loc.gov/lccn/sn91059179/ These all have obvious coding errors that can be easily fixed. The data may not be completely accurate after the fix, but at least they are not obviously wrong ;-) ################## I got invalid JSON from: https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json After some experimentation, I was able to replicate the problem with a request for rows=10: https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics and Associate Dean for Graduate Programs at the University of California - Davis, confirmed that it was a JSON error using: https://codebeautify.org/jsonvalidator He is part of the core team developing the R free, open-source programming language. He said, that starting at offsets 161070 and 161502 in the character string you get from [the R code RCurl::getURL()] we have: Santa Fe.\" and these are in an entry such as "city": ["Santa Fe.\"] So the final " is escaped and therefore there is no closing " for the string. The parser continues to consume characters looking for the end of that string. If one "repairs" the text from getURL() with ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt) then the rest of my code worked fine. You may wish to do something to implement other checks for valid JSON and repair this problem. I've scanned all the 157520 records that were in that database a couple of days ago, and this is the only JSON error identified by the code I used. NOTE: I was NOT able to replicate this error when downloading records one at a time. That suggests a problem NOT in the database itself but in the download algorithm. ??? Thank you for your help. I will almost certainly have other questions ;-) ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jul 01 2022, 11:46am via Email Hello, Kelly: I got invalid JSON from: https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json After some experimentation, I was able to replicate the problem with a request for rows=10: https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics and Associate Dean for Graduate Programs at the University of California - Davis, confirmed that it was a JSON error using: https://codebeautify.org/jsonvalidator He is part of the core team developing the R free, open-source programming language. He said, that starting at offsets 161070 and 161502 in the character string you get from [the R code RCurl::getURL()] we have: Santa Fe.\" and these are in an entry such as "city": ["Santa Fe.\"] So the final " is escaped and therefore there is no closing " for the string. The parser continues to consume characters looking for the end of that string. If one "repairs" the text from getURL() with ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt) then the rest of my code worked fine. You may wish to do something to implement other checks for valid JSON and repair this problem. I've scanned all the 157520 records that were in that database a couple of days ago, and this is the only JSON error identified by the code I used. Thank you for your help. I will almost certainly have other questions ;-) ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 28 2022, 02:20pm via System Hello Spencer, Thank you for sending along your follow-up questions. I'm glad to hear the json view will work for you. It was recommended to me that you limit your requests to 500 rows at a time. And a developer here at LC suggests the following regarding rate limiting: ?To avoid being blocked by the server, the current rate-limiting rules restrict un-cached requests to URLs starting with https://chroniclingamerica.loc.gov/search/ <https://chroniclingamerica.loc.gov/search/> to 120 requests every 10 minutes from a single IP address.? So, I think if you limited each of your requests to 500 rows at a time with the proper pauses, then you should be able to access what you need. As for the csv view, I checked on this as well, and was informed that the csv view was not implemented for all url formats. The csv view was only implemented for this view: https://chroniclingamerica.loc.gov/newspapers/ <https://chroniclingamerica.loc.gov/newspapers/>and urls resulting from US Directory search results - for e.g. if you wanted to narrow down your search results by state, city, date range, etc. found at this link: https://chroniclingamerica.loc.gov/search/titles/ <https://chroniclingamerica.loc.gov/search/titles/>. So, if you wanted a csv and limited your search by state ( for example: https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv <https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv> ), you could append &format=csv to the search result url and get the csv to automatically download. But, if your search results ended up being over a couple thousand titles, then the system would probably time out. I hope this info helps! Let me know if you have any other questions. Best wishes, Kerry Huller Newspaper & Current Periodical Reading Room Serial & Government Publications Division Library of Congress ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 27 2022, 04:15pm via Email Hello, Kerry: Thanks for the reply. Can you please give me some further guidance on two thing "so that the system is not overwhelmed"? 1. The max size in a small batch? 2. Any limit on the number of small batches in a second or minute? I've found that I can download small batches under program control using "RCurl::getURL" in R (programming language) using, e.g.; https://chroniclingamerica.loc.gov/search/titles/results/?rows=20&page=2&format=json With this, I can control the batch size with "row=20" vs. "row=50" vs., e.g., "row=1000". A naive search says there are 157520 "results". With "row=1000", this would require 158 calls. With "row=20", it would require 7876 calls. Before I start, I need to decide which fields I want; I don't need them all. Thanks, Spencer Graves p.s. I tried appending "&format=csv" and got "Error 504 Ray ID: 7220896da85e86e7 ? 2022-06-27 19:19:53 UTC Gateway time-out". I used: https://chroniclingamerica.loc.gov/search/titles/results/?state=&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv I can get what I want using json so do not need csv. However, I thought you might want to know that I was unable to get csv to work. ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 27 2022, 10:54am via System Hello Spencer, Thank you for contacting the Library of Congress about searching the US Newspaper Directory. I wanted to follow up with you regarding your request to output the data in a machine readable format. It looks like you were provided the link to the API documentation for the website: About the Site and API <https://chroniclingamerica.loc.gov/about/api/>. Scroll down to the section with the heading, Searching the directory and newspaper pages using OpenSearch. This section describes the search functionality and structure for the US Newspaper Directory in more detail. It is possible to return your directory searches in json format by appending &format=json to the end of the url. It is also possible to return search results in csv format by appending &format=csv to the end of the url, but I would strongly suggest that you do this in small batches by putting limits on your search so that the system is not overwhelmed. So, from the search page for the US Newspaper Directory <https://chroniclingamerica.loc.gov/search/titles/> you could potentially limit your search based on state and city, or date range, and/or even frequency. Then once you've completed the search, you can add &format=csv to the end of the url to automatically download a csv of those records. The resulting csv will contain several fields/headers: lccn, title, place of publication, start year, end year, publisher, edition, frequency, subject, state, city, country, language, oclc number, and holding type. I think these fields include the information you were looking for. But, again, I would like to stress that you put limits on your search before creating the csv so as not overwhelm the system. Please let me know if you have any other additional questions. Best wishes, Kerry Huller Newspaper & Current Periodical Reading Room Serial & Government Publications Division Library of Congress ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 23 2022, 01:55pm via System Mr. Graves, I'm going to transfer you request to a member of our digital collections team who may be of more assistance to you than me. Mike ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 23 2022, 01:51pm via Email Dear Mr. Queen: Thanks for the reply. I'm still confused. I downloaded and installed Docker Desktop and "docker-compose.yml" and ran their "Getting Started" Tutorial, but I don't see what to do next. I repeat: I'd like to analyze "U.S. Newspaper Directory, 1690-Present" (https://chroniclingamerica.loc.gov/search/titles/), which ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 22 2022, 07:15pm via System Mr. Graves, Programmatic access to the data forChronicling America <https://chroniclingamerica.loc.gov/>and possibly the U.S. Newspaper Directory <https://chroniclingamerica.loc.gov/search/titles/>can be found on theAbout the Site and API <https://chroniclingamerica.loc.gov/about/api/>page in various formats. Also, please note that Chronicling Americacontains newspapers published from 1777-1963, but does not include everyU.S. newspaper published in that time period. Please let me know if I can be of further assistance. ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 22 2022, 06:14pm via Email Dear Mr. Queen: Can we simplify this to just giving me the data behind "U.S. Newspaper Directory, 1690-Present" (https://chroniclingamerica.loc.gov/search/titles/) in a machine readable format, e.g., csv or xlsx or a MySQL database? As I mentioned in my original email, a naive search of that without restrictions returned 157520 titles in 7876 pages with up to 20 titles per page giving date ranges in at least some cases. I could probably write software to scrape those 7876 pages from your web site and combine them into a data file. I have a PhD in statistics, I have been using the R programming language and similar software for decades. This includes publishing tutorials on how to analyze data like this on Wikiversity.[1] I'd like to do something similar with this. I could help make your data more useful to others and discuss with you how we might prioritize improvements like accessing the other sources you mentioned. Thanks very much for your reply. Sincerely, Spencer Graves, PhD Founder, EffectiveDefense.org 4550 Warwick Blvd 508 Kansas City, MO 64111 m: 408-655-4567 [1] e.g.: https://en.wikiversity.org/wiki/US_Gross_Domestic_Product_(GDP)_per_capita ------------------------------------------------------------------------ Newspapers and Current Periodicals Reference Librarian Jun 22 2022, 05:27pm via System Mr. Graves Your request is a little more complex than it first appears and requires extensive research. A variety of resources should be consulted to determine the circulation statistics of newspapers published prior to 1851. You will need to check newspaper union lists and newspaper histories. Union listspresent lists of newspapers in geographic arrangement according to place of publication, and specify which libraries or other institutions hold collections of those newspapers and the dates of their holdings. These can also be useful for tracking title changes throughout a newspaper's history. Newspaper historieslikeAmerican Journalism: A History: 1690-1960 <https://lccn.loc.gov/62007157>(Mott),The Penny Press <https://lccn.loc.gov/2004043078>(Thompson), andThe Press and America <https://lccn.loc.gov/99044295>(Emery et al.) may not include circulation statistics, but they do document the diversity and progress of newspaper publishing, including notable newspapers of the era. Newspaper histories also cover the history of the printers and printing of newspapers in a state, county, or region more generally, and provide more condensed histories of the editors, journalists, and evolution of the newspapers in a specific area. Newspaper histories and union lists should be available at most large public or university libraries. More information about union lists, newspaper histories, and researching newspapers in general can be found in theU.S. Newspaper Collections at the Library of Congress <https://guides.loc.gov/united-states-newspapers/introduction>research guide (see Reference Sources). Please let me know if I can be of further assistance. ------------------------------------------------------------------------ Original Question Jun 20 2022, 02:34pm via System How can I get counts of the numbers of newspapers by year in the US, and preferably also elsewhere? A search of "U.S. Newspaper Directory, How can I get counts of the numbers of newspapers by year in the US, and preferably also elsewhere? A search of "U.S. Newspaper Directory, 1690-Present" (https://chroniclingamerica.loc.gov/search/titles/) returned 157520 titles in 7876 pages with up to 20 titles per page giving date ranges to the extent that it's known. If I can get a data file (e.g., csv or xls), I can summarize. I could also use data on circulation and frequency and especially parent company for multiple newspapers published by the same company, to the extant that such is available. I'm interested in this, because McChesney quoted Tocqueville in suggesting that the US had more newspapers per person (or per million population) prior to 1851 than at any other time or place in history. I'd like to evaluate that claim with data to the extent that I can. See "https://en.wikiversity.org/wiki/Social_construction_of_crime_and_what_we_can_do_about_it#Newspapers_1790_-_present". Thanks, Spencer Graves, PhD m: 408-655-4567 ------------------------------------------------------------------------ Thank you for using Newspapers & Current Periodicals Ask a Librarian Service! This email is sent from Ask a Librarian in relationship to ticket #9625195. Read our privacy policy. <https://springshare.com/privacy.html> ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Wed, 27 Jul 2022 15:50:55 -0500 Spencer Graves <spencer.graves at effectivedefense.org> wrote:> What would you suggest I do to parse the following XML file into a > list that I can understand: > > XMLfile <- > "https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ndnp_Alabama_all-yrs_e_0001_0050.xml"> XMLdat <- XML::xmlParse(XMLdata) > str(XMLdat)Isn't XMLdat already a tree-like list? For example, XMLdat[[1]][[1]][[3]][[1]] is the first <record> tag in the file, which you can further pick apart. What information do you need from this file and how would you like to access it? Parsing XML files is typically achieved with XPath expressions (e.g. 'under every <record> tag, extract the <datafield> tags containing attribute tag="042"' would look like 'record/datafield[tag="042"]') and/or handlers on specific tags, not by extracting all text nodes and performing string operations on them. -- Best regards, Ivan
What do you mean by "a list that I can understand"? A quick tally of the number of XML elements by identifier: 1 echoedSearchRetrieveRequest 1 frbrGrouping 1 maximumRecords 1 nextRecordPosition 1 numberOfRecords 1 query 1 records 1 resultSetIdleTime 1 searchRetrieveResponse 1 servicelevel 1 sortKeys 1 startRecord 1 wskey 2 version 50 leader 50 recordData 51 recordPacking 51 recordSchema 100 record 105 controlfield 923 datafield 1900 subfield What of this information do you actually want? The elements of the list should be what? On Thu, 28 Jul 2022 at 08:52, Spencer Graves < spencer.graves at effectivedefense.org> wrote:> Hello, All: > > > What would you suggest I do to parse the following XML file into > a > list that I can understand: > > > XMLfile <- > " > https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ndnp_Alabama_all-yrs_e_0001_0050.xml" > > > > > This is the first of 6666 XML files containing "U.S. Newspaper > Directory" maintained by the US Library of Congress discussed in the > thread below. I've tried various things using the XML and xml2. > > > XMLdata <- xml2::read_xml(XMLfile) > str(XMLdata) > XMLdat <- XML::xmlParse(XMLdata) > str(XMLdat) > XMLtxt <- xml2::xml_text(XMLdata) > nchar(XMLtxt) > #[1] 29415 > > > Someplace there's a schema for this. I don't know if it's > embedded > in this XML file or in a separate file. If it's in a separate file, how > could I describe it to my contacts with the Library of Congress so they > would understand what I needed and could help me get it. > > > Thanks, > Spencer Graves > > > p.s. All 29415 characters in XMLtext appear in the thread below. > > > > -------- Forwarded Message -------- > Subject: [Newspapers and Current Periodicals] How can I get counts > of > the numbers of newspapers by year in the US, and preferably also > elsewhere? A search of "U.S. Newspaper Directory, > Date: Wed, 27 Jul 2022 14:59:03 +0000 > From: Kerry Huller <serials at ask.loc.gov> > To: Spencer Graves <spencer.graves at effectivedefense.org> > CC: twes at loc.gov > > > > --# Type your reply above this line #-- > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 27 2022, 10:59am via System > > Hello Spencer, > > So, when I view the xml, I'm actually looking at it in XML editor > software, so I can view the tags and it's structured neatly. I've copied > and pasted the text from the beginning of the file and the first > newspaper title below from my XML editor: > > <?xml version="1.0" encoding="UTF-8" standalone="no"?> > <?xml-stylesheet type='text/xsl' > href='/webservices/catalog/xsl/searchRetrieveResponse.xsl'?> > > <searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/" > xmlns:oclcterms="http://purl.org/oclc/terms/" > xmlns:dc="http://purl.org/dc/elements/1.1/" > xmlns:diag="http://www.loc.gov/zing/srw/diagnostic/" > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> > <version>1.1</version> > <numberOfRecords>2250</numberOfRecords> > <records> > <record> > <recordSchema>info:srw/schema/1/marcxml</recordSchema> > <recordPacking>xml</recordPacking> > <recordData> > <record xmlns="http://www.loc.gov/MARC21/slim"> > <leader>00000nas a22000007i 4500</leader> > <controlfield tag="001">1030438981</controlfield> > <controlfield tag="008">180404c20159999aluwr n 0 a0eng > </controlfield> > <datafield ind1=" " ind2=" " tag="010"> > <subfield code="a"> 2018200464</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="040"> > <subfield code="a">DLC</subfield> > <subfield code="e">rda</subfield> > <subfield code="c">DLC</subfield> > <subfield code="b">eng</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="012"> > <subfield code="m">1</subfield> > </datafield> > <datafield ind1="0" ind2=" " tag="022"> > <subfield code="a">2577-5316</subfield> > <subfield code="2">1</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="032"> > <subfield code="a">021110</subfield> > <subfield code="b">USPS</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="037"> > <subfield code="b">711 Alabama Avenue, Selma, AL 36701</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="042"> > <subfield code="a">nsdp</subfield> > <subfield code="a">pcc</subfield> > </datafield> > <datafield ind1="1" ind2="0" tag="050"> > <subfield code="a">ISSN RECORD</subfield> > </datafield> > <datafield ind1="1" ind2="0" tag="082"> > <subfield code="a">071</subfield> > <subfield code="2">15</subfield> > </datafield> > <datafield ind1=" " ind2="0" tag="222"> > <subfield code="a">Selma sun</subfield> > </datafield> > <datafield ind1="0" ind2="0" tag="245"> > <subfield code="a">Selma sun.</subfield> > </datafield> > <datafield ind1=" " ind2="1" tag="264"> > <subfield code="a">Selma, AL :</subfield> > <subfield code="b">North Shore Press, LLC</subfield> > <subfield code="c">2016-</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="310"> > <subfield code="a">Weekly</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="336"> > <subfield code="a">text</subfield> > <subfield code="b">txt</subfield> > <subfield code="2">rdacontent</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="337"> > <subfield code="a">unmediated</subfield> > <subfield code="b">n</subfield> > <subfield code="2">rdamedia</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="338"> > <subfield code="a">volume</subfield> > <subfield code="b">nc</subfield> > <subfield code="2">rdacarrier</subfield> > </datafield> > <datafield ind1="1" ind2=" " tag="362"> > <subfield code="a">Began in 2015.</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="588"> > <subfield code="a">Description based on: Volume 2, Issue 40 > (October 5, 2017) (surrogate); title from caption.</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="588"> > <subfield code="a">Latest issue consulted: Volume 2, Issue 40 > (October 5, 2017).</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="752"> > <subfield code="a">United States</subfield> > <subfield code="b">Alabama</subfield> > <subfield code="c">Dallas</subfield> > <subfield code="d">Selma.</subfield> > </datafield> > </record> > </recordData> > </record> > > When I view the records in the XML editor, these 2 lines below do begin > each of the records for each individual title, but of course this is > including the xml tags: > > <recordSchema>info:srw/schema/1/marcxml</recordSchema> > <recordPacking>xml</recordPacking> > > Hopefully this helps you decide where to break or parse each record. > > On another note, I just noticed as well that at the top of this first > file it lists the total number of records for the Alabama grouping - > 2250. This also appeared to be the case for the Alaska records when I > took a look at the first one for that state. I imagine that should be > consistent throughout each "grouping" of records. > > Let me know if you have follow-up questions! > > Best wishes, > > Kerry Huller > Newspaper & Current Periodical Reading Room > Serial & Government Publications Division > Library of Congress > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 27 2022, 10:21am via Email > > Hi, Kerry: > > > Thanks. I understand the chunking in files of at most 50. I've read > the first file "ndnp_Alabama_all-yrs_e_0001_0050.xml" into a string of > 29415 characters, copied below. Might you have any suggestions on the > next step in parsing this? Staring at it now, it looks splitting on > "info:srw/schema/1/marcxmlxml" might convert the 29415 characters into > shorter chunks, each of which could then be parsed further. > > > This is not as bad as reading ancient Egyptian heiroglyphics without > the Rosetta Stone, but I wondered if you might have something that could > make this work easier and more reliable? I guess I could compare with > what I already read as JSON ;-) > > > Thanks, > Spencer Graves > > > "1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i > 45001030438981180404c20159999aluwr n 0 a0eng > 2018200464DLCrdaDLCeng12577-53161021110USPS711 Alabama Avenue, Selma, AL > 36701nsdppccISSN RECORD07115Selma sunSelma sun.Selma, AL :North Shore > Press, > LLC2016-WeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan > in > 2015.Description based on: Volume 2, Issue 40 (October 5, 2017) > (surrogate); title from caption.Latest issue consulted: Volume 2, Issue > 40 (October 5, 2017).United > StatesAlabamaDallasSelma.info:srw/schema/1/marcxmlxml00000cas a22000007a > 4500502150053100127c20109999aluwr n 0 a0eng > 2010200019DLCengDLCDLCOCLCQ112153-18111750USPSB & C Publishing, LLC, > 3514 Martin St. S. Ste 104, Cropwell, AL 35054pccnsdpISSN RECORDSt. > Clair County news (Cropwell, Ala.)St. Clair County news(Cropwell, > Ala.)St. Clair County news.Cropwell, AL :B & C Pub.WeeklyBegan in > 2010.Description based on: Nov. 4, 2010 (surrogate); title from > caption.info:srw/schema/1/marcxmlxml00000cas a22000007a > 4500426491872090720c20099999alumr n 0 a0eng > 2009203372DLCengDLCOCLCQ12150-346X2150-346X1AU at 000044489617NZ116076352Devon > Applewhite/Applewhite Publishing Co., 1910 Honeysuckle Rd., #N183, > Dothan, AL 36305mscnsdpISSN RECORD30514Triangle tribune(Dothan, > Ala.)Triangle tribune.Dothan, AL :Applewhite Pub. CoMonthlyBegan with > vol. 1, issue 1 (May 2009).\"Connecting the Tri-State African -American > Community.\"Description based on: Vol. 1, issue 1 (May 2009); title from > masthead.Applewhite, Devon.United StatesAlabama.United > StatesGeorgia.United StatesFlorida.info:srw/schema/1/marcxmlxml00000cas > a22000007a 4500289017315081219c20089999aluwr n | a0eng c > 2008213218NSDengNSDOCLCQDLCOCLCQ111945-93191945-93191005270USPSSpringhill > Publications, > LLC, P.O. Box 186, Greenville, AL 36037nsdppccISSN RECORD07014Greenville > standardThe Greenville standard.Greenville, AL :Springhill > PublicationsWeeklytexttxtrdacontentunmediatednrdamediaBegan with vol. 1, > issue 1 (Sept. 3, 2008)Description based on surrogate of: Vol. 1, no. 15 > (Dec. 18, 2008); title from masthead (publisher's Web site, viewed Dec. > 19, 2008).Latest issue consulted: Vol. 1, no. 99 (July 27, 2011) > (surrogate).info:srw/schema/1/marcxmlxml00000cas a22000007a > 4500123539969070426c20079999aluwr ne 0 a0eng c > 2007212138NSDengNSDNSDOCLCQ101936-95571936-95571The Western Tribune, > 1530 Third Ave. N., Bessemer, AL 35020mscnsdpISSN RECORDWestern tribune > (Bessemer, Ala.)The Western tribune(Bessemer, Ala.)The Western > tribune.Bessemer, Ala. :D-Med, Inc.v.WeeklyBegan in 2007.Description > based on: May 23, 2007 (surrogate); title from > caption.AU at 000041575341info:srw/schema/1/marcxmlxml00000cas a22000007a > 4500226300653080425c20079999aluwr ne | a0eng > 2008212112NSDengNSDNSDOCLCQ11942-20751942-20751nsdppccISSN RECORDThe > corridor messengerThe corridor messenger.Carbon Hill, AL :Corridor > Messenger, Inc.WeeklyBegan with vol. 1, issue (10.03.2007).Description > based on: 1st issue.United StatesAlabamaWalkerCarbon > Hill.http://www.corridormessenger.cominfo:srw/schema/1/marcxmlxml00000cas > a22000007a > 450077560432070109c20069999aluwr ne 0 a0eng c > > 2007213400NSDengNSDOCLCQAUBRNOCLCOOCLCFa01935-37901935-37901AU at 000041190283The > > Auburn Villager, P.O. Box 1633, Auburn, AL 36831-1633pccnsdpISSN > RECORDThe Auburn villagerThe Auburn villager.Auburn, AL :Auburn > Villagerv.WeeklyBegan in 2006.Description based on: Vol. 1, no. 4 (July > 20, 2006) (surrogate); title from caption.Auburn (Ala.)Newspapers.Lee > County (Ala.)Newspapers.AlabamaAuburn.fast(OCoLC)fst01209634AlabamaLee > County.fast(OCoLC)fst01211930Newspapers.fast(OCoLC)fst01423814United > StatesAlabamaLeeAuburn.info:srw/schema/1/marcxmlxml00000cas a2200000Ii > 4500872286785m o d s cr mn|---a||||140311c20069999alucr n o b > s0 a0eng cABCengrdaABCABCOCLCFLD59.13University of Alabama at > Birmingham.The eReporter.[Birmingham, Alabama] :The University of > Alabama at Birmingham,[2006]-[Birmingham, Alabama] :Offices of Public > Relations & Marketing and Information Technology1 online resource2 > issues weeklytexttxtrdacontentcomputercrdamediaonline > resourcecrrdacarrierSeptember 19, 2006-\"The eReporter is an official > communication of The University of Alabama at Birmingham, companion to > the UAB Reporter and recommended alternative to mass e-mails.\"Issues > for <March 11, 2014- published and distributed via e-mail subscription > on Tuesdays and Fridays.Description based on: September 19, 2006; title > from title screen (viewed March 12, 2014).University of Alabama at > BirminghamPeriodicals.Periodicals.fast(OCoLC)fst01411641University of > Alabama at Birmingham.fast(OCoLC)fst00645114University of Alabama at > Birmingham.Office of Public Relations and Marketing.University of > Alabama at Birmingham.Information Technology.2006-2012, companion > to:University of Alabama at Birmingham.UAB > reporter.(OCoLC)32435748Archived > issueshttp:// > hatteras.dpo.uab.edu/cgi-bin/ereporter.cgiinfo:srw/schema/1/marcxmlxml00000cas > > a22000007a 4500166387050070829c20059999aluwr ne | a0eng c > 2007215501NSDengNSDOCLCQ11939-68991939-68991The Wilkie Clark Memorial > Foundation, P.O. Box 514, Roanoke, AL 36274$30.00nsdpmscISSN > RECORD305.89614People's voice (Roanoke, Ala.)The people's voice(Roanoke, > Ala.)The people's voice.Roanoke, AL :Wilkie Clark Memorial > Foundationv.WeeklyBegan with vol. 1, no. 1 in 2005.Description based on: > Vol. 2, no. 20 (Apr. 20, 2007); title from caption.Wilkie Clark Memorial > Foundation.United > StatesAlabamaRandolphRoanoke.AU at 000042141390info:srw/schema/1/marcxmlxml00000nas > > > a22000007i 45001124677787191021c20uu9999aluwr ne | a0eng > 2019202521DLCengrdaDLC12689-3258122730USPSNorth Jackson Press, 42950 Hwy > 72, Suite 406, Stevenson, AL 35772nsdppccISSN RECORD071.323North Jackson > pressNorth Jackson press.Stevenson, AL :Caney Creek Publications > LLCWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierDescription > based on surrogate of: Volume 1, number 36 (October 11, 2019); title > from masthead.Latest issue consulted: Volume 1, number 36 (October 11, > 2019) (Surrogate).United > StatesAlabamaJacksonStevensoninfo:srw/schema/1/marcxmlxml00000cas > a2200000 a 4500226315099080428d19981998aluwr ne | 0eng c > 2008233691GUAengGUAOCLCQOCLCFOCLCO39911644pccn-us-gaThe Dekalb > news.Birmingham, Ala. :Community newspaper holdings Inc.v.WeeklyBegan > with 1st year, no. 1 (Apr. 1, 1998); ceased with 1st year, no. 31 (Oct. > 28, 1998).Final issue consulted.Description based on first issue; title > from caption.Decatur (Ga.)Newspapers.DeKalb County > (Ga.)Newspapers.Newspapers.fast(OCoLC)fst01423814GeorgiaDecatur.fast(OCoLC)fst01226234GeorgiaDeKalb > > > County.fast(OCoLC)fst01215288United > StatesGeorgiaDeKalbDecatur.Decatur-DeKalb news/era(DLC)sn > 89053661(OCoLC)19946163info:srw/schema/1/marcxmlxml00000cas a2200000 i > 450050263311m o d cr cn|||||||||020730c19979999alu x neo > 0 a0eng c > > 2015238492AMHengrdapnAMHOCLCQOCLCFOCLCOIULOCLHTMOCLCQCOODLC66460694810970435082687-93791AU at 000050711528OCLCS45109pccnsdpn-us---AP2.B5707023Birmingham > > weekly (Online)Birmingham weekly(Online)Birmingham weekly.Birmingham, AL > :Birmingham Weekly1 online resourceIrregular,Feb. 16-28, > 2012-Weekly,Sept. 4-11, 1997-Feb. 9-16, > 2012texttxtrdacontentcomputercrdamediaonline resourcecrrdacarrierBegan > with vol. 1, issue 1 (Sept. 4-11, 1997).\"City news, views & > entertainment\"--Cover.Numbering dropped in Mar. 2012.Also issued in > print.Description based on: Publication information from ProQuest; title > from web page (viewed June 18, 2015).Latest issue consulted: Aug. 15-20, > 2012.Birmingham (Ala.)Newspapers.Internet resources.Electronic > journals.AlabamaBirmingham.fast(OCoLC)fst01204958Newspapers.fast(OCoLC)fst01423814United > > > StatesAlabamaBirmingham.Print version:Birmingham > Weekly(OCoLC)39271050 > http://apw.softlineweb.com/http://WC2VB5MT8E.search.serialssolutions.com/?sid=sersol&SS_jc=JC_000051895&title=Birmingham+Weeklyinfo:srw/schema/1/marcxmlxml00000cas > > a22000007a 450031471314941116d19941995aluwr ne 0 a0eng csn > 94003083 > NSDengNSDANEOCLCQOCLCFOCLCOOCLCQ11079-65411079-65411nsdppccn-us-akSoutheast > shopperSoutheast shopper.Juneau, Alaska :Kemper > Communications,1994-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. > > > 1, no. 1 (Nov. 16, 1994)-Ceased in Feb. 1995.Juneau > (Alaska)Newspapers.AlaskaJuneau.fast(OCoLC)fst01213587Newspapers.fast(OCoLC)fst01423814United > > > StatesAlaskaJuneau.AU at 000011356572info:srw/schema/1/marcxmlxml00000cas > a22000008a 450027910515930413c19949999alumr n 0 a0eng dsn > 93002581 NSDengNSDOCLCQ11069-06621Birmingham Tribune, 216 Ave. T. Pratt > City, Birmingham, AL 35214nsdpBirmingham tribuneBirmingham > tribune.Birmingham, Ala. :Kervin > Fondren9501volumesMonthlytexttxtrdacontentunmediatednrdamediavolumencrdacarrierPREPUB: > > > publication expected Jan. > 1995AU at 000025863987info:srw/schema/1/marcxmlxml00000cas a22000007a > 450026199931920716d19922013alumr ne 0 a0eng csn 92003357 > NSDengNSDOCLOCLCQDLC011064-01341064-01341Black & White, POB 13215, > Birmingham, AL 35202-3215nsdppccBlack & white (Birmingham, Ala.)Black & > white(Birmingham, Ala.)Black & white.Black and whiteBirmingham, Ala. > :Black & White, Inc.v.Biweekly,Oct. 2, 1997-Monthly,May 1, 1992-Sept. > 1997Began in May 1992; ceased with Jan. 10, 2013.\"Birmingham's New City > paper.\"Description based on: June 1992.Latest issue consulted: No. 67 > (Oct. 16, 1997) (surrogate).info:srw/schema/1/marcxmlxml00000cas > a2200000 a 450032145723950314d19901999alumr ne 0 a0eng csn > 95068755 > > MGNengMGNNSDCLUOCLCQOCLCFOCLCOOCLCA971211082-34841082-34841AU at 000011579542nsdppccn-us-alF335.J5S68The > > Southern shofarThe Southern shofar.Birmingham, AL :L. Brook,-[1999]v. > :ill. ;35 cm.MonthlyBegan in 1990.-v. 9, issue 9 (Aug./Sept. 1999).\"The > monthly newspaper of Alabama's Jewish community.\"Some issues also > available on the Internet via the World Wide Web.Description based on: > Vol. 3, issue 11 (Oct. 1993).Jewish newspapersAlabama.Jewish > newspapers.fast(OCoLC)fst00982872Alabama.fast(OCoLC)fst01204694United > StatesAlabamaJeffersonBirmingham.Deep South Jewish voice(DLC)sn > 99018499(OCoLC)42431704CLUhttp:// > bibpurl.oclc.org/web/719http://www.bham.net/shofar/info:srw/schema/1/marcxmlxml00000cas > > a22000007a 450021265141900326c19909999aluwr ne 0 a0eng csn > 90099004 AARengAARCPNNSDOCLCQ11050-08981050-08981005022USPSE.O.N., Inc., > Main St., Eclectic, AL 36024pccnsdpISSN RECORDThe Eclectic observerThe > Eclectic observer.Eclectic, Ala. :E.O.N., Inc.,1990-v.WeeklyVol. 1, no. > 1 (Feb. 22, 1990)-Published by: Price Publications, Inc., <2006->Latest > issue consulted: Vol. 17, no. 1 (Jan. 5, 2006).United > StatesAlabamaElmoreEclectic.AU at 000040212446info:srw/schema/1/marcxmlxml00000cas > > > a22000007a 450021214781900314c19909999aluir ne 0 a0eng csn > 90002457 AAAengAAANSDOCLCQ111050-20841050-20841931180USPSClanton > Newspapers, 1109 Seventh St., N., PO Box 1379, Clanton, AL > 35045nsdppccn-us-alThe Clanton advertiserThe Clanton > advertiser.AdvertiserClanton, Ala. :Clanton Newspapersv. :ill. ;58 > cm.Three no. a week,<May 13, 1992->Semiweekly,<Apr. 4, 1990->Began in > Jan. 1990.Description based on: Vol. 19, no. 27 (Wed., Apr. 4, > 1990).Latest issue consulted: Vol. 22, no. 58 (May 13, 1992).United > StatesAlabamaChiltonClanton.Independent advertiser (Clanton, > Ala.)(OCoLC)21214732AU at 000025908452info:srw/schema/1/marcxmlxml00000cas > a2200000 a 450021214814900314c19909999aluwr ne 0 a0eng dsn > 90099009 AAAengAAACPNNSDOCLCQ11056-32881056-32881505740USPSThe Blount > Countian, 3rd St. at Washington Ave., PO Box 310, Oneonta, AL > 35121mscnsdpn-us-alThe Blount countianThe Blount countian.Oneonta, Ala. > :Southern Democrat, Inc.,1990-v. :ill.WeeklyVol. 1, no. 1 (Jan. 3, > 1990)-Editor: Molly Howard Ryan, 1990-Latest issue consulted: Vol. 1, > no. 36 (Sept. 5, 1990).Ryan, Molly Howard.United > StatesAlabamaBlountOneonta.Southern Democrat(DLC)sn > 85044741(OCoLC)12038577AU at 000025884049info:srw/schema/1/marcxmlxml00000cas > a22000007a 450022413044900920c19909999aluwr ne 0 a0eng dsn > 90099011 > AARengAARCPNNSDNSTOCLCQ92081707011191053-91231053-91231314240USPSmscnsdpThe > Clay times-journalThe Clay times-journal.Lineville, Ala. :C.L. > Proctor,1990-v.WeeklyVol. 1, no. 1 (Sept. 6, 1990)-United > StatesAlabamaClayLineville.Ashland progress(DLC)sn 85044701Lineville > tribune(DLC)sn 85044702AUinfo:srw/schema/1/marcxmlxml00000cas a22000007a > 450021265218900326c19909999aluwr ne 0 0eng dsn 90099005 > AARengAARCPNOCLCQmscTrussville news-journal.Trussville, Ala. :Mike > Mitchell,1990-v.BimonthlyVol. 1, no. 1 (Feb. 20, 1990)-United > StatesAlabamaJeffersonTrussville.info:srw/schema/1/marcxmlxml00000cas > a22000007a 450022301035900831c19909999aluwr ne 0 0eng dsn > 90099010 AARengAARCPNOCLCQmscWeaver tribune.Oxford, Ala. :Cheaha > Pub.,1990-v.WeeklyVol. 1, no. 1 (July 19, 1990)-United > StatesAlabamaCalhounWeaver.United > StatesAlabamaCalhounOxford.info:srw/schema/1/marcxmlxml00000cas > a22000007a 450015155895870205c19879999aludr ne 0 a0eng csn > 87050045 > > AAAengAAACPNNSDDLCCPNNSDDLCCPNDLCOCLDLCOCLCQOCLCFOCLCQ19261126829944596670892-44570892-44571AU at 000020456714360980USPSThe > > Advertiser, P.O. Box 1000, Montgomery, AL > 36192pccnsdpn-us-alNewspaperMontgomery advertiser (Montgomery, Ala. : > 1987)The Montgomery advertiser(1987)The Montgomery advertiser.Montgomery > advertiser & the Alabama journalSunday Montgomery advertiserMontgomery, > Ala. :Advertiser Co.,1987-volumes > :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier160th > > > year, no. 1 (Jan. 2, 1987)-On Saturdays, Sundays and holidays a combined > edition is published with the Alabama journal, and called: Montgomery > advertiser and the Alabama journal, Jan. 3, 1987, and: Alabama journal > and Montgomery advertiser, Jan. 4, 1987-Feb. 25, 1990.Issues for Sunday > called: Sunday Montgomery advertiser, Mar. 4, 1990-Issues for Saturday, > Sunday and holidays have their own numbering, Jan. 3, 1987-Feb. 25, > 1990.Montgomery > (Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United > > > StatesAlabamaMontgomeryMontgomery.Advertiser (Montgomery, > Ala.)0745-3221(DLC)sn 82008412(OCoLC)9049482Alabama journal (Montgomery, > Ala. : 1940)0745-323X(DLC)sn > 87062018(OCoLC)2666111info:srw/schema/1/marcxmlxml00000cas a2200000 a > 450016942287871105c19879999aludn ne 0 a0eng dsn 88050149 > AAAengAAACPNNSDOCLCQy1044-00701044-0070746--32780746-32781565580USPSTroy > Publications, Inc., 113 North Market St., Troy, AL 36081mscnsdpMessenger > (Troy, Ala.)The Messenger(Troy, Ala.)The Messenger.Troy, Ala. :Troy > Pub.,1987-v.Daily (Sunday, Tuesday, Thursday and Friday)Vol. 121, no. > 166 (July 1, 1987)-Sunday, Apr. 2, 1989 misprinted as v. 113.Latest > issue consulted: Vol. 113 [sic 123], no. 96 (Sunday, Apr. 2, > 1989).United StatesAlabamaPikeTroy.Troy messenger0746-3278(DLC)sn > 83009935(OCoLC)9921908info:srw/schema/1/marcxmlxml00000cas a22000007a > 450017799786880415c19879999aluir ne 0 a0eng dsn 88050086 > AARengAARCPNNSDOCLCQ1p1044-03801044-03800745-75961441520USPSThe > Prattville Progress, 152 W. 3rd St., Prattville, AL > 36067mscnsdpPrattville progress (Prattville, Ala. : 1987)The Prattville > progress(Prattville, Ala.)The Prattville progress.Prattville, Ala. > :James C. Seymour,1987-v.Three times a weekVol. 102, no. 8 (Jan. 20, > 1987)-Latest issue consulted: Vol. 105, no. 153 (Wednesday, Dec. 26, > 1990).United StatesAlabamaAutaugaPrattville.Progress (Prattville, > Ala.)0745-7596(DLC)sn > 83007623(OCoLC)9428489info:srw/schema/1/marcxmlxml00000cas a22000007a > 450015344667870319c19869999aluwr ne 0 a0eng dsn 87000284 > NSDengNSDCPNOCLCQy0893-07670893-07671431800USPSPickens County Herald, > P.O. Drawer E, Carrollton, AL 35447nsdpPickens County heraldPickens > County herald.Pickens County herald and west AlabamianCarrollton, Ala. > :Pickens Newspapers, Inc.,1986-WeeklyVol. 138, no. 40 (Oct. 2, > 1986)-United StatesAlabamaPickensCarrollton.Pickens County herald and > west Alabamian0746-0473(DLC)sn > 83008141AU at 000040635809info:srw/schema/1/marcxmlxml00000cas a22000007a > 450018917586881217c19869999aluwr ne 0 0eng dsn 88050225 > CPNengCPNOCLCQmscThe Oxford sun/times.Oxford, Ala. > :[s.n.],1986-v.WeeklyVol. 1, no. 1 (Jan. 16, 1986)-Editor: Andy > Goggans.Numbering is irregular.United StatesAlabamaCalhounOxford.Oxford > sun (Oxford, Ala.)(DLC)sn > 85045023AU at 000025803813info:srw/schema/1/marcxmlxml00000cas a22000007a > 450013991168860731c19869999aluwr ne 0 0eng dsn 86050322 > CPNengCPNOCLCQmscIndependent (Brewton, Ala.)The Independent.Brewton, > Ala. :Jim Thornton,1986-v. :ill. ;58 cm.WeeklyVol. 1, no. 1 (June 19, > 1986)-United > StatesAlabamaEscambiaBrewton.info:srw/schema/1/marcxmlxml00000cas > a22000007a 450018957493881231c19859999aluwr ne 0 0eng dsn > 88050247 CPNengCPNOCLCQmscPiedmont journal-independent (Piedmont, > Ala.)The Piedmont journal-independent.Journal independentPiedmont, Ala. > :Lane Weatherbee,1985-v.WeeklyVol. 4, no. 52 (Dec. 24, 1985)-Sometimes > published as: Journal independent.United > StatesAlabamaCalhounPiedmont.Journal-independent(DLC)sn > 85045014info:srw/schema/1/marcxmlxml00000cas a22000007a > 450012715821851024d19841985aluwr ne 0 a0eng dsn 85045014 > CPNengCPNNSDCPNOCLCQmscThe Journal-independent.Piedmont, Ala. > :Journal-Independent, Inc.,1984-1985.volumes :illustrations ;58 > cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 3, > no. 27 (July 3, 1984)- v. 4, no. 51 (Dec. 18, 1985).Carries the same > vol. numbering as the Piedmont journal-independent.United > StatesAlabamaCalhounPiedmont.Piedmont > journal-independent0890-6017(DLC)sn 85045013Piedmont journal-independent > (Piedmont, Ala.)(DLC)sn 88050247info:srw/schema/1/marcxmlxml00000cas > a22000007a 450012691448851018c19839999aludr ne 0 0eng dsn > 85045007 CPNengCPNOCLCQmscTimesDaily.Times dailyFlorence, Ala. :T.S.P. > Newspapers, Inc.,1983-volumes :illustrations ;58 > cmDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 114, > no. 226 (Aug. 14, 1983)-United StatesAlabamaLauderdaleFlorence.Florence > times + tri-cities daily(DLC)sn > 85044995info:srw/schema/1/marcxmlxml00000cas a22000007a > 45009428489830420d19831987aluir ne 0 a0eng dsn 83007623 > NSDengNSDCPNNSDNSTOCLCQ89090d0745-75960745-75961The Progress, 152 W. 3rd > St., Prattville, AL 36067nsdpmscProgress (Prattville, Ala.)The > Progress(Prattville, Ala.)The Progress.Prattville, Ala. :The Prattville > Progress,1983-1987.volumes :illustrations ;58 cmThree times a > weektexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 98, no. > 32 (Mar. 17, 1983)-v. 102, no. 7 (Jan. 17, 1987).United > StatesAlabamaAutaugaPrattville.Prattville progress(DLC)sn > 85044740Prattville progress (Prattville, Ala.)1044-0380(DLC)sn > 88050086(OCoLC)12254317AAPinfo:srw/schema/1/marcxmlxml00000cas a2200000 > a 45009867255830831c19839999aludr ne 0 a0eng dsn 84008052 > AAAengAAANSDOCLOCLCQX0743-15110743-15111617760USPST.S.P. Newspapers, > Inc., 219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Shoals > edition)TimesDaily(Shoals ed.)TimesDaily.Times dailyShoals ed.Florence, > Ala. :T.S.P. Newspapersvolumes > :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan > > > with: Vol. 114, no. 226 (Aug. 14, > 1983).\"Florence/Sheffield/Tuscumbia/Muscle Shoals.\"Shoals ed. and > Regional ed. combined on Sundays.Description based on: Vol. 114, no. 346 > (Monday, Dec. 12, 1983).United > StatesAlabamaLauderdaleFlorence.TimesDaily (Regional > edition)0743-152XTimes Tri-cities dailyUnknownDec. 12, > 1983info:srw/schema/1/marcxmlxml00000cas a22000007a > 450010536023840319c19839999aludr ne 0 a0eng dsn 84008051 > NSDengNSDOCLCQ1x0743-152X0743-152X1617760USPST.S.P. Newspapers, Inc., > 219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Regional > edition)TimesDaily(Regional ed.)TimesDaily.Times dailyRegional > ed.Florence, Ala. :T.S.P. > NewspapersDailytexttxtrdacontentunmediatednrdamediaBegan with: Vol. 114, > no. 226 (Aug. 14, 1983).Shoals ed. and Regional ed. combined on > Sundays.Description based on: Vol. 114, no. 346 (Monday, Dec. 12, > 1983).United StatesAlabamaLauderdaleFlorence.TimesDaily (Shoals > edition)0743-1511Times Tri-cities dailyDec. 12, > 1983AU at 000025818125info:srw/schema/1/marcxmlxml00000cas a22000007a > 45009049482821213d19821987aludn ne 0 a0eng csn 82008412 > AAAengAAANSDNPWCPNDLCCPNNSDDLCNSDDLCCPNNVFDLCOCLCQCRLOCLCFOCLCQ1d0745-32210745-32211nsdppccn-us-alNewspaperAdvertiser > > > (Montgomery, Ala.)The Advertiser(Montgomery, Ala.)The advertiser.Alabama > journal and advertiserMontgomery, Ala. :Advertiser Co.,1982-1987.volumes > :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier155th > > > year, no. 232 (Nov. 22, 1982)- ; -v. 14-3, Jan. 1, 1987.On Saturdays, > Sundays and holidays published as: The Alabama journal and advertiser, > Nov. 27, 1982-Jan. 1, 1987.Saturday, Sunday and holiday issues have > their own numbering.Montgomery > (Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United > > > StatesAlabamaMontgomeryMontgomery.Montgomery advertiser (Montgomery, > Ala. : Daily)(DLC)sn 84020645(OCoLC)2685433Montgomery advertiser > (Montgomery, Ala. : 1987)0892-4457(DLC)sn > 87050045(OCoLC)15155895AU at 000020281746info:srw/schema/1/marcxmlxml00000cas > a2200000 a 45009237931830218c19829999aluwr ne 0 0eng dsn > 86050139 AAAengAAACPNOCLOCLCQmscThe Randolph leader.Roanoke, Ala. :David > S. Stevenson,1982-volumes :illustrations ;58 > cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 91, > no. 1 (Oct. 6, 1982)-United StatesAlabamaRandolphRoanoke.Roanoke > leader(DLC)sn 86050137Randolph press(DLC)sn > 86050138info:srw/schema/1/marcxmlxml00000cas a22000007a > 450012715815851024d19821984aluwr ne 0 a0eng dsn 85045013 > CPNengCPNNSDCPNOCLCQ110890-60170890-60171432080USPSThe Piedmont > Journal-Independent, 115 N. Center Ave., Piedmont, AL 36272mscnsdpThe > Piedmont journal-independentThe Piedmont journal-independent.Piedmont, > Ala. :Piedmont Journal-Independent, Inc.,1982-1984.volumes > :illustrations ;58 > cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 1, > no. 1 (Mar. 31, 1982)-v. 3, no. 26 (June 27, 1984).Latest issue > consulted: Vol. 5, no. 31 (August 20, 1986).United > StatesAlabamaCalhounPiedmont.Piedmont journal(DLC)sn > 85045012Journal-independent(DLC)sn > 85045014(OCoLC)12715821AU at 000045312916info:srw/schema/1/marcxmlxml00000cas > a22000007a 45009183905830202c19829999aluwr n 0 a0eng dsn > 85044580 AAAengAAACPNNSDOCLOCLCQ11098-58671098-58671016409USPSNo. 4, > Rucker Plaza, Enterprise, AL 36331P.O. Box 1536, Enterprise, AL > 36331mscnsdpSoutheast sun (Enterprise, Ala.)The southeast > sun(Enterprise, Ala.)The Southeast sun.Enterprise, Ala. :QST > Publicationsvolumes :illustrations ;58 > cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in > 1982.Description based on: Vol. 1, no. 25 (Oct. 21, 1982).Latest issue > consulted: Vol. 16, no. 43 (Mar. 4, 1998).United > StatesAlabamaCoffeeEnterprise.AU at 000025827687info:srw/schema/1/marcxmlxml00000cas > > > a22000007a 450010487314840305c19819999aluwr ne 0 a0eng dsn > 85044906 > AAAengAAACPNNSDNSTCPNOCLOCLCQOCLCFOCLCOOCLCAOCLCQ900410885-16620885-16621749310USPSThe > > > New Times, 1618 1/2 St. Stephens Rd., Mobile, AL 36603mscnsdpn-us-alNew > times (Mobile, Ala.)The New times(Mobile, Ala.)The new times.Mobile, > Ala. :New Times Groupvolumes > :illustrationsWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan > > > in 1981.Vol. 3, no. 49 (Dec. 15-21, 1983) and vol. 3, no. 50 (Dec. > 22-28, 1983) are both called vol. 3, no. 49 (Dec. 15-21, > 1983).Description based on: Vol. 2, no. 3 (Jan. 28-Feb. 3, 1982).African > AmericansAlabamaNewspapers.African > Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694Newspapers.fast(OCoLC)fst01423814United > > > StatesAlabamaMobileMobile.AAPUnknownAug. 15, > 1985AU at 000024686659info:srw/schema/1/marcxmlxml00000cas a22000007a > 450018922463881219d19811983alucr ne 0 0eng dsn 88050233 > AARengAARCPNNSDOCLCQmscThe Sylacauga daily advance.Advance/Sylacauga > dailySylacauga advanceSunday advanceAdvanceSylacauga, Ala. :Mrs. W.A. > Moody,1981-1893.v.Semiweekly,<Nov. 24, 1982-Feb. 13, 1983>Daily (except > Mon., Tues. & Sat.),<May 26, 1982-Nov. 21, 1982>Daily (except Sat. & > Mon.),<Jan. 1, 1981-May 23, 1982>74th Year, no. 123 (Jan. 1, 1981)-76th > year, no. 83 (Feb. 13, 1983).Days of publication vary.Published as: The > Advance/Sylacauga daily, <Aug. 28, 1981-May 23, 1982>.Published as: > Sylacauga advance, <Nov. 24, 1982-Feb. 13, 1983>.On Sunday, published > as: Sunday advance.United StatesAlabamaTalladegaSylacauga.Childersburg > star(DLC)sn 88050232Coosa press(DLC)sn 86050293Daily > home1059-6461(DLC)sn 88050234info:srw/schema/1/marcxmlxml00000cas > a22000007a 450021026715cr un|||||||||900209c19809999aluwr ne 0 > 0eng dsn 90099002 > > AARengAARCPNCUSOCLOCLCQTJCOCLCQOCLCFOCLCOOCLCA926143844AU at 000020585756mscn-us-alSpeakin' > > > out news.Speaking out newsDecatur, Ala. :Minority Network, > Inc.v.WeeklyBegan in 1980.Published in Huntsville, Ala., <1987>-Also > issued by subscription via the World Wide Web.Description based on: Vol. > 7, no. 8 (Jan. 7-13, 1987).African AmericansAlabamaNewspapers.African > American > newspapersAlabama.AlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African > > > American newspapers.fast(OCoLC)fst00799278African > Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United > StatesAlabamaMorganDecatur.United > StatesAlabamaMadisonHuntsville.Speakin' out weekly news(DLC)sn > 88050097 > http://www.softlineweb.com/softlineweb/ethnic.htminfo:srw/schema/1/marcxmlxml00000cas > > a22000007a 450014996511861219c19809999aluwr ne 0 a0eng csn > 86050472 > AARengAARCPNNSDOCLCQ11080-15021080-15021328110USPSnsdppccWest-Alabama > gazetteWest-Alabama gazette.GazetteMillport, Ala. :Millport Pub. > Co.,1980-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrier4th > > > year, no. 32 (Jan. 3, 1980)-United StatesAlabamaLamarMillport.Gazette > (Millport, Ala.)(DLC)sn 86050471info:srw/schema/1/marcxmlxml00000cas > a2200000 a 450011828156850320c19809999aluwr ne 0 0eng dsn > 86050314 AAAengAAACPNOCLOCLCQmscThe Hartford news-herald.Hartford, Ala. > :Geneva Publications,1980-volumes :illustrations ;57-59 > cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 80, > no. 20 (Feb. 14, 1980)-United StatesAlabamaGenevaHartford.News-herald > (Hartford, Ala.)(DLC)sn 86050313info:srw/schema/1/marcxmlxml00000cas > a22000007a 450017857788880427d198u198ualusr ne 0 0eng dsn > 88050097 AARengAARCPNOCLOCLCQOCLCFOCLCOOCLCAmscn-us-alSpeakin' out > weekly news.Decatur, Ala. :Smothers PublicationsPublished every first > and third Wed. of each monthDescription based on: Vol. 3, no. 13 (May > 4-17, 1983).African > AmericansAlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African > Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United > StatesAlabamaMorganDecatur.Weekly news (Huntsville, Ala.)(DLC)sn > 87050012Speakin' out news(DLC)sn > 90099002info:srw/schema/1/marcxmlxml00000cas a2200000 a > 450017807936880418c198u9999aluwr ne 0 a0eng dsn 90099001 > AAAengAAACPNOCLOCLCQThe Daleville Sun-Courier, 310 Daleville Ave., > Daleville, AL 36322mscn-us-alDaleville sun-courier.Daleville, Ala. :QST > Publicationsv. :ill. ;58 cm.WeeklyDescription based on: Vol. 2, no. 28 > (Wed., Feb. 17, 1988).United > StatesAlabamaDaleDaleville.AU at 000020585749info:srw/schema/1/marcxmlxml00000cas > > > a22000007a 450015580838870423c198u9999aluwr ne 0 0eng dsn > 87050128 AARengAARCPNOCLCQmscGreene County independent.Eutaw, Ala. > :Greene County Independent, Inc.v.WeeklyDescription based on: Vol. 2, > no. 10 (Mar. 12, 1987).United > StatesAlabamaGreeneEutaw.info:srw/schema/1/marcxmlxml00000cas a22000007a > 450010125135831114d198u198ualucr ne 0 a0eng dsn 83003221 > NSDengNSDOCLCQ0d0746-55210746-55211Auburn Bulletin & Lee County Eagle, > PO Box 2111, Auburn, Ala. 36830nsdpThe Auburn bulletin & the Lee County > eagleThe Auburn bulletin & the Lee County eagle.Lee County eagleAuburn > bulletin and the Lee County eagleAuburn, Ala. :[publisher not > identified]Semiweekly,<Sept. 5, > 1984->WeeklytexttxtrdacontentunmediatednrdamediaDescription based on: > Oct. 19, 1983.United StatesAlabamaLeeAuburn.Auburn bulletin(DLC)sn > 89050006Eagle (Auburn, Ala.)(OCoLC)18435663Sept. 5, > 1984info:srw/schema/1/marcxmlxml00000cas a22000007a > 450018370324880818c198u9999aluwr ne 0 0eng dsn 88050147 > CPNengCPNOCLCQmscTri-city times (Geraldine, Ala.)The Tri-City > times.Geraldine, Ala. :Wanda Nelsonv.WeeklyDescription based on: Vol. 2, > no. 24 (Jan. 6, 1982).United > StatesAlabamaDeKalbGeraldine.info:srw/schema/1/marcxmlxml00000cas > a22000007a 450010199338831208c198u9999aluwr ne 0 a0eng dsn > 83005367 NSDengNSDCPNOCLCQ10746-62770746-62771707590USPSSpringville Pub. > Co., 539 Main St., Springville, AL 35146nsdpThe St. Clair clarionThe St. > Clair clarion.Saint Clair clarionSpringville, AL :Gary L. > ShultsWeeklytexttxtrdacontentunmediatednrdamediaDescription based on: > Vol. 2, no. 1 (Jan. 5, 1982).United StatesAlabamaSt. > ClairSpringville.AU at 000025783743info:srw/schema/1/marcxmlxml00000cas > a22000007a 450013787251860627c198u9999aluwr ne 0 a0eng dsn > 86001923 NSDengNSDCPNOCLCQ10889-00800889-00801The Westerner Star, P.O. > Box 2060, Bessemer, AL 35021nsdpWestern star (Bessemer, Ala.)The Western > star(Bessemer, Ala.)The western star.Bessemer, Ala. :Hal > HodgensWeeklytexttxtrdacontentunmediatednrdamediaDescription based on: > Vol. 3, no. 15 (Wednesday, June 11, 1986).United > StatesAlabamaJeffersonBessemer.Bessemer advertiser(DLC)sn > 87050117AU at 000025805174511.1srw.pc any \"y\" and srw.mt any > \"newspaper\" and srw.cp exact > \"Alabama\"50info:srw/schema/1/marcxmlxml1Date,,0mq1lME887FoIbjulKUV6bx9ImwWQNCv9GqZzGS92IKS31lEbcpRJBNHgcE1l29tFaHP9CHe0Yexk1uWQofffull" > > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 27 2022, 09:22am via System > > Hello Spencer, > > Thank you for reaching out about the bulk xml files for the US Newspaper > Directory. > > We don't have documentation specific to these bulk xml files, but upon > further inspection I can say that each of those files don't necessarily > contain info for 50 newspaper titles. The structure of the titles for > California and New York for instance are different from say, Alabama. > > If you look at California for example, the file naming structure > indicates the year the title started, and then the number of titles > included in that xml file. So for instance, the files below include info > for newspapers that started in 2000, 2001, and 2002 respectively. And > there is info for 30 titles in the xml file from 2000, and 14 in the > file for 2001, and so on. > > * ndnp_California_2000_e_0001_0030.xml > * ndnp_California_2001_e_0001_0014.xml > * ndnp_California_2002_e_0001_0012.xml > > If there's more than 50 titles for a given year, say for California > starting in 1880, then the next 50 titles will roll into the next xml > file, and so on. And the last xml file for that year may not include 50 > titles. > > Many of the states seem to group all the years together, so each xml > file contains 50 titles, until possibly the last one for a given state, > which may contain less. > > I hope this information helps explain the total number of records and > structure a bit better. Let me know if you have any further questions. > > Best wishes, > > Kerry Huller > Newspaper & Current Periodical Reading Room > Serial & Government Publications Division > Library of Congress > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 25 2022, 02:22pm via Email > > Hi, Kerry: > > > Might there be documentation on the XML files you mentioned? > > > I've successfully read > 'https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/', > extracted the names of 6666 XML files, and read the first one, > "ndnp_Alabama_all-yrs_e_0001_0050.xml". It contains 29415 characters, > beginning, "1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i > 45001030438981180404c20159999aluwr n 0 a0eng ". With a bit > more effort, I will likely be able to parse all 6666 of these. The > names suggest that each contains information on 50 newspapers, totaling > 333,300. The main page > "https://chroniclingamerica.loc.gov/search/titles/" says there are only > 157,521 "Titles currently listed". This suggests that these XML files > include place holders for a little more than double the number of > entries currently in "https://chroniclingamerica.loc.gov/search/titles/". > > > Thanks for this. > > > Progress. > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 07 2022, 08:55am via System > > Hi Spencer, > > I thought of one more option after I emailed you yesterday that I wanted > to make you aware of. > > I had explained the other day how we pull the records from OCLC into our > U.S. Newspaper Directory. You can also access all of the raw MARC > records found in the directory in xml format from here if you choose: > https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ > <https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/> These will > > provide you all of the data from the record fields in MARC format, so > you'd get all the data you see here for example: > https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/ > <https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/> but in xml. I > don't know if this might be more data and info than you want to work > with, but wanted to make sure you were aware of this option as well. > > Best wishes, > > Kerry Huller > Newspaper & Current Periodical Reading Room > Serial & Government Publications Division > Library of Congress > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 06 2022, 10:55am via System > > Hi Spencer, > > Thanks for reaching out again. I have been looking at the json view a > bit closer this morning and your example of "9999." > > After talking with a colleague this morning and looking at various > examples, I see there is some variation in how the titles with either an > unknown starting/ending date or currently published titles are being > handled - depending on the view. > > As an example, I completed a search in the directory for Alaska and the > city of Anchorage. There are 80 results, and on the first page of > results you'll see # 4. Fort Richardson news, which was published from > 1952-19??. The csv view of this state/city search result will show the > ending date of 19??. But if I append &format=json to this search result, > this specific title will show an ending date of 1999. After talking with > a colleague this morning, I discovered an integer had to be used in > these cases where dates were "?" so that the search based on year range > would work. Similarly, if you look at # 12 Alaska digest, which was > published 1994-current, the "current" becomes "9999" in the json view. > So, the records you are seeing with "9999" would most likely be titles > with an ending date of "current." > > However, there is an issue with the unknown dates, like "1999" being > used for "19??" in the example above. The "9" does not get inserted in > place of "?" when you are looking at the title/LCCN view of a specific > newspaper. So for instance, if you view the #4 title: Fort Richardson > news at this url: https://chroniclingamerica.loc.gov/lccn/sn98059792/ > <https://chroniclingamerica.loc.gov/lccn/sn98059792/> but append .json > to the end of the url, after the LCCN, like this: > https://chroniclingamerica.loc.gov/lccn/sn98059792.json > <https://chroniclingamerica.loc.gov/lccn/sn98059792.json> you'll see > that the end_year is "19??." Viewing the title/LCCN json view for titles > that are currently published will also show the end_year as "current." > The Alaska digest example from above can be viewed here: > https://chroniclingamerica.loc.gov/lccn/sn97060056.json > <https://chroniclingamerica.loc.gov/lccn/sn97060056.json> > > I wasn't aware of the difference between the directory search json view > and the title/LCCN view. But I think it would be possible to grab > the data from the title/LCCN json url through an additional script > potentially. The json url is included in the view under the "url" field. > > Of course, there are unknowns with publishing dates, but better to know > where the question marks are, and what titles are considered to be current. > > I hope this clarifies the data a bit more - let me know if any of it > needs more clarification though. And let me know if you have follow-up > questions. > > Thank you, > > Kerry Huller > Newspaper & Current Periodical Reading Room > Serial & Government Publications Division > Library of Congress > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 05 2022, 04:42pm via Email > > Hi, Kerry: > > > What would you suggest I do to get a count of the numbers of > newspapers and publishers operating by year from, say, 1790 to 2021? > > > I just determined that 20630 (13 percent) of the 157520 records in > the US Newspaper database I downloaded a week ago have end_year = 9999. > I don't think it's feasible to assume that all or even most of those > are still publishing. > > > Might there be some other database that might have this kind of > information? > > > I ask, because Robert McChesney (2004) The Problem of the Media > (Monthly Review Pr., esp. pp. 34-35) suggests that in the first half of > the nineteenth century, the US had more newspapers and newspaper > publishers per capita than any other place or time. He suggests that > that diversity of newspapers helped encourage literacy and limit > political corruption, both of which helped propel the young US to its > current dominance of the international political economy. I'm hoping to > get some data to evaluate this claim. Sadly, it looks like there is too > much missing and questionable data in this dataset for me to use this > without a fairly substantive data cleaning effort. > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 05 2022, 09:05am via System > > Hello Spencer, > > Thank you for reaching out about your additional questions. > > I was looking at the records you mention above, and yes, you are correct > - those 9 records with the date inconsistencies and the one record for > the The New Mexican mining news > <https://chroniclingamerica.loc.gov/lccn/sn93061507/> containing "Santa > Fe.\" have typos in them. Thanks for spotting these - it may be possible > to have the cataloger in our division correct those typos. I will look > into this further. > > The U.S. Newspaper Directory doesn't have a connection with Wikimedia or > Wikipedia. The Library of Congress periodically pulls the records for > the Directory from OCLC Worldcat > <https://www.oclc.org/en/worldcat.html>. And those newspaper records in > OCLC Worldcat have been created by catalogers at various institutions > around the U.S. over the span of several years. So, occasionally, you > will find a typo in the records. Corrections can be made by OCLC and > library staff at the various institutions. Every time we complete a new > pull on the OCLC records, any corrected records will then populate our > Directory. > > Regarding your question on the New-York weekly journal - yes, that is > also correct that it has two records. There is actually a record for > each format of the newspaper, so this record is for the microfilm format > <https://chroniclingamerica.loc.gov/lccn/2009252748/> and this one is > for the original print format > <https://chroniclingamerica.loc.gov/lccn/sn83030211/>. You can see in > the heading for the microfilm record where it says [microfilm reel] and > the print version shows [volume]. You are likely to see this for other > titles as well because each format has been cataloged with its own LCCN. > You are also likely to see additional records with [online resource] > identified as the format as more and more titles are available as > ePrints or online. > > I hope this helps answer your additional questions a bit more. Please > reach out if you have any other questions. > > Thank you, > > Kerry Huller > Newspaper & Current Periodical Reading Room > Serial & Government Publications Division > Library of Congress > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 04 2022, 01:47pm via Email > > Hi, Kelly: > > > At the risk of bombing your inbox with more emails than you want, > what is your relationship with Wikipedia and other Wikimedia Foundation > projects like Wikidata? > > > I ask, because I've logged over 20,000 edits in Wikimedia Foundation > projects since 2010, and I would happily try to answer questions about > Wikidata and other Wikimedia Foundation projects. I have NOT organized > an edit-a-thon, but I've made presentations at conferences with people > who have, and I would happily try to help organize such if you could > find a group of people who want to work to improve this US Newspaper > database. I think it would be good to establish links between this US > Newspaper database and Wikidata, with appropriate procedures so changes > to one could be evaluated for acceptance into the other. > > > FYI, John Peter Zenger's famous "New-York weekly journal" (1733-1751) > appears TWICE in your database with lccn = 2009252748 and sn83030211 and > ONCE in Wikidata WITHOUT an lccn, even though many other Wikidata items > have an lccn. See: > > > https://www.wikidata.org/wiki/Q23091960 > > > There's a "WikiProject Newspapers" on Wikipedia and a companion > "WikiProject Periodicals" on Wikidata: > > > https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Newspapers/Wikidata > > > https://www.wikidata.org/wiki/Wikidata:WikiProject_Periodicals > > > I've tried to connect with others on those projects, so far with only > limited success. However, you may know that almost anyone can change > almost anything on Wikipedia and other Wikimedia Foundation projects. > What stays tends to be written from a neutral point of view citing > credible sources. They have problems with vandals, but the problems are > usually easily controlled. This makes Wikipedia and Wikidata very > useful platforms for cleaning up databases like your US Newspaper dataset. > > > Spencer Graves > > > ########## > > > Hello, Kelly: > > > In addition to the invalid JSON, discussed below [NOTE: The "below" > contains a slight addition to the report of the I sent last Friday.], I > found 9 (NINE!) cases where start_year was AFTER end_year. These have > lccn = "sn86071531" "sn95069213" "sn90059096" "sn86058451" "sn90060926" > "sn99065409" "sn89065002" "sn98069857" "sn91059179" > > > See: > > > https://chroniclingamerica.loc.gov/lccn/sn86071531/ > https://chroniclingamerica.loc.gov/lccn/sn95069213/ > https://chroniclingamerica.loc.gov/lccn/sn90059096/ > https://chroniclingamerica.loc.gov/lccn/sn86058451/ > https://chroniclingamerica.loc.gov/lccn/sn90060926/ > https://chroniclingamerica.loc.gov/lccn/sn99065409/ > https://chroniclingamerica.loc.gov/lccn/sn89065002/ > https://chroniclingamerica.loc.gov/lccn/sn98069857/ > https://chroniclingamerica.loc.gov/lccn/sn91059179/ > > > These all have obvious coding errors that can be easily fixed. The > data may not be completely accurate after the fix, but at least they are > not obviously wrong ;-) > > > ################## > > I got invalid JSON from: > > > > https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json > > > After some experimentation, I was able to replicate the problem with > a request for rows=10: > > > > https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json > > > Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics > and Associate Dean for Graduate Programs at the University of California > - Davis, confirmed that it was a JSON error using: > > > https://codebeautify.org/jsonvalidator > > > He is part of the core team developing the R free, open-source > programming language. He said, that starting at offsets 161070 and > 161502 in the character string you get from [the R code RCurl::getURL()] > we have: > > > Santa Fe.\" > > > and these are in an entry such as > > > "city": ["Santa Fe.\"] > > > So the final " is escaped and therefore there is no closing " for the > string. The parser continues to consume characters looking for the end > of that string. > > > If one "repairs" the text from getURL() with > > > ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt) > > > then the rest of my code worked fine. > > > You may wish to do something to implement other checks for valid JSON > and repair this problem. I've scanned all the 157520 records that were > in that database a couple of days ago, and this is the only JSON error > identified by the code I used. > > > NOTE: I was NOT able to replicate this error when downloading records > one at a time. That suggests a problem NOT in the database itself but > in the download algorithm. ??? > > > Thank you for your help. I will almost certainly have other > questions ;-) > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 03 2022, 10:39pm via Email > > Hello, Kelly: > > > In addition to the invalid JSON, discussed below [NOTE: The "below" > contains a slight addition to the report of the I sent last Friday.], I > found 9 (NINE!) cases where start_year was AFTER end_year. These have > lccn = "sn86071531" "sn95069213" "sn90059096" "sn86058451" "sn90060926" > "sn99065409" "sn89065002" "sn98069857" "sn91059179" > > > See: > > > https://chroniclingamerica.loc.gov/lccn/sn86071531/ > https://chroniclingamerica.loc.gov/lccn/sn95069213/ > https://chroniclingamerica.loc.gov/lccn/sn90059096/ > https://chroniclingamerica.loc.gov/lccn/sn86058451/ > https://chroniclingamerica.loc.gov/lccn/sn90060926/ > https://chroniclingamerica.loc.gov/lccn/sn99065409/ > https://chroniclingamerica.loc.gov/lccn/sn89065002/ > https://chroniclingamerica.loc.gov/lccn/sn98069857/ > https://chroniclingamerica.loc.gov/lccn/sn91059179/ > > > These all have obvious coding errors that can be easily fixed. The > data may not be completely accurate after the fix, but at least they are > not obviously wrong ;-) > > > ################## > > I got invalid JSON from: > > > > https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json > > > After some experimentation, I was able to replicate the problem with > a request for rows=10: > > > > https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json > > > Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics > and Associate Dean for Graduate Programs at the University of California > - Davis, confirmed that it was a JSON error using: > > > https://codebeautify.org/jsonvalidator > > > He is part of the core team developing the R free, open-source > programming language. He said, that starting at offsets 161070 and > 161502 in the character string you get from [the R code RCurl::getURL()] > we have: > > > Santa Fe.\" > > > and these are in an entry such as > > > "city": ["Santa Fe.\"] > > > So the final " is escaped and therefore there is no closing " for the > string. The parser continues to consume characters looking for the end > of that string. > > > If one "repairs" the text from getURL() with > > > ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt) > > > then the rest of my code worked fine. > > > You may wish to do something to implement other checks for valid JSON > and repair this problem. I've scanned all the 157520 records that were > in that database a couple of days ago, and this is the only JSON error > identified by the code I used. > > > NOTE: I was NOT able to replicate this error when downloading records > one at a time. That suggests a problem NOT in the database itself but > in the download algorithm. ??? > > > Thank you for your help. I will almost certainly have other > questions ;-) > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 01 2022, 11:46am via Email > > Hello, Kelly: > > > I got invalid JSON from: > > > > https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json > > > After some experimentation, I was able to replicate the problem with > a request for rows=10: > > > > https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json > > > Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics > and Associate Dean for Graduate Programs at the University of California > - Davis, confirmed that it was a JSON error using: > > > https://codebeautify.org/jsonvalidator > > > He is part of the core team developing the R free, open-source > programming language. He said, that starting at offsets 161070 and > 161502 in the character string you get from [the R code RCurl::getURL()] > we have: > > > Santa Fe.\" > > > and these are in an entry such as > > > "city": ["Santa Fe.\"] > > > So the final " is escaped and therefore there is no closing " for the > string. The parser continues to consume characters looking for the end > of that string. > > > If one "repairs" the text from getURL() with > > > ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt) > > > then the rest of my code worked fine. > > > You may wish to do something to implement other checks for valid JSON > and repair this problem. I've scanned all the 157520 records that were > in that database a couple of days ago, and this is the only JSON error > identified by the code I used. > > > Thank you for your help. I will almost certainly have other > questions ;-) > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 28 2022, 02:20pm via System > > Hello Spencer, > > Thank you for sending along your follow-up questions. > > I'm glad to hear the json view will work for you. It was recommended to > me that you limit your requests to 500 rows at a time. And a developer > here at LC suggests the following regarding rate limiting: > > ?To avoid being blocked by the server, the current rate-limiting rules > restrict un-cached requests to URLs starting with > https://chroniclingamerica.loc.gov/search/ > <https://chroniclingamerica.loc.gov/search/> to 120 requests every 10 > minutes from a single IP address.? > > So, I think if you limited each of your requests to 500 rows at a time > with the proper pauses, then you should be able to access what you need. > > As for the csv view, I checked on this as well, and was informed that > the csv view was not implemented for all url formats. The csv view was > only implemented for this view: > https://chroniclingamerica.loc.gov/newspapers/ > <https://chroniclingamerica.loc.gov/newspapers/>and urls resulting from > US Directory search results - for e.g. if you wanted to narrow down your > search results by state, city, date range, etc. found at this link: > https://chroniclingamerica.loc.gov/search/titles/ > <https://chroniclingamerica.loc.gov/search/titles/>. So, if you wanted a > csv and limited your search by state ( for example: > > https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv > < > https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv> > > ), you could append &format=csv to the search result url and get the csv > to automatically download. But, if your search results ended up being > over a couple thousand titles, then the system would probably time out. > > I hope this info helps! Let me know if you have any other questions. > > Best wishes, > > Kerry Huller > Newspaper & Current Periodical Reading Room > Serial & Government Publications Division > Library of Congress > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 27 2022, 04:15pm via Email > > Hello, Kerry: > > > Thanks for the reply. Can you please give me some further guidance > on two thing "so that the system is not overwhelmed"? > > > 1. The max size in a small batch? > > > 2. Any limit on the number of small batches in a second or minute? > > > I've found that I can download small batches under program control > using "RCurl::getURL" in R (programming language) using, e.g.; > > > > https://chroniclingamerica.loc.gov/search/titles/results/?rows=20&page=2&format=json > > > With this, I can control the batch size with "row=20" vs. "row=50" > vs., e.g., "row=1000". A naive search says there are 157520 "results". > With "row=1000", this would require 158 calls. With "row=20", it > would require 7876 calls. Before I start, I need to decide which fields > I want; I don't need them all. > > > Thanks, > Spencer Graves > > > p.s. I tried appending "&format=csv" and got "Error 504 Ray ID: > 7220896da85e86e7 ? 2022-06-27 19:19:53 UTC Gateway time-out". I used: > > > > https://chroniclingamerica.loc.gov/search/titles/results/?state=&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv > > > I can get what I want using json so do not need csv. However, I > thought you might want to know that I was unable to get csv to work. > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 27 2022, 10:54am via System > > Hello Spencer, > > Thank you for contacting the Library of Congress about searching the US > Newspaper Directory. I wanted to follow up with you regarding your > request to output the data in a machine readable format. > > It looks like you were provided the link to the API documentation for > the website: About the Site and API > <https://chroniclingamerica.loc.gov/about/api/>. Scroll down to the > section with the heading, Searching the directory and newspaper pages > using OpenSearch. This section describes the search functionality and > structure for the US Newspaper Directory in more detail. It is possible > to return your directory searches in json format by appending > &format=json to the end of the url. It is also possible to return search > results in csv format by appending &format=csv to the end of the url, > but I would strongly suggest that you do this in small batches by > putting limits on your search so that the system is not overwhelmed. > > So, from the search page for the US Newspaper Directory > <https://chroniclingamerica.loc.gov/search/titles/> you could > potentially limit your search based on state and city, or date range, > and/or even frequency. Then once you've completed the search, you can > add &format=csv to the end of the url to automatically download a csv of > those records. The resulting csv will contain several fields/headers: > lccn, title, place of publication, start year, end year, publisher, > edition, frequency, subject, state, city, country, language, oclc > number, and holding type. I think these fields include the information > you were looking for. But, again, I would like to stress that you put > limits on your search before creating the csv so as not overwhelm the > system. > > Please let me know if you have any other additional questions. > > Best wishes, > > Kerry Huller > Newspaper & Current Periodical Reading Room > Serial & Government Publications Division > Library of Congress > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 23 2022, 01:55pm via System > > Mr. Graves, > > I'm going to transfer you request to a member of our digital collections > team who may be of more assistance to you than me. > > Mike > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 23 2022, 01:51pm via Email > > Dear Mr. Queen: > > > Thanks for the reply. I'm still confused. I downloaded and > installed Docker Desktop and "docker-compose.yml" and ran their "Getting > Started" Tutorial, but I don't see what to do next. > > > I repeat: I'd like to analyze "U.S. Newspaper Directory, > 1690-Present" (https://chroniclingamerica.loc.gov/search/titles/), which > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 22 2022, 07:15pm via System > > Mr. Graves, > > Programmatic access to the data forChronicling America > <https://chroniclingamerica.loc.gov/>and possibly the U.S. Newspaper > Directory <https://chroniclingamerica.loc.gov/search/titles/>can be > found on theAbout the Site and API > <https://chroniclingamerica.loc.gov/about/api/>page in various formats. > Also, please note that Chronicling Americacontains newspapers published > from 1777-1963, but does not include everyU.S. newspaper published in > that time period. > > Please let me know if I can be of further assistance. > > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 22 2022, 06:14pm via Email > > Dear Mr. Queen: > > > Can we simplify this to just giving me the data behind "U.S. > Newspaper Directory, 1690-Present" > (https://chroniclingamerica.loc.gov/search/titles/) in a machine > readable format, e.g., csv or xlsx or a MySQL database? > > > As I mentioned in my original email, a naive search of that without > restrictions returned 157520 titles in 7876 pages with up to 20 titles > per page giving date ranges in at least some cases. I could probably > write software to scrape those 7876 pages from your web site and combine > them into a data file. > > > I have a PhD in statistics, I have been using the R programming > language and similar software for decades. This includes publishing > tutorials on how to analyze data like this on Wikiversity.[1] I'd like > to do something similar with this. I could help make your data more > useful to others and discuss with you how we might prioritize > improvements like accessing the other sources you mentioned. > > > Thanks very much for your reply. > > > Sincerely, > Spencer Graves, PhD > Founder, EffectiveDefense.org > 4550 Warwick Blvd 508 > Kansas City, MO 64111 > m: 408-655-4567 > > > [1] e.g.: > > > https://en.wikiversity.org/wiki/US_Gross_Domestic_Product_(GDP)_per_capita > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 22 2022, 05:27pm via System > > Mr. Graves > > Your request is a little more complex than it first appears and requires > extensive research. A variety of resources should be consulted to > determine the circulation statistics of newspapers published prior to > 1851. You will need to check newspaper union lists and newspaper > histories. Union listspresent lists of newspapers in geographic > arrangement according to place of publication, and specify which > libraries or other institutions hold collections of those newspapers and > the dates of their holdings. These can also be useful for tracking title > changes throughout a newspaper's history. Newspaper > historieslikeAmerican Journalism: A History: 1690-1960 > <https://lccn.loc.gov/62007157>(Mott),The Penny Press > <https://lccn.loc.gov/2004043078>(Thompson), andThe Press and America > <https://lccn.loc.gov/99044295>(Emery et al.) may not include > circulation statistics, but they do document the diversity and progress > of newspaper publishing, including notable newspapers of the era. > Newspaper histories also cover the history of the printers and printing > of newspapers in a state, county, or region more generally, and provide > more condensed histories of the editors, journalists, and evolution of > the newspapers in a specific area. Newspaper histories and union lists > should be available at most large public or university libraries. More > information about union lists, newspaper histories, and researching > newspapers in general can be found in theU.S. Newspaper Collections at > the Library of Congress > <https://guides.loc.gov/united-states-newspapers/introduction>research > guide (see Reference Sources). > > Please let me know if I can be of further assistance. > > ------------------------------------------------------------------------ > > Original Question > > Jun 20 2022, 02:34pm via System > > How can I get counts of the numbers of newspapers by year in the US, and > preferably also elsewhere? A search of "U.S. Newspaper Directory, > How can I get counts of the numbers of newspapers by year in the US, and > preferably also elsewhere? > > A search of "U.S. Newspaper Directory, 1690-Present" > (https://chroniclingamerica.loc.gov/search/titles/) returned 157520 > titles in 7876 pages with up to 20 titles per page giving date ranges to > the extent that it's known. If I can get a data file (e.g., csv or xls), > I can summarize. I could also use data on circulation and frequency and > especially parent company for multiple newspapers published by the same > company, to the extant that such is available. > > I'm interested in this, because McChesney quoted Tocqueville in > suggesting that the US had more newspapers per person (or per million > population) prior to 1851 than at any other time or place in history. > I'd like to evaluate that claim with data to the extent that I can. See > " > https://en.wikiversity.org/wiki/Social_construction_of_crime_and_what_we_can_do_about_it#Newspapers_1790_-_present". > > > > Thanks, Spencer Graves, PhD > m: 408-655-4567 > > ------------------------------------------------------------------------ > > Thank you for using Newspapers & Current Periodicals Ask a Librarian > Service! > > > This email is sent from Ask a Librarian in relationship to ticket #9625195. > > Read our privacy policy. <https://springshare.com/privacy.html> > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]