Hello, All:
What would you suggest I do to parse the following XML file into a
list that I can understand:
XMLfile <-
"https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ndnp_Alabama_all-yrs_e_0001_0050.xml"
This is the first of 6666 XML files containing "U.S. Newspaper
Directory" maintained by the US Library of Congress discussed in the
thread below. I've tried various things using the XML and xml2.
XMLdata <- xml2::read_xml(XMLfile)
str(XMLdata)
XMLdat <- XML::xmlParse(XMLdata)
str(XMLdat)
XMLtxt <- xml2::xml_text(XMLdata)
nchar(XMLtxt)
#[1] 29415
Someplace there's a schema for this. I don't know if it's
embedded
in this XML file or in a separate file. If it's in a separate file, how
could I describe it to my contacts with the Library of Congress so they
would understand what I needed and could help me get it.
Thanks,
Spencer Graves
p.s. All 29415 characters in XMLtext appear in the thread below.
-------- Forwarded Message --------
Subject: [Newspapers and Current Periodicals] How can I get counts of
the numbers of newspapers by year in the US, and preferably also
elsewhere? A search of "U.S. Newspaper Directory,
Date: Wed, 27 Jul 2022 14:59:03 +0000
From: Kerry Huller <serials at ask.loc.gov>
To: Spencer Graves <spencer.graves at effectivedefense.org>
CC: twes at loc.gov
--# Type your reply above this line #--
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 27 2022, 10:59am via System
Hello Spencer,
So, when I view the xml, I'm actually looking at it in XML editor
software, so I can view the tags and it's structured neatly. I've copied
and pasted the text from the beginning of the file and the first
newspaper title below from my XML editor:
<?xml version="1.0" encoding="UTF-8"
standalone="no"?>
<?xml-stylesheet type='text/xsl'
href='/webservices/catalog/xsl/searchRetrieveResponse.xsl'?>
<searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/"
xmlns:oclcterms="http://purl.org/oclc/terms/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:diag="http://www.loc.gov/zing/srw/diagnostic/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<version>1.1</version>
<numberOfRecords>2250</numberOfRecords>
<records>
<record>
<recordSchema>info:srw/schema/1/marcxml</recordSchema>
<recordPacking>xml</recordPacking>
<recordData>
<record xmlns="http://www.loc.gov/MARC21/slim">
? ? <leader>00000nas a22000007i 4500</leader>
? ? <controlfield tag="001">1030438981</controlfield>
? ? <controlfield tag="008">180404c20159999aluwr n ? ? ? 0 ?
a0eng
?</controlfield>
? ? <datafield ind1=" " ind2=" "
tag="010">
? ? ? <subfield code="a"> ?2018200464</subfield>
? ? </datafield>
? ? <datafield ind1=" " ind2=" "
tag="040">
? ? ? <subfield code="a">DLC</subfield>
? ? ? <subfield code="e">rda</subfield>
? ? ? <subfield code="c">DLC</subfield>
? ? ? <subfield code="b">eng</subfield>
? ? </datafield>
? ? <datafield ind1=" " ind2=" "
tag="012">
? ? ? <subfield code="m">1</subfield>
? ? </datafield>
? ? <datafield ind1="0" ind2=" "
tag="022">
? ? ? <subfield code="a">2577-5316</subfield>
? ? ? <subfield code="2">1</subfield>
? ? </datafield>
? ? <datafield ind1=" " ind2=" "
tag="032">
? ? ? <subfield code="a">021110</subfield>
? ? ? <subfield code="b">USPS</subfield>
? ? </datafield>
? ? <datafield ind1=" " ind2=" "
tag="037">
? ? ? <subfield code="b">711 Alabama Avenue, Selma, AL
36701</subfield>
? ? </datafield>
? ? <datafield ind1=" " ind2=" "
tag="042">
? ? ? <subfield code="a">nsdp</subfield>
? ? ? <subfield code="a">pcc</subfield>
? ? </datafield>
? ? <datafield ind1="1" ind2="0"
tag="050">
? ? ? <subfield code="a">ISSN RECORD</subfield>
? ? </datafield>
? ? <datafield ind1="1" ind2="0"
tag="082">
? ? ? <subfield code="a">071</subfield>
? ? ? <subfield code="2">15</subfield>
? ? </datafield>
? ? <datafield ind1=" " ind2="0"
tag="222">
? ? ? <subfield code="a">Selma sun</subfield>
? ? </datafield>
? ? <datafield ind1="0" ind2="0"
tag="245">
? ? ? <subfield code="a">Selma sun.</subfield>
? ? </datafield>
? ? <datafield ind1=" " ind2="1"
tag="264">
? ? ? <subfield code="a">Selma, AL :</subfield>
? ? ? <subfield code="b">North Shore Press,
LLC</subfield>
? ? ? <subfield code="c">2016-</subfield>
? ? </datafield>
? ? <datafield ind1=" " ind2=" "
tag="310">
? ? ? <subfield code="a">Weekly</subfield>
? ? </datafield>
? ? <datafield ind1=" " ind2=" "
tag="336">
? ? ? <subfield code="a">text</subfield>
? ? ? <subfield code="b">txt</subfield>
? ? ? <subfield code="2">rdacontent</subfield>
? ? </datafield>
? ? <datafield ind1=" " ind2=" "
tag="337">
? ? ? <subfield code="a">unmediated</subfield>
? ? ? <subfield code="b">n</subfield>
? ? ? <subfield code="2">rdamedia</subfield>
? ? </datafield>
? ? <datafield ind1=" " ind2=" "
tag="338">
? ? ? <subfield code="a">volume</subfield>
? ? ? <subfield code="b">nc</subfield>
? ? ? <subfield code="2">rdacarrier</subfield>
? ? </datafield>
? ? <datafield ind1="1" ind2=" "
tag="362">
? ? ? <subfield code="a">Began in 2015.</subfield>
? ? </datafield>
? ? <datafield ind1=" " ind2=" "
tag="588">
? ? ? <subfield code="a">Description based on: Volume 2, Issue
40
(October 5, 2017) (surrogate); title from caption.</subfield>
? ? </datafield>
? ? <datafield ind1=" " ind2=" "
tag="588">
? ? ? <subfield code="a">Latest issue consulted: Volume 2,
Issue 40
(October 5, 2017).</subfield>
? ? </datafield>
? ? <datafield ind1=" " ind2=" "
tag="752">
? ? ? <subfield code="a">United States</subfield>
? ? ? <subfield code="b">Alabama</subfield>
? ? ? <subfield code="c">Dallas</subfield>
? ? ? <subfield code="d">Selma.</subfield>
? ? </datafield>
? </record>
</recordData>
</record>
When I view the records in the XML editor, these 2 lines below do begin
each of the records for each individual title, but of course this is
including the xml tags:
<recordSchema>info:srw/schema/1/marcxml</recordSchema>
<recordPacking>xml</recordPacking>
Hopefully this helps you decide where to break or parse each record.
On another note, I just noticed as well that at the top of this first
file it lists the total number of records for the Alabama grouping -
2250. This also appeared to be the case for the Alaska records when I
took a look at the first one for that state. I imagine that should be
consistent throughout each "grouping" of records.
Let me know if you have follow-up questions!
Best wishes,
Kerry Huller
Newspaper & Current Periodical Reading Room
Serial & Government Publications Division
Library of Congress
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 27 2022, 10:21am via Email
Hi, Kerry:
Thanks. I understand the chunking in files of at most 50. I've read
the first file "ndnp_Alabama_all-yrs_e_0001_0050.xml" into a string of
29415 characters, copied below. Might you have any suggestions on the
next step in parsing this? Staring at it now, it looks splitting on
"info:srw/schema/1/marcxmlxml" might convert the 29415 characters into
shorter chunks, each of which could then be parsed further.
This is not as bad as reading ancient Egyptian heiroglyphics without
the Rosetta Stone, but I wondered if you might have something that could
make this work easier and more reliable? I guess I could compare with
what I already read as JSON ;-)
Thanks,
Spencer Graves
"1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i
45001030438981180404c20159999aluwr n 0 a0eng
2018200464DLCrdaDLCeng12577-53161021110USPS711 Alabama Avenue, Selma, AL
36701nsdppccISSN RECORD07115Selma sunSelma sun.Selma, AL :North Shore
Press,
LLC2016-WeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in
2015.Description based on: Volume 2, Issue 40 (October 5, 2017)
(surrogate); title from caption.Latest issue consulted: Volume 2, Issue
40 (October 5, 2017).United
StatesAlabamaDallasSelma.info:srw/schema/1/marcxmlxml00000cas a22000007a
4500502150053100127c20109999aluwr n 0 a0eng
2010200019DLCengDLCDLCOCLCQ112153-18111750USPSB & C Publishing, LLC,
3514 Martin St. S. Ste 104, Cropwell, AL 35054pccnsdpISSN RECORDSt.
Clair County news (Cropwell, Ala.)St. Clair County news(Cropwell,
Ala.)St. Clair County news.Cropwell, AL :B & C Pub.WeeklyBegan in
2010.Description based on: Nov. 4, 2010 (surrogate); title from
caption.info:srw/schema/1/marcxmlxml00000cas a22000007a
4500426491872090720c20099999alumr n 0 a0eng
2009203372DLCengDLCOCLCQ12150-346X2150-346X1AU at 000044489617NZ116076352Devon
Applewhite/Applewhite Publishing Co., 1910 Honeysuckle Rd., #N183,
Dothan, AL 36305mscnsdpISSN RECORD30514Triangle tribune(Dothan,
Ala.)Triangle tribune.Dothan, AL :Applewhite Pub. CoMonthlyBegan with
vol. 1, issue 1 (May 2009).\"Connecting the Tri-State African -American
Community.\"Description based on: Vol. 1, issue 1 (May 2009); title from
masthead.Applewhite, Devon.United StatesAlabama.United
StatesGeorgia.United StatesFlorida.info:srw/schema/1/marcxmlxml00000cas
a22000007a 4500289017315081219c20089999aluwr n | a0eng c
2008213218NSDengNSDOCLCQDLCOCLCQ111945-93191945-93191005270USPSSpringhill
Publications,
LLC, P.O. Box 186, Greenville, AL 36037nsdppccISSN RECORD07014Greenville
standardThe Greenville standard.Greenville, AL :Springhill
PublicationsWeeklytexttxtrdacontentunmediatednrdamediaBegan with vol. 1,
issue 1 (Sept. 3, 2008)Description based on surrogate of: Vol. 1, no. 15
(Dec. 18, 2008); title from masthead (publisher's Web site, viewed Dec.
19, 2008).Latest issue consulted: Vol. 1, no. 99 (July 27, 2011)
(surrogate).info:srw/schema/1/marcxmlxml00000cas a22000007a
4500123539969070426c20079999aluwr ne 0 a0eng c
2007212138NSDengNSDNSDOCLCQ101936-95571936-95571The Western Tribune,
1530 Third Ave. N., Bessemer, AL 35020mscnsdpISSN RECORDWestern tribune
(Bessemer, Ala.)The Western tribune(Bessemer, Ala.)The Western
tribune.Bessemer, Ala. :D-Med, Inc.v.WeeklyBegan in 2007.Description
based on: May 23, 2007 (surrogate); title from
caption.AU at 000041575341info:srw/schema/1/marcxmlxml00000cas a22000007a
4500226300653080425c20079999aluwr ne | a0eng
2008212112NSDengNSDNSDOCLCQ11942-20751942-20751nsdppccISSN RECORDThe
corridor messengerThe corridor messenger.Carbon Hill, AL :Corridor
Messenger, Inc.WeeklyBegan with vol. 1, issue (10.03.2007).Description
based on: 1st issue.United StatesAlabamaWalkerCarbon
Hill.http://www.corridormessenger.cominfo:srw/schema/1/marcxmlxml00000cas
a22000007a
450077560432070109c20069999aluwr ne 0 a0eng c
2007213400NSDengNSDOCLCQAUBRNOCLCOOCLCFa01935-37901935-37901AU at
000041190283The
Auburn Villager, P.O. Box 1633, Auburn, AL 36831-1633pccnsdpISSN
RECORDThe Auburn villagerThe Auburn villager.Auburn, AL :Auburn
Villagerv.WeeklyBegan in 2006.Description based on: Vol. 1, no. 4 (July
20, 2006) (surrogate); title from caption.Auburn (Ala.)Newspapers.Lee
County (Ala.)Newspapers.AlabamaAuburn.fast(OCoLC)fst01209634AlabamaLee
County.fast(OCoLC)fst01211930Newspapers.fast(OCoLC)fst01423814United
StatesAlabamaLeeAuburn.info:srw/schema/1/marcxmlxml00000cas a2200000Ii
4500872286785m o d s cr mn|---a||||140311c20069999alucr n o b
s0 a0eng cABCengrdaABCABCOCLCFLD59.13University of Alabama at
Birmingham.The eReporter.[Birmingham, Alabama] :The University of
Alabama at Birmingham,[2006]-[Birmingham, Alabama] :Offices of Public
Relations & Marketing and Information Technology1 online resource2
issues weeklytexttxtrdacontentcomputercrdamediaonline
resourcecrrdacarrierSeptember 19, 2006-\"The eReporter is an official
communication of The University of Alabama at Birmingham, companion to
the UAB Reporter and recommended alternative to mass e-mails.\"Issues
for <March 11, 2014- published and distributed via e-mail subscription
on Tuesdays and Fridays.Description based on: September 19, 2006; title
from title screen (viewed March 12, 2014).University of Alabama at
BirminghamPeriodicals.Periodicals.fast(OCoLC)fst01411641University of
Alabama at Birmingham.fast(OCoLC)fst00645114University of Alabama at
Birmingham.Office of Public Relations and Marketing.University of
Alabama at Birmingham.Information Technology.2006-2012, companion
to:University of Alabama at Birmingham.UAB
reporter.(OCoLC)32435748Archived
issueshttp://hatteras.dpo.uab.edu/cgi-bin/ereporter.cgiinfo:srw/schema/1/marcxmlxml00000cas
a22000007a 4500166387050070829c20059999aluwr ne | a0eng c
2007215501NSDengNSDOCLCQ11939-68991939-68991The Wilkie Clark Memorial
Foundation, P.O. Box 514, Roanoke, AL 36274$30.00nsdpmscISSN
RECORD305.89614People's voice (Roanoke, Ala.)The people's voice(Roanoke,
Ala.)The people's voice.Roanoke, AL :Wilkie Clark Memorial
Foundationv.WeeklyBegan with vol. 1, no. 1 in 2005.Description based on:
Vol. 2, no. 20 (Apr. 20, 2007); title from caption.Wilkie Clark Memorial
Foundation.United
StatesAlabamaRandolphRoanoke.AU at
000042141390info:srw/schema/1/marcxmlxml00000nas
a22000007i 45001124677787191021c20uu9999aluwr ne | a0eng
2019202521DLCengrdaDLC12689-3258122730USPSNorth Jackson Press, 42950 Hwy
72, Suite 406, Stevenson, AL 35772nsdppccISSN RECORD071.323North Jackson
pressNorth Jackson press.Stevenson, AL :Caney Creek Publications
LLCWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierDescription
based on surrogate of: Volume 1, number 36 (October 11, 2019); title
from masthead.Latest issue consulted: Volume 1, number 36 (October 11,
2019) (Surrogate).United
StatesAlabamaJacksonStevensoninfo:srw/schema/1/marcxmlxml00000cas
a2200000 a 4500226315099080428d19981998aluwr ne | 0eng c
2008233691GUAengGUAOCLCQOCLCFOCLCO39911644pccn-us-gaThe Dekalb
news.Birmingham, Ala. :Community newspaper holdings Inc.v.WeeklyBegan
with 1st year, no. 1 (Apr. 1, 1998); ceased with 1st year, no. 31 (Oct.
28, 1998).Final issue consulted.Description based on first issue; title
from caption.Decatur (Ga.)Newspapers.DeKalb County
(Ga.)Newspapers.Newspapers.fast(OCoLC)fst01423814GeorgiaDecatur.fast(OCoLC)fst01226234GeorgiaDeKalb
County.fast(OCoLC)fst01215288United
StatesGeorgiaDeKalbDecatur.Decatur-DeKalb news/era(DLC)sn
89053661(OCoLC)19946163info:srw/schema/1/marcxmlxml00000cas a2200000 i
450050263311m o d cr cn|||||||||020730c19979999alu x neo
0 a0eng c
2015238492AMHengrdapnAMHOCLCQOCLCFOCLCOIULOCLHTMOCLCQCOODLC66460694810970435082687-93791AU
at 000050711528OCLCS45109pccnsdpn-us---AP2.B5707023Birmingham
weekly (Online)Birmingham weekly(Online)Birmingham weekly.Birmingham, AL
:Birmingham Weekly1 online resourceIrregular,Feb. 16-28,
2012-Weekly,Sept. 4-11, 1997-Feb. 9-16,
2012texttxtrdacontentcomputercrdamediaonline resourcecrrdacarrierBegan
with vol. 1, issue 1 (Sept. 4-11, 1997).\"City news, views &
entertainment\"--Cover.Numbering dropped in Mar. 2012.Also issued in
print.Description based on: Publication information from ProQuest; title
from web page (viewed June 18, 2015).Latest issue consulted: Aug. 15-20,
2012.Birmingham (Ala.)Newspapers.Internet resources.Electronic
journals.AlabamaBirmingham.fast(OCoLC)fst01204958Newspapers.fast(OCoLC)fst01423814United
StatesAlabamaBirmingham.Print version:Birmingham
Weekly(OCoLC)39271050http://apw.softlineweb.com/http://WC2VB5MT8E.search.serialssolutions.com/?sid=sersol&SS_jc=JC_000051895&title=Birmingham+Weeklyinfo:srw/schema/1/marcxmlxml00000cas
a22000007a 450031471314941116d19941995aluwr ne 0 a0eng csn
94003083
NSDengNSDANEOCLCQOCLCFOCLCOOCLCQ11079-65411079-65411nsdppccn-us-akSoutheast
shopperSoutheast shopper.Juneau, Alaska :Kemper
Communications,1994-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol.
1, no. 1 (Nov. 16, 1994)-Ceased in Feb. 1995.Juneau
(Alaska)Newspapers.AlaskaJuneau.fast(OCoLC)fst01213587Newspapers.fast(OCoLC)fst01423814United
StatesAlaskaJuneau.AU at 000011356572info:srw/schema/1/marcxmlxml00000cas
a22000008a 450027910515930413c19949999alumr n 0 a0eng dsn
93002581 NSDengNSDOCLCQ11069-06621Birmingham Tribune, 216 Ave. T. Pratt
City, Birmingham, AL 35214nsdpBirmingham tribuneBirmingham
tribune.Birmingham, Ala. :Kervin
Fondren9501volumesMonthlytexttxtrdacontentunmediatednrdamediavolumencrdacarrierPREPUB:
publication expected Jan.
1995AU at 000025863987info:srw/schema/1/marcxmlxml00000cas a22000007a
450026199931920716d19922013alumr ne 0 a0eng csn 92003357
NSDengNSDOCLOCLCQDLC011064-01341064-01341Black & White, POB 13215,
Birmingham, AL 35202-3215nsdppccBlack & white (Birmingham, Ala.)Black &
white(Birmingham, Ala.)Black & white.Black and whiteBirmingham, Ala.
:Black & White, Inc.v.Biweekly,Oct. 2, 1997-Monthly,May 1, 1992-Sept.
1997Began in May 1992; ceased with Jan. 10, 2013.\"Birmingham's New
City
paper.\"Description based on: June 1992.Latest issue consulted: No. 67
(Oct. 16, 1997) (surrogate).info:srw/schema/1/marcxmlxml00000cas
a2200000 a 450032145723950314d19901999alumr ne 0 a0eng csn
95068755
MGNengMGNNSDCLUOCLCQOCLCFOCLCOOCLCA971211082-34841082-34841AU at
000011579542nsdppccn-us-alF335.J5S68The
Southern shofarThe Southern shofar.Birmingham, AL :L. Brook,-[1999]v.
:ill. ;35 cm.MonthlyBegan in 1990.-v. 9, issue 9 (Aug./Sept. 1999).\"The
monthly newspaper of Alabama's Jewish community.\"Some issues also
available on the Internet via the World Wide Web.Description based on:
Vol. 3, issue 11 (Oct. 1993).Jewish newspapersAlabama.Jewish
newspapers.fast(OCoLC)fst00982872Alabama.fast(OCoLC)fst01204694United
StatesAlabamaJeffersonBirmingham.Deep South Jewish voice(DLC)sn
99018499(OCoLC)42431704CLUhttp://bibpurl.oclc.org/web/719http://www.bham.net/shofar/info:srw/schema/1/marcxmlxml00000cas
a22000007a 450021265141900326c19909999aluwr ne 0 a0eng csn
90099004 AARengAARCPNNSDOCLCQ11050-08981050-08981005022USPSE.O.N., Inc.,
Main St., Eclectic, AL 36024pccnsdpISSN RECORDThe Eclectic observerThe
Eclectic observer.Eclectic, Ala. :E.O.N., Inc.,1990-v.WeeklyVol. 1, no.
1 (Feb. 22, 1990)-Published by: Price Publications, Inc., <2006->Latest
issue consulted: Vol. 17, no. 1 (Jan. 5, 2006).United
StatesAlabamaElmoreEclectic.AU at
000040212446info:srw/schema/1/marcxmlxml00000cas
a22000007a 450021214781900314c19909999aluir ne 0 a0eng csn
90002457 AAAengAAANSDOCLCQ111050-20841050-20841931180USPSClanton
Newspapers, 1109 Seventh St., N., PO Box 1379, Clanton, AL
35045nsdppccn-us-alThe Clanton advertiserThe Clanton
advertiser.AdvertiserClanton, Ala. :Clanton Newspapersv. :ill. ;58
cm.Three no. a week,<May 13, 1992->Semiweekly,<Apr. 4, 1990->Began
in
Jan. 1990.Description based on: Vol. 19, no. 27 (Wed., Apr. 4,
1990).Latest issue consulted: Vol. 22, no. 58 (May 13, 1992).United
StatesAlabamaChiltonClanton.Independent advertiser (Clanton,
Ala.)(OCoLC)21214732AU at 000025908452info:srw/schema/1/marcxmlxml00000cas
a2200000 a 450021214814900314c19909999aluwr ne 0 a0eng dsn
90099009 AAAengAAACPNNSDOCLCQ11056-32881056-32881505740USPSThe Blount
Countian, 3rd St. at Washington Ave., PO Box 310, Oneonta, AL
35121mscnsdpn-us-alThe Blount countianThe Blount countian.Oneonta, Ala.
:Southern Democrat, Inc.,1990-v. :ill.WeeklyVol. 1, no. 1 (Jan. 3,
1990)-Editor: Molly Howard Ryan, 1990-Latest issue consulted: Vol. 1,
no. 36 (Sept. 5, 1990).Ryan, Molly Howard.United
StatesAlabamaBlountOneonta.Southern Democrat(DLC)sn
85044741(OCoLC)12038577AU at 000025884049info:srw/schema/1/marcxmlxml00000cas
a22000007a 450022413044900920c19909999aluwr ne 0 a0eng dsn
90099011
AARengAARCPNNSDNSTOCLCQ92081707011191053-91231053-91231314240USPSmscnsdpThe
Clay times-journalThe Clay times-journal.Lineville, Ala. :C.L.
Proctor,1990-v.WeeklyVol. 1, no. 1 (Sept. 6, 1990)-United
StatesAlabamaClayLineville.Ashland progress(DLC)sn 85044701Lineville
tribune(DLC)sn 85044702AUinfo:srw/schema/1/marcxmlxml00000cas a22000007a
450021265218900326c19909999aluwr ne 0 0eng dsn 90099005
AARengAARCPNOCLCQmscTrussville news-journal.Trussville, Ala. :Mike
Mitchell,1990-v.BimonthlyVol. 1, no. 1 (Feb. 20, 1990)-United
StatesAlabamaJeffersonTrussville.info:srw/schema/1/marcxmlxml00000cas
a22000007a 450022301035900831c19909999aluwr ne 0 0eng dsn
90099010 AARengAARCPNOCLCQmscWeaver tribune.Oxford, Ala. :Cheaha
Pub.,1990-v.WeeklyVol. 1, no. 1 (July 19, 1990)-United
StatesAlabamaCalhounWeaver.United
StatesAlabamaCalhounOxford.info:srw/schema/1/marcxmlxml00000cas
a22000007a 450015155895870205c19879999aludr ne 0 a0eng csn
87050045
AAAengAAACPNNSDDLCCPNNSDDLCCPNDLCOCLDLCOCLCQOCLCFOCLCQ19261126829944596670892-44570892-44571AU
at 000020456714360980USPSThe
Advertiser, P.O. Box 1000, Montgomery, AL
36192pccnsdpn-us-alNewspaperMontgomery advertiser (Montgomery, Ala. :
1987)The Montgomery advertiser(1987)The Montgomery advertiser.Montgomery
advertiser & the Alabama journalSunday Montgomery advertiserMontgomery,
Ala. :Advertiser Co.,1987-volumes
:illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier160th
year, no. 1 (Jan. 2, 1987)-On Saturdays, Sundays and holidays a combined
edition is published with the Alabama journal, and called: Montgomery
advertiser and the Alabama journal, Jan. 3, 1987, and: Alabama journal
and Montgomery advertiser, Jan. 4, 1987-Feb. 25, 1990.Issues for Sunday
called: Sunday Montgomery advertiser, Mar. 4, 1990-Issues for Saturday,
Sunday and holidays have their own numbering, Jan. 3, 1987-Feb. 25,
1990.Montgomery
(Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United
StatesAlabamaMontgomeryMontgomery.Advertiser (Montgomery,
Ala.)0745-3221(DLC)sn 82008412(OCoLC)9049482Alabama journal (Montgomery,
Ala. : 1940)0745-323X(DLC)sn
87062018(OCoLC)2666111info:srw/schema/1/marcxmlxml00000cas a2200000 a
450016942287871105c19879999aludn ne 0 a0eng dsn 88050149
AAAengAAACPNNSDOCLCQy1044-00701044-0070746--32780746-32781565580USPSTroy
Publications, Inc., 113 North Market St., Troy, AL 36081mscnsdpMessenger
(Troy, Ala.)The Messenger(Troy, Ala.)The Messenger.Troy, Ala. :Troy
Pub.,1987-v.Daily (Sunday, Tuesday, Thursday and Friday)Vol. 121, no.
166 (July 1, 1987)-Sunday, Apr. 2, 1989 misprinted as v. 113.Latest
issue consulted: Vol. 113 [sic 123], no. 96 (Sunday, Apr. 2,
1989).United StatesAlabamaPikeTroy.Troy messenger0746-3278(DLC)sn
83009935(OCoLC)9921908info:srw/schema/1/marcxmlxml00000cas a22000007a
450017799786880415c19879999aluir ne 0 a0eng dsn 88050086
AARengAARCPNNSDOCLCQ1p1044-03801044-03800745-75961441520USPSThe
Prattville Progress, 152 W. 3rd St., Prattville, AL
36067mscnsdpPrattville progress (Prattville, Ala. : 1987)The Prattville
progress(Prattville, Ala.)The Prattville progress.Prattville, Ala.
:James C. Seymour,1987-v.Three times a weekVol. 102, no. 8 (Jan. 20,
1987)-Latest issue consulted: Vol. 105, no. 153 (Wednesday, Dec. 26,
1990).United StatesAlabamaAutaugaPrattville.Progress (Prattville,
Ala.)0745-7596(DLC)sn
83007623(OCoLC)9428489info:srw/schema/1/marcxmlxml00000cas a22000007a
450015344667870319c19869999aluwr ne 0 a0eng dsn 87000284
NSDengNSDCPNOCLCQy0893-07670893-07671431800USPSPickens County Herald,
P.O. Drawer E, Carrollton, AL 35447nsdpPickens County heraldPickens
County herald.Pickens County herald and west AlabamianCarrollton, Ala.
:Pickens Newspapers, Inc.,1986-WeeklyVol. 138, no. 40 (Oct. 2,
1986)-United StatesAlabamaPickensCarrollton.Pickens County herald and
west Alabamian0746-0473(DLC)sn
83008141AU at 000040635809info:srw/schema/1/marcxmlxml00000cas a22000007a
450018917586881217c19869999aluwr ne 0 0eng dsn 88050225
CPNengCPNOCLCQmscThe Oxford sun/times.Oxford, Ala.
:[s.n.],1986-v.WeeklyVol. 1, no. 1 (Jan. 16, 1986)-Editor: Andy
Goggans.Numbering is irregular.United StatesAlabamaCalhounOxford.Oxford
sun (Oxford, Ala.)(DLC)sn
85045023AU at 000025803813info:srw/schema/1/marcxmlxml00000cas a22000007a
450013991168860731c19869999aluwr ne 0 0eng dsn 86050322
CPNengCPNOCLCQmscIndependent (Brewton, Ala.)The Independent.Brewton,
Ala. :Jim Thornton,1986-v. :ill. ;58 cm.WeeklyVol. 1, no. 1 (June 19,
1986)-United
StatesAlabamaEscambiaBrewton.info:srw/schema/1/marcxmlxml00000cas
a22000007a 450018957493881231c19859999aluwr ne 0 0eng dsn
88050247 CPNengCPNOCLCQmscPiedmont journal-independent (Piedmont,
Ala.)The Piedmont journal-independent.Journal independentPiedmont, Ala.
:Lane Weatherbee,1985-v.WeeklyVol. 4, no. 52 (Dec. 24, 1985)-Sometimes
published as: Journal independent.United
StatesAlabamaCalhounPiedmont.Journal-independent(DLC)sn
85045014info:srw/schema/1/marcxmlxml00000cas a22000007a
450012715821851024d19841985aluwr ne 0 a0eng dsn 85045014
CPNengCPNNSDCPNOCLCQmscThe Journal-independent.Piedmont, Ala.
:Journal-Independent, Inc.,1984-1985.volumes :illustrations ;58
cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 3,
no. 27 (July 3, 1984)- v. 4, no. 51 (Dec. 18, 1985).Carries the same
vol. numbering as the Piedmont journal-independent.United
StatesAlabamaCalhounPiedmont.Piedmont
journal-independent0890-6017(DLC)sn 85045013Piedmont journal-independent
(Piedmont, Ala.)(DLC)sn 88050247info:srw/schema/1/marcxmlxml00000cas
a22000007a 450012691448851018c19839999aludr ne 0 0eng dsn
85045007 CPNengCPNOCLCQmscTimesDaily.Times dailyFlorence, Ala. :T.S.P.
Newspapers, Inc.,1983-volumes :illustrations ;58
cmDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 114,
no. 226 (Aug. 14, 1983)-United StatesAlabamaLauderdaleFlorence.Florence
times + tri-cities daily(DLC)sn
85044995info:srw/schema/1/marcxmlxml00000cas a22000007a
45009428489830420d19831987aluir ne 0 a0eng dsn 83007623
NSDengNSDCPNNSDNSTOCLCQ89090d0745-75960745-75961The Progress, 152 W. 3rd
St., Prattville, AL 36067nsdpmscProgress (Prattville, Ala.)The
Progress(Prattville, Ala.)The Progress.Prattville, Ala. :The Prattville
Progress,1983-1987.volumes :illustrations ;58 cmThree times a
weektexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 98, no.
32 (Mar. 17, 1983)-v. 102, no. 7 (Jan. 17, 1987).United
StatesAlabamaAutaugaPrattville.Prattville progress(DLC)sn
85044740Prattville progress (Prattville, Ala.)1044-0380(DLC)sn
88050086(OCoLC)12254317AAPinfo:srw/schema/1/marcxmlxml00000cas a2200000
a 45009867255830831c19839999aludr ne 0 a0eng dsn 84008052
AAAengAAANSDOCLOCLCQX0743-15110743-15111617760USPST.S.P. Newspapers,
Inc., 219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Shoals
edition)TimesDaily(Shoals ed.)TimesDaily.Times dailyShoals ed.Florence,
Ala. :T.S.P. Newspapersvolumes
:illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan
with: Vol. 114, no. 226 (Aug. 14,
1983).\"Florence/Sheffield/Tuscumbia/Muscle Shoals.\"Shoals ed. and
Regional ed. combined on Sundays.Description based on: Vol. 114, no. 346
(Monday, Dec. 12, 1983).United
StatesAlabamaLauderdaleFlorence.TimesDaily (Regional
edition)0743-152XTimes Tri-cities dailyUnknownDec. 12,
1983info:srw/schema/1/marcxmlxml00000cas a22000007a
450010536023840319c19839999aludr ne 0 a0eng dsn 84008051
NSDengNSDOCLCQ1x0743-152X0743-152X1617760USPST.S.P. Newspapers, Inc.,
219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Regional
edition)TimesDaily(Regional ed.)TimesDaily.Times dailyRegional
ed.Florence, Ala. :T.S.P.
NewspapersDailytexttxtrdacontentunmediatednrdamediaBegan with: Vol. 114,
no. 226 (Aug. 14, 1983).Shoals ed. and Regional ed. combined on
Sundays.Description based on: Vol. 114, no. 346 (Monday, Dec. 12,
1983).United StatesAlabamaLauderdaleFlorence.TimesDaily (Shoals
edition)0743-1511Times Tri-cities dailyDec. 12,
1983AU at 000025818125info:srw/schema/1/marcxmlxml00000cas a22000007a
45009049482821213d19821987aludn ne 0 a0eng csn 82008412
AAAengAAANSDNPWCPNDLCCPNNSDDLCNSDDLCCPNNVFDLCOCLCQCRLOCLCFOCLCQ1d0745-32210745-32211nsdppccn-us-alNewspaperAdvertiser
(Montgomery, Ala.)The Advertiser(Montgomery, Ala.)The advertiser.Alabama
journal and advertiserMontgomery, Ala. :Advertiser Co.,1982-1987.volumes
:illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier155th
year, no. 232 (Nov. 22, 1982)- ; -v. 14-3, Jan. 1, 1987.On Saturdays,
Sundays and holidays published as: The Alabama journal and advertiser,
Nov. 27, 1982-Jan. 1, 1987.Saturday, Sunday and holiday issues have
their own numbering.Montgomery
(Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United
StatesAlabamaMontgomeryMontgomery.Montgomery advertiser (Montgomery,
Ala. : Daily)(DLC)sn 84020645(OCoLC)2685433Montgomery advertiser
(Montgomery, Ala. : 1987)0892-4457(DLC)sn
87050045(OCoLC)15155895AU at 000020281746info:srw/schema/1/marcxmlxml00000cas
a2200000 a 45009237931830218c19829999aluwr ne 0 0eng dsn
86050139 AAAengAAACPNOCLOCLCQmscThe Randolph leader.Roanoke, Ala. :David
S. Stevenson,1982-volumes :illustrations ;58
cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 91,
no. 1 (Oct. 6, 1982)-United StatesAlabamaRandolphRoanoke.Roanoke
leader(DLC)sn 86050137Randolph press(DLC)sn
86050138info:srw/schema/1/marcxmlxml00000cas a22000007a
450012715815851024d19821984aluwr ne 0 a0eng dsn 85045013
CPNengCPNNSDCPNOCLCQ110890-60170890-60171432080USPSThe Piedmont
Journal-Independent, 115 N. Center Ave., Piedmont, AL 36272mscnsdpThe
Piedmont journal-independentThe Piedmont journal-independent.Piedmont,
Ala. :Piedmont Journal-Independent, Inc.,1982-1984.volumes
:illustrations ;58
cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 1,
no. 1 (Mar. 31, 1982)-v. 3, no. 26 (June 27, 1984).Latest issue
consulted: Vol. 5, no. 31 (August 20, 1986).United
StatesAlabamaCalhounPiedmont.Piedmont journal(DLC)sn
85045012Journal-independent(DLC)sn
85045014(OCoLC)12715821AU at 000045312916info:srw/schema/1/marcxmlxml00000cas
a22000007a 45009183905830202c19829999aluwr n 0 a0eng dsn
85044580 AAAengAAACPNNSDOCLOCLCQ11098-58671098-58671016409USPSNo. 4,
Rucker Plaza, Enterprise, AL 36331P.O. Box 1536, Enterprise, AL
36331mscnsdpSoutheast sun (Enterprise, Ala.)The southeast
sun(Enterprise, Ala.)The Southeast sun.Enterprise, Ala. :QST
Publicationsvolumes :illustrations ;58
cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in
1982.Description based on: Vol. 1, no. 25 (Oct. 21, 1982).Latest issue
consulted: Vol. 16, no. 43 (Mar. 4, 1998).United
StatesAlabamaCoffeeEnterprise.AU at
000025827687info:srw/schema/1/marcxmlxml00000cas
a22000007a 450010487314840305c19819999aluwr ne 0 a0eng dsn
85044906
AAAengAAACPNNSDNSTCPNOCLOCLCQOCLCFOCLCOOCLCAOCLCQ900410885-16620885-16621749310USPSThe
New Times, 1618 1/2 St. Stephens Rd., Mobile, AL 36603mscnsdpn-us-alNew
times (Mobile, Ala.)The New times(Mobile, Ala.)The new times.Mobile,
Ala. :New Times Groupvolumes
:illustrationsWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan
in 1981.Vol. 3, no. 49 (Dec. 15-21, 1983) and vol. 3, no. 50 (Dec.
22-28, 1983) are both called vol. 3, no. 49 (Dec. 15-21,
1983).Description based on: Vol. 2, no. 3 (Jan. 28-Feb. 3, 1982).African
AmericansAlabamaNewspapers.African
Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694Newspapers.fast(OCoLC)fst01423814United
StatesAlabamaMobileMobile.AAPUnknownAug. 15,
1985AU at 000024686659info:srw/schema/1/marcxmlxml00000cas a22000007a
450018922463881219d19811983alucr ne 0 0eng dsn 88050233
AARengAARCPNNSDOCLCQmscThe Sylacauga daily advance.Advance/Sylacauga
dailySylacauga advanceSunday advanceAdvanceSylacauga, Ala. :Mrs. W.A.
Moody,1981-1893.v.Semiweekly,<Nov. 24, 1982-Feb. 13, 1983>Daily (except
Mon., Tues. & Sat.),<May 26, 1982-Nov. 21, 1982>Daily (except Sat.
&
Mon.),<Jan. 1, 1981-May 23, 1982>74th Year, no. 123 (Jan. 1, 1981)-76th
year, no. 83 (Feb. 13, 1983).Days of publication vary.Published as: The
Advance/Sylacauga daily, <Aug. 28, 1981-May 23, 1982>.Published as:
Sylacauga advance, <Nov. 24, 1982-Feb. 13, 1983>.On Sunday, published
as: Sunday advance.United StatesAlabamaTalladegaSylacauga.Childersburg
star(DLC)sn 88050232Coosa press(DLC)sn 86050293Daily
home1059-6461(DLC)sn 88050234info:srw/schema/1/marcxmlxml00000cas
a22000007a 450021026715cr un|||||||||900209c19809999aluwr ne 0
0eng dsn 90099002
AARengAARCPNCUSOCLOCLCQTJCOCLCQOCLCFOCLCOOCLCA926143844AU at
000020585756mscn-us-alSpeakin'
out news.Speaking out newsDecatur, Ala. :Minority Network,
Inc.v.WeeklyBegan in 1980.Published in Huntsville, Ala., <1987>-Also
issued by subscription via the World Wide Web.Description based on: Vol.
7, no. 8 (Jan. 7-13, 1987).African AmericansAlabamaNewspapers.African
American
newspapersAlabama.AlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African
American newspapers.fast(OCoLC)fst00799278African
Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United
StatesAlabamaMorganDecatur.United
StatesAlabamaMadisonHuntsville.Speakin' out weekly news(DLC)sn
88050097http://www.softlineweb.com/softlineweb/ethnic.htminfo:srw/schema/1/marcxmlxml00000cas
a22000007a 450014996511861219c19809999aluwr ne 0 a0eng csn
86050472
AARengAARCPNNSDOCLCQ11080-15021080-15021328110USPSnsdppccWest-Alabama
gazetteWest-Alabama gazette.GazetteMillport, Ala. :Millport Pub.
Co.,1980-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrier4th
year, no. 32 (Jan. 3, 1980)-United StatesAlabamaLamarMillport.Gazette
(Millport, Ala.)(DLC)sn 86050471info:srw/schema/1/marcxmlxml00000cas
a2200000 a 450011828156850320c19809999aluwr ne 0 0eng dsn
86050314 AAAengAAACPNOCLOCLCQmscThe Hartford news-herald.Hartford, Ala.
:Geneva Publications,1980-volumes :illustrations ;57-59
cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 80,
no. 20 (Feb. 14, 1980)-United StatesAlabamaGenevaHartford.News-herald
(Hartford, Ala.)(DLC)sn 86050313info:srw/schema/1/marcxmlxml00000cas
a22000007a 450017857788880427d198u198ualusr ne 0 0eng dsn
88050097 AARengAARCPNOCLOCLCQOCLCFOCLCOOCLCAmscn-us-alSpeakin' out
weekly news.Decatur, Ala. :Smothers PublicationsPublished every first
and third Wed. of each monthDescription based on: Vol. 3, no. 13 (May
4-17, 1983).African
AmericansAlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African
Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United
StatesAlabamaMorganDecatur.Weekly news (Huntsville, Ala.)(DLC)sn
87050012Speakin' out news(DLC)sn
90099002info:srw/schema/1/marcxmlxml00000cas a2200000 a
450017807936880418c198u9999aluwr ne 0 a0eng dsn 90099001
AAAengAAACPNOCLOCLCQThe Daleville Sun-Courier, 310 Daleville Ave.,
Daleville, AL 36322mscn-us-alDaleville sun-courier.Daleville, Ala. :QST
Publicationsv. :ill. ;58 cm.WeeklyDescription based on: Vol. 2, no. 28
(Wed., Feb. 17, 1988).United
StatesAlabamaDaleDaleville.AU at
000020585749info:srw/schema/1/marcxmlxml00000cas
a22000007a 450015580838870423c198u9999aluwr ne 0 0eng dsn
87050128 AARengAARCPNOCLCQmscGreene County independent.Eutaw, Ala.
:Greene County Independent, Inc.v.WeeklyDescription based on: Vol. 2,
no. 10 (Mar. 12, 1987).United
StatesAlabamaGreeneEutaw.info:srw/schema/1/marcxmlxml00000cas a22000007a
450010125135831114d198u198ualucr ne 0 a0eng dsn 83003221
NSDengNSDOCLCQ0d0746-55210746-55211Auburn Bulletin & Lee County Eagle,
PO Box 2111, Auburn, Ala. 36830nsdpThe Auburn bulletin & the Lee County
eagleThe Auburn bulletin & the Lee County eagle.Lee County eagleAuburn
bulletin and the Lee County eagleAuburn, Ala. :[publisher not
identified]Semiweekly,<Sept. 5,
1984->WeeklytexttxtrdacontentunmediatednrdamediaDescription based on:
Oct. 19, 1983.United StatesAlabamaLeeAuburn.Auburn bulletin(DLC)sn
89050006Eagle (Auburn, Ala.)(OCoLC)18435663Sept. 5,
1984info:srw/schema/1/marcxmlxml00000cas a22000007a
450018370324880818c198u9999aluwr ne 0 0eng dsn 88050147
CPNengCPNOCLCQmscTri-city times (Geraldine, Ala.)The Tri-City
times.Geraldine, Ala. :Wanda Nelsonv.WeeklyDescription based on: Vol. 2,
no. 24 (Jan. 6, 1982).United
StatesAlabamaDeKalbGeraldine.info:srw/schema/1/marcxmlxml00000cas
a22000007a 450010199338831208c198u9999aluwr ne 0 a0eng dsn
83005367 NSDengNSDCPNOCLCQ10746-62770746-62771707590USPSSpringville Pub.
Co., 539 Main St., Springville, AL 35146nsdpThe St. Clair clarionThe St.
Clair clarion.Saint Clair clarionSpringville, AL :Gary L.
ShultsWeeklytexttxtrdacontentunmediatednrdamediaDescription based on:
Vol. 2, no. 1 (Jan. 5, 1982).United StatesAlabamaSt.
ClairSpringville.AU at 000025783743info:srw/schema/1/marcxmlxml00000cas
a22000007a 450013787251860627c198u9999aluwr ne 0 a0eng dsn
86001923 NSDengNSDCPNOCLCQ10889-00800889-00801The Westerner Star, P.O.
Box 2060, Bessemer, AL 35021nsdpWestern star (Bessemer, Ala.)The Western
star(Bessemer, Ala.)The western star.Bessemer, Ala. :Hal
HodgensWeeklytexttxtrdacontentunmediatednrdamediaDescription based on:
Vol. 3, no. 15 (Wednesday, June 11, 1986).United
StatesAlabamaJeffersonBessemer.Bessemer advertiser(DLC)sn
87050117AU at 000025805174511.1srw.pc any \"y\" and srw.mt any
\"newspaper\" and srw.cp exact
\"Alabama\"50info:srw/schema/1/marcxmlxml1Date,,0mq1lME887FoIbjulKUV6bx9ImwWQNCv9GqZzGS92IKS31lEbcpRJBNHgcE1l29tFaHP9CHe0Yexk1uWQofffull"
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 27 2022, 09:22am via System
Hello Spencer,
Thank you for reaching out about the bulk xml files for the US Newspaper
Directory.
We don't have documentation specific to these bulk xml files, but upon
further inspection I can say that each of those files don't necessarily
contain info for 50 newspaper titles. The structure of the titles for
California and New York for instance are different from say, Alabama.
If you look at California for example, the file naming structure
indicates the year the title started, and then the number of titles
included in that xml file. So for instance, the files below include info
for newspapers that started in 2000, 2001, and 2002 respectively. And
there is info for 30 titles in the xml file from 2000, and 14 in the
file for 2001, and so on.
* ndnp_California_2000_e_0001_0030.xml
* ndnp_California_2001_e_0001_0014.xml
* ndnp_California_2002_e_0001_0012.xml
If there's more than 50 titles for a given year, say for California
starting in 1880, then the next 50 titles will roll into the next xml
file, and so on. And the last xml file for that year may not include 50
titles.
Many of the states seem to group all the years together, so each xml
file contains 50 titles, until possibly the last one for a given state,
which may contain less.
I hope this information helps explain the total number of records and
structure a bit better. Let me know if you have any further questions.
Best wishes,
Kerry Huller
Newspaper & Current Periodical Reading Room
Serial & Government Publications Division
Library of Congress
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 25 2022, 02:22pm via Email
Hi, Kerry:
Might there be documentation on the XML files you mentioned?
I've successfully read
'https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/',
extracted the names of 6666 XML files, and read the first one,
"ndnp_Alabama_all-yrs_e_0001_0050.xml". It contains 29415 characters,
beginning, "1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i
45001030438981180404c20159999aluwr n 0 a0eng ". With a bit
more effort, I will likely be able to parse all 6666 of these. The
names suggest that each contains information on 50 newspapers, totaling
333,300. The main page
"https://chroniclingamerica.loc.gov/search/titles/" says there are
only
157,521 "Titles currently listed". This suggests that these XML files
include place holders for a little more than double the number of
entries currently in
"https://chroniclingamerica.loc.gov/search/titles/".
Thanks for this.
Progress.
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 07 2022, 08:55am via System
Hi Spencer,
I thought of one more option after I emailed you yesterday that I wanted
to make you aware of.
I had explained the other day how we pull the records from OCLC into our
U.S. Newspaper Directory. You can also access all of?the raw MARC
records found in the directory in xml format from here if you choose:
https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/
<https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/>?These?will
provide you all of the data from the record fields in MARC format, so
you'd get all the data you see here for example:
https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/
<https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/>?but in xml. I
don't know if this might be more data and info than you want to work
with, but wanted to make sure you were aware of this option as well.
Best wishes,
Kerry Huller
Newspaper & Current Periodical Reading Room
Serial & Government Publications Division
Library of Congress
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 06 2022, 10:55am via System
Hi Spencer,
Thanks for reaching out again. I have been looking at the json view a
bit closer this morning and your example of "9999."
After talking with a colleague this morning and looking at various
examples, I see there is some variation in how the titles with either an
unknown starting/ending date or currently published titles are being
handled - depending on the view.
As an example, I completed a search in the directory for Alaska and the
city of Anchorage. There are 80 results, and?on the first page of
results you'll see # 4. Fort Richardson news, which was published from
1952-19??. The csv view of this state/city search result will show the
ending date of 19??. But if I append &format=json to this search result,
this specific title will show an ending date of 1999. After talking with
a colleague this morning, I discovered an integer had to be used in
these cases where dates were "?" so that the search based on year
range
would work. Similarly, if you look at # 12 Alaska digest, which was
published 1994-current, the "current" becomes "9999" in the
json view.
So, the records you are seeing with "9999" would most likely be titles
with an ending date of "current."
However, there is an issue with the unknown dates, like "1999" being
used for "19??" in the example above. The "9" does not get
inserted in
place of "?" when you are looking at the title/LCCN view of a specific
newspaper. So for instance, if you view the #4 title: Fort Richardson
news at this url: https://chroniclingamerica.loc.gov/lccn/sn98059792/
<https://chroniclingamerica.loc.gov/lccn/sn98059792/>?but append .json
to the end of the url, after the LCCN, like this:
https://chroniclingamerica.loc.gov/lccn/sn98059792.json
<https://chroniclingamerica.loc.gov/lccn/sn98059792.json>?you'll see
that the end_year is "19??." Viewing the title/LCCN json view for
titles
that are currently published will also show the end_year as "current."
The Alaska digest example from above can be viewed here:
https://chroniclingamerica.loc.gov/lccn/sn97060056.json
<https://chroniclingamerica.loc.gov/lccn/sn97060056.json>
I wasn't aware of the difference between the directory search json view
and the title/LCCN view. But I think it would be possible to grab
the?data from?the title/LCCN json url through an additional script
potentially. The json url is included in the view under the?"url"
field.
Of course, there are unknowns with publishing dates, but better to know
where the question marks are, and what titles are considered to be current.
I hope this clarifies the data a bit more - let me know if any of it
needs more clarification though. And let me know if you have follow-up
questions.
Thank you,
Kerry Huller
Newspaper & Current Periodical Reading Room
Serial & Government Publications Division
Library of Congress
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 05 2022, 04:42pm via Email
Hi, Kerry:
What would you suggest I do to get a count of the numbers of
newspapers and publishers operating by year from, say, 1790 to 2021?
I just determined that 20630 (13 percent) of the 157520 records in
the US Newspaper database I downloaded a week ago have end_year = 9999.
I don't think it's feasible to assume that all or even most of those
are still publishing.
Might there be some other database that might have this kind of
information?
I ask, because Robert McChesney (2004) The Problem of the Media
(Monthly Review Pr., esp. pp. 34-35) suggests that in the first half of
the nineteenth century, the US had more newspapers and newspaper
publishers per capita than any other place or time. He suggests that
that diversity of newspapers helped encourage literacy and limit
political corruption, both of which helped propel the young US to its
current dominance of the international political economy. I'm hoping to
get some data to evaluate this claim. Sadly, it looks like there is too
much missing and questionable data in this dataset for me to use this
without a fairly substantive data cleaning effort.
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 05 2022, 09:05am via System
Hello Spencer,
Thank you for reaching out about your additional questions.
I was looking at the records you mention above, and yes, you are correct
- those 9 records with the date inconsistencies and the one record?for
the The New Mexican mining news
<https://chroniclingamerica.loc.gov/lccn/sn93061507/> containing
"Santa
Fe.\" have typos in them. Thanks for spotting these - it may be possible
to have the cataloger in our division correct those typos. I will look
into this further.
The U.S. Newspaper Directory doesn't have a connection with Wikimedia or
Wikipedia. The Library of Congress?periodically pulls the records for
the Directory from OCLC Worldcat
<https://www.oclc.org/en/worldcat.html>. And those?newspaper records in
OCLC Worldcat have been created by catalogers?at various institutions
around the U.S. over the span of several years. So, occasionally, you
will find a typo in the records. Corrections can be?made by OCLC and
library staff at the various institutions. Every time we complete a new
pull on the OCLC records, any corrected records will then populate our
Directory.
Regarding your question on the New-York weekly journal - yes, that is
also correct that it has two records. There is actually a?record?for
each format of the newspaper, so this record is for the microfilm format
<https://chroniclingamerica.loc.gov/lccn/2009252748/> and this one is
for the original print format
<https://chroniclingamerica.loc.gov/lccn/sn83030211/>. You can see in
the heading for the microfilm record where it says [microfilm reel] and
the print version shows [volume]. You are likely to see this for other
titles as well because each format has been cataloged with its own LCCN.
You are also likely to see additional records with [online resource]
identified as the format as more and more titles are available as
ePrints or online.
I hope this helps answer your additional questions a bit more. Please
reach out if you have any other questions.
Thank you,
Kerry Huller
Newspaper & Current Periodical Reading Room
Serial & Government Publications Division
Library of Congress
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 04 2022, 01:47pm via Email
Hi, Kelly:
At the risk of bombing your inbox with more emails than you want,
what is your relationship with Wikipedia and other Wikimedia Foundation
projects like Wikidata?
I ask, because I've logged over 20,000 edits in Wikimedia Foundation
projects since 2010, and I would happily try to answer questions about
Wikidata and other Wikimedia Foundation projects. I have NOT organized
an edit-a-thon, but I've made presentations at conferences with people
who have, and I would happily try to help organize such if you could
find a group of people who want to work to improve this US Newspaper
database. I think it would be good to establish links between this US
Newspaper database and Wikidata, with appropriate procedures so changes
to one could be evaluated for acceptance into the other.
FYI, John Peter Zenger's famous "New-York weekly journal"
(1733-1751)
appears TWICE in your database with lccn = 2009252748 and sn83030211 and
ONCE in Wikidata WITHOUT an lccn, even though many other Wikidata items
have an lccn. See:
https://www.wikidata.org/wiki/Q23091960
There's a "WikiProject Newspapers" on Wikipedia and a companion
"WikiProject Periodicals" on Wikidata:
https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Newspapers/Wikidata
https://www.wikidata.org/wiki/Wikidata:WikiProject_Periodicals
I've tried to connect with others on those projects, so far with only
limited success. However, you may know that almost anyone can change
almost anything on Wikipedia and other Wikimedia Foundation projects.
What stays tends to be written from a neutral point of view citing
credible sources. They have problems with vandals, but the problems are
usually easily controlled. This makes Wikipedia and Wikidata very
useful platforms for cleaning up databases like your US Newspaper dataset.
Spencer Graves
##########
Hello, Kelly:
In addition to the invalid JSON, discussed below [NOTE: The "below"
contains a slight addition to the report of the I sent last Friday.], I
found 9 (NINE!) cases where start_year was AFTER end_year. These have
lccn = "sn86071531" "sn95069213" "sn90059096"
"sn86058451" "sn90060926"
"sn99065409" "sn89065002" "sn98069857"
"sn91059179"
See:
https://chroniclingamerica.loc.gov/lccn/sn86071531/
https://chroniclingamerica.loc.gov/lccn/sn95069213/
https://chroniclingamerica.loc.gov/lccn/sn90059096/
https://chroniclingamerica.loc.gov/lccn/sn86058451/
https://chroniclingamerica.loc.gov/lccn/sn90060926/
https://chroniclingamerica.loc.gov/lccn/sn99065409/
https://chroniclingamerica.loc.gov/lccn/sn89065002/
https://chroniclingamerica.loc.gov/lccn/sn98069857/
https://chroniclingamerica.loc.gov/lccn/sn91059179/
These all have obvious coding errors that can be easily fixed. The
data may not be completely accurate after the fix, but at least they are
not obviously wrong ;-)
##################
I got invalid JSON from:
https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json
After some experimentation, I was able to replicate the problem with
a request for rows=10:
https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json
Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics
and Associate Dean for Graduate Programs at the University of California
- Davis, confirmed that it was a JSON error using:
https://codebeautify.org/jsonvalidator
He is part of the core team developing the R free, open-source
programming language. He said, that starting at offsets 161070 and
161502 in the character string you get from [the R code RCurl::getURL()]
we have:
Santa Fe.\"
and these are in an entry such as
"city": ["Santa Fe.\"]
So the final " is escaped and therefore there is no closing " for the
string. The parser continues to consume characters looking for the end
of that string.
If one "repairs" the text from getURL() with
ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt)
then the rest of my code worked fine.
You may wish to do something to implement other checks for valid JSON
and repair this problem. I've scanned all the 157520 records that were
in that database a couple of days ago, and this is the only JSON error
identified by the code I used.
NOTE: I was NOT able to replicate this error when downloading records
one at a time. That suggests a problem NOT in the database itself but
in the download algorithm. ???
Thank you for your help. I will almost certainly have other
questions ;-)
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 03 2022, 10:39pm via Email
Hello, Kelly:
In addition to the invalid JSON, discussed below [NOTE: The "below"
contains a slight addition to the report of the I sent last Friday.], I
found 9 (NINE!) cases where start_year was AFTER end_year. These have
lccn = "sn86071531" "sn95069213" "sn90059096"
"sn86058451" "sn90060926"
"sn99065409" "sn89065002" "sn98069857"
"sn91059179"
See:
https://chroniclingamerica.loc.gov/lccn/sn86071531/
https://chroniclingamerica.loc.gov/lccn/sn95069213/
https://chroniclingamerica.loc.gov/lccn/sn90059096/
https://chroniclingamerica.loc.gov/lccn/sn86058451/
https://chroniclingamerica.loc.gov/lccn/sn90060926/
https://chroniclingamerica.loc.gov/lccn/sn99065409/
https://chroniclingamerica.loc.gov/lccn/sn89065002/
https://chroniclingamerica.loc.gov/lccn/sn98069857/
https://chroniclingamerica.loc.gov/lccn/sn91059179/
These all have obvious coding errors that can be easily fixed. The
data may not be completely accurate after the fix, but at least they are
not obviously wrong ;-)
##################
I got invalid JSON from:
https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json
After some experimentation, I was able to replicate the problem with
a request for rows=10:
https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json
Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics
and Associate Dean for Graduate Programs at the University of California
- Davis, confirmed that it was a JSON error using:
https://codebeautify.org/jsonvalidator
He is part of the core team developing the R free, open-source
programming language. He said, that starting at offsets 161070 and
161502 in the character string you get from [the R code RCurl::getURL()]
we have:
Santa Fe.\"
and these are in an entry such as
"city": ["Santa Fe.\"]
So the final " is escaped and therefore there is no closing " for the
string. The parser continues to consume characters looking for the end
of that string.
If one "repairs" the text from getURL() with
ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt)
then the rest of my code worked fine.
You may wish to do something to implement other checks for valid JSON
and repair this problem. I've scanned all the 157520 records that were
in that database a couple of days ago, and this is the only JSON error
identified by the code I used.
NOTE: I was NOT able to replicate this error when downloading records
one at a time. That suggests a problem NOT in the database itself but
in the download algorithm. ???
Thank you for your help. I will almost certainly have other
questions ;-)
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 01 2022, 11:46am via Email
Hello, Kelly:
I got invalid JSON from:
https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json
After some experimentation, I was able to replicate the problem with
a request for rows=10:
https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json
Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics
and Associate Dean for Graduate Programs at the University of California
- Davis, confirmed that it was a JSON error using:
https://codebeautify.org/jsonvalidator
He is part of the core team developing the R free, open-source
programming language. He said, that starting at offsets 161070 and
161502 in the character string you get from [the R code RCurl::getURL()]
we have:
Santa Fe.\"
and these are in an entry such as
"city": ["Santa Fe.\"]
So the final " is escaped and therefore there is no closing " for the
string. The parser continues to consume characters looking for the end
of that string.
If one "repairs" the text from getURL() with
ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt)
then the rest of my code worked fine.
You may wish to do something to implement other checks for valid JSON
and repair this problem. I've scanned all the 157520 records that were
in that database a couple of days ago, and this is the only JSON error
identified by the code I used.
Thank you for your help. I will almost certainly have other
questions ;-)
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 28 2022, 02:20pm via System
Hello Spencer,
Thank you for sending along your follow-up questions.
I'm glad to hear the json view?will work for you. It was recommended to
me that you limit your requests to 500 rows at a time. And a developer
here at LC suggests the following regarding rate limiting:
?To avoid being blocked by the server, the current rate-limiting rules
restrict un-cached requests to URLs starting with
https://chroniclingamerica.loc.gov/search/
<https://chroniclingamerica.loc.gov/search/> to 120 requests every 10
minutes from a single IP address.?
So, I think if you limited each of your requests to 500 rows at a time
with the proper pauses, then you should be able to access what you need.
As for the csv view, I checked on this as well, and was informed that
the?csv view was not implemented for all url formats. The csv view was
only implemented for this view:
https://chroniclingamerica.loc.gov/newspapers/
<https://chroniclingamerica.loc.gov/newspapers/>and urls resulting from
US Directory search results - for e.g. if you wanted to narrow down your
search results by state, city, date range, etc. found at this link:
https://chroniclingamerica.loc.gov/search/titles/
<https://chroniclingamerica.loc.gov/search/titles/>. So, if you wanted a
csv and limited your search by state ( for example:
https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv
<https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv>
), you could append &format=csv to the search result url and get the csv
to automatically download. But, if your search results ended up being
over a couple thousand titles, then the system would probably time out.
I hope this info helps! Let me know if you have any other questions.
Best wishes,
Kerry Huller
Newspaper & Current Periodical Reading Room
Serial & Government Publications Division
Library of Congress
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 27 2022, 04:15pm via Email
Hello, Kerry:
Thanks for the reply. Can you please give me some further guidance
on two thing "so that the system is not overwhelmed"?
1. The max size in a small batch?
2. Any limit on the number of small batches in a second or minute?
I've found that I can download small batches under program control
using "RCurl::getURL" in R (programming language) using, e.g.;
https://chroniclingamerica.loc.gov/search/titles/results/?rows=20&page=2&format=json
With this, I can control the batch size with "row=20" vs.
"row=50"
vs., e.g., "row=1000". A naive search says there are 157520
"results".
With "row=1000", this would require 158 calls. With
"row=20", it
would require 7876 calls. Before I start, I need to decide which fields
I want; I don't need them all.
Thanks,
Spencer Graves
p.s. I tried appending "&format=csv" and got "Error 504 Ray
ID:
7220896da85e86e7 ? 2022-06-27 19:19:53 UTC Gateway time-out". I used:
https://chroniclingamerica.loc.gov/search/titles/results/?state=&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv
I can get what I want using json so do not need csv. However, I
thought you might want to know that I was unable to get csv to work.
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 27 2022, 10:54am via System
Hello Spencer,
Thank you for contacting the Library of Congress about searching the US
Newspaper Directory. I wanted to follow up with you regarding your
request to output the data in a machine readable format.
It looks like you were provided the link to the API documentation for
the website: About the Site and API
<https://chroniclingamerica.loc.gov/about/api/>. Scroll down to the
section with the heading, Searching the directory and newspaper pages
using OpenSearch. This section describes the search functionality and
structure for the US Newspaper Directory in more detail. It is possible
to return your directory searches in json format by appending
&format=json to the end of the url. It is also possible to return search
results in csv format by appending &format=csv to the end of the url,
but I would strongly suggest that you do this in small batches by
putting limits on your search so that the system is not overwhelmed.
So, from the search page for the US Newspaper Directory
<https://chroniclingamerica.loc.gov/search/titles/>?you could
potentially limit your search based on state?and city, or date range,
and/or even frequency. Then once you've completed the search, you can
add &format=csv to the end of the url to automatically download a csv of
those records. The resulting csv will contain several fields/headers:
lccn, title, place of publication, start year, end year, publisher,
edition, frequency, subject, state, city, country, language, oclc
number, and holding type. I think these fields include the information
you were looking for. But, again, I would like to stress that you put
limits on your search before creating the csv so as not overwhelm the
system.
Please let me know if you have any other additional questions.
Best wishes,
Kerry Huller
Newspaper & Current Periodical Reading Room
Serial & Government Publications Division
Library of Congress
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 23 2022, 01:55pm via System
Mr. Graves,
I'm going to transfer you request to a member of our digital collections
team who may be of more assistance to you than me.
Mike
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 23 2022, 01:51pm via Email
Dear Mr. Queen:
Thanks for the reply. I'm still confused. I downloaded and
installed Docker Desktop and "docker-compose.yml" and ran their
"Getting
Started" Tutorial, but I don't see what to do next.
I repeat: I'd like to analyze "U.S. Newspaper Directory,
1690-Present" (https://chroniclingamerica.loc.gov/search/titles/), which
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 22 2022, 07:15pm via System
Mr. Graves,
Programmatic access to the data forChronicling America
<https://chroniclingamerica.loc.gov/>and possibly the U.S. Newspaper
Directory <https://chroniclingamerica.loc.gov/search/titles/>can be
found on theAbout the Site and API
<https://chroniclingamerica.loc.gov/about/api/>page in various formats.
Also, please note that Chronicling Americacontains newspapers published
from 1777-1963, but does not include everyU.S. newspaper published in
that time period.
Please let me know if I can be of further assistance.
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 22 2022, 06:14pm via Email
Dear Mr. Queen:
Can we simplify this to just giving me the data behind "U.S.
Newspaper Directory, 1690-Present"
(https://chroniclingamerica.loc.gov/search/titles/) in a machine
readable format, e.g., csv or xlsx or a MySQL database?
As I mentioned in my original email, a naive search of that without
restrictions returned 157520 titles in 7876 pages with up to 20 titles
per page giving date ranges in at least some cases. I could probably
write software to scrape those 7876 pages from your web site and combine
them into a data file.
I have a PhD in statistics, I have been using the R programming
language and similar software for decades. This includes publishing
tutorials on how to analyze data like this on Wikiversity.[1] I'd like
to do something similar with this. I could help make your data more
useful to others and discuss with you how we might prioritize
improvements like accessing the other sources you mentioned.
Thanks very much for your reply.
Sincerely,
Spencer Graves, PhD
Founder, EffectiveDefense.org
4550 Warwick Blvd 508
Kansas City, MO 64111
m: 408-655-4567
[1] e.g.:
https://en.wikiversity.org/wiki/US_Gross_Domestic_Product_(GDP)_per_capita
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 22 2022, 05:27pm via System
Mr. Graves
Your request is a little more complex than it first appears and requires
extensive research. A variety of resources should be consulted to
determine the circulation statistics of newspapers published prior to
1851. You will need to check newspaper union lists and newspaper
histories. Union listspresent lists of newspapers in geographic
arrangement according to place of publication, and specify which
libraries or other institutions hold collections of those newspapers and
the dates of their holdings. These can also be useful for tracking title
changes throughout a newspaper's history. Newspaper
historieslikeAmerican Journalism: A History: 1690-1960
<https://lccn.loc.gov/62007157>(Mott),The Penny Press
<https://lccn.loc.gov/2004043078>(Thompson), andThe Press and America
<https://lccn.loc.gov/99044295>(Emery et al.) may not include
circulation statistics, but they do document the diversity and progress
of newspaper publishing, including notable newspapers of the era.
Newspaper histories also cover the history of the printers and printing
of newspapers in a state, county, or region more generally, and provide
more condensed histories of the editors, journalists, and evolution of
the newspapers in a specific area. Newspaper histories and union lists
should be available at most large public or university libraries. More
information about union lists, newspaper histories, and researching
newspapers in general can be found in theU.S. Newspaper Collections at
the Library of Congress
<https://guides.loc.gov/united-states-newspapers/introduction>research
guide (see Reference Sources).
Please let me know if I can be of further assistance.
------------------------------------------------------------------------
Original Question
Jun 20 2022, 02:34pm via System
How can I get counts of the numbers of newspapers by year in the US, and
preferably also elsewhere? A search of "U.S. Newspaper Directory,
How can I get counts of the numbers of newspapers by year in the US, and
preferably also elsewhere?
A search of "U.S. Newspaper Directory, 1690-Present"
(https://chroniclingamerica.loc.gov/search/titles/) returned 157520
titles in 7876 pages with up to 20 titles per page giving date ranges to
the extent that it's known. If I can get a data file (e.g., csv or xls),
I can summarize. I could also use data on circulation and frequency and
especially parent company for multiple newspapers published by the same
company, to the extant that such is available.
I'm interested in this, because McChesney quoted Tocqueville in
suggesting that the US had more newspapers per person (or per million
population) prior to 1851 than at any other time or place in history.
I'd like to evaluate that claim with data to the extent that I can. See
"https://en.wikiversity.org/wiki/Social_construction_of_crime_and_what_we_can_do_about_it#Newspapers_1790_-_present".
Thanks, Spencer Graves, PhD
m: 408-655-4567
------------------------------------------------------------------------
Thank you for using Newspapers & Current Periodicals Ask a Librarian
Service!
This email is sent from Ask a Librarian in relationship to ticket #9625195.
Read our privacy policy. <https://springshare.com/privacy.html>
General XML is not intended to be parsable as a list. But there are lots of
tools you can use to extract various patterns out of XML in forms like a list.
But your data example is huge and I am falling asleep waiting to see if it
loads. I looked sideways and it is not that big directly but my browser may be
trying to show it as a web page.
How about you copying and pasting a sample of say the first few dozen lines so
we see what is in it for the purpose of ...
The schema would be mentioned in an attribute if you know what you are looking
for and may be an external file.
So decide what you want, like a list of all titles and use something like
xpath().
-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Spencer Graves
Sent: Wednesday, July 27, 2022 4:51 PM
To: 'R-help' <r-help at r-project.org>
Subject: [R] Parsing XML?
Hello, All:
What would you suggest I do to parse the following XML file into a
list that I can understand:
XMLfile <-
"https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ndnp_Alabama_all-yrs_e_0001_0050.xml"
This is the first of 6666 XML files containing "U.S. Newspaper
Directory" maintained by the US Library of Congress discussed in the
thread below. I've tried various things using the XML and xml2.
XMLdata <- xml2::read_xml(XMLfile)
str(XMLdata)
XMLdat <- XML::xmlParse(XMLdata)
str(XMLdat)
XMLtxt <- xml2::xml_text(XMLdata)
nchar(XMLtxt)
#[1] 29415
Someplace there's a schema for this. I don't know if it's
embedded
in this XML file or in a separate file. If it's in a separate file, how
could I describe it to my contacts with the Library of Congress so they
would understand what I needed and could help me get it.
Thanks,
Spencer Graves
p.s. All 29415 characters in XMLtext appear in the thread below.
-------- Forwarded Message --------
Subject: [Newspapers and Current Periodicals] How can I get counts of
the numbers of newspapers by year in the US, and preferably also
elsewhere? A search of "U.S. Newspaper Directory,
Date: Wed, 27 Jul 2022 14:59:03 +0000
From: Kerry Huller <serials at ask.loc.gov>
To: Spencer Graves <spencer.graves at effectivedefense.org>
CC: twes at loc.gov
--# Type your reply above this line #--
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 27 2022, 10:59am via System
Hello Spencer,
So, when I view the xml, I'm actually looking at it in XML editor
software, so I can view the tags and it's structured neatly. I've copied
and pasted the text from the beginning of the file and the first
newspaper title below from my XML editor:
<?xml version="1.0" encoding="UTF-8"
standalone="no"?>
<?xml-stylesheet type='text/xsl'
href='/webservices/catalog/xsl/searchRetrieveResponse.xsl'?>
<searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/"
xmlns:oclcterms="http://purl.org/oclc/terms/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:diag="http://www.loc.gov/zing/srw/diagnostic/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<version>1.1</version>
<numberOfRecords>2250</numberOfRecords>
<records>
<record>
<recordSchema>info:srw/schema/1/marcxml</recordSchema>
<recordPacking>xml</recordPacking>
<recordData>
<record xmlns="http://www.loc.gov/MARC21/slim">
<leader>00000nas a22000007i 4500</leader>
<controlfield tag="001">1030438981</controlfield>
<controlfield tag="008">180404c20159999aluwr n 0
a0eng
</controlfield>
<datafield ind1=" " ind2=" "
tag="010">
<subfield code="a"> 2018200464</subfield>
</datafield>
<datafield ind1=" " ind2=" "
tag="040">
<subfield code="a">DLC</subfield>
<subfield code="e">rda</subfield>
<subfield code="c">DLC</subfield>
<subfield code="b">eng</subfield>
</datafield>
<datafield ind1=" " ind2=" "
tag="012">
<subfield code="m">1</subfield>
</datafield>
<datafield ind1="0" ind2=" "
tag="022">
<subfield code="a">2577-5316</subfield>
<subfield code="2">1</subfield>
</datafield>
<datafield ind1=" " ind2=" "
tag="032">
<subfield code="a">021110</subfield>
<subfield code="b">USPS</subfield>
</datafield>
<datafield ind1=" " ind2=" "
tag="037">
<subfield code="b">711 Alabama Avenue, Selma, AL
36701</subfield>
</datafield>
<datafield ind1=" " ind2=" "
tag="042">
<subfield code="a">nsdp</subfield>
<subfield code="a">pcc</subfield>
</datafield>
<datafield ind1="1" ind2="0"
tag="050">
<subfield code="a">ISSN RECORD</subfield>
</datafield>
<datafield ind1="1" ind2="0"
tag="082">
<subfield code="a">071</subfield>
<subfield code="2">15</subfield>
</datafield>
<datafield ind1=" " ind2="0"
tag="222">
<subfield code="a">Selma sun</subfield>
</datafield>
<datafield ind1="0" ind2="0"
tag="245">
<subfield code="a">Selma sun.</subfield>
</datafield>
<datafield ind1=" " ind2="1"
tag="264">
<subfield code="a">Selma, AL :</subfield>
<subfield code="b">North Shore Press,
LLC</subfield>
<subfield code="c">2016-</subfield>
</datafield>
<datafield ind1=" " ind2=" "
tag="310">
<subfield code="a">Weekly</subfield>
</datafield>
<datafield ind1=" " ind2=" "
tag="336">
<subfield code="a">text</subfield>
<subfield code="b">txt</subfield>
<subfield code="2">rdacontent</subfield>
</datafield>
<datafield ind1=" " ind2=" "
tag="337">
<subfield code="a">unmediated</subfield>
<subfield code="b">n</subfield>
<subfield code="2">rdamedia</subfield>
</datafield>
<datafield ind1=" " ind2=" "
tag="338">
<subfield code="a">volume</subfield>
<subfield code="b">nc</subfield>
<subfield code="2">rdacarrier</subfield>
</datafield>
<datafield ind1="1" ind2=" "
tag="362">
<subfield code="a">Began in 2015.</subfield>
</datafield>
<datafield ind1=" " ind2=" "
tag="588">
<subfield code="a">Description based on: Volume 2, Issue
40
(October 5, 2017) (surrogate); title from caption.</subfield>
</datafield>
<datafield ind1=" " ind2=" "
tag="588">
<subfield code="a">Latest issue consulted: Volume 2,
Issue 40
(October 5, 2017).</subfield>
</datafield>
<datafield ind1=" " ind2=" "
tag="752">
<subfield code="a">United States</subfield>
<subfield code="b">Alabama</subfield>
<subfield code="c">Dallas</subfield>
<subfield code="d">Selma.</subfield>
</datafield>
</record>
</recordData>
</record>
When I view the records in the XML editor, these 2 lines below do begin
each of the records for each individual title, but of course this is
including the xml tags:
<recordSchema>info:srw/schema/1/marcxml</recordSchema>
<recordPacking>xml</recordPacking>
Hopefully this helps you decide where to break or parse each record.
On another note, I just noticed as well that at the top of this first
file it lists the total number of records for the Alabama grouping -
2250. This also appeared to be the case for the Alaska records when I
took a look at the first one for that state. I imagine that should be
consistent throughout each "grouping" of records.
Let me know if you have follow-up questions!
Best wishes,
Kerry Huller
Newspaper & Current Periodical Reading Room
Serial & Government Publications Division
Library of Congress
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 27 2022, 10:21am via Email
Hi, Kerry:
Thanks. I understand the chunking in files of at most 50. I've read
the first file "ndnp_Alabama_all-yrs_e_0001_0050.xml" into a string of
29415 characters, copied below. Might you have any suggestions on the
next step in parsing this? Staring at it now, it looks splitting on
"info:srw/schema/1/marcxmlxml" might convert the 29415 characters into
shorter chunks, each of which could then be parsed further.
This is not as bad as reading ancient Egyptian heiroglyphics without
the Rosetta Stone, but I wondered if you might have something that could
make this work easier and more reliable? I guess I could compare with
what I already read as JSON ;-)
Thanks,
Spencer Graves
"1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i
45001030438981180404c20159999aluwr n 0 a0eng
2018200464DLCrdaDLCeng12577-53161021110USPS711 Alabama Avenue, Selma, AL
36701nsdppccISSN RECORD07115Selma sunSelma sun.Selma, AL :North Shore
Press,
LLC2016-WeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in
2015.Description based on: Volume 2, Issue 40 (October 5, 2017)
(surrogate); title from caption.Latest issue consulted: Volume 2, Issue
40 (October 5, 2017).United
StatesAlabamaDallasSelma.info:srw/schema/1/marcxmlxml00000cas a22000007a
4500502150053100127c20109999aluwr n 0 a0eng
2010200019DLCengDLCDLCOCLCQ112153-18111750USPSB & C Publishing, LLC,
3514 Martin St. S. Ste 104, Cropwell, AL 35054pccnsdpISSN RECORDSt.
Clair County news (Cropwell, Ala.)St. Clair County news(Cropwell,
Ala.)St. Clair County news.Cropwell, AL :B & C Pub.WeeklyBegan in
2010.Description based on: Nov. 4, 2010 (surrogate); title from
caption.info:srw/schema/1/marcxmlxml00000cas a22000007a
4500426491872090720c20099999alumr n 0 a0eng
2009203372DLCengDLCOCLCQ12150-346X2150-346X1AU at 000044489617NZ116076352Devon
Applewhite/Applewhite Publishing Co., 1910 Honeysuckle Rd., #N183,
Dothan, AL 36305mscnsdpISSN RECORD30514Triangle tribune(Dothan,
Ala.)Triangle tribune.Dothan, AL :Applewhite Pub. CoMonthlyBegan with
vol. 1, issue 1 (May 2009).\"Connecting the Tri-State African -American
Community.\"Description based on: Vol. 1, issue 1 (May 2009); title from
masthead.Applewhite, Devon.United StatesAlabama.United
StatesGeorgia.United StatesFlorida.info:srw/schema/1/marcxmlxml00000cas
a22000007a 4500289017315081219c20089999aluwr n | a0eng c
2008213218NSDengNSDOCLCQDLCOCLCQ111945-93191945-93191005270USPSSpringhill
Publications,
LLC, P.O. Box 186, Greenville, AL 36037nsdppccISSN RECORD07014Greenville
standardThe Greenville standard.Greenville, AL :Springhill
PublicationsWeeklytexttxtrdacontentunmediatednrdamediaBegan with vol. 1,
issue 1 (Sept. 3, 2008)Description based on surrogate of: Vol. 1, no. 15
(Dec. 18, 2008); title from masthead (publisher's Web site, viewed Dec.
19, 2008).Latest issue consulted: Vol. 1, no. 99 (July 27, 2011)
(surrogate).info:srw/schema/1/marcxmlxml00000cas a22000007a
4500123539969070426c20079999aluwr ne 0 a0eng c
2007212138NSDengNSDNSDOCLCQ101936-95571936-95571The Western Tribune,
1530 Third Ave. N., Bessemer, AL 35020mscnsdpISSN RECORDWestern tribune
(Bessemer, Ala.)The Western tribune(Bessemer, Ala.)The Western
tribune.Bessemer, Ala. :D-Med, Inc.v.WeeklyBegan in 2007.Description
based on: May 23, 2007 (surrogate); title from
caption.AU at 000041575341info:srw/schema/1/marcxmlxml00000cas a22000007a
4500226300653080425c20079999aluwr ne | a0eng
2008212112NSDengNSDNSDOCLCQ11942-20751942-20751nsdppccISSN RECORDThe
corridor messengerThe corridor messenger.Carbon Hill, AL :Corridor
Messenger, Inc.WeeklyBegan with vol. 1, issue (10.03.2007).Description
based on: 1st issue.United StatesAlabamaWalkerCarbon
Hill.http://www.corridormessenger.cominfo:srw/schema/1/marcxmlxml00000cas
a22000007a
450077560432070109c20069999aluwr ne 0 a0eng c
2007213400NSDengNSDOCLCQAUBRNOCLCOOCLCFa01935-37901935-37901AU at
000041190283The
Auburn Villager, P.O. Box 1633, Auburn, AL 36831-1633pccnsdpISSN
RECORDThe Auburn villagerThe Auburn villager.Auburn, AL :Auburn
Villagerv.WeeklyBegan in 2006.Description based on: Vol. 1, no. 4 (July
20, 2006) (surrogate); title from caption.Auburn (Ala.)Newspapers.Lee
County (Ala.)Newspapers.AlabamaAuburn.fast(OCoLC)fst01209634AlabamaLee
County.fast(OCoLC)fst01211930Newspapers.fast(OCoLC)fst01423814United
StatesAlabamaLeeAuburn.info:srw/schema/1/marcxmlxml00000cas a2200000Ii
4500872286785m o d s cr mn|---a||||140311c20069999alucr n o b
s0 a0eng cABCengrdaABCABCOCLCFLD59.13University of Alabama at
Birmingham.The eReporter.[Birmingham, Alabama] :The University of
Alabama at Birmingham,[2006]-[Birmingham, Alabama] :Offices of Public
Relations & Marketing and Information Technology1 online resource2
issues weeklytexttxtrdacontentcomputercrdamediaonline
resourcecrrdacarrierSeptember 19, 2006-\"The eReporter is an official
communication of The University of Alabama at Birmingham, companion to
the UAB Reporter and recommended alternative to mass e-mails.\"Issues
for <March 11, 2014- published and distributed via e-mail subscription
on Tuesdays and Fridays.Description based on: September 19, 2006; title
from title screen (viewed March 12, 2014).University of Alabama at
BirminghamPeriodicals.Periodicals.fast(OCoLC)fst01411641University of
Alabama at Birmingham.fast(OCoLC)fst00645114University of Alabama at
Birmingham.Office of Public Relations and Marketing.University of
Alabama at Birmingham.Information Technology.2006-2012, companion
to:University of Alabama at Birmingham.UAB
reporter.(OCoLC)32435748Archived
issueshttp://hatteras.dpo.uab.edu/cgi-bin/ereporter.cgiinfo:srw/schema/1/marcxmlxml00000cas
a22000007a 4500166387050070829c20059999aluwr ne | a0eng c
2007215501NSDengNSDOCLCQ11939-68991939-68991The Wilkie Clark Memorial
Foundation, P.O. Box 514, Roanoke, AL 36274$30.00nsdpmscISSN
RECORD305.89614People's voice (Roanoke, Ala.)The people's voice(Roanoke,
Ala.)The people's voice.Roanoke, AL :Wilkie Clark Memorial
Foundationv.WeeklyBegan with vol. 1, no. 1 in 2005.Description based on:
Vol. 2, no. 20 (Apr. 20, 2007); title from caption.Wilkie Clark Memorial
Foundation.United
StatesAlabamaRandolphRoanoke.AU at
000042141390info:srw/schema/1/marcxmlxml00000nas
a22000007i 45001124677787191021c20uu9999aluwr ne | a0eng
2019202521DLCengrdaDLC12689-3258122730USPSNorth Jackson Press, 42950 Hwy
72, Suite 406, Stevenson, AL 35772nsdppccISSN RECORD071.323North Jackson
pressNorth Jackson press.Stevenson, AL :Caney Creek Publications
LLCWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierDescription
based on surrogate of: Volume 1, number 36 (October 11, 2019); title
from masthead.Latest issue consulted: Volume 1, number 36 (October 11,
2019) (Surrogate).United
StatesAlabamaJacksonStevensoninfo:srw/schema/1/marcxmlxml00000cas
a2200000 a 4500226315099080428d19981998aluwr ne | 0eng c
2008233691GUAengGUAOCLCQOCLCFOCLCO39911644pccn-us-gaThe Dekalb
news.Birmingham, Ala. :Community newspaper holdings Inc.v.WeeklyBegan
with 1st year, no. 1 (Apr. 1, 1998); ceased with 1st year, no. 31 (Oct.
28, 1998).Final issue consulted.Description based on first issue; title
from caption.Decatur (Ga.)Newspapers.DeKalb County
(Ga.)Newspapers.Newspapers.fast(OCoLC)fst01423814GeorgiaDecatur.fast(OCoLC)fst01226234GeorgiaDeKalb
County.fast(OCoLC)fst01215288United
StatesGeorgiaDeKalbDecatur.Decatur-DeKalb news/era(DLC)sn
89053661(OCoLC)19946163info:srw/schema/1/marcxmlxml00000cas a2200000 i
450050263311m o d cr cn|||||||||020730c19979999alu x neo
0 a0eng c
2015238492AMHengrdapnAMHOCLCQOCLCFOCLCOIULOCLHTMOCLCQCOODLC66460694810970435082687-93791AU
at 000050711528OCLCS45109pccnsdpn-us---AP2.B5707023Birmingham
weekly (Online)Birmingham weekly(Online)Birmingham weekly.Birmingham, AL
:Birmingham Weekly1 online resourceIrregular,Feb. 16-28,
2012-Weekly,Sept. 4-11, 1997-Feb. 9-16,
2012texttxtrdacontentcomputercrdamediaonline resourcecrrdacarrierBegan
with vol. 1, issue 1 (Sept. 4-11, 1997).\"City news, views &
entertainment\"--Cover.Numbering dropped in Mar. 2012.Also issued in
print.Description based on: Publication information from ProQuest; title
from web page (viewed June 18, 2015).Latest issue consulted: Aug. 15-20,
2012.Birmingham (Ala.)Newspapers.Internet resources.Electronic
journals.AlabamaBirmingham.fast(OCoLC)fst01204958Newspapers.fast(OCoLC)fst01423814United
StatesAlabamaBirmingham.Print version:Birmingham
Weekly(OCoLC)39271050http://apw.softlineweb.com/http://WC2VB5MT8E.search.serialssolutions.com/?sid=sersol&SS_jc=JC_000051895&title=Birmingham+Weeklyinfo:srw/schema/1/marcxmlxml00000cas
a22000007a 450031471314941116d19941995aluwr ne 0 a0eng csn
94003083
NSDengNSDANEOCLCQOCLCFOCLCOOCLCQ11079-65411079-65411nsdppccn-us-akSoutheast
shopperSoutheast shopper.Juneau, Alaska :Kemper
Communications,1994-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol.
1, no. 1 (Nov. 16, 1994)-Ceased in Feb. 1995.Juneau
(Alaska)Newspapers.AlaskaJuneau.fast(OCoLC)fst01213587Newspapers.fast(OCoLC)fst01423814United
StatesAlaskaJuneau.AU at 000011356572info:srw/schema/1/marcxmlxml00000cas
a22000008a 450027910515930413c19949999alumr n 0 a0eng dsn
93002581 NSDengNSDOCLCQ11069-06621Birmingham Tribune, 216 Ave. T. Pratt
City, Birmingham, AL 35214nsdpBirmingham tribuneBirmingham
tribune.Birmingham, Ala. :Kervin
Fondren9501volumesMonthlytexttxtrdacontentunmediatednrdamediavolumencrdacarrierPREPUB:
publication expected Jan.
1995AU at 000025863987info:srw/schema/1/marcxmlxml00000cas a22000007a
450026199931920716d19922013alumr ne 0 a0eng csn 92003357
NSDengNSDOCLOCLCQDLC011064-01341064-01341Black & White, POB 13215,
Birmingham, AL 35202-3215nsdppccBlack & white (Birmingham, Ala.)Black &
white(Birmingham, Ala.)Black & white.Black and whiteBirmingham, Ala.
:Black & White, Inc.v.Biweekly,Oct. 2, 1997-Monthly,May 1, 1992-Sept.
1997Began in May 1992; ceased with Jan. 10, 2013.\"Birmingham's New
City
paper.\"Description based on: June 1992.Latest issue consulted: No. 67
(Oct. 16, 1997) (surrogate).info:srw/schema/1/marcxmlxml00000cas
a2200000 a 450032145723950314d19901999alumr ne 0 a0eng csn
95068755
MGNengMGNNSDCLUOCLCQOCLCFOCLCOOCLCA971211082-34841082-34841AU at
000011579542nsdppccn-us-alF335.J5S68The
Southern shofarThe Southern shofar.Birmingham, AL :L. Brook,-[1999]v.
:ill. ;35 cm.MonthlyBegan in 1990.-v. 9, issue 9 (Aug./Sept. 1999).\"The
monthly newspaper of Alabama's Jewish community.\"Some issues also
available on the Internet via the World Wide Web.Description based on:
Vol. 3, issue 11 (Oct. 1993).Jewish newspapersAlabama.Jewish
newspapers.fast(OCoLC)fst00982872Alabama.fast(OCoLC)fst01204694United
StatesAlabamaJeffersonBirmingham.Deep South Jewish voice(DLC)sn
99018499(OCoLC)42431704CLUhttp://bibpurl.oclc.org/web/719http://www.bham.net/shofar/info:srw/schema/1/marcxmlxml00000cas
a22000007a 450021265141900326c19909999aluwr ne 0 a0eng csn
90099004 AARengAARCPNNSDOCLCQ11050-08981050-08981005022USPSE.O.N., Inc.,
Main St., Eclectic, AL 36024pccnsdpISSN RECORDThe Eclectic observerThe
Eclectic observer.Eclectic, Ala. :E.O.N., Inc.,1990-v.WeeklyVol. 1, no.
1 (Feb. 22, 1990)-Published by: Price Publications, Inc., <2006->Latest
issue consulted: Vol. 17, no. 1 (Jan. 5, 2006).United
StatesAlabamaElmoreEclectic.AU at
000040212446info:srw/schema/1/marcxmlxml00000cas
a22000007a 450021214781900314c19909999aluir ne 0 a0eng csn
90002457 AAAengAAANSDOCLCQ111050-20841050-20841931180USPSClanton
Newspapers, 1109 Seventh St., N., PO Box 1379, Clanton, AL
35045nsdppccn-us-alThe Clanton advertiserThe Clanton
advertiser.AdvertiserClanton, Ala. :Clanton Newspapersv. :ill. ;58
cm.Three no. a week,<May 13, 1992->Semiweekly,<Apr. 4, 1990->Began
in
Jan. 1990.Description based on: Vol. 19, no. 27 (Wed., Apr. 4,
1990).Latest issue consulted: Vol. 22, no. 58 (May 13, 1992).United
StatesAlabamaChiltonClanton.Independent advertiser (Clanton,
Ala.)(OCoLC)21214732AU at 000025908452info:srw/schema/1/marcxmlxml00000cas
a2200000 a 450021214814900314c19909999aluwr ne 0 a0eng dsn
90099009 AAAengAAACPNNSDOCLCQ11056-32881056-32881505740USPSThe Blount
Countian, 3rd St. at Washington Ave., PO Box 310, Oneonta, AL
35121mscnsdpn-us-alThe Blount countianThe Blount countian.Oneonta, Ala.
:Southern Democrat, Inc.,1990-v. :ill.WeeklyVol. 1, no. 1 (Jan. 3,
1990)-Editor: Molly Howard Ryan, 1990-Latest issue consulted: Vol. 1,
no. 36 (Sept. 5, 1990).Ryan, Molly Howard.United
StatesAlabamaBlountOneonta.Southern Democrat(DLC)sn
85044741(OCoLC)12038577AU at 000025884049info:srw/schema/1/marcxmlxml00000cas
a22000007a 450022413044900920c19909999aluwr ne 0 a0eng dsn
90099011
AARengAARCPNNSDNSTOCLCQ92081707011191053-91231053-91231314240USPSmscnsdpThe
Clay times-journalThe Clay times-journal.Lineville, Ala. :C.L.
Proctor,1990-v.WeeklyVol. 1, no. 1 (Sept. 6, 1990)-United
StatesAlabamaClayLineville.Ashland progress(DLC)sn 85044701Lineville
tribune(DLC)sn 85044702AUinfo:srw/schema/1/marcxmlxml00000cas a22000007a
450021265218900326c19909999aluwr ne 0 0eng dsn 90099005
AARengAARCPNOCLCQmscTrussville news-journal.Trussville, Ala. :Mike
Mitchell,1990-v.BimonthlyVol. 1, no. 1 (Feb. 20, 1990)-United
StatesAlabamaJeffersonTrussville.info:srw/schema/1/marcxmlxml00000cas
a22000007a 450022301035900831c19909999aluwr ne 0 0eng dsn
90099010 AARengAARCPNOCLCQmscWeaver tribune.Oxford, Ala. :Cheaha
Pub.,1990-v.WeeklyVol. 1, no. 1 (July 19, 1990)-United
StatesAlabamaCalhounWeaver.United
StatesAlabamaCalhounOxford.info:srw/schema/1/marcxmlxml00000cas
a22000007a 450015155895870205c19879999aludr ne 0 a0eng csn
87050045
AAAengAAACPNNSDDLCCPNNSDDLCCPNDLCOCLDLCOCLCQOCLCFOCLCQ19261126829944596670892-44570892-44571AU
at 000020456714360980USPSThe
Advertiser, P.O. Box 1000, Montgomery, AL
36192pccnsdpn-us-alNewspaperMontgomery advertiser (Montgomery, Ala. :
1987)The Montgomery advertiser(1987)The Montgomery advertiser.Montgomery
advertiser & the Alabama journalSunday Montgomery advertiserMontgomery,
Ala. :Advertiser Co.,1987-volumes
:illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier160th
year, no. 1 (Jan. 2, 1987)-On Saturdays, Sundays and holidays a combined
edition is published with the Alabama journal, and called: Montgomery
advertiser and the Alabama journal, Jan. 3, 1987, and: Alabama journal
and Montgomery advertiser, Jan. 4, 1987-Feb. 25, 1990.Issues for Sunday
called: Sunday Montgomery advertiser, Mar. 4, 1990-Issues for Saturday,
Sunday and holidays have their own numbering, Jan. 3, 1987-Feb. 25,
1990.Montgomery
(Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United
StatesAlabamaMontgomeryMontgomery.Advertiser (Montgomery,
Ala.)0745-3221(DLC)sn 82008412(OCoLC)9049482Alabama journal (Montgomery,
Ala. : 1940)0745-323X(DLC)sn
87062018(OCoLC)2666111info:srw/schema/1/marcxmlxml00000cas a2200000 a
450016942287871105c19879999aludn ne 0 a0eng dsn 88050149
AAAengAAACPNNSDOCLCQy1044-00701044-0070746--32780746-32781565580USPSTroy
Publications, Inc., 113 North Market St., Troy, AL 36081mscnsdpMessenger
(Troy, Ala.)The Messenger(Troy, Ala.)The Messenger.Troy, Ala. :Troy
Pub.,1987-v.Daily (Sunday, Tuesday, Thursday and Friday)Vol. 121, no.
166 (July 1, 1987)-Sunday, Apr. 2, 1989 misprinted as v. 113.Latest
issue consulted: Vol. 113 [sic 123], no. 96 (Sunday, Apr. 2,
1989).United StatesAlabamaPikeTroy.Troy messenger0746-3278(DLC)sn
83009935(OCoLC)9921908info:srw/schema/1/marcxmlxml00000cas a22000007a
450017799786880415c19879999aluir ne 0 a0eng dsn 88050086
AARengAARCPNNSDOCLCQ1p1044-03801044-03800745-75961441520USPSThe
Prattville Progress, 152 W. 3rd St., Prattville, AL
36067mscnsdpPrattville progress (Prattville, Ala. : 1987)The Prattville
progress(Prattville, Ala.)The Prattville progress.Prattville, Ala.
:James C. Seymour,1987-v.Three times a weekVol. 102, no. 8 (Jan. 20,
1987)-Latest issue consulted: Vol. 105, no. 153 (Wednesday, Dec. 26,
1990).United StatesAlabamaAutaugaPrattville.Progress (Prattville,
Ala.)0745-7596(DLC)sn
83007623(OCoLC)9428489info:srw/schema/1/marcxmlxml00000cas a22000007a
450015344667870319c19869999aluwr ne 0 a0eng dsn 87000284
NSDengNSDCPNOCLCQy0893-07670893-07671431800USPSPickens County Herald,
P.O. Drawer E, Carrollton, AL 35447nsdpPickens County heraldPickens
County herald.Pickens County herald and west AlabamianCarrollton, Ala.
:Pickens Newspapers, Inc.,1986-WeeklyVol. 138, no. 40 (Oct. 2,
1986)-United StatesAlabamaPickensCarrollton.Pickens County herald and
west Alabamian0746-0473(DLC)sn
83008141AU at 000040635809info:srw/schema/1/marcxmlxml00000cas a22000007a
450018917586881217c19869999aluwr ne 0 0eng dsn 88050225
CPNengCPNOCLCQmscThe Oxford sun/times.Oxford, Ala.
:[s.n.],1986-v.WeeklyVol. 1, no. 1 (Jan. 16, 1986)-Editor: Andy
Goggans.Numbering is irregular.United StatesAlabamaCalhounOxford.Oxford
sun (Oxford, Ala.)(DLC)sn
85045023AU at 000025803813info:srw/schema/1/marcxmlxml00000cas a22000007a
450013991168860731c19869999aluwr ne 0 0eng dsn 86050322
CPNengCPNOCLCQmscIndependent (Brewton, Ala.)The Independent.Brewton,
Ala. :Jim Thornton,1986-v. :ill. ;58 cm.WeeklyVol. 1, no. 1 (June 19,
1986)-United
StatesAlabamaEscambiaBrewton.info:srw/schema/1/marcxmlxml00000cas
a22000007a 450018957493881231c19859999aluwr ne 0 0eng dsn
88050247 CPNengCPNOCLCQmscPiedmont journal-independent (Piedmont,
Ala.)The Piedmont journal-independent.Journal independentPiedmont, Ala.
:Lane Weatherbee,1985-v.WeeklyVol. 4, no. 52 (Dec. 24, 1985)-Sometimes
published as: Journal independent.United
StatesAlabamaCalhounPiedmont.Journal-independent(DLC)sn
85045014info:srw/schema/1/marcxmlxml00000cas a22000007a
450012715821851024d19841985aluwr ne 0 a0eng dsn 85045014
CPNengCPNNSDCPNOCLCQmscThe Journal-independent.Piedmont, Ala.
:Journal-Independent, Inc.,1984-1985.volumes :illustrations ;58
cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 3,
no. 27 (July 3, 1984)- v. 4, no. 51 (Dec. 18, 1985).Carries the same
vol. numbering as the Piedmont journal-independent.United
StatesAlabamaCalhounPiedmont.Piedmont
journal-independent0890-6017(DLC)sn 85045013Piedmont journal-independent
(Piedmont, Ala.)(DLC)sn 88050247info:srw/schema/1/marcxmlxml00000cas
a22000007a 450012691448851018c19839999aludr ne 0 0eng dsn
85045007 CPNengCPNOCLCQmscTimesDaily.Times dailyFlorence, Ala. :T.S.P.
Newspapers, Inc.,1983-volumes :illustrations ;58
cmDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 114,
no. 226 (Aug. 14, 1983)-United StatesAlabamaLauderdaleFlorence.Florence
times + tri-cities daily(DLC)sn
85044995info:srw/schema/1/marcxmlxml00000cas a22000007a
45009428489830420d19831987aluir ne 0 a0eng dsn 83007623
NSDengNSDCPNNSDNSTOCLCQ89090d0745-75960745-75961The Progress, 152 W. 3rd
St., Prattville, AL 36067nsdpmscProgress (Prattville, Ala.)The
Progress(Prattville, Ala.)The Progress.Prattville, Ala. :The Prattville
Progress,1983-1987.volumes :illustrations ;58 cmThree times a
weektexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 98, no.
32 (Mar. 17, 1983)-v. 102, no. 7 (Jan. 17, 1987).United
StatesAlabamaAutaugaPrattville.Prattville progress(DLC)sn
85044740Prattville progress (Prattville, Ala.)1044-0380(DLC)sn
88050086(OCoLC)12254317AAPinfo:srw/schema/1/marcxmlxml00000cas a2200000
a 45009867255830831c19839999aludr ne 0 a0eng dsn 84008052
AAAengAAANSDOCLOCLCQX0743-15110743-15111617760USPST.S.P. Newspapers,
Inc., 219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Shoals
edition)TimesDaily(Shoals ed.)TimesDaily.Times dailyShoals ed.Florence,
Ala. :T.S.P. Newspapersvolumes
:illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan
with: Vol. 114, no. 226 (Aug. 14,
1983).\"Florence/Sheffield/Tuscumbia/Muscle Shoals.\"Shoals ed. and
Regional ed. combined on Sundays.Description based on: Vol. 114, no. 346
(Monday, Dec. 12, 1983).United
StatesAlabamaLauderdaleFlorence.TimesDaily (Regional
edition)0743-152XTimes Tri-cities dailyUnknownDec. 12,
1983info:srw/schema/1/marcxmlxml00000cas a22000007a
450010536023840319c19839999aludr ne 0 a0eng dsn 84008051
NSDengNSDOCLCQ1x0743-152X0743-152X1617760USPST.S.P. Newspapers, Inc.,
219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Regional
edition)TimesDaily(Regional ed.)TimesDaily.Times dailyRegional
ed.Florence, Ala. :T.S.P.
NewspapersDailytexttxtrdacontentunmediatednrdamediaBegan with: Vol. 114,
no. 226 (Aug. 14, 1983).Shoals ed. and Regional ed. combined on
Sundays.Description based on: Vol. 114, no. 346 (Monday, Dec. 12,
1983).United StatesAlabamaLauderdaleFlorence.TimesDaily (Shoals
edition)0743-1511Times Tri-cities dailyDec. 12,
1983AU at 000025818125info:srw/schema/1/marcxmlxml00000cas a22000007a
45009049482821213d19821987aludn ne 0 a0eng csn 82008412
AAAengAAANSDNPWCPNDLCCPNNSDDLCNSDDLCCPNNVFDLCOCLCQCRLOCLCFOCLCQ1d0745-32210745-32211nsdppccn-us-alNewspaperAdvertiser
(Montgomery, Ala.)The Advertiser(Montgomery, Ala.)The advertiser.Alabama
journal and advertiserMontgomery, Ala. :Advertiser Co.,1982-1987.volumes
:illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier155th
year, no. 232 (Nov. 22, 1982)- ; -v. 14-3, Jan. 1, 1987.On Saturdays,
Sundays and holidays published as: The Alabama journal and advertiser,
Nov. 27, 1982-Jan. 1, 1987.Saturday, Sunday and holiday issues have
their own numbering.Montgomery
(Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United
StatesAlabamaMontgomeryMontgomery.Montgomery advertiser (Montgomery,
Ala. : Daily)(DLC)sn 84020645(OCoLC)2685433Montgomery advertiser
(Montgomery, Ala. : 1987)0892-4457(DLC)sn
87050045(OCoLC)15155895AU at 000020281746info:srw/schema/1/marcxmlxml00000cas
a2200000 a 45009237931830218c19829999aluwr ne 0 0eng dsn
86050139 AAAengAAACPNOCLOCLCQmscThe Randolph leader.Roanoke, Ala. :David
S. Stevenson,1982-volumes :illustrations ;58
cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 91,
no. 1 (Oct. 6, 1982)-United StatesAlabamaRandolphRoanoke.Roanoke
leader(DLC)sn 86050137Randolph press(DLC)sn
86050138info:srw/schema/1/marcxmlxml00000cas a22000007a
450012715815851024d19821984aluwr ne 0 a0eng dsn 85045013
CPNengCPNNSDCPNOCLCQ110890-60170890-60171432080USPSThe Piedmont
Journal-Independent, 115 N. Center Ave., Piedmont, AL 36272mscnsdpThe
Piedmont journal-independentThe Piedmont journal-independent.Piedmont,
Ala. :Piedmont Journal-Independent, Inc.,1982-1984.volumes
:illustrations ;58
cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 1,
no. 1 (Mar. 31, 1982)-v. 3, no. 26 (June 27, 1984).Latest issue
consulted: Vol. 5, no. 31 (August 20, 1986).United
StatesAlabamaCalhounPiedmont.Piedmont journal(DLC)sn
85045012Journal-independent(DLC)sn
85045014(OCoLC)12715821AU at 000045312916info:srw/schema/1/marcxmlxml00000cas
a22000007a 45009183905830202c19829999aluwr n 0 a0eng dsn
85044580 AAAengAAACPNNSDOCLOCLCQ11098-58671098-58671016409USPSNo. 4,
Rucker Plaza, Enterprise, AL 36331P.O. Box 1536, Enterprise, AL
36331mscnsdpSoutheast sun (Enterprise, Ala.)The southeast
sun(Enterprise, Ala.)The Southeast sun.Enterprise, Ala. :QST
Publicationsvolumes :illustrations ;58
cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in
1982.Description based on: Vol. 1, no. 25 (Oct. 21, 1982).Latest issue
consulted: Vol. 16, no. 43 (Mar. 4, 1998).United
StatesAlabamaCoffeeEnterprise.AU at
000025827687info:srw/schema/1/marcxmlxml00000cas
a22000007a 450010487314840305c19819999aluwr ne 0 a0eng dsn
85044906
AAAengAAACPNNSDNSTCPNOCLOCLCQOCLCFOCLCOOCLCAOCLCQ900410885-16620885-16621749310USPSThe
New Times, 1618 1/2 St. Stephens Rd., Mobile, AL 36603mscnsdpn-us-alNew
times (Mobile, Ala.)The New times(Mobile, Ala.)The new times.Mobile,
Ala. :New Times Groupvolumes
:illustrationsWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan
in 1981.Vol. 3, no. 49 (Dec. 15-21, 1983) and vol. 3, no. 50 (Dec.
22-28, 1983) are both called vol. 3, no. 49 (Dec. 15-21,
1983).Description based on: Vol. 2, no. 3 (Jan. 28-Feb. 3, 1982).African
AmericansAlabamaNewspapers.African
Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694Newspapers.fast(OCoLC)fst01423814United
StatesAlabamaMobileMobile.AAPUnknownAug. 15,
1985AU at 000024686659info:srw/schema/1/marcxmlxml00000cas a22000007a
450018922463881219d19811983alucr ne 0 0eng dsn 88050233
AARengAARCPNNSDOCLCQmscThe Sylacauga daily advance.Advance/Sylacauga
dailySylacauga advanceSunday advanceAdvanceSylacauga, Ala. :Mrs. W.A.
Moody,1981-1893.v.Semiweekly,<Nov. 24, 1982-Feb. 13, 1983>Daily (except
Mon., Tues. & Sat.),<May 26, 1982-Nov. 21, 1982>Daily (except Sat.
&
Mon.),<Jan. 1, 1981-May 23, 1982>74th Year, no. 123 (Jan. 1, 1981)-76th
year, no. 83 (Feb. 13, 1983).Days of publication vary.Published as: The
Advance/Sylacauga daily, <Aug. 28, 1981-May 23, 1982>.Published as:
Sylacauga advance, <Nov. 24, 1982-Feb. 13, 1983>.On Sunday, published
as: Sunday advance.United StatesAlabamaTalladegaSylacauga.Childersburg
star(DLC)sn 88050232Coosa press(DLC)sn 86050293Daily
home1059-6461(DLC)sn 88050234info:srw/schema/1/marcxmlxml00000cas
a22000007a 450021026715cr un|||||||||900209c19809999aluwr ne 0
0eng dsn 90099002
AARengAARCPNCUSOCLOCLCQTJCOCLCQOCLCFOCLCOOCLCA926143844AU at
000020585756mscn-us-alSpeakin'
out news.Speaking out newsDecatur, Ala. :Minority Network,
Inc.v.WeeklyBegan in 1980.Published in Huntsville, Ala., <1987>-Also
issued by subscription via the World Wide Web.Description based on: Vol.
7, no. 8 (Jan. 7-13, 1987).African AmericansAlabamaNewspapers.African
American
newspapersAlabama.AlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African
American newspapers.fast(OCoLC)fst00799278African
Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United
StatesAlabamaMorganDecatur.United
StatesAlabamaMadisonHuntsville.Speakin' out weekly news(DLC)sn
88050097http://www.softlineweb.com/softlineweb/ethnic.htminfo:srw/schema/1/marcxmlxml00000cas
a22000007a 450014996511861219c19809999aluwr ne 0 a0eng csn
86050472
AARengAARCPNNSDOCLCQ11080-15021080-15021328110USPSnsdppccWest-Alabama
gazetteWest-Alabama gazette.GazetteMillport, Ala. :Millport Pub.
Co.,1980-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrier4th
year, no. 32 (Jan. 3, 1980)-United StatesAlabamaLamarMillport.Gazette
(Millport, Ala.)(DLC)sn 86050471info:srw/schema/1/marcxmlxml00000cas
a2200000 a 450011828156850320c19809999aluwr ne 0 0eng dsn
86050314 AAAengAAACPNOCLOCLCQmscThe Hartford news-herald.Hartford, Ala.
:Geneva Publications,1980-volumes :illustrations ;57-59
cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 80,
no. 20 (Feb. 14, 1980)-United StatesAlabamaGenevaHartford.News-herald
(Hartford, Ala.)(DLC)sn 86050313info:srw/schema/1/marcxmlxml00000cas
a22000007a 450017857788880427d198u198ualusr ne 0 0eng dsn
88050097 AARengAARCPNOCLOCLCQOCLCFOCLCOOCLCAmscn-us-alSpeakin' out
weekly news.Decatur, Ala. :Smothers PublicationsPublished every first
and third Wed. of each monthDescription based on: Vol. 3, no. 13 (May
4-17, 1983).African
AmericansAlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African
Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United
StatesAlabamaMorganDecatur.Weekly news (Huntsville, Ala.)(DLC)sn
87050012Speakin' out news(DLC)sn
90099002info:srw/schema/1/marcxmlxml00000cas a2200000 a
450017807936880418c198u9999aluwr ne 0 a0eng dsn 90099001
AAAengAAACPNOCLOCLCQThe Daleville Sun-Courier, 310 Daleville Ave.,
Daleville, AL 36322mscn-us-alDaleville sun-courier.Daleville, Ala. :QST
Publicationsv. :ill. ;58 cm.WeeklyDescription based on: Vol. 2, no. 28
(Wed., Feb. 17, 1988).United
StatesAlabamaDaleDaleville.AU at
000020585749info:srw/schema/1/marcxmlxml00000cas
a22000007a 450015580838870423c198u9999aluwr ne 0 0eng dsn
87050128 AARengAARCPNOCLCQmscGreene County independent.Eutaw, Ala.
:Greene County Independent, Inc.v.WeeklyDescription based on: Vol. 2,
no. 10 (Mar. 12, 1987).United
StatesAlabamaGreeneEutaw.info:srw/schema/1/marcxmlxml00000cas a22000007a
450010125135831114d198u198ualucr ne 0 a0eng dsn 83003221
NSDengNSDOCLCQ0d0746-55210746-55211Auburn Bulletin & Lee County Eagle,
PO Box 2111, Auburn, Ala. 36830nsdpThe Auburn bulletin & the Lee County
eagleThe Auburn bulletin & the Lee County eagle.Lee County eagleAuburn
bulletin and the Lee County eagleAuburn, Ala. :[publisher not
identified]Semiweekly,<Sept. 5,
1984->WeeklytexttxtrdacontentunmediatednrdamediaDescription based on:
Oct. 19, 1983.United StatesAlabamaLeeAuburn.Auburn bulletin(DLC)sn
89050006Eagle (Auburn, Ala.)(OCoLC)18435663Sept. 5,
1984info:srw/schema/1/marcxmlxml00000cas a22000007a
450018370324880818c198u9999aluwr ne 0 0eng dsn 88050147
CPNengCPNOCLCQmscTri-city times (Geraldine, Ala.)The Tri-City
times.Geraldine, Ala. :Wanda Nelsonv.WeeklyDescription based on: Vol. 2,
no. 24 (Jan. 6, 1982).United
StatesAlabamaDeKalbGeraldine.info:srw/schema/1/marcxmlxml00000cas
a22000007a 450010199338831208c198u9999aluwr ne 0 a0eng dsn
83005367 NSDengNSDCPNOCLCQ10746-62770746-62771707590USPSSpringville Pub.
Co., 539 Main St., Springville, AL 35146nsdpThe St. Clair clarionThe St.
Clair clarion.Saint Clair clarionSpringville, AL :Gary L.
ShultsWeeklytexttxtrdacontentunmediatednrdamediaDescription based on:
Vol. 2, no. 1 (Jan. 5, 1982).United StatesAlabamaSt.
ClairSpringville.AU at 000025783743info:srw/schema/1/marcxmlxml00000cas
a22000007a 450013787251860627c198u9999aluwr ne 0 a0eng dsn
86001923 NSDengNSDCPNOCLCQ10889-00800889-00801The Westerner Star, P.O.
Box 2060, Bessemer, AL 35021nsdpWestern star (Bessemer, Ala.)The Western
star(Bessemer, Ala.)The western star.Bessemer, Ala. :Hal
HodgensWeeklytexttxtrdacontentunmediatednrdamediaDescription based on:
Vol. 3, no. 15 (Wednesday, June 11, 1986).United
StatesAlabamaJeffersonBessemer.Bessemer advertiser(DLC)sn
87050117AU at 000025805174511.1srw.pc any \"y\" and srw.mt any
\"newspaper\" and srw.cp exact
\"Alabama\"50info:srw/schema/1/marcxmlxml1Date,,0mq1lME887FoIbjulKUV6bx9ImwWQNCv9GqZzGS92IKS31lEbcpRJBNHgcE1l29tFaHP9CHe0Yexk1uWQofffull"
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 27 2022, 09:22am via System
Hello Spencer,
Thank you for reaching out about the bulk xml files for the US Newspaper
Directory.
We don't have documentation specific to these bulk xml files, but upon
further inspection I can say that each of those files don't necessarily
contain info for 50 newspaper titles. The structure of the titles for
California and New York for instance are different from say, Alabama.
If you look at California for example, the file naming structure
indicates the year the title started, and then the number of titles
included in that xml file. So for instance, the files below include info
for newspapers that started in 2000, 2001, and 2002 respectively. And
there is info for 30 titles in the xml file from 2000, and 14 in the
file for 2001, and so on.
* ndnp_California_2000_e_0001_0030.xml
* ndnp_California_2001_e_0001_0014.xml
* ndnp_California_2002_e_0001_0012.xml
If there's more than 50 titles for a given year, say for California
starting in 1880, then the next 50 titles will roll into the next xml
file, and so on. And the last xml file for that year may not include 50
titles.
Many of the states seem to group all the years together, so each xml
file contains 50 titles, until possibly the last one for a given state,
which may contain less.
I hope this information helps explain the total number of records and
structure a bit better. Let me know if you have any further questions.
Best wishes,
Kerry Huller
Newspaper & Current Periodical Reading Room
Serial & Government Publications Division
Library of Congress
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 25 2022, 02:22pm via Email
Hi, Kerry:
Might there be documentation on the XML files you mentioned?
I've successfully read
'https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/',
extracted the names of 6666 XML files, and read the first one,
"ndnp_Alabama_all-yrs_e_0001_0050.xml". It contains 29415 characters,
beginning, "1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i
45001030438981180404c20159999aluwr n 0 a0eng ". With a bit
more effort, I will likely be able to parse all 6666 of these. The
names suggest that each contains information on 50 newspapers, totaling
333,300. The main page
"https://chroniclingamerica.loc.gov/search/titles/" says there are
only
157,521 "Titles currently listed". This suggests that these XML files
include place holders for a little more than double the number of
entries currently in
"https://chroniclingamerica.loc.gov/search/titles/".
Thanks for this.
Progress.
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 07 2022, 08:55am via System
Hi Spencer,
I thought of one more option after I emailed you yesterday that I wanted
to make you aware of.
I had explained the other day how we pull the records from OCLC into our
U.S. Newspaper Directory. You can also access all of the raw MARC
records found in the directory in xml format from here if you choose:
https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/
<https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/> These
will
provide you all of the data from the record fields in MARC format, so
you'd get all the data you see here for example:
https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/
<https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/> but in xml. I
don't know if this might be more data and info than you want to work
with, but wanted to make sure you were aware of this option as well.
Best wishes,
Kerry Huller
Newspaper & Current Periodical Reading Room
Serial & Government Publications Division
Library of Congress
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 06 2022, 10:55am via System
Hi Spencer,
Thanks for reaching out again. I have been looking at the json view a
bit closer this morning and your example of "9999."
After talking with a colleague this morning and looking at various
examples, I see there is some variation in how the titles with either an
unknown starting/ending date or currently published titles are being
handled - depending on the view.
As an example, I completed a search in the directory for Alaska and the
city of Anchorage. There are 80 results, and on the first page of
results you'll see # 4. Fort Richardson news, which was published from
1952-19??. The csv view of this state/city search result will show the
ending date of 19??. But if I append &format=json to this search result,
this specific title will show an ending date of 1999. After talking with
a colleague this morning, I discovered an integer had to be used in
these cases where dates were "?" so that the search based on year
range
would work. Similarly, if you look at # 12 Alaska digest, which was
published 1994-current, the "current" becomes "9999" in the
json view.
So, the records you are seeing with "9999" would most likely be titles
with an ending date of "current."
However, there is an issue with the unknown dates, like "1999" being
used for "19??" in the example above. The "9" does not get
inserted in
place of "?" when you are looking at the title/LCCN view of a specific
newspaper. So for instance, if you view the #4 title: Fort Richardson
news at this url: https://chroniclingamerica.loc.gov/lccn/sn98059792/
<https://chroniclingamerica.loc.gov/lccn/sn98059792/> but append .json
to the end of the url, after the LCCN, like this:
https://chroniclingamerica.loc.gov/lccn/sn98059792.json
<https://chroniclingamerica.loc.gov/lccn/sn98059792.json> you'll see
that the end_year is "19??." Viewing the title/LCCN json view for
titles
that are currently published will also show the end_year as "current."
The Alaska digest example from above can be viewed here:
https://chroniclingamerica.loc.gov/lccn/sn97060056.json
<https://chroniclingamerica.loc.gov/lccn/sn97060056.json>
I wasn't aware of the difference between the directory search json view
and the title/LCCN view. But I think it would be possible to grab
the data from the title/LCCN json url through an additional script
potentially. The json url is included in the view under the "url"
field.
Of course, there are unknowns with publishing dates, but better to know
where the question marks are, and what titles are considered to be current.
I hope this clarifies the data a bit more - let me know if any of it
needs more clarification though. And let me know if you have follow-up
questions.
Thank you,
Kerry Huller
Newspaper & Current Periodical Reading Room
Serial & Government Publications Division
Library of Congress
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 05 2022, 04:42pm via Email
Hi, Kerry:
What would you suggest I do to get a count of the numbers of
newspapers and publishers operating by year from, say, 1790 to 2021?
I just determined that 20630 (13 percent) of the 157520 records in
the US Newspaper database I downloaded a week ago have end_year = 9999.
I don't think it's feasible to assume that all or even most of those
are still publishing.
Might there be some other database that might have this kind of
information?
I ask, because Robert McChesney (2004) The Problem of the Media
(Monthly Review Pr., esp. pp. 34-35) suggests that in the first half of
the nineteenth century, the US had more newspapers and newspaper
publishers per capita than any other place or time. He suggests that
that diversity of newspapers helped encourage literacy and limit
political corruption, both of which helped propel the young US to its
current dominance of the international political economy. I'm hoping to
get some data to evaluate this claim. Sadly, it looks like there is too
much missing and questionable data in this dataset for me to use this
without a fairly substantive data cleaning effort.
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 05 2022, 09:05am via System
Hello Spencer,
Thank you for reaching out about your additional questions.
I was looking at the records you mention above, and yes, you are correct
- those 9 records with the date inconsistencies and the one record for
the The New Mexican mining news
<https://chroniclingamerica.loc.gov/lccn/sn93061507/> containing
"Santa
Fe.\" have typos in them. Thanks for spotting these - it may be possible
to have the cataloger in our division correct those typos. I will look
into this further.
The U.S. Newspaper Directory doesn't have a connection with Wikimedia or
Wikipedia. The Library of Congress periodically pulls the records for
the Directory from OCLC Worldcat
<https://www.oclc.org/en/worldcat.html>. And those newspaper records in
OCLC Worldcat have been created by catalogers at various institutions
around the U.S. over the span of several years. So, occasionally, you
will find a typo in the records. Corrections can be made by OCLC and
library staff at the various institutions. Every time we complete a new
pull on the OCLC records, any corrected records will then populate our
Directory.
Regarding your question on the New-York weekly journal - yes, that is
also correct that it has two records. There is actually a record for
each format of the newspaper, so this record is for the microfilm format
<https://chroniclingamerica.loc.gov/lccn/2009252748/> and this one is
for the original print format
<https://chroniclingamerica.loc.gov/lccn/sn83030211/>. You can see in
the heading for the microfilm record where it says [microfilm reel] and
the print version shows [volume]. You are likely to see this for other
titles as well because each format has been cataloged with its own LCCN.
You are also likely to see additional records with [online resource]
identified as the format as more and more titles are available as
ePrints or online.
I hope this helps answer your additional questions a bit more. Please
reach out if you have any other questions.
Thank you,
Kerry Huller
Newspaper & Current Periodical Reading Room
Serial & Government Publications Division
Library of Congress
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 04 2022, 01:47pm via Email
Hi, Kelly:
At the risk of bombing your inbox with more emails than you want,
what is your relationship with Wikipedia and other Wikimedia Foundation
projects like Wikidata?
I ask, because I've logged over 20,000 edits in Wikimedia Foundation
projects since 2010, and I would happily try to answer questions about
Wikidata and other Wikimedia Foundation projects. I have NOT organized
an edit-a-thon, but I've made presentations at conferences with people
who have, and I would happily try to help organize such if you could
find a group of people who want to work to improve this US Newspaper
database. I think it would be good to establish links between this US
Newspaper database and Wikidata, with appropriate procedures so changes
to one could be evaluated for acceptance into the other.
FYI, John Peter Zenger's famous "New-York weekly journal"
(1733-1751)
appears TWICE in your database with lccn = 2009252748 and sn83030211 and
ONCE in Wikidata WITHOUT an lccn, even though many other Wikidata items
have an lccn. See:
https://www.wikidata.org/wiki/Q23091960
There's a "WikiProject Newspapers" on Wikipedia and a companion
"WikiProject Periodicals" on Wikidata:
https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Newspapers/Wikidata
https://www.wikidata.org/wiki/Wikidata:WikiProject_Periodicals
I've tried to connect with others on those projects, so far with only
limited success. However, you may know that almost anyone can change
almost anything on Wikipedia and other Wikimedia Foundation projects.
What stays tends to be written from a neutral point of view citing
credible sources. They have problems with vandals, but the problems are
usually easily controlled. This makes Wikipedia and Wikidata very
useful platforms for cleaning up databases like your US Newspaper dataset.
Spencer Graves
##########
Hello, Kelly:
In addition to the invalid JSON, discussed below [NOTE: The "below"
contains a slight addition to the report of the I sent last Friday.], I
found 9 (NINE!) cases where start_year was AFTER end_year. These have
lccn = "sn86071531" "sn95069213" "sn90059096"
"sn86058451" "sn90060926"
"sn99065409" "sn89065002" "sn98069857"
"sn91059179"
See:
https://chroniclingamerica.loc.gov/lccn/sn86071531/
https://chroniclingamerica.loc.gov/lccn/sn95069213/
https://chroniclingamerica.loc.gov/lccn/sn90059096/
https://chroniclingamerica.loc.gov/lccn/sn86058451/
https://chroniclingamerica.loc.gov/lccn/sn90060926/
https://chroniclingamerica.loc.gov/lccn/sn99065409/
https://chroniclingamerica.loc.gov/lccn/sn89065002/
https://chroniclingamerica.loc.gov/lccn/sn98069857/
https://chroniclingamerica.loc.gov/lccn/sn91059179/
These all have obvious coding errors that can be easily fixed. The
data may not be completely accurate after the fix, but at least they are
not obviously wrong ;-)
##################
I got invalid JSON from:
https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json
After some experimentation, I was able to replicate the problem with
a request for rows=10:
https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json
Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics
and Associate Dean for Graduate Programs at the University of California
- Davis, confirmed that it was a JSON error using:
https://codebeautify.org/jsonvalidator
He is part of the core team developing the R free, open-source
programming language. He said, that starting at offsets 161070 and
161502 in the character string you get from [the R code RCurl::getURL()]
we have:
Santa Fe.\"
and these are in an entry such as
"city": ["Santa Fe.\"]
So the final " is escaped and therefore there is no closing " for the
string. The parser continues to consume characters looking for the end
of that string.
If one "repairs" the text from getURL() with
ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt)
then the rest of my code worked fine.
You may wish to do something to implement other checks for valid JSON
and repair this problem. I've scanned all the 157520 records that were
in that database a couple of days ago, and this is the only JSON error
identified by the code I used.
NOTE: I was NOT able to replicate this error when downloading records
one at a time. That suggests a problem NOT in the database itself but
in the download algorithm. ???
Thank you for your help. I will almost certainly have other
questions ;-)
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 03 2022, 10:39pm via Email
Hello, Kelly:
In addition to the invalid JSON, discussed below [NOTE: The "below"
contains a slight addition to the report of the I sent last Friday.], I
found 9 (NINE!) cases where start_year was AFTER end_year. These have
lccn = "sn86071531" "sn95069213" "sn90059096"
"sn86058451" "sn90060926"
"sn99065409" "sn89065002" "sn98069857"
"sn91059179"
See:
https://chroniclingamerica.loc.gov/lccn/sn86071531/
https://chroniclingamerica.loc.gov/lccn/sn95069213/
https://chroniclingamerica.loc.gov/lccn/sn90059096/
https://chroniclingamerica.loc.gov/lccn/sn86058451/
https://chroniclingamerica.loc.gov/lccn/sn90060926/
https://chroniclingamerica.loc.gov/lccn/sn99065409/
https://chroniclingamerica.loc.gov/lccn/sn89065002/
https://chroniclingamerica.loc.gov/lccn/sn98069857/
https://chroniclingamerica.loc.gov/lccn/sn91059179/
These all have obvious coding errors that can be easily fixed. The
data may not be completely accurate after the fix, but at least they are
not obviously wrong ;-)
##################
I got invalid JSON from:
https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json
After some experimentation, I was able to replicate the problem with
a request for rows=10:
https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json
Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics
and Associate Dean for Graduate Programs at the University of California
- Davis, confirmed that it was a JSON error using:
https://codebeautify.org/jsonvalidator
He is part of the core team developing the R free, open-source
programming language. He said, that starting at offsets 161070 and
161502 in the character string you get from [the R code RCurl::getURL()]
we have:
Santa Fe.\"
and these are in an entry such as
"city": ["Santa Fe.\"]
So the final " is escaped and therefore there is no closing " for the
string. The parser continues to consume characters looking for the end
of that string.
If one "repairs" the text from getURL() with
ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt)
then the rest of my code worked fine.
You may wish to do something to implement other checks for valid JSON
and repair this problem. I've scanned all the 157520 records that were
in that database a couple of days ago, and this is the only JSON error
identified by the code I used.
NOTE: I was NOT able to replicate this error when downloading records
one at a time. That suggests a problem NOT in the database itself but
in the download algorithm. ???
Thank you for your help. I will almost certainly have other
questions ;-)
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jul 01 2022, 11:46am via Email
Hello, Kelly:
I got invalid JSON from:
https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json
After some experimentation, I was able to replicate the problem with
a request for rows=10:
https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json
Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics
and Associate Dean for Graduate Programs at the University of California
- Davis, confirmed that it was a JSON error using:
https://codebeautify.org/jsonvalidator
He is part of the core team developing the R free, open-source
programming language. He said, that starting at offsets 161070 and
161502 in the character string you get from [the R code RCurl::getURL()]
we have:
Santa Fe.\"
and these are in an entry such as
"city": ["Santa Fe.\"]
So the final " is escaped and therefore there is no closing " for the
string. The parser continues to consume characters looking for the end
of that string.
If one "repairs" the text from getURL() with
ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt)
then the rest of my code worked fine.
You may wish to do something to implement other checks for valid JSON
and repair this problem. I've scanned all the 157520 records that were
in that database a couple of days ago, and this is the only JSON error
identified by the code I used.
Thank you for your help. I will almost certainly have other
questions ;-)
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 28 2022, 02:20pm via System
Hello Spencer,
Thank you for sending along your follow-up questions.
I'm glad to hear the json view will work for you. It was recommended to
me that you limit your requests to 500 rows at a time. And a developer
here at LC suggests the following regarding rate limiting:
?To avoid being blocked by the server, the current rate-limiting rules
restrict un-cached requests to URLs starting with
https://chroniclingamerica.loc.gov/search/
<https://chroniclingamerica.loc.gov/search/> to 120 requests every 10
minutes from a single IP address.?
So, I think if you limited each of your requests to 500 rows at a time
with the proper pauses, then you should be able to access what you need.
As for the csv view, I checked on this as well, and was informed that
the csv view was not implemented for all url formats. The csv view was
only implemented for this view:
https://chroniclingamerica.loc.gov/newspapers/
<https://chroniclingamerica.loc.gov/newspapers/>and urls resulting from
US Directory search results - for e.g. if you wanted to narrow down your
search results by state, city, date range, etc. found at this link:
https://chroniclingamerica.loc.gov/search/titles/
<https://chroniclingamerica.loc.gov/search/titles/>. So, if you wanted a
csv and limited your search by state ( for example:
https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv
<https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv>
), you could append &format=csv to the search result url and get the csv
to automatically download. But, if your search results ended up being
over a couple thousand titles, then the system would probably time out.
I hope this info helps! Let me know if you have any other questions.
Best wishes,
Kerry Huller
Newspaper & Current Periodical Reading Room
Serial & Government Publications Division
Library of Congress
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 27 2022, 04:15pm via Email
Hello, Kerry:
Thanks for the reply. Can you please give me some further guidance
on two thing "so that the system is not overwhelmed"?
1. The max size in a small batch?
2. Any limit on the number of small batches in a second or minute?
I've found that I can download small batches under program control
using "RCurl::getURL" in R (programming language) using, e.g.;
https://chroniclingamerica.loc.gov/search/titles/results/?rows=20&page=2&format=json
With this, I can control the batch size with "row=20" vs.
"row=50"
vs., e.g., "row=1000". A naive search says there are 157520
"results".
With "row=1000", this would require 158 calls. With
"row=20", it
would require 7876 calls. Before I start, I need to decide which fields
I want; I don't need them all.
Thanks,
Spencer Graves
p.s. I tried appending "&format=csv" and got "Error 504 Ray
ID:
7220896da85e86e7 ? 2022-06-27 19:19:53 UTC Gateway time-out". I used:
https://chroniclingamerica.loc.gov/search/titles/results/?state=&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv
I can get what I want using json so do not need csv. However, I
thought you might want to know that I was unable to get csv to work.
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 27 2022, 10:54am via System
Hello Spencer,
Thank you for contacting the Library of Congress about searching the US
Newspaper Directory. I wanted to follow up with you regarding your
request to output the data in a machine readable format.
It looks like you were provided the link to the API documentation for
the website: About the Site and API
<https://chroniclingamerica.loc.gov/about/api/>. Scroll down to the
section with the heading, Searching the directory and newspaper pages
using OpenSearch. This section describes the search functionality and
structure for the US Newspaper Directory in more detail. It is possible
to return your directory searches in json format by appending
&format=json to the end of the url. It is also possible to return search
results in csv format by appending &format=csv to the end of the url,
but I would strongly suggest that you do this in small batches by
putting limits on your search so that the system is not overwhelmed.
So, from the search page for the US Newspaper Directory
<https://chroniclingamerica.loc.gov/search/titles/> you could
potentially limit your search based on state and city, or date range,
and/or even frequency. Then once you've completed the search, you can
add &format=csv to the end of the url to automatically download a csv of
those records. The resulting csv will contain several fields/headers:
lccn, title, place of publication, start year, end year, publisher,
edition, frequency, subject, state, city, country, language, oclc
number, and holding type. I think these fields include the information
you were looking for. But, again, I would like to stress that you put
limits on your search before creating the csv so as not overwhelm the
system.
Please let me know if you have any other additional questions.
Best wishes,
Kerry Huller
Newspaper & Current Periodical Reading Room
Serial & Government Publications Division
Library of Congress
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 23 2022, 01:55pm via System
Mr. Graves,
I'm going to transfer you request to a member of our digital collections
team who may be of more assistance to you than me.
Mike
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 23 2022, 01:51pm via Email
Dear Mr. Queen:
Thanks for the reply. I'm still confused. I downloaded and
installed Docker Desktop and "docker-compose.yml" and ran their
"Getting
Started" Tutorial, but I don't see what to do next.
I repeat: I'd like to analyze "U.S. Newspaper Directory,
1690-Present" (https://chroniclingamerica.loc.gov/search/titles/), which
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 22 2022, 07:15pm via System
Mr. Graves,
Programmatic access to the data forChronicling America
<https://chroniclingamerica.loc.gov/>and possibly the U.S. Newspaper
Directory <https://chroniclingamerica.loc.gov/search/titles/>can be
found on theAbout the Site and API
<https://chroniclingamerica.loc.gov/about/api/>page in various formats.
Also, please note that Chronicling Americacontains newspapers published
from 1777-1963, but does not include everyU.S. newspaper published in
that time period.
Please let me know if I can be of further assistance.
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 22 2022, 06:14pm via Email
Dear Mr. Queen:
Can we simplify this to just giving me the data behind "U.S.
Newspaper Directory, 1690-Present"
(https://chroniclingamerica.loc.gov/search/titles/) in a machine
readable format, e.g., csv or xlsx or a MySQL database?
As I mentioned in my original email, a naive search of that without
restrictions returned 157520 titles in 7876 pages with up to 20 titles
per page giving date ranges in at least some cases. I could probably
write software to scrape those 7876 pages from your web site and combine
them into a data file.
I have a PhD in statistics, I have been using the R programming
language and similar software for decades. This includes publishing
tutorials on how to analyze data like this on Wikiversity.[1] I'd like
to do something similar with this. I could help make your data more
useful to others and discuss with you how we might prioritize
improvements like accessing the other sources you mentioned.
Thanks very much for your reply.
Sincerely,
Spencer Graves, PhD
Founder, EffectiveDefense.org
4550 Warwick Blvd 508
Kansas City, MO 64111
m: 408-655-4567
[1] e.g.:
https://en.wikiversity.org/wiki/US_Gross_Domestic_Product_(GDP)_per_capita
------------------------------------------------------------------------
Newspapers and Current Periodicals Reference Librarian
Jun 22 2022, 05:27pm via System
Mr. Graves
Your request is a little more complex than it first appears and requires
extensive research. A variety of resources should be consulted to
determine the circulation statistics of newspapers published prior to
1851. You will need to check newspaper union lists and newspaper
histories. Union listspresent lists of newspapers in geographic
arrangement according to place of publication, and specify which
libraries or other institutions hold collections of those newspapers and
the dates of their holdings. These can also be useful for tracking title
changes throughout a newspaper's history. Newspaper
historieslikeAmerican Journalism: A History: 1690-1960
<https://lccn.loc.gov/62007157>(Mott),The Penny Press
<https://lccn.loc.gov/2004043078>(Thompson), andThe Press and America
<https://lccn.loc.gov/99044295>(Emery et al.) may not include
circulation statistics, but they do document the diversity and progress
of newspaper publishing, including notable newspapers of the era.
Newspaper histories also cover the history of the printers and printing
of newspapers in a state, county, or region more generally, and provide
more condensed histories of the editors, journalists, and evolution of
the newspapers in a specific area. Newspaper histories and union lists
should be available at most large public or university libraries. More
information about union lists, newspaper histories, and researching
newspapers in general can be found in theU.S. Newspaper Collections at
the Library of Congress
<https://guides.loc.gov/united-states-newspapers/introduction>research
guide (see Reference Sources).
Please let me know if I can be of further assistance.
------------------------------------------------------------------------
Original Question
Jun 20 2022, 02:34pm via System
How can I get counts of the numbers of newspapers by year in the US, and
preferably also elsewhere? A search of "U.S. Newspaper Directory,
How can I get counts of the numbers of newspapers by year in the US, and
preferably also elsewhere?
A search of "U.S. Newspaper Directory, 1690-Present"
(https://chroniclingamerica.loc.gov/search/titles/) returned 157520
titles in 7876 pages with up to 20 titles per page giving date ranges to
the extent that it's known. If I can get a data file (e.g., csv or xls),
I can summarize. I could also use data on circulation and frequency and
especially parent company for multiple newspapers published by the same
company, to the extant that such is available.
I'm interested in this, because McChesney quoted Tocqueville in
suggesting that the US had more newspapers per person (or per million
population) prior to 1851 than at any other time or place in history.
I'd like to evaluate that claim with data to the extent that I can. See
"https://en.wikiversity.org/wiki/Social_construction_of_crime_and_what_we_can_do_about_it#Newspapers_1790_-_present".
Thanks, Spencer Graves, PhD
m: 408-655-4567
------------------------------------------------------------------------
Thank you for using Newspapers & Current Periodicals Ask a Librarian
Service!
This email is sent from Ask a Librarian in relationship to ticket #9625195.
Read our privacy policy. <https://springshare.com/privacy.html>
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
On Wed, 27 Jul 2022 15:50:55 -0500 Spencer Graves <spencer.graves at effectivedefense.org> wrote:> What would you suggest I do to parse the following XML file into a > list that I can understand: > > XMLfile <- > "https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ndnp_Alabama_all-yrs_e_0001_0050.xml"> XMLdat <- XML::xmlParse(XMLdata) > str(XMLdat)Isn't XMLdat already a tree-like list? For example, XMLdat[[1]][[1]][[3]][[1]] is the first <record> tag in the file, which you can further pick apart. What information do you need from this file and how would you like to access it? Parsing XML files is typically achieved with XPath expressions (e.g. 'under every <record> tag, extract the <datafield> tags containing attribute tag="042"' would look like 'record/datafield[tag="042"]') and/or handlers on specific tags, not by extracting all text nodes and performing string operations on them. -- Best regards, Ivan
What do you mean by "a list that I can understand"? A quick tally of the number of XML elements by identifier: 1 echoedSearchRetrieveRequest 1 frbrGrouping 1 maximumRecords 1 nextRecordPosition 1 numberOfRecords 1 query 1 records 1 resultSetIdleTime 1 searchRetrieveResponse 1 servicelevel 1 sortKeys 1 startRecord 1 wskey 2 version 50 leader 50 recordData 51 recordPacking 51 recordSchema 100 record 105 controlfield 923 datafield 1900 subfield What of this information do you actually want? The elements of the list should be what? On Thu, 28 Jul 2022 at 08:52, Spencer Graves < spencer.graves at effectivedefense.org> wrote:> Hello, All: > > > What would you suggest I do to parse the following XML file into > a > list that I can understand: > > > XMLfile <- > " > https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ndnp_Alabama_all-yrs_e_0001_0050.xml" > > > > > This is the first of 6666 XML files containing "U.S. Newspaper > Directory" maintained by the US Library of Congress discussed in the > thread below. I've tried various things using the XML and xml2. > > > XMLdata <- xml2::read_xml(XMLfile) > str(XMLdata) > XMLdat <- XML::xmlParse(XMLdata) > str(XMLdat) > XMLtxt <- xml2::xml_text(XMLdata) > nchar(XMLtxt) > #[1] 29415 > > > Someplace there's a schema for this. I don't know if it's > embedded > in this XML file or in a separate file. If it's in a separate file, how > could I describe it to my contacts with the Library of Congress so they > would understand what I needed and could help me get it. > > > Thanks, > Spencer Graves > > > p.s. All 29415 characters in XMLtext appear in the thread below. > > > > -------- Forwarded Message -------- > Subject: [Newspapers and Current Periodicals] How can I get counts > of > the numbers of newspapers by year in the US, and preferably also > elsewhere? A search of "U.S. Newspaper Directory, > Date: Wed, 27 Jul 2022 14:59:03 +0000 > From: Kerry Huller <serials at ask.loc.gov> > To: Spencer Graves <spencer.graves at effectivedefense.org> > CC: twes at loc.gov > > > > --# Type your reply above this line #-- > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 27 2022, 10:59am via System > > Hello Spencer, > > So, when I view the xml, I'm actually looking at it in XML editor > software, so I can view the tags and it's structured neatly. I've copied > and pasted the text from the beginning of the file and the first > newspaper title below from my XML editor: > > <?xml version="1.0" encoding="UTF-8" standalone="no"?> > <?xml-stylesheet type='text/xsl' > href='/webservices/catalog/xsl/searchRetrieveResponse.xsl'?> > > <searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/" > xmlns:oclcterms="http://purl.org/oclc/terms/" > xmlns:dc="http://purl.org/dc/elements/1.1/" > xmlns:diag="http://www.loc.gov/zing/srw/diagnostic/" > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> > <version>1.1</version> > <numberOfRecords>2250</numberOfRecords> > <records> > <record> > <recordSchema>info:srw/schema/1/marcxml</recordSchema> > <recordPacking>xml</recordPacking> > <recordData> > <record xmlns="http://www.loc.gov/MARC21/slim"> > <leader>00000nas a22000007i 4500</leader> > <controlfield tag="001">1030438981</controlfield> > <controlfield tag="008">180404c20159999aluwr n 0 a0eng > </controlfield> > <datafield ind1=" " ind2=" " tag="010"> > <subfield code="a"> 2018200464</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="040"> > <subfield code="a">DLC</subfield> > <subfield code="e">rda</subfield> > <subfield code="c">DLC</subfield> > <subfield code="b">eng</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="012"> > <subfield code="m">1</subfield> > </datafield> > <datafield ind1="0" ind2=" " tag="022"> > <subfield code="a">2577-5316</subfield> > <subfield code="2">1</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="032"> > <subfield code="a">021110</subfield> > <subfield code="b">USPS</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="037"> > <subfield code="b">711 Alabama Avenue, Selma, AL 36701</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="042"> > <subfield code="a">nsdp</subfield> > <subfield code="a">pcc</subfield> > </datafield> > <datafield ind1="1" ind2="0" tag="050"> > <subfield code="a">ISSN RECORD</subfield> > </datafield> > <datafield ind1="1" ind2="0" tag="082"> > <subfield code="a">071</subfield> > <subfield code="2">15</subfield> > </datafield> > <datafield ind1=" " ind2="0" tag="222"> > <subfield code="a">Selma sun</subfield> > </datafield> > <datafield ind1="0" ind2="0" tag="245"> > <subfield code="a">Selma sun.</subfield> > </datafield> > <datafield ind1=" " ind2="1" tag="264"> > <subfield code="a">Selma, AL :</subfield> > <subfield code="b">North Shore Press, LLC</subfield> > <subfield code="c">2016-</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="310"> > <subfield code="a">Weekly</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="336"> > <subfield code="a">text</subfield> > <subfield code="b">txt</subfield> > <subfield code="2">rdacontent</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="337"> > <subfield code="a">unmediated</subfield> > <subfield code="b">n</subfield> > <subfield code="2">rdamedia</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="338"> > <subfield code="a">volume</subfield> > <subfield code="b">nc</subfield> > <subfield code="2">rdacarrier</subfield> > </datafield> > <datafield ind1="1" ind2=" " tag="362"> > <subfield code="a">Began in 2015.</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="588"> > <subfield code="a">Description based on: Volume 2, Issue 40 > (October 5, 2017) (surrogate); title from caption.</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="588"> > <subfield code="a">Latest issue consulted: Volume 2, Issue 40 > (October 5, 2017).</subfield> > </datafield> > <datafield ind1=" " ind2=" " tag="752"> > <subfield code="a">United States</subfield> > <subfield code="b">Alabama</subfield> > <subfield code="c">Dallas</subfield> > <subfield code="d">Selma.</subfield> > </datafield> > </record> > </recordData> > </record> > > When I view the records in the XML editor, these 2 lines below do begin > each of the records for each individual title, but of course this is > including the xml tags: > > <recordSchema>info:srw/schema/1/marcxml</recordSchema> > <recordPacking>xml</recordPacking> > > Hopefully this helps you decide where to break or parse each record. > > On another note, I just noticed as well that at the top of this first > file it lists the total number of records for the Alabama grouping - > 2250. This also appeared to be the case for the Alaska records when I > took a look at the first one for that state. I imagine that should be > consistent throughout each "grouping" of records. > > Let me know if you have follow-up questions! > > Best wishes, > > Kerry Huller > Newspaper & Current Periodical Reading Room > Serial & Government Publications Division > Library of Congress > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 27 2022, 10:21am via Email > > Hi, Kerry: > > > Thanks. I understand the chunking in files of at most 50. I've read > the first file "ndnp_Alabama_all-yrs_e_0001_0050.xml" into a string of > 29415 characters, copied below. Might you have any suggestions on the > next step in parsing this? Staring at it now, it looks splitting on > "info:srw/schema/1/marcxmlxml" might convert the 29415 characters into > shorter chunks, each of which could then be parsed further. > > > This is not as bad as reading ancient Egyptian heiroglyphics without > the Rosetta Stone, but I wondered if you might have something that could > make this work easier and more reliable? I guess I could compare with > what I already read as JSON ;-) > > > Thanks, > Spencer Graves > > > "1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i > 45001030438981180404c20159999aluwr n 0 a0eng > 2018200464DLCrdaDLCeng12577-53161021110USPS711 Alabama Avenue, Selma, AL > 36701nsdppccISSN RECORD07115Selma sunSelma sun.Selma, AL :North Shore > Press, > LLC2016-WeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan > in > 2015.Description based on: Volume 2, Issue 40 (October 5, 2017) > (surrogate); title from caption.Latest issue consulted: Volume 2, Issue > 40 (October 5, 2017).United > StatesAlabamaDallasSelma.info:srw/schema/1/marcxmlxml00000cas a22000007a > 4500502150053100127c20109999aluwr n 0 a0eng > 2010200019DLCengDLCDLCOCLCQ112153-18111750USPSB & C Publishing, LLC, > 3514 Martin St. S. Ste 104, Cropwell, AL 35054pccnsdpISSN RECORDSt. > Clair County news (Cropwell, Ala.)St. Clair County news(Cropwell, > Ala.)St. Clair County news.Cropwell, AL :B & C Pub.WeeklyBegan in > 2010.Description based on: Nov. 4, 2010 (surrogate); title from > caption.info:srw/schema/1/marcxmlxml00000cas a22000007a > 4500426491872090720c20099999alumr n 0 a0eng > 2009203372DLCengDLCOCLCQ12150-346X2150-346X1AU at 000044489617NZ116076352Devon > Applewhite/Applewhite Publishing Co., 1910 Honeysuckle Rd., #N183, > Dothan, AL 36305mscnsdpISSN RECORD30514Triangle tribune(Dothan, > Ala.)Triangle tribune.Dothan, AL :Applewhite Pub. CoMonthlyBegan with > vol. 1, issue 1 (May 2009).\"Connecting the Tri-State African -American > Community.\"Description based on: Vol. 1, issue 1 (May 2009); title from > masthead.Applewhite, Devon.United StatesAlabama.United > StatesGeorgia.United StatesFlorida.info:srw/schema/1/marcxmlxml00000cas > a22000007a 4500289017315081219c20089999aluwr n | a0eng c > 2008213218NSDengNSDOCLCQDLCOCLCQ111945-93191945-93191005270USPSSpringhill > Publications, > LLC, P.O. Box 186, Greenville, AL 36037nsdppccISSN RECORD07014Greenville > standardThe Greenville standard.Greenville, AL :Springhill > PublicationsWeeklytexttxtrdacontentunmediatednrdamediaBegan with vol. 1, > issue 1 (Sept. 3, 2008)Description based on surrogate of: Vol. 1, no. 15 > (Dec. 18, 2008); title from masthead (publisher's Web site, viewed Dec. > 19, 2008).Latest issue consulted: Vol. 1, no. 99 (July 27, 2011) > (surrogate).info:srw/schema/1/marcxmlxml00000cas a22000007a > 4500123539969070426c20079999aluwr ne 0 a0eng c > 2007212138NSDengNSDNSDOCLCQ101936-95571936-95571The Western Tribune, > 1530 Third Ave. N., Bessemer, AL 35020mscnsdpISSN RECORDWestern tribune > (Bessemer, Ala.)The Western tribune(Bessemer, Ala.)The Western > tribune.Bessemer, Ala. :D-Med, Inc.v.WeeklyBegan in 2007.Description > based on: May 23, 2007 (surrogate); title from > caption.AU at 000041575341info:srw/schema/1/marcxmlxml00000cas a22000007a > 4500226300653080425c20079999aluwr ne | a0eng > 2008212112NSDengNSDNSDOCLCQ11942-20751942-20751nsdppccISSN RECORDThe > corridor messengerThe corridor messenger.Carbon Hill, AL :Corridor > Messenger, Inc.WeeklyBegan with vol. 1, issue (10.03.2007).Description > based on: 1st issue.United StatesAlabamaWalkerCarbon > Hill.http://www.corridormessenger.cominfo:srw/schema/1/marcxmlxml00000cas > a22000007a > 450077560432070109c20069999aluwr ne 0 a0eng c > > 2007213400NSDengNSDOCLCQAUBRNOCLCOOCLCFa01935-37901935-37901AU at 000041190283The > > Auburn Villager, P.O. Box 1633, Auburn, AL 36831-1633pccnsdpISSN > RECORDThe Auburn villagerThe Auburn villager.Auburn, AL :Auburn > Villagerv.WeeklyBegan in 2006.Description based on: Vol. 1, no. 4 (July > 20, 2006) (surrogate); title from caption.Auburn (Ala.)Newspapers.Lee > County (Ala.)Newspapers.AlabamaAuburn.fast(OCoLC)fst01209634AlabamaLee > County.fast(OCoLC)fst01211930Newspapers.fast(OCoLC)fst01423814United > StatesAlabamaLeeAuburn.info:srw/schema/1/marcxmlxml00000cas a2200000Ii > 4500872286785m o d s cr mn|---a||||140311c20069999alucr n o b > s0 a0eng cABCengrdaABCABCOCLCFLD59.13University of Alabama at > Birmingham.The eReporter.[Birmingham, Alabama] :The University of > Alabama at Birmingham,[2006]-[Birmingham, Alabama] :Offices of Public > Relations & Marketing and Information Technology1 online resource2 > issues weeklytexttxtrdacontentcomputercrdamediaonline > resourcecrrdacarrierSeptember 19, 2006-\"The eReporter is an official > communication of The University of Alabama at Birmingham, companion to > the UAB Reporter and recommended alternative to mass e-mails.\"Issues > for <March 11, 2014- published and distributed via e-mail subscription > on Tuesdays and Fridays.Description based on: September 19, 2006; title > from title screen (viewed March 12, 2014).University of Alabama at > BirminghamPeriodicals.Periodicals.fast(OCoLC)fst01411641University of > Alabama at Birmingham.fast(OCoLC)fst00645114University of Alabama at > Birmingham.Office of Public Relations and Marketing.University of > Alabama at Birmingham.Information Technology.2006-2012, companion > to:University of Alabama at Birmingham.UAB > reporter.(OCoLC)32435748Archived > issueshttp:// > hatteras.dpo.uab.edu/cgi-bin/ereporter.cgiinfo:srw/schema/1/marcxmlxml00000cas > > a22000007a 4500166387050070829c20059999aluwr ne | a0eng c > 2007215501NSDengNSDOCLCQ11939-68991939-68991The Wilkie Clark Memorial > Foundation, P.O. Box 514, Roanoke, AL 36274$30.00nsdpmscISSN > RECORD305.89614People's voice (Roanoke, Ala.)The people's voice(Roanoke, > Ala.)The people's voice.Roanoke, AL :Wilkie Clark Memorial > Foundationv.WeeklyBegan with vol. 1, no. 1 in 2005.Description based on: > Vol. 2, no. 20 (Apr. 20, 2007); title from caption.Wilkie Clark Memorial > Foundation.United > StatesAlabamaRandolphRoanoke.AU at 000042141390info:srw/schema/1/marcxmlxml00000nas > > > a22000007i 45001124677787191021c20uu9999aluwr ne | a0eng > 2019202521DLCengrdaDLC12689-3258122730USPSNorth Jackson Press, 42950 Hwy > 72, Suite 406, Stevenson, AL 35772nsdppccISSN RECORD071.323North Jackson > pressNorth Jackson press.Stevenson, AL :Caney Creek Publications > LLCWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierDescription > based on surrogate of: Volume 1, number 36 (October 11, 2019); title > from masthead.Latest issue consulted: Volume 1, number 36 (October 11, > 2019) (Surrogate).United > StatesAlabamaJacksonStevensoninfo:srw/schema/1/marcxmlxml00000cas > a2200000 a 4500226315099080428d19981998aluwr ne | 0eng c > 2008233691GUAengGUAOCLCQOCLCFOCLCO39911644pccn-us-gaThe Dekalb > news.Birmingham, Ala. :Community newspaper holdings Inc.v.WeeklyBegan > with 1st year, no. 1 (Apr. 1, 1998); ceased with 1st year, no. 31 (Oct. > 28, 1998).Final issue consulted.Description based on first issue; title > from caption.Decatur (Ga.)Newspapers.DeKalb County > (Ga.)Newspapers.Newspapers.fast(OCoLC)fst01423814GeorgiaDecatur.fast(OCoLC)fst01226234GeorgiaDeKalb > > > County.fast(OCoLC)fst01215288United > StatesGeorgiaDeKalbDecatur.Decatur-DeKalb news/era(DLC)sn > 89053661(OCoLC)19946163info:srw/schema/1/marcxmlxml00000cas a2200000 i > 450050263311m o d cr cn|||||||||020730c19979999alu x neo > 0 a0eng c > > 2015238492AMHengrdapnAMHOCLCQOCLCFOCLCOIULOCLHTMOCLCQCOODLC66460694810970435082687-93791AU at 000050711528OCLCS45109pccnsdpn-us---AP2.B5707023Birmingham > > weekly (Online)Birmingham weekly(Online)Birmingham weekly.Birmingham, AL > :Birmingham Weekly1 online resourceIrregular,Feb. 16-28, > 2012-Weekly,Sept. 4-11, 1997-Feb. 9-16, > 2012texttxtrdacontentcomputercrdamediaonline resourcecrrdacarrierBegan > with vol. 1, issue 1 (Sept. 4-11, 1997).\"City news, views & > entertainment\"--Cover.Numbering dropped in Mar. 2012.Also issued in > print.Description based on: Publication information from ProQuest; title > from web page (viewed June 18, 2015).Latest issue consulted: Aug. 15-20, > 2012.Birmingham (Ala.)Newspapers.Internet resources.Electronic > journals.AlabamaBirmingham.fast(OCoLC)fst01204958Newspapers.fast(OCoLC)fst01423814United > > > StatesAlabamaBirmingham.Print version:Birmingham > Weekly(OCoLC)39271050 > http://apw.softlineweb.com/http://WC2VB5MT8E.search.serialssolutions.com/?sid=sersol&SS_jc=JC_000051895&title=Birmingham+Weeklyinfo:srw/schema/1/marcxmlxml00000cas > > a22000007a 450031471314941116d19941995aluwr ne 0 a0eng csn > 94003083 > NSDengNSDANEOCLCQOCLCFOCLCOOCLCQ11079-65411079-65411nsdppccn-us-akSoutheast > shopperSoutheast shopper.Juneau, Alaska :Kemper > Communications,1994-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. > > > 1, no. 1 (Nov. 16, 1994)-Ceased in Feb. 1995.Juneau > (Alaska)Newspapers.AlaskaJuneau.fast(OCoLC)fst01213587Newspapers.fast(OCoLC)fst01423814United > > > StatesAlaskaJuneau.AU at 000011356572info:srw/schema/1/marcxmlxml00000cas > a22000008a 450027910515930413c19949999alumr n 0 a0eng dsn > 93002581 NSDengNSDOCLCQ11069-06621Birmingham Tribune, 216 Ave. T. Pratt > City, Birmingham, AL 35214nsdpBirmingham tribuneBirmingham > tribune.Birmingham, Ala. :Kervin > Fondren9501volumesMonthlytexttxtrdacontentunmediatednrdamediavolumencrdacarrierPREPUB: > > > publication expected Jan. > 1995AU at 000025863987info:srw/schema/1/marcxmlxml00000cas a22000007a > 450026199931920716d19922013alumr ne 0 a0eng csn 92003357 > NSDengNSDOCLOCLCQDLC011064-01341064-01341Black & White, POB 13215, > Birmingham, AL 35202-3215nsdppccBlack & white (Birmingham, Ala.)Black & > white(Birmingham, Ala.)Black & white.Black and whiteBirmingham, Ala. > :Black & White, Inc.v.Biweekly,Oct. 2, 1997-Monthly,May 1, 1992-Sept. > 1997Began in May 1992; ceased with Jan. 10, 2013.\"Birmingham's New City > paper.\"Description based on: June 1992.Latest issue consulted: No. 67 > (Oct. 16, 1997) (surrogate).info:srw/schema/1/marcxmlxml00000cas > a2200000 a 450032145723950314d19901999alumr ne 0 a0eng csn > 95068755 > > MGNengMGNNSDCLUOCLCQOCLCFOCLCOOCLCA971211082-34841082-34841AU at 000011579542nsdppccn-us-alF335.J5S68The > > Southern shofarThe Southern shofar.Birmingham, AL :L. Brook,-[1999]v. > :ill. ;35 cm.MonthlyBegan in 1990.-v. 9, issue 9 (Aug./Sept. 1999).\"The > monthly newspaper of Alabama's Jewish community.\"Some issues also > available on the Internet via the World Wide Web.Description based on: > Vol. 3, issue 11 (Oct. 1993).Jewish newspapersAlabama.Jewish > newspapers.fast(OCoLC)fst00982872Alabama.fast(OCoLC)fst01204694United > StatesAlabamaJeffersonBirmingham.Deep South Jewish voice(DLC)sn > 99018499(OCoLC)42431704CLUhttp:// > bibpurl.oclc.org/web/719http://www.bham.net/shofar/info:srw/schema/1/marcxmlxml00000cas > > a22000007a 450021265141900326c19909999aluwr ne 0 a0eng csn > 90099004 AARengAARCPNNSDOCLCQ11050-08981050-08981005022USPSE.O.N., Inc., > Main St., Eclectic, AL 36024pccnsdpISSN RECORDThe Eclectic observerThe > Eclectic observer.Eclectic, Ala. :E.O.N., Inc.,1990-v.WeeklyVol. 1, no. > 1 (Feb. 22, 1990)-Published by: Price Publications, Inc., <2006->Latest > issue consulted: Vol. 17, no. 1 (Jan. 5, 2006).United > StatesAlabamaElmoreEclectic.AU at 000040212446info:srw/schema/1/marcxmlxml00000cas > > > a22000007a 450021214781900314c19909999aluir ne 0 a0eng csn > 90002457 AAAengAAANSDOCLCQ111050-20841050-20841931180USPSClanton > Newspapers, 1109 Seventh St., N., PO Box 1379, Clanton, AL > 35045nsdppccn-us-alThe Clanton advertiserThe Clanton > advertiser.AdvertiserClanton, Ala. :Clanton Newspapersv. :ill. ;58 > cm.Three no. a week,<May 13, 1992->Semiweekly,<Apr. 4, 1990->Began in > Jan. 1990.Description based on: Vol. 19, no. 27 (Wed., Apr. 4, > 1990).Latest issue consulted: Vol. 22, no. 58 (May 13, 1992).United > StatesAlabamaChiltonClanton.Independent advertiser (Clanton, > Ala.)(OCoLC)21214732AU at 000025908452info:srw/schema/1/marcxmlxml00000cas > a2200000 a 450021214814900314c19909999aluwr ne 0 a0eng dsn > 90099009 AAAengAAACPNNSDOCLCQ11056-32881056-32881505740USPSThe Blount > Countian, 3rd St. at Washington Ave., PO Box 310, Oneonta, AL > 35121mscnsdpn-us-alThe Blount countianThe Blount countian.Oneonta, Ala. > :Southern Democrat, Inc.,1990-v. :ill.WeeklyVol. 1, no. 1 (Jan. 3, > 1990)-Editor: Molly Howard Ryan, 1990-Latest issue consulted: Vol. 1, > no. 36 (Sept. 5, 1990).Ryan, Molly Howard.United > StatesAlabamaBlountOneonta.Southern Democrat(DLC)sn > 85044741(OCoLC)12038577AU at 000025884049info:srw/schema/1/marcxmlxml00000cas > a22000007a 450022413044900920c19909999aluwr ne 0 a0eng dsn > 90099011 > AARengAARCPNNSDNSTOCLCQ92081707011191053-91231053-91231314240USPSmscnsdpThe > Clay times-journalThe Clay times-journal.Lineville, Ala. :C.L. > Proctor,1990-v.WeeklyVol. 1, no. 1 (Sept. 6, 1990)-United > StatesAlabamaClayLineville.Ashland progress(DLC)sn 85044701Lineville > tribune(DLC)sn 85044702AUinfo:srw/schema/1/marcxmlxml00000cas a22000007a > 450021265218900326c19909999aluwr ne 0 0eng dsn 90099005 > AARengAARCPNOCLCQmscTrussville news-journal.Trussville, Ala. :Mike > Mitchell,1990-v.BimonthlyVol. 1, no. 1 (Feb. 20, 1990)-United > StatesAlabamaJeffersonTrussville.info:srw/schema/1/marcxmlxml00000cas > a22000007a 450022301035900831c19909999aluwr ne 0 0eng dsn > 90099010 AARengAARCPNOCLCQmscWeaver tribune.Oxford, Ala. :Cheaha > Pub.,1990-v.WeeklyVol. 1, no. 1 (July 19, 1990)-United > StatesAlabamaCalhounWeaver.United > StatesAlabamaCalhounOxford.info:srw/schema/1/marcxmlxml00000cas > a22000007a 450015155895870205c19879999aludr ne 0 a0eng csn > 87050045 > > AAAengAAACPNNSDDLCCPNNSDDLCCPNDLCOCLDLCOCLCQOCLCFOCLCQ19261126829944596670892-44570892-44571AU at 000020456714360980USPSThe > > Advertiser, P.O. Box 1000, Montgomery, AL > 36192pccnsdpn-us-alNewspaperMontgomery advertiser (Montgomery, Ala. : > 1987)The Montgomery advertiser(1987)The Montgomery advertiser.Montgomery > advertiser & the Alabama journalSunday Montgomery advertiserMontgomery, > Ala. :Advertiser Co.,1987-volumes > :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier160th > > > year, no. 1 (Jan. 2, 1987)-On Saturdays, Sundays and holidays a combined > edition is published with the Alabama journal, and called: Montgomery > advertiser and the Alabama journal, Jan. 3, 1987, and: Alabama journal > and Montgomery advertiser, Jan. 4, 1987-Feb. 25, 1990.Issues for Sunday > called: Sunday Montgomery advertiser, Mar. 4, 1990-Issues for Saturday, > Sunday and holidays have their own numbering, Jan. 3, 1987-Feb. 25, > 1990.Montgomery > (Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United > > > StatesAlabamaMontgomeryMontgomery.Advertiser (Montgomery, > Ala.)0745-3221(DLC)sn 82008412(OCoLC)9049482Alabama journal (Montgomery, > Ala. : 1940)0745-323X(DLC)sn > 87062018(OCoLC)2666111info:srw/schema/1/marcxmlxml00000cas a2200000 a > 450016942287871105c19879999aludn ne 0 a0eng dsn 88050149 > AAAengAAACPNNSDOCLCQy1044-00701044-0070746--32780746-32781565580USPSTroy > Publications, Inc., 113 North Market St., Troy, AL 36081mscnsdpMessenger > (Troy, Ala.)The Messenger(Troy, Ala.)The Messenger.Troy, Ala. :Troy > Pub.,1987-v.Daily (Sunday, Tuesday, Thursday and Friday)Vol. 121, no. > 166 (July 1, 1987)-Sunday, Apr. 2, 1989 misprinted as v. 113.Latest > issue consulted: Vol. 113 [sic 123], no. 96 (Sunday, Apr. 2, > 1989).United StatesAlabamaPikeTroy.Troy messenger0746-3278(DLC)sn > 83009935(OCoLC)9921908info:srw/schema/1/marcxmlxml00000cas a22000007a > 450017799786880415c19879999aluir ne 0 a0eng dsn 88050086 > AARengAARCPNNSDOCLCQ1p1044-03801044-03800745-75961441520USPSThe > Prattville Progress, 152 W. 3rd St., Prattville, AL > 36067mscnsdpPrattville progress (Prattville, Ala. : 1987)The Prattville > progress(Prattville, Ala.)The Prattville progress.Prattville, Ala. > :James C. Seymour,1987-v.Three times a weekVol. 102, no. 8 (Jan. 20, > 1987)-Latest issue consulted: Vol. 105, no. 153 (Wednesday, Dec. 26, > 1990).United StatesAlabamaAutaugaPrattville.Progress (Prattville, > Ala.)0745-7596(DLC)sn > 83007623(OCoLC)9428489info:srw/schema/1/marcxmlxml00000cas a22000007a > 450015344667870319c19869999aluwr ne 0 a0eng dsn 87000284 > NSDengNSDCPNOCLCQy0893-07670893-07671431800USPSPickens County Herald, > P.O. Drawer E, Carrollton, AL 35447nsdpPickens County heraldPickens > County herald.Pickens County herald and west AlabamianCarrollton, Ala. > :Pickens Newspapers, Inc.,1986-WeeklyVol. 138, no. 40 (Oct. 2, > 1986)-United StatesAlabamaPickensCarrollton.Pickens County herald and > west Alabamian0746-0473(DLC)sn > 83008141AU at 000040635809info:srw/schema/1/marcxmlxml00000cas a22000007a > 450018917586881217c19869999aluwr ne 0 0eng dsn 88050225 > CPNengCPNOCLCQmscThe Oxford sun/times.Oxford, Ala. > :[s.n.],1986-v.WeeklyVol. 1, no. 1 (Jan. 16, 1986)-Editor: Andy > Goggans.Numbering is irregular.United StatesAlabamaCalhounOxford.Oxford > sun (Oxford, Ala.)(DLC)sn > 85045023AU at 000025803813info:srw/schema/1/marcxmlxml00000cas a22000007a > 450013991168860731c19869999aluwr ne 0 0eng dsn 86050322 > CPNengCPNOCLCQmscIndependent (Brewton, Ala.)The Independent.Brewton, > Ala. :Jim Thornton,1986-v. :ill. ;58 cm.WeeklyVol. 1, no. 1 (June 19, > 1986)-United > StatesAlabamaEscambiaBrewton.info:srw/schema/1/marcxmlxml00000cas > a22000007a 450018957493881231c19859999aluwr ne 0 0eng dsn > 88050247 CPNengCPNOCLCQmscPiedmont journal-independent (Piedmont, > Ala.)The Piedmont journal-independent.Journal independentPiedmont, Ala. > :Lane Weatherbee,1985-v.WeeklyVol. 4, no. 52 (Dec. 24, 1985)-Sometimes > published as: Journal independent.United > StatesAlabamaCalhounPiedmont.Journal-independent(DLC)sn > 85045014info:srw/schema/1/marcxmlxml00000cas a22000007a > 450012715821851024d19841985aluwr ne 0 a0eng dsn 85045014 > CPNengCPNNSDCPNOCLCQmscThe Journal-independent.Piedmont, Ala. > :Journal-Independent, Inc.,1984-1985.volumes :illustrations ;58 > cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 3, > no. 27 (July 3, 1984)- v. 4, no. 51 (Dec. 18, 1985).Carries the same > vol. numbering as the Piedmont journal-independent.United > StatesAlabamaCalhounPiedmont.Piedmont > journal-independent0890-6017(DLC)sn 85045013Piedmont journal-independent > (Piedmont, Ala.)(DLC)sn 88050247info:srw/schema/1/marcxmlxml00000cas > a22000007a 450012691448851018c19839999aludr ne 0 0eng dsn > 85045007 CPNengCPNOCLCQmscTimesDaily.Times dailyFlorence, Ala. :T.S.P. > Newspapers, Inc.,1983-volumes :illustrations ;58 > cmDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 114, > no. 226 (Aug. 14, 1983)-United StatesAlabamaLauderdaleFlorence.Florence > times + tri-cities daily(DLC)sn > 85044995info:srw/schema/1/marcxmlxml00000cas a22000007a > 45009428489830420d19831987aluir ne 0 a0eng dsn 83007623 > NSDengNSDCPNNSDNSTOCLCQ89090d0745-75960745-75961The Progress, 152 W. 3rd > St., Prattville, AL 36067nsdpmscProgress (Prattville, Ala.)The > Progress(Prattville, Ala.)The Progress.Prattville, Ala. :The Prattville > Progress,1983-1987.volumes :illustrations ;58 cmThree times a > weektexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 98, no. > 32 (Mar. 17, 1983)-v. 102, no. 7 (Jan. 17, 1987).United > StatesAlabamaAutaugaPrattville.Prattville progress(DLC)sn > 85044740Prattville progress (Prattville, Ala.)1044-0380(DLC)sn > 88050086(OCoLC)12254317AAPinfo:srw/schema/1/marcxmlxml00000cas a2200000 > a 45009867255830831c19839999aludr ne 0 a0eng dsn 84008052 > AAAengAAANSDOCLOCLCQX0743-15110743-15111617760USPST.S.P. Newspapers, > Inc., 219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Shoals > edition)TimesDaily(Shoals ed.)TimesDaily.Times dailyShoals ed.Florence, > Ala. :T.S.P. Newspapersvolumes > :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan > > > with: Vol. 114, no. 226 (Aug. 14, > 1983).\"Florence/Sheffield/Tuscumbia/Muscle Shoals.\"Shoals ed. and > Regional ed. combined on Sundays.Description based on: Vol. 114, no. 346 > (Monday, Dec. 12, 1983).United > StatesAlabamaLauderdaleFlorence.TimesDaily (Regional > edition)0743-152XTimes Tri-cities dailyUnknownDec. 12, > 1983info:srw/schema/1/marcxmlxml00000cas a22000007a > 450010536023840319c19839999aludr ne 0 a0eng dsn 84008051 > NSDengNSDOCLCQ1x0743-152X0743-152X1617760USPST.S.P. Newspapers, Inc., > 219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Regional > edition)TimesDaily(Regional ed.)TimesDaily.Times dailyRegional > ed.Florence, Ala. :T.S.P. > NewspapersDailytexttxtrdacontentunmediatednrdamediaBegan with: Vol. 114, > no. 226 (Aug. 14, 1983).Shoals ed. and Regional ed. combined on > Sundays.Description based on: Vol. 114, no. 346 (Monday, Dec. 12, > 1983).United StatesAlabamaLauderdaleFlorence.TimesDaily (Shoals > edition)0743-1511Times Tri-cities dailyDec. 12, > 1983AU at 000025818125info:srw/schema/1/marcxmlxml00000cas a22000007a > 45009049482821213d19821987aludn ne 0 a0eng csn 82008412 > AAAengAAANSDNPWCPNDLCCPNNSDDLCNSDDLCCPNNVFDLCOCLCQCRLOCLCFOCLCQ1d0745-32210745-32211nsdppccn-us-alNewspaperAdvertiser > > > (Montgomery, Ala.)The Advertiser(Montgomery, Ala.)The advertiser.Alabama > journal and advertiserMontgomery, Ala. :Advertiser Co.,1982-1987.volumes > :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier155th > > > year, no. 232 (Nov. 22, 1982)- ; -v. 14-3, Jan. 1, 1987.On Saturdays, > Sundays and holidays published as: The Alabama journal and advertiser, > Nov. 27, 1982-Jan. 1, 1987.Saturday, Sunday and holiday issues have > their own numbering.Montgomery > (Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United > > > StatesAlabamaMontgomeryMontgomery.Montgomery advertiser (Montgomery, > Ala. : Daily)(DLC)sn 84020645(OCoLC)2685433Montgomery advertiser > (Montgomery, Ala. : 1987)0892-4457(DLC)sn > 87050045(OCoLC)15155895AU at 000020281746info:srw/schema/1/marcxmlxml00000cas > a2200000 a 45009237931830218c19829999aluwr ne 0 0eng dsn > 86050139 AAAengAAACPNOCLOCLCQmscThe Randolph leader.Roanoke, Ala. :David > S. Stevenson,1982-volumes :illustrations ;58 > cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 91, > no. 1 (Oct. 6, 1982)-United StatesAlabamaRandolphRoanoke.Roanoke > leader(DLC)sn 86050137Randolph press(DLC)sn > 86050138info:srw/schema/1/marcxmlxml00000cas a22000007a > 450012715815851024d19821984aluwr ne 0 a0eng dsn 85045013 > CPNengCPNNSDCPNOCLCQ110890-60170890-60171432080USPSThe Piedmont > Journal-Independent, 115 N. Center Ave., Piedmont, AL 36272mscnsdpThe > Piedmont journal-independentThe Piedmont journal-independent.Piedmont, > Ala. :Piedmont Journal-Independent, Inc.,1982-1984.volumes > :illustrations ;58 > cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 1, > no. 1 (Mar. 31, 1982)-v. 3, no. 26 (June 27, 1984).Latest issue > consulted: Vol. 5, no. 31 (August 20, 1986).United > StatesAlabamaCalhounPiedmont.Piedmont journal(DLC)sn > 85045012Journal-independent(DLC)sn > 85045014(OCoLC)12715821AU at 000045312916info:srw/schema/1/marcxmlxml00000cas > a22000007a 45009183905830202c19829999aluwr n 0 a0eng dsn > 85044580 AAAengAAACPNNSDOCLOCLCQ11098-58671098-58671016409USPSNo. 4, > Rucker Plaza, Enterprise, AL 36331P.O. Box 1536, Enterprise, AL > 36331mscnsdpSoutheast sun (Enterprise, Ala.)The southeast > sun(Enterprise, Ala.)The Southeast sun.Enterprise, Ala. :QST > Publicationsvolumes :illustrations ;58 > cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in > 1982.Description based on: Vol. 1, no. 25 (Oct. 21, 1982).Latest issue > consulted: Vol. 16, no. 43 (Mar. 4, 1998).United > StatesAlabamaCoffeeEnterprise.AU at 000025827687info:srw/schema/1/marcxmlxml00000cas > > > a22000007a 450010487314840305c19819999aluwr ne 0 a0eng dsn > 85044906 > AAAengAAACPNNSDNSTCPNOCLOCLCQOCLCFOCLCOOCLCAOCLCQ900410885-16620885-16621749310USPSThe > > > New Times, 1618 1/2 St. Stephens Rd., Mobile, AL 36603mscnsdpn-us-alNew > times (Mobile, Ala.)The New times(Mobile, Ala.)The new times.Mobile, > Ala. :New Times Groupvolumes > :illustrationsWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan > > > in 1981.Vol. 3, no. 49 (Dec. 15-21, 1983) and vol. 3, no. 50 (Dec. > 22-28, 1983) are both called vol. 3, no. 49 (Dec. 15-21, > 1983).Description based on: Vol. 2, no. 3 (Jan. 28-Feb. 3, 1982).African > AmericansAlabamaNewspapers.African > Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694Newspapers.fast(OCoLC)fst01423814United > > > StatesAlabamaMobileMobile.AAPUnknownAug. 15, > 1985AU at 000024686659info:srw/schema/1/marcxmlxml00000cas a22000007a > 450018922463881219d19811983alucr ne 0 0eng dsn 88050233 > AARengAARCPNNSDOCLCQmscThe Sylacauga daily advance.Advance/Sylacauga > dailySylacauga advanceSunday advanceAdvanceSylacauga, Ala. :Mrs. W.A. > Moody,1981-1893.v.Semiweekly,<Nov. 24, 1982-Feb. 13, 1983>Daily (except > Mon., Tues. & Sat.),<May 26, 1982-Nov. 21, 1982>Daily (except Sat. & > Mon.),<Jan. 1, 1981-May 23, 1982>74th Year, no. 123 (Jan. 1, 1981)-76th > year, no. 83 (Feb. 13, 1983).Days of publication vary.Published as: The > Advance/Sylacauga daily, <Aug. 28, 1981-May 23, 1982>.Published as: > Sylacauga advance, <Nov. 24, 1982-Feb. 13, 1983>.On Sunday, published > as: Sunday advance.United StatesAlabamaTalladegaSylacauga.Childersburg > star(DLC)sn 88050232Coosa press(DLC)sn 86050293Daily > home1059-6461(DLC)sn 88050234info:srw/schema/1/marcxmlxml00000cas > a22000007a 450021026715cr un|||||||||900209c19809999aluwr ne 0 > 0eng dsn 90099002 > > AARengAARCPNCUSOCLOCLCQTJCOCLCQOCLCFOCLCOOCLCA926143844AU at 000020585756mscn-us-alSpeakin' > > > out news.Speaking out newsDecatur, Ala. :Minority Network, > Inc.v.WeeklyBegan in 1980.Published in Huntsville, Ala., <1987>-Also > issued by subscription via the World Wide Web.Description based on: Vol. > 7, no. 8 (Jan. 7-13, 1987).African AmericansAlabamaNewspapers.African > American > newspapersAlabama.AlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African > > > American newspapers.fast(OCoLC)fst00799278African > Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United > StatesAlabamaMorganDecatur.United > StatesAlabamaMadisonHuntsville.Speakin' out weekly news(DLC)sn > 88050097 > http://www.softlineweb.com/softlineweb/ethnic.htminfo:srw/schema/1/marcxmlxml00000cas > > a22000007a 450014996511861219c19809999aluwr ne 0 a0eng csn > 86050472 > AARengAARCPNNSDOCLCQ11080-15021080-15021328110USPSnsdppccWest-Alabama > gazetteWest-Alabama gazette.GazetteMillport, Ala. :Millport Pub. > Co.,1980-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrier4th > > > year, no. 32 (Jan. 3, 1980)-United StatesAlabamaLamarMillport.Gazette > (Millport, Ala.)(DLC)sn 86050471info:srw/schema/1/marcxmlxml00000cas > a2200000 a 450011828156850320c19809999aluwr ne 0 0eng dsn > 86050314 AAAengAAACPNOCLOCLCQmscThe Hartford news-herald.Hartford, Ala. > :Geneva Publications,1980-volumes :illustrations ;57-59 > cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 80, > no. 20 (Feb. 14, 1980)-United StatesAlabamaGenevaHartford.News-herald > (Hartford, Ala.)(DLC)sn 86050313info:srw/schema/1/marcxmlxml00000cas > a22000007a 450017857788880427d198u198ualusr ne 0 0eng dsn > 88050097 AARengAARCPNOCLOCLCQOCLCFOCLCOOCLCAmscn-us-alSpeakin' out > weekly news.Decatur, Ala. :Smothers PublicationsPublished every first > and third Wed. of each monthDescription based on: Vol. 3, no. 13 (May > 4-17, 1983).African > AmericansAlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African > Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United > StatesAlabamaMorganDecatur.Weekly news (Huntsville, Ala.)(DLC)sn > 87050012Speakin' out news(DLC)sn > 90099002info:srw/schema/1/marcxmlxml00000cas a2200000 a > 450017807936880418c198u9999aluwr ne 0 a0eng dsn 90099001 > AAAengAAACPNOCLOCLCQThe Daleville Sun-Courier, 310 Daleville Ave., > Daleville, AL 36322mscn-us-alDaleville sun-courier.Daleville, Ala. :QST > Publicationsv. :ill. ;58 cm.WeeklyDescription based on: Vol. 2, no. 28 > (Wed., Feb. 17, 1988).United > StatesAlabamaDaleDaleville.AU at 000020585749info:srw/schema/1/marcxmlxml00000cas > > > a22000007a 450015580838870423c198u9999aluwr ne 0 0eng dsn > 87050128 AARengAARCPNOCLCQmscGreene County independent.Eutaw, Ala. > :Greene County Independent, Inc.v.WeeklyDescription based on: Vol. 2, > no. 10 (Mar. 12, 1987).United > StatesAlabamaGreeneEutaw.info:srw/schema/1/marcxmlxml00000cas a22000007a > 450010125135831114d198u198ualucr ne 0 a0eng dsn 83003221 > NSDengNSDOCLCQ0d0746-55210746-55211Auburn Bulletin & Lee County Eagle, > PO Box 2111, Auburn, Ala. 36830nsdpThe Auburn bulletin & the Lee County > eagleThe Auburn bulletin & the Lee County eagle.Lee County eagleAuburn > bulletin and the Lee County eagleAuburn, Ala. :[publisher not > identified]Semiweekly,<Sept. 5, > 1984->WeeklytexttxtrdacontentunmediatednrdamediaDescription based on: > Oct. 19, 1983.United StatesAlabamaLeeAuburn.Auburn bulletin(DLC)sn > 89050006Eagle (Auburn, Ala.)(OCoLC)18435663Sept. 5, > 1984info:srw/schema/1/marcxmlxml00000cas a22000007a > 450018370324880818c198u9999aluwr ne 0 0eng dsn 88050147 > CPNengCPNOCLCQmscTri-city times (Geraldine, Ala.)The Tri-City > times.Geraldine, Ala. :Wanda Nelsonv.WeeklyDescription based on: Vol. 2, > no. 24 (Jan. 6, 1982).United > StatesAlabamaDeKalbGeraldine.info:srw/schema/1/marcxmlxml00000cas > a22000007a 450010199338831208c198u9999aluwr ne 0 a0eng dsn > 83005367 NSDengNSDCPNOCLCQ10746-62770746-62771707590USPSSpringville Pub. > Co., 539 Main St., Springville, AL 35146nsdpThe St. Clair clarionThe St. > Clair clarion.Saint Clair clarionSpringville, AL :Gary L. > ShultsWeeklytexttxtrdacontentunmediatednrdamediaDescription based on: > Vol. 2, no. 1 (Jan. 5, 1982).United StatesAlabamaSt. > ClairSpringville.AU at 000025783743info:srw/schema/1/marcxmlxml00000cas > a22000007a 450013787251860627c198u9999aluwr ne 0 a0eng dsn > 86001923 NSDengNSDCPNOCLCQ10889-00800889-00801The Westerner Star, P.O. > Box 2060, Bessemer, AL 35021nsdpWestern star (Bessemer, Ala.)The Western > star(Bessemer, Ala.)The western star.Bessemer, Ala. :Hal > HodgensWeeklytexttxtrdacontentunmediatednrdamediaDescription based on: > Vol. 3, no. 15 (Wednesday, June 11, 1986).United > StatesAlabamaJeffersonBessemer.Bessemer advertiser(DLC)sn > 87050117AU at 000025805174511.1srw.pc any \"y\" and srw.mt any > \"newspaper\" and srw.cp exact > \"Alabama\"50info:srw/schema/1/marcxmlxml1Date,,0mq1lME887FoIbjulKUV6bx9ImwWQNCv9GqZzGS92IKS31lEbcpRJBNHgcE1l29tFaHP9CHe0Yexk1uWQofffull" > > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 27 2022, 09:22am via System > > Hello Spencer, > > Thank you for reaching out about the bulk xml files for the US Newspaper > Directory. > > We don't have documentation specific to these bulk xml files, but upon > further inspection I can say that each of those files don't necessarily > contain info for 50 newspaper titles. The structure of the titles for > California and New York for instance are different from say, Alabama. > > If you look at California for example, the file naming structure > indicates the year the title started, and then the number of titles > included in that xml file. So for instance, the files below include info > for newspapers that started in 2000, 2001, and 2002 respectively. And > there is info for 30 titles in the xml file from 2000, and 14 in the > file for 2001, and so on. > > * ndnp_California_2000_e_0001_0030.xml > * ndnp_California_2001_e_0001_0014.xml > * ndnp_California_2002_e_0001_0012.xml > > If there's more than 50 titles for a given year, say for California > starting in 1880, then the next 50 titles will roll into the next xml > file, and so on. And the last xml file for that year may not include 50 > titles. > > Many of the states seem to group all the years together, so each xml > file contains 50 titles, until possibly the last one for a given state, > which may contain less. > > I hope this information helps explain the total number of records and > structure a bit better. Let me know if you have any further questions. > > Best wishes, > > Kerry Huller > Newspaper & Current Periodical Reading Room > Serial & Government Publications Division > Library of Congress > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 25 2022, 02:22pm via Email > > Hi, Kerry: > > > Might there be documentation on the XML files you mentioned? > > > I've successfully read > 'https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/', > extracted the names of 6666 XML files, and read the first one, > "ndnp_Alabama_all-yrs_e_0001_0050.xml". It contains 29415 characters, > beginning, "1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i > 45001030438981180404c20159999aluwr n 0 a0eng ". With a bit > more effort, I will likely be able to parse all 6666 of these. The > names suggest that each contains information on 50 newspapers, totaling > 333,300. The main page > "https://chroniclingamerica.loc.gov/search/titles/" says there are only > 157,521 "Titles currently listed". This suggests that these XML files > include place holders for a little more than double the number of > entries currently in "https://chroniclingamerica.loc.gov/search/titles/". > > > Thanks for this. > > > Progress. > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 07 2022, 08:55am via System > > Hi Spencer, > > I thought of one more option after I emailed you yesterday that I wanted > to make you aware of. > > I had explained the other day how we pull the records from OCLC into our > U.S. Newspaper Directory. You can also access all of the raw MARC > records found in the directory in xml format from here if you choose: > https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ > <https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/> These will > > provide you all of the data from the record fields in MARC format, so > you'd get all the data you see here for example: > https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/ > <https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/> but in xml. I > don't know if this might be more data and info than you want to work > with, but wanted to make sure you were aware of this option as well. > > Best wishes, > > Kerry Huller > Newspaper & Current Periodical Reading Room > Serial & Government Publications Division > Library of Congress > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 06 2022, 10:55am via System > > Hi Spencer, > > Thanks for reaching out again. I have been looking at the json view a > bit closer this morning and your example of "9999." > > After talking with a colleague this morning and looking at various > examples, I see there is some variation in how the titles with either an > unknown starting/ending date or currently published titles are being > handled - depending on the view. > > As an example, I completed a search in the directory for Alaska and the > city of Anchorage. There are 80 results, and on the first page of > results you'll see # 4. Fort Richardson news, which was published from > 1952-19??. The csv view of this state/city search result will show the > ending date of 19??. But if I append &format=json to this search result, > this specific title will show an ending date of 1999. After talking with > a colleague this morning, I discovered an integer had to be used in > these cases where dates were "?" so that the search based on year range > would work. Similarly, if you look at # 12 Alaska digest, which was > published 1994-current, the "current" becomes "9999" in the json view. > So, the records you are seeing with "9999" would most likely be titles > with an ending date of "current." > > However, there is an issue with the unknown dates, like "1999" being > used for "19??" in the example above. The "9" does not get inserted in > place of "?" when you are looking at the title/LCCN view of a specific > newspaper. So for instance, if you view the #4 title: Fort Richardson > news at this url: https://chroniclingamerica.loc.gov/lccn/sn98059792/ > <https://chroniclingamerica.loc.gov/lccn/sn98059792/> but append .json > to the end of the url, after the LCCN, like this: > https://chroniclingamerica.loc.gov/lccn/sn98059792.json > <https://chroniclingamerica.loc.gov/lccn/sn98059792.json> you'll see > that the end_year is "19??." Viewing the title/LCCN json view for titles > that are currently published will also show the end_year as "current." > The Alaska digest example from above can be viewed here: > https://chroniclingamerica.loc.gov/lccn/sn97060056.json > <https://chroniclingamerica.loc.gov/lccn/sn97060056.json> > > I wasn't aware of the difference between the directory search json view > and the title/LCCN view. But I think it would be possible to grab > the data from the title/LCCN json url through an additional script > potentially. The json url is included in the view under the "url" field. > > Of course, there are unknowns with publishing dates, but better to know > where the question marks are, and what titles are considered to be current. > > I hope this clarifies the data a bit more - let me know if any of it > needs more clarification though. And let me know if you have follow-up > questions. > > Thank you, > > Kerry Huller > Newspaper & Current Periodical Reading Room > Serial & Government Publications Division > Library of Congress > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 05 2022, 04:42pm via Email > > Hi, Kerry: > > > What would you suggest I do to get a count of the numbers of > newspapers and publishers operating by year from, say, 1790 to 2021? > > > I just determined that 20630 (13 percent) of the 157520 records in > the US Newspaper database I downloaded a week ago have end_year = 9999. > I don't think it's feasible to assume that all or even most of those > are still publishing. > > > Might there be some other database that might have this kind of > information? > > > I ask, because Robert McChesney (2004) The Problem of the Media > (Monthly Review Pr., esp. pp. 34-35) suggests that in the first half of > the nineteenth century, the US had more newspapers and newspaper > publishers per capita than any other place or time. He suggests that > that diversity of newspapers helped encourage literacy and limit > political corruption, both of which helped propel the young US to its > current dominance of the international political economy. I'm hoping to > get some data to evaluate this claim. Sadly, it looks like there is too > much missing and questionable data in this dataset for me to use this > without a fairly substantive data cleaning effort. > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 05 2022, 09:05am via System > > Hello Spencer, > > Thank you for reaching out about your additional questions. > > I was looking at the records you mention above, and yes, you are correct > - those 9 records with the date inconsistencies and the one record for > the The New Mexican mining news > <https://chroniclingamerica.loc.gov/lccn/sn93061507/> containing "Santa > Fe.\" have typos in them. Thanks for spotting these - it may be possible > to have the cataloger in our division correct those typos. I will look > into this further. > > The U.S. Newspaper Directory doesn't have a connection with Wikimedia or > Wikipedia. The Library of Congress periodically pulls the records for > the Directory from OCLC Worldcat > <https://www.oclc.org/en/worldcat.html>. And those newspaper records in > OCLC Worldcat have been created by catalogers at various institutions > around the U.S. over the span of several years. So, occasionally, you > will find a typo in the records. Corrections can be made by OCLC and > library staff at the various institutions. Every time we complete a new > pull on the OCLC records, any corrected records will then populate our > Directory. > > Regarding your question on the New-York weekly journal - yes, that is > also correct that it has two records. There is actually a record for > each format of the newspaper, so this record is for the microfilm format > <https://chroniclingamerica.loc.gov/lccn/2009252748/> and this one is > for the original print format > <https://chroniclingamerica.loc.gov/lccn/sn83030211/>. You can see in > the heading for the microfilm record where it says [microfilm reel] and > the print version shows [volume]. You are likely to see this for other > titles as well because each format has been cataloged with its own LCCN. > You are also likely to see additional records with [online resource] > identified as the format as more and more titles are available as > ePrints or online. > > I hope this helps answer your additional questions a bit more. Please > reach out if you have any other questions. > > Thank you, > > Kerry Huller > Newspaper & Current Periodical Reading Room > Serial & Government Publications Division > Library of Congress > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 04 2022, 01:47pm via Email > > Hi, Kelly: > > > At the risk of bombing your inbox with more emails than you want, > what is your relationship with Wikipedia and other Wikimedia Foundation > projects like Wikidata? > > > I ask, because I've logged over 20,000 edits in Wikimedia Foundation > projects since 2010, and I would happily try to answer questions about > Wikidata and other Wikimedia Foundation projects. I have NOT organized > an edit-a-thon, but I've made presentations at conferences with people > who have, and I would happily try to help organize such if you could > find a group of people who want to work to improve this US Newspaper > database. I think it would be good to establish links between this US > Newspaper database and Wikidata, with appropriate procedures so changes > to one could be evaluated for acceptance into the other. > > > FYI, John Peter Zenger's famous "New-York weekly journal" (1733-1751) > appears TWICE in your database with lccn = 2009252748 and sn83030211 and > ONCE in Wikidata WITHOUT an lccn, even though many other Wikidata items > have an lccn. See: > > > https://www.wikidata.org/wiki/Q23091960 > > > There's a "WikiProject Newspapers" on Wikipedia and a companion > "WikiProject Periodicals" on Wikidata: > > > https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Newspapers/Wikidata > > > https://www.wikidata.org/wiki/Wikidata:WikiProject_Periodicals > > > I've tried to connect with others on those projects, so far with only > limited success. However, you may know that almost anyone can change > almost anything on Wikipedia and other Wikimedia Foundation projects. > What stays tends to be written from a neutral point of view citing > credible sources. They have problems with vandals, but the problems are > usually easily controlled. This makes Wikipedia and Wikidata very > useful platforms for cleaning up databases like your US Newspaper dataset. > > > Spencer Graves > > > ########## > > > Hello, Kelly: > > > In addition to the invalid JSON, discussed below [NOTE: The "below" > contains a slight addition to the report of the I sent last Friday.], I > found 9 (NINE!) cases where start_year was AFTER end_year. These have > lccn = "sn86071531" "sn95069213" "sn90059096" "sn86058451" "sn90060926" > "sn99065409" "sn89065002" "sn98069857" "sn91059179" > > > See: > > > https://chroniclingamerica.loc.gov/lccn/sn86071531/ > https://chroniclingamerica.loc.gov/lccn/sn95069213/ > https://chroniclingamerica.loc.gov/lccn/sn90059096/ > https://chroniclingamerica.loc.gov/lccn/sn86058451/ > https://chroniclingamerica.loc.gov/lccn/sn90060926/ > https://chroniclingamerica.loc.gov/lccn/sn99065409/ > https://chroniclingamerica.loc.gov/lccn/sn89065002/ > https://chroniclingamerica.loc.gov/lccn/sn98069857/ > https://chroniclingamerica.loc.gov/lccn/sn91059179/ > > > These all have obvious coding errors that can be easily fixed. The > data may not be completely accurate after the fix, but at least they are > not obviously wrong ;-) > > > ################## > > I got invalid JSON from: > > > > https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json > > > After some experimentation, I was able to replicate the problem with > a request for rows=10: > > > > https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json > > > Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics > and Associate Dean for Graduate Programs at the University of California > - Davis, confirmed that it was a JSON error using: > > > https://codebeautify.org/jsonvalidator > > > He is part of the core team developing the R free, open-source > programming language. He said, that starting at offsets 161070 and > 161502 in the character string you get from [the R code RCurl::getURL()] > we have: > > > Santa Fe.\" > > > and these are in an entry such as > > > "city": ["Santa Fe.\"] > > > So the final " is escaped and therefore there is no closing " for the > string. The parser continues to consume characters looking for the end > of that string. > > > If one "repairs" the text from getURL() with > > > ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt) > > > then the rest of my code worked fine. > > > You may wish to do something to implement other checks for valid JSON > and repair this problem. I've scanned all the 157520 records that were > in that database a couple of days ago, and this is the only JSON error > identified by the code I used. > > > NOTE: I was NOT able to replicate this error when downloading records > one at a time. That suggests a problem NOT in the database itself but > in the download algorithm. ??? > > > Thank you for your help. I will almost certainly have other > questions ;-) > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 03 2022, 10:39pm via Email > > Hello, Kelly: > > > In addition to the invalid JSON, discussed below [NOTE: The "below" > contains a slight addition to the report of the I sent last Friday.], I > found 9 (NINE!) cases where start_year was AFTER end_year. These have > lccn = "sn86071531" "sn95069213" "sn90059096" "sn86058451" "sn90060926" > "sn99065409" "sn89065002" "sn98069857" "sn91059179" > > > See: > > > https://chroniclingamerica.loc.gov/lccn/sn86071531/ > https://chroniclingamerica.loc.gov/lccn/sn95069213/ > https://chroniclingamerica.loc.gov/lccn/sn90059096/ > https://chroniclingamerica.loc.gov/lccn/sn86058451/ > https://chroniclingamerica.loc.gov/lccn/sn90060926/ > https://chroniclingamerica.loc.gov/lccn/sn99065409/ > https://chroniclingamerica.loc.gov/lccn/sn89065002/ > https://chroniclingamerica.loc.gov/lccn/sn98069857/ > https://chroniclingamerica.loc.gov/lccn/sn91059179/ > > > These all have obvious coding errors that can be easily fixed. The > data may not be completely accurate after the fix, but at least they are > not obviously wrong ;-) > > > ################## > > I got invalid JSON from: > > > > https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json > > > After some experimentation, I was able to replicate the problem with > a request for rows=10: > > > > https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json > > > Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics > and Associate Dean for Graduate Programs at the University of California > - Davis, confirmed that it was a JSON error using: > > > https://codebeautify.org/jsonvalidator > > > He is part of the core team developing the R free, open-source > programming language. He said, that starting at offsets 161070 and > 161502 in the character string you get from [the R code RCurl::getURL()] > we have: > > > Santa Fe.\" > > > and these are in an entry such as > > > "city": ["Santa Fe.\"] > > > So the final " is escaped and therefore there is no closing " for the > string. The parser continues to consume characters looking for the end > of that string. > > > If one "repairs" the text from getURL() with > > > ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt) > > > then the rest of my code worked fine. > > > You may wish to do something to implement other checks for valid JSON > and repair this problem. I've scanned all the 157520 records that were > in that database a couple of days ago, and this is the only JSON error > identified by the code I used. > > > NOTE: I was NOT able to replicate this error when downloading records > one at a time. That suggests a problem NOT in the database itself but > in the download algorithm. ??? > > > Thank you for your help. I will almost certainly have other > questions ;-) > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jul 01 2022, 11:46am via Email > > Hello, Kelly: > > > I got invalid JSON from: > > > > https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json > > > After some experimentation, I was able to replicate the problem with > a request for rows=10: > > > > https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json > > > Duncan Temple Lang <dtemplelang at ucdavis.edu>, Professor of Statistics > and Associate Dean for Graduate Programs at the University of California > - Davis, confirmed that it was a JSON error using: > > > https://codebeautify.org/jsonvalidator > > > He is part of the core team developing the R free, open-source > programming language. He said, that starting at offsets 161070 and > 161502 in the character string you get from [the R code RCurl::getURL()] > we have: > > > Santa Fe.\" > > > and these are in an entry such as > > > "city": ["Santa Fe.\"] > > > So the final " is escaped and therefore there is no closing " for the > string. The parser continues to consume characters looking for the end > of that string. > > > If one "repairs" the text from getURL() with > > > ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt) > > > then the rest of my code worked fine. > > > You may wish to do something to implement other checks for valid JSON > and repair this problem. I've scanned all the 157520 records that were > in that database a couple of days ago, and this is the only JSON error > identified by the code I used. > > > Thank you for your help. I will almost certainly have other > questions ;-) > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 28 2022, 02:20pm via System > > Hello Spencer, > > Thank you for sending along your follow-up questions. > > I'm glad to hear the json view will work for you. It was recommended to > me that you limit your requests to 500 rows at a time. And a developer > here at LC suggests the following regarding rate limiting: > > ?To avoid being blocked by the server, the current rate-limiting rules > restrict un-cached requests to URLs starting with > https://chroniclingamerica.loc.gov/search/ > <https://chroniclingamerica.loc.gov/search/> to 120 requests every 10 > minutes from a single IP address.? > > So, I think if you limited each of your requests to 500 rows at a time > with the proper pauses, then you should be able to access what you need. > > As for the csv view, I checked on this as well, and was informed that > the csv view was not implemented for all url formats. The csv view was > only implemented for this view: > https://chroniclingamerica.loc.gov/newspapers/ > <https://chroniclingamerica.loc.gov/newspapers/>and urls resulting from > US Directory search results - for e.g. if you wanted to narrow down your > search results by state, city, date range, etc. found at this link: > https://chroniclingamerica.loc.gov/search/titles/ > <https://chroniclingamerica.loc.gov/search/titles/>. So, if you wanted a > csv and limited your search by state ( for example: > > https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv > < > https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv> > > ), you could append &format=csv to the search result url and get the csv > to automatically download. But, if your search results ended up being > over a couple thousand titles, then the system would probably time out. > > I hope this info helps! Let me know if you have any other questions. > > Best wishes, > > Kerry Huller > Newspaper & Current Periodical Reading Room > Serial & Government Publications Division > Library of Congress > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 27 2022, 04:15pm via Email > > Hello, Kerry: > > > Thanks for the reply. Can you please give me some further guidance > on two thing "so that the system is not overwhelmed"? > > > 1. The max size in a small batch? > > > 2. Any limit on the number of small batches in a second or minute? > > > I've found that I can download small batches under program control > using "RCurl::getURL" in R (programming language) using, e.g.; > > > > https://chroniclingamerica.loc.gov/search/titles/results/?rows=20&page=2&format=json > > > With this, I can control the batch size with "row=20" vs. "row=50" > vs., e.g., "row=1000". A naive search says there are 157520 "results". > With "row=1000", this would require 158 calls. With "row=20", it > would require 7876 calls. Before I start, I need to decide which fields > I want; I don't need them all. > > > Thanks, > Spencer Graves > > > p.s. I tried appending "&format=csv" and got "Error 504 Ray ID: > 7220896da85e86e7 ? 2022-06-27 19:19:53 UTC Gateway time-out". I used: > > > > https://chroniclingamerica.loc.gov/search/titles/results/?state=&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=ðnicity=&labor=&material_type=&lccn=&rows=20&format=csv > > > I can get what I want using json so do not need csv. However, I > thought you might want to know that I was unable to get csv to work. > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 27 2022, 10:54am via System > > Hello Spencer, > > Thank you for contacting the Library of Congress about searching the US > Newspaper Directory. I wanted to follow up with you regarding your > request to output the data in a machine readable format. > > It looks like you were provided the link to the API documentation for > the website: About the Site and API > <https://chroniclingamerica.loc.gov/about/api/>. Scroll down to the > section with the heading, Searching the directory and newspaper pages > using OpenSearch. This section describes the search functionality and > structure for the US Newspaper Directory in more detail. It is possible > to return your directory searches in json format by appending > &format=json to the end of the url. It is also possible to return search > results in csv format by appending &format=csv to the end of the url, > but I would strongly suggest that you do this in small batches by > putting limits on your search so that the system is not overwhelmed. > > So, from the search page for the US Newspaper Directory > <https://chroniclingamerica.loc.gov/search/titles/> you could > potentially limit your search based on state and city, or date range, > and/or even frequency. Then once you've completed the search, you can > add &format=csv to the end of the url to automatically download a csv of > those records. The resulting csv will contain several fields/headers: > lccn, title, place of publication, start year, end year, publisher, > edition, frequency, subject, state, city, country, language, oclc > number, and holding type. I think these fields include the information > you were looking for. But, again, I would like to stress that you put > limits on your search before creating the csv so as not overwhelm the > system. > > Please let me know if you have any other additional questions. > > Best wishes, > > Kerry Huller > Newspaper & Current Periodical Reading Room > Serial & Government Publications Division > Library of Congress > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 23 2022, 01:55pm via System > > Mr. Graves, > > I'm going to transfer you request to a member of our digital collections > team who may be of more assistance to you than me. > > Mike > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 23 2022, 01:51pm via Email > > Dear Mr. Queen: > > > Thanks for the reply. I'm still confused. I downloaded and > installed Docker Desktop and "docker-compose.yml" and ran their "Getting > Started" Tutorial, but I don't see what to do next. > > > I repeat: I'd like to analyze "U.S. Newspaper Directory, > 1690-Present" (https://chroniclingamerica.loc.gov/search/titles/), which > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 22 2022, 07:15pm via System > > Mr. Graves, > > Programmatic access to the data forChronicling America > <https://chroniclingamerica.loc.gov/>and possibly the U.S. Newspaper > Directory <https://chroniclingamerica.loc.gov/search/titles/>can be > found on theAbout the Site and API > <https://chroniclingamerica.loc.gov/about/api/>page in various formats. > Also, please note that Chronicling Americacontains newspapers published > from 1777-1963, but does not include everyU.S. newspaper published in > that time period. > > Please let me know if I can be of further assistance. > > > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 22 2022, 06:14pm via Email > > Dear Mr. Queen: > > > Can we simplify this to just giving me the data behind "U.S. > Newspaper Directory, 1690-Present" > (https://chroniclingamerica.loc.gov/search/titles/) in a machine > readable format, e.g., csv or xlsx or a MySQL database? > > > As I mentioned in my original email, a naive search of that without > restrictions returned 157520 titles in 7876 pages with up to 20 titles > per page giving date ranges in at least some cases. I could probably > write software to scrape those 7876 pages from your web site and combine > them into a data file. > > > I have a PhD in statistics, I have been using the R programming > language and similar software for decades. This includes publishing > tutorials on how to analyze data like this on Wikiversity.[1] I'd like > to do something similar with this. I could help make your data more > useful to others and discuss with you how we might prioritize > improvements like accessing the other sources you mentioned. > > > Thanks very much for your reply. > > > Sincerely, > Spencer Graves, PhD > Founder, EffectiveDefense.org > 4550 Warwick Blvd 508 > Kansas City, MO 64111 > m: 408-655-4567 > > > [1] e.g.: > > > https://en.wikiversity.org/wiki/US_Gross_Domestic_Product_(GDP)_per_capita > ------------------------------------------------------------------------ > > Newspapers and Current Periodicals Reference Librarian > > Jun 22 2022, 05:27pm via System > > Mr. Graves > > Your request is a little more complex than it first appears and requires > extensive research. A variety of resources should be consulted to > determine the circulation statistics of newspapers published prior to > 1851. You will need to check newspaper union lists and newspaper > histories. Union listspresent lists of newspapers in geographic > arrangement according to place of publication, and specify which > libraries or other institutions hold collections of those newspapers and > the dates of their holdings. These can also be useful for tracking title > changes throughout a newspaper's history. Newspaper > historieslikeAmerican Journalism: A History: 1690-1960 > <https://lccn.loc.gov/62007157>(Mott),The Penny Press > <https://lccn.loc.gov/2004043078>(Thompson), andThe Press and America > <https://lccn.loc.gov/99044295>(Emery et al.) may not include > circulation statistics, but they do document the diversity and progress > of newspaper publishing, including notable newspapers of the era. > Newspaper histories also cover the history of the printers and printing > of newspapers in a state, county, or region more generally, and provide > more condensed histories of the editors, journalists, and evolution of > the newspapers in a specific area. Newspaper histories and union lists > should be available at most large public or university libraries. More > information about union lists, newspaper histories, and researching > newspapers in general can be found in theU.S. Newspaper Collections at > the Library of Congress > <https://guides.loc.gov/united-states-newspapers/introduction>research > guide (see Reference Sources). > > Please let me know if I can be of further assistance. > > ------------------------------------------------------------------------ > > Original Question > > Jun 20 2022, 02:34pm via System > > How can I get counts of the numbers of newspapers by year in the US, and > preferably also elsewhere? A search of "U.S. Newspaper Directory, > How can I get counts of the numbers of newspapers by year in the US, and > preferably also elsewhere? > > A search of "U.S. Newspaper Directory, 1690-Present" > (https://chroniclingamerica.loc.gov/search/titles/) returned 157520 > titles in 7876 pages with up to 20 titles per page giving date ranges to > the extent that it's known. If I can get a data file (e.g., csv or xls), > I can summarize. I could also use data on circulation and frequency and > especially parent company for multiple newspapers published by the same > company, to the extant that such is available. > > I'm interested in this, because McChesney quoted Tocqueville in > suggesting that the US had more newspapers per person (or per million > population) prior to 1851 than at any other time or place in history. > I'd like to evaluate that claim with data to the extent that I can. See > " > https://en.wikiversity.org/wiki/Social_construction_of_crime_and_what_we_can_do_about_it#Newspapers_1790_-_present". > > > > Thanks, Spencer Graves, PhD > m: 408-655-4567 > > ------------------------------------------------------------------------ > > Thank you for using Newspapers & Current Periodicals Ask a Librarian > Service! > > > This email is sent from Ask a Librarian in relationship to ticket #9625195. > > Read our privacy policy. <https://springshare.com/privacy.html> > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]