Jason K. Jackson
2006-Jun-26 03:07 UTC
[Rails] DBI question. How do I load a database with data parsed from REXML?
I wrote a rexml script that parses rather big xml files (DOM parsing) and builds an object out of them. This is what it looks like after i parse the xml file and display the object: [#<Article:0xb79e7428 @brand="This is a story from ", @id=5337, @fblurb="burb", @ntype="news type here", @copy="Copyright \302\251 2006 ", @source="source", @body="body", @archive=nil, @att="attribution here", @blurb="Signs and symptoms of food poisoning", @post=nil, @fimage=nil, @head="Health Tip: Eaten Bad Food?">] I have a database schema that looks like this: Code: create table Newsfeeds ( ArticleId varchar(100), PostingDate date, ArchiveDate date, NewsType varchar(100), Headline text, Blurb text, Byline varchar(100), Body text, FeatureBlurb text, FeatureImage text, Attribution varchar(100), Source text, Copyright text, Brand text ); What i need to do is put each parsing from the constructed data object into its place in the database. This is not inside of a rails application. This is a ruby script that will be run from a cron on a linux box. The code that i have is the following: Code: require ''rexml/document'' include REXML # classes to represent the objects and relationships in the xml file class Article attr_accessor :id, :post, :archive, :ntype, :head, :blurb, :body, :fblurb, :fimage, :att, :source, :copy, :brand end # the base parser class BaseXMLParser def initialize(filename) @raw_xml = File.open(filename).read @newsfeed = [] end end # DOM-based parsing class FeedXMLParser < BaseXMLParser def get_newsfeed doc = Document.new(@raw_xml) XPath.each(doc, ''NEWSFEED/ARTICLE'') do |article| p = Article.new p.id = article.attributes[''ID''] p.post = article.attributes[''POSTING_DATE''] p.archive = article.attributes[''ARCHIVE_DATE''] p.ntype = article.elements[''NEWS_TYPE''] p.head = article.elements[''HEADLINE''].text p.blurb = article.elements[''BLURB''].text p.body = article.elements[''BODY''].text p.fblurb = article.elements[''FEATURE_BLURB''].text p.fimage = article.elements[''FEATURE_IMAGE''].text p.att = article.elements[''ATTRIBUTION''] p.source = article.elements[''SOURCE''].text p.copy = article.elements[''COPYRIGHT''].text p.brand = article.elements[''BRAND''].text @newsfeed << p end @newsfeed end end # factory to instantiate parser instance and parse test file # klass: the class of parser to use def get_parser(klass) lambda do parser_inst = klass.new(''newsfeed.xml'') parser_inst.get_newsfeed end end docparser = get_parser(FeedXMLParser) #puts docparser.call.inspect {|x| x.each {|z| puts z}} class ArcticleLoad attr_reader :id, :post, :archive, :ntype, :head, :blurb, :body, :fblurb, :fimage, :att, :source, :copy, :brand def initialize(id, post, archive, ntype, head, blurb, body, fblurb, fimage, att, source, copy, brand) @id = id @post = post @archive = archive @ntype = ntype @head = head @blurb = blurb @body = body @fblurb = fblurb @fimage = fimage @att = att @source = source @copy = copy @brand = brand end end # Set up the DB require "dbi" begin # Connect to the MySQL server dbh = DBI.connect("dbi:Mysql:newsfeeds:localhost","eleven","foo") row = dbh.select_one("SELECT VERSION()") puts "Connecting to Server Version: " + row[0] #connection check # Start the SQL madness here: docparser.call.instance_variables {|x| x.each {|z| assort(z)}} rescue DBI::DatabaseError => e puts "An error occured" puts "Error code: #{e.err}" puts "Error messages: #{e.errstr}" ensure # Disconnect from server dbh.disconnect if dbh end def assort(ArticleLoad) rows = dbh.do("INSERT INTO articles (ArticleId,PostingDate,ArchiveDate,NewsType,Headline,Blurb,Byline,Body,FeatureBlurb,FeatureImage,Attribution,Source,Copyright,Brand) VALUES(''#{id}'',''#{post}'',''#{archive}'',''#{ntype}'',''#{head}'',''#{blurb}'',''#{body}'',''#{fblurb}'',''#{fimage}'',''#{att}'',''#{source}'',''#{copy}'',''#{brand}''") end To explain a bit.. docparser is an object, filled with arrays. Each array (article) has elements that are from REXML parsings. I need to find a way to load each of these elements into the corresponding field in the database. This is a ruby script that will be run from cron. The database connection is handled via DBI, and for the life of me i cannot figure out how to load mysql. If anyone can help me with this, that would be great. Much appreciate your time. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://wrath.rubyonrails.org/pipermail/rails/attachments/20060626/1d81a156/attachment-0001.html