Hi, Working like never on my ISBN importer, I''ve reached the step where i can produce a document containing both a content and infos from ISBVN providers, here is an example : ###################################### --- title: 2-07-042684-X --- | !>http://ecx.images-amazon.com/images/I/21V1XY3XJYL.jpg!:http://www.amazon.fr/gp/product/207042684X%3ftag=leblo-20%26link_code=xm2%26camp=2025%26dev-t=0KBJ59SK0VR923NMVKR2 h3. Parade nuptiale h4. Book authors # Donald Kingsbury # Michel Lederer edited by Gallimard published on 06/03/2003 --- :name: Parade nuptiale :editor: Gallimard :src: "#<Module:0x33f0f18>::AmazonAsin" :authors: - Donald Kingsbury - Michel Lederer :image: http://ecx.images-amazon.com/images/I/21V1XY3XJYL.jpg :url: http://www.amazon.fr/gp/product/207042684X%3ftag=leblo-20%26link_code=xm2%26camp=2025%26dev-t=0KBJ59SK0VR923NMVKR2 :release_date: 2003-03-06 :main_category: Book ###################################### You''ll obviously tell me that there is a problem, since there are more than one content document. And I''ll completely agree with you. Unfortunatly, I''m unable to create a compliant document, since the ruby YAML library does not seems to provide a way to add the required "options" or the things that should do the trick in document header, in order to tell webgen the last document should not be the content one, but rather a kind of AmazonAsin content block. So, do you know any solution to solve that issue ? For your information, I add section using http://www.ruby-doc.org/core/classes/YAML/Stream.src/M007467.html YAML::Stream.add(doc) -- Nicolas Delsaux N''imprimez ce mail que si vous ne savez pas le lire sur l''?cran : les ?lectrons se recyclent bien, le papier, beaucoup moins bien.
Nicolas Delsaux
2008-Feb-24 07:26 UTC
[webgen-users] Fwd: build a multi-document page file
On 2/23/08, Andrea Censi <andrea at cds.caltech.edu> wrote: > > Hi, > I can''t help you on your YAML problem, but I''d be interested in > knowing more about your ISBN importer. Is it already available > somewhere? > I''ll make it available maybe today on my website. But remember it can''t be considered as a working plugin, since pages contains more than one document starting with "--- \n". So never consider it as a stable plugin. It''s for now only a development version, but I have high goals for this little dev. For now, you can take a look at http://nicolas.delsaux.free.fr/webgen/informatique/web/webgen/pseudo-tags.en.html http://nicolas.delsaux.free.fr/webgen/informatique/web/webgen/isbn.en.html And you should obtain a quasi-working system (but no displayabled ISBN). Also, take car e I may change the megastore folder in a near future. And finally notice that this code should be LGPL-licensed. -- Nicolas Delsaux N''imprimez ce mail que si vous ne savez pas le lire sur l''?cran : les ?lectrons se recyclent bien, le papier, beaucoup moins bien.
Am Sat, 23 Feb 2008 16:00:53 +0100 schrieb "Nicolas Delsaux" <nicolas.delsaux at gmail.com>:> Hi, > > <SNIP/> > > published on 06/03/2003 > > ---Just escape the three dashes with a backslash like in the following line: \---> :name: Parade nuptiale > <SNIP/>If I understand your problem correctly, this should solve your problem. -- Thomas
Nicolas Delsaux
2008-Feb-25 09:18 UTC
[webgen-users] Fwd: build a multi-document page file
On 2/24/08, Thomas Leitner <t_leitner at gmx.at> wrote: > > > > Just escape the three dashes with a backslash like in the following > line: > \--- >> If I understand your problem correctly, this should solve your> problem. > > MMh This could be a solution. However, what I want is each data grabbed from a provider in its own document. So your solution, although promising, seems to be not so appliable. What I would really would like to have is a document name per document. I was thinking, since you''re able to read from YAML the document/parser, would be to be able to write them, like, say ... --- content blablabla --- AmazonIsan # AmazonIsan content -- Nicolas Delsaux N''imprimez ce mail que si vous ne savez pas le lire sur l''?cran : les ?lectrons se recyclent bien, le papier, beaucoup moins bien.
Am Mon, 25 Feb 2008 10:18:36 +0100 schrieb "Nicolas Delsaux" <nicolas.delsaux at gmail.com>:> > If I understand your problem correctly, this should solve your > > problem. > > MMh > > This could be a solution. However, what I want is each data grabbed > from a provider in its own document. > So your solution, although promising, seems to be not so appliable. > What I would really would like to have is a document name per > document. I was thinking, since you''re able to read from YAML the > document/parser, would be to be able to write them, like, say ... > > --- content > blablabla > --- AmazonIsan > # AmazonIsan contentAh, okay, I misunderstood your problem. You want to generate a file in WebPage Format with a metadata section, a content section and an AmazonIsan section by using YAML. Since WebPage Format is *not* a series of Yaml documents, this won''t work this way. Since I have not written a class for creating pages in WebPage Format, you will need to do that manually, but this isn''t too hard. Just dump the meta information with hash.to_yaml, then add three dashes on a separate line, then the content, then three dashes followed by ''AmazonIsan'' and then the AmazonIsan content. Best regards, Thomas
Nicolas Delsaux
2008-Feb-25 10:58 UTC
[webgen-users] Fwd: build a multi-document page file
On 2/25/08, Thomas Leitner <t_leitner at gmx.at> wrote:> > Ah, okay, I misunderstood your problem. You want to generate a file in > WebPage Format with a metadata section, a content section and an > AmazonIsan section by using YAML.Exactly, yes.> Since WebPage Format is *not* a > series of Yaml documents, this won''t work this way.Damn, that''s plain weird :-O (notice I do not critcize your choice, i only express here my astonishment)> Since I have not > written a class for creating pages in WebPage Format, you will need to > do that manually, but this isn''t too hard.This sentence reminds me not-so-good job memories ;-)> Just dump the meta > information with hash.to_yaml, then add three dashes on a separate > line, then the content, then three dashes followed by ''AmazonIsan'' and > then the AmazonIsan content.Well, ... here come the drawback. Take a look at my isbn_processor.rb (http://nicolas.delsaux.free.fr/webgen/informatique/web/webgen/isbn.en.html). Here is in short what it does. For each isbn number in the isbn cache (this is a temporary step) For each grabber Grab content from the web (for Amazon, do a webservice query and put result in a big has) Then put request result in our kind of hash Create a YAML document containing this Hash and push it inot YAML stream (line 177) Once that''s done For each non-header and non-content YAML document, load data and choose best to populate content. The goal is to maintain a cache of web result in a file nearby real page (and what is more nearby than the file itself ?). Obviously, i can think about a turnaround : create a isbn.page.cache auxiliary file that would contain my YAML documents, then create the isbn.page file with only relevant data. Interesting, but a little more complciated, since I now have two files ... However, i think I''ll use this solution, to preserve the isbn.page usability as webgen page (since my ultimate goal is to include it in files using tyhe <isbn value=""/> tag) Thanks for our lights. -- Nicolas Delsaux N''imprimez ce mail que si vous ne savez pas le lire sur l''?cran : les ?lectrons se recyclent bien, le papier, beaucoup moins bien.