thr3ads.net - webgen users - [webgen-users] build a multi-document page file [Feb 2008]

If this information is useful, please help other people find it:
Share via:

Nicolas Delsaux

2008-Feb-23 15:00 UTC

[webgen-users] build a multi-document page file

Hi,
Working like never on my ISBN importer, I''ve reached the step where i
can produce a document containing both a content and infos from ISBVN
providers, here is an example :

######################################
---
title: 2-07-042684-X
--- |

!>http://ecx.images-amazon.com/images/I/21V1XY3XJYL.jpg!:http://www.amazon.fr/gp/product/207042684X%3ftag=leblo-20%26link_code=xm2%26camp=2025%26dev-t=0KBJ59SK0VR923NMVKR2

h3. Parade nuptiale
h4. Book
authors
# Donald Kingsbury
# Michel Lederer

edited by Gallimard

published on 06/03/2003

---
:name: Parade nuptiale
:editor: Gallimard
:src: "#<Module:0x33f0f18>::AmazonAsin"
:authors:
- Donald Kingsbury
- Michel Lederer
:image: http://ecx.images-amazon.com/images/I/21V1XY3XJYL.jpg
:url:
http://www.amazon.fr/gp/product/207042684X%3ftag=leblo-20%26link_code=xm2%26camp=2025%26dev-t=0KBJ59SK0VR923NMVKR2
:release_date: 2003-03-06
:main_category: Book
######################################

You''ll obviously tell me that there is a problem, since there are more
than one content document. And I''ll completely agree with you.
Unfortunatly, I''m unable to create a compliant document, since the
ruby YAML library does not seems to provide a way to add the required
"options" or the things that should do the trick in document header,
in order to tell webgen the last document should not be the content
one, but rather a kind of AmazonAsin content block.
So, do you know any solution to solve that issue ?
For your information, I add section using
http://www.ruby-doc.org/core/classes/YAML/Stream.src/M007467.html

YAML::Stream.add(doc)
-- 
Nicolas Delsaux
N''imprimez ce mail que si vous ne savez pas le lire sur
l''?cran : les
?lectrons se recyclent bien, le papier, beaucoup moins bien.

Nicolas Delsaux

2008-Feb-24 07:26 UTC

head link

[webgen-users] Fwd: build a multi-document page file

On 2/23/08, Andrea Censi <andrea at cds.caltech.edu> wrote:
 >
 > Hi,
 >  I can''t help you on your YAML problem, but I''d be
interested in
 >  knowing more about your ISBN importer. Is it already available
 >  somewhere?
 >

I''ll make it available maybe today on my website.
 But remember it can''t be considered as a working plugin, since pages
 contains more than one document starting with "--- \n".
 So never consider it as a stable plugin. It''s for now only a
 development version, but I have high goals for this little dev.
 For now, you can take a look at
 http://nicolas.delsaux.free.fr/webgen/informatique/web/webgen/pseudo-tags.en.html
 http://nicolas.delsaux.free.fr/webgen/informatique/web/webgen/isbn.en.html

 And you should obtain a quasi-working system (but no displayabled
 ISBN). Also, take car e I may change the megastore folder in a near
 future.
 And finally notice that this code should be LGPL-licensed.

 --
 Nicolas Delsaux
 N''imprimez ce mail que si vous ne savez pas le lire sur
l''?cran : les
 ?lectrons se recyclent bien, le papier, beaucoup moins bien.

Thomas Leitner

2008-Feb-24 19:50 UTC

head link

[webgen-users] build a multi-document page file

Am Sat, 23 Feb 2008 16:00:53 +0100
schrieb "Nicolas Delsaux" <nicolas.delsaux at gmail.com>:
> Hi,
>
> <SNIP/>
> 
> published on 06/03/2003
> 
> ---
Just escape the three dashes with a backslash like in the following
line:
\---
> :name: Parade nuptiale
> <SNIP/>
If I understand your problem correctly, this should solve your
problem.

-- Thomas

Nicolas Delsaux

2008-Feb-25 09:18 UTC

head link

[webgen-users] Fwd: build a multi-document page file

On 2/24/08, Thomas Leitner <t_leitner at gmx.at> wrote:
 >
 >
 >
 > Just escape the three dashes with a backslash like in the following
 >  line:
 >  \---
 >
>  If I understand your problem correctly, this should solve your >  problem.
 >
 >

MMh

 This could be a solution. However, what I want is each data grabbed
 from a provider in its own document.
 So your solution, although promising, seems to be not so appliable.
 What I would really would like to have is a document name per document.
 I was thinking, since you''re able to read from YAML the
 document/parser, would be to be able to write them, like, say ...

 --- content
 blablabla
 --- AmazonIsan
 # AmazonIsan content

 --
 Nicolas Delsaux
 N''imprimez ce mail que si vous ne savez pas le lire sur
l''?cran : les
 ?lectrons se recyclent bien, le papier, beaucoup moins bien.

Thomas Leitner

2008-Feb-25 10:47 UTC

head link

[webgen-users] Fwd: build a multi-document page file

Am Mon, 25 Feb 2008 10:18:36 +0100
schrieb "Nicolas Delsaux" <nicolas.delsaux at gmail.com>:
> >  If I understand your problem correctly, this should solve your
>  >  problem.
> 
> MMh
> 
>  This could be a solution. However, what I want is each data grabbed
>  from a provider in its own document.
>  So your solution, although promising, seems to be not so appliable.
>  What I would really would like to have is a document name per
> document. I was thinking, since you''re able to read from YAML the
>  document/parser, would be to be able to write them, like, say ...
> 
>  --- content
>  blablabla
>  --- AmazonIsan
>  # AmazonIsan content
Ah, okay, I misunderstood your problem. You want to generate a file in
WebPage Format with a metadata section, a content section and an
AmazonIsan section by using YAML. Since WebPage Format is *not* a
series of Yaml documents, this won''t work this way. Since I have not
written a class for creating pages in WebPage Format, you will need to
do that manually, but this isn''t too hard. Just dump the meta
information with hash.to_yaml, then add three dashes on a separate
line, then the content, then three dashes followed by
''AmazonIsan'' and
then the AmazonIsan content.

Best regards,
  Thomas

Nicolas Delsaux

2008-Feb-25 10:58 UTC

head link

[webgen-users] Fwd: build a multi-document page file

On 2/25/08, Thomas Leitner <t_leitner at gmx.at>
wrote:>
> Ah, okay, I misunderstood your problem. You want to generate a file in
>  WebPage Format with a metadata section, a content section and an
>  AmazonIsan section by using YAML.
Exactly, yes.
> Since WebPage Format is *not* a
>  series of Yaml documents, this won''t work this way.
Damn, that''s plain weird :-O (notice I do not critcize your choice, i
only express here my astonishment)
> Since I have not
>  written a class for creating pages in WebPage Format, you will need to
>  do that manually, but this isn''t too hard.
This sentence reminds me not-so-good job memories ;-)
> Just dump the meta
>  information with hash.to_yaml, then add three dashes on a separate
>  line, then the content, then three dashes followed by
''AmazonIsan'' and
>  then the AmazonIsan content.
Well, ... here come the drawback.
Take a look at my isbn_processor.rb
(http://nicolas.delsaux.free.fr/webgen/informatique/web/webgen/isbn.en.html).
Here is in short what it does.
For each isbn number in the isbn cache (this is a temporary step)
  For each grabber
    Grab content from the web (for Amazon, do a webservice query and
put result in a big has)
    Then put request result in our kind of hash
    Create a YAML document containing this Hash and push it inot YAML
stream (line 177)

Once that''s done

For each non-header and non-content YAML document, load data and
choose best to populate content.

The goal is to maintain a cache of web result in a file nearby real
page (and what is more nearby than the file itself ?).
Obviously, i can think about a turnaround : create a isbn.page.cache
auxiliary file that would contain my YAML documents, then create the
isbn.page file with only relevant data.
Interesting, but a little more complciated, since I now have two files ...
However, i think I''ll use this solution, to preserve the isbn.page
usability as webgen page (since my ultimate goal is to include it in
files using tyhe <isbn value=""/> tag)

Thanks for our lights.

-- 
Nicolas Delsaux
N''imprimez ce mail que si vous ne savez pas le lire sur
l''?cran : les
?lectrons se recyclent bien, le papier, beaucoup moins bien.

webgen users - Feb 2008 - build a multi-document page file

[webgen-users] build a multi-document page file

[webgen-users] Fwd: build a multi-document page file

[webgen-users] build a multi-document page file

[webgen-users] Fwd: build a multi-document page file

[webgen-users] Fwd: build a multi-document page file

[webgen-users] Fwd: build a multi-document page file