thr3ads.net - CentOS - [CentOS] script to make webpage snapshot [Aug 2016]

If this information is useful, please help other people find it:
Share via:

Valeri Galtsev

2016-Aug-11 22:10 UTC

[CentOS] script to make webpage snapshot

On Thu, August 11, 2016 5:02 pm, John R Pierce wrote:> On 8/11/2016 1:46 PM, Valeri Galtsev wrote:
>> Could someone recommend a script or utility one can run from command
>> line
>> on Linux or UNIX machine to make a snapshot of webpage?
>>
>> We have a signage (xibo) and whoever creates/changes content, likes to
>> add
>> URLs of some webpages there. All works well if these are webpages on
our
>> servers (which are pretty fast), but some external servers often take
>> time
>> to respond and take time to assemble the page, in addition these
servers
>> sometimes get really busy, and when response is longer than time
devoted
>> for that content in signage window, this window hangs forever with
blank
>> white field until you restart client. Trivial workaround: just to get
>> snapshot (as, say daily cron job), and point signage client to that
>> snapshot definitely will solve it, and simultaneously we will stop
>> bugging
>> other people servers often without much need for it.
>>
>> But when I tried to search for some utility or script that makes
webpage
>> snapshot, I discovered that my ability to search degraded somehow...
>
> many/most webpages these days are heavily dynamic content, a static
> snapshot would likely break.  plus any site-relative links on that
> snapshot would be pointing to your server, not the original, any ajax
> code on that webpage would try to interact with your server which won't
> be running the right back end stuff, etcetc.
I usually am not good at explaining what I need. I really only need an
image of what one would see in web browser if one point to that URL. I do
not care it to be interactive. I also don't want to get the content
("mirror") of stuff that URL points to on variety of
"depths" - I don't
want to use wget or curl for this reason. That is what I tried first and
it breaks with at lest one of the web sites - they do seem protect
themselves from "robots" or similar. And we don't need it. We just
need to
show what they page shows today, that's all.

Valeri
>
> --
> john r pierce, recycling bits in santa cruz
>

++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev
Sr System Administrator
Department of Astronomy and Astrophysics
Kavli Institute for Cosmological Physics
University of Chicago
Phone: 773-702-4247
++++++++++++++++++++++++++++++++++++++++

Dave Stevens

2016-Aug-11 22:13 UTC

head link

[CentOS] script to make webpage snapshot

Quoting Valeri Galtsev <galtsev at kicp.uchicago.edu>:
>
> On Thu, August 11, 2016 5:02 pm, John R Pierce wrote:
>> On 8/11/2016 1:46 PM, Valeri Galtsev wrote:
>>> Could someone recommend a script or utility one can run from
command
>>> line
>>> on Linux or UNIX machine to make a snapshot of webpage?
>>>
>>> We have a signage (xibo) and whoever creates/changes content, likes
to
>>> add
>>> URLs of some webpages there. All works well if these are webpages
on our
>>> servers (which are pretty fast), but some external servers often
take
>>> time
>>> to respond and take time to assemble the page, in addition these
servers
>>> sometimes get really busy, and when response is longer than time
devoted
>>> for that content in signage window, this window hangs forever with
blank
>>> white field until you restart client. Trivial workaround: just to
get
>>> snapshot (as, say daily cron job), and point signage client to that
>>> snapshot definitely will solve it, and simultaneously we will stop
>>> bugging
>>> other people servers often without much need for it.
>>>
>>> But when I tried to search for some utility or script that makes
webpage
>>> snapshot, I discovered that my ability to search degraded
somehow...
>>
>> many/most webpages these days are heavily dynamic content, a static
>> snapshot would likely break.  plus any site-relative links on that
>> snapshot would be pointing to your server, not the original, any ajax
>> code on that webpage would try to interact with your server which
won't
>> be running the right back end stuff, etcetc.
>
> I usually am not good at explaining what I need. I really only need an
> image of what one would see in web browser if one point to that URL. I do
> not care it to be interactive. I also don't want to get the content
> ("mirror") of stuff that URL points to on variety of
"depths" - I don't
> want to use wget or curl for this reason. That is what I tried first and
> it breaks with at lest one of the web sites - they do seem protect
> themselves from "robots" or similar. And we don't need it. We
just need to
> show what they page shows today, that's all.
>
> Valeri
why not File -> Print -> .pdf?

D
>
>>
>> --
>> john r pierce, recycling bits in santa cruz
>>
>
>
> ++++++++++++++++++++++++++++++++++++++++
> Valeri Galtsev
> Sr System Administrator
> Department of Astronomy and Astrophysics
> Kavli Institute for Cosmological Physics
> University of Chicago
> Phone: 773-702-4247
> ++++++++++++++++++++++++++++++++++++++++
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> https://lists.centos.org/mailman/listinfo/centos
>


-- 
"As long as politics is the shadow cast on society by big business,
the attenuation of the shadow will not change the substance."

-- John Dewey

John R Pierce

2016-Aug-11 22:27 UTC

head link

[CentOS] script to make webpage snapshot

On 8/11/2016 3:10 PM, Valeri Galtsev wrote:> I usually am not good at explaining what I need. I really only need an
> image of what one would see in web browser if one point to that URL. I do
> not care it to be interactive. I also don't want to get the content
> ("mirror") of stuff that URL points to on variety of
"depths" - I don't
> want to use wget or curl for this reason. That is what I tried first and
> it breaks with at lest one of the web sites - they do seem protect
> themselves from "robots" or similar. And we don't need it. We
just need to
> show what they page shows today, that's all.
then screen capture is about it.... too many sites, ALL the content is 
dynamic, for instance, 
https://www.google.com/maps/@36.9460899,-122.0268105,664a,20y,41.31t/data=!3m1!1e3

that page is composed of tiles of image data superimposed on the fly 
with ajax code running in the browser to fetch the layers displayed.

you simply can't fetch the html and make any sense out of it, the 
browser is running a complex application to display that.


-- 
john r pierce, recycling bits in santa cruz

Valeri Galtsev

2016-Aug-11 22:32 UTC

head link

[CentOS] script to make webpage snapshot

On Thu, August 11, 2016 5:27 pm, John R Pierce wrote:> On 8/11/2016 3:10 PM, Valeri Galtsev wrote:
>> I usually am not good at explaining what I need. I really only need an
>> image of what one would see in web browser if one point to that URL. I
>> do
>> not care it to be interactive. I also don't want to get the content
>> ("mirror") of stuff that URL points to on variety of
"depths" - I don't
>> want to use wget or curl for this reason. That is what I tried first
and
>> it breaks with at lest one of the web sites - they do seem protect
>> themselves from "robots" or similar. And we don't need
it. We just need
>> to
>> show what they page shows today, that's all.
>
> then screen capture is about it.... too many sites, ALL the content is
> dynamic, for instance,
>
https://www.google.com/maps/@36.9460899,-122.0268105,664a,20y,41.31t/data=!3m1!1e3
>
> that page is composed of tiles of image data superimposed on the fly
> with ajax code running in the browser to fetch the layers displayed.
>
> you simply can't fetch the html and make any sense out of it, the
> browser is running a complex application to display that.
>
Yes, I understand as much, thanks. I'm still sure it is not hopeless task.

Valeri

++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev
Sr System Administrator
Department of Astronomy and Astrophysics
Kavli Institute for Cosmological Physics
University of Chicago
Phone: 773-702-4247
++++++++++++++++++++++++++++++++++++++++

Apparently Analagous Threads

Search for more seemingly similar threads

CentOS - Aug 2016 - script to make webpage snapshot

[CentOS] script to make webpage snapshot

[CentOS] script to make webpage snapshot

[CentOS] script to make webpage snapshot

[CentOS] script to make webpage snapshot

Apparently Analagous Threads