search for: crawling

Displaying 20 results from an estimated 638 matches for "crawling".

2018 Jan 08
0
different names for bricks
I just noticed that gluster volume info foo and gluster volume heal foo statistics use different indices for brick numbers. Info uses 1 based but heal statistics uses 0 based. gluster volume info clifford Volume Name: cliffordType: Distributed- ReplicateVolume ID: 0e33ff98-53e8-40cf-bdb0-3e18406a945aStatus: StartedSnapshot Count: 0Number of Bricks: 2 x 2 = 4Transport-type: tcpBricks:Brick1:
2007 Sep 18
4
basic rdig setup
I''m developing locally on Windows and I have a remote dev box that runs Linux. I''m trying to use RDig just to index using urls, no files. Both use acts_as_ferret for an administrative search that works fine. On the Windows machine, I get no errors, but get no results. On the Linux machine, I get: File Not Found Error occured at <except.c>:93 in xraise Error occured in
2008 Mar 25
0
Questions about backgroundrb
...web crawler runs once, it is then scheduled to run periodically on > a daily basis, saving information to the db but not generating any xml or > json to send back to the view. > The questions I have: > > Is the following the best/most scalable way to implement?...Each site I am > crawling gets its own worker -- e.g., MyspaceWorker. Within each worker, I > have a crawl method that uses concurrency to avoid latency when crawling one > set of several web pages within one website. If 2 users decides to track > two different sets of pages from a given website, then I declare t...
2017 Sep 29
1
Gluster geo replication volume is faulty
I am trying to get up geo replication between two gluster volumes I have set up two replica 2 arbiter 1 volumes with 9 bricks [root at gfs1 ~]# gluster volume info Volume Name: gfsvol Type: Distributed-Replicate Volume ID: c2fb4365-480b-4d37-8c7d-c3046bca7306 Status: Started Snapshot Count: 0 Number of Bricks: 3 x (2 + 1) = 9 Transport-type: tcp Bricks: Brick1: gfs2:/gfs/brick1/gv0 Brick2:
2018 Nov 15
3
Mail slowed down to a crawl...
Been moving along just fine for a couple years now, then in the last two days, email has slowed to a crawl. Retrieving email via IMAP is very slow, progress bar in the mail client shows that it is downloading messages almost constantly, like it never closes the connection, or it?s getting the messages very slowly. The server that dovecot/postfix is on has also bogged down to a crawl. The
2009 May 27
1
R package installation (PR#13726)
Full_Name: Lukasz Andrzej Bartnik Version: 2.8.1 OS: RHELS 5.2 Submission from: (NULL) (194.181.94.250) Compile R for 32 bit on a 64 bit machine: unset LD_LIBRARY_PATH unset R_LD_LIBRARY_PATH export CC="gcc -m32" export CXXFLAGS="-m32 -O2 -g" export FFLAGS="-m32 -O2 -g" export FCFLAGS="-m32 -O2 -g" export OBJCFLAGS="-m32 -O2 -g" export LIBnn=lib
2006 Oct 23
3
Design Dilemma - Please Help
Hi, I''m new. ;-) I creating a little rails app, that will crawl the web on a regular basis and then show the results. The crawling will be scheduled, likely a cron job. I can''t wrap my head around where to put my crawler. It doesn''t seem to fit. An example: Model - News Story Controllers - Grabs a story from the DB, Sort the Stories, Search the Stories etc. View - HTML News Story, RSS Story etc. Then a I...
2006 Mar 29
1
htdig with omega for multiple URLs (websites)
...Htdig looks better than my original idea - wget, you were right. Using htdig, I can crawl and search single website - but I need to integrate search of pages spread over 100+ sites. Learning, learning.... Htdig uses separate document database for every website (one database per URL to initiate crawling). Htdig also can merge result databases to allow search of integrated results. If you still have around the script you said you wrote to use htdig as crawler front-end for omega, I would be really interested to see it. My htdig crawls single site. I need to learn how to crawl multiple sites an...
2018 Feb 07
2
add geo-replication "passive" node after node replacement
Hi all, i had a replica 2 gluster 3.12 between S1 and S2 (1 brick per node) geo-replicated to S5 where both S1 and S2 were visible in the geo-replication status and S2 "active" while S1 "passive". I had to replace S1 with S3, so I did an "add-brick replica 3 S3" and then "remove-brick replica 2 S1". Now I have again a replica 2 gluster between S3 and S2
2017 Oct 06
0
Gluster geo replication volume is faulty
On 09/29/2017 09:30 PM, rick sanchez wrote: > I am trying to get up geo replication between two gluster volumes > > I have set up two replica 2 arbiter 1 volumes with 9 bricks > > [root at gfs1 ~]# gluster volume info > Volume Name: gfsvol > Type: Distributed-Replicate > Volume ID: c2fb4365-480b-4d37-8c7d-c3046bca7306 > Status: Started > Snapshot Count: 0 > Number
2018 Feb 21
2
Geo replication snapshot error
Hi all, I use gluster 3.12 on centos 7. I am writing a snapshot program for my geo-replicated cluster. Now when I started to run tests with my application I have found a very strange behavior regarding geo-replication in gluster. I have setup my geo-replication according to the docs: http://docs.gluster.org/en/latest/Administrator%20Guide/Geo%20Replication/ Both master and slave clusters are
2007 Feb 08
1
Getting custom field data from the page through crawling
Now on to my next question.. I've got the search and indexing working well for now.. My next quest is to implement a system of creating custom fields in the index. Our site is fully dynamic. That is, every page is generated in PHP and there are enough different kinds of pages that I wouldn't want to get into the business of indexing the DB directly, so I think that using htdig to crawl
2019 Aug 04
2
browsers slowing Centos 7 installation to a crawl
elinks does not seem to be working for me. I typed in google.com as my first url. There seems to be no way out of google, nor any way further in. No place to type a url. What appears to be the search window is black and does not accept input. Oops. Now I seem to have clicked on google help or something. There seems no way to back up. Ok. Found the left arrow. Still no way to search or to get out
2018 Nov 15
0
Mail slowed down to a crawl...
Den 15.11.2018 21:00, skrev StarionTech (IMAP): > Been moving along just fine for a couple years now, then in the last two days, email has slowed to a crawl. > > Retrieving email via IMAP is very slow, progress bar in the mail client shows that it is downloading messages almost constantly, like it never closes the connection, or it?s getting the messages very slowly. The server that
2000 Jul 28
0
M$ Photodraw 2000 brings box to a crawl
LTho wrote: > > One of my users is using Microsoft Photodraw 2000 ver 2.0.0.0915. She > can save to the samba server 3, maybe 4 times, after that the box > slows to a crawl; everybody starts getting semaphore timeouts on their > boxes, and linux activities slow to a crawl. On Linux, I'd connect from a test client, find the right smbd process (using smbstatus and ps) and
2006 Jan 19
1
IMQ slows computer to a crawl
I am attempting to implement IMQ on a 2.4.31 version kernel with iptables 1.3.3. I am following the example at http://www.linuximq.net/usage.html. When I enter the line iptables -t mangle -A POSTROUTING -o eth1 -j IMQ --todev1 (eth1 is the external interface), the computer slows to a crawl. OK, the CPU is only an AMD K6 233 which is not the world''s greatest CPU, but egress shaping is
2018 Nov 16
1
Mail slowed down to a crawl...
Yes, I thought all of those things too. Spent a good part of yesterday eliminating them as the cause. top and iotop do not show any issues. No spam filtering on this box (it?s a separate server/software). Drive is 1/3 full. No smart errors. The queue problem (and I don?t know why this isn?t handled better) turned out to be a couple of corrupted messages in the queue. Had about 400 messages
2012 Nov 17
1
fast parallel crawling of file systems
Hi, I use a disk space inventory tool called TreeSizePro to scan file filesystems on windows and linux boxes. On Linux systems I export these shares via samba to scan them. TreeSizePro is multi-threaded (32 crawlers) and I run it on windows 7. I am scanning file systems that are local to the linux servers and also nfs mounts that are re-exported via samba. If I scan a windows 2008 server I can
2003 Apr 17
0
2.2.8 slows down to a crawl within a few days
We have been using samba for years now and are generally very happy. But since a recent upgrade to 2.2.8 we have seen samba slowing down periodically to a crawl every 2 or 3 days. BTW the start of these problems coincided with the upgrade to samba 2.2.8 and a partial cabling upgrade to 100 MBit, which also coincided with the introduction of the first Win XP Pro machines to our LAN. ;-( We have
2019 Aug 04
4
browsers slowing Centos 7 installation to a crawl
My video problems mentioned in a previous thread are gone, though I do not know why. Now my problem is that whenever I have a browser open and an internet connection, my Centos 7 slows to a crawl. Chromium seems to be the least bad. Sometimes it slows to the point that I cannot even move the mouse. Even switching between virtual terminals takes a while sometimes. When I get there, top generally