thr3ads.net - search: "crawl"

Displaying 20 results from an estimated 638 matches for "crawl".

2018 Jan 08

different names for bricks

...uster.self-heal-readdir-size: 4KBdisperse.background- heals: 16transport.address-family: inetperformance.readdir-ahead: oncluster.background-self-heal-count: 20cluster.self-heal-window-size: 16cluster.self-heal-daemon: enablestorage.build-pgfid: on gluster volume heal clifford statisticsGathering crawl statistics on volume clifford has been successful ------------------------------------------------ Crawl statistics for brick no 0Hostname of brick bmidata1 Starting time of crawl: Mon Jan 8 16:24:55 2018 Ending time of crawl: Mon Jan 8 16:24:55 2018 Type of crawl: INDEXNo. of entries healed: 0No...

basic rdig setup

2007 Sep 18

basic rdig setup

...ments file On both machines I have run the indexer with no errors using: rdig -c config/rdig_config.rb Both machines have an index dir at the rails root that has two files, segments and segments_0. Both files look like they have next to nothing in them. Both rdig_config.rb files look like: cfg.crawler.start_urls = [ ''http://domain.tpl/'' ] cfg.crawler.include_hosts = [ ''domain.tpl/'' ] cfg.index.path = ''./rdig_index'' cfg.verbose = true cfg.content_extraction = OpenStruct.new( :hpricot => OpenStruct.new( :titl...

Questions about backgroundrb

2008 Mar 25

Questions about backgroundrb

...d to incorporating > it into my site. > > I had several questions regarding implementing some features on my site > using backgroundrb. If you could help guide me in any way with any of > these, that would be great! > > Background: I''m trying to write a series of web crawler tasks. This is my > first time writing a robust web crawler. > > A new web crawler task is initiated whenever a user decides to track > information from a new site. Upon initialization by the user, that web > crawler is supposed to run using backgroundrb and then (1) save the &gt...

Gluster geo replication volume is faulty

2017 Sep 29

Gluster geo replication volume is faulty

...s/arbiter/gv0 (arbiter) Options Reconfigured: nfs.disable: on transport.address-family: inet I set up passwordless ssh login from all the master servers to all the slave servers then created and started the geo replicated volume I check the status and they switch between being active with history crawl and faulty with n/a every few second MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED -------------------------------------------------------------------------------------------------------------------...

Mail slowed down to a crawl...

2018 Nov 15

Mail slowed down to a crawl...

Been moving along just fine for a couple years now, then in the last two days, email has slowed to a crawl. Retrieving email via IMAP is very slow, progress bar in the mail client shows that it is downloading messages almost constantly, like it never closes the connection, or it?s getting the messages very slowly. The server that dovecot/postfix is on has also bogged down to a crawl. The postfix queu...

R package installation (PR#13726)

2009 May 27

R package installation (PR#13726)

...m32 -O2 -g" export FFLAGS="-m32 -O2 -g" export FCFLAGS="-m32 -O2 -g" export OBJCFLAGS="-m32 -O2 -g" export LIBnn=lib ./configure --with-x=no --enable-R-shlib --prefix=/prefix Now try to install a package which has Fortran files inside: /prefix/bin/R CMD INSTALL crawl_1.0-4.tar.gz * Installing to library '/prefix/lib/R/library' * Installing *source* package 'crawl' ... ** libs gfortran -fpic -m32 -O2 -g -c crwDriftN2ll.f90 -o crwDriftN2ll.o gfortran -fpic -m32 -O2 -g -c crwDriftPredict.f90 -o crwDriftPredict.o gfortran -fpic -m32 -O2 -g -c...

Design Dilemma - Please Help

2006 Oct 23

Design Dilemma - Please Help

Hi, I''m new. ;-) I creating a little rails app, that will crawl the web on a regular basis and then show the results. The crawling will be scheduled, likely a cron job. I can''t wrap my head around where to put my crawler. It doesn''t seem to fit. An example: Model - News Story Controllers - Grabs a story from the DB, Sort the Stories, Searc...

htdig with omega for multiple URLs (websites)

2006 Mar 29

htdig with omega for multiple URLs (websites)

Olly, many thanks for suggesting htdig, you saved me a lot of time. Htdig looks better than my original idea - wget, you were right. Using htdig, I can crawl and search single website - but I need to integrate search of pages spread over 100+ sites. Learning, learning.... Htdig uses separate document database for every website (one database per URL to initiate crawling). Htdig also can merge result databases to allow search of integrated results. I...

add geo-replication "passive" node after node replacement

2018 Feb 07

add geo-replication "passive" node after node replacement

...s S3 does not know about the geo-replica and it is not ready to geo-replicate in case S2 goes down. Here was the original geo-rep status # gluster volume geo-replication status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ---------------------------------------------------------------------------------------------------------------------------------------------------------------- S2 sharedvol /home/sharedvol root ssh://S5::sharedvolslave S5 Passive N/A N/A S1...

Gluster geo replication volume is faulty

2017 Oct 06

Gluster geo replication volume is faulty

...you can use a tool called "gluster-georep-setup", which doesn't require initial passwordless step. http://aravindavk.in/blog/gluster-georep-tools/ https://github.com/aravindavk/gluster-georep-tools > > I check the status and they switch between being active with history > crawl and faulty with n/a every few second > MASTER NODE? ? MASTER VOL? ? MASTER BRICK? ? ? ? SLAVE USER? ? ? > SLAVE? ? ? ? ? ? ? ? ? ? ? ? ? ? SLAVE NODE STATUS? ? CRAWL STATUS? ? > ?LAST_SYNCED > --------------------------------------------------------------------------------------------...

Geo replication snapshot error

2018 Feb 21

Geo replication snapshot error

...s with: snapshot create: failed: geo-replication session is running for the volume vol. Session needs to be stopped before taking a snapshot. gluster volume geo-replication status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ------------------------------------------------------------------------------------------------------------------------------------------------- ggluster1 vol /gluster geouser ssh://geouser at ggluster1-geo::vol N/A Paused...

Getting custom field data from the page through crawling

2007 Feb 08

Getting custom field data from the page through crawling

...ext quest is to implement a system of creating custom fields in the index. Our site is fully dynamic. That is, every page is generated in PHP and there are enough different kinds of pages that I wouldn't want to get into the business of indexing the DB directly, so I think that using htdig to crawl the site is the best way to go.. But, I would like to be able to search for things by field such as 'type', 'category', 'name', 'city', etc. I thought about it a lot and also did a lot of reading and research in the list archives but couldn't come up with any w...

browsers slowing Centos 7 installation to a crawl

2019 Aug 04

browsers slowing Centos 7 installation to a crawl

elinks does not seem to be working for me. I typed in google.com as my first url. There seems to be no way out of google, nor any way further in. No place to type a url. What appears to be the search window is black and does not accept input. Oops. Now I seem to have clicked on google help or something. There seems no way to back up. Ok. Found the left arrow. Still no way to search or to get out

Mail slowed down to a crawl...

2018 Nov 15

Mail slowed down to a crawl...

Den 15.11.2018 21:00, skrev StarionTech (IMAP): > Been moving along just fine for a couple years now, then in the last two days, email has slowed to a crawl. > > Retrieving email via IMAP is very slow, progress bar in the mail client shows that it is downloading messages almost constantly, like it never closes the connection, or it?s getting the messages very slowly. The server that dovecot/postfix is on has also bogged down to a crawl. > &gt...

M$ Photodraw 2000 brings box to a crawl

2000 Jul 28

M$ Photodraw 2000 brings box to a crawl

LTho wrote: > > One of my users is using Microsoft Photodraw 2000 ver 2.0.0.0915. She > can save to the samba server 3, maybe 4 times, after that the box > slows to a crawl; everybody starts getting semaphore timeouts on their > boxes, and linux activities slow to a crawl. On Linux, I'd connect from a test client, find the right smbd process (using smbstatus and ps) and run strace on it as you access the file. Watching the trace should give you a hint as...

IMQ slows computer to a crawl

2006 Jan 19

IMQ slows computer to a crawl

I am attempting to implement IMQ on a 2.4.31 version kernel with iptables 1.3.3. I am following the example at http://www.linuximq.net/usage.html. When I enter the line iptables -t mangle -A POSTROUTING -o eth1 -j IMQ --todev1 (eth1 is the external interface), the computer slows to a crawl. OK, the CPU is only an AMD K6 233 which is not the world''s greatest CPU, but egress shaping is done at acceptable speed. Neither top nor free is any help. top says the system is using 35% and user about 1%, with load averages in the range of 0.2x, 0.2x and 0.1x and top itself is at the...

Mail slowed down to a crawl...

2018 Nov 16

Mail slowed down to a crawl...

...vecot transactions. Jeff J. > On Nov 15, 2018, at 4:35 PM, H?kon Alstadheim <hakon at alstadheim.priv.no> wrote: > > > Den 15.11.2018 21:00, skrev StarionTech (IMAP): >> Been moving along just fine for a couple years now, then in the last two days, email has slowed to a crawl. >> >> Retrieving email via IMAP is very slow, progress bar in the mail client shows that it is downloading messages almost constantly, like it never closes the connection, or it?s getting the messages very slowly. The server that dovecot/postfix is on has also bogged down to a crawl....

fast parallel crawling of file systems

2012 Nov 17

fast parallel crawling of file systems

Hi, I use a disk space inventory tool called TreeSizePro to scan file filesystems on windows and linux boxes. On Linux systems I export these shares via samba to scan them. TreeSizePro is multi-threaded (32 crawlers) and I run it on windows 7. I am scanning file systems that are local to the linux servers and also nfs mounts that are re-exported via samba. If I scan a windows 2008 server I can scan many millon files in about 1 hour, If I do the same thing with samba it takes more than 1 day. It takes longe...

2.2.8 slows down to a crawl within a few days

2003 Apr 17

2.2.8 slows down to a crawl within a few days

We have been using samba for years now and are generally very happy. But since a recent upgrade to 2.2.8 we have seen samba slowing down periodically to a crawl every 2 or 3 days. BTW the start of these problems coincided with the upgrade to samba 2.2.8 and a partial cabling upgrade to 100 MBit, which also coincided with the introduction of the first Win XP Pro machines to our LAN. ;-( We have about twenty users logged in at any one time and when it come...

browsers slowing Centos 7 installation to a crawl

2019 Aug 04

browsers slowing Centos 7 installation to a crawl

My video problems mentioned in a previous thread are gone, though I do not know why. Now my problem is that whenever I have a browser open and an internet connection, my Centos 7 slows to a crawl. Chromium seems to be the least bad. Sometimes it slows to the point that I cannot even move the mouse. Even switching between virtual terminals takes a while sometimes. When I get there, top generally shows me between two and five D states. I've changed service providers lately. The problem su...

search for: crawl