Displaying 20 results from an estimated 400 matches similar to: "Questions about backgroundrb"
2007 Jan 23
3
Someone getting RDig work for Linux?
I got this
root at linux:~# rdig -c configfile
RDig version 0.3.4
using Ferret 0.10.14
added url file:///home/myaccount/documents/
waiting for threads to finish...
root at linux:~# rdig -c configfile -q "Ruby"
RDig version 0.3.4
using Ferret 0.10.14
executing query >Ruby<
Query:
total results: 0
root at linux:~#
my configfile
I changed from config to cfg, because of maybe
2006 Oct 23
3
Design Dilemma - Please Help
Hi, I''m new. ;-)
I creating a little rails app, that will crawl the web on a regular
basis and then show the results.
The crawling will be scheduled, likely a cron job.
I can''t wrap my head around where to put my crawler. It doesn''t seem
to fit.
An example:
Model - News Story
Controllers - Grabs a story from the DB, Sort the Stories, Search the
Stories etc.
View -
2006 Mar 29
1
htdig with omega for multiple URLs (websites)
Olly,
many thanks for suggesting htdig, you saved me a lot of time.
Htdig looks better than my original idea - wget, you were right.
Using htdig, I can crawl and search single website - but I need to
integrate search of pages spread over 100+ sites. Learning, learning....
Htdig uses separate document database for every website (one database
per URL to initiate crawling). Htdig also can merge
2010 Oct 14
1
[LLVMdev] llvm.org robots.txt prevents crawling by Google code search?
On Wed, Oct 13, 2010 at 11:10 PM, Anton Korobeynikov <
anton at korobeynikov.info> wrote:
> > indexing the llvm.org svn archive. This means that when you search for
> an
> > LLVM-related symbol in code search, you get one of the many (possibly
> > out-of-date) mirrors, rather than the up-to-date llvm.org version. This
> is
> > sad.
> This is intentional. The
2008 Oct 17
0
Wine release 1.0.1
The Wine maintenance release 1.0.1 is now available.
This is a maintenance release from the 1.0 stable branch. It contains
only translation updates and small bug fixes.
The source is available from the following locations:
http://ibiblio.org/pub/linux/system/emulators/wine/wine-1.0.1.tar.bz2
http://prdownloads.sourceforge.net/wine/wine-1.0.1.tar.bz2
Binary packages for various
2007 Sep 27
2
Problem getting "extract" from RDig
Hi All,
I have to have a site wide search for my current application. By search
I mean I have to search the static and the dynamic contents from the
database. I have been searching on this for a while on the net and RDig
seems to be a apt solution. While using it I have encountered a few
problems. I know these might be very basic issues but I have not been
able to figure out what is wrong with
2009 Sep 13
0
regrex_crawler -- a crawler which uses regular expression to catch data from website
RegexpCrawler is a crawler which uses regular expression to catch data
from website. It is easy to use and less code if you are familiar with
regular expression.
The project site is: http://github.com/flyerhzm/regexp_crawler/tree
I give an example: a script to synchronize your github projects except
fork projects, , please check example/github_projects.rb
require ''rubygems''
2006 Jul 25
1
RDig document processing error
Hi all,
Am having problems using RDig:
With this rdig config...
cfg.crawler.start_urls = [''http://www.defensetech.org'']
cfg.crawler.include_hosts = [''www.defensetech.org'']
cfg.index.path = ''/my/path/to/index''
cfg.verbose = true
...I get this output:
$ rdig -c config/rdig_config.rb
/usr/local/lib/site_ruby/1.8/ferret/index/term.rb:45:
2000 Jun 27
0
FemFind - search engine for SMB/FTP shares
What is FemFind?
FemFind is a crawler/search engine for SMB shares. FemFind does also
crawl FTP servers and provides a web interface and a Windows client as
frontends for searching.
What do I need to run it?
The FemFind crawler runs on a Unix platform (currently only Linux has
been tested). It utilizes a MySQL database. The web interface requires
a webserver. In addition some Perl modules
2007 Apr 19
0
scRUBYt! 0.2.8
This is long overdue (0.2.8 is out for about a week already), but
anyway, here we go:
============
What''s this?
============
scRUBYt! is a very easy to learn and use, yet powerful Web scraping
framework based on Hpricot and mechanize. It''s purpose is to free you
from the drudgery of web page crawling, looking up HTML tags,
attributes, XPaths, form names and other typical
2011 Mar 19
1
Weighting Schemes
Hi!
I am Praveen Kumar, an Applied Mathematics student and I am interested in
developing other weighting schemes for Xapian through GSOC.
I did not have any formal course in Information Retrieval in our institute.
The theory that I presently know is
from the Xapian documentations and other references and resources mentioned
on the website which I read
to design our own Probabilistic Information
2010 Sep 24
0
[LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
On Sep 22, 2010, at 8:52 AM, Talin wrote:
> I'm moving this thread to llvm-dev in the hopes of reaching a wider audience.
>
> This patch relaxes the restriction on llvm.gcroot so that it can work with non-pointer allocas. The only changes are to Verifier.cpp - it appears from my testing that llvm.gcroot always worked fine with non-pointer allocas, except that the verifier
2013 Jun 13
4
Rails 3 application capable of generating an offline version of itself for download as zip archive
I''m kinda newbie in RoR yet and I''m having a hard time trying to figure
out how should I implement this. I''m writing an application to store and
display information about insects and their distribution. Currently I
have almost all functionality implemented, except for a **very**
important one: The application must be capable of "crawling" itself and
generate a
2007 Sep 18
4
basic rdig setup
I''m developing locally on Windows and I have a remote dev box that runs
Linux. I''m trying to use RDig just to index using urls, no files.
Both use acts_as_ferret for an administrative search that works fine.
On the Windows machine, I get no errors, but get no results.
On the Linux machine, I get:
File Not Found Error occured at <except.c>:93 in xraise
Error occured in
2010 Oct 13
3
[LLVMdev] llvm.org robots.txt prevents crawling by Google code search?
One of the tools I use most frequently when coding is Google codesearch.
Unfortunately, llvm.org's robots.txt appears to block all crawlers from
indexing the llvm.org svn archive. This means that when you search for an
LLVM-related symbol in code search, you get one of the many (possibly
out-of-date) mirrors, rather than the up-to-date llvm.org version. This is
sad.
For more info, see the
2012 Nov 17
1
fast parallel crawling of file systems
Hi, I use a disk space inventory tool called TreeSizePro to scan file
filesystems on windows and linux boxes. On Linux systems I export
these shares via samba to scan them. TreeSizePro is multi-threaded (32
crawlers) and I run it on windows 7. I am scanning file systems that
are local to the linux servers and also nfs mounts that are
re-exported via samba.
If I scan a windows 2008 server I can
2010 Oct 02
0
[LLVMdev] Function inlining creates uninitialized stack roots
Hi Talin,
You are not doing something wrong, it is just that the LLVM optimizers
consider llvm.gcroot like a regular function call. The alloca is moved in
the first block most probably because the inliner anticipates another
optimization pass (the mem2reg).
Cheers,
Nicolas
On Sat, Oct 2, 2010 at 8:28 PM, Talin <viridia at gmail.com> wrote:
> I'm still putting the final touches on
2010 Sep 22
3
[LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
I'm moving this thread to llvm-dev in the hopes of reaching a wider
audience.
This patch relaxes the restriction on llvm.gcroot so that it can work with
non-pointer allocas. The only changes are to Verifier.cpp - it appears from
my testing that llvm.gcroot always worked fine with non-pointer allocas,
except that the verifier wouldn't allow it. I've used this patch to build an
2011 May 17
2
HTML snapshots for crawlable ajax
Hi,
There doesn''t seem to be any reference for taking HTML snapshots from
within a Rails server. I wonder how one could implement Google''s
crawlable AJAX spec
(http://code.google.com/web/ajaxcrawling/docs/learn-more.html)on a Rails
application?
To summarize: I have a Rails application with a Javascript front-end
with lots of AJAX. I need Google to index the AJAX content, hence
2010 Oct 02
2
[LLVMdev] Function inlining creates uninitialized stack roots
I'm still putting the final touches on my stack crawler, and I've run into a
problem having to do with function inlining and local stack roots.
As you know, all local roots must be initialized before you can make any
call to a function which might crawl the stack. My compiler ensures that all
local variables of a function are allocated, declared as root, and
initialized in the first