Displaying 20 results from an estimated 2000 matches similar to: "Getting custom field data from the page through crawling"
2006 Mar 29
1
htdig with omega for multiple URLs (websites)
Olly,
many thanks for suggesting htdig, you saved me a lot of time.
Htdig looks better than my original idea - wget, you were right.
Using htdig, I can crawl and search single website - but I need to
integrate search of pages spread over 100+ sites. Learning, learning....
Htdig uses separate document database for every website (one database
per URL to initiate crawling). Htdig also can merge
2006 May 26
1
Unicode troubles
Hi,
I've tried to follow all helpful tips I've found in the mailing-list
and I've applied these two utf-8 patches;
http://article.gmane.org/gmane.comp.search.xapian.general/2324
http://article.gmane.org/gmane.comp.search.xapian.general/1927
Now the QueryParser works as I wants it to do, and creates the terms
correctly. But sadly I can't find any documents. If I do this;
$ quest
2006 Mar 17
1
omega crawler: ht://dig or wget?
At wiki page: http://wiki.xapian.org/Omega
I added a comment that ht://Dig looks like dead.
Does anybody really use it?
>From brief glance at docs I had a feeling it is not easy to configure.
Maybe better crawler is GNU wget? Mature, stable, maintained?
--
Peter Masiar
2001 Nov 08
0
[RHSA-2001:139-04] Updated htdig packages are available
---------------------------------------------------------------------
Red Hat, Inc. Red Hat Security Advisory
Synopsis: Updated htdig packages are available
Advisory ID: RHSA-2001:139-04
Issue date: 2001-10-24
Updated on: 2001-10-30
Product: Red Hat Linux
Keywords: htdig CGI htsearch DOS configuration file -c switch security
Cross
2001 Oct 09
0
Security Update: [CSSA-2001-035.0] Linux - Remote File View Problem in htdig
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
______________________________________________________________________________
Caldera International, Inc. Security Advisory
Subject: Linux - Remote File View Problem in htdig
Advisory number: CSSA-2001-035.0
Issue date: 2001, October 09
Cross reference:
______________________________________________________________________________
1.
2011 Apr 17
3
Report for http://trac.xapian.org/wiki/SupportedPlatforms
Hello :-)
There was probably no good reason to do this build but the Debian 6.0
Squeeze repo version was 1.2.3, we needed 1.2.4 and I didn't think of using
the package from unstable.
Arch: x86_64
Platform: Linux 2.6 Debian 6.0 (Squeeze)
Compiler: gcc version 4.4.5 (Debian 4.4.5-8)
Version: 1.2.4
Status: no known problems
Source: http://oligarchy.co.uk/xapian/1.2.4/xapian-core-1.2.4.tar.gz
2004 Nov 20
0
NT_LOGON_FAILURE setting up a Linux BDC
Hi,
We're trying to set up a Red Hat 9 box as a BDC for a domain, the PDC for that
domain is another RH9 machine. To do it we're using samba 2.2.7a and openLDAP
2.0.7 in both machines. We've followed the instructions from the Samba-PDC-Howto
and Samba-BDC-Howto from the samba.org. The PDC works fine but when I try to
list the shares of the BDC using my user I get a NT_LOGON_FAILURE
2007 Dec 05
0
CentOS-announce Digest, Vol 34, Issue 5
Send CentOS-announce mailing list submissions to
centos-announce at centos.org
To subscribe or unsubscribe via the World Wide Web, visit
http://lists.centos.org/mailman/listinfo/centos-announce
or, via email, send a message with subject or body 'help' to
centos-announce-request at centos.org
You can reach the person managing the list at
centos-announce-owner at centos.org
When
2016 Aug 22
1
RPC server is unavailable when using ADUC
Hello.
We're running Samba 4.3.9 AD on two Ubuntu 16.04 LTS machines. I'm managing
AD users and DNS from Windows 10 joined to the domain, by using ADUC.
Last week I noticed the following error when starting ADUC as Administrator
of the AD domain:
----
Naming information cannot be located because:
The RPC server is unavailable.
Contact your system administrator to verify that your domain
2007 Dec 04
0
CentOS-announce Digest, Vol 34, Issue 4
Send CentOS-announce mailing list submissions to
centos-announce at centos.org
To subscribe or unsubscribe via the World Wide Web, visit
http://lists.centos.org/mailman/listinfo/centos-announce
or, via email, send a message with subject or body 'help' to
centos-announce-request at centos.org
You can reach the person managing the list at
centos-announce-owner at centos.org
When
2006 Oct 17
2
RODBC and NULL values
Dear All,
Writing sooner than I thought I'd need to.
I'm using R 2.4 on Mac OS X, with RODBC, PostgreSQL 8.1 and Actual's
ODBC driver. I have all my data in Filemaker 8.5, but it is
automatically exported into PostgreSQL for analysis as Filemaker's ODBC
and JDBC access is awful, slow and has a tendency to crash.
I have disability data where for each patient there is a survival
2007 Dec 03
0
CESA-2007:1095 Moderate CentOS 4 s390(x) htdig - security update
CentOS Errata and Security Advisory 2007:1095
https://rhn.redhat.com/errata/RHSA-2007-1095.html
The following updated files have been uploaded and are currently
syncing to the mirrors:
s390:
updates/s390/RPMS/htdig-3.2.0b6-4.c4.s390.rpm
updates/s390/RPMS/htdig-web-3.2.0b6-4.c4.s390.rpm
s390x:
updates/s390x/RPMS/htdig-3.2.0b6-4.c4.s390x.rpm
updates/s390x/RPMS/htdig-web-3.2.0b6-4.c4.s390x.rpm
2003 Jan 06
1
replacing a w2k machine with samba 2.2.7a
Hi.
First, i would like to thank samba developers for producing such a good product. Second, i have a few questions/remarks :
I have recently replaced a w2k file server running in w2k domain (native mode) with samba 2.2.7a on RH 7.3 with the latest kernel, no acl, configured winbind, and ran into the problem described here :
2007 Dec 05
0
CESA-2007:1095 Moderate CentOS 5 x86_64 htdig Update
CentOS Errata and Security Advisory 2007:1095 Moderate
Upstream details at : https://rhn.redhat.com/errata/RHSA-2007-1095.html
The following updated files have been uploaded and are currently
syncing to the mirrors: ( md5sum Filename )
x86_64:
9cb4b14b7e1a32596705f2ed6882f7ef htdig-3.2.0b6-9.0.1.el5_1.x86_64.rpm
b96548484dfaf007eb3d4c362ed577f8 htdig-web-3.2.0b6-9.0.1.el5_1.x86_64.rpm
2007 Dec 05
0
CESA-2007:1095 Moderate CentOS 5 i386 htdig Update
CentOS Errata and Security Advisory 2007:1095 Moderate
Upstream details at : https://rhn.redhat.com/errata/RHSA-2007-1095.html
The following updated files have been uploaded and are currently
syncing to the mirrors: ( md5sum Filename )
i386:
b4b53fd6444cd16ca1ba49ff3326f2ca htdig-3.2.0b6-9.0.1.el5_1.i386.rpm
70f178075fab7be728b9bcdfff7f25ca htdig-web-3.2.0b6-9.0.1.el5_1.i386.rpm
Source:
2009 Mar 01
8
puppet and LDAP users
I am trying to get puppet to manage my LDAP users but I don''t appear
to be having much success. What I have in puppet.conf is this
[puppetmasterd]
ldapserver=ldap.myorg.company.com
ldapbase=dc=myorg,dc=org
ldapuser=cn=admin,dc=myorg,dc=org
ldappassword=mysecret
ldapparentattr=dc=myorg,dc=org
I added the ldapparentattr in desperation and doubt if
2013 Apr 12
0
AD groups mapped to wrong GIDs
Hi list,
I need some help getting group mapping to work:
We've got a fileserver serving Linux clients via NFS.
NSS source for users and groups is LDAP (sssd).
nsswitch.conf:
[...]
passwd: compat sss
group: compat sss
shadow: compat sss
[...]
So far, this works quite well since years.
Now I tried to have our content served via Samba for our Windows
clients.
2002 Aug 01
1
Outlook/Express Crawling with Domains
I'm gonna try to give as complete a description as I can. Maybe someone
can point me in the right direction, as I haven't seen anything exactly
like this.
I'm attempting to switch from Workgroup to Domain at work. On a couple
machines I did fresh installs of XP, used the registry patch, and
succesffully got logons, roaming profiles working. However, when a user
attempts to open
2012 Nov 17
1
fast parallel crawling of file systems
Hi, I use a disk space inventory tool called TreeSizePro to scan file
filesystems on windows and linux boxes. On Linux systems I export
these shares via samba to scan them. TreeSizePro is multi-threaded (32
crawlers) and I run it on windows 7. I am scanning file systems that
are local to the linux servers and also nfs mounts that are
re-exported via samba.
If I scan a windows 2008 server I can
2010 Oct 14
0
[LLVMdev] llvm.org robots.txt prevents crawling by Google code search?
> indexing the llvm.org svn archive. This means that when you search for an
> LLVM-related symbol in code search, you get one of the many (possibly
> out-of-date) mirrors, rather than the up-to-date llvm.org version. This is
> sad.
This is intentional. The workload of the server was pretty huge w/o this.
--
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics,