<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general">
    <title>gmane.comp.search.xapian.general</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9601"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9600"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9599"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9598"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9597"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9596"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9595"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9594"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9593"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9592"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9591"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9590"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9589"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9588"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9587"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9586"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9585"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9584"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9583"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xapian.general/9582"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9601">
    <title>Re: How to omindex some sub-directories?</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9601</link>
    <description>&lt;pre&gt;


As the person responsible for introducing subsites, I think it's worth pointing out that it was a fairly quick hack to do something specific that I needed…in about 1999. I'm pretty sure that I *did* use --no-delete, but I could be wrong (if I did, it would have been to reindex only parts of the site; I knew that documents almost never got deleted).

Obviously we should find ways of continuing to support anyone who's using it (or any of the features of omega / omindex), but I'd be in favour of ripping the whole thing out and starting again; probably either with something using symlinks or (more powerful, I think) a file that describes which directories &amp;amp; files are "mounted" in the URL space where. (We could then have something to autogen this from standard apache virtualhost / documentroot / alias directives; also I suspect fairly easy to write something that converts subsite invocations to the new style.)

With some care we should be able to ensure that there are suitable terms generated for each "mount&lt;/pre&gt;</description>
    <dc:creator>James Aylett</dc:creator>
    <dc:date>2013-05-17T14:15:22</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9600">
    <title>Re: How to omindex some sub-directories?</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9600</link>
    <description>&lt;pre&gt;
I guess you mean "A and B" (or "C" not "B" below)...


I think it is better to use --url (though I find the subsite stuff
confusing, so I may be misunderstanding the plan behind it):

omindex --db /my_db --no-delete --url /foo/A /foo/A
omindex --db /my_db --no-delete --url /foo/B /foo/B


Another approach is to "fake up" the tree you want to index with
symlinks and then index that - e.g.:

mkdir foo-to-index
ln -s /foo/A /foo/B foo-to-index
omindex --db /my_db --follow --url /foo foo-to-index

If you have symlinks in the tree you don't want to follow, then bind
mounts are another option (at least on Linux if you have root access).

You can also just search the databases together.  If you pass multiple
DB parameters to omega, it'll search them together.  You can also pass
DB parameters with a '/' in, which are split at the '/' into multiple
DB names to search.


Not as things are.  I think the "subsite" feature really needs
overhauling - this '--no-delete' restriction means it's pretty much
useless if you ev&lt;/pre&gt;</description>
    <dc:creator>Olly Betts</dc:creator>
    <dc:date>2013-05-16T21:59:19</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9599">
    <title>How to omindex some sub-directories?</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9599</link>
    <description>&lt;pre&gt;Given a directory tree like ...

/foo
|
+-- A
|
+-- B
|
+-- C

... what is the best way to index A and C into a single Xapian database?

AFAIK the alternatives are:

omindex --db /my_db --no-delete /foo /foo/A
omindex --db /my_db --no-delete /foo /foo/B

or

omindex --db /my_A_db /foo /foo/A
omindex --db /my_B_db /foo /foo/B
xapian-compact /my_A_db /my_B_db /my_db

The first alternative does not delete files deleted from the file system
from the database.  Is there any way around this except by emptying the
database and starting over?

The second alternative increases storage and processing requirements. 
OK if needs must but would prefer to avoid.

Are there any better alternatives?

Best, Charles
&lt;/pre&gt;</description>
    <dc:creator>Charles</dc:creator>
    <dc:date>2013-05-15T06:35:01</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9598">
    <title>Match positions of a queryresult</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9598</link>
    <description>&lt;pre&gt;Hello,

I've just started learning Xapian and I'm facing the following problem.

I've indexed many text files (using a TermGenerator from std::string), each
document in my database is a single file on the disk.
The search works pretty well and finds the files that match the query
string, but I can't figure out how I can determine the location of the
actual matched terms. I want to show the user the row and column number of
the match (to somehow highlight the match).

So far I haven't found a solution. The closest I've got is the
Enquire::get_matching_terms_* functions but this does not really work for
phases and I'm still far from character positions.

I hope someone can give me some hints where to look to begin with.

Thanks,
- Tamás Cséri -
_______________________________________________
Xapian-discuss mailing list
Xapian-discuss&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss
&lt;/pre&gt;</description>
    <dc:creator>Cséri Tamás</dc:creator>
    <dc:date>2013-05-15T02:11:07</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9597">
    <title>Re: What is the significance of 22 in theDebianlibxapian22 packages</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9597</link>
    <description>&lt;pre&gt;
I've now uploaded squeeze backports of the 1.2.12 packages which are in
wheezy - xapian-core is visible, the others should appear shortly.

See http://backports.debian.org/ for details of how to enable backports.

Cheers,
    Olly
&lt;/pre&gt;</description>
    <dc:creator>Olly Betts</dc:creator>
    <dc:date>2013-05-08T02:02:26</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9596">
    <title>Re: remote backend</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9596</link>
    <description>&lt;pre&gt;Thanks Olly for the confirmation. And yes, my email program put an extra line break for some reason.

Michael

-----Original Message-----
From: Olly Betts [mailto:olly&amp;lt; at &amp;gt;survex.com]
Sent: Tuesday, May 07, 2013 6:31 AM
To: Michael Lewis
Cc: xapian-discuss&amp;lt; at &amp;gt;lists.xapian.org
Subject: Re: [Xapian-discuss] remote backend

On Fri, Apr 26, 2013 at 09:42:33AM -0500, Michael Lewis wrote:

I think your mail client has mangled the line breaks, so I'm not exactly sure what you're trying there, but with the remote backend over TCP, you instantiate one server for each database, and run them on different ports, so this searches three remote databases:

remote 192.168.1.10:30000
remote 192.168.1.10:30001
remote 192.168.1.10:30002

And this searches one remote and one local database:

remote 192.168.1.10:30000
chert /var/lib/xapian_database/segment1

Though you are probably better off to use "auto" instead of "chert"
as that will work with other disk based backends.


Yes.  Or for omega, you can pass FMT=document_database.txt a&lt;/pre&gt;</description>
    <dc:creator>Michael Lewis</dc:creator>
    <dc:date>2013-05-07T11:20:25</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9595">
    <title>Re: remote backend</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9595</link>
    <description>&lt;pre&gt;
I think your mail client has mangled the line breaks, so I'm not
exactly sure what you're trying there, but with the remote backend
over TCP, you instantiate one server for each database, and run them
on different ports, so this searches three remote databases:

remote 192.168.1.10:30000
remote 192.168.1.10:30001
remote 192.168.1.10:30002

And this searches one remote and one local database:

remote 192.168.1.10:30000
chert /var/lib/xapian_database/segment1

Though you are probably better off to use "auto" instead of "chert"
as that will work with other disk based backends.


Yes.  Or for omega, you can pass FMT=document_database.txt as one of the
CGI parameters.


Yes.

Cheers,
    Olly
&lt;/pre&gt;</description>
    <dc:creator>Olly Betts</dc:creator>
    <dc:date>2013-05-07T10:31:14</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9594">
    <title>Xapian 1.3.1 development snapshot released</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9594</link>
    <description>&lt;pre&gt;After rather longer than expected, Xapian 1.3.1 is now available.

Please note that 1.3.x releases are development releases - they are made
to encourage earlier and wider use and testing of new and changed code
so that 1.4.0 can both happen sooner and be better than otherwise.

Our record with 1.1.x was very good - all the bugs I am aware of were
either in new features, or were also present in the corresponding 1.0.x
release.  But if you main concern is minimising risk of accidental
breakage, sticking with 1.2.x would be prudent, at least for deployment.

The 1.3.x development series will lead to a stable 1.4.x release series.
We don't have a date set, but towards the end of this year seems plausible.

If you make packages of this release, please make sure that they are very
clearly labelled as not being a stable version, and ensure that they can be
installed in parallel with the stable version (the default paths and program
suffix are set to help make this easier).  If they are binary packages you
should al&lt;/pre&gt;</description>
    <dc:creator>Olly Betts</dc:creator>
    <dc:date>2013-05-04T04:40:38</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9593">
    <title>Re: Tutorial on OS X</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9593</link>
    <description>&lt;pre&gt;
I don't think that's a path much trodden before, but Jarrod Roberson
was working on something back in 2006:

http://thread.gmane.org/gmane.comp.search.xapian.general/3099/focus=3103

Cheers,
    Olly
&lt;/pre&gt;</description>
    <dc:creator>Olly Betts</dc:creator>
    <dc:date>2013-05-03T03:43:22</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9592">
    <title>Re: Compiling Xapian within a Cocoa project</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9592</link>
    <description>&lt;pre&gt;
For 1.3.1 I've added a workaround for this issue (I added parentheses
around the method name to stop the check() macro getting expanded).
I'll backport that for 1.2.16 too.


But I'd recommend doing this (or defining
__ASSERT_MACROS_DEFINE_VERSIONS_WITHOUT_UNDERSCORES to 0) anyway, as
it'll allow your code to build with older and newer Xapian versions.

Cheers,
    Olly
&lt;/pre&gt;</description>
    <dc:creator>Olly Betts</dc:creator>
    <dc:date>2013-05-01T22:33:09</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9591">
    <title>Tutorial on OS X</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9591</link>
    <description>&lt;pre&gt;Hi, guys.

Is there any example Cocoa project which uses Xapian?

Best,
Tae
&lt;/pre&gt;</description>
    <dc:creator>Tae</dc:creator>
    <dc:date>2013-05-01T18:42:27</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9590">
    <title>Re: replace_document issue</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9590</link>
    <description>&lt;pre&gt;Thanks guys, I'll try that.

Michael

-----Original Message-----
From: Olly Betts [mailto:olly&amp;lt; at &amp;gt;survex.com]
Sent: Tuesday, April 30, 2013 8:34 AM
To: Michael Lewis
Cc: Xapian Discussion
Subject: Re: [Xapian-discuss] replace_document issue

On Tue, Apr 30, 2013 at 07:15:19AM -0500, Michael Lewis wrote:
[...]

In C++, there are two versions of replace_document() - one takes a docid, while the other takes a term (so you can easily replace a document by a unique key).  From PHP which you get depends if you pass a string or integer, so you probably want to use:

       $docid=$database-&amp;gt;replace_document(intval($rid),$doc);

Cheers,
    Olly
&lt;/pre&gt;</description>
    <dc:creator>Michael Lewis</dc:creator>
    <dc:date>2013-04-30T12:42:13</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9589">
    <title>Re: replace_document issue</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9589</link>
    <description>&lt;pre&gt;[...]

In C++, there are two versions of replace_document() - one takes a
docid, while the other takes a term (so you can easily replace a
document by a unique key).  From PHP which you get depends if you pass a
string or integer, so you probably want to use:

       $docid=$database-&amp;gt;replace_document(intval($rid),$doc);

Cheers,
    Olly
&lt;/pre&gt;</description>
    <dc:creator>Olly Betts</dc:creator>
    <dc:date>2013-04-30T12:33:30</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9588">
    <title>Re: replace_document issue</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9588</link>
    <description>&lt;pre&gt;replace_document() is overloaded in C++; there's:

http://xapian.org/docs/apidoc/html/classXapian_1_1WritableDatabase.html#23344c9000ea98b15d491fa875bd5d1e

which takes an integer, and uses that as the docid, and

http://xapian.org/docs/apidoc/html/classXapian_1_1WritableDatabase.html#43c4630ec482508667e9ca539f19cbf0

which takes a term, and uses that as a unique term

I suspect you're supplying a string in PHP, and using the latter form.  If
you cast it to an int, you may have more success.

In retrospect, this overloading is probably an API mistake, and it would
have been better to use a different method name for the version which takes
a string; it's clear in C++, but often a confusion in dynamically typed
languages.

You might also want to read through
http://trac.xapian.org/wiki/FAQ/UniqueIds for more on this topic.

HTH,
&lt;/pre&gt;</description>
    <dc:creator>Richard Boulton</dc:creator>
    <dc:date>2013-04-30T12:31:23</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9587">
    <title>replace_document issue</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9587</link>
    <description>&lt;pre&gt;I am converting an MySQL db to use xapian for full-text searches in PHP. I fetch the record ID and the text field to be indexed for each record and then index the document. I am putting the documents into three separate xapian dbs. I need to preserve the original record ID and use it for the xapian document ID. The code I use is:

      $r=$dh-&amp;gt;FetchArray();
        $rid=$r['id'];
        $sc=$r['sc'];
        $doc=new XapianDocument();
      $doc-&amp;gt;set_data($sc);
        $indexer-&amp;gt;index_text($sc);
      $docid=$database-&amp;gt;replace_document($rid,$doc);

However, when I use delve on the three xapian DBs I get the following:

delve  -V /var/lib/xapian/segment_1 | more
UUID = 9dd08d44-68a7-4e6b-987e-287dde7bf9c2
number of documents = 448741
average document length = 2284.29
document length lower bound = 1
document length upper bound = 498430
highest document id ever used = 449577
has positional information = true
[root&amp;lt; at &amp;gt;localhost sw]# delve  -V /var/lib/xapian/segment_2 | more
UUID = 8bf087e7-a9e6-4539-a08e-aeab382&lt;/pre&gt;</description>
    <dc:creator>Michael Lewis</dc:creator>
    <dc:date>2013-04-30T12:15:19</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9586">
    <title>Re: What is the significance of 22 in theDebianlibxapian22 packages</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9586</link>
    <description>&lt;pre&gt;
It's the "soversion" - essentially it's an indicator of ABI
compatibility, and has no direct connection with the version
number:

http://www.debian.org/doc/debian-policy/ch-sharedlibs.html


You should just be able to rebuild the newer packages for squeeze -
I backported 1.2.7 and I don't think it needed any special changes:

http://packages.debian.org/source/stable-backports/xapian-omega

Backporting a newer version is on my todo list, but I'm rather busy
so I'm not sure when I'll get to it.  You might find this script which
mostly automates the process handy (change MIRROR near the top, unless
you also live in NZ):

http://trac.xapian.org/browser/trunk/xapian-maintainer-tools/debian/backport-source-packages

Cheers,
    Olly
&lt;/pre&gt;</description>
    <dc:creator>Olly Betts</dc:creator>
    <dc:date>2013-04-30T04:34:15</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9585">
    <title>Re: What is the significance of 22 in the Debianlibxapian22 packages</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9585</link>
    <description>&lt;pre&gt;Sorry -- should have mentioned that the packages are for Debian squeeze
so presumably not safe to use the wheezy or sid packages

On 30/04/13 09:24, Charles wrote:
&lt;/pre&gt;</description>
    <dc:creator>Charles</dc:creator>
    <dc:date>2013-04-30T04:22:26</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9584">
    <title>What is the significance of 22 in the Debianlibxapian22 packages</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9584</link>
    <description>&lt;pre&gt;What is the significance of 22 in the Debian libxapian22 and
libxapian22-dbg packages?  According to
http://packages.debian.org/search?keywords=libxapian22 they are built
from upstream versions 1.2.3 to 1.2.12 (of xapian-core?)

The reason for asking is that we need xapian-omega (and hence
xapian-core) at 1.2.8 or later.  Previously we have built and installed
directly from source but a package would be more convenient.  So I am
studying the existing Debian packages for guidance while creating 1.2.15
packages..

Charles
&lt;/pre&gt;</description>
    <dc:creator>Charles</dc:creator>
    <dc:date>2013-04-30T03:54:06</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9583">
    <title>Re: Compiling Xapian within a Cocoa project</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9583</link>
    <description>&lt;pre&gt;
It turns out that this check() macro is deprecated in favour of
__Check() - the old name is only supported for compatibility and is
due to be removed in a future OS X SDK release.  Meanwhile you can
define __ASSERT_MACROS_DEFINE_VERSIONS_WITHOUT_UNDERSCORES to 0 to avoid
these compatibility macros, e.g. by adding this before you include any
OS X headers:

#define __ASSERT_MACROS_DEFINE_VERSIONS_WITHOUT_UNDERSCORES 0

That also has the benefit of ensuring you don't use these deprecated
macros in your code.

There's a full explanation in the comments here:

http://www.opensource.apple.com/source/CarbonHeaders/CarbonHeaders-18.1/AssertMacros.h

Cheers,
    Olly
&lt;/pre&gt;</description>
    <dc:creator>Olly Betts</dc:creator>
    <dc:date>2013-04-30T02:04:17</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9582">
    <title>remote backend</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9582</link>
    <description>&lt;pre&gt;So, given what I've read in the documentation I would create a text file named document_database.txt that might have the following:



remote 192.168.1.10:30000

chert /var/lib/xapian_database/segment1

remote 192.168.1.10:30000 chert /var/lib/xapian_database/segment2

remote 192.168.1.10:30000 chert /var/lib/xapian_database/segment3

etc.



I would then in my PHP program open document_database.txt as the database and then perform normal Xapian calls. The same with Omega, I would, in my template, change the database name to document_database.txt.



Given the above, it appears that I do not have to refer to each segment (segment1, segment2, etc.) for my calls, as it appears they are aggregated and treated as a single database and Xapian takes care of which actual database segment to search. IE, searching all the segments for my search terms.



Michael



-----Original Message-----

From: simon.roe&amp;lt; at &amp;gt;gmail.com&amp;lt;mailto:simon.roe&amp;lt; at &amp;gt;gmail.com&amp;gt; [mailto:simon.roe&amp;lt; at &amp;gt;gmail.com] On Behalf Of Sym Roe

Sent: Friday, April 2&lt;/pre&gt;</description>
    <dc:creator>Michael Lewis</dc:creator>
    <dc:date>2013-04-26T14:42:33</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xapian.general/9581">
    <title>Re: Converting MySQL database to Xapian</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xapian.general/9581</link>
    <description>&lt;pre&gt;Hi Michael,

On Thu, Apr 25, 2013 at 04:56:26PM -0500, Michael Lewis wrote:

we have a similar fragmented setup for our library catalogs, 193 at this time. See:

http://kug.ub.uni-koeln.de/

The data comes from different sources, e.g. not only from local
library systems, but also from OpenLibrary, WikiSource, Project
Gutenberg and is exported in a unified JSON format.

For each source catalog we create

a) a separate PostgreSQL database with all relevant information
(formerly we used MySQL, but switched to PostgreSQL recently)
b) a separate Xapian index. The data for each document is a
predefined set of catalog fields in JSON format for display in search
result lists. We use pure Xapian, no Omega etc.

With Xapian you can

a) join several or all indexes at search time. For lots of indexes we
experienced problems with wildcard searches, though...

b) physically join each index to a single index with xapian-compact.
This compacted index is optimized for search.

c) export its index and access it as a remote ba&lt;/pre&gt;</description>
    <dc:creator>Oliver Flimm</dc:creator>
    <dc:date>2013-04-26T14:27:17</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.comp.search.xapian.general">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.comp.search.xapian.general</link>
  </textinput>
</rdf:RDF>
