<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general">
    <title>gmane.comp.search.xappy.general</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/103"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/102"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/101"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/100"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/99"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/98"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/97"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/96"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/95"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/94"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/93"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/92"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/91"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/90"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/89"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/88"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/87"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/86"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/85"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.search.xappy.general/84"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/103">
    <title>Re: is EXACT_MATCH working?</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/103</link>
    <description>&lt;pre&gt;
Hmm, ok so I got this sort of working.  If I do a search for
exact_match: dbus it finds it but that isn't exactly what I am trying
to achieve here.  Say I have a list of package names with descriptions
which mention dbus to a varying degree:

dbus-python - 2 hits for dbus in description
dbus - 1 hit for dbus in description
Perl-DBus - 5 hits for dbus in description

I want the search for "dbus" to weight exact matches higher than non-
exact matches so it would return:

dbus
Perl-DBus
dbus-python

right now my searches for dbus do not even show dbus in the top 10.
I've tried querying for exact_name:dbus dbus with various default ops
and it either returns nothing at all or the same results as dbus.  I
noticed that EXACT_INDEX doesn't do weighting so it is most likely not
what I am looking for.  I could possibly add my own prefix and weight
that and add it to the search terms but the PREFIXdbus-python may
still be ranked above PREFIXdbus (but it would fix the PREFIXPerl-DBus
showing up at the top).

I'm slowly&lt;/pre&gt;</description>
    <dc:creator>J5</dc:creator>
    <dc:date>2011-09-30T18:11:27</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/102">
    <title>Re: is EXACT_MATCH working?</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/102</link>
    <description>&lt;pre&gt;
Thanks for you reply.  I hope I didn't come off as offensive but as it
is the stable version of xappy doesn't quite have everything we need
so I am using the version in svn and even then our requirements are
somewhat different which is always the issue between high level
interfaces and low level capabilities.  Due to the sparse
documentation of xapian itself I do want to know the internals and how
it is doing its matching so that we can tweak it if need be.  Xappy
has given us a great jumping off point in that respect but for
instance I don't need the stored document to be pickled - json would
work better for us as we are simply storing strings, lists and hashes.
 This seems easy to switch out by not marking any fields as
STORE_CONTENT and setting the data in the xapian document before it is
saved to the db.



Ah, that makes sense.  I was looking at the debian xapian package
search and they did something like this.  It didn't dawn on me how
this worked but now I understand the matching a bit better.  Thank&lt;/pre&gt;</description>
    <dc:creator>john palmieri</dc:creator>
    <dc:date>2011-09-30T16:10:48</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/101">
    <title>Re: is EXACT_MATCH working?</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/101</link>
    <description>&lt;pre&gt;
I'm not sure what stability you mean, but ok; it's possible to do
this, but you'll need to understand a bit more about xapian internals.
 I think you'll end up replicating chunks of xappy, so I wouldn't take
this approach, personally.

I think the problem in this case is that the INDEX_EXACT action
doesn't store an unprefixed version of the term.  For an
INDEX_FREETEXT action, the text "dbus" will get indexed both as "dbus"
(for non field-specific searches) and also as something like "XAdbus"
for field specific searches.  (dbus may also be stemmed, depending on
settings).  For an index exact field, you'll just get soemthing like
the "XAdbus" field.

To search this using pure xapian, you'll have to look up what the
prefix to insert is by reading and unpacking the metadata key stored
by xappy which holds this configuration, and give it to the query
parser by calling qp.add_boolean_prefix().

Really, I recommend you use xappy for searches too.

&lt;/pre&gt;</description>
    <dc:creator>Richard Boulton</dc:creator>
    <dc:date>2011-09-30T09:22:21</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/100">
    <title>is EXACT_MATCH working?</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/100</link>
    <description>&lt;pre&gt;Hello,

We are using Xappy to create indexes for package searching in Fedora.
Right now the results are a bit skewed due to freetext searches simply
matching the number of times a term shows up.  I want to fix this
using exact matching on the package name so that if an exact match is
found we return that as the top result.  This does not seem to work.
If I do this and remove all of the other matching fields we always get
an empty result

iconn.add_field_action('exact_name', xappy.FieldActions.INDEX_EXACT)
iconn.add_field_action('exact_name', xappy.FieldActions.STORE_CONTENT)
doc.fields.append(xappy.Field('exact_name', 'dbus',  weight=100.0))
.
.
.

then searching for 'dbus' using xapian should return that match but we
get an empty set:

query = qp.parse_query('dbus')
enquire.set_query(query)
matches = enquire.get_mset(0, 10)
count = matches.get_matches_estimated()
print count


How do we get count working?  BTW we are using xappy for indexing
because it presents a nice interface but xapian is simple enough o&lt;/pre&gt;</description>
    <dc:creator>J5</dc:creator>
    <dc:date>2011-09-29T22:44:10</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/99">
    <title>Re: Fix for fieldmapping</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/99</link>
    <description>&lt;pre&gt;
Thanks!

&lt;/pre&gt;</description>
    <dc:creator>Bruno Rezende</dc:creator>
    <dc:date>2011-05-06T19:33:01</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/98">
    <title>Re: Fix for fieldmapping</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/98</link>
    <description>&lt;pre&gt;I'l take a look at this when I'm next at my desk (Monday, probably).

On 6 May 2011 19:02, "Bruno Rezende" &amp;lt;brunovianarezende-Re5JQEeQqe8AvxtiuMwx3w&amp;lt; at &amp;gt;public.gmane.org&amp;gt; wrote:

Hi,

I've created a patch for a problem in fieldmapping:

http://code.google.com/p/xappy/issues/detail?id=37 (FieldMappings is
generating wrong prefixes )

can someone review it? what is the right procedure for asking for
patch reviews?

--
You received this message because you are subscribed to the Google Groups
"xappy-discuss" group.
To post to this group, send email to xappy-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0&amp;lt; at &amp;gt;public.gmane.org
To unsubscribe from this group, send email to
xappy-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0&amp;lt; at &amp;gt;public.gmane.org
For more options, visit this group at
http://groups.google.com/group/xappy-discuss?hl=en.

&lt;/pre&gt;</description>
    <dc:creator>Richard Boulton</dc:creator>
    <dc:date>2011-05-06T19:29:31</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/97">
    <title>Fix for fieldmapping</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/97</link>
    <description>&lt;pre&gt;Hi,

I've created a patch for a problem in fieldmapping:

http://code.google.com/p/xappy/issues/detail?id=37 (FieldMappings is
generating wrong prefixes )

can someone review it? what is the right procedure for asking for
patch reviews?

&lt;/pre&gt;</description>
    <dc:creator>Bruno Rezende</dc:creator>
    <dc:date>2011-05-06T18:02:17</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/96">
    <title>Re: Applying Multiple Caches support</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/96</link>
    <description>&lt;pre&gt;On Mon, Mar 14, 2011 at 10:27 AM, Bruno Rezende
&amp;lt;brunovianarezende-Re5JQEeQqe8AvxtiuMwx3w&amp;lt; at &amp;gt;public.gmane.org&amp;gt; wrote:

I changed the way we deal with deletions when we have multiple applied caches.

The new patch is here:
http://code.google.com/p/xappy/issues/attachmentText?id=36&amp;amp;aid=3390807064686651435&amp;amp;name=multicache.diff&amp;amp;token=bfccdbc7cb0e6ae2d4394e88519658e8

The deletion of documents when we use applied caches must be done by
the user. I know this is sub-optimal, but it is simple enough for now.

&lt;/pre&gt;</description>
    <dc:creator>Bruno Rezende</dc:creator>
    <dc:date>2011-03-16T12:52:44</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/95">
    <title>Re: Applying Multiple Caches support</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/95</link>
    <description>&lt;pre&gt;There is a new version of the patch that handles 'replace' and 'delete':

http://code.google.com/p/xappy/issues/attachmentText?id=36&amp;amp;aid=-3979246154754598562&amp;amp;name=multicache.diff&amp;amp;token=5f7a1f2276356b317e392e15ff5350d2

I'm not sure how we should handle deletions when an index has more
than one cache applied.

On Fri, Mar 11, 2011 at 12:06 PM, Richard Boulton &amp;lt;richard-n4XGBM20PuNg9hUCZPvPmw&amp;lt; at &amp;gt;public.gmane.org&amp;gt; wrote:



&lt;/pre&gt;</description>
    <dc:creator>Bruno Rezende</dc:creator>
    <dc:date>2011-03-14T13:27:01</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/94">
    <title>Re: Applying Multiple Caches support</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/94</link>
    <description>&lt;pre&gt;I'll take a look at this over the weekend.  Thanks for producing patches.

On 11 March 2011 15:05, Bruno Rezende &amp;lt;brunovianarezende-Re5JQEeQqe8AvxtiuMwx3w&amp;lt; at &amp;gt;public.gmane.org&amp;gt; wrote:



&lt;/pre&gt;</description>
    <dc:creator>Richard Boulton</dc:creator>
    <dc:date>2011-03-11T15:06:59</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/93">
    <title>Applying Multiple Caches support</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/93</link>
    <description>&lt;pre&gt;Hi,

I've added support to applying multiple caches to a xappy index here:

http://code.google.com/p/xappy/issues/detail?id=36

it still misses some tests related to document removal and update, but
I think it is in shape for a review. If someone can take a look, i'd
be grateful.

regards,
Bruno

&lt;/pre&gt;</description>
    <dc:creator>Bruno Rezende</dc:creator>
    <dc:date>2011-03-11T15:05:00</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/92">
    <title>Re: Multiple Caches and replace document</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/92</link>
    <description>&lt;pre&gt;
Yes, it should work.  If you don't apply the cache, it will be stored
in a separate index (so there are issues about keeping the cache in
sync with the main index when updating or replicating to worry about),
but the


If the cache isn't "applied", it won't be notified of document
removals, so you'll have to handle those yourself.  Other than that,
nothing special needs to be done.

Correct.


This is only needed when the cache has been applied.  When a cache is
applied to an index, all the documents which are mentioned in a cache
have values added to them.  Each cached query id corresponds to a
slot, and the value stored is the position at which that document
should be returned for the query.  These values are used to return the
appropriate cached documents in the result of a search.

If a document is reindexed, the incoming document won't have those
values stored in it, so the code you quote copies the old cache values
into it.  If it didn't do this, the document would no longer appear in
cached search re&lt;/pre&gt;</description>
    <dc:creator>Richard Boulton</dc:creator>
    <dc:date>2011-02-17T16:11:03</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/91">
    <title>Re: Multiple Caches and replace document</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/91</link>
    <description>&lt;pre&gt;Hi,

On Thu, Feb 17, 2011 at 11:35 AM, Richard Boulton &amp;lt;richard-n4XGBM20PuNg9hUCZPvPmw&amp;lt; at &amp;gt;public.gmane.org&amp;gt; wrote:

hum... I was thinking in doing something like:

sconn = xappy.SearchConnection(path)
cache_id = _get_cache_id(request)
cache_path = _get_cache_path(cache_id)
cachemanager = xappy.cachemanager.XapianCacheManager(cache_path)
sconn.set_cache_manager(cachemanager)
... do the search and get results from cache ...

I think this should work, right? If I have 10 different cache ids,
would it work? or should I need to apply the 10 caches to the index if
I want to use them for search?


the policy wouldn't be different. My question is if I need to do
anything special on each cache.


nope, probably I'm missing how cache works :-). I thought that
applying a cache to an index would mean that the cache would be copied
to the index and applying would be just a convenience, since caches
wouldn't be required to be applied to be used. But, then I read that
code in indexerconnection.replace and I don't know what it&lt;/pre&gt;</description>
    <dc:creator>Bruno Rezende</dc:creator>
    <dc:date>2011-02-17T15:24:21</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/90">
    <title>Re: Multiple Caches and replace document</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/90</link>
    <description>&lt;pre&gt;
Currently, you can't apply multiple caches to an index; so I assume
you're thinking of patching xappy to do this.  I think it should be
quite possible to do; you'll need to cause xappy to allocate a
separate set of value slots for each cache (currently, it's just
hardcoded to allocate slots for the cache based on the query id +
IndexerConnection._cache_manager_slot_start)

The current xappy policy is not to update caches when a document is
modified, since there's no way to know where the newly modified
document should be placed in the cached order.  Xappy only updates the
cache when a document is deleted (or changed to be marked as
store_only).  I don't see why this should be different with multiple
caches - you'll just need to remove the document from each cache.

Perhaps I'm missing what the question is here?

&lt;/pre&gt;</description>
    <dc:creator>Richard Boulton</dc:creator>
    <dc:date>2011-02-17T13:35:51</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/89">
    <title>Multiple Caches and replace document</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/89</link>
    <description>&lt;pre&gt;Hi,

suppose I have a index and have multiple caches that can be applied to
it. The cache that will be used will be chosen at search time. In this
scenario, how would incremental indexing be affected? I'm looking
IndexerConnection.replace method and have seen this:


       if self._index.get_metadata('_xappy_hascache'):
           if store_only:
               # Remove any cached items from the cache - the document
is no
               # longer wanted in search results.
               self._remove_cached_items(id, xapid)
           else:
               # Copy any cached query items over to the new document.
               olddoc, olddocid = self._get_xapdoc(id, xapid)
               if olddoc is not None:
                   for value in olddoc.values():
                       if value.num &amp;lt; self._cache_manager_slot_start:
                           continue
                       xapdoc.add_value(value.num, value.value)


we can remove documents from our multiple caches, but I don't know
what should I do wh&lt;/pre&gt;</description>
    <dc:creator>Bruno Rezende</dc:creator>
    <dc:date>2011-02-17T13:16:44</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/88">
    <title>Re: replace document slow?</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/88</link>
    <description>&lt;pre&gt;ops, replace 'bulk_update' by 'incremental indexing'...

On Feb 17, 10:29 am, Bruno Rezende &amp;lt;brunovianareze...-Re5JQEeQqe8AvxtiuMwx3w&amp;lt; at &amp;gt;public.gmane.org&amp;gt;
wrote:

&lt;/pre&gt;</description>
    <dc:creator>Bruno Rezende</dc:creator>
    <dc:date>2011-02-17T13:13:38</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/87">
    <title>Re: Re: replace document slow?</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/87</link>
    <description>&lt;pre&gt;Just a follow up: bulk_update performance is ok, my memory problem was
caused by iterating a search result, updating the index and re-opening
the connection that generated the search result. I'm avoiding doing
this and the memory usage is ok. So, I will wait before doing any time
measure. Thanks for the help, Richard!

&lt;/pre&gt;</description>
    <dc:creator>Bruno Rezende</dc:creator>
    <dc:date>2011-02-17T12:29:52</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/86">
    <title>Re: replace document slow?</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/86</link>
    <description>&lt;pre&gt;Hi,

On Fri, Feb 4, 2011 at 8:13 AM, boulton.rj-gM/Ye1E23mwN+BqQ9rBEUg&amp;lt; at &amp;gt;public.gmane.org
&amp;lt;boulton.rj-Re5JQEeQqe8AvxtiuMwx3w&amp;lt; at &amp;gt;public.gmane.org&amp;gt; wrote:
 ...
http://code.google.com/p/xappy/source/browse/trunk/libs/get_xapian.py,

 Yes, I'm using chert.


 I don't have this info now. I'll do a test and report back.


 I flush at each 10K items. I'm using this value to try to keep memory
usage low, we had a case where the memory usage went up to 15GB. But,
I think it didn't work very well, we had some days ago a 4GB memory
usage case.


 by initial update you call when I add it for the first time? I'll need
to check on this machine.


 yes, I have. I'll try to disable the cache and test this too.


 ok. I'll do some more testings and see if I can get some profiling info.

--
 Bruno

&lt;/pre&gt;</description>
    <dc:creator>Bruno Rezende</dc:creator>
    <dc:date>2011-02-04T11:03:46</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/85">
    <title>Re: Re: replace document slow?</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/85</link>
    <description>&lt;pre&gt;Hi,

On Fri, Feb 4, 2011 at 8:13 AM, boulton.rj-gM/Ye1E23mwN+BqQ9rBEUg&amp;lt; at &amp;gt;public.gmane.org
&amp;lt;boulton.rj-Re5JQEeQqe8AvxtiuMwx3w&amp;lt; at &amp;gt;public.gmane.org&amp;gt; wrote:
...

Yes, I'm using chert.


I don't have this info now. I'll do a test and report back.


I flush at each 10K items. I'm using this value to try to keep memory
usage low, we had a case where the memory usage went up to 15GB. But,
I think it didn't work very well, we had some days ago a 4GB memory
usage case.


by initial update you call when I add it for the first time? I'll need
to check on this machine.


yes, I have. I'll try to disable the cache and test this too.


ok. I'll do some more testings and see if I can get some profiling info.




&lt;/pre&gt;</description>
    <dc:creator>Bruno Rezende</dc:creator>
    <dc:date>2011-02-04T10:46:38</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/84">
    <title>Re: replace document slow?</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/84</link>
    <description>&lt;pre&gt;
Indeed, you need to call replace() to put the changes back into the
database.


Yes, they're definitely included in that version.  I assume you're
using chert databases, too (the improvements didn't work so will with
flint, due to the way document lengths were stored).

What sort of speed do you get if you change your code to delete the
old document and then add it back, rather than replacing it?  I'd
expect that to be much slower, since that's what the old code path did
(ie, before xapian ticket 250 was fixed).

Are you flushing frequently when doing this update, or not at all
during the update?

What sort of speed do you get when doing the initial update?

One thought occurs; do you have a query cache enabled on this index?
I think that may be being updated when you call replace(), and could
account for some of the time.

It's possible that there's a lot of unnecessary parsing going on in
python here; I think some profiling output will be needed to dig into
this (at the least, finding out whether the time&lt;/pre&gt;</description>
    <dc:creator>boulton.rj-gM/Ye1E23mwN+BqQ9rBEUg&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2011-02-04T10:13:10</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.search.xappy.general/83">
    <title>replace document slow?</title>
    <link>http://permalink.gmane.org/gmane.comp.search.xappy.general/83</link>
    <description>&lt;pre&gt;(it seems I'm having problems emailing xappy-discuss, so sorry if this
message is sent twice)

Hi,

I'm doing some incremental updates in a xapian database using xappy
api. The changes to the documents are minimal, just adding/removing
some terms. The way I'm doing is something like:

1. get the documents from a search connection
2. change the terms with ProcessedDocument.add_term /
ProcessedDocument.remove_term
3. call IndexerConnection.replace(changed_doc)

I'm getting an average of ~200 items/sec. If instead of using the
document returned by search connection I get the document from the
indexer connection and continue using replace(doc), I see no real
gain.

I tried this too:

1. get the documents from a search connection
2. get each document from indexer connection
3. change the terms with ProcessedDocument.add_term /
ProcessedDocument.remove_term
4. see if the changes would be applied to the index, without calling
IndexerConnection.replace(changed_doc)

with this approach the number of items per second &lt;/pre&gt;</description>
    <dc:creator>Bruno Rezende</dc:creator>
    <dc:date>2011-02-03T12:40:07</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.comp.search.xappy.general">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.comp.search.xappy.general</link>
  </textinput>
</rdf:RDF>

