<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://blog.gmane.org/gmane.comp.search.xapian.devel">
    <title>gmane.comp.search.xapian.devel</title>
    <link>http://blog.gmane.org/gmane.comp.search.xapian.devel</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2079"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2071"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2055"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2051"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2047"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2038"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2035"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2034"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2022"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2012"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2009"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2007"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1999"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1996"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1995"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1992"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1991"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1989"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1987"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1986"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2079">
    <title>GSoC - Follow up Question about doc</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2079</link>
    <description>&lt;pre&gt;Dan Colish,

I am facing major issues in the documentation. Have invested quite some
time but not able to properly figure out what really needs to be done in
the documentation.

Can you please once again, clearly mention the objective of  a "user
documentation".
Also, can you please guide as to how to give an overall view? I am not able
to figure out what really needs to be added.

Also, it seems that I have wasted much of my time and energy in explaining
the things which were not meant to be explained. Have tried to removed them
from the doc.
This could have been avoided if I had come up with smaller chunks rather
than coming up with a single large chunk.

Have also added some replies regarding the comments you made on my github
commit. Reply to them when you can.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Sehaj Singh Kalra</dc:creator>
    <dc:date>2012-05-22T20:36:20</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2071">
    <title>GSoC - QueryParser Information</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2071</link>
    <description>&lt;pre&gt;Have added information regarding queryparser at
http://trac.xapian.org/wiki/GSoC2012/QueryParser/Notes.
It is a bit informal at present but have a look, and let me know your
views. Will be improving it and adding some more things.
The aim as I talked with dcolish, is to explain the working of QueryParser
along with appropriate examples.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Sehaj Singh Kalra</dc:creator>
    <dc:date>2012-05-19T04:30:53</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2055">
    <title>Xapian license change</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2055</link>
    <description>&lt;pre&gt;Like Amish Shah who posted earlier this month to this list I find myself 
interested in Xapian, but currently not being able to use it due to its 
GPL license. I am aware of the CommercialLicence wiki page, but have not 
been able to find any current status or anything willing to work on it, 
so I am wondering if perhaps I could help. As far as I can see several 
things would need to happen:

  * a new license (or set of licenses) will need to be chosen
  * current copyright holders will need to sign off on the relicensing
  * dependencies on GPLed libraries (think getopt) will need to replaced


Looking at older posts and the xapian website it seems that LGPL has 
been suggested as an alternative license, but I can not find a clear 
decision.

Judging by the copyright statements in the sources (for xapian-core) 
there are 18 copyright holders, which is a manageable list:

  * Action Without Borders
  * Adam Sjøgren
  * Ananova Ltd
  * Brandon Schaefer
  * BrightStation PLC
  * Dan Colish
  * Dr Martin Port&lt;/pre&gt;</description>
    <dc:creator>Wichert Akkerman</dc:creator>
    <dc:date>2012-05-16T15:37:52</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2051">
    <title>Iterators</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2051</link>
    <description>&lt;pre&gt;Hello,

Can we use an iterator after deletion of a parent object?

For example, like this;


&lt;/pre&gt;</description>
    <dc:creator>Michael Uvarov</dc:creator>
    <dc:date>2012-05-16T10:44:18</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2047">
    <title>xapian gpl replacement</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2047</link>
    <description>&lt;pre&gt;Hi,
Xapian seems to work for us, but the GPL license is going to prevent us
from using it. Is there an update on the efforts to remove the GPL code
from xapian? Also, my company might be willing to fund such efforts if the
scope of the work is known.

thanks
Amish
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Amish Shah</dc:creator>
    <dc:date>2012-05-15T17:12:07</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2038">
    <title>Patch for compiling xapian with gcc4.7</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2038</link>
    <description>&lt;pre&gt;Hi,

I updated my system and shifted from gcc4.6 --&amp;gt; gcc4.7 .While building
xapian code with gcc4.7 there is error returned due to some changes in
GCC.Pastbin of error is http://pastebin.com/7ee5WGiu .I searched and found
it to some other packages also seems to face such issue like
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=667403 .

Attached herewith is patch.

&lt;/pre&gt;</description>
    <dc:creator>Gaurav Arora</dc:creator>
    <dc:date>2012-05-08T10:49:26</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2035">
    <title>GSoC 2012 admin</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2035</link>
    <description>&lt;pre&gt;If you're one of our six selected GSoC students, please read through this
email and attend to anything which needs doing.  If anything is unclear,
please ask.

I've created a page on the wiki for each of the selected projects, which
are all linked from this overview page:

http://trac.xapian.org/wiki/GSoC2012

There's a table of useful information on each summary page.  Please check
that this is correct for you, and fill in the missing information.  For
most of you this is:

 * "work hours" - we realise you probably won't be working exactly the
   same hours every day, but it's useful for us to have some idea when we
   can expect you to be around.  We also encourage you to be on IRC in
   #xapian at least while you are working, as that will make interaction
   easier.

 * repository location - I've filled this in where I already know it, but
   for the rest of you please clone/fork our repository, push it to
   a public location, and put a link to that on your project's wiki page.
   There are already xapia&lt;/pre&gt;</description>
    <dc:creator>Olly Betts</dc:creator>
    <dc:date>2012-05-08T02:43:22</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2034">
    <title>Erlang examples</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2034</link>
    <description>&lt;pre&gt;Hello,

Here are few examples of Erlang binding usage:

https://github.com/freeakk/xapian-examples

&lt;/pre&gt;</description>
    <dc:creator>Michael Uvarov</dc:creator>
    <dc:date>2012-05-07T17:19:46</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2022">
    <title>Index Size comparison</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2022</link>
    <description>&lt;pre&gt;Hi,
I did a comparison based on similar steps as in the blog
(zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter),
against lucene-3.4 and xapian-1.3.0. The overall index sizes are:
lucene 89M, xapian 189M (chert backend and compacted).
Since I'm more interested in index size, I dig a little further to dump
the full term list. There are about 360000 terms from lucene index, and
about 285000 terms from xapian index. But surprisingly, the termlist.DB
of xapian index is already 122M.
Is there some idea/plan on reducing the index size? I'll glad if I could
help.

Thanks!
Jaguar
&lt;/pre&gt;</description>
    <dc:creator>Jaguar Xiong</dc:creator>
    <dc:date>2012-04-23T14:16:51</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2012">
    <title>Handling Negative value due to logarithm ofprobabilities.</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2012</link>
    <description>&lt;pre&gt;Hi,

In continuation of the discussion of melange comments,about negative value
returned in matcher due to logarithm of probabilities.

*I**f we make K suitably large, we could clamp each log(K.Pi) to be &amp;gt;= 0,
and this change will only affect really low probability terms (those with
Pi &amp;lt; 1/K, so you can adjust K to suit):*

*W' = sum(i=1,...,n, max(log(K.Pi), 0))*

Did you mean for low probability the the value returned by log(K.Pi) would
be negative. So replace lower probability, which still gives negative value
by 0?

Assigning 0 will be equivalent to rejecting term from the query
completely,which hurts the retrieval performance in Language Model as the
term missing from the document are smoothened with collection frequency.

I think we must try the smoothing from collection statistics if the
document term probability doesn't work(is generating negative value).

*sum(*

*i=1,...,n, if( max(log(K.Pi), 0) == 0)*

*max(max(log(K.Pcollec.i),0)*

*else*

***log(K.Pi)*

*)*

In case both doesnt work return 0 wou&lt;/pre&gt;</description>
    <dc:creator>Gaurav Arora</dc:creator>
    <dc:date>2012-04-27T04:58:30</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2009">
    <title>Letor re-factored code</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2009</link>
    <description>&lt;pre&gt;Hello Rishabh,

The attached diff file should help you to refactor the existing letor code
and plug in your code easily. I have defined the header files with the
necessary flow. I have also specified the ranker class, which should be
implemented by a new LTR models, ListNet and ListMLE in your case. The
evaluation file should be used for validating the performance of the
algorithm in the training so which you plan to implement. You should take
the corresponding definitions from the letor_internal.cc and create the
corresponding .cc files. Feel free to do necessary changes and in case of
any doubt, feel free to ask.

Regards,
Parth.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Parth Gupta</dc:creator>
    <dc:date>2012-04-24T09:11:34</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2007">
    <title>filter by facets and objective c version</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2007</link>
    <description>&lt;pre&gt;1. How do I perform filter by facets using the C++ api?
2. Has anyone ported xapian to objective C


thanks
Amish
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Amish Shah</dc:creator>
    <dc:date>2012-04-23T19:21:44</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1999">
    <title>MeCab as a Japanese tokenizer</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1999</link>
    <description>&lt;pre&gt;The first version of a Japanese Tokenizer implementation using MeCab.
It hasn't been tested yet.

In order to compile property both MeCab and libiconv should be installed.

In ubuntu the following packages are needed: libmecab-dev mecab-ipadict-utf8
It's also recommended to install libiconv from source.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Harrynson Hidalgo</dc:creator>
    <dc:date>2012-04-20T07:56:51</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1996">
    <title>Japanese tokenization</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1996</link>
    <description>&lt;pre&gt;This is the first version I had, it's full of bugs... I had already
debugged it but forget to run gitt add... before doing git commit.
I'm re-debugging it right now.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Harrynson Hidalgo</dc:creator>
    <dc:date>2012-04-20T05:04:27</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1995">
    <title>Japanese tokenization</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1995</link>
    <description>&lt;pre&gt;I send the first prototype of the patch.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Harrynson Hidalgo</dc:creator>
    <dc:date>2012-04-20T04:46:13</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1992">
    <title>Implementing the tf-idf weighting scheme</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1992</link>
    <description>&lt;pre&gt;And the patch file is in the attachment.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Jiuding Duan</dc:creator>
    <dc:date>2012-04-20T03:47:22</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1991">
    <title>Implementing the tf-idf weighting scheme</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1991</link>
    <description>&lt;pre&gt;Hi, all:

This is the basic implementation of tf-idf scheme (basic scheme used in
SMART) that can be used in the Xapian. It might still  need some futher
revision, but I believe it works anyway.:)

I modified the weight.h to define a subclass Tf_idfWeight and add a new
file tf_idf.cc in ../weight in the repo, to implement Tf_idfWeight.

Here is the git diff patch:

https://gist.github.com/2422049

I think the next thing to do is register this scheme to Xapian and write
some test to see whether or not it works?

I'm grepped the current BM25Weight, TradWeight and BoolWeight, and find
clues about Enquire::set_weighting_scheme( ). But something more should be
done to understand it.


Best,

Jiuding
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Jiuding Duan</dc:creator>
    <dc:date>2012-04-20T03:33:36</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1989">
    <title>Posting list encoding improvements - pfd encoding &amp; var len encoding comparison program</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1989</link>
    <description>&lt;pre&gt;Hi all,
I wrote a program that implement the variable length encoding and fixed
length encoding, and compares their index size and speed of search doc
length.
You can see the comparison result from the attachment snapshot.

1. The posting list is in all memory;
2. The search strategy of fixed length encoding is skipping with
exponential step (1, 2, 4, 8, ...). Once exceeds the desired doc id, back
to previous step and skip with step 1.
3. The implemented fixed length encoding uses 4 bytes as fixed length. This
is not optimal and can be further optimized in PFD.
4. The program generates uniform random doc id gap and doc len to make
posting list.

*You can access the code via my github:
https://github.com/zwxxx/pfd_simple_test*
&lt;/pre&gt;</description>
    <dc:creator>Weixian Zhou</dc:creator>
    <dc:date>2012-04-20T02:51:45</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1987">
    <title>Xapian::Database-&gt;close() for perl missing</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1987</link>
    <description>&lt;pre&gt;I have a xapian-daemon, which can be queried via http. A background-process
generated every hour one new index and then remove and create a new symlink
to the current database.

/path/to/index/20120419010000
/path/to/index/20120419020000
/path/to/index/20120419030000
/path/to/index/default =&amp;gt; /path/to/index/20120419030000

So the daemon only check the mtime of /path/to/index/default/iamchert befor
every request and if it is a new one, he close/reopen the database.

The problem is: There is no -&amp;gt;close in perl for a database! So currently i
override the object. After some days, the prozess have many open
filepointers to allready removed databases.

I also tried -&amp;gt;reopen, but xapian dont reopen every file, so after reopen,
i have some files opened from the old database and the record.DB from the
new one. Also the filepointer to the old record.DB still exists (i tested
with lsof -p $PID).

So what is the right way to make a clean shutdown of a opened xapian
database?

I tested with xapian-core 1.2.9 / Search::Xa&lt;/pre&gt;</description>
    <dc:creator>Websuche :: Felix Ostmann</dc:creator>
    <dc:date>2012-04-19T13:45:36</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1986">
    <title>Patch: New features for Learning to Rank</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1986</link>
    <description>&lt;pre&gt;Hello,
Please find attached the patch which contains new features for the LETOR
framework which increases the total features from 19 to 37. In my proposal
I had mentioned of having a final count of 44 IR specific features apart
from those learnt by unsupervised feature learning part via Deep Learning.
So I am left with 7 out of those 44,of which 6 I will be implementing
within the next 2 days(before 22nd).
This patch also contains a 1 line change to
xapian-core/examples/questletor.cc correcting the usage of the questletor
file. I will send in another patch as soon as I am done with those 6
features.

Best regards,
Rishabh.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Rishabh Mehrotra</dc:creator>
    <dc:date>2012-04-19T03:18:41</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1982">
    <title>Patch for Initial Prototype implementation of Unigram Langauage Modelling in xapian-core.</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1982</link>
    <description>&lt;pre&gt;Hi,

  I have implemented initial prototype of the  Xapian::Weight subclass for
Unigram Language Modelling to support UnigramLM weighing in xapian.Other
changes include adding collection_frequency to TermFreqs struct to store
collection frequency of terms and some changes to support it xapian
Framework,Changing simplesearch.cc to search using UnigramLMWeight class.

Following issues have not being addressed in this patch(I am working on
following issues):

1. Log trick for handling multiplication for LM need to made more robust
than just adding some random number to avoid rejecting document due to
negative value returned by log.

     Since each term contribution is probability(b/w 0 and 1). Hence
finding log will result in negative value and eventually rejection of
document.Hence a random linear weight has been added.It need to be
addressed by using log diffrent bases and some other techniques.

Discussion about log trick needed to be used are here for reference:
http://comments.gmane.org/gmane.comp.search.&lt;/pre&gt;</description>
    <dc:creator>Gaurav Arora</dc:creator>
    <dc:date>2012-04-15T01:09:33</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.comp.search.xapian.devel">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.comp.search.xapian.devel</link>
  </textinput>
</rdf:RDF>

