<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://blog.gmane.org/gmane.comp.search.xapian.devel">
    <title>gmane.comp.search.xapian.devel</title>
    <link>http://blog.gmane.org/gmane.comp.search.xapian.devel</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2079"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2071"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2055"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2051"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2047"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2038"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2035"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2034"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2022"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2012"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2009"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/2007"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1999"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1996"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1995"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1992"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1991"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1989"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1987"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.search.xapian.devel/1986"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2079">
    <title>GSoC - Follow up Question about doc</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2079</link>
    <description>&lt;pre&gt;Dan Colish,

I am facing major issues in the documentation. Have invested quite some
time but not able to properly figure out what really needs to be done in
the documentation.

Can you please once again, clearly mention the objective of  a "user
documentation".
Also, can you please guide as to how to give an overall view? I am not able
to figure out what really needs to be added.

Also, it seems that I have wasted much of my time and energy in explaining
the things which were not meant to be explained. Have tried to removed them
from the doc.
This could have been avoided if I had come up with smaller chunks rather
than coming up with a single large chunk.

Have also added some replies regarding the comments you made on my github
commit. Reply to them when you can.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Sehaj Singh Kalra</dc:creator>
    <dc:date>2012-05-22T20:36:20</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2071">
    <title>GSoC - QueryParser Information</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2071</link>
    <description>&lt;pre&gt;Have added information regarding queryparser at
http://trac.xapian.org/wiki/GSoC2012/QueryParser/Notes.
It is a bit informal at present but have a look, and let me know your
views. Will be improving it and adding some more things.
The aim as I talked with dcolish, is to explain the working of QueryParser
along with appropriate examples.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Sehaj Singh Kalra</dc:creator>
    <dc:date>2012-05-19T04:30:53</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2055">
    <title>Xapian license change</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2055</link>
    <description>&lt;pre&gt;Like Amish Shah who posted earlier this month to this list I find myself 
interested in Xapian, but currently not being able to use it due to its 
GPL license. I am aware of the CommercialLicence wiki page, but have not 
been able to find any current status or anything willing to work on it, 
so I am wondering if perhaps I could help. As far as I can see several 
things would need to happen:

  * a new license (or set of licenses) will need to be chosen
  * current copyright holders will need to sign off on the relicensing
  * dependencies on GPLed libraries (think getopt) will need to replaced


Looking at older posts and the xapian website it seems that LGPL has 
been suggested as an alternative license, but I can not find a clear 
decision.

Judging by the copyright statements in the sources (for xapian-core) 
there are 18 copyright holders, which is a manageable list:

  * Action Without Borders
  * Adam Sjøgren
  * Ananova Ltd
  * Brandon Schaefer
  * BrightStation PLC
  * Dan Colish
  * Dr Martin Porter
  * Evgeny Sizikov
  * Hein Ragas
  * Kan-Ru Chen
  * Kevlin Henney
  * Lemur Consulting Ltd
  * Olly Betts
  * Orange PCS Ltd
  * Richard Boulton
  * Sam Liddicott
  * Scriptics Corporation.
  * Yung-chung Lin


Does anyone have any idea how willing these copyright holders might be 
to approve relicensing?

Regards,
Wichert.

_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Wichert Akkerman</dc:creator>
    <dc:date>2012-05-16T15:37:52</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2051">
    <title>Iterators</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2051</link>
    <description>&lt;pre&gt;Hello,

Can we use an iterator after deletion of a parent object?

For example, like this;


&lt;/pre&gt;</description>
    <dc:creator>Michael Uvarov</dc:creator>
    <dc:date>2012-05-16T10:44:18</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2047">
    <title>xapian gpl replacement</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2047</link>
    <description>&lt;pre&gt;Hi,
Xapian seems to work for us, but the GPL license is going to prevent us
from using it. Is there an update on the efforts to remove the GPL code
from xapian? Also, my company might be willing to fund such efforts if the
scope of the work is known.

thanks
Amish
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Amish Shah</dc:creator>
    <dc:date>2012-05-15T17:12:07</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2038">
    <title>Patch for compiling xapian with gcc4.7</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2038</link>
    <description>&lt;pre&gt;Hi,

I updated my system and shifted from gcc4.6 --&amp;gt; gcc4.7 .While building
xapian code with gcc4.7 there is error returned due to some changes in
GCC.Pastbin of error is http://pastebin.com/7ee5WGiu .I searched and found
it to some other packages also seems to face such issue like
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=667403 .

Attached herewith is patch.

&lt;/pre&gt;</description>
    <dc:creator>Gaurav Arora</dc:creator>
    <dc:date>2012-05-08T10:49:26</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2035">
    <title>GSoC 2012 admin</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2035</link>
    <description>&lt;pre&gt;If you're one of our six selected GSoC students, please read through this
email and attend to anything which needs doing.  If anything is unclear,
please ask.

I've created a page on the wiki for each of the selected projects, which
are all linked from this overview page:

http://trac.xapian.org/wiki/GSoC2012

There's a table of useful information on each summary page.  Please check
that this is correct for you, and fill in the missing information.  For
most of you this is:

 * "work hours" - we realise you probably won't be working exactly the
   same hours every day, but it's useful for us to have some idea when we
   can expect you to be around.  We also encourage you to be on IRC in
   #xapian at least while you are working, as that will make interaction
   easier.

 * repository location - I've filled this in where I already know it, but
   for the rest of you please clone/fork our repository, push it to
   a public location, and put a link to that on your project's wiki page.
   There are already xapian git mirrors on gitorious and github, but any
   public hosting site is OK.  If you need help getting to grips with git,
   just ask - that's what we're here for.  Don't forget to actually push
   changes regularly (at least once a day is good) so there's a remote
   backup of your work, and so we can easily review your work.

 * Journal - this is for you to document your progress, which will help
   the mentors (and other interested people) to follow how you are doing,
   and is also likely to help you.  If you already have a blog, or would
   like to start one, then you can just post to that with a tag, or in a
   category, so that it's possible to just see the posts about the project.
   Then link to that from the wiki page (the trac wiki format for external
   links is:

   [http://blog.example.org/ Journal]

   If you'd prefer to just maintain the journal on the wiki, let me know
   and I'll create an outline for you, like the one we used last year.

   During coding, you should aim to make at least a couple of entries each
   week.  It doesn't need to be anything formal - just what you're working
   on, what you have recently completed, what obstacles you're frustrated
   by, etc.  Communication is an important part of your project.

 * Project plan - this is for the project plan part of your application
   - i.e. the bit which talks about the project and the timeline.  We want
   a publicly visible copy of this so that anyone in the wider Xapian
   community who is interested in the project can see it.  You can either:

    + Copy the proposal to the wiki.  We did this last year, and the
      main annoyance is that you need to reformat it to use trac's
      wiki syntax.

    + Mark your proposal as "public" in melange and link to it there.
      Note that this means that personal information in the proposal (like
      phone numbers, etc) will be public.  If you'd like to remove such
      things, ask Dan or me and we can record them elsewhere and allow you
      to edit the proposal.

    + Attach the proposal to the wiki in some other format.

 * Notes - this is intended as a place to note down any information which
   is useful to you and/or other people interested in the project, such
   as links to background information and other useful resources, TODO
   items, other notes, etc.  You may prefer to keep such information in
   a README in your code repository, in which case you can just link to
   that.

Feel free to create more sub-pages or links if you'd find them useful.

Each project has a "primary mentor".  One reason is because someone needs
to be in Google's system as your mentor, but they are also responsible for
making sure you do actually get a response to questions, and things like
that.  But we're intending to use a similar "group mentoring" approach as
we did last year, so you'll be expected to discuss issues and ask questions
in public via the #xapian IRC channel or on the xapian-devel mailing list.
This is better for you as it means you don't have to wait for one person to
be available to respond, and better for us as it makes it easier to keep
track of how everyone's doing.  Public communication is also the way open
source projects usually work, and GSoC is meant to give you experience of
working in an open source project.

If there's an issue you'd rather not disclose in public, then you can
either talk to your primary mentor, or to one of the admins (that's Dan
Colish and me).

"To do" summary:

* Set up your git repo (if you haven't already).

* Fill in missing info (and correct any errors) in your page under:
  http://trac.xapian.org/wiki/GSoC2012

Good luck with your projects, and we look forward to getting to know you
better over the course of the next few months.

Cheers,
    Olly (on behalf of the Xapian GSoC mentors)
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Olly Betts</dc:creator>
    <dc:date>2012-05-08T02:43:22</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2034">
    <title>Erlang examples</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2034</link>
    <description>&lt;pre&gt;Hello,

Here are few examples of Erlang binding usage:

https://github.com/freeakk/xapian-examples

&lt;/pre&gt;</description>
    <dc:creator>Michael Uvarov</dc:creator>
    <dc:date>2012-05-07T17:19:46</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2022">
    <title>Index Size comparison</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2022</link>
    <description>&lt;pre&gt;Hi,
I did a comparison based on similar steps as in the blog
(zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter),
against lucene-3.4 and xapian-1.3.0. The overall index sizes are:
lucene 89M, xapian 189M (chert backend and compacted).
Since I'm more interested in index size, I dig a little further to dump
the full term list. There are about 360000 terms from lucene index, and
about 285000 terms from xapian index. But surprisingly, the termlist.DB
of xapian index is already 122M.
Is there some idea/plan on reducing the index size? I'll glad if I could
help.

Thanks!
Jaguar
&lt;/pre&gt;</description>
    <dc:creator>Jaguar Xiong</dc:creator>
    <dc:date>2012-04-23T14:16:51</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2012">
    <title>Handling Negative value due to logarithm ofprobabilities.</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2012</link>
    <description>&lt;pre&gt;Hi,

In continuation of the discussion of melange comments,about negative value
returned in matcher due to logarithm of probabilities.

*I**f we make K suitably large, we could clamp each log(K.Pi) to be &amp;gt;= 0,
and this change will only affect really low probability terms (those with
Pi &amp;lt; 1/K, so you can adjust K to suit):*

*W' = sum(i=1,...,n, max(log(K.Pi), 0))*

Did you mean for low probability the the value returned by log(K.Pi) would
be negative. So replace lower probability, which still gives negative value
by 0?

Assigning 0 will be equivalent to rejecting term from the query
completely,which hurts the retrieval performance in Language Model as the
term missing from the document are smoothened with collection frequency.

I think we must try the smoothing from collection statistics if the
document term probability doesn't work(is generating negative value).

*sum(*

*i=1,...,n, if( max(log(K.Pi), 0) == 0)*

*max(max(log(K.Pcollec.i),0)*

*else*

***log(K.Pi)*

*)*

In case both doesnt work return 0 would be only option .


Moreover selecting a large enough K would be a tricky task as as no K would
be large enough since log(x) -&amp;gt; -inf as x -&amp;gt; 0

Should we approach selecting value of K by statistically, i will mean to
run the unigram Weighting scheme on large collection and observing lowest
probability which could be found and hence approximating the value of K or
any other method.


I asked same Question on Stack overflow about this.

http://goo.gl/ykwN4


They suggested:

*"Could you simple take the negative of the logarithm? Since you are
dealing with probabilities (i.e. values &amp;lt;= 1), the logarithm is always
negative,
so negating it will always make it positive."*

But this approach wont be a good idea as large values will indicates low
probabilites,small values will indicate high probabilities.Hence matcher
will tend to skip some good documents from ranked list due to lower weight.


Thanks,
*-- *
with regards
Gaurav A.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Gaurav Arora</dc:creator>
    <dc:date>2012-04-27T04:58:30</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2009">
    <title>Letor re-factored code</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2009</link>
    <description>&lt;pre&gt;Hello Rishabh,

The attached diff file should help you to refactor the existing letor code
and plug in your code easily. I have defined the header files with the
necessary flow. I have also specified the ranker class, which should be
implemented by a new LTR models, ListNet and ListMLE in your case. The
evaluation file should be used for validating the performance of the
algorithm in the training so which you plan to implement. You should take
the corresponding definitions from the letor_internal.cc and create the
corresponding .cc files. Feel free to do necessary changes and in case of
any doubt, feel free to ask.

Regards,
Parth.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Parth Gupta</dc:creator>
    <dc:date>2012-04-24T09:11:34</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/2007">
    <title>filter by facets and objective c version</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/2007</link>
    <description>&lt;pre&gt;1. How do I perform filter by facets using the C++ api?
2. Has anyone ported xapian to objective C


thanks
Amish
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Amish Shah</dc:creator>
    <dc:date>2012-04-23T19:21:44</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1999">
    <title>MeCab as a Japanese tokenizer</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1999</link>
    <description>&lt;pre&gt;The first version of a Japanese Tokenizer implementation using MeCab.
It hasn't been tested yet.

In order to compile property both MeCab and libiconv should be installed.

In ubuntu the following packages are needed: libmecab-dev mecab-ipadict-utf8
It's also recommended to install libiconv from source.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Harrynson Hidalgo</dc:creator>
    <dc:date>2012-04-20T07:56:51</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1996">
    <title>Japanese tokenization</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1996</link>
    <description>&lt;pre&gt;This is the first version I had, it's full of bugs... I had already
debugged it but forget to run gitt add... before doing git commit.
I'm re-debugging it right now.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Harrynson Hidalgo</dc:creator>
    <dc:date>2012-04-20T05:04:27</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1995">
    <title>Japanese tokenization</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1995</link>
    <description>&lt;pre&gt;I send the first prototype of the patch.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Harrynson Hidalgo</dc:creator>
    <dc:date>2012-04-20T04:46:13</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1992">
    <title>Implementing the tf-idf weighting scheme</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1992</link>
    <description>&lt;pre&gt;And the patch file is in the attachment.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Jiuding Duan</dc:creator>
    <dc:date>2012-04-20T03:47:22</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1991">
    <title>Implementing the tf-idf weighting scheme</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1991</link>
    <description>&lt;pre&gt;Hi, all:

This is the basic implementation of tf-idf scheme (basic scheme used in
SMART) that can be used in the Xapian. It might still  need some futher
revision, but I believe it works anyway.:)

I modified the weight.h to define a subclass Tf_idfWeight and add a new
file tf_idf.cc in ../weight in the repo, to implement Tf_idfWeight.

Here is the git diff patch:

https://gist.github.com/2422049

I think the next thing to do is register this scheme to Xapian and write
some test to see whether or not it works?

I'm grepped the current BM25Weight, TradWeight and BoolWeight, and find
clues about Enquire::set_weighting_scheme( ). But something more should be
done to understand it.


Best,

Jiuding
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Jiuding Duan</dc:creator>
    <dc:date>2012-04-20T03:33:36</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1989">
    <title>Posting list encoding improvements - pfd encoding &amp; var len encoding comparison program</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1989</link>
    <description>&lt;pre&gt;Hi all,
I wrote a program that implement the variable length encoding and fixed
length encoding, and compares their index size and speed of search doc
length.
You can see the comparison result from the attachment snapshot.

1. The posting list is in all memory;
2. The search strategy of fixed length encoding is skipping with
exponential step (1, 2, 4, 8, ...). Once exceeds the desired doc id, back
to previous step and skip with step 1.
3. The implemented fixed length encoding uses 4 bytes as fixed length. This
is not optimal and can be further optimized in PFD.
4. The program generates uniform random doc id gap and doc len to make
posting list.

*You can access the code via my github:
https://github.com/zwxxx/pfd_simple_test*
&lt;/pre&gt;</description>
    <dc:creator>Weixian Zhou</dc:creator>
    <dc:date>2012-04-20T02:51:45</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1987">
    <title>Xapian::Database-&gt;close() for perl missing</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1987</link>
    <description>&lt;pre&gt;I have a xapian-daemon, which can be queried via http. A background-process
generated every hour one new index and then remove and create a new symlink
to the current database.

/path/to/index/20120419010000
/path/to/index/20120419020000
/path/to/index/20120419030000
/path/to/index/default =&amp;gt; /path/to/index/20120419030000

So the daemon only check the mtime of /path/to/index/default/iamchert befor
every request and if it is a new one, he close/reopen the database.

The problem is: There is no -&amp;gt;close in perl for a database! So currently i
override the object. After some days, the prozess have many open
filepointers to allready removed databases.

I also tried -&amp;gt;reopen, but xapian dont reopen every file, so after reopen,
i have some files opened from the old database and the record.DB from the
new one. Also the filepointer to the old record.DB still exists (i tested
with lsof -p $PID).

So what is the right way to make a clean shutdown of a opened xapian
database?

I tested with xapian-core 1.2.9 / Search::Xapian 1.2.9.0
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Websuche :: Felix Ostmann</dc:creator>
    <dc:date>2012-04-19T13:45:36</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1986">
    <title>Patch: New features for Learning to Rank</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1986</link>
    <description>&lt;pre&gt;Hello,
Please find attached the patch which contains new features for the LETOR
framework which increases the total features from 19 to 37. In my proposal
I had mentioned of having a final count of 44 IR specific features apart
from those learnt by unsupervised feature learning part via Deep Learning.
So I am left with 7 out of those 44,of which 6 I will be implementing
within the next 2 days(before 22nd).
This patch also contains a 1 line change to
xapian-core/examples/questletor.cc correcting the usage of the questletor
file. I will send in another patch as soon as I am done with those 6
features.

Best regards,
Rishabh.
_______________________________________________
Xapian-devel mailing list
Xapian-devel&amp;lt; at &amp;gt;lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
&lt;/pre&gt;</description>
    <dc:creator>Rishabh Mehrotra</dc:creator>
    <dc:date>2012-04-19T03:18:41</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.search.xapian.devel/1982">
    <title>Patch for Initial Prototype implementation of Unigram Langauage Modelling in xapian-core.</title>
    <link>http://comments.gmane.org/gmane.comp.search.xapian.devel/1982</link>
    <description>&lt;pre&gt;Hi,

  I have implemented initial prototype of the  Xapian::Weight subclass for
Unigram Language Modelling to support UnigramLM weighing in xapian.Other
changes include adding collection_frequency to TermFreqs struct to store
collection frequency of terms and some changes to support it xapian
Framework,Changing simplesearch.cc to search using UnigramLMWeight class.

Following issues have not being addressed in this patch(I am working on
following issues):

1. Log trick for handling multiplication for LM need to made more robust
than just adding some random number to avoid rejecting document due to
negative value returned by log.

     Since each term contribution is probability(b/w 0 and 1). Hence
finding log will result in negative value and eventually rejection of
document.Hence a random linear weight has been added.It need to be
addressed by using log diffrent bases and some other techniques.

Discussion about log trick needed to be used are here for reference:
http://comments.gmane.org/gmane.comp.search.xapian.devel/1857

2. Setting tighter bound for the get_maxpart() to make matching process
more efficient.

3. Adding other smoothing factors to the UnigramLMWeight implementation.


PFA 5 patches for the initial prototype implementation of Unigram Language
Model in Xapian.

Thanks,

&lt;/pre&gt;</description>
    <dc:creator>Gaurav Arora</dc:creator>
    <dc:date>2012-04-15T01:09:33</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.comp.search.xapian.devel">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.comp.search.xapian.devel</link>
  </textinput>
</rdf:RDF>

