<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://blog.gmane.org/gmane.mail.spam.crm114">
    <title>gmane.mail.spam.crm114</title>
    <link>http://blog.gmane.org/gmane.mail.spam.crm114</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9591"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9590"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9589"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9588"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9587"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9586"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9585"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9584"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9583"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9582"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9581"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9580"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9579"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9578"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9577"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9576"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9575"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9574"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9573"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.mail.spam.crm114/9572"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9591">
    <title>Re: priolist.mfp regex problem</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9591</link>
    <description>&lt;pre&gt;

More likely to be a change/update in TRE, but we're both running the
same version.

   - Bill

------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
&lt;/pre&gt;</description>
    <dc:creator>wsy-MXLR4ItNKm8&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2012-04-13T13:58:06</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9590">
    <title>Re: priolist.mfp regex problem</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9590</link>
    <description>&lt;pre&gt;On Thu, Apr 12, 2012 at 08:36:51AM -0400, wsy-MXLR4ItNKm8&amp;lt; at &amp;gt;public.gmane.org spake thusly:

You mean my fix works, but it is really intended that I be able to put in raw
spaces, right?


Weird.


Oops...I forgot to mention my crm version in the initial posting:

# crm -v
 This is CRM114, version 20090423-BlameSteveJobs (TRE 0.8.0 (BSD))
 Copyright 2001-2006 William S. Yerazunis
 This software is licensed under the GPL with ABSOLUTELY NO WARRANTY

Yours is somewhat newer than mine. Any chance this is a bug which has been fixed?

&lt;/pre&gt;</description>
    <dc:creator>Tracy Reed</dc:creator>
    <dc:date>2012-04-12T18:21:31</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9589">
    <title>Re: priolist.mfp regex problem</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9589</link>
    <description>&lt;pre&gt;

Only marginally unintentional parsing.  But yeah, your solution
is correct.  (the "." is matching an arbitrary next character).

Lemme look at the source code.

Mailreaver:502 is:

    match &amp;lt;fromend nomultiline&amp;gt; (:w: :pm: :pat:) [:priolist:]  /(.)(.+)/

So it *should* have picked up everything after the first character
as the priority pattern.

Worse: it works for me correctly:


bash-4.2$ crm -v
 This is CRM114, version 20100106-BlameMichelson (TRE 0.8.0 (BSD))
 Copyright 2001-2009 William S. Yerazunis
 This software is licensed under the GPL with ABSOLUTELY NO WARRANTY
bash-4.2$ 
bash-4.2$ crm '-{ match &amp;lt;fromend nomultiline&amp;gt; (:a: :b: :c:) /(.)(.+)/;
 output /1st char: :*:b:\n/; output /Rest of chars: :*:c:\n/; liaf}'
foo bar
baz wugga
happy go lucky
1st char: f
Rest of chars: oo bar
1st char: b
Rest of chars: az wugga
1st char: h
Rest of chars: appy go lucky
bash-4.2$ 


No problem.  But sadly, no clue either.

   - Bill

------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
&lt;/pre&gt;</description>
    <dc:creator>wsy-MXLR4ItNKm8&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2012-04-12T12:36:51</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9588">
    <title>priolist.mfp regex problem</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9588</link>
    <description>&lt;pre&gt;I have been successfully using priolist.mfp to match email addresses that I
want to black/whitelist for ages. Now I want to block on a particular phrase
such as:

yahoo messenger online now

which is from a particularly egregious sort of spam/scam my organization is
receiving. I have tried putting the following combinations, none of which have
worked:

-yahoo messenger online now
-yahoo\ messenger\ online\ now
-yahoo\smessenger\sonline\snow
-"yahoo messenger online now"

and probably various others which I cannot now reproduce. I really expected the
first one to work. I have finally stumbled upon a combination which does work:

-yahoo.messenger.online.now

Why would . work (I'm plenty familiar with regex and know it will match
anything) but \s to match the spaces or simply raw spaces not work?

Is there some unintentional parsing being done on the whitespace I put in which
is breaking things?

Thanks!

&lt;/pre&gt;</description>
    <dc:creator>Tracy Reed</dc:creator>
    <dc:date>2012-04-12T00:14:20</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9587">
    <title>libcrm and svm</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9587</link>
    <description>&lt;pre&gt;Hi Bill,

It's been a long time since I last wrote to the list. I'm interested in playing around with libcrm and the SVM classifier, but am a bit confused.

1. Where might I find the "best" version of the library? The link on the wiki seems pretty old. For me, "best" means reliable enough for doing personal spam filtering.
2. The HowTo mentions CRM114_SVM, but I see that there is also CRM114_LIBSVM and that the two choices follow different code paths in crm114_base. Which should I use, or why might I choose one vs. the other?

Best Regards,
   Steve


------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
&lt;/pre&gt;</description>
    <dc:creator>Steve Pellegrin</dc:creator>
    <dc:date>2012-03-29T01:55:14</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9586">
    <title>Re: Help needed with a "wedged" CRM114 installation</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9586</link>
    <description>&lt;pre&gt;

I *think* there's supposed to be a space there, but it's been
so long since I touched the code that I can't remember!

I *think* that a column 1 '#' can be used as a comment.



I am pretty sure that will NOT work.


I don't think so.  It's reading a line at a time.  But it's always 
possible that there's a bug.

   - Bill

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
&lt;/pre&gt;</description>
    <dc:creator>wsy-MXLR4ItNKm8&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2012-03-26T17:52:58</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9585">
    <title>Re: Help needed with a "wedged" CRM114 installation</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9585</link>
    <description>&lt;pre&gt;

Well, in newer versions, the whitelist/blacklist/priolist is deprecated
into JUST one: priolist.

A priolist file looks like this: a + or a -, then a space, then a
pattern.  A leading "+" means "whitelist", a leading "-" means
"blacklist".  As a regex, it looks like this:

  (+|-) pattern

For example, to whitelist my lawer, my doctor, and blacklist 
my ex-girlfriend:

+ my_lawyer-gexy9bi/sWtBDgjK7y7TUQ&amp;lt; at &amp;gt;public.gmane.org
+ my_doctor-rimE70MFspc&amp;lt; at &amp;gt;public.gmane.org
- my_ex_girlfriend-XJeLq6sSUTW+dUnouKm4lg&amp;lt; at &amp;gt;public.gmane.org
+ my_new_girlfriend-Tpms0nXwIR954TAoqtyWWQ&amp;lt; at &amp;gt;public.gmane.org

So, to whitelist or blacklist anyone, just add them (or their
domain) to the priolist.

Note that the pattern just needs to *match*.  It can match
anywhere in the entire message, so if my ex-girlfriend mentiones
my doctor's email address, it will still come through.

Note also that the priolist is executed in strict order.
Thus, if my ex-girlfriend mentions my doctor, it comes through (the
doctor outranks my ex-girlfriend) but if she mentions my 
new girlfriend, that does NOT come through (ex-girlfriend
occurs before new girlfriend).

   - Bill (married five years now, still thinks this way.  Some
           things *never* change.)


------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
&lt;/pre&gt;</description>
    <dc:creator>wsy-MXLR4ItNKm8&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2012-03-26T12:16:31</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9584">
    <title>Re: Help needed with a "wedged" CRM114 installation</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9584</link>
    <description>&lt;pre&gt;I see whitelisting mentioned in this thread, so I thought to offer
what I find a very useful and trouble-free whitelist system based on
shell scripts. (My current understanding is that one crm-114 setup
includes whitelisting, and one does not.)

I have this system integrated with my maildrop .mailfilter, and call
it when/as needed from the keyboard in mutt to add senders to my
.whitelist.

http://impressive.net/people/gerald/2000/12/spam-filtering.html


&lt;/pre&gt;</description>
    <dc:creator>Eric d'Halibut</dc:creator>
    <dc:date>2012-03-23T03:25:03</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9583">
    <title>Re: Help needed with a "wedged" CRM114 installation</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9583</link>
    <description>&lt;pre&gt;martin-Gf0sdYSFR0SsTnJN9+BGXg&amp;lt; at &amp;gt;public.gmane.org said:

Ok, /me is feeling rather embarrased now. I just checked my entire setup,
and it looks like something hasn't been working since some time in May last
year, at least based on the date on my spam.css and nonspam.css files.

I think the problem is in the dovecot antispam plugin, am investigating.

Martin
&lt;/pre&gt;</description>
    <dc:creator>Martin Lucina</dc:creator>
    <dc:date>2012-03-22T15:12:29</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9582">
    <title>Re: Help needed with a "wedged" CRM114 installation</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9582</link>
    <description>&lt;pre&gt;wsy-MXLR4ItNKm8&amp;lt; at &amp;gt;public.gmane.org said:

I'm currently using the default from mailfilter.cf: clf: /osb unique microgroom/

Do I have anything to gain from SVM? (Speaking as a user, and slightly
distracted at the moment, so I haven't read up on it)


It's definitely 0x20.


Same here. I'll see what happens when I retrain and if the problem persists
will just add the Koreans to the whitelist.

Martin
&lt;/pre&gt;</description>
    <dc:creator>Martin Lucina</dc:creator>
    <dc:date>2012-03-22T14:50:21</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9581">
    <title>Re: Help needed with a "wedged" CRM114 installation</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9581</link>
    <description>&lt;pre&gt;

Only if you want the SVM classifier.  The base classifer (Markovian) 
is unchanged.



Go into EMACS or another byte-accurate editor (or a hex editor) and
check!  It can be quite important.  (i.e. is there really a hex 0x20
between each word?  Or does the Hangul representation you get have a
"space" in the glyphset that is NOT 0x20, and it gets used because that
way you don't have to change glyphsets twice for every word.



Yes, there is.  That's what the whitelist is for.  Although one might regard 
it as "cheating", I am a firm believer in "whatever gets you through
the night."

   - Bill Yerazunis

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
&lt;/pre&gt;</description>
    <dc:creator>wsy-MXLR4ItNKm8&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2012-03-22T14:32:52</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9580">
    <title>Re: Help needed with a "wedged" CRM114 installation</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9580</link>
    <description>&lt;pre&gt;Hi Bill,

wsy-MXLR4ItNKm8&amp;lt; at &amp;gt;public.gmane.org said:

Added the Cc: to the list back in, have now received email from the system
confirming that I'm subscribed.


That never happened for me. I kept getting a trickle of both SPAM and
non-SPAM mail going into the "UNSURE" folder.


Should I also upgrade to the latest codebase? If so, which one? I'm
currently using 20090807-BlameThorstenAndJenny (TRE 0.7.5 (LGPL)).


Assuming the various different Korean character sets all have what renders
as "space" on my screen in the same place as "ascii space", which is a
pretty safe assumption, then yes. Hangul is alphabet-based rather than
glyph-based, the only reason it looks superficially similar to Japanese is
that they use the trick of combining multiple charaters to form a single
glyph.

Alternatively, given that the volume of mail I get in Korean is quite low,
is there a way to tell the system to just pass through mail from certain
senders, completely ignoring it for training purposes?


Replaying email will be a bit hard.  I'll try with just moving the files
aside and re-training from scratch, anything is better than what it's doing
now :-) 

Martin

&lt;/pre&gt;</description>
    <dc:creator>Martin Lucina</dc:creator>
    <dc:date>2012-03-22T14:24:53</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9579">
    <title>Re: Confidential information scanning backup files</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9579</link>
    <description>&lt;pre&gt;

Yes, such a toolkit (actually, a complete package) exists.  It's
actually a commercial product; the core bits of it are libcrm114
(which is LGPLed) but you pay for the rest.

Oh, and it's all in Japanese.  I don't think there's an English-language
port of the GUI nor of the documentation.

Your choice:

  1) build a toolkit yourself.
or
  2) learn Japanese

     - Bill Yerazunis



------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
&lt;/pre&gt;</description>
    <dc:creator>wsy-MXLR4ItNKm8&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2012-02-28T13:24:04</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9578">
    <title>Confidential information scanning backup files</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9578</link>
    <description>&lt;pre&gt;I was discussing a security issue with my new employer today, about
scanning backups of servers that should not have confidential data on them
for precisely such data. In the short term, scanning the email would do,
especially if attachmants can be scanned. And I thought of CRM114 for the
task, instead of the very slow and painful tools that are often used now.

Is there a toolkit for such scanning? I'd much prefer to avoid take on the
full integration project, but if anyone's already got such a toolkit
assembled, even if it's a commercial toolkit, I'd love to review it for use
at my new workplace.
------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d_______________________________________________
Crm114-general mailing list
Crm114-general-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f&amp;lt; at &amp;gt;public.gmane.org
https://lists.sourceforge.net/lists/listinfo/crm114-general
&lt;/pre&gt;</description>
    <dc:creator>Nico Kadel-Garcia</dc:creator>
    <dc:date>2012-02-28T03:22:22</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9577">
    <title>modest update to CLASSIFY_DETAILS.txt</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9577</link>
    <description>&lt;pre&gt;Hi, All.

I'm back to looking at CRM114, and I decided to update some of the
documentation, working from "the (possibly) slightly unstable latest
mainline version" -- thanks for the continued good work, and I hope the 
updates prove useful.  This is the first.

/Jskud

--- crm114-20120205-ORIG/CLASSIFY_DETAILS.txt2009-09-11 11:25:57.000000000 -0700
+++ crm114-20120205/CLASSIFY_DETAILS.txt2012-02-06 07:49:00.000000000 -0800
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -18,7 +18,8 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 The current distribution builds in this set of classifiers.  The
 classifiers are:
 
-1) SBPH Markovian (the default) This is an extension of Bayesian
+1) SBPH Markovian (the default) - This classifier uses
+   Sparse Binary Polynomial Hashing (SBPH), an extension of Bayesian
    classification, mapping features in the input text into a Markov
    Random Field.  This turns each token in the input into 2^(N-1)
    features, which gives high accuracy but at high computation
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -52,7 +53,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
    other classifiers.  It _will_ work against binary files, though,
    which none of the other classifiers will.
 
-5) Hyperspatial classification - this experimental classifier
+5) Hyperspatial classification - This experimental classifier
    tokenizes, but does not use Bayes law at all, nor statistical
    "clumping".  During learning, each example document generates a
    single point in a 4 billion dimensional hyperspace.  The
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -719,7 +720,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
 
 
-The format of a SBPH or OSB Markovian .css file (and, for winnow a
+The format of a SBPH or OSB Markovian .css file (and, for Winnow a
 .cow file) is a 64-bit hash of a feature (whether the feature is a
 single word, a bigram, or a full SBPH does not matter) and a 32-bit
 representation of the value.  In .css files, the 32 bits is an

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
&lt;/pre&gt;</description>
    <dc:creator>Jskud.CRM114-hFAK3oOPH3QAvxtiuMwx3w&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2012-02-07T06:27:59</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9576">
    <title>updates to CRM114_Mailfilter_HOWTO.txt</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9576</link>
    <description>&lt;pre&gt;Hi, All.

I'm back to looking at CRM114, and I decided to update some of the
documentation, working from "the (possibly) slightly unstable latest
mainline version" -- thanks for the continued good work, and I hope the 
updates prove useful.  This is the second of two.

I mostly reworked the formatting to make it consistent, and made the
step titles consistent as well, fixing a few obvious typos in the
process.

/Jskud

--- crm114-20120205-ORIG/CRM114_Mailfilter_HOWTO.txt2009-09-11 11:25:57.000000000 -0700
+++ crm114-20120205/CRM114_Mailfilter_HOWTO.txt2012-02-06 19:52:36.000000000 -0800
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -7,7 +7,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 The CRM114 &amp;amp; Mailfilter HOWTO
 
     -Bill Yerazunis, 2003-09-18
-(last update 2009-03-02)
+(last update 2012-02-06)
 
 
 This is the CRM114 Mailfilter HOWTO.  It describes how to set up CRM114
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -31,7 +31,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
    ----------------------------------------------------------
 
-That said, I hope CRM114, Mailreaver, and Mailreaver is useful to you;
+That said, I hope CRM114, Mailfilter, and Mailreaver is useful to you;
 it's been very useful to me.  It's been keeping my mailbox clear of
 clutter for since 2002; I'm convinced it has better performance than
 I-the-human at killing spam without accidentally deleting important
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -64,12 +64,15 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
      - Bill Yerazunis (wsy-MXLR4ItNKm8&amp;lt; at &amp;gt;public.gmane.org)
 
--------------------------------------------------------------------
 
-Step 0:  Scientes Inamicae  (Know Thy Enemy)
+------------------------------------------------------------------------
+------------------------------------------------------------------------
+
+
+      Step 0: Scientes Inamicae  (Know Thy Enemy)
 
-These are the major steps in using CRM114 Mailfilter.  The steps are
-pretty simple:
+These are the other major steps in using CRM114 Mailfilter.  The steps
+are pretty simple:
 
       1) Downloading what you need
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -85,7 +88,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
          (editing one file, most likely change is ONE line, and we tell
  you which one)
 
-      3) Setting up the needed auxilliary files
+      4) Setting up other needed files
 
  (not more than 2 files to edit of no more than 5 lines each,
  plus typing one or two commands)
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -115,10 +118,11 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
  don't need to know this, but you may find it useful.
 
 
+------------------------------------------------------------------------
+------------------------------------------------------------------------
 
--------------------------------------------------------------------------
 
-                  Step 1: Downloading.
+Step 1: Downloading What You Need
 
 Get yourself a copy of a CRM114 kit.  The kits can always be found by
 visiting the CRM114 homepage at:
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -168,16 +172,14 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
 Download the kits you will need (at least one of .src.tar.gz or
 .i386.tar.gz or .i386.rpm) and then proceed to "Step 2: Setting Up the
-Executables"
-
-
-
---------------------------------------------------------------------------
+Executables".
 
 
+------------------------------------------------------------------------
+------------------------------------------------------------------------
 
 
-                       Step 2: Setting Up the Executables
+Step 2: Setting Up the Executables
 
 In this step, you will install four binaries into your system.
 The four binaries are:
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -262,8 +264,9 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
 
 Congratulations!  You've now completed the installation of CRM114 and
-utilities from prebuilt binaries.  Proceed to "Step 3: Setting Up Needed
-Files.
+utilities from prebuilt binaries.  Proceed to "Step 3: Configuring
+Mailfilter or Mailreaver.
+
 
   -----
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -382,14 +385,14 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
 
 Congratulations!  You've now completed the installation of CRM114 and
-utilities from source.  Move on to the next step - "Step 3: Setting Up
-Your .CSS Files" .
-
+utilities from source.  Move on to the next step - "Step 3: Configuring
+Mailfilter or Mailreaver".
 
 
 ------------------------------------------------------------------------
 ------------------------------------------------------------------------
 
+
 Step 3: Configuring Mailfilter or Mailreaver
 
 In this step you will tell Mailfilter or MailReaver what you want it
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -500,11 +503,11 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 Now, proceed to "Step 4: Setting Up Other Needed Files" .
 
 
---------------------------------------------------------------------
---------------------------------------------------------------------
+------------------------------------------------------------------------
+------------------------------------------------------------------------
 
 
-Step 4: Setting Up Other Needed Files
+Step 4: Setting Up Other Needed Files
 
 Now that the crm114 language is working, you need to set up your
 .css files,  your rewrites.mfp file, and your priolist.mfp file.
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -512,8 +515,8 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 All of these files need to exist (either by being there, or by
 being symlinked to) the directory where CRM114 will "run in"
 when an actual mail comes in.  Usually this is your per-user
-directory on the mail server (if your mail server is also your
-home directory, then it's there.).    If this is inconvenient,
+directory on the mail server (if your mail server also provides your
+home directory, then it's there.).  If this is inconvenient,
 you can use the --fileprefix option on the command line to
 tell CRM114 to "change over" to a different directory.  The files
 that need to be in the home (or --fileprefix) directory are:
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -582,9 +585,9 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 whitelists", you can now say "yes, and they're even _prioritized_
 blacklists and whitelists!".
 
+  -----
 
-
-       Step 4 Part 1 - Setting up the Rewrites file.
+  Step 4 Part 1 - Setting up the Rewrites file.
 
 To set up the rewrites.mfp file, edit the file "rewrites.mfp" and
 replace the placeholders (in this case, "wsy", "merl.com", and
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -628,23 +631,23 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 router, etc, add lines in rewrites.mfp for each email name, email
 address, server, router, and so forth.  This is something you really
 _should_ do, if you have more than one email path leading to the
-account that leads to an account that is being filtered by CRM114 (if
+account that leads to an account that is being filtered by CRM114.  (If
 you don't, a lot of learning will have to be repeated for each path,
 which will cost you accuracy and use up valuable feature slots in the
 .css files that you could use in more valuable ways otherwise.  On the
 other hand, if you have multiple email addresses that all channel
-through one CRM114 fileset, and the addresses recieve very different
+through one CRM114 fileset, and the addresses receive very different
 ratios of spam and nonspam (or, very differnt *types* of spam), then
 it _might_ be to your advantage to not use rewrites.mfp, (just replace
 it with an empty file), so that the extra statistical information of
-the incoming email address is not lost)
+the incoming email address is not lost.)
 
 If all this confuses you to no end, just make rewrites.mfp be an
-empty file and everything should decently well.
+empty file and everything should work decently well.
 
-       -----
+  -----
 
-       Step 4 Part 2 - Setting up the .CSS files
+  Step 4 Part 2 - Setting up the .CSS files
 
 
 You have a choice here.  You can either build your own files from your
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -662,7 +665,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
 If your mail service runs on your local machine (say, you have just
 one machine - and I do hope you have a firewall in that case), then
-mailfilter will almost certainly "run" in your home directory- the
+mailfilter will almost certainly "run" in your home directory - the
 directory you're in when you log in.
 
 If your mail service runs on a mail server (not your local machine),
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -695,7 +698,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 Once you have these empty files you will have a high (50% or so)
 error rate for the first few hours, till you have 'taught' CRM114
 what your particular mix of spam and nonspam looks like.  Proceed
-below to "Step 4: Configuring Mailfilter".
+below to "Step 5: Engaging Mailfilter".
 
 Many people want to "preload" their spam collection into CRM114.  This
 used to be a bad idea.  CRM114 is optimized for TOE learning - "Train
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -741,8 +744,8 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
   -----
 
-    Step 4 Part 2 Method C - BETA TEST - Using mailtrainer.crm to
-    Build .CSS Files
+  Step 4 Part 2 Method C - BETA TEST -
+       Using mailtrainer.crm to Build .CSS Files
 
 New in 20060101 is the "mailtrainer.crm" program.  This program
 accepts two directories of "archetype" good and spam email, and runs
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -767,7 +770,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
   -----
 
-      Step 4 Part 2 Method D - ALPHA TEST -- MAKEFILE Build And
+   Step 4 Part 2 Method D - ALPHA TEST -- MAKEFILE Build And
         Preload .CSS Files From Fresh Spam and Nonspam
 
  CAUTION - this applies ONLY to kits 20060606 and later!!!  DO NOT DO
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -816,7 +819,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 installs post 20060606 .  Versions prior to that will hose you if
 you do this.
 
- --------
+  -----
 
   Step 4 Part 3 - Checking your installation
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -893,6 +896,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 Note: this works fine for the default classifiers like Markov, OSB,
 and OSB Unique, but _not_ for Winnow, Hyperspace, or Corellative
 classifiers; for OSBF classifiers use osbf-util instead of cssutil.
+See ./CLASSIFY_DETAILS.txt for a description of the classifiers.
 
 Type in:
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -959,10 +963,12 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 there are similarities.  That's pretty much typical- and it's a good sign
 that your filtering should be quite accurate.
 
-Now, move on to "Step 4: Configuring Mailfilter".
+Now, move on to "Step 5: Engaging Mailfilter".
+
+
+------------------------------------------------------------------------
+------------------------------------------------------------------------
 
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
 
 Step 5: Engaging Mailfilter
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -985,7 +991,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
   -----
 
-      Step 5 Method A: For Procmail and Maildrop Users
+  Step 5 Method A: For Procmail and Maildrop Users
 
 For Procmail users just add a procmail recipe to .procmailrc to run
 CRM114 and mailfilter whenever your other procmail rules fail to
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1011,7 +1017,8 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 To use mailreaver instead of mailfilter, just put "mailreaver.crm"
 in instead of "mailfilter.crm" .
 
-If you get the test message, proceed to "Step 6: Training CRM114".
+If you get the test message, proceed to "Step 6: Training CRM114 and
+Mailfilter".
 
 -----
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1059,7 +1066,6 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 ----------------------------------------------------------------------------
 ----------------------------------------------------------------------------
 
-
 Advanced Topic: Huge Emails and Denial Of Service Avoidance
 
 CRM114 has a number of built-in anti-Denial-of-Service (anti-DoS)
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1089,10 +1095,9 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
    mail/crm-spam
 
 
-
   -----
 
-    Step 5 Method B: The .forward hook file
+  Step 5 Method B: The .forward hook file
 
 For .forward hook users you should be aware that you should NOT put a
 direct link to crm in /etc/smrsh; since crm can do arbitrary things,
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1118,7 +1123,8 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
   ----
 
 Once you have engaged CRM114 mailfilter, you now get to train it to
-recognize spam and nonspam.  Proceed to "Step 6: Training CRM114".
+recognize spam and nonspam.  Proceed to "Step 6: Training CRM114 and
+Mailfilter".
 
 Note: CRM114 contains a design decision that you may have to play
 with.  Instead of doing memory management games, which both consume
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1138,7 +1144,9 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 a buffer-shuffling dance to minimize time spent reclaiming and
 compactifying memory.
 
----------------------------------------------------------------------------
+
+------------------------------------------------------------------------
+------------------------------------------------------------------------
 
 
 Step 6: Training CRM114 and Mailfilter
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1162,16 +1170,15 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 interchangeably here; the instructions say "mailfilter.crm" but
 mailreaver.crm works exactly the same way from the user point of view.
 
-   * Mail-to-Myself with In-Line Commands to retrain  (Method A)
-   * shell commands to retrain  (Method B)
-   * Mutt direct interface    (Method C)
-   * Some Other Interface    (Method D)
-
+   * Method A: mail-to-myself with in-line commands to retrain
+   * Method B: shell commands to retrain
+   * Method C: Mutt direct interface
+   * Method D: some other interface
 
 
-Whatever Way You Train : try to train _approximately_ equal amounts of spam and
-nonspam.  If you are within 50% one way or the other, performance will
-be very good.
+Whatever Way You Train: try to train _approximately_ equal amounts of
+spam and nonspam.  If you are within 50% one way or the other,
+performance will be very good.
 
 If you are running mailfilter.crm:
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1198,8 +1205,9 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 error per thousand).
 
 
+  -----
 
-     Step 6 Method A: Mail-to-Myself
+  Step 6 Method A: Mail-to-Myself
 
 The first way is to use the in-line command feature.  Just forward
 the mistake back to yourself, with full headers (except edit out any
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1247,25 +1255,24 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 If you are a mailreaver user, you also have a priority system you can
 access, either by editing your priolist.mfp file directly or by
 sending youself email in the following forms (where mypwd is the
-command passworda_regex_pattern is what will be used for priority
+command password, and a_regex_pattern is what will be used for priority
 matching.  Priority matches can occur in both the headers and body of
 the text.)
 
     command mypwd maxprio +a_regex_pattern      - sets a maximum priority GOOD
     command mypwd maxprio -a_regex_pattern      - sets a maximum priority SPAM
-    command mypwd minprio +a_regex_pattern      - sets a maximum priority GOOD
-    command mypwd minprio -a_regex_pattern      - sets a maximum priority SPAM
+    command mypwd minprio +a_regex_pattern      - sets a minimum priority GOOD
+    command mypwd minprio -a_regex_pattern      - sets a minimum priority SPAM
     command mypwd delprio a_regex_pattern       - deletes the first priority
-                                                 list entry that fully matches
-                                                 the regex pattern
-
-
+                                                  list entry that fully matches
+                                                  the regex pattern
 
 
+  -----
 
-    Step 6 Method B: Shell commands to retrain
+  Step 6 Method B: Shell commands to retrain
 
-   &amp;gt;&amp;gt; For mailfilter users (mailreaver is different - skip to below! &amp;lt;&amp;lt;
+   &amp;gt;&amp;gt; For mailfilter users (mailreaver is different - skip to below! &amp;lt;&amp;lt;)
 
 The second way to train in spam and nonspam is to use mailfilter.crm's
 shell command line options.  When you find a spam that was mistakenly
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1286,7 +1293,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
  [[ If you are using mailreaver.crm instead of mailfilter.crm, and
  cacheing is enabled, you don't even need to pipe in the full text in,
  all that's needed is either the intact X-CRM114-CacheID: line or the
- Message-ID line containing an intact sfid.  That's another reason to
+ Message-ID line containing an intact SFID.  That's another reason to
  switch to mailreaver! :) ]]
 
               &amp;gt;&amp;gt; For mailreaver.crm users &amp;lt;&amp;lt;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1294,7 +1301,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 You're in luck, assuming you have taken the default and left cacheing
 turned on.  All you need to pipe into mailreaver for training is any
 text or text fragment containing an intact X-CRM114-CacheID: line or
-the Message-ID line containing an intact sfid; mailreaver will go get
+the Message-ID line containing an intact SFID; mailreaver will go get
 the exact incoming text of the message and train it, so you don't need
 to worry about munged headers.
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1352,9 +1359,9 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
                          file; instead use the file so noted.
 
 
+  -----
 
-
-     Part 6 Method C: For Mutt Users
+  Step 6 Method C: For Mutt Users
 
 (Contributed by Mathieu Doidy and Joost van Baal:)
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1375,8 +1382,9 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
    * esc-h will tag a message, falsely classified as spam, as ham.
 
 
+  -----
 
-    Part 6 Method D: Some Other Method
+  Step 6 Method D: Some Other Method
 
 
 There are at least five other ways to retrain CRM114.  Some interface
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1458,12 +1466,11 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 of daily use and about a gigabyte of email).
 
 
-
------------------------------------------------------------------------
-
+------------------------------------------------------------------------
+------------------------------------------------------------------------
 
 
-     Step 7: Adding Priority Lists, Whitelists, and Blacklists
+Step 7: Adding Priority Lists, Whitelists, and Blacklists
 
 If you really want, you can add white, black, and priority lists
 to CRM114.  Most people don't need them, but there are always
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1497,10 +1504,11 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
 Lastly (well, actually firstly, because prio-listing happens before
 whitelisting or blacklisting) any mail that matches any regex in
-priolist.mfp .  The format of priolist.mfp is that the first character
-on the line is a + or a -, which indicates "whitelist" or "blacklist",
-and the rest of the line is a regex.  These regexes are tested
-in the order given in the file.  An empty file is perfectly acceptable.
+priolist.mfp is handled.  The format of priolist.mfp is that the first
+character on the line is a + or a -, which indicates "whitelist" or
+"blacklist", and the rest of the line is a regex.  These regexes are
+tested in the order given in the file.  An empty file is perfectly
+acceptable.
 
 For examples of how to set up the whitelist, blacklist, and priolist
 files, see the included "whitelist.mfp.example", "blacklist.mfp.example",
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1513,10 +1521,11 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 add, otherwise you may get a rude surprise some day.
 
 
-----------------------------------------------------------------
+------------------------------------------------------------------------
+------------------------------------------------------------------------
 
 
-        Step 8: Useful Utilities
+Step 8: Useful Utilities
 
 You don't _need_ to know the stuff in this section to set up and use
 CRM114 and mailfilter or mailreaver, but it might be useful to you- or
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1539,7 +1548,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
 
                    The cssutil utility:
-
+                    -------------------
 
 Usage is
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1560,9 +1569,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
                 -s css-size  - if no cssfile found, create new
                                cssfile with this many buckets.
                 -S css-size  - same as -s, but round up to next
-                               2^n + 1 boundary.
-
-
+                               2^k + 1 boundary.
 
 
        The cssdiff utility
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1572,8 +1579,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
     ./cssdiff somefile.css anotherfile.css
 
-which writes out a summary of how two different .css files are.
-
+which writes out a summary of how different two .css files are.
 
 
                     The cssmerge utility
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1600,19 +1606,16 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
      -s NNNN      -new file length, if needed
 
 
-
-
-
-Enlarging a .css file
-                ---------------------
+    Enlarging a .css file
+                    ---------------------
 
 One of the advantages of CRM114 is that the .css files are relatively
 small and of fixed size; they don't grow out of control and never need
 trimming if you use &amp;lt;microgroom&amp;gt;, which is the default.
 
 The disadvantage of this is that if your spam/nonspam discrimination
-is too convoluted, it won't be able to sort them out ( in trek-speak
-this is a high-order nonlinearity in the discrimination function ).
+is too convoluted, it won't be able to sort them out (in trek-speak
+this is a high-order nonlinearity in the discrimination function).
 The fix in this situation is to increase the dimensionality of the
 feature space.  The number of dimensions is about 1/12 the number of
 bytes in the .css files; this works well at about a million dimensions
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1640,7 +1643,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
 You can even combine steps 1 and 2, because newer versions of cssmerge
 will create a new file if needed (the -s N flag sets the number of slots
-in the new file; -S N does the same thing but rounds up to a 2^N+1
+in the new file; -S N does the same thing but rounds up to a 2^k+1
 boundary, which is recommended ).
 
 For example, here's how to increase the size of the spam.css file
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1657,6 +1660,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 --------------------------------------------------------------------
 
     APPENDIX 1
+
                Using mailtrainer.crm
 
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1902,12 +1906,11 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 improve your accuracy still more.
 
 
+------------------------------------------------------------------------
 
----------------------------------------------------------------------
-
-That's all!  If you have errors or updates (or find bugs!) please
-let me know; the best way is to join the CRM114-general mailing list; it's
-on the webpage:
+That's all!  If you have errors or updates (or find bugs!) please let me
+know; the best way is to join the CRM114-general mailing list; it's on
+the webpage:
 
    http://crm114.sourceforge.net
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1920,3 +1923,5 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 Enjoy, and good luck.
 
        -Bill Yerazunis
+
+[]

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
&lt;/pre&gt;</description>
    <dc:creator>Jskud.CRM114-hFAK3oOPH3QAvxtiuMwx3w&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2012-02-07T06:31:46</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9575">
    <title>Re: CRM114 php extension</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9575</link>
    <description>&lt;pre&gt;

Hi Khalid,

This would explain a lot.


Certainly I can package a shared library, preferrably in addition to the 
static library. The Makefile would have to produce libcrm114.so which I 
would package as libcrm114.so.1.0.0 or similar (according to the actual 
version) and add a symbolic link libcrm114.so in the package. This means 
we would need proper versioning so that several versions could co-exist in 
the future. In an ideal world the upstream source should be named to show 
the version; libcrm114-1.0.0.tar.gz as an example. The major version 
number is to be counted up when a new version is no longer backward 
compatible to previous versions. The middle number counts up for every 
release introducing a backward compatible change in the API and the minor 
number is for bugfix releases.

I really would like to see this done in the Makefile upstream rather than 
adding a patch to my spec file.

Bill: any chance that you consider to change your Makefile to produce an 
additional dynamic library?


Dynamic linking would be preferred as per openSUSE policy. We should get 
it right, or the packages will never find a way out of my home: repo into 
the main distribution.

Best regards,
Thomas

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual 
desktops for less than the cost of PCs and save 60% on VDI infrastructure 
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
&lt;/pre&gt;</description>
    <dc:creator>Thomas Spahni</dc:creator>
    <dc:date>2012-01-11T16:33:50</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9574">
    <title>CRM114 php extension</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9574</link>
    <description>&lt;pre&gt;

Hello Khalid

Something is broken, either with my package for libcrm114.a or something 
else. When I set up pecl-crm114 I get this during the configure step:

checking for grep that handles long lines and -e... /usr/bin/grep
checking for egrep... /usr/bin/grep -E
checking for a sed that does not truncate output... /usr/bin/sed
checking for cc... cc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether cc accepts -g... yes
checking for cc option to accept ISO C89... none needed
checking how to run the C preprocessor... cc -E
checking for icc... no
checking for suncc... no
checking whether cc understands -c and -o together... yes
checking for system library directory... lib
checking if compiler supports -R... no
checking if compiler supports -Wl,-rpath,... yes
checking build system type... i686-pc-linux-gnu
checking host system type... i686-pc-linux-gnu
checking target system type... i686-pc-linux-gnu
checking for PHP prefix... /usr
checking for PHP includes... -I/usr/include/php5 -I/usr/include/php5/main 
-I/usr/include/php5/TSRM -I/usr/include/php5/Zend -I/usr/include/php5/ext 
-I/usr/include/php5/ext/date/lib
checking for PHP extension directory... /usr/lib/php5/extensions
checking for PHP installed headers prefix... /usr/include/php5
checking if debug is enabled... no
checking if zts is enabled... no
checking for re2c... re2c
checking for re2c version... 0.13.5 (ok)
checking for gawk... gawk
checking for crm114 support... yes, shared
checking for libcrm114.a directory... yes, shared
checking whether to enable binary/text data block... yes
checking for tre/regex.h... found in /usr
checking for regfree in -ltre... yes
checking for libcrm114.a... found in /usr/lib
checking for crm114_learn_text in -lcrm114... no
configure: error: wrong libcrm114 version or lib not found

This is a snippet from config.log

configure:4468: checking for libcrm114.a
configure:4473: result: found in /usr/lib
configure:4599: checking for crm114_learn_text in -lcrm114
configure:4624: cc -o conftest -g -O2   -lcrm114 conftest.c -lcrm114   &amp;gt;&amp;amp;5
/usr/lib/gcc/i586-suse-linux/4.5/../../../libcrm114.a(crm114_bit_entropy.o): 
In function `stats_2_entropy':
/home/tsp/programs/libcrm/compile/libcrm114-20100726/crm114_bit_entropy.c:853: 
undefined reference to `logl'
/usr/lib/gcc/i586-suse-linux/4.5/../../../libcrm114.a(crm114_bit_entropy.o): 
In function `crm114__init_block_bit_entropy':
/home/tsp/programs/libcrm/compile/libcrm114-20100726/crm114_bit_entropy.c:1637:
undefined reference to `sqrt'
/usr/lib/gcc/i586-suse-linux/4.5/../../../libcrm114.a(crm114_bit_entropy.o): 
In function `crm114_classify_text_bit_entropy':
/home/tsp/programs/libcrm/compile/libcrm114-20100726/crm114_bit_entropy.c:2613: 
undefined reference to `pow'
/usr/lib/gcc/i586-suse-linux/4.5/../../../libcrm114.a(crm114_svm.o): In 
function `crm114_classify_features_svm':
/home/tsp/programs/libcrm/compile/libcrm114-20100726/crm114_svm.c:2593: 
undefined reference to `tanh'
/home/tsp/programs/libcrm/compile/libcrm114-20100726/crm114_svm.c:2594: 
undefined reference to `pow'

    ... &amp;lt;many more, all similar&amp;gt;

collect2: ld returned 1 exit status
configure:4624: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME ""
| #define PACKAGE_TARNAME ""
| #define PACKAGE_VERSION ""
| #define PACKAGE_STRING ""
| #define PACKAGE_BUGREPORT ""
| #define PACKAGE_URL ""
| #define HAVE_LIBTRE 1
| /* end confdefs.h.  */
|
| /* Override any GCC internal prototype to avoid an error.
|    Use char because int might match the return type of a GCC
|    builtin and then its argument prototype would still apply.  */
| #ifdef __cplusplus
| extern "C"
| #endif
| char crm114_learn_text ();
| int
| main ()
| {
| return crm114_learn_text ();
|   ;
|   return 0;
| }
configure:4633: result: no
configure:4745: error: wrong libcrm114 version or lib not found

When I remove /usr/lib/libcrm114.a and use the bundled code all works 
well. As a side note I can add that Bill's simple_demo.c compiles and 
links well with the packaged library. Any idea what's wrong here?

Regards,
Thomas

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual 
desktops for less than the cost of PCs and save 60% on VDI infrastructure 
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
&lt;/pre&gt;</description>
    <dc:creator>Thomas Spahni</dc:creator>
    <dc:date>2012-01-08T18:36:37</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9573">
    <title>Re: CRM114 php extension</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9573</link>
    <description>&lt;pre&gt;

Hi all

I packaged libcrm114 as crm114-devel-static for the openSUSE distribution. 
Package details are here:

https://build.opensuse.org/package/show?package=crm114-devel-static&amp;amp;project=home%3Avodoo

and the rpm's may be downloaded from the archive here:

http://download.opensuse.org/repositories/home:/vodoo/openSUSE_11.3/
http://download.opensuse.org/repositories/home:/vodoo/openSUSE_11.4/
http://download.opensuse.org/repositories/home:/vodoo/openSUSE_12.1/

I added a man page with contents borrowed from HOWTO.txt. The file layout 
of the package is using these locations:

/usr/lib/libcrm114.a
/usr/include/crm114_*.h
/usr/share/doc/packages/crm114-devel-static

Replace /usr/lib by /usr/lib64 for the x86_64 packages. Anyone may feel 
free to use my spec file as a starting point for other distributions.
Comments are welcome.

Khalid Ahsein: Can you please adapt your config.m4 to make 
pecl-crm114-0.9.1.tgz link with the installed libcrm114 library if 
available? Then I could package it was well.

Thank you all for the good work.

Thomas

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual 
desktops for less than the cost of PCs and save 60% on VDI infrastructure 
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
&lt;/pre&gt;</description>
    <dc:creator>Thomas Spahni</dc:creator>
    <dc:date>2012-01-06T09:43:10</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9572">
    <title>Re: rewrites.mfp and IPv6 addresses ... seems tonot blow up</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9572</link>
    <description>&lt;pre&gt;wsy-MXLR4ItNKm8&amp;lt; at &amp;gt;public.gmane.org replied to my note:

&amp;lt;&amp;gt; &amp;gt; So, setting up crm114 for someone else led me to a seriously overdue
&amp;lt;&amp;gt; &amp;gt; cleanup of my own rewrites.mfp file.  I decided that it's probably
&amp;lt;&amp;gt; &amp;gt; just doing string replacements to normalize the data, so it should
&amp;lt;&amp;gt; &amp;gt; deal just fine with an IPv6 colon separated address.
&amp;lt;&amp;gt; &amp;gt;
&amp;lt;&amp;gt; &amp;gt; So far, it seems to do so just fine, so, Bill, I think you can call
&amp;lt;&amp;gt; &amp;gt; crm114 "IPv6 compliant"
&amp;lt;&amp;gt; 
&amp;lt;&amp;gt; So - all those colons didn't get the knickers in a twist?  :-)

Apparently not &amp;lt;grin&amp;gt;

&amp;lt;&amp;gt; I guess :*: and :+: don't appear much in IP6 addresses - or, if they
&amp;lt;&amp;gt; do, then whatever follows it until the next colon doesn't map
&amp;lt;&amp;gt; to a preexisting variable.

Nope, those two strings will never appear in an address.  A v6 address
is comprised of eight units of four hex digits separated by colons:

  2001:0470:1f07:61e:52e5:49ff:fe55:6731

Leading zeros can be dropped and units that are all zero can be
dropped, ONCE, in an address:

  2001:470:1f07:61e::3(global address in the same block as above.
 same host, actually)
  fe80::52e5:49ff:fe55:6731(link-local address - not routable)
  ::1(loopback address)

and that's about all the variation you can expect to see.  For full
gory details, the wikipedia entry for "IPv6 Address" is quite good.

&amp;lt;&amp;gt; That's one bug in the syntax; I wrote CRM114's expansion syntax
&amp;lt;&amp;gt; fifteen years ago and had no idea about IPV6.  Then again, RFC
&amp;lt;&amp;gt; 2460 only came out in 1998 which was about the same time.

Yeah, well, I'm reasonably sure I babled at you about v6 at some point
prior to 2000 seeing I was running early code in Waltham &amp;lt;laugh&amp;gt;

Reto
&lt;/pre&gt;</description>
    <dc:creator>R A Lichtensteiger</dc:creator>
    <dc:date>2012-01-05T20:25:18</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.mail.spam.crm114/9571">
    <title>Re: rewrites.mfp and IPv6 addresses ... seems tonotblow up</title>
    <link>http://permalink.gmane.org/gmane.mail.spam.crm114/9571</link>
    <description>&lt;pre&gt;

So - all those colons didn't get the knickers in a twist?  :-)

I guess :*: and :+: don't appear much in IP6 addresses - or, if they
do, then whatever follows it until the next colon doesn't map
to a preexisting variable.

That's one bug in the syntax; I wrote CRM114's expansion syntax
fifteen years ago and had no idea about IPV6.  Then again, RFC
2460 only came out in 1998 which was about the same time.

That's good to know.  :)

   - Crash

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual 
desktops for less than the cost of PCs and save 60% on VDI infrastructure 
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
&lt;/pre&gt;</description>
    <dc:creator>wsy-MXLR4ItNKm8&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2012-01-05T19:57:43</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.mail.spam.crm114">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.mail.spam.crm114</link>
  </textinput>
</rdf:RDF>

