<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://blog.gmane.org/gmane.comp.version-control.mercurial.devel">
    <title>gmane.comp.version-control.mercurial.devel</title>
    <link>http://blog.gmane.org/gmane.comp.version-control.mercurial.devel</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50537"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50536"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50535"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50534"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50533"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50532"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50531"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50530"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50529"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50528"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50527"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50526"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50525"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50524"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50523"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50522"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50521"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50520"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50519"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50518"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50537">
    <title>Re: [PATCH v4] strip: incrementally update the branchheads cacheafter a strip</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50537</link>
    <description>&lt;pre&gt;
I should probably add a note about keyword args to CodingStyle,
something like:

"Avoid using keyword and default arguments wherever reasonable. Use
positional args for arguments that are almost always used."


But this is neither a keyword arg nor a default. I'm actually a little
surprised this works.


That comment's a bit excessive for a standard idiom.


recursively


\ isn't needed here


Here's some evidence that it might be better to do this whole thing in
rev-space and convert back to node-space when we're done.

Notably, ctxisnew is pretty trivial to determine from looking at revs:
if the new nodes are all greater than max(bheadrevs), then it's true.
And that hides this detail from the rest of the code.

(The rule here, of course, is that if x is a descendant of y, x.rev() &amp;gt;
y.rev(), which is the intrinsic rule of a topological ordering. Lots of
algorithms can be simplified by taking advantage of the fact that we
have a topological ordering handy. But you have to be careful: you can't
infer any other relationships!)


iternodes = bheads
if ctxisnew:
    iternodes = newnodes
iternodes = sorted(iternodes,...)


The reachable method is pretty unfortunate here as it's not a generator,
and goes from node-space to rev-space and back, and it only works on one
rev at a time. You could do this all in one pass with revlog.ancestors.
It doesn't take a stop point, but since it's a generator, you can stop
whenever you want.


This is mysterious to me. Let's suppose we have a history that looks
like:

a-b-b-b-b-b-b-b-a
              ^ ^  &amp;lt;- branch heads

..and we strip the rightmost a. How do we discover that there was an
earlier a without visiting each changeset? In particular, note that we
have no local information to distinguish from this case:

b-b-b-b-b-b-b-b-a
              ^ ^  &amp;lt;- branch heads

I really don't believe we can handle all cases efficiently (ie by just
looking at heads and the DAG).

So I think the strategy here should be to incrementally fix one case at
a time, while leaving the visit-all-history fallback for correctness.
So:

- introduce ALL the tests (which should work correctly but slowly today)
- convert to rev-based logic
- introduce any external-to-case
- fix case A
- fix case B
- fix case C
- ...

&lt;/pre&gt;</description>
    <dc:creator>Matt Mackall</dc:creator>
    <dc:date>2012-05-24T20:10:55</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50536">
    <title>Re: [PATCH v3] parsers: add a C function to pack the dirstate</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50536</link>
    <description>&lt;pre&gt;
Inserting a print here

diff --git a/mercurial/parsers.c b/mercurial/parsers.c
--- a/mercurial/parsers.c
+++ b/mercurial/parsers.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -219,6 +219,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 PyObject *o = PyTuple_GET_ITEM(tuple, off);
 if (!PyInt_Check(o)) {
 PyErr_SetString(PyExc_TypeError, "expected an int");
+PyObject_Print(o, stdout, 0);
 return -1;
 }
 *v = (uint32_t)PyInt_AS_LONG(o);

gives

  $ python run-tests.py --local test-1102.t

  --- c:\users\adi\hgrepos\hg-main\tests\test-1102.t
  +++ c:\users\adi\hgrepos\hg-main\tests\test-1102.t.err
  &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -4,13 +4,98 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
     $ echo a &amp;gt; a
     $ hg ci -Am0
     adding a
  +  2L** unknown exception encountered, please report by visiting
  +  ** http://mercurial.selenic.com/wiki/BugTracker
  +  ** Python 2.6.6 (r266:84297, Aug 24 2010, 18:13:38) [MSC v.1500 64 bit (AMD64)]
  +  ** Mercurial Distributed SCM (version 2.2.1+139-85316b3c6a3b+20120524)
  +  ** Extensions loaded:
  +  Traceback (most recent call last):
  +    File "c:/users/adi/hgrepos/hg-main/hg", line 38, in &amp;lt;module&amp;gt;
  +      mercurial.dispatch.run()

Note the "2L". So it looks like we have Python long there...
&lt;/pre&gt;</description>
    <dc:creator>Adrian Buehlmann</dc:creator>
    <dc:date>2012-05-24T19:53:22</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50535">
    <title>Re: RFC: safe pattern matching for problematic encoding</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50535</link>
    <description>&lt;pre&gt;
At Thu, 24 May 2012 12:12:23 -0500,
Matt Mackall wrote:

Oops, sorry for my mis-understanding.

You can reproduce this problem with:

    unicode: \u30bd
    utf-8:   \xe3\x82\xbd
    cp932:   \x83\x5c

# This is the first character of the translation of 'software' :-)



----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy&amp;lt; at &amp;gt;lares.dti.ne.jp
&lt;/pre&gt;</description>
    <dc:creator>FUJIWARA Katsunori</dc:creator>
    <dc:date>2012-05-24T17:56:22</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50534">
    <title>Re: RFC: safe pattern matching for problematic encoding</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50534</link>
    <description>&lt;pre&gt;
At Thu, 24 May 2012 11:47:38 -0500,
Matt Mackall wrote:

I think that I understand about "tool chain problem by transcoding"
and "Windows and Linux have no normalization-based filesystems".

So, basic concepts of my patching are:

  (1) unification of filename normalization are switchable like as EOL
      extension:

      - tracked configuration file (like: ".hgeol") chooses type of
        normalization (NFC/NFD/none) for data storing

      - filenames are checked at commit/update/merge and so on as like
        case-folding collision detection, according to the configured
        normalization type

        this can prevent users on Linux/Windows from adding files:

          - normalized in the type other than configured one, or
          - colliding against ones normalized in another type


  (2) filenames in manifest file/bundle file/(exported-)diff file and
      paths to filelogs are normalized in chosen type


  (3) filenames in the working directory are always in NFD on MacOS,
      and NFC on Linux/Windows, if feature is enabled

      converting on the border between above representation and
      on-memory-objects (manifest, context and so on) is done

      HFS+ can treat both NFC/NFD form of same file (and tools on
      MacOS, too), so user can write filenames in NFC for tool chain
      configuration files, if they want to use it each on
      MacOS/Linux/Windows


But this should not explain well, so I'll post my patch series for the
base of discussion soon.
   
# now, writing patch descriptions :-)

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy&amp;lt; at &amp;gt;lares.dti.ne.jp
&lt;/pre&gt;</description>
    <dc:creator>FUJIWARA Katsunori</dc:creator>
    <dc:date>2012-05-24T17:44:54</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50533">
    <title>Re: RFC: safe pattern matching for problematic encoding</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50533</link>
    <description>&lt;pre&gt;
This is not an example yet. With bytes, please.

Beyond Ruby's 's' switch, there seems to be very little precedent for
how to deal with ShiftJIS (where there's even confusion about whether a
'\' character even exists).

When we implement UTF-8 mode, this will all be irrelevant: we'll only be
able to use UTF-8 encoded ignore files and we'll only accept UTF-8
command arguments.

&lt;/pre&gt;</description>
    <dc:creator>Matt Mackall</dc:creator>
    <dc:date>2012-05-24T17:12:23</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50532">
    <title>Re: RFC: safe pattern matching for problematic encoding</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50532</link>
    <description>&lt;pre&gt;
At Wed, 23 May 2012 13:56:43 -0500,
Matt Mackall wrote:

I also posted my patch series for this problem as RFC to devel-ml for
the base of discussion.

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy&amp;lt; at &amp;gt;lares.dti.ne.jp
&lt;/pre&gt;</description>
    <dc:creator>FUJIWARA Katsunori</dc:creator>
    <dc:date>2012-05-24T17:06:13</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50531">
    <title>[PATCH 0 of 4 RFC] RFC: safe pattern matching for problematic encoding</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50531</link>
    <description>&lt;pre&gt;this patch series achieves safe pattern matching for problematic encoding.

this series is posted for just as the base of discussion: not well
tested yet.
&lt;/pre&gt;</description>
    <dc:creator>FUJIWARA Katsunori</dc:creator>
    <dc:date>2012-05-24T17:04:24</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50530">
    <title>[PATCH 2 of 4 RFC] i18n: add hook point to convert MBCS strings beforebackslash sensitive process</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50530</link>
    <description>&lt;pre&gt;# HG changeset patch
# User FUJIWARA Katsunori &amp;lt;foozy&amp;lt; at &amp;gt;lares.dti.ne.jp&amp;gt;
# Date 1337873233 -32400
# Branch stable
# Node ID a85e6240d0ab23191d158390095dd48852dcdc39
# Parent  5b34156d5fec045557bef52ccd661d680090daf1
i18n: add hook point to convert MBCS strings before backslash sensitive process

added hook poit is "filter()" in "mercurial/encoding.py".

when win32mbcs is enabled, "filter()" is replaced with specific
implementation to do below:

  - for byte sequences:
    1. convert to unicode
    2. apply filter function on unicode object
    3. convert from unicode to byte sequence in local encoding
       (with substitution MBCS parts in the string by oneself in
       '\xXX' form for regexp safeness)
    4. return above byte sequence

  - for unicode objects:
    1. apply filter function on unicode object
    2. convert from unicode to byte sequence in local encoding
       (with substitution MBCS parts in the string by oneself in
       '\xXX' form for regexp safeness)
    3. convert byte sequence to unicode
    4. return above unicode

diff -r 5b34156d5fec -r a85e6240d0ab hgext/win32mbcs.py
--- a/hgext/win32mbcs.pyFri May 25 00:20:45 2012 +0900
+++ b/hgext/win32mbcs.pyFri May 25 00:27:13 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -148,6 +148,22 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
         raise util.Abort(_("[win32mbcs] conversion in escaping failed with"
                          " %s encoding\n") % (_encoding))
 
+def safefilter(s, filter, escape=True):
+    try:
+        if isinstance(s, unicode):
+            if escape:
+                return decode(escapeencode(filter(s), None))
+            else:
+                return filter(s)
+        else:
+            if escape:
+                return escapeencode(filter(decode(s)), None)
+            else:
+                return encode(filter(decode(s)))
+    except UnicodeError:
+        raise util.Abort(_("[win32mbcs] conversion in filtering failed with"
+                         " %s encoding\n") % (_encoding))
+
 def replacename(name, replacement):
     module, name = name.rsplit('.', 1)
     module = sys.modules[module]
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -194,6 +210,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
                 wrapname(f, wrapper)
         wrapname("mercurial.osutil.listdir", wrapperforlistdir)
         replacename("mercurial.encoding.escape", safeescape)
+        replacename("mercurial.encoding.filter", safefilter)
         # Check sys.args manually instead of using ui.debug() because
         # command line options is not yet applied when
         # extensions.loadall() is called.
diff -r 5b34156d5fec -r a85e6240d0ab mercurial/encoding.py
--- a/mercurial/encoding.pyFri May 25 00:20:45 2012 +0900
+++ b/mercurial/encoding.pyFri May 25 00:27:13 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -214,6 +214,15 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
     else:
         return s
 
+def filter(s, filter, escape=True):
+    """Hook point to apply FILTER func on string S safely,
+    even if current encoding is problematic one.
+
+    MBCS parts of result should be escaped as regexp safe, if ESCAPE is True
+    in problematic encoding.
+    """
+    return filter(s)
+
 def toutf8b(s):
     '''convert a local, possibly-binary string into UTF-8b
 
diff -r 5b34156d5fec -r a85e6240d0ab mercurial/match.py
--- a/mercurial/match.pyFri May 25 00:20:45 2012 +0900
+++ b/mercurial/match.pyFri May 25 00:27:13 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -249,14 +249,14 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
     elif kind == 'path':
         return '^' + encoding.escape(name, re.escape) + '(?:/|$)'
     elif kind == 'relglob':
-        return '(?:|.*/)' + _globre(name) + tail
+        return '(?:|.*/)' + encoding.filter(name, _globre) + tail
     elif kind == 'relpath':
         return encoding.escape(name, re.escape) + '(?:/|$)'
     elif kind == 'relre':
         if name.startswith('^'):
             return encoding.escape(name)
         return '.*' + encoding.escape(name)
-    return _globre(name) + tail
+    return encoding.filter(name, _globre) + tail
 
 def _buildmatch(ctx, pats, tail):
     fset, pats = _expandsets(pats, ctx)
&lt;/pre&gt;</description>
    <dc:creator>FUJIWARA Katsunori</dc:creator>
    <dc:date>2012-05-24T17:04:26</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50529">
    <title>[PATCH 3 of 4 RFC] i18n: add hook point to make tokenizing process asencoding safe</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50529</link>
    <description>&lt;pre&gt;# HG changeset patch
# User FUJIWARA Katsunori &amp;lt;foozy&amp;lt; at &amp;gt;lares.dti.ne.jp&amp;gt;
# Date 1337873761 -32400
# Branch stable
# Node ID 1d5a60c7f44f106af3c0d56139a1b6d0fd3d0b3c
# Parent  a85e6240d0ab23191d158390095dd48852dcdc39
i18n: add hook point to make tokenizing process as encoding safe

added hook poit is "tokenize()" in "mercurial/encoding.py".

when win32mbcs is enabled, "tokenize()" is replaced with specific
implementation to do below:

    1. convert from specified string to unicode object
    2. invoke the tokenizer with unicode to get generator from it
    3. "pos" of returned value is one in unicode, so recalculate one
       for in byte sequence
    4. convert "token" and "value" from unicode to byte sequence, and
       return them

step (3) of above is required, because the last "pos" value should be
equal to the length of the specified byte sequence, and otherwise it
causes exception raising.

this affects to invocations of below:

    - mercurial.fileset.tokenize()
    - mercurial.revset.tokenize()
    - mercurial.templater.tokenize()

diff -r a85e6240d0ab -r 1d5a60c7f44f hgext/win32mbcs.py
--- a/hgext/win32mbcs.pyFri May 25 00:27:13 2012 +0900
+++ b/hgext/win32mbcs.pyFri May 25 00:36:01 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -164,6 +164,23 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
         raise util.Abort(_("[win32mbcs] conversion in filtering failed with"
                          " %s encoding\n") % (_encoding))
 
+def _tokenize(tokenizer, s):
+    try:
+        us = decode(s)
+        for token, value, pos in tokenizer(us):
+            # re-calculate position in MBCS string
+            pos = len(encode(us[:pos]))
+            yield encode((token, value, pos))
+    except UnicodeError:
+        raise util.Abort(_("[win32mbcs] conversion in tokenizing failed with"
+                         " %s encoding\n") % (_encoding))
+
+def safetokenize(tokenizer, s):
+    if isinstance(s, unicode):
+        return tokenizer(s)
+    else:
+        return _tokenize(tokenizer, s)
+
 def replacename(name, replacement):
     module, name = name.rsplit('.', 1)
     module = sys.modules[module]
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -211,6 +228,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
         wrapname("mercurial.osutil.listdir", wrapperforlistdir)
         replacename("mercurial.encoding.escape", safeescape)
         replacename("mercurial.encoding.filter", safefilter)
+        replacename("mercurial.encoding.tokenize", safetokenize)
         # Check sys.args manually instead of using ui.debug() because
         # command line options is not yet applied when
         # extensions.loadall() is called.
diff -r a85e6240d0ab -r 1d5a60c7f44f mercurial/encoding.py
--- a/mercurial/encoding.pyFri May 25 00:27:13 2012 +0900
+++ b/mercurial/encoding.pyFri May 25 00:36:01 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -223,6 +223,12 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
     """
     return filter(s)
 
+def tokenize(tokenizer, s):
+    """Hook point to ensure that the tokenizer parses specified string safely
+    in current encoding.
+    """
+    return tokenizer(s)
+
 def toutf8b(s):
     '''convert a local, possibly-binary string into UTF-8b
 
diff -r a85e6240d0ab -r 1d5a60c7f44f mercurial/parser.py
--- a/mercurial/parser.pyFri May 25 00:27:13 2012 +0900
+++ b/mercurial/parser.pyFri May 25 00:36:01 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -15,7 +15,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 # an action is a tree node name, a tree label, and an optional match
 # __call__(program) parses program into a labelled tree
 
-import error
+import error, encoding
 from i18n import _
 
 class parser(object):
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -77,7 +77,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
         return expr
     def parse(self, message):
         'generate a parse tree from a message'
-        self._iter = self._tokenizer(message)
+        self._iter = encoding.tokenize(self._tokenizer, message)
         self._advance()
         res = self._parse()
         token, value, pos = self.current
&lt;/pre&gt;</description>
    <dc:creator>FUJIWARA Katsunori</dc:creator>
    <dc:date>2012-05-24T17:04:27</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50528">
    <title>[PATCH 1 of 4 RFC] i18n: add hook point to make MBCS parts ofspecified string as regexp safe</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50528</link>
    <description>&lt;pre&gt;# HG changeset patch
# User FUJIWARA Katsunori &amp;lt;foozy&amp;lt; at &amp;gt;lares.dti.ne.jp&amp;gt;
# Date 1337872845 -32400
# Branch stable
# Node ID 5b34156d5fec045557bef52ccd661d680090daf1
# Parent  0a730d3c5aaefae00239b8472703c884192f31b7
i18n: add hook point to make MBCS parts of specified string as regexp safe

added hook poit is "escape()" in "mercurial/encoding.py".

when win32mbcs is enabled, "escape()" is replaced with specific
implementation, which substitutes MBCS parts in specified string
by oneself in '\xXX' form for regexp safeness.

if "escape()" is invoked with unicode object, at first it is encoded
by local encoding, MBCS parts of it are changed as regexp safe as
above, and then decoded into unicode object again.

sometimes, it is needed to apply another escaping on specified string,
so "escape()" can take the function to do so: e.g. applying
re.escape() in "match.py" and applying encode('string-escape') in
"subrepo.py".

there is no need to "wrap" for "escape()", because default behavior of
it is NOP, so "escape()" is just replaced to simplify implementation.

almost all invocations of "escape()" is once per pattern/regexp
parsing, so they cost less overhead.

"_globre()" in "match.py" invokes "escape()" almost once per
character: original "_globre()" invokes "re.escape()" in same ratio,
so increase of overhead may not be so serious, when win32mbcs is
disabled.

diff -r 0a730d3c5aae -r 5b34156d5fec hgext/win32mbcs.py
--- a/hgext/win32mbcs.pyWed May 16 17:02:30 2012 +0900
+++ b/hgext/win32mbcs.pyFri May 25 00:20:45 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -123,6 +123,41 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
         pass
     setattr(module, name, f)
 
+
+def escapeencode(us, escfunc):
+    bs = ''
+    for u in us:
+        e = u.encode(_encoding)
+        if len(e) == 1:
+            if escfunc:
+                bs += escfunc(e)
+            else:
+                bs += e
+        else:
+            # escape whole MBCS bytes to avoid ambiguousness in regexp
+            bs += ''.join([r'\x%02x' % ord(b) for b in e])
+    return bs
+
+def safeescape(s, escfunc=None):
+    try:
+        if isinstance(s, unicode):
+            return decode(escapeencode(s, escfunc))
+        else:
+            return escapeencode(decode(s), escfunc)
+    except UnicodeError:
+        raise util.Abort(_("[win32mbcs] conversion in escaping failed with"
+                         " %s encoding\n") % (_encoding))
+
+def replacename(name, replacement):
+    module, name = name.rsplit('.', 1)
+    module = sys.modules[module]
+    func = getattr(module, name)
+    try:
+        replacement.__name__ = func.__name__ # fail with python23
+    except Exception:
+        pass
+    setattr(module, name, replacement)
+
 # List of functions to be wrapped.
 # NOTE: os.path.dirname() and os.path.basename() are safe because
 #       they use result of os.path.split()
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -158,6 +193,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
             for f in winfuncs.split():
                 wrapname(f, wrapper)
         wrapname("mercurial.osutil.listdir", wrapperforlistdir)
+        replacename("mercurial.encoding.escape", safeescape)
         # Check sys.args manually instead of using ui.debug() because
         # command line options is not yet applied when
         # extensions.loadall() is called.
diff -r 0a730d3c5aae -r 5b34156d5fec mercurial/commands.py
--- a/mercurial/commands.pyWed May 16 17:02:30 2012 +0900
+++ b/mercurial/commands.pyFri May 25 00:20:45 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -2773,7 +2773,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
     if opts.get('ignore_case'):
         reflags |= re.I
     try:
-        regexp = re.compile(pattern, reflags)
+        regexp = re.compile(encoding.escape(pattern), reflags)
     except re.error, inst:
         ui.warn(_("grep: invalid match pattern: %s\n") % inst)
         return 1
diff -r 0a730d3c5aae -r 5b34156d5fec mercurial/encoding.py
--- a/mercurial/encoding.pyWed May 16 17:02:30 2012 +0900
+++ b/mercurial/encoding.pyFri May 25 00:20:45 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -203,6 +203,17 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
     except LookupError, k:
         raise error.Abort(k, hint="please check your locale settings")
 
+def escape(s, escfunc=None):
+    """Hook point to escape specified string as regexp safely,
+    if current encoding is problematic one.
+
+    escfunc is used to escape non ambiguous parts of specified string.
+    """
+    if escfunc:
+        return escfunc(s)
+    else:
+        return s
+
 def toutf8b(s):
     '''convert a local, possibly-binary string into UTF-8b
 
diff -r 0a730d3c5aae -r 5b34156d5fec mercurial/fileset.py
--- a/mercurial/fileset.pyWed May 16 17:02:30 2012 +0900
+++ b/mercurial/fileset.pyFri May 25 00:20:45 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -5,7 +5,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 # This software may be used and distributed according to the terms of the
 # GNU General Public License version 2 or any later version.
 
-import parser, error, util, merge, re
+import parser, error, util, merge, re, encoding
 from i18n import _
 
 elements = {
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -252,7 +252,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
     File contains the given regular expression.
     """
     pat = getstring(x, _("grep requires a pattern"))
-    r = re.compile(pat)
+    r = re.compile(encoding.escape(pat))
     return [f for f in mctx.existing() if r.search(mctx.ctx[f].data())]
 
 _units = dict(k=2**10, K=2**10, kB=2**10, KB=2**10,
diff -r 0a730d3c5aae -r 5b34156d5fec mercurial/match.py
--- a/mercurial/match.pyWed May 16 17:02:30 2012 +0900
+++ b/mercurial/match.pyFri May 25 00:20:45 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -6,7 +6,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 # GNU General Public License version 2 or any later version.
 
 import re
-import scmutil, util, fileset
+import scmutil, util, fileset, encoding
 from i18n import _
 
 def _expandsets(pats, ctx):
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -188,14 +188,15 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
     i, n = 0, len(pat)
     res = ''
     group = 0
-    escape = re.escape
+    escfunc = re.escape
+    escape = encoding.escape
     def peek():
         return i &amp;lt; n and pat[i]
     while i &amp;lt; n:
         c = pat[i]
         i += 1
         if c not in '*?[{},\\':
-            res += escape(c)
+            res += escape(c, escfunc)
         elif c == '*':
             if peek() == '*':
                 i += 1
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -232,11 +233,11 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
             p = peek()
             if p:
                 i += 1
-                res += escape(p)
+                res += escape(p, escfunc)
             else:
-                res += escape(c)
+                res += escape(c, escfunc)
         else:
-            res += escape(c)
+            res += escape(c, escfunc)
     return res
 
 def _regex(kind, name, tail):
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -244,17 +245,17 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
     if not name:
         return ''
     if kind == 're':
-        return name
+        return encoding.escape(name)
     elif kind == 'path':
-        return '^' + re.escape(name) + '(?:/|$)'
+        return '^' + encoding.escape(name, re.escape) + '(?:/|$)'
     elif kind == 'relglob':
         return '(?:|.*/)' + _globre(name) + tail
     elif kind == 'relpath':
-        return re.escape(name) + '(?:/|$)'
+        return encoding.escape(name, re.escape) + '(?:/|$)'
     elif kind == 'relre':
         if name.startswith('^'):
-            return name
-        return '.*' + name
+            return encoding.escape(name)
+        return '.*' + encoding.escape(name)
     return _globre(name) + tail
 
 def _buildmatch(ctx, pats, tail):
diff -r 0a730d3c5aae -r 5b34156d5fec mercurial/revset.py
--- a/mercurial/revset.pyWed May 16 17:02:30 2012 +0900
+++ b/mercurial/revset.pyFri May 25 00:20:45 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -541,7 +541,8 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
     """
     try:
         # i18n: "grep" is a keyword
-        gr = re.compile(getstring(x, _("grep requires a string")))
+        msg = _("grep requires a string")
+        gr = re.compile(encoding.escape(getstring(x, msg)))
     except re.error, e:
         raise error.ParseError(_('invalid match pattern: %s') % e)
     l = []
diff -r 0a730d3c5aae -r 5b34156d5fec mercurial/subrepo.py
--- a/mercurial/subrepo.pyWed May 16 17:02:30 2012 +0900
+++ b/mercurial/subrepo.pyFri May 25 00:20:45 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -8,7 +8,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 import errno, os, re, xml.dom.minidom, shutil, posixpath
 import stat, subprocess, tarfile
 from i18n import _
-import config, scmutil, util, node, error, cmdutil, bookmarks
+import config, scmutil, util, node, error, cmdutil, bookmarks, encoding
 hg = None
 propertycache = util.propertycache
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -62,13 +62,13 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
         for pattern, repl in p.items('subpaths'):
             # Turn r'C:\foo\bar' into r'C:\\foo\\bar' since re.sub
             # does a string decode.
-            repl = repl.encode('string-escape')
+            repl = encoding.escape(repl, lambda x: x.encode('string-escape'))
             # However, we still want to allow back references to go
             # through unharmed, so we turn r'\\1' into r'\1'. Again,
             # extra escapes are needed because re.sub string decodes.
             repl = re.sub(r'\\\\([0-9]+)', r'\\\1', repl)
             try:
-                src = re.sub(pattern, repl, src, 1)
+                src = re.sub(encoding.escape(pattern), repl, src, 1)
             except re.error, e:
                 raise util.Abort(_("bad subrepository pattern in %s: %s")
                                  % (p.source('subpaths', pattern), e))
&lt;/pre&gt;</description>
    <dc:creator>FUJIWARA Katsunori</dc:creator>
    <dc:date>2012-05-24T17:04:25</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50527">
    <title>[PATCH 4 of 4 RFC] i18n: add hook point to executedecode('string_escape') safely</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50527</link>
    <description>&lt;pre&gt;# HG changeset patch
# User FUJIWARA Katsunori &amp;lt;foozy&amp;lt; at &amp;gt;lares.dti.ne.jp&amp;gt;
# Date 1337875094 -32400
# Branch stable
# Node ID 78d4c3b4a98139529376d4bfcd5ceb15745fbb1f
# Parent  1d5a60c7f44f106af3c0d56139a1b6d0fd3d0b3c
i18n: add hook point to execute decode('string_escape') safely

when win32mbcs is enabled, it causes invocations of some other
functions with unicode objects instead of byte sequences.

but unicode objects can not accept "decode('string_escape')"
invocation on it: it causes exception raising.

this patch adds to hook point to execute it safely.

added hook poit is "decodestringescape()" in "mercurial/encoding.py".

"decode('string_escape')" is applied for:

  - r'....' form program in revsets/filesets
  - decoding '\xxx' form in styles/templates

when win32mbcs is enabled, "decodestringescape()" is replaced with
specific implementation to do below:

  - for byte sequences:
    1. convert to unicode
    2. convert from unicode to byte sequence in local encoding
       (with substitution MBCS parts in the string by oneself in
       '\xXX' form for regexp safeness)
    3. apply 'string_escape' on byte sequence
       (this restore original MBCS parts substituted by '\xXX' form in (2))
    4. return above byte sequence

  - for unicode objects:
    1. convert from unicode to byte sequence in local encoding
       (with substitution MBCS parts in the string by oneself in
       '\xXX' form for regexp safeness)
    2. apply 'string_escape' on byte sequence
       (this restore original MBCS parts substituted by '\xXX' form in (2))
    3. convert byte sequence to unicode
    4. return above unicode

diff -r 1d5a60c7f44f -r 78d4c3b4a981 hgext/win32mbcs.py
--- a/hgext/win32mbcs.pyFri May 25 00:36:01 2012 +0900
+++ b/hgext/win32mbcs.pyFri May 25 00:58:14 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -181,6 +181,16 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
     else:
         return _tokenize(tokenizer, s)
 
+def safedecodestringescape(s):
+    try:
+        if isinstance(s, unicode):
+            return decode(escapeencode(s, None).decode('string_escape'))
+        else:
+            return escapeencode(decode(s), None).decode('string_escape')
+    except UnicodeError:
+        raise util.Abort(_("[win32mbcs] conversion in string-escape decoding"
+                           " failed with %s encoding\n") % (_encoding))
+
 def replacename(name, replacement):
     module, name = name.rsplit('.', 1)
     module = sys.modules[module]
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -229,6 +239,8 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
         replacename("mercurial.encoding.escape", safeescape)
         replacename("mercurial.encoding.filter", safefilter)
         replacename("mercurial.encoding.tokenize", safetokenize)
+        replacename("mercurial.encoding.decodestringescape",
+                    safedecodestringescape)
         # Check sys.args manually instead of using ui.debug() because
         # command line options is not yet applied when
         # extensions.loadall() is called.
diff -r 1d5a60c7f44f -r 78d4c3b4a981 mercurial/encoding.py
--- a/mercurial/encoding.pyFri May 25 00:36:01 2012 +0900
+++ b/mercurial/encoding.pyFri May 25 00:58:14 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -229,6 +229,12 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
     """
     return tokenizer(s)
 
+def decodestringescape(s):
+    """Hook point to decode specified string by 'string-escape' safely
+    in current encoding.
+    """
+    return s.decode('string-escape')
+
 def toutf8b(s):
     '''convert a local, possibly-binary string into UTF-8b
 
diff -r 1d5a60c7f44f -r 78d4c3b4a981 mercurial/fileset.py
--- a/mercurial/fileset.pyFri May 25 00:36:01 2012 +0900
+++ b/mercurial/fileset.pyFri May 25 00:58:14 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -44,7 +44,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
                 c = program[pos]
                 decode = lambda x: x
             else:
-                decode = lambda x: x.decode('string-escape')
+                decode = encoding.decodestringescape
             pos += 1
             s = pos
             while pos &amp;lt; l: # find closing quote
diff -r 1d5a60c7f44f -r 78d4c3b4a981 mercurial/revset.py
--- a/mercurial/revset.pyFri May 25 00:36:01 2012 +0900
+++ b/mercurial/revset.pyFri May 25 00:58:14 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -94,7 +94,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
                 c = program[pos]
                 decode = lambda x: x
             else:
-                decode = lambda x: x.decode('string-escape')
+                decode = encoding.decodestringescape
             pos += 1
             s = pos
             while pos &amp;lt; l: # find closing quote
diff -r 1d5a60c7f44f -r 78d4c3b4a981 mercurial/templater.py
--- a/mercurial/templater.pyFri May 25 00:36:01 2012 +0900
+++ b/mercurial/templater.pyFri May 25 00:58:14 2012 +0900
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -7,7 +7,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
 from i18n import _
 import sys, os
-import util, config, templatefilters, parser, error
+import util, config, templatefilters, parser, error, encoding
 
 # template parsing
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -38,7 +38,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
                 c = program[pos]
                 decode = lambda x: x
             else:
-                decode = lambda x: x.decode('string-escape')
+                decode = encoding.decodestringescape
             pos += 1
             s = pos
             while pos &amp;lt; end: # find closing quote
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -223,9 +223,9 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
     if quoted:
         if len(s) &amp;lt; 2 or s[0] != s[-1]:
             raise SyntaxError(_('unmatched quotes'))
-        return s[1:-1].decode('string_escape')
+        return encoding.decodestringescape(s[1:-1])
 
-    return s.decode('string_escape')
+    return encoding.decodestringescape(s)
 
 class engine(object):
     '''template expansion engine.
&lt;/pre&gt;</description>
    <dc:creator>FUJIWARA Katsunori</dc:creator>
    <dc:date>2012-05-24T17:04:28</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50526">
    <title>Re: RFC: safe pattern matching for problematic encoding</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50526</link>
    <description>&lt;pre&gt;
That's very annoying, of course, and the various Unicode committees and
Apple engineers involved should be flogged in the streets, but the odds
of convincing me we should transcode filenames on the fly from one
encoding to another (even if it's just NFD to NFC) is just about nil.
See:

http://mercurial.selenic.com/wiki/EncodingStrategy#The_.22makefile.22_problem

Because neither Windows or Linux do normalization-based filename lookup,
normalization on our end WILL break tools in those ecosystems.

Nor can we do normalization on check-in on the Mac side.

&lt;/pre&gt;</description>
    <dc:creator>Matt Mackall</dc:creator>
    <dc:date>2012-05-24T16:47:38</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50525">
    <title>Re: [PATCH v3] parsers: add a C function to pack the dirstate</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50525</link>
    <description>&lt;pre&gt;
I'm seeing this in MSYS with 64-bit Python on Windows 7. Haven't analyzed it yet:

$ python run-tests.py --local

--- c:\users\adi\hgrepos\hg-main\tests\test-1102.t
+++ c:\users\adi\hgrepos\hg-main\tests\test-1102.t.err
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -4,13 +4,98 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
   $ echo a &amp;gt; a
   $ hg ci -Am0
   adding a
+  ** unknown exception encountered, please report by visiting
+  ** http://mercurial.selenic.com/wiki/BugTracker
+  ** Python 2.6.6 (r266:84297, Aug 24 2010, 18:13:38) [MSC v.1500 64 bit (AMD64)]
+  ** Mercurial Distributed SCM (version 2.2.1+139-85316b3c6a3b)
+  ** Extensions loaded:
+  Traceback (most recent call last):
+    File "c:/users/adi/hgrepos/hg-main/hg", line 38, in &amp;lt;module&amp;gt;
+      mercurial.dispatch.run()
+    File "c:\users\adi\hgrepos\hg-main\mercurial\dispatch.py", line 28, in run
+      sys.exit((dispatch(request(sys.argv[1:])) or 0) &amp;amp; 255)
+    File "c:\users\adi\hgrepos\hg-main\mercurial\dispatch.py", line 65, in dispatch
+      return _runcatch(req)
+    File "c:\users\adi\hgrepos\hg-main\mercurial\dispatch.py", line 88, in _runcatch
+      return _dispatch(req)
+    File "c:\users\adi\hgrepos\hg-main\mercurial\dispatch.py", line 737, in _dispatch
+      cmdpats, cmdoptions)
+    File "c:\users\adi\hgrepos\hg-main\mercurial\dispatch.py", line 511, in runcommand
+      ret = _runcommand(ui, options, cmd, d)
+    File "c:\users\adi\hgrepos\hg-main\mercurial\dispatch.py", line 827, in _runcommand
+      return checkargs()
+    File "c:\users\adi\hgrepos\hg-main\mercurial\dispatch.py", line 798, in checkargs
+      return cmdfunc()
+    File "c:\users\adi\hgrepos\hg-main\mercurial\dispatch.py", line 734, in &amp;lt;lambda&amp;gt;
+      d = lambda: util.checksignature(func)(ui, *args, **cmdoptions)
+    File "c:\users\adi\hgrepos\hg-main\mercurial\util.py", line 463, in check
+      return func(*args, **kwargs)
+    File "c:\users\adi\hgrepos\hg-main\mercurial\commands.py", line 1313, in commit
+      node = cmdutil.commit(ui, repo, commitfunc, pats, opts)
+    File "c:\users\adi\hgrepos\hg-main\mercurial\cmdutil.py", line 1294, in commit
+      scmutil.match(repo[None], pats, opts), opts)
+    File "c:\users\adi\hgrepos\hg-main\mercurial\commands.py", line 1311, in commitfunc
+      match, editor=e, extra=extra)
+    File "c:\users\adi\hgrepos\hg-main\mercurial\localrepo.py", line 1216, in commit
+      wlock.release()
+    File "c:\users\adi\hgrepos\hg-main\mercurial\lock.py", line 132, in release
+      self.releasefn()
+    File "c:\users\adi\hgrepos\hg-main\mercurial\localrepo.py", line 963, in unlock
+      self.dirstate.write()
+    File "c:\users\adi\hgrepos\hg-main\mercurial\dirstate.py", line 512, in write
+      finish(parsers.pack_dirstate(self._map, copymap, self._pl, now))
+  TypeError: expected an int
+  [1]


&lt;/pre&gt;</description>
    <dc:creator>Adrian Buehlmann</dc:creator>
    <dc:date>2012-05-24T16:44:58</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50524">
    <title>Re: RFC: safe pattern matching for problematic encoding</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50524</link>
    <description>&lt;pre&gt;
At Wed, 23 May 2012 13:56:43 -0500,
Matt Mackall wrote:

We need such safeness in situations below:

  - for file/directory patterns of "hg status", "hg log" and so on:
    (path, globbing or regex)

      in this case, backslashes in patterns are skipped, because they
      are recognized as an escape character of next by "_globre()" in
      "match.py".

      this causes unexpected matching result: "_globre()" doesn't
      raise exception, even though specified pattern is ended by
      backslash of MBCS.


  - for regexp patterns of "hg grep":

      in this case, backslashes in patterns are skipped, because they
      are recognized as an escape character of next by "re.compile()".

      this causes unexpected matching result (MBCS is in the middle of
      the pattern), or parse error (in the tail of the pattern)


  - for arguments of revsets/filesets predicates:
    (pathes, regexp, keywords and so on)

  - for strings of styles/templates:

      in these cases, backslashes in patterns are skipped, because
      they are recognized as an escape character of next by:

        - "tokenize()" in "fileset.py"
        - "tokenize()" in "revset.py"
        - "tokenize()" in "templater.py"

      this causes unexpected matching result (MBCS is in the middle of
      the argument), or parse error (in the tail of the argument)


even though safeness in "strings for styles/templates" situation is
not needed so seriously.

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy&amp;lt; at &amp;gt;lares.dti.ne.jp
&lt;/pre&gt;</description>
    <dc:creator>FUJIWARA Katsunori</dc:creator>
    <dc:date>2012-05-24T16:23:02</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50523">
    <title>[Bug 3469] New: Fix for bug 2653 broke ability to map svn trunk tobranch other than 'default'</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50523</link>
    <description>&lt;pre&gt;http://bz.selenic.com/show_bug.cgi?id=3469

          Priority: normal
            Bug ID: 3469
                CC: mercurial-devel&amp;lt; at &amp;gt;selenic.com
          Assignee: bugzilla&amp;lt; at &amp;gt;selenic.com
           Summary: Fix for bug 2653 broke ability to map svn trunk to
                    branch other than 'default'
          Severity: feature
    Classification: Unclassified
                OS: Other
          Reporter: lstewart&amp;lt; at &amp;gt;room52.net
          Hardware: PC
            Status: UNCONFIRMED
           Version: 2.2.1
         Component: convert
           Product: Mercurial

I use Mercurial to mirror the FreeBSD Subversion repository for local FreeBSD
development purposes. FreeBSD's trunk branch is named "head", and the convert
extension post bugfix 2653 now forces commits on the "head" branch to be mapped
to the "default" branch in the Mercurial repository. For my use case, the new
behaviour is a regression in functionality.

I realise it is more "Mercurial-y" to use "default" as the equivalent branch
name for trunk, but I would prefer to keep the branch names the same between
svn and hg in this instance.

I propose to leave the default behaviour of the convert extension as is post
bugfix 2653, but allow the branchmap file to be used to override the default
behaviour and map the svn trunk name to a hg branch name other than "default"
e.g. by placing the line "head head" in the branchmap file, the convert
extension should preserve the trunk branch name of "head" in the svn repo as
"head" in the converted hg repo.

&lt;/pre&gt;</description>
    <dc:creator>bugzilla-daemon&lt; at &gt;bz.selenic.com</dc:creator>
    <dc:date>2012-05-24T07:45:01</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50522">
    <title>[PATCH] localrepo: Add locking to _branchtags around _writebrancache</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50522</link>
    <description>&lt;pre&gt;# HG changeset patch
# User Joshua Redstone &amp;lt;joshua.redstone&amp;lt; at &amp;gt;fb.com&amp;gt;
# Date 1336812429 25200
# Node ID b3bb1d7360c481b847431fc0d264111b1d93f993
# Parent  2ac08d8b21aa7b6e0a062afed5a3f357ccef67f9
localrepo:  Add locking to _branchtags around _writebrancache

Read code paths such as via localrepo.branchmap() may end up calling
_writebranchcache without having acquire a repo lock, potentially leading to
races with other repo mutations.  This fix acquires the repo lock if not yet
held before calling _writebranchcache.

diff -r 2ac08d8b21aa -r b3bb1d7360c4 mercurial/localrepo.py
--- a/mercurial/localrepo.pyTue May 22 14:37:20 2012 -0500
+++ b/mercurial/localrepo.pySat May 12 01:47:09 2012 -0700
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -481,8 +481,18 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
         if lrev != tiprev:
             ctxgen = (self[r] for r in xrange(lrev + 1, tiprev + 1))
             self._updatebranchcache(partial, ctxgen)
-            self._writebranchcache(partial, self.changelog.tip(), tiprev)
-
+            # Read code paths (e.g., from branchmap()) do not have the lock
+            # yet.  Lock acquisition may fail if we do not have write-access
+            # to the directory.
+            try:
+                wlock = self.wlock(wait=False)
+                try:
+                    self._writebranchcache(partial, self.changelog.tip(),
+                                           tiprev)
+                finally:
+                    wlock.release()
+            except error.LockError:
+                pass
         return partial
 
     def updatebranchcache(self):
diff -r 2ac08d8b21aa -r b3bb1d7360c4 mercurial/statichttprepo.py
--- a/mercurial/statichttprepo.pyTue May 22 14:37:20 2012 -0500
+++ b/mercurial/statichttprepo.pySat May 12 01:47:09 2012 -0700
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -133,6 +133,11 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
     def lock(self, wait=True):
         raise util.Abort(_('cannot lock static-http repository'))
 
+    def wlock(self, wait=True):
+        raise error.LockError(0, "statichttprepository",
+                              "statichttprepository does not support locking",
+                              "statichttprepository")
+
 def instance(ui, path, create):
     if create:
         raise util.Abort(_('cannot create new static-http repository'))
diff -r 2ac08d8b21aa -r b3bb1d7360c4 tests/test-bheads.t
--- a/tests/test-bheads.tTue May 22 14:37:20 2012 -0500
+++ b/tests/test-bheads.tSat May 12 01:47:09 2012 -0700
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -31,6 +31,14 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 
 =======
 
+Not being able to update the branchheads cache should not make read ops fail
+  $ echo "junk" &amp;gt; $TESTTMP/a/.hg/cache/branchheads
+  $ chmod -R u-w $TESTTMP/a/.hg
+  $ heads
+  1: Adding a branch (a)
+  0: Adding root node ()
+  $ chmod -R u+w $TESTTMP/a/.hg
+
   $ hg update -C 0
   0 files updated, 0 files merged, 1 files removed, 0 files unresolved
   $ echo 'b' &amp;gt;b
&lt;/pre&gt;</description>
    <dc:creator>Joshua Redstone</dc:creator>
    <dc:date>2012-05-24T14:50:53</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50521">
    <title>Re: RFC: safe pattern matching for problematic encoding</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50521</link>
    <description>&lt;pre&gt;
At Thu, 24 May 2012 15:10:33 +0200,
Mads wrote:

I image converting on the border between on-memory-object (manifest,
context and so on) and storages (revlog/changelog/path of filelog)/
external representations (bundle/diff).

I'll post patch series soon for discussion base, because it is
difficult to explain without code for my English :-)


Sorry for not enough explanation.

I want to show one of the ratio example of NFD-ed characters in
Japanese text.


----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy&amp;lt; at &amp;gt;lares.dti.ne.jp
&lt;/pre&gt;</description>
    <dc:creator>FUJIWARA Katsunori</dc:creator>
    <dc:date>2012-05-24T14:09:15</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50520">
    <title>Re: RFC: safe pattern matching for problematic encoding</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50520</link>
    <description>&lt;pre&gt;
So the file has the 'wrong' normalization from the beginning, even 
before Mercurial see it for the first time?

How do you imagine Mercurial possibly could 'fix' that? Do you have 
general rules for an alternative normalization that can clean up the 
MacOS normalization?

(Files that have non-NFD encoded names in the repo and on real unix will 
obviously have the NFD names on MacOS file systems, but Mercurial will 
preserve the original filename encoding when committing.)


(That non-standard markup do probably not help communicating your intent.)


How is file content and encoding of translations related to MacOS 
filename normalization?

As you probably know: Mercurial never touches or changes (or care about) 
file content encoding - it just reproduce it 100% reliably.

On 24/05/12 14:38, Noel Grandin wrote:

We already have http://selenic.com/repo/hg/file/tip/contrib/casesmash.py .

Further work in that direction could perhaps be useful in some cases, 
but it is in no way a prerequisite or sufficient for implementing 
support for all the real quirks of the real platforms.

/Mads

&lt;/pre&gt;</description>
    <dc:creator>Mads</dc:creator>
    <dc:date>2012-05-24T13:10:33</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50519">
    <title>Re: RFC: safe pattern matching for problematic encoding</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50519</link>
    <description>&lt;pre&gt;

On 2012-05-24 14:16, FUJIWARA Katsunori wrote:

Perhaps the first step towards solving these problems is to build 3 
mock-filesystem libraries
- a MacOS-type-NFD system
- a Windows-type case-insensitive system
- a straightforward Linux-type system

And then create some unit tests using these mock-filesystem libraries 
which expose the problems.

Then everybody can test and fix without needing 3 different boxes :-)

Links:
https://launchpad.net/mockfs

Disclaimer: http://www.peralex.com/disclaimer.html


&lt;/pre&gt;</description>
    <dc:creator>Noel Grandin</dc:creator>
    <dc:date>2012-05-24T12:38:07</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50518">
    <title>Re: RFC: safe pattern matching for problematic encoding</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50518</link>
    <description>&lt;pre&gt;
At Wed, 23 May 2012 13:57:27 -0500,
Matt Mackall wrote:

Sorry for not enough explanation.

With current implementation:

    1. add the file, of which name is changed by NFD, on MacOS
    2. receive such changeset on other platforms, and then,
    3. NFD-ed file is stored into working dir by "hg update"

Of course, for some people, this is co-operative enough.

But at least in Japan, users on platforms other than MacOS strongly
want to get non NFD-ed files on their working directory.

# both via GUI file browsers and CUI terminals

For example, "i18n/ja.po" message translation file contains 122181 non
ascii Japanese characters and 10% of them can be changed by NFD: this
ratio may become 15% or more in some situation.

This is not less for users using Japanese.

Yes, this requrement is not logical but emotional. And because it is
emotional reason, it is so important to spread Mercurial in Japan :-)


BTW, in Japan, non-programmer people (e.g.: graphic designers and
project managers) seems to want to use MacOS, and they are main users
of filenames in Japanese.

# of course, there are many programmers loving MacOS, too :-)

But in other hands, many of them use Windows, also.

So, people leading introduction of Mercurial into their teams often
ask me how to co-operate with them via Mercurial.

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy&amp;lt; at &amp;gt;lares.dti.ne.jp
&lt;/pre&gt;</description>
    <dc:creator>FUJIWARA Katsunori</dc:creator>
    <dc:date>2012-05-24T12:16:37</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50517">
    <title>[PATCH] revset: cache alias expansions</title>
    <link>http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/50517</link>
    <description>&lt;pre&gt;# HG changeset patch
# User Patrick Mezard &amp;lt;patrick&amp;lt; at &amp;gt;mezard.eu&amp;gt;
# Date 1337857506 -7200
# Node ID f5171e8248a20b23f0404a6f41c1c88390060e60
# Parent  2ac08d8b21aa7b6e0a062afed5a3f357ccef67f9
revset: cache alias expansions

Caching has no performance effect on the revset aliases which triggered
the recent recursive evaluation bug. I wrote it not to feel bad about
expanding several times the same complicated expression.

diff --git a/mercurial/revset.py b/mercurial/revset.py
--- a/mercurial/revset.py
+++ b/mercurial/revset.py
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1387,7 +1387,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
         return args[arg]
     return tuple(_expandargs(t, args) for t in tree)
 
-def _expandaliases(aliases, tree, expanding):
+def _expandaliases(aliases, tree, expanding, cache):
     """Expand aliases in tree, recursively.
 
     'aliases' is a dictionary mapping user defined aliases to
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1402,17 +1402,20 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
             raise error.ParseError(_('infinite expansion of revset alias "%s" '
                                      'detected') % alias.name)
         expanding.append(alias)
-        result = _expandaliases(aliases, alias.replacement, expanding)
+        if alias.name not in cache:
+            cache[alias.name] = _expandaliases(aliases, alias.replacement,
+                                               expanding, cache)
+        result = cache[alias.name]
         expanding.pop()
         if alias.args is not None:
             l = getlist(tree[2])
             if len(l) != len(alias.args):
                 raise error.ParseError(
                     _('invalid number of arguments: %s') % len(l))
-            l = [_expandaliases(aliases, a, []) for a in l]
+            l = [_expandaliases(aliases, a, [], cache) for a in l]
             result = _expandargs(result, dict(zip(alias.args, l)))
     else:
-        result = tuple(_expandaliases(aliases, t, expanding)
+        result = tuple(_expandaliases(aliases, t, expanding, cache)
                        for t in tree)
     return result
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1422,7 +1425,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
     for k, v in ui.configitems('revsetalias'):
         alias = revsetalias(k, v)
         aliases[alias.name] = alias
-    return _expandaliases(aliases, tree, [])
+    return _expandaliases(aliases, tree, [], {})
 
 parse = parser.parser(tokenize, elements).parse
 
&lt;/pre&gt;</description>
    <dc:creator>Patrick Mezard</dc:creator>
    <dc:date>2012-05-24T11:49:49</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.comp.version-control.mercurial.devel">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.comp.version-control.mercurial.devel</link>
  </textinput>
</rdf:RDF>

