<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel about="http://blog.gmane.org/gmane.science.linguistics.wikipedia.technical">
    <title>gmane.science.linguistics.wikipedia.technical</title>
    <link>http://blog.gmane.org/gmane.science.linguistics.wikipedia.technical</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40834"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40833"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40832"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40831"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40830"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40829"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40828"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40827"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40826"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40825"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40824"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40823"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40822"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40821"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40820"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40819"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40818"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40817"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40816"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40815"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40834">
    <title>Re: The never-dying topic: category intersection (beenthere done that .. to the power of three)</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40834</link>
    <description>2008/12/4 Gregory Maxwell &lt;gmaxwell-Re5JQEeQqe8AvxtiuMwx3w&lt; at &gt;public.gmane.org&gt;:




Hmm, I musta missed this. I woulda thought the commons-l habitues
would have swooped upon it with great glee.


- d.
</description>
    <dc:creator>David Gerard</dc:creator>
    <dc:date>2008-12-04T01:22:44</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40833">
    <title>Re: The never-dying topic: category intersection (beenthere done that .. to the power of three)</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40833</link>
    <description>
With a JS hack I had my tool integrated to the site. The AJAX calls
went to the toolserver, but as far as the users could see it was
running on the site. No one cared: It didn't produce useful results
because of how categories are used, and when I suggested changing
people just waved their arms at me "just make it walk the tree".
</description>
    <dc:creator>Gregory Maxwell</dc:creator>
    <dc:date>2008-12-04T01:16:55</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40832">
    <title>Re: The never-dying topic: category intersection (beenthere done that .. to the power of three)</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40832</link>
    <description>2008/12/4 Daniel Schwen &lt;lists-SYvMc73jUx+zQB+pC5nmwQ&lt; at &gt;public.gmane.org&gt;:




It's vaporware until it's usable as a tagging system in practice.




This being precisely what Commons has been begging for for a while!




The last time will be when there's a feature end-users can use without
going off to the toolserver.


- d.
</description>
    <dc:creator>David Gerard</dc:creator>
    <dc:date>2008-12-04T01:12:12</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40831">
    <title>Re: All wikipedia text less than 500 MB compressed?</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40831</link>
    <description>[snip]

Yes.
</description>
    <dc:creator>Gregory Maxwell</dc:creator>
    <dc:date>2008-12-04T01:10:25</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40830">
    <title>Re: Non-latin characters broken in donation comments</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40830</link>
    <description>-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Platonides wrote:

Because the order is dependent on the listed name, not on the viewer.

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkk3KhEACgkQwRnhpk1wk47qdgCgry+LIP3L4g3Z+iodOD/WB3/8
zjEAn1v8lyhkpQYXiYzXzfdK10xzE0zw
=8rJO
-----END PGP SIGNATURE-----
</description>
    <dc:creator>Brion Vibber</dc:creator>
    <dc:date>2008-12-04T00:53:37</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40829">
    <title>Re: All wikipedia text less than 500 MB compressed?</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40829</link>
    <description>-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Platonides wrote:

Off the top of my head, referring to compressed size of text of current
article pages only. Looks like enwiki has expanded a bit since I last
looked (4.1 GB). :)

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkk3KeUACgkQwRnhpk1wk45mtACfYNpVv8whUa7jEg6csluYzw1s
PxIAoJQG01ynhdH6WCrVJoZw0M7piTro
=SmbV
-----END PGP SIGNATURE-----
</description>
    <dc:creator>Brion Vibber</dc:creator>
    <dc:date>2008-12-04T00:52:53</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40828">
    <title>Re: Non-latin characters broken in donation comments</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40828</link>
    <description>
Why?
West names would be shown with the 'wrong' order when viewed with the
East setting, and viceversa. But it'd be a client setting, so anyone can
view the list on the order which fits him most.
</description>
    <dc:creator>Platonides</dc:creator>
    <dc:date>2008-12-04T00:45:36</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40827">
    <title>All wikipedia text less than 500 MB compressed?</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40827</link>
    <description>From CNET interview to Brion
http://news.cnet.com/8301-17939_109-10103177-2.html


That statement struck me, as I wouldn't think that big wikis could fit
on that, much less all wikis.

So I went and spent some CPU on calculations:

I first looked at dewiki:
$ 7z e -so dewiki-20081011-pages-meta-history.xml.7z|sed -n 's/\s*&lt;text
xml:space="preserve"&gt;\([^&lt;]*\)\(&lt;\/text&gt;\)\?/\1/gp'| bzip2 -9 | wc -c
325915907 bytes = 310.8 MB

Not bad for a 5.1 GB 7z file. :)


Then I to enwiki, begining with the current versions:
$  bzcat enwiki-20081008-pages-meta-current.xml.bz2|sed -n 's/\s*&lt;text
xml:space="preserve"&gt;\([^&lt;]*\)\(&lt;\/text&gt;\)\?/\1/gp'|bzip2 -9 | wc -c
253648578

253648578 bytes = 241.898 MB

Again, a gigantic file (7.8 GB bz2) was reduced to less than 500MB.
Maybe it *can* be done after all. There're much more revisions, but
the compression ratio is greater.


So I had to go to turn to the beast, enwiki history files. As there
hasn't been any successful enwiki history dump on the last months, I
used an old dump I had, which is nearly a year old and fills 18G.

$ 7z e -so enwiki-20080103-pages-meta-history.xml.7z |sed -n 's/\s*&lt;text
xml:space="preserve"&gt;\([^&lt;]*\)\(&lt;\/text&gt;\)\?/\1/gp'|bzip2 -9 | wc -c

1092104465 bytes = 1041.5 MB = 1.01 GB


So, where did those 'less than 500MB' numbers came from? Also note that
I used bzip2 instead of gzip, so external storage will be using much
more space (plus indexes, ids...).

Nonetheless, the results are impressive on how the size of *already
compressed files* get reduced just by reducing the metadata.

As a comparison, dewiki-20081011-stub-meta-history.xml.gz containing the
remaining metadata is 1.7GB. 1.7 GB + 310.8 MB is still much less than
the 51.4 GB of dewiki-20081011-pages-meta-history.xml.bz2!


Maybe we should investigate new ways of storing the dumps compressed.
Could we achieve similar gains increasing the bzip window size to
counteract the noise of revision metadata?
Or perhaps I used a wrong regex and thus large chunks of data were not
taken into account ?
</description>
    <dc:creator>Platonides</dc:creator>
    <dc:date>2008-12-04T00:43:19</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40826">
    <title>Re: The never-dying topic: category intersection (beenthere done that .. to the power of three)</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40826</link>
    <description>
Uhm, yeah.. except that intersection of atomic categories are not vaporware. 
We had proofs of concept for that and the interest was marginal.

In any case. If someone would really just shoved it into mw core and enabled 
it on all the wmf sites I'd be happy. I concur that it would make the job 
convincing useres of a less retarded categorization scheme a bit easier.

As far as Aeriks soapboxing from a few emails back goes: Let's not kid 
ourselves, tag based categorization is standard on commercial sites such as 
stockphotography libraries. We are not exactly inventing this...

I'll shut up now, and I really hope that this is the last time we're having 
this discussion... (but boy, you will get an earfull if it isn't ;-) )
</description>
    <dc:creator>Daniel Schwen</dc:creator>
    <dc:date>2008-12-04T00:12:50</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40825">
    <title>Re: The never-dying topic: category intersection (beenthere done that .. to the power of three)</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40825</link>
    <description>
Of course, humans would have to manually specify which new categories
each old one corresponds to, but that's a perfectly doable job for a
small group of volunteers working over the course of months.  The bots
would do the much more tedious work of actually replacing them, so
each category could take substantially less than a minute of human
review.  The category intersection feature would then get
incrementally more useful as the work progressed.


There's a world of difference between showing that something is
feasible in theory, and making it a core part of the software that's
visible on every category page on every Wikimedia wiki without asking
for community consensus in advance.  As soon as people actually start
using the feature, and they will if there's a box on every category
page, they'll realize that it would be way more useful if they changed
how things are categorized.  As long as category intersections remain
vaporware, there's no incentive to change.  A technical fait accompli
will bring about change.

Even if Commons hypothetically didn't go along with the scheme, it
would be valuable to have it in the software anyway.  Plenty of wikis
could still use it, like dewiki.  We need an interface and we need a
backend and we need someone to hook them together and commit them to
Subversion.  People have spent too much time inventing and reinventing
and re-reinventing new and different but basically interchangeable
backends, and too little time on the other parts of the problem.  If
the feature were committed to the software with a completely brainless
backend unusable on Wikimedia wikis, I predict it would be live on all
sites in less than six months.
</description>
    <dc:creator>Aryeh Gregor</dc:creator>
    <dc:date>2008-12-03T23:35:49</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40824">
    <title>Re: Non-latin characters broken in donation comments</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40824</link>
    <description>-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Platonides wrote:

Because it would show everything wrong? :)

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkk3F1QACgkQwRnhpk1wk46rmACeMuL9sy6yc7yGw7K+9s4QWd/S
0PYAoJRYIQs93H9gLMbSsgN0JmhywsK5
=AyQs
-----END PGP SIGNATURE-----
</description>
    <dc:creator>Brion Vibber</dc:creator>
    <dc:date>2008-12-03T23:33:40</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40823">
    <title>Re: Non-latin characters broken in donation comments</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40823</link>
    <description>
There is something to be said for annoying everyone equally. Being an
international organisation is very important for the foundation, it
may well be worth annoying (non-Hungarian) westerners unnecessarily in
order to show that we're not favouring any nationalities over others.
(This is all assuming people that use the Surname-Given name order
will actually care - they may all be so used to having their names
mangled that they barely notice anymore. A little market research may
be called for.)
</description>
    <dc:creator>Thomas Dalton</dc:creator>
    <dc:date>2008-12-03T23:30:45</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40822">
    <title>Re: Non-latin characters broken in donation comments</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40822</link>
    <description>(long, complex solutions to guess the right display)

Why not have a "Show Name, Surname / Show Surname, Name" option on the
donation display?
Easy, consistent, and everybody should be happy with it.
</description>
    <dc:creator>Platonides</dc:creator>
    <dc:date>2008-12-03T23:30:09</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40821">
    <title>Re: Non-latin characters broken in donation comments</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40821</link>
    <description>

As there are no such data released (you can't filter donations by currency,
or even better currency+location) so I'm just guessing that those donating
in forints are mostly (~100%) Hungarians, while there is no easy way to find
the Hungarians among those not donating in forints.
I didn't want to elaborate on this in my previous mail, but as long as the
surname - first name order is not considered wrong, strange or out of place
in the context of English, and possibly other languages, than using this
order would be a win - win (it would be still acceptable on the
English/other interfaces, and on the Hungarian interface it would be
correct).
However, most Hungarians themselves use the Western order to name themselves
in English (and I guess in most foreign languages and contexts) so the
Western order would be correct on every interface language (except possibly
in those countries that use the non-Western order) except Hungarian (but I
dare say that people don't/wouldn't mind it, as they understand that the
context is mostly English [website of an American foundation, even the
currencies look 'foreign']). In conclusion, I would let the Hungarians'
name's rest for this year :).

Unfortunately we get the name already divided up from PayPal and are

You have a box for comments, that is independent from the PayPal people.
Maybe a solution would be to have 3 options instead of two at the privacy
checkbox: Display my name [default], Anonymous donation, Display a custom
name [this could work possibly for donating in someone other's name,  if
that's not a privacy concern].
--
Bence Damokos (Damokos Bence in Hungary)

</description>
    <dc:creator>Bence Damokos</dc:creator>
    <dc:date>2008-12-03T22:04:31</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40820">
    <title>Re: Non-latin characters broken in donation comments</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40820</link>
    <description>-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Roan Kattouw wrote:

Basically, names are hard. :)

The only way to do it right reliably is to let the person type in their
name the way they want it and then *not change it*.

Unfortunately we get the name already divided up from PayPal and are
stuck either guessing or making an unattractive 'Surname, Given' display
which looks bad for everyone. :(

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkk3AHMACgkQwRnhpk1wk45+igCePqiBFMcJALDG2k94C5wX2HpB
UVYAn0k0psCcjmtWXZ3xMzercmlxbIqg
=VlLl
-----END PGP SIGNATURE-----
</description>
    <dc:creator>Brion Vibber</dc:creator>
    <dc:date>2008-12-03T21:56:03</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40819">
    <title>Re: Non-latin characters broken in donation comments</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40819</link>
    <description>Bence Damokos schreef:
Note that not all people who live in Hungary have Hungarian names, and 
not all Hungarians live in Hungary.

Roan Kattouw (Catrope)
</description>
    <dc:creator>Roan Kattouw</dc:creator>
    <dc:date>2008-12-03T21:01:40</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40818">
    <title>Re: Non-latin characters broken in donation comments</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40818</link>
    <description>

Thank you for considering Hungarian. You could detect Hungarians by simply
looking for donations in Hungarian Forints (HUF).

Best regards,
Bence Damokos

</description>
    <dc:creator>Bence Damokos</dc:creator>
    <dc:date>2008-12-03T19:27:28</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40817">
    <title>Re: Non-latin characters broken in donation comments</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40817</link>
    <description>-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Brion Vibber wrote:

Ok, quick summary:

1) PayPal sends us a payment record with 'first_name' and 'last_name'
fields.

2) We insert that record into our CiviCRM database.

3) CiviCRM combines the first name and last name into a "display
name"... per standard Western ordering assumptions.

4) The display name is copied into our public reporting database and
shown on the web.

It looks like we can't do much about the name split in 1); that's just
what we get out of the payment processor. We may be able to fudge things
at step 3) by detecting Han characters and producing a properly-sorted
display name, at least for that case.

Of course this will still be wrong for Hungarians, and Romanized
Japanese names may often get written either way...

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkk21moACgkQwRnhpk1wk47rgACg31a0iArCTSyHfQ/Sutv4zorh
wjYAni4MbNRDwgtQderCNvGjnQziGGM5
=0p5I
-----END PGP SIGNATURE-----
</description>
    <dc:creator>Brion Vibber</dc:creator>
    <dc:date>2008-12-03T18:56:42</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40816">
    <title>Stanton Foundation $890K Usability Grant</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40816</link>
    <description>As per Michael's earlier e-mail:

http://wikimediafoundation.org/wiki/Press_releases/Wikipedia_to_become_more_user-friendly_for_new_volunteer_writers

We're very grateful to the Stanton Foundation for this important
investment in Wikipedia's user-friendliness. We're aware of the UNICEF
research as well and we'll survey the existing improvements as part of
this project. A few points beyond the press release:

'''When will this project begin, and when will it finish?'''

The project will begin in January 2009.  It will wrap up April 2010.

'''What is its overall scope?'''

The project scope will include the following:

* user testing designed to identify the most common barriers to entry
for first-time writers, and
* a series of improvements to the MediaWiki interface, including
improvements to issues identified through user testing and a focus on
hiding complex elements of the user interface from people who don't
use them. (Specifically, we'll focus on complex syntax like templates,
references, tables, etc.)

'''What does the Wikimedia Foundation consider to be wrong with the
editing interface right now?'''

When it was first developed, MediaWiki was considered reasonably
user-friendly.  At that time, software wasn't as flexible and
user-focused as it is today.  It's logical that by today's standards,
MediaWiki may not seem to be as streamlined or user-friendly as other
software.

We have never systematically examined the editing interface to examine
what kinds of challenges new contributors face, but we do know of
certain common problems.  For example, many people have difficulty
creating new articles, uploading images, and editing templates,
footnotes, and tables.  We hope to make improvements in those areas.

'''Who are the new contributors you are hoping to attract?'''

We are hoping to attract new contributors who are just as smart and
knowledgeable as the people who have always written for Wikipedia and
its sister projects, but who -to date- have been unable or reluctant
to participate because of the barriers posed by the interface.  There
are countless individuals who read Wikipedia and would be great
writers/editors, but are daunted by complex wiki syntax.  They may not
even realize that they can edit Wikipedia. They are the people we are
targeting with this project.

'''What is the nature of the interface improvements that will be made
in this project?'''

In phase 1 (until late summer 2009), we will focus on reducing or
eliminating common, simple barriers to entry.  A possible example
would be, "making the edit button more visible."  These will be
identified through systematic user testing, but also by surveying
existing research.  In phase 2 (until early 2010), we will shift our
attention to identifying complex pieces of "wiki code" (the formatting
language used to write Wikipedia articles) and making them less
visible to first-time contributors and/or helping them achieve the
respective functionality (such as adding tables) more easily.

'''When can we expect to see the first changes to the Wikipedia interface?'''

We hope to demonstrate a first series of improvements by mid-2009,
with production deployment following shortly thereafter.

'''How can the Wikimedia volunteer community be involved in this project?'''

The project will be open and participatory throughout.  Every major
report will be publicly shared, and all code will be developed through
our existing, public version control system.  Volunteer developers and
testers will be encouraged to contribute throughout the process.

'''Are the positions created for this project just temporary?'''

We will allocate at least two existing, budgeted developer positions
to this project, and additional hires will be employed for the
duration of the grant.

'''Why don't these funds count towards your overall fundraising goals?'''

The majority of the funding for this project will go towards costs not
included in our 2008-09 budget.  While we anticipate that the project
will offset some of our operating costs, we also want to retain
flexibility to reallocate funding inside the project budget as
required.

'''Are you going to localize these changes in all the languages of
Wikipedia and the other projects?'''

All code will be ready for internationalization.

'''Are you going to be looking at the entire editing/contribution
process or just the software?'''

This project focuses on technical solutions, but the user testing will
aim to capture problems experienced throughout the editing process.

</description>
    <dc:creator>Erik Moeller</dc:creator>
    <dc:date>2008-12-03T18:42:16</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40815">
    <title>Re: Non-latin characters broken in donation comments</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40815</link>
    <description>-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

mizusumashi wrote:

Yay!


Hmmmmmm... we'll see if we get a display ordering or if we can arrange
something else nice...

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkk20p4ACgkQwRnhpk1wk47PiACffU8uMAVuVtzLz+xfTUJ3u42N
dkgAn3ggd6bxxcD9wBsVjoSaObwWQe9w
=GuxA
-----END PGP SIGNATURE-----
</description>
    <dc:creator>Brion Vibber</dc:creator>
    <dc:date>2008-12-03T18:40:30</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40814">
    <title>Re: The never-dying topic: category intersection</title>
    <link>http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/40814</link>
    <description>[snip]
[snip]

So an interface I had that was really pleasing was that I asked the
database to find a random subset of the results, which it could do
quickly, (or I used the whole results if the initial query contained
them) and I found the set of categories which maximally bisected the
result and presented the list with a set of +/- buttons.

I.e. you search for Animal and you'd get:
Mammal[+/-] Reptile[+/-] Kittens[+/-] Taken with Canon Camera[+/-] Human[+/-]

based on the how close to 50% of the results have the suggested category.

It's not exactly a 'related category', but I thought it was very useful.

I also did a fuzzy text matching search one the category names using a
trigram index, so it was always sure to suggest Category:Cats when you
searched for Cat, or whatever.  (I did this with an ajaxy-search-while
you type, it was handy)
</description>
    <dc:creator>Gregory Maxwell</dc:creator>
    <dc:date>2008-12-03T18:12:08</dc:date>
  </item>
  <textinput about="http://search.gmane.org/?group=$group=gmane.science.linguistics.wikipedia.technical">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.science.linguistics.wikipedia.technical</link>
  </textinput>
</rdf:RDF>
