<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://blog.gmane.org/gmane.linux.kernel.aio.general">
    <title>gmane.linux.kernel.aio.general</title>
    <link>http://blog.gmane.org/gmane.linux.kernel.aio.general</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3066"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3065"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3064"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3063"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3062"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3061"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3060"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3059"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3058"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3057"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3056"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3055"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3054"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3053"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3052"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3051"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3050"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3049"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3048"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3047"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3066">
    <title>Re: libaio status</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3066</link>
    <description>&lt;pre&gt;
We've got an application that uses aio with O_DIRECT to LVM volumes. Originally 
we had io_submit() and io_getevents() in separate threads, but as part of a 
recent re-write moved to using eventfd() with poll() to determine when 
completions are ready. There are other optimizations in the code, and generally 
the new, non-threaded code is 5-10% faster.

When I profile the code under load, it show 70-80% of the time waiting in 
io_submit() - but I'm suspicious of these numbers, because if I try modifying 
the new code to put io_submit() in a separate thread, it does not make any 
perceptible difference to throughput.

I would like to know, under what circumstances, when using O_DIRECT to a block 
device, can io_submit() block?

Chris Farey

On 16/05/2012 21:07, Raz wrote:

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo&amp;lt; at &amp;gt;kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: &amp;lt;a href=mailto:"aart&amp;lt; at &amp;gt;kvack.org"&amp;gt;aart&amp;lt; at &amp;gt;kvack.org&amp;lt;/a&amp;gt;

&lt;/pre&gt;</description>
    <dc:creator>Chris Farey</dc:creator>
    <dc:date>2012-05-17T09:16:05</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3065">
    <title>Re: libaio status</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3065</link>
    <description>&lt;pre&gt;
As far as I know any allocating write (extending or writing into the
middle of a sprase, non-fallocated region) will block on all file
systems.  The current AIO framework calls the file-system specific
bmap function, and the only thing which is asynchronous is the actual
Data I/O block itself.

I looked into fixing it, and it would have creating an entirely new
AIO framework (which would have initially only have been used for
ext4).  And then it would had a number of interesting problems in a
heavily containerized world (i.e., where you have multiple cgroups
strictly allocating how much memory, CPU, and prop I/O you allow to be
used to each group of processes comprising a job), since if you run
the metadata I/O requests out of a workgroup, you'll be constrained by
whatever cgroup was used to mount the file system, which might not
normally be given huge amounts of CPU and memory to play with.

And you *do* have to run out of some kind of foreground process
context, since the Block I/O layer doesn't support making requests out
of the completion callback handler.

This is all fixable, but it's a huge amount of work.  So what I did
instead was create an ioctl which would cache all of the
logical-&amp;gt;physical block mappings in an in-memory btree.  That means
that application programs would need to use synchronous calls to open,
pre-cache the extent tree, and to use fallocate, but then all data
reads and writes would be asynchronous.  The in-memory extent-cache,
along with the pre-cache ioctl, still needs to be forward ported to
mainline.  I know it's not a complete solution (and truth to hell, a
bit of a hack), but it was a hell of a lot easier than trying to add
full support for generic AIO, especially when dealing with cgroup
interactions.

- Ted

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo&amp;lt; at &amp;gt;kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: &amp;lt;a href=mailto:"aart&amp;lt; at &amp;gt;kvack.org"&amp;gt;aart&amp;lt; at &amp;gt;kvack.org&amp;lt;/a&amp;gt;

&lt;/pre&gt;</description>
    <dc:creator>Ted Ts'o</dc:creator>
    <dc:date>2012-05-16T22:15:11</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3064">
    <title>Re: libaio status</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3064</link>
    <description>&lt;pre&gt;

Libaio doesn't really have a website.  That page is a part of the old
linux scalability effort website.


Looks pretty good, but I didn't read too closely.


Well, how do you define "good?"  Libaio will work just fine on regular
files when opened with O_DIRECT.  As mentioned elsewhere, buffered I/O
is not supported in an asynchronous manner, so io_submit will block
until the I/O is complete.

Some caveats:
- if a metadata read is required, that can block io_submit
- if there is memory pressure, memory allocations can block io_submit
- if the file is not preallocated, some file systems will fall back to
  buffered I/O for hole filling (so io_submit will block until the I/O is
  complete).  Since you explicitly mentioned ext4, you don't have this
  problem, so long as you use a sufficiently recent kernel.

If it is at all possible, I would preallocate your files before doing
AIO to them.


If you can't afford *any* blocking, then you'll have to farm out the
io_submit call to another thread, as another person mentioned in this
mail thread.


I don't think that file-extending writes will block on ext4 these days,
but I'd double check with an ext4 expert on that one.

Cheers,
Jeff

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo&amp;lt; at &amp;gt;kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: &amp;lt;a href=mailto:"aart&amp;lt; at &amp;gt;kvack.org"&amp;gt;aart&amp;lt; at &amp;gt;kvack.org&amp;lt;/a&amp;gt;

&lt;/pre&gt;</description>
    <dc:creator>Jeff Moyer</dc:creator>
    <dc:date>2012-05-16T20:53:41</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3063">
    <title>Re: libaio status</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3063</link>
    <description>&lt;pre&gt;libio is not truely asynchronous -- last time i checked. problem is
that io_submit may take far too long time.
I suggest you profile io_submit yourself. I was forced to move
io_submit to a different thread.


On Tue, May 15, 2012 at 4:46 PM, Leandro Lucarella &amp;lt;luca&amp;lt; at &amp;gt;llucax.com.ar&amp;gt; wrote:

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo&amp;lt; at &amp;gt;kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: &amp;lt;a href=mailto:"aart&amp;lt; at &amp;gt;kvack.org"&amp;gt;aart&amp;lt; at &amp;gt;kvack.org&amp;lt;/a&amp;gt;

&lt;/pre&gt;</description>
    <dc:creator>Raz</dc:creator>
    <dc:date>2012-05-16T20:07:00</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3062">
    <title>libaio status</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3062</link>
    <description>&lt;pre&gt;Hello. I'm looking for alternatives to do async I/O in Linux and it
looks like libaio is the only "native" way to do it, but the information
about it is really scarce and incomplete.

libaio website (if [1] is the website) looks outdated and have a lot of
broken links. The manpage footers says Linux 2.4 and I can't find any
mention about the restrictions mentioned in the website, so I don't know
if those restrictions are up to date and accurate.

I found this document [2], which so far looks like the most
comprehensive and up to date resource available, but again, being not an
"official" document, I'm not sure how accurate is it.

My feeling is libaio is only good to do O_DIRECT I/O on raw block
devices, but I needed for regular files in an ext4 filesystem (ideally
I shouldn't  impose a limitation on the filesystem to use, but I guess
I could do that if necessary). I need to do I/O in a server that needs
to have extremely low latency, so I can't afford any type of blocking.
Using threads is not a viable option for other reasons.

Would you say libaio is good for what I need to do? If the limits
when used with ext4 filesystem mentioned in [2] are correct, is there
any way to overcome them?

Thanks in advance!

[1] http://lse.sourceforge.net/io/aio.html
[2] http://code.google.com/p/kernel/wiki/AIOUserGuide

&lt;/pre&gt;</description>
    <dc:creator>Leandro Lucarella</dc:creator>
    <dc:date>2012-05-15T13:46:22</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3061">
    <title>Re: What's the usage of io_setup?</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3061</link>
    <description>&lt;pre&gt;Thanks, Jeff.

在 2012年4月19日 下午10:51，Jeff Moyer &amp;lt;jmoyer&amp;lt; at &amp;gt;redhat.com&amp;gt;写道：

&lt;/pre&gt;</description>
    <dc:creator>Ryan Wang</dc:creator>
    <dc:date>2012-04-19T15:44:18</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3060">
    <title>Re: What's the usage of io_setup?</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3060</link>
    <description>&lt;pre&gt;

No, you have misread the code.


To submit I/O.  You can try looking through the man pages for a better
understanding.  Or, you could have a look at other code which uses the
aio interface (such as fio, or aio-stress, or any number of other
projects).

Cheers,
Jeff

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo&amp;lt; at &amp;gt;kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: &amp;lt;a href=mailto:"aart&amp;lt; at &amp;gt;kvack.org"&amp;gt;aart&amp;lt; at &amp;gt;kvack.org&amp;lt;/a&amp;gt;

&lt;/pre&gt;</description>
    <dc:creator>Jeff Moyer</dc:creator>
    <dc:date>2012-04-19T14:51:27</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3059">
    <title>What's the usage of io_setup?</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3059</link>
    <description>&lt;pre&gt;Hi,

I'm new to libaio, and have some question about io_setup.

I walked through the code of io_setup, and found that it frees the ioctx
after obtain ioctx-&amp;gt;user_id. So the caller just gets a handle for a freed
ioctx, right?

So what's the usage of io_submit in real word applications?

thanks,
Ryan
&lt;/pre&gt;</description>
    <dc:creator>Ryan Wang</dc:creator>
    <dc:date>2012-04-19T01:58:58</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3058">
    <title>Re: Where can I find the git tree for libaio?</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3058</link>
    <description>&lt;pre&gt;在 2012年4月17日 上午4:53，Jeff Moyer &amp;lt;jmoyer&amp;lt; at &amp;gt;redhat.com&amp;gt;写道：



&lt;/pre&gt;</description>
    <dc:creator>Ryan Wang</dc:creator>
    <dc:date>2012-04-17T00:08:34</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3057">
    <title>Re: Where can I find the git tree for libaio?</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3057</link>
    <description>&lt;pre&gt;

The new location is here:
  http://git.fedorahosted.org/git/?p=libaio.git

Sorry for the delay.

Cheers,
Jeff

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo&amp;lt; at &amp;gt;kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: &amp;lt;a href=mailto:"aart&amp;lt; at &amp;gt;kvack.org"&amp;gt;aart&amp;lt; at &amp;gt;kvack.org&amp;lt;/a&amp;gt;

&lt;/pre&gt;</description>
    <dc:creator>Jeff Moyer</dc:creator>
    <dc:date>2012-04-16T20:53:58</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3056">
    <title>Where can I find the git tree for libaio?</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3056</link>
    <description>&lt;pre&gt;Hi,

I'm studying libaio recently, and now I want to get the upstream libaio.
I wonder where can I find the git tree for libaio, please?

thanks,
Ryan
&lt;/pre&gt;</description>
    <dc:creator>Ryan Wang</dc:creator>
    <dc:date>2012-04-16T06:18:04</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3055">
    <title>Re: [patch] aio: change a stray spin_unlock_bh() to spin_unlock()</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3055</link>
    <description>&lt;pre&gt;
Nice catch; folded into the offending commit

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo&amp;lt; at &amp;gt;kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: &amp;lt;a href=mailto:"aart&amp;lt; at &amp;gt;kvack.org"&amp;gt;aart&amp;lt; at &amp;gt;kvack.org&amp;lt;/a&amp;gt;

&lt;/pre&gt;</description>
    <dc:creator>Al Viro</dc:creator>
    <dc:date>2012-03-20T19:13:23</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3054">
    <title>[patch] aio: change a stray spin_unlock_bh() to spin_unlock()</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3054</link>
    <description>&lt;pre&gt;We missed this spin_unlock_bh() when we removed the _bh from the other
locks in cb22bbe9f7 "aio: aio_nr_lock is taken only synchronously now"

Signed-off-by: Dan Carpenter &amp;lt;dan.carpenter&amp;lt; at &amp;gt;oracle.com&amp;gt;

diff --git a/fs/aio.c b/fs/aio.c
index 7b6b9d5..4f71627 100644
--- a/fs/aio.c
+++ b/fs/aio.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -280,7 +280,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static struct kioctx *ioctx_alloc(unsigned nr_events)
 spin_lock(&amp;amp;aio_nr_lock);
 if (aio_nr + nr_events &amp;gt; aio_max_nr ||
     aio_nr + nr_events &amp;lt; aio_nr) {
-spin_unlock_bh(&amp;amp;aio_nr_lock);
+spin_unlock(&amp;amp;aio_nr_lock);
 goto out_cleanup;
 }
 aio_nr += ctx-&amp;gt;max_reqs;

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo&amp;lt; at &amp;gt;kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: &amp;lt;a href=mailto:"aart&amp;lt; at &amp;gt;kvack.org"&amp;gt;aart&amp;lt; at &amp;gt;kvack.org&amp;lt;/a&amp;gt;

&lt;/pre&gt;</description>
    <dc:creator>Dan Carpenter</dc:creator>
    <dc:date>2012-03-20T13:09:19</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3053">
    <title>Re: [PATCH] aio: fix the "too late munmap()" race</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3053</link>
    <description>&lt;pre&gt;

Looks good to me.

Reviewed-by: Jeff Moyer &amp;lt;jmoyer&amp;lt; at &amp;gt;redhat.com&amp;gt;


--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo&amp;lt; at &amp;gt;kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: &amp;lt;a href=mailto:"aart&amp;lt; at &amp;gt;kvack.org"&amp;gt;aart&amp;lt; at &amp;gt;kvack.org&amp;lt;/a&amp;gt;

&lt;/pre&gt;</description>
    <dc:creator>Jeff Moyer</dc:creator>
    <dc:date>2012-03-08T18:15:20</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3052">
    <title>Re: [PATCH] aio: fix io_setup/io_destroy race</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3052</link>
    <description>&lt;pre&gt;

Al, you certainly are creative.  ;-) I agree with the problem and the
fix.  It would be nice, though if you had added comments.

I ran xfstests ./check -g aio, and there were no problems.

Reviewed-by: Jeff Moyer &amp;lt;jmoyer&amp;lt; at &amp;gt;redhat.com&amp;gt;

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo&amp;lt; at &amp;gt;kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: &amp;lt;a href=mailto:"aart&amp;lt; at &amp;gt;kvack.org"&amp;gt;aart&amp;lt; at &amp;gt;kvack.org&amp;lt;/a&amp;gt;

&lt;/pre&gt;</description>
    <dc:creator>Jeff Moyer</dc:creator>
    <dc:date>2012-03-07T18:16:18</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3051">
    <title>[patch] aio: wake up waiters when freeing unused kiocbs</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3051</link>
    <description>&lt;pre&gt;Hi,

Bart Van Assche reported a hung fio process when either hot-removing
storage or when interrupting the fio process itself.  The (pruned) call
trace for the latter looks like so:

fio             D 0000000000000001     0  6849   6848 0x00000004
 ffff880092541b88 0000000000000046 ffff880000000000 ffff88012fa11dc0
 ffff88012404be70 ffff880092541fd8 ffff880092541fd8 ffff880092541fd8
 ffff880128b894d0 ffff88012404be70 ffff880092541b88 000000018106f24d
Call Trace:
 [&amp;lt;ffffffff813b683f&amp;gt;] schedule+0x3f/0x60
 [&amp;lt;ffffffff813b68ef&amp;gt;] io_schedule+0x8f/0xd0
 [&amp;lt;ffffffff81174410&amp;gt;] wait_for_all_aios+0xc0/0x100
 [&amp;lt;ffffffff81175385&amp;gt;] exit_aio+0x55/0xc0
 [&amp;lt;ffffffff810413cd&amp;gt;] mmput+0x2d/0x110
 [&amp;lt;ffffffff81047c1d&amp;gt;] exit_mm+0x10d/0x130
 [&amp;lt;ffffffff810482b1&amp;gt;] do_exit+0x671/0x860
 [&amp;lt;ffffffff81048804&amp;gt;] do_group_exit+0x44/0xb0
 [&amp;lt;ffffffff81058018&amp;gt;] get_signal_to_deliver+0x218/0x5a0
 [&amp;lt;ffffffff81002065&amp;gt;] do_signal+0x65/0x700
 [&amp;lt;ffffffff81002785&amp;gt;] do_notify_resume+0x65/0x80
 [&amp;lt;ffffffff813c0333&amp;gt;] int_signal+0x12/0x17

The problem lies with the allocation batching code.  It will
opportunistically allocate kiocbs, and then trim back the list of iocbs
when there is not enough room in the completion ring to hold all of the
events.  In the case above, what happens is that the pruning back of
events ends up freeing up the last active request and the context is
marked as dead, so it is thus responsible for waking up waiters.
Unfortunately, the code does not check for this condition, so we end up
with a hung task.

Bart reports that the below patch has fixed the problem in his testing.

Cheers,
Jeff

Signed-off-by: Jeff Moyer &amp;lt;jmoyer&amp;lt; at &amp;gt;redhat.com&amp;gt;
Reported-and-Tested-by: Bart Van Assche &amp;lt;bvanassche&amp;lt; at &amp;gt;acm.org&amp;gt;

---
Note for stable: this should be applied to 3.2.

diff --git a/fs/aio.c b/fs/aio.c
index 969beb0..67e4b90 100644
--- a/fs/aio.c
+++ b/fs/aio.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -490,6 +490,8 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static void kiocb_batch_free(struct kioctx *ctx, struct kiocb_batch *batch)
 kmem_cache_free(kiocb_cachep, req);
 ctx-&amp;gt;reqs_active--;
 }
+if (unlikely(!ctx-&amp;gt;reqs_active &amp;amp;&amp;amp; ctx-&amp;gt;dead))
+wake_up_all(&amp;amp;ctx-&amp;gt;wait);
 spin_unlock_irq(&amp;amp;ctx-&amp;gt;ctx_lock);
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo&amp;lt; at &amp;gt;kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: &amp;lt;a href=mailto:"aart&amp;lt; at &amp;gt;kvack.org"&amp;gt;aart&amp;lt; at &amp;gt;kvack.org&amp;lt;/a&amp;gt;

&lt;/pre&gt;</description>
    <dc:creator>Jeff Moyer</dc:creator>
    <dc:date>2012-02-16T19:56:15</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3050">
    <title>[PATCH 27/60] fs: remove the second argument of k[un]map_atomic()</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3050</link>
    <description>&lt;pre&gt;Acked-by: Benjamin LaHaise &amp;lt;bcrl&amp;lt; at &amp;gt;kvack.org&amp;gt;
Signed-off-by: Cong Wang &amp;lt;amwang&amp;lt; at &amp;gt;redhat.com&amp;gt;
---
 fs/aio.c            |   30 +++++++++++++++---------------
 fs/bio-integrity.c  |   10 +++++-----
 fs/exec.c           |    4 ++--
 fs/namei.c          |    4 ++--
 fs/pipe.c           |    8 ++++----
 fs/splice.c         |    7 ++-----
 include/linux/bio.h |    8 ++++----
 7 files changed, 34 insertions(+), 37 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 969beb0..04ae7e2 100644
--- a/fs/aio.c
+++ b/fs/aio.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -160,7 +160,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static int aio_setup_ring(struct kioctx *ctx)
 
 info-&amp;gt;nr = nr_events;/* trusted copy */
 
-ring = kmap_atomic(info-&amp;gt;ring_pages[0], KM_USER0);
+ring = kmap_atomic(info-&amp;gt;ring_pages[0]);
 ring-&amp;gt;nr = nr_events;/* user copy */
 ring-&amp;gt;id = ctx-&amp;gt;user_id;
 ring-&amp;gt;head = ring-&amp;gt;tail = 0;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -168,32 +168,32 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static int aio_setup_ring(struct kioctx *ctx)
 ring-&amp;gt;compat_features = AIO_RING_COMPAT_FEATURES;
 ring-&amp;gt;incompat_features = AIO_RING_INCOMPAT_FEATURES;
 ring-&amp;gt;header_length = sizeof(struct aio_ring);
-kunmap_atomic(ring, KM_USER0);
+kunmap_atomic(ring);
 
 return 0;
 }
 
 
 /* aio_ring_event: returns a pointer to the event at the given index from
- * kmap_atomic(, km).  Release the pointer with put_aio_ring_event();
+ * kmap_atomic().  Release the pointer with put_aio_ring_event();
  */
 #define AIO_EVENTS_PER_PAGE(PAGE_SIZE / sizeof(struct io_event))
 #define AIO_EVENTS_FIRST_PAGE((PAGE_SIZE - sizeof(struct aio_ring)) / sizeof(struct io_event))
 #define AIO_EVENTS_OFFSET(AIO_EVENTS_PER_PAGE - AIO_EVENTS_FIRST_PAGE)
 
-#define aio_ring_event(info, nr, km) ({\
+#define aio_ring_event(info, nr) ({\
 unsigned pos = (nr) + AIO_EVENTS_OFFSET;\
 struct io_event *__event;\
 __event = kmap_atomic(\
-(info)-&amp;gt;ring_pages[pos / AIO_EVENTS_PER_PAGE], km); \
+(info)-&amp;gt;ring_pages[pos / AIO_EVENTS_PER_PAGE]); \
 __event += pos % AIO_EVENTS_PER_PAGE;\
 __event;\
 })
 
-#define put_aio_ring_event(event, km) do {\
+#define put_aio_ring_event(event) do {\
 struct io_event *__event = (event);\
 (void)__event;\
-kunmap_atomic((void *)((unsigned long)__event &amp;amp; PAGE_MASK), km); \
+kunmap_atomic((void *)((unsigned long)__event &amp;amp; PAGE_MASK)); \
 } while(0)
 
 static void ctx_rcu_free(struct rcu_head *head)
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1019,10 +1019,10 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; int aio_complete(struct kiocb *iocb, long res, long res2)
 if (kiocbIsCancelled(iocb))
 goto put_rq;
 
-ring = kmap_atomic(info-&amp;gt;ring_pages[0], KM_IRQ1);
+ring = kmap_atomic(info-&amp;gt;ring_pages[0]);
 
 tail = info-&amp;gt;tail;
-event = aio_ring_event(info, tail, KM_IRQ0);
+event = aio_ring_event(info, tail);
 if (++tail &amp;gt;= info-&amp;gt;nr)
 tail = 0;
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1043,8 +1043,8 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; int aio_complete(struct kiocb *iocb, long res, long res2)
 info-&amp;gt;tail = tail;
 ring-&amp;gt;tail = tail;
 
-put_aio_ring_event(event, KM_IRQ0);
-kunmap_atomic(ring, KM_IRQ1);
+put_aio_ring_event(event);
+kunmap_atomic(ring);
 
 pr_debug("added to ring %p at [%lu]\n", iocb, tail);
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1089,7 +1089,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static int aio_read_evt(struct kioctx *ioctx, struct io_event *ent)
 unsigned long head;
 int ret = 0;
 
-ring = kmap_atomic(info-&amp;gt;ring_pages[0], KM_USER0);
+ring = kmap_atomic(info-&amp;gt;ring_pages[0]);
 dprintk("in aio_read_evt h%lu t%lu m%lu\n",
  (unsigned long)ring-&amp;gt;head, (unsigned long)ring-&amp;gt;tail,
  (unsigned long)ring-&amp;gt;nr);
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1101,18 +1101,18 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static int aio_read_evt(struct kioctx *ioctx, struct io_event *ent)
 
 head = ring-&amp;gt;head % info-&amp;gt;nr;
 if (head != ring-&amp;gt;tail) {
-struct io_event *evp = aio_ring_event(info, head, KM_USER1);
+struct io_event *evp = aio_ring_event(info, head);
 *ent = *evp;
 head = (head + 1) % info-&amp;gt;nr;
 smp_mb(); /* finish reading the event before updatng the head */
 ring-&amp;gt;head = head;
 ret = 1;
-put_aio_ring_event(evp, KM_USER1);
+put_aio_ring_event(evp);
 }
 spin_unlock(&amp;amp;info-&amp;gt;ring_lock);
 
 out:
-kunmap_atomic(ring, KM_USER0);
+kunmap_atomic(ring);
 dprintk("leaving aio_read_evt: %d  h%lu t%lu\n", ret,
  (unsigned long)ring-&amp;gt;head, (unsigned long)ring-&amp;gt;tail);
 return ret;
diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
index c2183f3..e85c04b 100644
--- a/fs/bio-integrity.c
+++ b/fs/bio-integrity.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -357,7 +357,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static void bio_integrity_generate(struct bio *bio)
 bix.sector_size = bi-&amp;gt;sector_size;
 
 bio_for_each_segment(bv, bio, i) {
-void *kaddr = kmap_atomic(bv-&amp;gt;bv_page, KM_USER0);
+void *kaddr = kmap_atomic(bv-&amp;gt;bv_page);
 bix.data_buf = kaddr + bv-&amp;gt;bv_offset;
 bix.data_size = bv-&amp;gt;bv_len;
 bix.prot_buf = prot_buf;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -371,7 +371,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static void bio_integrity_generate(struct bio *bio)
 total += sectors * bi-&amp;gt;tuple_size;
 BUG_ON(total &amp;gt; bio-&amp;gt;bi_integrity-&amp;gt;bip_size);
 
-kunmap_atomic(kaddr, KM_USER0);
+kunmap_atomic(kaddr);
 }
 }
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -498,7 +498,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static int bio_integrity_verify(struct bio *bio)
 bix.sector_size = bi-&amp;gt;sector_size;
 
 bio_for_each_segment(bv, bio, i) {
-void *kaddr = kmap_atomic(bv-&amp;gt;bv_page, KM_USER0);
+void *kaddr = kmap_atomic(bv-&amp;gt;bv_page);
 bix.data_buf = kaddr + bv-&amp;gt;bv_offset;
 bix.data_size = bv-&amp;gt;bv_len;
 bix.prot_buf = prot_buf;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -507,7 +507,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static int bio_integrity_verify(struct bio *bio)
 ret = bi-&amp;gt;verify_fn(&amp;amp;bix);
 
 if (ret) {
-kunmap_atomic(kaddr, KM_USER0);
+kunmap_atomic(kaddr);
 return ret;
 }
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -517,7 +517,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static int bio_integrity_verify(struct bio *bio)
 total += sectors * bi-&amp;gt;tuple_size;
 BUG_ON(total &amp;gt; bio-&amp;gt;bi_integrity-&amp;gt;bip_size);
 
-kunmap_atomic(kaddr, KM_USER0);
+kunmap_atomic(kaddr);
 }
 
 return ret;
diff --git a/fs/exec.c b/fs/exec.c
index 92ce83a..7043408e 100644
--- a/fs/exec.c
+++ b/fs/exec.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1339,13 +1339,13 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; int remove_arg_zero(struct linux_binprm *bprm)
 ret = -EFAULT;
 goto out;
 }
-kaddr = kmap_atomic(page, KM_USER0);
+kaddr = kmap_atomic(page);
 
 for (; offset &amp;lt; PAGE_SIZE &amp;amp;&amp;amp; kaddr[offset];
 offset++, bprm-&amp;gt;p++)
 ;
 
-kunmap_atomic(kaddr, KM_USER0);
+kunmap_atomic(kaddr);
 put_arg_page(page);
 
 if (offset == PAGE_SIZE)
diff --git a/fs/namei.c b/fs/namei.c
index 208c6aa..dcc4f42 100644
--- a/fs/namei.c
+++ b/fs/namei.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -3347,9 +3347,9 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; retry:
 if (err)
 goto fail;
 
-kaddr = kmap_atomic(page, KM_USER0);
+kaddr = kmap_atomic(page);
 memcpy(kaddr, symname, len-1);
-kunmap_atomic(kaddr, KM_USER0);
+kunmap_atomic(kaddr);
 
 err = pagecache_write_end(NULL, mapping, 0, len-1, len-1,
 page, fsdata);
diff --git a/fs/pipe.c b/fs/pipe.c
index a932ced..fe0502f 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -230,7 +230,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; void *generic_pipe_buf_map(struct pipe_inode_info *pipe,
 {
 if (atomic) {
 buf-&amp;gt;flags |= PIPE_BUF_FLAG_ATOMIC;
-return kmap_atomic(buf-&amp;gt;page, KM_USER0);
+return kmap_atomic(buf-&amp;gt;page);
 }
 
 return kmap(buf-&amp;gt;page);
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -251,7 +251,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; void generic_pipe_buf_unmap(struct pipe_inode_info *pipe,
 {
 if (buf-&amp;gt;flags &amp;amp; PIPE_BUF_FLAG_ATOMIC) {
 buf-&amp;gt;flags &amp;amp;= ~PIPE_BUF_FLAG_ATOMIC;
-kunmap_atomic(map_data, KM_USER0);
+kunmap_atomic(map_data);
 } else
 kunmap(buf-&amp;gt;page);
 }
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -565,14 +565,14 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; redo1:
 iov_fault_in_pages_read(iov, chars);
 redo2:
 if (atomic)
-src = kmap_atomic(page, KM_USER0);
+src = kmap_atomic(page);
 else
 src = kmap(page);
 
 error = pipe_iov_copy_from_user(src, iov, chars,
 atomic);
 if (atomic)
-kunmap_atomic(src, KM_USER0);
+kunmap_atomic(src);
 else
 kunmap(page);
 
diff --git a/fs/splice.c b/fs/splice.c
index 1ec0493..f16402e 100644
--- a/fs/splice.c
+++ b/fs/splice.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -737,15 +737,12 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; int pipe_to_file(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
 goto out;
 
 if (buf-&amp;gt;page != page) {
-/*
- * Careful, -&amp;gt;map() uses KM_USER0!
- */
 char *src = buf-&amp;gt;ops-&amp;gt;map(pipe, buf, 1);
-char *dst = kmap_atomic(page, KM_USER1);
+char *dst = kmap_atomic(page);
 
 memcpy(dst + offset, src + buf-&amp;gt;offset, this_len);
 flush_dcache_page(page);
-kunmap_atomic(dst, KM_USER1);
+kunmap_atomic(dst);
 buf-&amp;gt;ops-&amp;gt;unmap(pipe, buf, src);
 }
 ret = pagecache_write_end(file, mapping, sd-&amp;gt;pos, this_len, this_len,
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 129a9c0..de5422a 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -101,10 +101,10 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static inline int bio_has_allocated_vec(struct bio *bio)
  * I/O completely on that queue (see ide-dma for example)
  */
 #define __bio_kmap_atomic(bio, idx, kmtype)\
-(kmap_atomic(bio_iovec_idx((bio), (idx))-&amp;gt;bv_page, kmtype) +\
+(kmap_atomic(bio_iovec_idx((bio), (idx))-&amp;gt;bv_page) +\
 bio_iovec_idx((bio), (idx))-&amp;gt;bv_offset)
 
-#define __bio_kunmap_atomic(addr, kmtype) kunmap_atomic(addr, kmtype)
+#define __bio_kunmap_atomic(addr, kmtype) kunmap_atomic(addr)
 
 /*
  * merge helpers etc
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -317,7 +317,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static inline char *bvec_kmap_irq(struct bio_vec *bvec, unsigned long *flags)
  * balancing is a lot nicer this way
  */
 local_irq_save(*flags);
-addr = (unsigned long) kmap_atomic(bvec-&amp;gt;bv_page, KM_BIO_SRC_IRQ);
+addr = (unsigned long) kmap_atomic(bvec-&amp;gt;bv_page);
 
 BUG_ON(addr &amp;amp; ~PAGE_MASK);
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -328,7 +328,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static inline void bvec_kunmap_irq(char *buffer, unsigned long *flags)
 {
 unsigned long ptr = (unsigned long) buffer &amp;amp; PAGE_MASK;
 
-kunmap_atomic((void *) ptr, KM_BIO_SRC_IRQ);
+kunmap_atomic((void *) ptr);
 local_irq_restore(*flags);
 }
 
&lt;/pre&gt;</description>
    <dc:creator>Cong Wang</dc:creator>
    <dc:date>2012-02-10T05:39:48</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3049">
    <title>Re: [PATCH] AIO: Don't plug the I/O queue in do_io_submit()</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3049</link>
    <description>&lt;pre&gt;

I believe the original plugging here was done on a per fd basis.  So, I
concede that the behaviour may have changed a bit since the initial
patch for this was merged.


I have a patch slated for 3.2 that should help that.  It batches the
allocation of the aio requests, which showed a good improvement in
microbenchmarks there.

commit 080d676de095a14ecba14c0b9a91acb5bbb634df
Author: Jeff Moyer &amp;lt;jmoyer&amp;lt; at &amp;gt;redhat.com&amp;gt;
Date:   Wed Nov 2 13:40:10 2011 -0700

    aio: allocate kiocbs in batches

Cheers,
Jeff

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo&amp;lt; at &amp;gt;kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: &amp;lt;a href=mailto:"aart&amp;lt; at &amp;gt;kvack.org"&amp;gt;aart&amp;lt; at &amp;gt;kvack.org&amp;lt;/a&amp;gt;

&lt;/pre&gt;</description>
    <dc:creator>Jeff Moyer</dc:creator>
    <dc:date>2011-12-16T14:45:07</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3048">
    <title>Re: [PATCH] AIO: Don't plug the I/O queue in do_io_submit()</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3048</link>
    <description>&lt;pre&gt;
Each io_submit call is sending down about 34K of IO to two different devices.
The latencies were measured just on the process writing the redo
logs, so it is a very specific subset of the overall benchmark.

The patched kernel only does 4x more iops for the redo logs than the
unpatched kernel, so we're talking ~8K ios here.

-chris

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo&amp;lt; at &amp;gt;kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: &amp;lt;a href=mailto:"aart&amp;lt; at &amp;gt;kvack.org"&amp;gt;aart&amp;lt; at &amp;gt;kvack.org&amp;lt;/a&amp;gt;

&lt;/pre&gt;</description>
    <dc:creator>Chris Mason</dc:creator>
    <dc:date>2011-12-15T16:40:49</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3047">
    <title>Re: [PATCH] AIO: Don't plug the I/O queue in do_io_submit()</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3047</link>
    <description>&lt;pre&gt;
I think that would indeed be an interesting addition to test on top of
the 3.0 kernel being used.

This is a bit of a sticky situation. We want the plugging and merging on
rotational storage, and on SSDs we want the batch addition to the queue
to avoid hammering on the queue lock. At this level, we have no idea.
But we don't want to introduce longer latencies. So the question is, are
these latencies due to long queues (and hence would be helped with the
auto-replug on 3.1 and newer), or are they due to the submissions
running for too long. If the latter, then we can either look into
reducing the time spent between submitting the individual pieces. Or at
least not holding up too long.

&lt;/pre&gt;</description>
    <dc:creator>Jens Axboe</dc:creator>
    <dc:date>2011-12-15T16:15:26</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.kernel.aio.general/3046">
    <title>Re: [PATCH] AIO: Don't plug the I/O queue in do_io_submit()</title>
    <link>http://permalink.gmane.org/gmane.linux.kernel.aio.general/3046</link>
    <description>&lt;pre&gt;2011/12/14 Dave Kleikamp &amp;lt;dave.kleikamp&amp;lt; at &amp;gt;oracle.com&amp;gt;:
can you explain why this can help? Note, in 3.1 kernel we now force flush
plug list if the list is too long, which will remove a lot of latency.

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo&amp;lt; at &amp;gt;kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: &amp;lt;a href=mailto:"aart&amp;lt; at &amp;gt;kvack.org"&amp;gt;aart&amp;lt; at &amp;gt;kvack.org&amp;lt;/a&amp;gt;

&lt;/pre&gt;</description>
    <dc:creator>Shaohua Li</dc:creator>
    <dc:date>2011-12-15T01:09:04</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.linux.kernel.aio.general">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.linux.kernel.aio.general</link>
  </textinput>
</rdf:RDF>

