<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://blog.gmane.org/gmane.linux.file-systems">
    <title>gmane.linux.file-systems</title>
    <link>http://blog.gmane.org/gmane.linux.file-systems</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64644"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64643"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64637"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64636"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64628"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64625"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64618"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64617"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64613"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64612"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64611"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64610"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64608"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64607"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64606"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64598"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64588"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64585"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64584"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/64581"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64644">
    <title>The $1,549 per day ZERO traffic system (UPDATE) Recommends Advanced Sports to you</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64644</link>
    <description>&lt;pre&gt;Email        :sniperxsystem&amp;lt; at &amp;gt;support.com
Friend Name  :Friend
Friend Email :linux-fsdevel&amp;lt; at &amp;gt;vger.kernel.org
comment      :

Listen to this... pretty crazy...

So many people rushed to download this 
$530k/year system yesterday...

That they crashed the ENTIRE server!

=&amp;gt;&amp;gt;http://www.sniperxsystem.com/?code=4fbea9cb964f6&amp;lt;&amp;lt;=

(The site was down ALL day) Pretty crazy. 

... The "ghetto" video alone has sent shockwaves 
through the Clickbank community.

Can you believe THIS guy's one of Clickbanks
biggest super affiliates?

=&amp;gt;&amp;gt;http://www.sniperxsystem.com/?code=4fbea9cb964f6&amp;lt;&amp;lt;=

Talk soon

P.S. This is **BRAND NEW**...

It works and it's made $1,549.87 a DAY for 
the past 739 days in a ROW!

No PPC, no PPV, no CPA, no so-called 'push
button softwares' scams, no 'loopholes'...

Something TOTALLY different.

Check it out (fast, while it's still open):

=&amp;gt;&amp;gt;http://www.sniperxsystem.com/?code=4fbea9cb964f6&amp;lt;&amp;lt;=




--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>sniperxsystem&lt; at &gt;support.com</dc:creator>
    <dc:date>2012-05-25T01:26:47</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64643">
    <title>Re: [RFC PATCH 2/5] block: Do not stop draining if waitqueue is not empty.</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64643</link>
    <description>&lt;pre&gt;Hi, Tejun and Jens

On 05/23/2012 10:54 PM, Asias He wrote:

Ping.

&lt;/pre&gt;</description>
    <dc:creator>Asias He</dc:creator>
    <dc:date>2012-05-25T01:16:47</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64637">
    <title>Re: [PATCH v2 11/14] block: Rework bio splitting</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64637</link>
    <description>&lt;pre&gt;
Ahh, I saw that comment but I missed what it was that you wanted me to
reorder. Will do.


Yeah, I'll do that.


Yes. 


Yes. Not masking out __GFP_WAIT would only be safe if you could
guarantee that you never tried to split a bio more than once (from the
same bio set), and IMO that'd be a terrible thing to rely on.


When what's not the same?


They are.


I agree that it is hacky, but shifting the responsibility onto the
caller would IMO be much more likely to lead to buggy code in the
future (and those deadlocks are not going to be easy to track down).

It might be better to mask out __GFP_WAIT in bio_alloc_bioset(), I'm not
sure.


Doesn't matter. If you allocate a split, it won't free itself until the
IO is submitted and completes; current-&amp;gt;bio_list != NULL the bio cannot
complete until you return.


Yeah, caller responsibility. Will do.


I thought about that, but this works for now - my preferred solution is
to make bi_io_vec immutable (that'll require a bi_bvec_offset field in
struct bio, and the end result is we'd be able to split mid bvec and
share the bvec with both bios).


Don't follow... If you're processing a bio starting from the smallest
sector though, you want the split to be the front of the bio otherwise
if you split multiple times you'll try to split a split, and deadlock.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>Kent Overstreet</dc:creator>
    <dc:date>2012-05-24T21:27:23</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64636">
    <title>Re: [rfc v2 0/7] procfs fdinfo extension v2</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64636</link>
    <description>&lt;pre&gt;
OK, no problem -- if anything I'm glad to have verification that this
change is not needed.

Cheers,
-Matt

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>Matt Helsley</dc:creator>
    <dc:date>2012-05-24T20:32:04</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64628">
    <title>Re: [PATCH v2 04/14] block: Add bio_clone_kmalloc()</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64628</link>
    <description>&lt;pre&gt;
[..]

Is this code correct. Now original code might clone bio after split and
new code will clone the original bio itself and not the split one?

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>Vivek Goyal</dc:creator>
    <dc:date>2012-05-24T18:59:19</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64625">
    <title>Description of HFS+ compression</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64625</link>
    <description>&lt;pre&gt;Hello, all. I've looked into how Mac OS X compresses files using "transparent compression".
Since I don't plan to use this data now, I've thought it may be a good idea to document my
findings, perhaps someone may implement compressed reader. It's conceptually and in
implementation very similar to zisofs.
I suppose that reader is familiar with
http://developer.apple.com/legacy/mac/library/#technotes/tn/tn1150.html
Used compression: zlib
Block size: 64K
Missing bits from TN1150.
Attributes key (big-endian):
uint16_t   | unknown | always zero
uint32_t   | cnid    | file id of parent, most likely, not checked
uint32_t   | unknown | always zero
uint16_t   | namelen | length of name
uint16_t[] | name    | name in UTF-16BE

Attributes header (start of the value in attributes key), (big-endian):
uint8[3]   | unknown | always zero
uint8_t    | type    | only 0x10 = inline is used for com.apple.decmpfs, attribute itself follows
uint32_t   | unknown | always zero
uint64_t   | size    | size of attribute

Compressed attribute header (little-endian):
uint32_t   | magic             | "fpmc"
uint32_t   | unknown           | always 3
uint32_t   | uncompressed_size | uncompressed size if inline, 8 otherwise
uint32_t   | unknown           | always 0

If there is only one block and it's small enough it's stored directly following the header.
Otherwise "# dummy\n" is stored instead and the compressed data is stored in resource fork of the file in question.

The headers Mac OS X uses to masquerade as some kind of resource:
Resource fork header (big-endian):
uint32_t  | header_size | always 0x100
uint32_t  | size        | total_compressed_size + seek_block_size + 4 + 0x100 
uint32_t  | size        | total_compressed_size + seek_block_size + 4
uint32_t  | unknown     | always 0x32
uint8_t[0xf0] | unknown | zero-filled
uint8_t   | size        | total_compressed_size + seek_block_size
It's followed by seek block starts with (little-endian)
uint32_t  | nentries    | number of entries follow
entries are (little-endian):
uint32_t  | compressed_offset (offset 0 corresponds to the nentries field)
uint32_t  | compressed_size

Follow zlib compressed blocks.

Trailer is 50 bytes of always the same contents:
0000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000010: 0000 0000 0000 0000 001c 0032 0000 636d  ...........2..cm
0000020: 7066 0000 000a 0001 ffff 0000 0000 0000  pf..............
0000030: 0000 


&lt;/pre&gt;</description>
    <dc:creator>Vladimir 'φ-coder/phcoder' Serbinenko</dc:creator>
    <dc:date>2012-05-24T18:38:35</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64618">
    <title>Re: [PATCH v2 11/14] block: Rework bio splitting</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64618</link>
    <description>&lt;pre&gt;




Again could you please take this bio_split and just put
it down below the implementation of bio_pair_split. This way
we can better review what the changes actually are. Now it
is a complete mess in the diff, where the deleted lines of the
bio_pair_release are in between the new lines of bio_split().
Please ???





This is a freaking important and capable exported function.
Could you please put some comment on what it does and what are
it's limitation.

For example the returned bio is the beginning of the chain
and the original is the reminder, right?





Is this true also when &amp;lt; at &amp;gt;bio is not from &amp;lt; at &amp;gt;bs ?

Is it at all supported when they are not the same?

Are kmalloc bios not split-able?

Please put answer to these in above comment.

In the split you have a single bio with or without bvects allocation
should you not let the caller make sure not to set __GFP_WAIT.

For me, inspecting current-&amp;gt;bio_list is out of context and a complete
hack. The caller should take care of it, which has more context.

For example I might want to use split from OSD code where I do
not use an elevator at all, and current-&amp;gt;bio_list could belong
to a completely different device. (Maybe)





I don't see below any references taken by "ret" on &amp;lt; at &amp;gt;bio.
What protects us from &amp;lt; at &amp;gt;bio not been freed before "ret" ?

If it's the caller responsibility please say so in
comment above 





We could in this case save and only allocate one more bio with a single
bio_vec and chain it to the end.

And if you change it around, where the reminder is "ret" and the
beginning of the chain is left &amp;lt; at &amp;gt;bio. you wouldn't even need the
extra bio. (Trim the last and a single-bvec bio holds the reminder + remainder-chain)

Thanks
Boaz





--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>Boaz Harrosh</dc:creator>
    <dc:date>2012-05-24T16:56:03</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64617">
    <title>Re: [PATCH v2 03/14] block: Add bio_clone_bioset()</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64617</link>
    <description>&lt;pre&gt;
Which consolidations are happening and which drivers are being
affected how?


Why is this safe?  If it's only bioset related API changes, why are
there other changes at all?  If the new clone interface can handle
bioset fine, do we still need to expose __bio_clone()?


Why is idx != bi_idx test dropped?

I'm gonna stop here on this series.  It doesn't seem like the issues
pointed out before have been addressed.  I recommend spending more
effort on patch descriptions.  Writing descriptions is not only
important for reviewing and history but it's a good step in ensuring
the patches are sane and properly split.  If you can't explain each
change in the patch, it generally means either the changes themselves
are wrong or wrongly split.

Thanks.

&lt;/pre&gt;</description>
    <dc:creator>Tejun Heo</dc:creator>
    <dc:date>2012-05-24T16:38:31</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64613">
    <title>Re: [PATCH 00/16] vfs: atomic open v4 (part 1)</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64613</link>
    <description>&lt;pre&gt;
I'd also recommend changing the "ok" and "common" labels in do_last() to
something a bit more meaningful, perhaps:

common -&amp;gt; finish_open
ok -&amp;gt; finish_open_may_want_write

Also, does it make sense to combine:

if (!S_ISREG(nd-&amp;gt;inode-&amp;gt;i_mode))
will_truncate = 0;

with:

int will_truncate = open_flag &amp;amp; O_TRUNC;

up at the top of the function.

As the code stands, if -&amp;gt;atomic_open() opens the file but does not create it,
handle_truncate() will be called on it even if it is not a regular file,
whereas by the normal path, it won't.

I would also be tempted to move the body of:

if (filp == ERR_PTR(-EOPENSTALE) &amp;amp;&amp;amp; save_parent.dentry &amp;amp;&amp;amp; !retried) {
BUG_ON(save_parent.dentry != dir);
path_put(&amp;amp;nd-&amp;gt;path);
nd-&amp;gt;path = save_parent;
nd-&amp;gt;inode = dir-&amp;gt;d_inode;
save_parent.mnt = NULL;
save_parent.dentry = NULL;
if (want_write) {
mnt_drop_write(nd-&amp;gt;path.mnt);
want_write = 0;
}
retried = true;
goto retry_lookup;
}

before the retry_lookup label and then goto around it from the preceding
if-else statement or place it at the bottom to make the "common:" block simpler
to read.  Also, you could nest the if (filp == ERR_PTR(-EOPENSTALE)...) inside
if (IS_ERR(filp)).

Can I also suggest being consistent about the use of int v bool?  "created"
and "retried" are bool, but "will_truncate", "want_write" and "symlink_ok" are
not.  Granted some of this is likely inherited from the previous incarnation.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>David Howells</dc:creator>
    <dc:date>2012-05-24T15:52:02</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64612">
    <title>Re: [PATCH 00/16] vfs: atomic open v4 (part 1)</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64612</link>
    <description>&lt;pre&gt;
I've been looking at your patches when they're all applied, and I suspect
you're missing some security calls.

For instance, in lookup_open(), you call security_path_mknod() prior to
calling vfs_create(), but you don't call it prior to calling atomic_open() or
in, say, nfs_atomic_open().  You do need to, however, though I can see it's
difficult to work out where.  Is it possible to call it if O_CREAT is
specified and d_inode is NULL right before calling atomic_open()?

I'm also wondering if you're missing an audit_inode() call in the if (created)
path after the retry_lookup label.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>David Howells</dc:creator>
    <dc:date>2012-05-24T15:07:26</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64611">
    <title>Re: exofs/ore: allocation of _ore_get_io_state()</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64611</link>
    <description>&lt;pre&gt;


Personally I had it with scsi-lib's sg_table bigger than PAGE_SIZE
allocation. (Because of a bug) It is currently MAXed at PAGE_SIZE.
Other people reported same failures and great performance degradation
when allocating BIOs and BIO_VECs larger then PAGE_SIZE. 

It's simply the old and known page-fragmentation problem. It's
why virtual memory was invented in the first place.
kmalloc is not a virtual allocator.



Welcome to Linux Kernel 101. vmalloc is ten fold slower than
kmalloc. And in principal the same will happen, multiple discrete
pages will be allocated, and collected together but now you will need
to set up a TLB entries, and make sure they are mapped in when needed.
(Every interrupt every context switch)

This single fact of "Linux Kernel code does not use VM" is a
10 fold speed gain over Windows Kernel, measured.



Boaz
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>Boaz Harrosh</dc:creator>
    <dc:date>2012-05-24T14:05:42</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64610">
    <title>Re: Hole punching and mmap races</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64610</link>
    <description>&lt;pre&gt;  Yes, this is a nice summary of the most interesting cases. For completeness,
here are the remaining cases:
  8. mmap vs writeback (page lock)
  9. writeback vs direct IO (as direct IO vs buffered IO)
 10. writeback vs buffered IO (page lock)
 11. direct IO vs truncate (dio_wait)
 12. direct IO vs hole punch (dio_wait)
 13. buffered IO vs truncate (i_mutex for writes, i_size/page lock for reads)
 14. buffered IO vs hole punch (fs dependent, broken for ext4)
 15. truncate vs hole punch (fs dependent)
 16. mmap vs mmap (page lock)
 17. writeback vs writeback (page lock)
 18. direct IO vs direct IO (i_mutex or fs dependent)
 19. buffered IO vs buffered IO (i_mutex for writes, page lock for reads)
 20. truncate vs truncate (i_mutex)
 21. punch hole vs punch hole (fs dependent)

  Yes, looking at the above table, the amount of different synchronization
mechanisms is really striking. So probably we should look at some
possibility of unifying at least some cases.

Honza
&lt;/pre&gt;</description>
    <dc:creator>Jan Kara</dc:creator>
    <dc:date>2012-05-24T12:35:38</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64608">
    <title>Activate your free commission shop now Recommends Advanced Sports to you</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64608</link>
    <description>&lt;pre&gt;Email        :kimtan&amp;lt; at &amp;gt;gmail.com
Friend Name  :Friend
Friend Email :linux-fsdevel&amp;lt; at &amp;gt;vger.kernel.org
comment      :

Hey,

Your commission shop is ready.
Activate your free commission shop now:

==&amp;gt; http://bit.ly/KUXNMS

In a few minutes, your commission shop will be ready, 
attracting buyers from all around the world.

And each shop is designed to suck commissions and sales 
for you automatically. 

This is the first time this revolutionary technique is applied.

So don't delay. Activate your commission shop now:

==&amp;gt; http://bit.ly/KUXNMS

Sincerely,
Kim Tan



--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>kimtan&lt; at &gt;gmail.com</dc:creator>
    <dc:date>2012-05-24T11:52:43</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64607">
    <title>Re: exofs/ore: allocation of _ore_get_io_state()</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64607</link>
    <description>&lt;pre&gt;
What allocation sizes (of struct __alloc_all_io_state) are we talking
about? how many devices per I/O did you encounter?


Why not use virtual memory? Is this limitation imposed by the OSD
initiator or by some other layer in the OSD stack?


&lt;/pre&gt;</description>
    <dc:creator>Idan Kedar</dc:creator>
    <dc:date>2012-05-24T11:23:48</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64606">
    <title>Activate your free commission shop now Recommends Advanced Sports to you</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64606</link>
    <description>&lt;pre&gt;Email        :kimtan&amp;lt; at &amp;gt;gmail.com
Friend Name  :Friend
Friend Email :linux-fsdevel&amp;lt; at &amp;gt;vger.kernel.org
comment      :

Hey,

Your commission shop is ready.
Activate your free commission shop now:

==&amp;gt; http://bit.ly/KUXNMS

In a few minutes, your commission shop will be ready, 
attracting buyers from all around the world.

And each shop is designed to suck commissions and sales 
for you automatically. 

This is the first time this revolutionary technique is applied.

So don't delay. Activate your commission shop now:

==&amp;gt; http://bit.ly/KUXNMS

Sincerely,
Kim Tan



--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>kimtan&lt; at &gt;gmail.com</dc:creator>
    <dc:date>2012-05-24T05:59:21</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64598">
    <title>Re: [dm-devel] [PATCH v2 02/14] dm: kill dm_rq_bio_destructor</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64598</link>
    <description>&lt;pre&gt;Hi,

On 05/24/12 09:02, Kent Overstreet wrote:

The destructor may also be called from blk_rq_unprep_clone(),
which just puts bio.
So this patch will introduce a memory leak.

Please check this comment as well:
https://www.redhat.com/archives/dm-devel/2012-May/msg00216.html

Thanks,
&lt;/pre&gt;</description>
    <dc:creator>Jun'ichi Nomura</dc:creator>
    <dc:date>2012-05-24T00:19:12</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64588">
    <title>[PATCH v2 13/14] Make generic_make_request handle arbitrarily large bios</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64588</link>
    <description>&lt;pre&gt;The way the block layer is currently written, it goes to great lengths
to avoid having to split bios; upper layer code (such as bio_add_page())
checks what the underlying device can handle and tries to always create
bios that don't need to be split.

But this approach becomes unwieldy and eventually breaks down with
stacked devices and devices with dynamic limits, and it adds a lot of
complexity. If the block layer could split bios as needed, we could
eliminate a lot of complexity elsewhere - particularly in stacked
drivers. Code that creates bios can then create whatever size bios are
convenient, and more importantly stacked drivers don't have to deal with
both their own bio size limitations and the limitations of the
(potentially multiple) devices underneath them.

Signed-off-by: Kent Overstreet &amp;lt;koverstreet&amp;lt; at &amp;gt;google.com&amp;gt;
Change-Id: I53ed182e8c0a5fe192b040b1fdedba205fe953d5
---
 block/blk-core.c       |  118 ++++++++++++++++++++++++++++++++++++++++++++++++
 fs/bio.c               |   41 +++++++++++++++++
 include/linux/bio.h    |    7 +++
 include/linux/blkdev.h |    3 ++
 4 files changed, 169 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index 91617eb..19145ab 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -29,6 +29,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 #include &amp;lt;linux/fault-inject.h&amp;gt;
 #include &amp;lt;linux/list_sort.h&amp;gt;
 #include &amp;lt;linux/delay.h&amp;gt;
+#include &amp;lt;linux/closure.h&amp;gt;
 
 #define CREATE_TRACE_POINTS
 #include &amp;lt;trace/events/block.h&amp;gt;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -52,6 +53,12 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static struct kmem_cache *request_cachep;
 struct kmem_cache *blk_requestq_cachep;
 
 /*
+ * For bio_split_hook
+ */
+static struct kmem_cache *bio_split_cache;
+static struct workqueue_struct *bio_split_wq;
+
+/*
  * Controlling structure to kblockd
  */
 static struct workqueue_struct *kblockd_workqueue;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -487,6 +494,14 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
 if (q-&amp;gt;id &amp;lt; 0)
 goto fail_q;
 
+q-&amp;gt;bio_split_hook = mempool_create_slab_pool(4, bio_split_cache);
+if (!q-&amp;gt;bio_split_hook)
+goto fail_split_hook;
+
+q-&amp;gt;bio_split = bioset_create(4, 0);
+if (!q-&amp;gt;bio_split)
+goto fail_split;
+
 q-&amp;gt;backing_dev_info.ra_pages =
 (VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE;
 q-&amp;gt;backing_dev_info.state = 0;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -526,6 +541,10 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
 
 fail_id:
 ida_simple_remove(&amp;amp;blk_queue_ida, q-&amp;gt;id);
+fail_split:
+bioset_free(q-&amp;gt;bio_split);
+fail_split_hook:
+mempool_destroy(q-&amp;gt;bio_split_hook);
 fail_q:
 kmem_cache_free(blk_requestq_cachep, q);
 return NULL;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1493,6 +1512,90 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static inline bool should_fail_request(struct hd_struct *part,
 
 #endif /* CONFIG_FAIL_MAKE_REQUEST */
 
+struct bio_split_hook {
+struct closurecl;
+struct request_queue*q;
+struct bio*bio;
+bio_end_io_t*bi_end_io;
+void*bi_private;
+};
+
+static void bio_submit_split_done(struct closure *cl)
+{
+struct bio_split_hook *s = container_of(cl, struct bio_split_hook, cl);
+
+s-&amp;gt;bio-&amp;gt;bi_end_io = s-&amp;gt;bi_end_io;
+s-&amp;gt;bio-&amp;gt;bi_private = s-&amp;gt;bi_private;
+bio_endio(s-&amp;gt;bio, 0);
+
+closure_debug_destroy(&amp;amp;s-&amp;gt;cl);
+mempool_free(s, s-&amp;gt;q-&amp;gt;bio_split_hook);
+}
+
+static void bio_submit_split_endio(struct bio *bio, int error)
+{
+struct closure *cl = bio-&amp;gt;bi_private;
+struct bio_split_hook *s = container_of(cl, struct bio_split_hook, cl);
+
+if (error)
+clear_bit(BIO_UPTODATE, &amp;amp;s-&amp;gt;bio-&amp;gt;bi_flags);
+
+bio_put(bio);
+closure_put(cl);
+}
+
+static void __bio_submit_split(struct closure *cl)
+{
+struct bio_split_hook *s = container_of(cl, struct bio_split_hook, cl);
+struct bio *bio = s-&amp;gt;bio, *n;
+
+do {
+/*
+ * If we're running underneath generic_make_request(), we risk
+ * deadlock if we allocate multiple bios from the mempool.
+ *
+ * To avoid this, bio_split() masks out __GFP_WAIT
+ * current-&amp;gt;bio_list != NULL; if it fails, the continue_at()
+ * just punts us to a workqueue, where we can safely retry the
+ * allocation using the mempool.
+ */
+n = bio_split(bio, bio_max_sectors(bio),
+      GFP_NOIO, s-&amp;gt;q-&amp;gt;bio_split);
+if (!n)
+continue_at(cl, __bio_submit_split, bio_split_wq);
+
+closure_get(cl);
+generic_make_request(n);
+} while (n != bio);
+
+continue_at(cl, bio_submit_split_done, NULL);
+}
+
+static bool bio_submit_split(struct bio *bio)
+{
+struct bio_split_hook *s;
+struct request_queue *q = bdev_get_queue(bio-&amp;gt;bi_bdev);
+
+if (!bio_has_data(bio) || !q || !q-&amp;gt;bio_split_hook ||
+    bio_sectors(bio) &amp;lt;= bio_max_sectors(bio))
+return false;
+
+s = mempool_alloc(q-&amp;gt;bio_split_hook, GFP_NOIO);
+
+closure_init(&amp;amp;s-&amp;gt;cl, NULL);
+s-&amp;gt;bio= bio;
+s-&amp;gt;q= q;
+s-&amp;gt;bi_end_io= bio-&amp;gt;bi_end_io;
+s-&amp;gt;bi_private= bio-&amp;gt;bi_private;
+
+bio_get(bio);
+bio-&amp;gt;bi_end_io= bio_submit_split_endio;
+bio-&amp;gt;bi_private= &amp;amp;s-&amp;gt;cl;
+
+__bio_submit_split(&amp;amp;s-&amp;gt;cl);
+return true;
+}
+
 /*
  * Check whether this bio extends beyond the end of the device.
  */
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1646,6 +1749,14 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; void generic_make_request(struct bio *bio)
  * it is non-NULL, then a make_request is active, and new requests
  * should be added at the tail
  */
+
+/*
+ * If the device can't accept arbitrary sized bios, check if we
+ * need to split:
+ */
+if (bio_submit_split(bio))
+return;
+
 if (current-&amp;gt;bio_list) {
 bio_list_add(current-&amp;gt;bio_list, bio);
 return;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -2892,11 +3003,18 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; int __init blk_dev_init(void)
 if (!kblockd_workqueue)
 panic("Failed to create kblockd\n");
 
+bio_split_wq = alloc_workqueue("bio_split", WQ_MEM_RECLAIM, 0);
+if (!bio_split_wq)
+panic("Failed to create bio_split wq\n");
+
 request_cachep = kmem_cache_create("blkdev_requests",
 sizeof(struct request), 0, SLAB_PANIC, NULL);
 
 blk_requestq_cachep = kmem_cache_create("blkdev_queue",
 sizeof(struct request_queue), 0, SLAB_PANIC, NULL);
 
+bio_split_cache = kmem_cache_create("bio_split_hook",
+sizeof(struct bio_split_hook), 0, SLAB_PANIC, NULL);
+
 return 0;
 }
diff --git a/fs/bio.c b/fs/bio.c
index b73c570..9077a07 100644
--- a/fs/bio.c
+++ b/fs/bio.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -426,6 +426,47 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; inline int bio_phys_segments(struct request_queue *q, struct bio *bio)
 }
 EXPORT_SYMBOL(bio_phys_segments);
 
+unsigned __bio_max_sectors(struct bio *bio, struct block_device *bdev,
+   sector_t sector)
+{
+unsigned ret = bio_sectors(bio);
+struct request_queue *q = bdev_get_queue(bdev);
+struct bio_vec *bv, *end = bio_iovec(bio) +
+min_t(int, bio_segments(bio), queue_max_segments(q));
+
+struct bvec_merge_data bvm = {
+.bi_bdev= bdev,
+.bi_sector= sector,
+.bi_size= 0,
+.bi_rw= bio-&amp;gt;bi_rw,
+};
+
+if (bio_segments(bio) &amp;gt; queue_max_segments(q) ||
+    q-&amp;gt;merge_bvec_fn) {
+ret = 0;
+
+for (bv = bio_iovec(bio); bv &amp;lt; end; bv++) {
+if (q-&amp;gt;merge_bvec_fn &amp;amp;&amp;amp;
+    q-&amp;gt;merge_bvec_fn(q, &amp;amp;bvm, bv) &amp;lt; (int) bv-&amp;gt;bv_len)
+break;
+
+ret+= bv-&amp;gt;bv_len &amp;gt;&amp;gt; 9;
+bvm.bi_size+= bv-&amp;gt;bv_len;
+}
+
+if (ret &amp;gt;= (BIO_MAX_PAGES * PAGE_SIZE) &amp;gt;&amp;gt; 9)
+return (BIO_MAX_PAGES * PAGE_SIZE) &amp;gt;&amp;gt; 9;
+}
+
+ret = min(ret, queue_max_sectors(q));
+
+WARN_ON(!ret);
+ret = max_t(int, ret, bio_iovec(bio)-&amp;gt;bv_len &amp;gt;&amp;gt; 9);
+
+return ret;
+}
+EXPORT_SYMBOL_GPL(__bio_max_sectors);
+
 /**
  * __bio_clone-clone a bio
  * &amp;lt; at &amp;gt;bio: destination bio
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 64fdcc8..3669942 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -219,6 +219,13 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; extern void bio_endio(struct bio *, int);
 struct request_queue;
 extern int bio_phys_segments(struct request_queue *, struct bio *);
 
+unsigned __bio_max_sectors(struct bio *, struct block_device *, sector_t);
+
+static inline unsigned bio_max_sectors(struct bio *bio)
+{
+return __bio_max_sectors(bio, bio-&amp;gt;bi_bdev, bio-&amp;gt;bi_sector);
+}
+
 extern void __bio_clone(struct bio *, struct bio *);
 extern struct bio *bio_clone_bioset(struct bio *, gfp_t, struct bio_set *bs);
 extern struct bio *bio_clone(struct bio *, gfp_t);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 2aa2466..464adb7 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -399,6 +399,9 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; struct request_queue {
 /* Throttle data */
 struct throtl_data *td;
 #endif
+
+mempool_t*bio_split_hook;
+struct bio_set*bio_split;
 };
 
 #define QUEUE_FLAG_QUEUED1/* uses generic tag queueing */
&lt;/pre&gt;</description>
    <dc:creator>Kent Overstreet</dc:creator>
    <dc:date>2012-05-24T00:02:50</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64585">
    <title>[PATCH v2 06/14] block: Add bio_reset()</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64585</link>
    <description>&lt;pre&gt;Reusing bios is something that's been highly frowned upon in the past,
but driver code keeps doing it anyways. If it's going to happen anyways,
we should provide a generic method.

This'll help with getting rid of bi_destructor - drivers/block/pktcdvd.c
was open coding it, by doing a bio_init() and resetting bi_destructor.

Signed-off-by: Kent Overstreet &amp;lt;koverstreet&amp;lt; at &amp;gt;google.com&amp;gt;
Change-Id: Ib0a43dfcb3f6c22a54da513d4a86be544b5ffd95
---
 fs/bio.c                  |    8 ++++++++
 include/linux/bio.h       |    1 +
 include/linux/blk_types.h |    6 ++++++
 3 files changed, 15 insertions(+)

diff --git a/fs/bio.c b/fs/bio.c
index de0733e..7d8c29d 100644
--- a/fs/bio.c
+++ b/fs/bio.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -259,6 +259,14 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; void bio_init(struct bio *bio)
 }
 EXPORT_SYMBOL(bio_init);
 
+void bio_reset(struct bio *bio)
+{
+memset(bio, 0, BIO_RESET_BYTES);
+bio-&amp;gt;bi_flags = 1 &amp;lt;&amp;lt; BIO_UPTODATE;
+
+}
+EXPORT_SYMBOL(bio_reset);
+
 /**
  * bio_alloc_bioset - allocate a bio for I/O
  * &amp;lt; at &amp;gt;gfp_mask:   the GFP_ mask given to the slab allocator
diff --git a/include/linux/bio.h b/include/linux/bio.h
index b27f16b..35f7c4d 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -228,6 +228,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; extern struct bio *bio_clone(struct bio *, gfp_t);
 struct bio *bio_clone_kmalloc(struct bio *, gfp_t);
 
 extern void bio_init(struct bio *);
+extern void bio_reset(struct bio *);
 
 extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
 extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index dc0e399..6b7daf3 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -57,6 +57,10 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; struct bio {
 unsigned intbi_seg_front_size;
 unsigned intbi_seg_back_size;
 
+/*
+ * Everything starting with bi_max_vecs will be preserved by bio_reset()
+ */
+
 unsigned intbi_max_vecs;/* max bvl_vecs we can hold */
 
 atomic_tbi_cnt;/* pin count */
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -83,6 +87,8 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; struct bio {
 struct bio_vecbi_inline_vecs[0];
 };
 
+#define BIO_RESET_BYTESoffsetof(struct bio, bi_max_vecs)
+
 /*
  * bio flags
  */
&lt;/pre&gt;</description>
    <dc:creator>Kent Overstreet</dc:creator>
    <dc:date>2012-05-24T00:02:43</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64584">
    <title>[PATCH v2 02/14] dm: kill dm_rq_bio_destructor</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64584</link>
    <description>&lt;pre&gt;Signed-off-by: Kent Overstreet &amp;lt;koverstreet&amp;lt; at &amp;gt;google.com&amp;gt;
---
 drivers/md/dm.c |   11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 40b7735..e6e7b19 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -696,6 +696,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static void end_clone_bio(struct bio *clone, int error)
 struct bio *bio = info-&amp;gt;orig;
 unsigned int nr_bytes = info-&amp;gt;orig-&amp;gt;bi_size;
 
+free_bio_info(info);
 bio_put(clone);
 
 if (tio-&amp;gt;error)
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1438,15 +1439,6 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; void dm_dispatch_request(struct request *rq)
 }
 EXPORT_SYMBOL_GPL(dm_dispatch_request);
 
-static void dm_rq_bio_destructor(struct bio *bio)
-{
-struct dm_rq_clone_bio_info *info = bio-&amp;gt;bi_private;
-struct mapped_device *md = info-&amp;gt;tio-&amp;gt;md;
-
-free_bio_info(info);
-bio_free(bio, md-&amp;gt;bs);
-}
-
 static int dm_rq_bio_constructor(struct bio *bio, struct bio *bio_orig,
  void *data)
 {
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1461,7 +1453,6 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static int dm_rq_bio_constructor(struct bio *bio, struct bio *bio_orig,
 info-&amp;gt;tio = tio;
 bio-&amp;gt;bi_end_io = end_clone_bio;
 bio-&amp;gt;bi_private = info;
-bio-&amp;gt;bi_destructor = dm_rq_bio_destructor;
 
 return 0;
 }
&lt;/pre&gt;</description>
    <dc:creator>Kent Overstreet</dc:creator>
    <dc:date>2012-05-24T00:02:39</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64581">
    <title>[PATCH 2/2] vfs: remove unused __d_splice_alias argument</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64581</link>
    <description>&lt;pre&gt;From: "J. Bruce Fields" &amp;lt;bfields&amp;lt; at &amp;gt;redhat.com&amp;gt;

Nobody sets want_disconn any more.

Reported-by: Peng Tao &amp;lt;bergwolf&amp;lt; at &amp;gt;gmail.com&amp;gt;
Signed-off-by: J. Bruce Fields &amp;lt;bfields&amp;lt; at &amp;gt;redhat.com&amp;gt;
---
 fs/dcache.c |   13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index 2434c1e..8e34e91 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -649,8 +649,6 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; EXPORT_SYMBOL(dget_parent);
 /**
  * d_find_alias - grab a hashed alias of inode
  * &amp;lt; at &amp;gt;inode: inode in question
- * &amp;lt; at &amp;gt;want_discon:  flag, used by d_splice_alias, to request
- *          that only a DISCONNECTED alias be returned.
  *
  * If inode has a hashed alias, or is a directory and has any alias,
  * acquire the reference to alias and return it. Otherwise return NULL.
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -659,10 +657,9 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; EXPORT_SYMBOL(dget_parent);
  * of a filesystem.
  *
  * If the inode has an IS_ROOT, DCACHE_DISCONNECTED alias, then prefer
- * any other hashed alias over that one unless &amp;lt; at &amp;gt;want_discon is set,
- * in which case only return an IS_ROOT, DCACHE_DISCONNECTED alias.
+ * any other hashed alias over that.
  */
-static struct dentry *__d_find_alias(struct inode *inode, int want_discon)
+static struct dentry *__d_find_alias(struct inode *inode)
 {
 struct dentry *alias, *discon_alias;
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -674,7 +671,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; again:
 if (IS_ROOT(alias) &amp;amp;&amp;amp;
     (alias-&amp;gt;d_flags &amp;amp; DCACHE_DISCONNECTED)) {
 discon_alias = alias;
-} else if (!want_discon) {
+} else {
 __dget_dlock(alias);
 spin_unlock(&amp;amp;alias-&amp;gt;d_lock);
 return alias;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -705,7 +702,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; struct dentry *d_find_alias(struct inode *inode)
 
 if (!list_empty(&amp;amp;inode-&amp;gt;i_dentry)) {
 spin_lock(&amp;amp;inode-&amp;gt;i_lock);
-de = __d_find_alias(inode, 0);
+de = __d_find_alias(inode);
 spin_unlock(&amp;amp;inode-&amp;gt;i_lock);
 }
 return de;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -2395,7 +2392,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; struct dentry *d_materialise_unique(struct dentry *dentry, struct inode *inode)
 struct dentry *alias;
 
 /* Does an aliased dentry already exist? */
-alias = __d_find_alias(inode, 0);
+alias = __d_find_alias(inode);
 if (alias) {
 actual = alias;
 write_seqlock(&amp;amp;rename_lock);
&lt;/pre&gt;</description>
    <dc:creator>J. Bruce Fields</dc:creator>
    <dc:date>2012-05-23T22:05:46</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/64580">
    <title>[PATCH 1/2] vfs: stop d_splice_alias creating directory aliases</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/64580</link>
    <description>&lt;pre&gt;From: "J. Bruce Fields" &amp;lt;bfields&amp;lt; at &amp;gt;redhat.com&amp;gt;

A directory should never have more than one dentry pointing to it.

But d_splice_alias() will add one if it finds a directory with an
already-existing non-DISCONNECTED dentry.

I can't find an obvious reproducer, but I also can't see what prevents
d_splice_alias() from encountering such a case.

It therefore seems safest to allow d_splice_alias to use any dentry it
finds.

(Prior to the removal of dentry_unhash() from vfs_rmdir(), around v3.0,
this could cause an nfsd deadlock like this:

- Somebody attempts to remove a non-empty directory.
- The dentry_unhash() in vfs_rmdir() unhashes the dentry
  pointing to the non-empty directory.
- -&amp;gt;rmdir() then fails with -ENOTEMPTY
- Before the vfs_rmdir() caller reaches dput(), an nfsd process
  in rename looks up the directory by filehandle; at the end of
  that lookup, this dentry is found by d_alloc_anon(), and a
  reference is taken on it, preventing dput() from removing it.
- A regular lookup of the directory calls d_splice_alias(),
  finds only an unhashed (not a DISCONNECTED) dentry, and
  insteads adds a new one, so the directory now has two
  dentries.
- The nfsd process in rename, which was previously looking up
  the source directory of the rename, now looks up the target
  directory (which is the same), and gets the dentry newly
  created by the previous lookup.
- The rename, seeing two different dentries, assumes this is a
  cross-directory rename and attempts to take the i_mutex on the
  directory twice.

That reproducer no longer exists, but I don't think there was anything
fundamentally incorrect about the vfs_rmdir() behavior there, so I think
the real fault was here in d_splice_alias().)

Signed-off-by: J. Bruce Fields &amp;lt;bfields&amp;lt; at &amp;gt;redhat.com&amp;gt;
---
 fs/dcache.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index b60ddc4..2434c1e 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1606,9 +1606,8 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; struct dentry *d_splice_alias(struct inode *inode, struct dentry *dentry)
 
 if (inode &amp;amp;&amp;amp; S_ISDIR(inode-&amp;gt;i_mode)) {
 spin_lock(&amp;amp;inode-&amp;gt;i_lock);
-new = __d_find_alias(inode, 1);
+new = __d_find_any_alias(inode);
 if (new) {
-BUG_ON(!(new-&amp;gt;d_flags &amp;amp; DCACHE_DISCONNECTED));
 spin_unlock(&amp;amp;inode-&amp;gt;i_lock);
 security_d_instantiate(new, inode);
 d_move(new, dentry);
&lt;/pre&gt;</description>
    <dc:creator>J. Bruce Fields</dc:creator>
    <dc:date>2012-05-23T22:05:45</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.linux.file-systems">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.linux.file-systems</link>
  </textinput>
</rdf:RDF>

