<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://blog.gmane.org/gmane.linux.file-systems">
    <title>gmane.linux.file-systems</title>
    <link>http://blog.gmane.org/gmane.linux.file-systems</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74757"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74755"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74753"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74752"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74729"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74716"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74710"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74708"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74703"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74701"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74699"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74697"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74696"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74695"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74690"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74689"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74682"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74676"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74675"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.linux.file-systems/74674"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74757">
    <title>предобро поглядеть на наш сайт. вам полюбится то что Вы на нем прочитаете.</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74757</link>
    <description>&lt;pre&gt;наипрекраснейшее предложение на сейчас  http://goo.gl/ZXVUR?/JXCbq Без подхода – никуда! 
&lt;/pre&gt;</description>
    <dc:creator>angfealmian</dc:creator>
    <dc:date>2013-05-21T21:47:41</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74755">
    <title>(unknown)</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74755</link>
    <description>&lt;pre&gt;

Are you financially down and in need of financial assistance to settle your bills or depth and you have know were to go,if yes ,contact us &amp;amp;nbsp;for assistance Via Email:cmothertheressa&amp;lt; at &amp;gt;yahoo.com
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>Mrs. Theressa</dc:creator>
    <dc:date>2013-05-21T21:51:21</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74753">
    <title>(unknown)</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74753</link>
    <description>&lt;pre&gt;

Are you financially down and in need of financial assistance to settle your bills or depth and you have know were to go,if yes ,contact us &amp;amp;nbsp;for assistance Via Email:cmothertheressa&amp;lt; at &amp;gt;yahoo.com
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>Mrs. Theressa</dc:creator>
    <dc:date>2013-05-21T21:32:45</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74752">
    <title>(unknown)</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74752</link>
    <description>&lt;pre&gt;

Are you financially down and in need of financial assistance to settle your bills or depth and you have know were to go,if yes ,contact us &amp;amp;nbsp;for assistance Via Email:cmothertheressa&amp;lt; at &amp;gt;yahoo.com
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>Mrs. Theressa</dc:creator>
    <dc:date>2013-05-21T21:31:47</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74729">
    <title>Re: proc_subdir_lock related deadlock</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74729</link>
    <description>&lt;pre&gt;
I believe the real fix is not to call remove or create proc from softirq
context.

What kernel are you using, as I can't find uid_stat_tcp_rcv() anywhere
in the latest kernel.

&lt;/pre&gt;</description>
    <dc:creator>Steven Rostedt</dc:creator>
    <dc:date>2013-05-21T17:39:21</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74716">
    <title>Re: Limit dentry cache entries</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74716</link>
    <description>&lt;pre&gt;
This request does come up every so often. There are valid reasons
for being able to control the exact size of the dentry and page
caches - I've seen a few implementations in storage appliance
vendor kernels where total control of memory usage yields a few
percent better performance of industry specific benchmarks. Indeed,
years ago I thought that capping the size of the dnetry cache was a
good idea, too.

However, the problem that I've seen with every single on of these
implementations is that the limit is carefully tuned for best all
round performance in a given set of canned workloads. When the limit
is wrong, performance tanks, and it is just about impossible to set
a limit correctly for a machine that has a changing workload.

If your problem is negative dentries building up, where do you set
the limit? Set it low enough to keep only a small number of total
dentries to keep the negative dentries down, and you'll end up
with a dentry cache that isn't big enough to hold all th dentries
needed for efficient performance with workloads that do directory
traversals. It's a two-edged sword, and most people do not have
enough knowledge to tune a knob correctly.

IOWs, the automatic sizing of the dentry cache based on memory
pressure is the correct thing to do. Capping it, or allowing it to
be capped will simply generate bug reports for strange performance
problems....

That said, keeping lots of negative dentries around until memory
pressure kicks them out is probably the wrong thing to do. Negative
dentries are an optimisation for some workloads, but they tend to
have references to negative dentries with a temporal locality that
matches the unlink time.

Perhaps we need to separately reclaim negative dentries i.e. not
wait for memory pressure to reclaim them but use some other kind of
trigger for reclamation. That doesn't cap the size of the dentry
cache, but would address the problem of negative dentry buildup....

Cheers,

Dave.
&lt;/pre&gt;</description>
    <dc:creator>Dave Chinner</dc:creator>
    <dc:date>2013-05-20T22:53:42</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74710">
    <title>Re: [PATCH 13/13] Kconfig: Add Kconfig entry for Labeled NFS V4 client</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74710</link>
    <description>&lt;pre&gt;
Sorry. I mean, just replace it with

config NFS_V4_SECURITY_LABEL
bool
depend on NFS_V4_2 &amp;amp;&amp;amp; SECURITY
default Y


&lt;/pre&gt;</description>
    <dc:creator>Myklebust, Trond</dc:creator>
    <dc:date>2013-05-20T21:14:51</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74708">
    <title>Re: [PATCH 13/13] Kconfig: Add Kconfig entry for Labeled NFS V4 client</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74708</link>
    <description>&lt;pre&gt;
Eric is right. In any case, we already agreed that we don't need _both_
a NFSv4.2 and a NFSv4 security label switch.

Please just get rid of NFS_V4_SECURITY_LABEL.

&lt;/pre&gt;</description>
    <dc:creator>Myklebust, Trond</dc:creator>
    <dc:date>2013-05-20T21:12:56</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74703">
    <title>Re: [PATCH 07/13] NFSv4: Introduce new label structure</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74703</link>
    <description>&lt;pre&gt;
I thought we were getting rid of all these unnecessary dir_labels etc.?
We agreed that we don't need to read labels on link, remove, readlink
etc.


Why does this belong in the uapi?



&lt;/pre&gt;</description>
    <dc:creator>Myklebust, Trond</dc:creator>
    <dc:date>2013-05-20T19:12:38</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74701">
    <title>Наизабавнейший подарок для детей</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74701</link>
    <description>&lt;pre&gt;Игрушка сконструирована ошеломлять http://goo.gl/6PBxZ?/wgLIXQb
&lt;/pre&gt;</description>
    <dc:creator>инуля</dc:creator>
    <dc:date>2013-05-20T18:15:28</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74699">
    <title>Re: [PATCH v7 18/34] fs: convert fs shrinkers to new scan/count API</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74699</link>
    <description>&lt;pre&gt;Dave,

I am auditing the other conversions now for patterns like this.
In the ashmem driver, you wrote:

- * 'nr_to_scan' is the number of objects (pages) to prune, or 0 to
query how
- * many objects (pages) we have in total.
+ * 'nr_to_scan' is the number of objects to scan for freeing.

Can you please clarify what is your intention here? For me, nr_to_scan
is still the amount we should try to free - and it has been this way for
a while - even if we have to *scan* more objects than this, because
some of them cannot be freed.

In the shrinkers you have been converting, I actually found both kinds
of behaviors: In some of them you test for freed &amp;gt;= nr_to_scan, and in
others, --nr_to_scan &amp;gt; 0

My assumption is that scanning the objects is cheap comparing to the
other cache operations we do, filling or unfilling, so whoever could,
should incur the extra couple of scans to make sure that we free as many
objects as requested *if we can*.

I am changing all shrinkers now to behave consistently like this, i.e.
bailing out in freed &amp;gt; nr_to_scan instead of nr_to_scan == 0 (for most
of them is thankfully obvious, those two things being the same).

If you for any reason wanted nr_to_scan to mean # of objects *scanned*,
not freed, IOW, if this is not a mistake, please say so and justify.

Thanks.



--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>Glauber Costa</dc:creator>
    <dc:date>2013-05-20T15:25:47</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74697">
    <title>Re: [PATCH v7 18/34] fs: convert fs shrinkers to new scan/count API</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74697</link>
    <description>&lt;pre&gt;No, this should be the max number to be demoted, no change.
This test above should then be freed &amp;lt; nr.

I will update, thanks for spotting.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>Glauber Costa</dc:creator>
    <dc:date>2013-05-20T13:46:57</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74696">
    <title>Вы можете ощутимее</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74696</link>
    <description>&lt;pre&gt;Мы преподносим вашему любопытству наиновейшую программку людям, кои направляются к своим целям! и не имеет значения, каковую задачу Вы пред собою устанавливаете – заменить кампанию, развернуть компанию али похудеть на |0 кил. Мы знаем как помочь Вам достичь успеха! она посодействует Вам «разложить целиком по полочкам» и сосредоточить внутренние резервы на наивысшем – на Вашей цели! Подробности: http://goo.gl/1eym8?/asYIZBAX Вам когда - нибудь доводилось находиться в этакой ситуации, когда Вы начинали работать, но сызнова задерживались, начинали не решаться либо - утрачивали смак. Как будто содержалось Все понятно, и замысел содержался, но с места двинуться не слаживалось. совершили заключение действовать, а не предвидели с чего начать. не ориентировались как уйти из сложной ситуации, как пересилить препятствие.
&lt;/pre&gt;</description>
    <dc:creator>венедиктушка</dc:creator>
    <dc:date>2013-05-20T12:51:28</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74695">
    <title>Re: Limit dentry cache entries</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74695</link>
    <description>&lt;pre&gt;----- Original Message -----
| Hello,
| 
| We have a bunch of servers that create a lot of temp files, or check
| for the existence of non-existent files. Every such operation creates
| a dentry object and soon most of the free memory is consumed for
| 'negative' dentry entries. This behavior was observed on both CentOS
| kernel v.2.6.32-358 and Amazon Linux kernel v.3.4.43-4.
| 
| There are also some processes running that occasionally allocate large
| chunks of memory, and when this happens the kernel clears out a bunch
| of stale dentry caches. This clearing takes some time. kswapd kicks
| in, and allocations and bzero() of 4GB that normally takes &amp;lt;1s, takes
| 20s or more.
| 
| Because the memory needs are non-continuous but negative dentry
| generation is fairly continuous, vfs_cache_pressure doesn't help much.
| 
| The thought I had was to have a sysctl that limits the number of
| dentries per super-block (sb-max-dentry). Everytime a new dentry is
| allocated in d_alloc(), check if dentry_stat.nr_dentry exceeds (number
| of super blocks * sb-max-dentry). If yes, queue up an asynchronous
| workqueue call to prune_dcache(). Also have a separate sysctl to
| indicate by what percentage to reduce the dentry entries when this
| happens.
| 
| Thanks for your input. If this sounds like a reasonable idea, I'll
| send out a patch.
| 
| Cheers,
| Keyur.

Hi Keyur,

I like the idea. I've had people bring up the same issue, relating
to GFS2. This is especially true for doing du and similar ops on a
very large file system. This wasn't on GFS2, was it?

Regards,

Bob Peterson
Red Hat File Systems
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>Bob Peterson</dc:creator>
    <dc:date>2013-05-20T12:20:49</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74690">
    <title>Скоростное изучение</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74690</link>
    <description>&lt;pre&gt;  Адрес в интернете: http://goo.gl/yhPy0 . Конкретнейшая система подачи прорабатываемого учебного материала - фокусирование на успех. Создать круг общения. Курс для делающих первые щаги и углублённых. 
 
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>дануся</dc:creator>
    <dc:date>2013-05-20T04:53:48</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74689">
    <title>Limit dentry cache entries</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74689</link>
    <description>&lt;pre&gt;Hello,

We have a bunch of servers that create a lot of temp files, or check
for the existence of non-existent files. Every such operation creates
a dentry object and soon most of the free memory is consumed for
'negative' dentry entries. This behavior was observed on both CentOS
kernel v.2.6.32-358 and Amazon Linux kernel v.3.4.43-4.

There are also some processes running that occasionally allocate large
chunks of memory, and when this happens the kernel clears out a bunch
of stale dentry caches. This clearing takes some time. kswapd kicks
in, and allocations and bzero() of 4GB that normally takes &amp;lt;1s, takes
20s or more.

Because the memory needs are non-continuous but negative dentry
generation is fairly continuous, vfs_cache_pressure doesn't help much.

The thought I had was to have a sysctl that limits the number of
dentries per super-block (sb-max-dentry). Everytime a new dentry is
allocated in d_alloc(), check if dentry_stat.nr_dentry exceeds (number
of super blocks * sb-max-dentry). If yes, queue up an asynchronous
workqueue call to prune_dcache(). Also have a separate sysctl to
indicate by what percentage to reduce the dentry entries when this
happens.

Thanks for your input. If this sounds like a reasonable idea, I'll
send out a patch.

Cheers,
Keyur.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo&amp;lt; at &amp;gt;vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

&lt;/pre&gt;</description>
    <dc:creator>Keyur Govande</dc:creator>
    <dc:date>2013-05-20T03:50:55</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74682">
    <title>[PATCH 04/15] f2fs: fix BUG_ON during f2fs_evict_inode(dir)</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74682</link>
    <description>&lt;pre&gt;During the dentry recovery routine, recover_inode() triggers __f2fs_add_link
with its directory inode.

In the following scenario, a bug is captured.
 1. dir = f2fs_iget(pino)
 2. __f2fs_add_link(dir, name)
 3. iput(dir)
  -&amp;gt; f2fs_evict_inode() faces with BUG_ON(atomic_read(fi-&amp;gt;dirty_dents))

Kernel BUG at ffffffffa01c0676 [verbose debug info unavailable]
[&amp;lt;ffffffffa01c0676&amp;gt;] f2fs_evict_inode+0x276/0x300 [f2fs]
Call Trace:
 [&amp;lt;ffffffff8118ea00&amp;gt;] evict+0xb0/0x1b0
 [&amp;lt;ffffffff8118f1c5&amp;gt;] iput+0x105/0x190
 [&amp;lt;ffffffffa01d2dac&amp;gt;] recover_fsync_data+0x3bc/0x1070 [f2fs]
 [&amp;lt;ffffffff81692e8a&amp;gt;] ? io_schedule+0xaa/0xd0
 [&amp;lt;ffffffff81690acb&amp;gt;] ? __wait_on_bit_lock+0x7b/0xc0
 [&amp;lt;ffffffff8111a0e7&amp;gt;] ? __lock_page+0x67/0x70
 [&amp;lt;ffffffff81165e21&amp;gt;] ? kmem_cache_alloc+0x31/0x140
 [&amp;lt;ffffffff8118a502&amp;gt;] ? __d_instantiate+0x92/0xf0
 [&amp;lt;ffffffff812a949b&amp;gt;] ? security_d_instantiate+0x1b/0x30
 [&amp;lt;ffffffff8118a5b4&amp;gt;] ? d_instantiate+0x54/0x70

This means that we should flush all the dentry pages between iget and iput().
But, during the recovery routine, it is unallowed due to consistency, so we
have to wait the whole recovery process.
And then, write_checkpoint flushes all the dirty dentry blocks, and nicely we
can put the stale dir inodes from the dirty_dir_inode_list.

Signed-off-by: Jaegeuk Kim &amp;lt;jaegeuk.kim&amp;lt; at &amp;gt;samsung.com&amp;gt;
---
 fs/f2fs/checkpoint.c | 23 +++++++++++++++++++++++
 fs/f2fs/f2fs.h       |  2 ++
 fs/f2fs/recovery.c   | 14 +++++++++-----
 3 files changed, 34 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index b1de01d..3d11449 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -514,6 +514,29 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; void remove_dirty_dir_inode(struct inode *inode)
 }
 out:
 spin_unlock(&amp;amp;sbi-&amp;gt;dir_inode_lock);
+
+/* Only from the recovery routine */
+if (is_inode_flag_set(F2FS_I(inode), FI_DELAY_IPUT))
+iput(inode);
+}
+
+struct inode *check_dirty_dir_inode(struct f2fs_sb_info *sbi, nid_t ino)
+{
+struct list_head *head = &amp;amp;sbi-&amp;gt;dir_inode_list;
+struct list_head *this;
+struct inode *inode = NULL;
+
+spin_lock(&amp;amp;sbi-&amp;gt;dir_inode_lock);
+list_for_each(this, head) {
+struct dir_inode_entry *entry;
+entry = list_entry(this, struct dir_inode_entry, list);
+if (entry-&amp;gt;inode-&amp;gt;i_ino == ino) {
+inode = entry-&amp;gt;inode;
+break;
+}
+}
+spin_unlock(&amp;amp;sbi-&amp;gt;dir_inode_lock);
+return inode;
 }
 
 void sync_dirty_dir_inodes(struct f2fs_sb_info *sbi)
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 20aab02..ef6cac8 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -846,6 +846,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; enum {
 FI_INC_LINK,/* need to increment i_nlink */
 FI_ACL_MODE,/* indicate acl mode */
 FI_NO_ALLOC,/* should not allocate any blocks */
+FI_DELAY_IPUT,/* used for the recovery */
 };
 
 static inline void set_inode_flag(struct f2fs_inode_info *fi, int flag)
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -1012,6 +1013,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; int recover_orphan_inodes(struct f2fs_sb_info *);
 int get_valid_checkpoint(struct f2fs_sb_info *);
 void set_dirty_dir_page(struct inode *, struct page *);
 void remove_dirty_dir_inode(struct inode *);
+struct inode *check_dirty_dir_inode(struct f2fs_sb_info *, nid_t);
 void sync_dirty_dir_inodes(struct f2fs_sb_info *);
 void write_checkpoint(struct f2fs_sb_info *, bool);
 void init_orphan_info(struct f2fs_sb_info *);
diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index f77aedd..c573944 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -42,6 +42,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static int recover_dentry(struct page *ipage, struct inode *inode)
 {
 struct f2fs_node *raw_node = (struct f2fs_node *)kmap(ipage);
 struct f2fs_inode *raw_inode = &amp;amp;(raw_node-&amp;gt;i);
+nid_t pino = le32_to_cpu(raw_inode-&amp;gt;i_pino);
 struct qstr name;
 struct f2fs_dir_entry *de;
 struct page *page;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -51,10 +52,14 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static int recover_dentry(struct page *ipage, struct inode *inode)
 if (!is_dent_dnode(ipage))
 goto out;
 
-dir = f2fs_iget(inode-&amp;gt;i_sb, le32_to_cpu(raw_inode-&amp;gt;i_pino));
-if (IS_ERR(dir)) {
-err = PTR_ERR(dir);
-goto out;
+dir = check_dirty_dir_inode(F2FS_SB(inode-&amp;gt;i_sb), pino);
+if (!dir) {
+dir = f2fs_iget(inode-&amp;gt;i_sb, pino);
+if (IS_ERR(dir)) {
+err = PTR_ERR(dir);
+goto out;
+}
+set_inode_flag(F2FS_I(dir), FI_DELAY_IPUT);
 }
 
 name.len = le32_to_cpu(raw_inode-&amp;gt;i_namelen);
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -67,7 +72,6 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static int recover_dentry(struct page *ipage, struct inode *inode)
 } else {
 err = __f2fs_add_link(dir, &amp;amp;name, inode);
 }
-iput(dir);
 out:
 kunmap(ipage);
 return err;
&lt;/pre&gt;</description>
    <dc:creator>Jaegeuk Kim</dc:creator>
    <dc:date>2013-05-20T03:32:18</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74676">
    <title>[PATCH v7 34/34] memcg: reap dead memcgs upon global memory pressure.</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74676</link>
    <description>&lt;pre&gt;When we delete kmem-enabled memcgs, they can still be zombieing
around for a while. The reason is that the objects may still be alive,
and we won't be able to delete them at destruction time.

The only entry point for that, though, are the shrinkers. The
shrinker interface, however, is not exactly tailored to our needs. It
could be a little bit better by using the API Dave Chinner proposed, but
it is still not ideal since we aren't really a count-and-scan event, but
more a one-off flush-all-you-can event that would have to abuse that
somehow.

Signed-off-by: Glauber Costa &amp;lt;glommer&amp;lt; at &amp;gt;openvz.org&amp;gt;
Cc: Dave Chinner &amp;lt;dchinner&amp;lt; at &amp;gt;redhat.com&amp;gt;
Cc: Mel Gorman &amp;lt;mgorman&amp;lt; at &amp;gt;suse.de&amp;gt;
Cc: Rik van Riel &amp;lt;riel&amp;lt; at &amp;gt;redhat.com&amp;gt;
Cc: Johannes Weiner &amp;lt;hannes&amp;lt; at &amp;gt;cmpxchg.org&amp;gt;
Cc: Michal Hocko &amp;lt;mhocko&amp;lt; at &amp;gt;suse.cz&amp;gt;
Cc: Hugh Dickins &amp;lt;hughd&amp;lt; at &amp;gt;google.com&amp;gt;
Cc: Kamezawa Hiroyuki &amp;lt;kamezawa.hiroyu&amp;lt; at &amp;gt;jp.fujitsu.com&amp;gt;
Cc: Andrew Morton &amp;lt;akpm&amp;lt; at &amp;gt;linux-foundation.org&amp;gt;
---
 mm/memcontrol.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 46 insertions(+), 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 6f6a330..7006f03 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -400,7 +400,6 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static size_t memcg_size(void)
 nr_node_ids * sizeof(struct mem_cgroup_per_node);
 }
 
-#ifdef CONFIG_MEMCG_DEBUG_ASYNC_DESTROY
 static LIST_HEAD(dangling_memcgs);
 static DEFINE_MUTEX(dangling_memcgs_mutex);
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -409,11 +408,14 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static inline void memcg_dangling_free(struct mem_cgroup *memcg)
 mutex_lock(&amp;amp;dangling_memcgs_mutex);
 list_del(&amp;amp;memcg-&amp;gt;dead);
 mutex_unlock(&amp;amp;dangling_memcgs_mutex);
+#ifdef CONFIG_MEMCG_DEBUG_ASYNC_DESTROY
 free_pages((unsigned long)memcg-&amp;gt;memcg_name, 0);
+#endif
 }
 
 static inline void memcg_dangling_add(struct mem_cgroup *memcg)
 {
+#ifdef CONFIG_MEMCG_DEBUG_ASYNC_DESTROY
 /*
  * cgroup.c will do page-sized allocations most of the time,
  * so we'll just follow the pattern. Also, __get_free_pages
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -439,15 +441,12 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static inline void memcg_dangling_add(struct mem_cgroup *memcg)
 }
 
 add_list:
+#endif
 INIT_LIST_HEAD(&amp;amp;memcg-&amp;gt;dead);
 mutex_lock(&amp;amp;dangling_memcgs_mutex);
 list_add(&amp;amp;memcg-&amp;gt;dead, &amp;amp;dangling_memcgs);
 mutex_unlock(&amp;amp;dangling_memcgs_mutex);
 }
-#else
-static inline void memcg_dangling_free(struct mem_cgroup *memcg) {}
-static inline void memcg_dangling_add(struct mem_cgroup *memcg) {}
-#endif
 
 static DEFINE_MUTEX(set_limit_mutex);
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -6312,6 +6311,41 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static int mem_cgroup_oom_control_write(struct cgroup *cgrp,
 }
 
 #ifdef CONFIG_MEMCG_KMEM
+static void memcg_vmpressure_shrink_dead(void)
+{
+struct memcg_cache_params *params, *tmp;
+struct kmem_cache *cachep;
+struct mem_cgroup *memcg;
+
+mutex_lock(&amp;amp;dangling_memcgs_mutex);
+list_for_each_entry(memcg, &amp;amp;dangling_memcgs, dead) {
+mutex_lock(&amp;amp;memcg-&amp;gt;slab_caches_mutex);
+/* The element may go away as an indirect result of shrink */
+list_for_each_entry_safe(params, tmp,
+ &amp;amp;memcg-&amp;gt;memcg_slab_caches, list) {
+cachep = memcg_params_to_cache(params);
+/*
+ * the cpu_hotplug lock is taken in kmem_cache_create
+ * outside the slab_caches_mutex manipulation. It will
+ * be taken by kmem_cache_shrink to flush the cache.
+ * So we need to drop the lock. It is all right because
+ * the lock only protects elements moving in and out the
+ * list.
+ */
+mutex_unlock(&amp;amp;memcg-&amp;gt;slab_caches_mutex);
+kmem_cache_shrink(cachep);
+mutex_lock(&amp;amp;memcg-&amp;gt;slab_caches_mutex);
+}
+mutex_unlock(&amp;amp;memcg-&amp;gt;slab_caches_mutex);
+}
+mutex_unlock(&amp;amp;dangling_memcgs_mutex);
+}
+
+static void memcg_register_kmem_events(struct cgroup *cont)
+{
+vmpressure_register_kernel_event(cont, memcg_vmpressure_shrink_dead);
+}
+
 static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
 {
 int ret;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -6347,6 +6381,10 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static void kmem_cgroup_destroy(struct mem_cgroup *memcg)
 }
 }
 #else
+static inline void memcg_register_kmem_events(struct cgroup *cont)
+{
+}
+
 static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
 {
 return 0;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -6732,8 +6770,10 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; mem_cgroup_css_online(struct cgroup *cont)
 struct mem_cgroup *memcg, *parent;
 int error = 0;
 
-if (!cont-&amp;gt;parent)
+if (!cont-&amp;gt;parent) {
+memcg_register_kmem_events(cont);
 return 0;
+}
 
 mutex_lock(&amp;amp;memcg_create_mutex);
 memcg = mem_cgroup_from_cont(cont);
&lt;/pre&gt;</description>
    <dc:creator>Glauber Costa</dc:creator>
    <dc:date>2013-05-19T20:07:27</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74675">
    <title>[PATCH v7 33/34] vmpressure: in-kernel notifications</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74675</link>
    <description>&lt;pre&gt;From: Glauber Costa &amp;lt;glommer&amp;lt; at &amp;gt;parallels.com&amp;gt;

During the past weeks, it became clear to us that the shrinker interface
we have right now works very well for some particular types of users,
but not that well for others. The later are usually people interested in
one-shot notifications, that were forced to adapt themselves to the
count+scan behavior of shrinkers. To do so, they had no choice than to
greatly abuse the shrinker interface producing little monsters all over.

During LSF/MM, one of the proposals that popped out during our session
was to reuse Anton Voronstsov's vmpressure for this. They are designed
for userspace consumption, but also provide a well-stablished,
cgroup-aware entry point for notifications.

This patch extends that to also support in-kernel users. Events that
should be generated for in-kernel consumption will be marked as such,
and for those, we will call a registered function instead of triggering
an eventfd notification.

Please note that due to my lack of understanding of each shrinker user,
I will stay away from converting the actual users, you are all welcome
to do so.

Signed-off-by: Glauber Costa &amp;lt;glommer&amp;lt; at &amp;gt;openvz.org&amp;gt;
Acked-by: Anton Vorontsov &amp;lt;anton&amp;lt; at &amp;gt;enomsg.org&amp;gt;
Acked-by: Pekka Enberg &amp;lt;penberg&amp;lt; at &amp;gt;kernel.org&amp;gt;
Reviewed-by: Greg Thelen &amp;lt;gthelen&amp;lt; at &amp;gt;google.com&amp;gt;
Cc: Dave Chinner &amp;lt;david&amp;lt; at &amp;gt;fromorbit.com&amp;gt;
Cc: John Stultz &amp;lt;john.stultz&amp;lt; at &amp;gt;linaro.org&amp;gt;
Cc: Andrew Morton &amp;lt;akpm&amp;lt; at &amp;gt;linux-foundation.org&amp;gt;
Cc: Joonsoo Kim &amp;lt;js1304&amp;lt; at &amp;gt;gmail.com&amp;gt;
Cc: Michal Hocko &amp;lt;mhocko&amp;lt; at &amp;gt;suse.cz&amp;gt;
Cc: Kamezawa Hiroyuki &amp;lt;kamezawa.hiroyu&amp;lt; at &amp;gt;jp.fujitsu.com&amp;gt;
Cc: Johannes Weiner &amp;lt;hannes&amp;lt; at &amp;gt;cmpxchg.org&amp;gt;
---
 include/linux/vmpressure.h |  6 ++++++
 mm/vmpressure.c            | 52 +++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/include/linux/vmpressure.h b/include/linux/vmpressure.h
index 76be077..3131e72 100644
--- a/include/linux/vmpressure.h
+++ b/include/linux/vmpressure.h
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -19,6 +19,9 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; struct vmpressure {
 /* Have to grab the lock on events traversal or modifications. */
 struct mutex events_lock;
 
+/* False if only kernel users want to be notified, true otherwise. */
+bool notify_userspace;
+
 struct work_struct work;
 };
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -36,6 +39,9 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; extern struct vmpressure *css_to_vmpressure(struct cgroup_subsys_state *css);
 extern int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
      struct eventfd_ctx *eventfd,
      const char *args);
+
+extern int vmpressure_register_kernel_event(struct cgroup *cg,
+    void (*fn)(void));
 extern void vmpressure_unregister_event(struct cgroup *cg, struct cftype *cft,
 struct eventfd_ctx *eventfd);
 #else
diff --git a/mm/vmpressure.c b/mm/vmpressure.c
index 736a601..e16256e 100644
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -135,8 +135,12 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned,
 }
 
 struct vmpressure_event {
-struct eventfd_ctx *efd;
+union {
+struct eventfd_ctx *efd;
+void (*fn)(void);
+};
 enum vmpressure_levels level;
+bool kernel_event;
 struct list_head node;
 };
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -152,12 +156,15 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static bool vmpressure_event(struct vmpressure *vmpr,
 mutex_lock(&amp;amp;vmpr-&amp;gt;events_lock);
 
 list_for_each_entry(ev, &amp;amp;vmpr-&amp;gt;events, node) {
-if (level &amp;gt;= ev-&amp;gt;level) {
+if (ev-&amp;gt;kernel_event) {
+ev-&amp;gt;fn();
+} else if (vmpr-&amp;gt;notify_userspace &amp;amp;&amp;amp; level &amp;gt;= ev-&amp;gt;level) {
 eventfd_signal(ev-&amp;gt;efd, 1);
 signalled = true;
 }
 }
 
+vmpr-&amp;gt;notify_userspace = false;
 mutex_unlock(&amp;amp;vmpr-&amp;gt;events_lock);
 
 return signalled;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -227,7 +234,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; void vmpressure(gfp_t gfp, struct mem_cgroup *memcg,
  * we account it too.
  */
 if (!(gfp &amp;amp; (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_IO | __GFP_FS)))
-return;
+goto schedule;
 
 /*
  * If we got here with no pages scanned, then that is an indicator
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -244,8 +251,15 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; void vmpressure(gfp_t gfp, struct mem_cgroup *memcg,
 vmpr-&amp;gt;scanned += scanned;
 vmpr-&amp;gt;reclaimed += reclaimed;
 scanned = vmpr-&amp;gt;scanned;
+/*
+ * If we didn't reach this point, only kernel events will be triggered.
+ * It is the job of the worker thread to clean this up once the
+ * notifications are all delivered.
+ */
+vmpr-&amp;gt;notify_userspace = true;
 mutex_unlock(&amp;amp;vmpr-&amp;gt;sr_lock);
 
+schedule:
 if (scanned &amp;lt; vmpressure_win || work_pending(&amp;amp;vmpr-&amp;gt;work))
 return;
 schedule_work(&amp;amp;vmpr-&amp;gt;work);
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -328,6 +342,38 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
 }
 
 /**
+ * vmpressure_register_kernel_event() - Register kernel-side notification
+ * &amp;lt; at &amp;gt;cg:cgroup that is interested in vmpressure notifications
+ * &amp;lt; at &amp;gt;fn:function to be called when pressure happens
+ *
+ * This function register in-kernel users interested in receiving notifications
+ * about pressure conditions. Pressure notifications will be triggered at the
+ * same time as userspace notifications (with no particular ordering relative
+ * to it).
+ *
+ * Pressure notifications are a alternative method to shrinkers and will serve
+ * well users that are interested in a one-shot notification, with a
+ * well-defined cgroup aware interface.
+ */
+int vmpressure_register_kernel_event(struct cgroup *cg, void (*fn)(void))
+{
+struct vmpressure *vmpr = cg_to_vmpressure(cg);
+struct vmpressure_event *ev;
+
+ev = kzalloc(sizeof(*ev), GFP_KERNEL);
+if (!ev)
+return -ENOMEM;
+
+ev-&amp;gt;kernel_event = true;
+ev-&amp;gt;fn = fn;
+
+mutex_lock(&amp;amp;vmpr-&amp;gt;events_lock);
+list_add(&amp;amp;ev-&amp;gt;node, &amp;amp;vmpr-&amp;gt;events);
+mutex_unlock(&amp;amp;vmpr-&amp;gt;events_lock);
+return 0;
+}
+
+/**
  * vmpressure_unregister_event() - Unbind eventfd from vmpressure
  * &amp;lt; at &amp;gt;cg:cgroup handle
  * &amp;lt; at &amp;gt;cft:cgroup control files handle
&lt;/pre&gt;</description>
    <dc:creator>Glauber Costa</dc:creator>
    <dc:date>2013-05-19T20:07:26</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74674">
    <title>[PATCH v7 32/34] memcg: move initialization to memcg creation</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74674</link>
    <description>&lt;pre&gt;Those structures are only used for memcgs that are effectively using
kmemcg. However, in a later patch I intend to use scan that list
inconditionally (list empty meaning no kmem caches present), which
simplifies the code a lot.

So move the initialization to early kmem creation.

Signed-off-by: Glauber Costa &amp;lt;glommer&amp;lt; at &amp;gt;openvz.org&amp;gt;
Cc: Dave Chinner &amp;lt;dchinner&amp;lt; at &amp;gt;redhat.com&amp;gt;
Cc: Mel Gorman &amp;lt;mgorman&amp;lt; at &amp;gt;suse.de&amp;gt;
Cc: Rik van Riel &amp;lt;riel&amp;lt; at &amp;gt;redhat.com&amp;gt;
Cc: Johannes Weiner &amp;lt;hannes&amp;lt; at &amp;gt;cmpxchg.org&amp;gt;
Cc: Michal Hocko &amp;lt;mhocko&amp;lt; at &amp;gt;suse.cz&amp;gt;
Cc: Hugh Dickins &amp;lt;hughd&amp;lt; at &amp;gt;google.com&amp;gt;
Cc: Kamezawa Hiroyuki &amp;lt;kamezawa.hiroyu&amp;lt; at &amp;gt;jp.fujitsu.com&amp;gt;
Cc: Andrew Morton &amp;lt;akpm&amp;lt; at &amp;gt;linux-foundation.org&amp;gt;
---
 mm/memcontrol.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b8980d1..6f6a330 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -3323,9 +3323,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; int memcg_update_cache_sizes(struct mem_cgroup *memcg)
 
 memcg_update_array_size(num + 1);
 
-INIT_LIST_HEAD(&amp;amp;memcg-&amp;gt;memcg_slab_caches);
 INIT_WORK(&amp;amp;memcg-&amp;gt;kmemcg_shrink_work, kmemcg_shrink_work_fn);
-mutex_init(&amp;amp;memcg-&amp;gt;slab_caches_mutex);
 
 return 0;
 out:
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -6318,6 +6316,8 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
 {
 int ret;
 
+INIT_LIST_HEAD(&amp;amp;memcg-&amp;gt;memcg_slab_caches);
+mutex_init(&amp;amp;memcg-&amp;gt;slab_caches_mutex);
 memcg-&amp;gt;kmemcg_id = -1;
 ret = memcg_propagate_kmem(memcg);
 if (ret)
&lt;/pre&gt;</description>
    <dc:creator>Glauber Costa</dc:creator>
    <dc:date>2013-05-19T20:07:25</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.linux.file-systems/74673">
    <title>[PATCH v7 31/34] super: targeted memcg reclaim</title>
    <link>http://permalink.gmane.org/gmane.linux.file-systems/74673</link>
    <description>&lt;pre&gt;We now have all our dentries and inodes placed in memcg-specific LRU
lists. All we have to do is restrict the reclaim to the said lists in
case of memcg pressure.

That can't be done so easily for the fs_objects part of the equation,
since this is heavily fs-specific. What we do is pass on the context,
and let the filesystems decide if they ever chose or want to. At this
time, we just don't shrink them in memcg pressure (none is supported),
leaving that for global pressure only.

Marking the superblock shrinker and its LRUs as memcg-aware will
guarantee that the shrinkers will get invoked during targetted reclaim.

Signed-off-by: Glauber Costa &amp;lt;glommer&amp;lt; at &amp;gt;openvz.org&amp;gt;
Cc: Dave Chinner &amp;lt;dchinner&amp;lt; at &amp;gt;redhat.com&amp;gt;
Cc: Mel Gorman &amp;lt;mgorman&amp;lt; at &amp;gt;suse.de&amp;gt;
Cc: Rik van Riel &amp;lt;riel&amp;lt; at &amp;gt;redhat.com&amp;gt;
Cc: Johannes Weiner &amp;lt;hannes&amp;lt; at &amp;gt;cmpxchg.org&amp;gt;
Cc: Michal Hocko &amp;lt;mhocko&amp;lt; at &amp;gt;suse.cz&amp;gt;
Cc: Hugh Dickins &amp;lt;hughd&amp;lt; at &amp;gt;google.com&amp;gt;
Cc: Kamezawa Hiroyuki &amp;lt;kamezawa.hiroyu&amp;lt; at &amp;gt;jp.fujitsu.com&amp;gt;
Cc: Andrew Morton &amp;lt;akpm&amp;lt; at &amp;gt;linux-foundation.org&amp;gt;
---
 fs/dcache.c   |  7 ++++---
 fs/inode.c    |  7 ++++---
 fs/internal.h |  5 +++--
 fs/super.c    | 39 ++++++++++++++++++++++++++-------------
 4 files changed, 37 insertions(+), 21 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index e07aa73..cace5cd 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -889,13 +889,14 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; dentry_lru_isolate(struct list_head *item, spinlock_t *lru_lock, void *arg)
  * use.
  */
 long prune_dcache_sb(struct super_block *sb, unsigned long nr_to_scan,
-     int nid)
+     int nid, struct mem_cgroup *memcg)
 {
 LIST_HEAD(dispose);
 long freed;
 
-freed = list_lru_walk_node(&amp;amp;sb-&amp;gt;s_dentry_lru, nid, dentry_lru_isolate,
-       &amp;amp;dispose, &amp;amp;nr_to_scan);
+freed = list_lru_walk_node_memcg(&amp;amp;sb-&amp;gt;s_dentry_lru, nid,
+dentry_lru_isolate, &amp;amp;dispose,
+&amp;amp;nr_to_scan, memcg);
 shrink_dentry_list(&amp;amp;dispose);
 return freed;
 }
diff --git a/fs/inode.c b/fs/inode.c
index 00b804e..b9a8125 100644
--- a/fs/inode.c
+++ b/fs/inode.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -747,13 +747,14 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; inode_lru_isolate(struct list_head *item, spinlock_t *lru_lock, void *arg)
  * then are freed outside inode_lock by dispose_list().
  */
 long prune_icache_sb(struct super_block *sb, unsigned long nr_to_scan,
-     int nid)
+     int nid, struct mem_cgroup *memcg)
 {
 LIST_HEAD(freeable);
 long freed;
 
-freed = list_lru_walk_node(&amp;amp;sb-&amp;gt;s_inode_lru, nid, inode_lru_isolate,
-       &amp;amp;freeable, &amp;amp;nr_to_scan);
+freed = list_lru_walk_node_memcg(&amp;amp;sb-&amp;gt;s_inode_lru, nid,
+inode_lru_isolate, &amp;amp;freeable,
+&amp;amp;nr_to_scan, memcg);
 dispose_list(&amp;amp;freeable);
 return freed;
 }
diff --git a/fs/internal.h b/fs/internal.h
index 8902d56..601bd15 100644
--- a/fs/internal.h
+++ b/fs/internal.h
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -16,6 +16,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; struct file_system_type;
 struct linux_binprm;
 struct path;
 struct mount;
+struct mem_cgroup;
 
 /*
  * block_dev.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -111,7 +112,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; extern int open_check_o_direct(struct file *f);
  */
 extern spinlock_t inode_sb_list_lock;
 extern long prune_icache_sb(struct super_block *sb, unsigned long nr_to_scan,
-    int nid);
+    int nid, struct mem_cgroup *memcg);
 extern void inode_add_lru(struct inode *inode);
 
 /*
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -128,7 +129,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; extern int invalidate_inodes(struct super_block *, bool);
  */
 extern struct dentry *__d_alloc(struct super_block *, const struct qstr *);
 extern long prune_dcache_sb(struct super_block *sb, unsigned long nr_to_scan,
-    int nid);
+    int nid, struct mem_cgroup *memcg);
 
 /*
  * read_write.c
diff --git a/fs/super.c b/fs/super.c
index caf7639..b5c2a4d 100644
--- a/fs/super.c
+++ b/fs/super.c
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -34,6 +34,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt;
 #include &amp;lt;linux/cleancache.h&amp;gt;
 #include &amp;lt;linux/fsnotify.h&amp;gt;
 #include &amp;lt;linux/lockdep.h&amp;gt;
+#include &amp;lt;linux/memcontrol.h&amp;gt;
 #include "internal.h"
 
 
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -56,6 +57,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static char *sb_writers_name[SB_FREEZE_LEVELS] = {
 static long super_cache_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 struct super_block *sb;
+struct mem_cgroup *memcg = sc-&amp;gt;target_mem_cgroup;
 longfs_objects = 0;
 longtotal_objects;
 longfreed = 0;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -74,11 +76,12 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static long super_cache_scan(struct shrinker *shrink, struct shrink_control *sc)
 if (!grab_super_passive(sb))
 return -1;
 
-if (sb-&amp;gt;s_op &amp;amp;&amp;amp; sb-&amp;gt;s_op-&amp;gt;nr_cached_objects)
+if (sb-&amp;gt;s_op &amp;amp;&amp;amp; sb-&amp;gt;s_op-&amp;gt;nr_cached_objects &amp;amp;&amp;amp; !memcg)
 fs_objects = sb-&amp;gt;s_op-&amp;gt;nr_cached_objects(sb, sc-&amp;gt;nid);
 
-inodes = list_lru_count_node(&amp;amp;sb-&amp;gt;s_inode_lru, sc-&amp;gt;nid);
-dentries = list_lru_count_node(&amp;amp;sb-&amp;gt;s_dentry_lru, sc-&amp;gt;nid);
+inodes = list_lru_count_node_memcg(&amp;amp;sb-&amp;gt;s_inode_lru, sc-&amp;gt;nid, memcg);
+dentries = list_lru_count_node_memcg(&amp;amp;sb-&amp;gt;s_dentry_lru, sc-&amp;gt;nid, memcg);
+
 total_objects = dentries + inodes + fs_objects + 1;
 
 /* proportion the scan between the caches */
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -89,8 +92,8 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static long super_cache_scan(struct shrinker *shrink, struct shrink_control *sc)
  * prune the dcache first as the icache is pinned by it, then
  * prune the icache, followed by the filesystem specific caches
  */
-freed = prune_dcache_sb(sb, dentries, sc-&amp;gt;nid);
-freed += prune_icache_sb(sb, inodes, sc-&amp;gt;nid);
+freed = prune_dcache_sb(sb, dentries, sc-&amp;gt;nid, memcg);
+freed += prune_icache_sb(sb, inodes, sc-&amp;gt;nid, memcg);
 
 if (fs_objects) {
 fs_objects = mult_frac(sc-&amp;gt;nr_to_scan, fs_objects,
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -107,20 +110,26 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static long super_cache_count(struct shrinker *shrink, struct shrink_control *sc
 {
 struct super_block *sb;
 longtotal_objects = 0;
+struct mem_cgroup *memcg = sc-&amp;gt;target_mem_cgroup;
 
 sb = container_of(shrink, struct super_block, s_shrink);
 
 if (!grab_super_passive(sb))
 return -1;
 
-if (sb-&amp;gt;s_op &amp;amp;&amp;amp; sb-&amp;gt;s_op-&amp;gt;nr_cached_objects)
+/*
+ * Ideally we would pass memcg to nr_cached_objects, and
+ * let the underlying filesystem decide. Most likely the
+ * path will be if (!memcg) return;, but even then.
+ */
+if (sb-&amp;gt;s_op &amp;amp;&amp;amp; sb-&amp;gt;s_op-&amp;gt;nr_cached_objects &amp;amp;&amp;amp; !memcg)
 total_objects = sb-&amp;gt;s_op-&amp;gt;nr_cached_objects(sb,
  sc-&amp;gt;nid);
 
-total_objects += list_lru_count_node(&amp;amp;sb-&amp;gt;s_dentry_lru,
- sc-&amp;gt;nid);
-total_objects += list_lru_count_node(&amp;amp;sb-&amp;gt;s_inode_lru,
- sc-&amp;gt;nid);
+total_objects += list_lru_count_node_memcg(&amp;amp;sb-&amp;gt;s_dentry_lru,
+ sc-&amp;gt;nid, memcg);
+total_objects += list_lru_count_node_memcg(&amp;amp;sb-&amp;gt;s_inode_lru,
+ sc-&amp;gt;nid, memcg);
 
 total_objects = vfs_pressure_ratio(total_objects);
 drop_super(sb);
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -199,8 +208,10 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static struct super_block *alloc_super(struct file_system_type *type, int flags)
 INIT_HLIST_NODE(&amp;amp;s-&amp;gt;s_instances);
 INIT_HLIST_BL_HEAD(&amp;amp;s-&amp;gt;s_anon);
 INIT_LIST_HEAD(&amp;amp;s-&amp;gt;s_inodes);
-list_lru_init(&amp;amp;s-&amp;gt;s_dentry_lru);
-list_lru_init(&amp;amp;s-&amp;gt;s_inode_lru);
+
+list_lru_init_memcg(&amp;amp;s-&amp;gt;s_dentry_lru);
+list_lru_init_memcg(&amp;amp;s-&amp;gt;s_inode_lru);
+
 INIT_LIST_HEAD(&amp;amp;s-&amp;gt;s_mounts);
 init_rwsem(&amp;amp;s-&amp;gt;s_umount);
 lockdep_set_class(&amp;amp;s-&amp;gt;s_umount, &amp;amp;type-&amp;gt;s_umount_key);
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -236,7 +247,7 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; static struct super_block *alloc_super(struct file_system_type *type, int flags)
 s-&amp;gt;s_shrink.scan_objects = super_cache_scan;
 s-&amp;gt;s_shrink.count_objects = super_cache_count;
 s-&amp;gt;s_shrink.batch = 1024;
-s-&amp;gt;s_shrink.flags = SHRINKER_NUMA_AWARE;
+s-&amp;gt;s_shrink.flags = SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE;
 }
 out:
 return s;
&amp;lt; at &amp;gt;&amp;lt; at &amp;gt; -319,6 +330,8 &amp;lt; at &amp;gt;&amp;lt; at &amp;gt; void deactivate_locked_super(struct super_block *s)
 
 /* caches are now gone, we can safely kill the shrinker now */
 unregister_shrinker(&amp;amp;s-&amp;gt;s_shrink);
+list_lru_destroy(&amp;amp;s-&amp;gt;s_dentry_lru);
+list_lru_destroy(&amp;amp;s-&amp;gt;s_inode_lru);
 put_filesystem(fs);
 put_super(s);
 } else {
&lt;/pre&gt;</description>
    <dc:creator>Glauber Costa</dc:creator>
    <dc:date>2013-05-19T20:07:24</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.linux.file-systems">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.linux.file-systems</link>
  </textinput>
</rdf:RDF>
