Welcome to Soft32 Linux Forums!
FAQFAQ    SearchSearch      ProfileProfile    Private MessagesPrivate Messages   Log inLog in

[PATCH v2 0/5] mm: modest useability enhancements for node..

 
Goto page 1, 2
   Soft32 Home -> Linux -> Kernel RSS
Next:  [PATCH 0/7] Documentation: document /sys/devices/..  
Author Message
Alex Chiang

External


Since: Aug 22, 2007
Posts: 87



(Msg. 1) Posted: Wed Oct 21, 2009 11:20 pm
Post subject: [PATCH v2 0/5] mm: modest useability enhancements for node sysfs attrs
Archived from groups: linux>kernel (more info?)

This is v2 of the series.

The last patch in this series is dependent upon the documentation patch
series that I just sent out a few moments ago:

http://thread.gmane.org/gmane.linux.kernel/905018

Thanks,
/ac


v1 -> v2: http://thread.gmane.org/gmane.linux.kernel.mm/40084/
Address David Rientjes's comments
- check return value of sysfs_create_link in register_cpu_under_node
- do /not/ convert [un]register_cpu_under_node to return void, since
sparse starts whinging if you ignore sysfs_create_link()'s return
value and working around sparse makes the code ugly
- adjust documentation

Added S390 maintainers to cc: for patch [1/5] as per Kame-san's
suggestion. S390 may map a memory section to more than one node,
causing this series to break.

---

Alex Chiang (5):
mm: add numa node symlink for memory section in sysfs
mm: refactor register_cpu_under_node()
mm: refactor unregister_cpu_under_node()
mm: add numa node symlink for cpu devices in sysfs
Documentation: ABI: /sys/devices/system/cpu/cpu#/node


Documentation/ABI/testing/sysfs-devices-memory | 14 ++++-
Documentation/ABI/testing/sysfs-devices-system-cpu | 15 +++++
Documentation/memory-hotplug.txt | 11 ++--
drivers/base/node.c | 58 ++++++++++++++------
4 files changed, 77 insertions(+), 21 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Login to vote
Alex Chiang

External


Since: Aug 22, 2007
Posts: 87



(Msg. 2) Posted: Wed Oct 21, 2009 11:20 pm
Post subject: [PATCH v2 5/5] Documentation: ABI: /sys/devices/system/cpu/cpu#/node [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Describe NUMA node symlink created for CPUs when CONFIG_NUMA is set.

Cc: Greg KH <greg.TakeThisOut@kroah.com>
Cc: Randy Dunlap <randy.dunlap.TakeThisOut@oracle.com>
Signed-off-by: Alex Chiang <achiang.TakeThisOut@hp.com>
---

Documentation/ABI/testing/sysfs-devices-system-cpu | 15 +++++++++++++++
1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index b400c34..67813ae 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -79,6 +79,21 @@ Description: Discover and change the online state of a CPU.

For more information, please read Documentation/cpu-hotplug.txt

+
+What: /sys/devices/system/cpu/cpu#/node
+Date: October 2009
+Contact: Linux memory management mailing list <linux-mm.TakeThisOut@kvack.org>
+Description: Discover NUMA node a CPU belongs to
+
+ When CONFIG_NUMA is enabled, a symbolic link that points
+ to the corresponding NUMA node directory.
+
+ For example, the following symlink is created for cpu42
+ in NUMA node 2:
+
+ /sys/devices/system/cpu/cpu42/node2 -> ../../node/node2
+
+
What: /sys/devices/system/cpu/cpu#/topology/core_id
/sys/devices/system/cpu/cpu#/topology/core_siblings
/sys/devices/system/cpu/cpu#/topology/core_siblings_list

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Login to vote
Alex Chiang

External


Since: Aug 22, 2007
Posts: 87



(Msg. 3) Posted: Wed Oct 21, 2009 11:20 pm
Post subject: [PATCH v2 2/5] mm: refactor register_cpu_under_node() [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

By returning early if the node is not online, we can unindent the
interesting code by one level.

No functional change.

Signed-off-by: Alex Chiang <achiang.TakeThisOut@hp.com>
---

drivers/base/node.c | 20 +++++++++++---------
1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 3108b21..ef7dd22 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -227,16 +227,18 @@ struct node node_devices[MAX_NUMNODES];
*/
int register_cpu_under_node(unsigned int cpu, unsigned int nid)
{
- if (node_online(nid)) {
- struct sys_device *obj = get_cpu_sysdev(cpu);
- if (!obj)
- return 0;
- return sysfs_create_link(&node_devices[nid].sysdev.kobj,
- &obj->kobj,
- kobject_name(&obj->kobj));
- }
+ struct sys_device *obj;

- return 0;
+ if (!node_online(nid))
+ return 0;
+
+ obj = get_cpu_sysdev(cpu);
+ if (!obj)
+ return 0;
+
+ return sysfs_create_link(&node_devices[nid].sysdev.kobj,
+ &obj->kobj,
+ kobject_name(&obj->kobj));
}

int unregister_cpu_under_node(unsigned int cpu, unsigned int nid)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Login to vote
David Rientjes

External


Since: Jan 29, 2007
Posts: 184



(Msg. 4) Posted: Thu Oct 22, 2009 3:20 pm
Post subject: Re: [PATCH v2 1/5] mm: add numa node symlink for memory section in sysfs [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Wed, 21 Oct 2009, Alex Chiang wrote:

> Commit c04fc586c (mm: show node to memory section relationship with
> symlinks in sysfs) created symlinks from nodes to memory sections, e.g.
>
> /sys/devices/system/node/node1/memory135 -> ../../memory/memory135
>
> If you're examining the memory section though and are wondering what
> node it might belong to, you can find it by grovelling around in
> sysfs, but it's a little cumbersome.
>
> Add a reverse symlink for each memory section that points back to the
> node to which it belongs.
>
> Cc: Martin Schwidefsky <schwidefsky.DeleteThis@de.ibm.com>
> Cc: Heiko Carstens <heiko.carstens.DeleteThis@de.ibm.com>
> Cc: Gary Hade <garyhade.DeleteThis@us.ibm.com>
> Cc: Badari Pulavarty <pbadari.DeleteThis@us.ibm.com>
> Cc: Ingo Molnar <mingo.DeleteThis@elte.hu>
> Signed-off-by: Alex Chiang <achiang.DeleteThis@hp.com>

Acked-by: David Rientjes <rientjes.DeleteThis@google.com>

Very helpful backlinks to memory section nodes even though I have lots of
memory directories on some of my test machines Smile
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Login to vote
David Rientjes

External


Since: Jan 29, 2007
Posts: 184



(Msg. 5) Posted: Thu Oct 22, 2009 3:20 pm
Post subject: Re: [PATCH v2 4/5] mm: add numa node symlink for cpu devices in sysfs [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Wed, 21 Oct 2009, Alex Chiang wrote:

> You can discover which CPUs belong to a NUMA node by examining
> /sys/devices/system/node/node#/
>
> However, it's not convenient to go in the other direction, when looking at
> /sys/devices/system/cpu/cpu#/
>
> Yes, you can muck about in sysfs, but adding these symlinks makes
> life a lot more convenient.
>
> Signed-off-by: Alex Chiang <achiang.TakeThisOut@hp.com>

Acked-by: David Rientjes <rientjes.TakeThisOut@google.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Login to vote
Alex Chiang

External


Since: Aug 22, 2007
Posts: 87



(Msg. 6) Posted: Tue Oct 27, 2009 3:20 pm
Post subject: Re: [PATCH v2 1/5] mm: add numa node symlink for memory section in sysfs [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Thank you for ACKing, David.

S390 guys, I cc'ed you on this patch because I heard a rumour
that your memory sections may belong to more than one NUMA node?
Is that true? If so, how would you like me to handle that
situation?

Any comments on this patch series would be appreciated.

Thanks.
/ac

* David Rientjes <rientjes.RemoveThis@google.com>:
> On Wed, 21 Oct 2009, Alex Chiang wrote:
>
> > Commit c04fc586c (mm: show node to memory section relationship with
> > symlinks in sysfs) created symlinks from nodes to memory sections, e.g.
> >
> > /sys/devices/system/node/node1/memory135 -> ../../memory/memory135
> >
> > If you're examining the memory section though and are wondering what
> > node it might belong to, you can find it by grovelling around in
> > sysfs, but it's a little cumbersome.
> >
> > Add a reverse symlink for each memory section that points back to the
> > node to which it belongs.
> >
> > Cc: Martin Schwidefsky <schwidefsky.RemoveThis@de.ibm.com>
> > Cc: Heiko Carstens <heiko.carstens.RemoveThis@de.ibm.com>
> > Cc: Gary Hade <garyhade.RemoveThis@us.ibm.com>
> > Cc: Badari Pulavarty <pbadari.RemoveThis@us.ibm.com>
> > Cc: Ingo Molnar <mingo.RemoveThis@elte.hu>
> > Signed-off-by: Alex Chiang <achiang.RemoveThis@hp.com>
>
> Acked-by: David Rientjes <rientjes.RemoveThis@google.com>
>
> Very helpful backlinks to memory section nodes even though I have lots of
> memory directories on some of my test machines Smile
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Login to vote
David Rientjes

External


Since: Jan 29, 2007
Posts: 184



(Msg. 7) Posted: Tue Oct 27, 2009 5:20 pm
Post subject: Re: [PATCH v2 1/5] mm: add numa node symlink for memory section in sysfs [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Tue, 27 Oct 2009, Alex Chiang wrote:

> Thank you for ACKing, David.
>
> S390 guys, I cc'ed you on this patch because I heard a rumour
> that your memory sections may belong to more than one NUMA node?
> Is that true? If so, how would you like me to handle that
> situation?
>

You're referring to how unregister_mem_sect_under_nodes() should be
handled, right? register_mem_sect_under_node() already looks supported by
your patch.

Since the unregister function includes a plural "nodes," I assume that
it's possible for hotplug to register a memory section to more than one
node. That's probably lacking on x86 currently, however, because we lack
node hotplug.

I'd suggest a similiar iteration through pfn's that the register function
does checking for multiple nodes and then removing the link from all
applicable node_devices kobj when unregistering.

Maybe one of the s390 maintainers will test that?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Login to vote
Heiko Carstens

External


Since: Nov 14, 2006
Posts: 125



(Msg. 8) Posted: Wed Oct 28, 2009 5:20 am
Post subject: Re: [PATCH v2 1/5] mm: add numa node symlink for memory section in sysfs [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Tue, Oct 27, 2009 at 02:27:56PM -0700, David Rientjes wrote:
> On Tue, 27 Oct 2009, Alex Chiang wrote:
>
> > Thank you for ACKing, David.
> >
> > S390 guys, I cc'ed you on this patch because I heard a rumour
> > that your memory sections may belong to more than one NUMA node?
> > Is that true? If so, how would you like me to handle that
> > situation?
> >
>
> You're referring to how unregister_mem_sect_under_nodes() should be
> handled, right? register_mem_sect_under_node() already looks supported by
> your patch.
>
> Since the unregister function includes a plural "nodes," I assume that
> it's possible for hotplug to register a memory section to more than one
> node. That's probably lacking on x86 currently, however, because we lack
> node hotplug.
>
> I'd suggest a similiar iteration through pfn's that the register function
> does checking for multiple nodes and then removing the link from all
> applicable node_devices kobj when unregistering.
>
> Maybe one of the s390 maintainers will test that?

The short answer is: s390 doesn't support NUMA, because the hardware doesn't
tell us to which node (book in s390 terms) a memory range belongs to.

Memory layout for a logical partition is striped: first x mbyte belong to
node 0, next x mbyte belong to node 1, etc...

Also, since there is always a hypervisor running below Linux I don't think
it would make too much sense if we would know to which node a piece of
memory belongs to: if the hypervisor decides to schedule a virtual cpu of
a logical partition to a different node then what?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Login to vote
David Rientjes

External


Since: Jan 29, 2007
Posts: 184



(Msg. 9) Posted: Wed Oct 28, 2009 5:20 am
Post subject: Re: [PATCH v2 1/5] mm: add numa node symlink for memory section in sysfs [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Wed, 28 Oct 2009, Heiko Carstens wrote:

> The short answer is: s390 doesn't support NUMA, because the hardware doesn't
> tell us to which node (book in s390 terms) a memory range belongs to.
>
> Memory layout for a logical partition is striped: first x mbyte belong to
> node 0, next x mbyte belong to node 1, etc...
>
> Also, since there is always a hypervisor running below Linux I don't think
> it would make too much sense if we would know to which node a piece of
> memory belongs to: if the hypervisor decides to schedule a virtual cpu of
> a logical partition to a different node then what?
>

Ok, so the patchset is a no-op for s390 since it only utilizes the
!CONFIG_NUMA code.

Alex, I think the safest thing to do in unregister_mem_sect_under_nodes()
is to iterate though the section pfns and remove links to the node_device
kobjs for all the distinct pfn_to_nid()'s that it encounters.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Login to vote
Alex Chiang

External


Since: Aug 22, 2007
Posts: 87



(Msg. 10) Posted: Wed Oct 28, 2009 1:20 pm
Post subject: Re: [PATCH v2 1/5] mm: add numa node symlink for memory section in sysfs [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

* David Rientjes <rientjes RemoveThis @google.com>:
> On Wed, 28 Oct 2009, Heiko Carstens wrote:
>
> > The short answer is: s390 doesn't support NUMA, because the hardware doesn't
> > tell us to which node (book in s390 terms) a memory range belongs to.
> >
> > Memory layout for a logical partition is striped: first x mbyte belong to
> > node 0, next x mbyte belong to node 1, etc...
> >
> > Also, since there is always a hypervisor running below Linux I don't think
> > it would make too much sense if we would know to which node a piece of
> > memory belongs to: if the hypervisor decides to schedule a virtual cpu of
> > a logical partition to a different node then what?
> >
>
> Ok, so the patchset is a no-op for s390 since it only utilizes the
> !CONFIG_NUMA code.

Sounds good.

> Alex, I think the safest thing to do in unregister_mem_sect_under_nodes()
> is to iterate though the section pfns and remove links to the node_device
> kobjs for all the distinct pfn_to_nid()'s that it encounters.

Ok, I will respin.

Thanks!
/ac
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Login to vote
Alex Chiang

External


Since: Aug 22, 2007
Posts: 87



(Msg. 11) Posted: Wed Oct 28, 2009 3:20 pm
Post subject: Re: [PATCH v2 1/5] mm: add numa node symlink for memory section in sysfs [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

* David Rientjes <rientjes.RemoveThis@google.com>:
>
> Alex, I think the safest thing to do in unregister_mem_sect_under_nodes()
> is to iterate though the section pfns and remove links to the node_device
> kobjs for all the distinct pfn_to_nid()'s that it encounters.

Am I not understanding the code? It looks like we do this
already...

/* unregister memory section under all nodes that it spans */
int unregister_mem_sect_under_nodes(struct memory_block *mem_blk)
{
nodemask_t unlinked_nodes;
unsigned long pfn, sect_start_pfn, sect_end_pfn;

if (!mem_blk)
return -EFAULT;
nodes_clear(unlinked_nodes);
sect_start_pfn = section_nr_to_pfn(mem_blk->phys_index);
sect_end_pfn = sect_start_pfn + PAGES_PER_SECTION - 1;
for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) {
int nid;

nid = get_nid_for_pfn(pfn);
if (nid < 0)
continue;
if (!node_online(nid))
continue;
if (node_test_and_set(nid, unlinked_nodes))
continue;
sysfs_remove_link(&node_devices[nid].sysdev.kobj,
kobject_name(&mem_blk->sysdev.kobj));
sysfs_remove_link(&mem_blk->sysdev.kobj,
kobject_name(&node_devices[nid].sysdev.kobj));
}
return 0;
}

Thanks,
/ac

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Login to vote
David Rientjes

External


Since: Jan 29, 2007
Posts: 184



(Msg. 12) Posted: Wed Oct 28, 2009 5:20 pm
Post subject: [patch -mm] mm: slab allocate memory section nodemask for large systems [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Wed, 28 Oct 2009, Alex Chiang wrote:

> Am I not understanding the code? It looks like we do this
> already...
>
> /* unregister memory section under all nodes that it spans */
> int unregister_mem_sect_under_nodes(struct memory_block *mem_blk)
> {
> nodemask_t unlinked_nodes;
> unsigned long pfn, sect_start_pfn, sect_end_pfn;
>
> if (!mem_blk)
> return -EFAULT;
> nodes_clear(unlinked_nodes);
> sect_start_pfn = section_nr_to_pfn(mem_blk->phys_index);
> sect_end_pfn = sect_start_pfn + PAGES_PER_SECTION - 1;
> for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) {
> int nid;
>
> nid = get_nid_for_pfn(pfn);
> if (nid < 0)
> continue;
> if (!node_online(nid))
> continue;
> if (node_test_and_set(nid, unlinked_nodes))
> continue;
> sysfs_remove_link(&node_devices[nid].sysdev.kobj,
> kobject_name(&mem_blk->sysdev.kobj));
> sysfs_remove_link(&mem_blk->sysdev.kobj,
> kobject_name(&node_devices[nid].sysdev.kobj));
> }
> return 0;
> }
>

That shound be sufficient with the exception that allocating nodemask_t
on the stack is usually dangerous because it can be extremely large; we
typically use NODEMASK_ALLOC() for such code. It's had some changes in
-mm, but since this patchset will likely be going through that tree anyway
we can fix it now with the patch below.

Otherwise, it looks like the iteration is already there and will remove
links for memory sections bound to multiple nodes if they exist through
hotplug.



mm: slab allocate memory section nodemask for large systems

Nodemasks should not be allocated on the stack for large systems (when it
is larger than 256 bytes) since there is a threat of overflow.

This patch causes the unregister_mem_sect_under_nodes() nodemask to be
allocated on the stack for smaller systems and be allocated by slab for
larger systems.

GFP_KERNEL is used since remove_memory_block() can block.

Cc: Gary Hade <garyhade RemoveThis @us.ibm.com>
Cc: Badari Pulavarty <pbadari RemoveThis @us.ibm.com>
Signed-off-by: David Rientjes <rientjes RemoveThis @google.com>
---
Depends on NODEMASK_ALLOC() changes currently present only in -mm.

drivers/base/node.c | 11 +++++++----
1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -363,12 +363,14 @@ int register_mem_sect_under_node(struct memory_block *mem_blk, int nid)
/* unregister memory section under all nodes that it spans */
int unregister_mem_sect_under_nodes(struct memory_block *mem_blk)
{
- nodemask_t unlinked_nodes;
+ NODEMASK_ALLOC(nodemask_t, unlinked_nodes, GFP_KERNEL);
unsigned long pfn, sect_start_pfn, sect_end_pfn;

- if (!mem_blk)
+ if (!mem_blk) {
+ NODEMASK_FREE(unlinked_nodes);
return -EFAULT;
- nodes_clear(unlinked_nodes);
+ }
+ nodes_clear(*unlinked_nodes);
sect_start_pfn = section_nr_to_pfn(mem_blk->phys_index);
sect_end_pfn = sect_start_pfn + PAGES_PER_SECTION - 1;
for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) {
@@ -379,13 +381,14 @@ int unregister_mem_sect_under_nodes(struct memory_block *mem_blk)
continue;
if (!node_online(nid))
continue;
- if (node_test_and_set(nid, unlinked_nodes))
+ if (node_test_and_set(nid, *unlinked_nodes))
continue;
sysfs_remove_link(&node_devices[nid].sysdev.kobj,
kobject_name(&mem_blk->sysdev.kobj));
sysfs_remove_link(&mem_blk->sysdev.kobj,
kobject_name(&node_devices[nid].sysdev.kobj));
}
+ NODEMASK_FREE(unlinked_nodes);
return 0;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Login to vote
Alex Chiang

External


Since: Aug 22, 2007
Posts: 87



(Msg. 13) Posted: Mon Nov 02, 2009 5:20 pm
Post subject: Re: [patch -mm] mm: slab allocate memory section nodemask for large systems [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Hi Andrew,

* David Rientjes <rientjes.RemoveThis@google.com>:
> On Wed, 28 Oct 2009, Alex Chiang wrote:
>
> > Am I not understanding the code? It looks like we do this
> > already...
> >
> > /* unregister memory section under all nodes that it spans */
> > int unregister_mem_sect_under_nodes(struct memory_block *mem_blk)
> > {
> > nodemask_t unlinked_nodes;
> > unsigned long pfn, sect_start_pfn, sect_end_pfn;
> >
> > if (!mem_blk)
> > return -EFAULT;
> > nodes_clear(unlinked_nodes);
> > sect_start_pfn = section_nr_to_pfn(mem_blk->phys_index);
> > sect_end_pfn = sect_start_pfn + PAGES_PER_SECTION - 1;
> > for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) {
> > int nid;
> >
> > nid = get_nid_for_pfn(pfn);
> > if (nid < 0)
> > continue;
> > if (!node_online(nid))
> > continue;
> > if (node_test_and_set(nid, unlinked_nodes))
> > continue;
> > sysfs_remove_link(&node_devices[nid].sysdev.kobj,
> > kobject_name(&mem_blk->sysdev.kobj));
> > sysfs_remove_link(&mem_blk->sysdev.kobj,
> > kobject_name(&node_devices[nid].sysdev.kobj));
> > }
> > return 0;
> > }
> >
>
> That shound be sufficient with the exception that allocating nodemask_t
> on the stack is usually dangerous because it can be extremely large; we
> typically use NODEMASK_ALLOC() for such code. It's had some changes in
> -mm, but since this patchset will likely be going through that tree anyway
> we can fix it now with the patch below.
>
> Otherwise, it looks like the iteration is already there and will remove
> links for memory sections bound to multiple nodes if they exist through
> hotplug.

Any comments on this patch series?

Turns out that Kame-san's fear about a memory section spanning
several nodes on certain architectures (S390) isn't really
applicable and even if it were, we have code to handle situation
anyway.

Kame-san was generally supportive of these convenience symlinks
although he did not give a formal ACK.

David has given an ACK on the two patches that do real work, as
well as supplied the below patch.

I can respin this series once more, including David's Acked-by:
and adding his patch if that makes life easier for you.

Thanks,
/ac


> mm: slab allocate memory section nodemask for large systems
>
> Nodemasks should not be allocated on the stack for large systems (when it
> is larger than 256 bytes) since there is a threat of overflow.
>
> This patch causes the unregister_mem_sect_under_nodes() nodemask to be
> allocated on the stack for smaller systems and be allocated by slab for
> larger systems.
>
> GFP_KERNEL is used since remove_memory_block() can block.
>
> Cc: Gary Hade <garyhade.RemoveThis@us.ibm.com>
> Cc: Badari Pulavarty <pbadari.RemoveThis@us.ibm.com>
> Signed-off-by: David Rientjes <rientjes.RemoveThis@google.com>
> ---
> Depends on NODEMASK_ALLOC() changes currently present only in -mm.
>
> drivers/base/node.c | 11 +++++++----
> 1 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -363,12 +363,14 @@ int register_mem_sect_under_node(struct memory_block *mem_blk, int nid)
> /* unregister memory section under all nodes that it spans */
> int unregister_mem_sect_under_nodes(struct memory_block *mem_blk)
> {
> - nodemask_t unlinked_nodes;
> + NODEMASK_ALLOC(nodemask_t, unlinked_nodes, GFP_KERNEL);
> unsigned long pfn, sect_start_pfn, sect_end_pfn;
>
> - if (!mem_blk)
> + if (!mem_blk) {
> + NODEMASK_FREE(unlinked_nodes);
> return -EFAULT;
> - nodes_clear(unlinked_nodes);
> + }
> + nodes_clear(*unlinked_nodes);
> sect_start_pfn = section_nr_to_pfn(mem_blk->phys_index);
> sect_end_pfn = sect_start_pfn + PAGES_PER_SECTION - 1;
> for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) {
> @@ -379,13 +381,14 @@ int unregister_mem_sect_under_nodes(struct memory_block *mem_blk)
> continue;
> if (!node_online(nid))
> continue;
> - if (node_test_and_set(nid, unlinked_nodes))
> + if (node_test_and_set(nid, *unlinked_nodes))
> continue;
> sysfs_remove_link(&node_devices[nid].sysdev.kobj,
> kobject_name(&mem_blk->sysdev.kobj));
> sysfs_remove_link(&mem_blk->sysdev.kobj,
> kobject_name(&node_devices[nid].sysdev.kobj));
> }
> + NODEMASK_FREE(unlinked_nodes);
> return 0;
> }
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Login to vote
David Rientjes

External


Since: Jan 29, 2007
Posts: 184



(Msg. 14) Posted: Tue Nov 03, 2009 9:20 pm
Post subject: Re: [patch -mm] mm: slab allocate memory section nodemask for large systems [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Mon, 2 Nov 2009, Alex Chiang wrote:

> Any comments on this patch series?
>
> Turns out that Kame-san's fear about a memory section spanning
> several nodes on certain architectures (S390) isn't really
> applicable and even if it were, we have code to handle situation
> anyway.
>
> Kame-san was generally supportive of these convenience symlinks
> although he did not give a formal ACK.
>
> David has given an ACK on the two patches that do real work, as
> well as supplied the below patch.
>
> I can respin this series once more, including David's Acked-by:
> and adding his patch if that makes life easier for you.
>

It's probably in Andrew's queue after getting back from the kernel summit,
it would be best to wait a week or so.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo DeleteThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Login to vote
David Rientjes

External


Since: Jan 29, 2007
Posts: 184



(Msg. 15) Posted: Tue Nov 10, 2009 3:20 pm
Post subject: Re: [patch -mm] mm: slab allocate memory section nodemask for large systems [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Tue, 10 Nov 2009, Andrew Morton wrote:

> The prerequisite Documentation/ patches are a bit of a mess - some have
> been cherrypicked into Greg's tree I believe and some haven't. So
> please also send out whatever is needed to bring linux-next up to date.
>

I'm not aware of any prerequisites for this patchset, Alex's documentation
changes have already been merged by Linus.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Login to vote
Display posts from previous:   
Related Topics:
[PATCH 000 of 4] knfsd: fixes and enhancements for 2.6.21 - Following are 4 patchs from knfsd suitable for 2.6.21. Numbers 3 and 4 provide new usability features that require a..

[PATCH 0/3] core_pattern: cleaned up repost/continuing pos.. - Ok, here we go As promised, I'm reposting the core_pattern enhancements I've done over the past few days. These three...

[RFC] [PATCH] more support for memory-less-node. - In my last posintg, mempolicy-fix-for-memory-less-node patch, there was a discussion 'what do you consider definition....

[PATCH 1/1] mm: Inconsistent use of node IDs - This patch corrects inconsistent use of node numbers (variously "nid" or "node") in the presence of...

[PATCH] allocate GART/IOMMU aperture from any node - Linux always tries to allocate the GART/IOMMU aperture from the boot node memory. However, it is legal to boot with no...

[PATCH 5/10] dma: use dev_to_node to get node for device i.. - [PATCH 5/10] dma: use dev_to_node to get node for device in dma_alloc_pages Signed-off-by: Yinghai Lu..
       Soft32 Home -> Linux -> Kernel All times are: Pacific Time (US & Canada) (change)
Goto page 1, 2
Page 1 of 2

 
You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Categories:
 Windows
  Linux
 Mac
 PDA


[ Contact us | Terms of Service/Privacy Policy ]