libglusterfs: make memory pools more thread-friendly

Early multiplexing tests revealed *massive* contention on certain pools' global locks - especially for dictionaries and secondarily for call stubs. For the thread counts that multiplexing can create, a more lock-free solution is clearly needed. Also, the current mem-pool implementation does a poor job releasing memory back to the system, artificially inflating memory usage to match whatever the worst case was since the process started. This is bad in general, but especially so for multiplexing where there are more pools and a major point of the whole exercise is to reduce memory consumption. The basic ideas for the new design are these There is one pool, globally, for each power-of-two size range. Every attempt to create a new pool within this range will instead add a reference to the existing pool. Instead of adding pools for each translator within each multiplexed brick (potentially infinite and quite possibly thousands), we allocate one set of size-based pools per *thread* (hundreds at worst). Each per-thread pool is divided into hot and cold lists. Every allocation first attempts to use the hot list, then the cold list. When objects are freed, they always go on the hot list. There is one global "pool sweeper" thread, which periodically reclaims everything in each pool's cold list and then "demotes" the current hot list to be the new cold list. For normal allocation activity, only a per-thread lock need be taken, and even that only to guard against very rare contention from the pool sweeper. When threads start and stop, a global lock must be taken to add them to the pool sweeper's list. Lock contention is therefore extremely low, and the hot/cold lists also provide good locality. A more complete explanation (of a similar earlier design) can be found here: http://www.gluster.org/pipermail/gluster-devel/2016-October/051160.html Change-Id: I5bc8a1ba57cfb553998f979a498886e0d006e665 BUG: 1385758 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/15645 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
author: Jeff Darcy <jdarcy@redhat.com> 2016-10-14 10:04:07 -0400
committer: Shyamsundar Ranganathan <srangana@redhat.com> 2017-02-02 13:30:19 -0500
commit: ae47befebeda2de5fd2d706090cbacf4ef60c785 (patch)
tree: 4d0605aa4fd21c504341894d287e555dac455768 /tests
parent: 984c470159a68114be2a260412cfefe2c158ab99 (diff)
1 files changed, 9 insertions, 0 deletions
diff --git a/tests/basic/quota-anon-fd-nfs.t b/tests/basic/quota-anon-fd-nfs.t
index c6b01553b02..ea07b529c5a 100755
--- a/tests/basic/quota-anon-fd-nfs.t
+++ b/tests/basic/quota-anon-fd-nfs.t
@@ -97,6 +97,15 @@ $CLI volume statedump $V0 all
 
 EXPECT_WITHIN $UMOUNT_TIMEOUT "Y" force_umount $N0
 
+# This is ugly, but there seems to be a latent race between other actions and
+# stopping the volume.  The visible symptom is that "umount -l" (run from
+# gf_umount_lazy in glusterd) hangs.  This happens pretty consistently with the
+# new mem-pool code, though it's not really anything to do with memory pools -
+# just with changed timing.  Adding the sleep here makes it work consistently.
+#
+# If anyone else wants to debug the race condition, feel free.
+sleep 3
+
 TEST $CLI volume stop $V0
 EXPECT "1" get_aux
author	Jeff Darcy <jdarcy@redhat.com>	2016-10-14 10:04:07 -0400
committer	Shyamsundar Ranganathan <srangana@redhat.com>	2017-02-02 13:30:19 -0500
commit	ae47befebeda2de5fd2d706090cbacf4ef60c785 (patch)
tree	4d0605aa4fd21c504341894d287e555dac455768 /tests
parent	984c470159a68114be2a260412cfefe2c158ab99 (diff)