summaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
authorPoornima G <pgurusid@redhat.com>2018-11-21 12:09:39 +0530
committerAmar Tumballi <amarts@redhat.com>2018-12-18 09:35:24 +0000
commitb87c397091bac6a4a6dec4e45a7671fad4a11770 (patch)
tree6f7eeff5be2ae69af0eba03add10103091639a6c /doc
parentd50f22e6ae410fdcde573b6015b97dc1573bbb7e (diff)
iobuf: Get rid of pre allocated iobuf_pool and use per thread mem pool
The current implementation of iobuf_pool has two problems: - prealloc of 12.5MB memory, this limits the scale factor of the gluster processes due to RAM requirements - lock contention, as the current implementation has one global iobuf_pool lock. Credits for debugging and addressing the same goes to Krutika Dhananjay <kdhananj@redhat.com>. Issue: #410 Hence changing the iobuf implementation to use per thread mem pool. This may theoritically appear to cause perf dip as there is no preallocation. But per thread mem pool will not have significant perf impact as the last allocated memory is kept alive for subsequent allocs, for some time. The worst case would be if iobufs requested are of random sizes each time. The best case is, if we get iobuf request of the same size. From the perf tests, this patch did not seem to cause any perf decrease. Note that, with this patch, the rdma performance is going to degrade drastically. In one of the previous patchsets we had fixes to not degrade rdma perf, but rdma is not supported and also not tested [1]. Hence the decision was to not have code in rdma that is not tested and not supported. [1] https://lists.gluster.org/pipermail/gluster-users.old/2018-July/034400.html Updates: #325 Change-Id: Ic2ef3bd498f9250dea25f25ba0c01fde19584b27 Signed-off-by: Poornima G <pgurusid@redhat.com>
Diffstat (limited to 'doc')
-rw-r--r--doc/debugging/statedump.md50
-rw-r--r--doc/developer-guide/datastructure-iobuf.md152
2 files changed, 18 insertions, 184 deletions
diff --git a/doc/debugging/statedump.md b/doc/debugging/statedump.md
index 9939576e270..797d51f8062 100644
--- a/doc/debugging/statedump.md
+++ b/doc/debugging/statedump.md
@@ -95,52 +95,10 @@ max-stdalloc=0 #Maximum number of allocations from heap that are in active use a
```
###Iobufs
-```
-[iobuf.global]
-iobuf_pool=0x1f0d970 #The memory pool for iobufs
-iobuf_pool.default_page_size=131072 #The default size of iobuf (if no iobuf size is specified the default size is allocated)
-#iobuf_arena: One arena represents a group of iobufs of a particular size
-iobuf_pool.arena_size=12976128 # The initial size of the iobuf pool (doesn't include the stdalloc'd memory or the newly added arenas)
-iobuf_pool.arena_cnt=8 #Total number of arenas in the pool
-iobuf_pool.request_misses=0 #The number of iobufs that were stdalloc'd (as they exceeded the default max page size provided by iobuf_pool).
-```
-
-There are 3 lists of arenas
-
-1. Arena list: arenas allocated during iobuf pool creation and the arenas that are in use(active_cnt != 0) will be part of this list.
-2. Purge list: arenas that can be purged(no active iobufs, active_cnt == 0).
-3. Filled list: arenas without free iobufs.
-
-```
-[purge.1] #purge.<S.No.>
-purge.1.mem_base=0x7fc47b35f000 #The address of the arena structure
-purge.1.active_cnt=0 #The number of iobufs active in that arena
-purge.1.passive_cnt=1024 #The number of unused iobufs in the arena
-purge.1.alloc_cnt=22853 #Total allocs in this pool(number of times the iobuf was allocated from this arena)
-purge.1.max_active=7 #Max active iobufs from this arena, at any point in the life of this process.
-purge.1.page_size=128 #Size of all the iobufs in this arena.
-
-[arena.5] #arena.<S.No.>
-arena.5.mem_base=0x7fc47af1f000
-arena.5.active_cnt=0
-arena.5.passive_cnt=64
-arena.5.alloc_cnt=0
-arena.5.max_active=0
-arena.5.page_size=32768
-```
-
-If the active_cnt of any arena is non zero, then the statedump will also have the iobuf list.
-```
-[arena.6.active_iobuf.1] #arena.<S.No>.active_iobuf.<iobuf.S.No.>
-arena.6.active_iobuf.1.ref=1 #refcount of the iobuf
-arena.6.active_iobuf.1.ptr=0x7fdb921a9000 #address of the iobuf
-
-[arena.6.active_iobuf.2]
-arena.6.active_iobuf.2.ref=1
-arena.6.active_iobuf.2.ptr=0x7fdb92189000
-```
-
-At any given point in time if there are lots of filled arenas then that could be a sign of iobuf leaks.
+The iobuf stats are printed in this section. It includes:
+- active_cnt : number of iobufs that are currently allocated and being used. This number should not be too high. It generally is only as much as the number of inflight IO fops. large number indicates a leak in iobufs. There is no easy way to debug this, since the iobufs also come from mem pools, looking at the mem pool section in statedump will help.
+- misses : number of iobuf allocations that were not served from mem_pool. (includes stdalloc and mem_pool alloc misses)
+- hits : number of iobuf allocations that were served from the mem_pool memory.
###Call stack
All the fops received by gluster are handled using call-stacks. Call stack contains the information about uid/gid/pid etc of the process that is executing the fop. Each call-stack contains different call-frames per xlator which handles that fop.
diff --git a/doc/developer-guide/datastructure-iobuf.md b/doc/developer-guide/datastructure-iobuf.md
index 5f521f1485f..fdbbad7b499 100644
--- a/doc/developer-guide/datastructure-iobuf.md
+++ b/doc/developer-guide/datastructure-iobuf.md
@@ -2,31 +2,13 @@
##Datastructures
###iobuf
Short for IO Buffer. It is one allocatable unit for the consumers of the IOBUF
-API, each unit hosts @page_size(defined in arena structure) bytes of memory. As
-initial step of processing a fop, the IO buffer passed onto GlusterFS by the
-other applications (FUSE VFS/ Applications using gfapi) is copied into GlusterFS
-space i.e. iobufs. Hence Iobufs are mostly allocated/deallocated in Fuse, gfapi,
-protocol xlators, and also in performance xlators to cache the IO buffers etc.
-```
-struct iobuf {
- union {
- struct list_head list;
- struct {
- struct iobuf *next;
- struct iobuf *prev;
- };
- };
- struct iobuf_arena *iobuf_arena;
-
- gf_lock_t lock; /* for ->ptr and ->ref */
- int ref; /* 0 == passive, >0 == active */
-
- void *ptr; /* usable memory region by the consumer */
+API, each unit hosts @page_size bytes of memory. As initial step of processing
+a fop, the IO buffer passed onto GlusterFS by the other applications (FUSE VFS/
+Applications using gfapi) is copied into GlusterFS space i.e. iobufs. Hence Iobufs
+are mostly allocated/deallocated in Fuse, gfapi, protocol xlators, and also in
+performance xlators to cache the IO buffers etc.
- void *free_ptr; /* in case of stdalloc, this is the
- one to be freed not the *ptr */
-};
-```
+Iobufs is allocated from the per thread mem pool.
###iobref
There may be need of multiple iobufs for a single fop, like in vectored read/write.
@@ -40,104 +22,9 @@ struct iobref {
int used; /* number of iobufs added to this iobref */
};
```
-###iobuf_arenas
-One region of memory MMAPed from the operating system. Each region MMAPs
-@arena_size bytes of memory, and hosts @arena_size / @page_size IOBUFs.
-The same sized iobufs are grouped into one arena, for sanity of access.
-
-```
-struct iobuf_arena {
- union {
- struct list_head list;
- struct {
- struct iobuf_arena *next;
- struct iobuf_arena *prev;
- };
- };
-
- size_t page_size; /* size of all iobufs in this arena */
- size_t arena_size; /* this is equal to
- (iobuf_pool->arena_size / page_size)
- * page_size */
- size_t page_count;
-
- struct iobuf_pool *iobuf_pool;
-
- void *mem_base;
- struct iobuf *iobufs; /* allocated iobufs list */
-
- int active_cnt;
- struct iobuf active; /* head node iobuf
- (unused by itself) */
- int passive_cnt;
- struct iobuf passive; /* head node iobuf
- (unused by itself) */
- uint64_t alloc_cnt; /* total allocs in this pool */
- int max_active; /* max active buffers at a given time */
-};
-
-```
###iobuf_pool
-Pool of Iobufs. As there may be many Io buffers required by the filesystem,
-a pool of iobufs are preallocated and kept, if these preallocated ones are
-exhausted only then the standard malloc/free is called, thus improving the
-performance. Iobuf pool is generally one per process, allocated during
-glusterfs_ctx_t init (glusterfs_ctx_defaults_init), currently the preallocated
-iobuf pool memory is freed on process exit. Iobuf pool is globally accessible
-across GlusterFs, hence iobufs allocated by any xlator can be accessed by any
-other xlators(unless iobuf is not passed).
-```
-struct iobuf_pool {
- pthread_mutex_t mutex;
- size_t arena_size; /* size of memory region in
- arena */
- size_t default_page_size; /* default size of iobuf */
-
- int arena_cnt;
- struct list_head arenas[GF_VARIABLE_IOBUF_COUNT];
- /* array of arenas. Each element of the array is a list of arenas
- holding iobufs of particular page_size */
-
- struct list_head filled[GF_VARIABLE_IOBUF_COUNT];
- /* array of arenas without free iobufs */
-
- struct list_head purge[GF_VARIABLE_IOBUF_COUNT];
- /* array of of arenas which can be purged */
-
- uint64_t request_misses; /* mostly the requests for higher
- value of iobufs */
-};
-```
-~~~
-The default size of the iobuf_pool(as of yet):
-1024 iobufs of 128Bytes = 128KB
-512 iobufs of 512Bytes = 256KB
-512 iobufs of 2KB = 1MB
-128 iobufs of 8KB = 1MB
-64 iobufs of 32KB = 2MB
-32 iobufs of 128KB = 4MB
-8 iobufs of 256KB = 2MB
-2 iobufs of 1MB = 2MB
-Total ~13MB
-~~~
-As seen in the datastructure iobuf_pool has 3 arena lists.
-
-- arenas:
-The arenas allocated during iobuf_pool create, are part of this list. This list
-also contains arenas that are partially filled i.e. contain few active and few
-passive iobufs (passive_cnt !=0, active_cnt!=0 except for initially allocated
-arenas). There will be by default 8 arenas of the sizes mentioned above.
-- filled:
-If all the iobufs in the arena are filled(passive_cnt = 0), the arena is moved
-to the filled list. If any of the iobufs from the filled arena is iobuf_put,
-then the arena moves back to the 'arenas' list.
-- purge:
-If there are no active iobufs in the arena(active_cnt = 0), the arena is moved
-to purge list. iobuf_put() triggers destruction of the arenas in this list. The
-arenas in the purge list are destroyed only if there is atleast one arena in
-'arenas' list, that way there won't be spurious mmap/unmap of buffers.
-(e.g: If there is an arena (page_size=128KB, count=32) in purge list, this arena
-is destroyed(munmap) only if there is an arena in 'arenas' list with page_size=128KB).
+This is just a wrapper structure to keep count of active iobufs, iobuf mem pool
+alloc misses and hits.
##APIs
###iobuf_get
@@ -157,23 +44,15 @@ struct iobuf * iobuf_get2 (struct iobuf_pool *iobuf_pool, size_t page_size);
Creates a new iobuf of a specified page size, if page_size=0 default page size
is considered.
```
-if (requested iobuf size > Max iobuf size in the pool(1MB as of yet))
+if (requested iobuf size > Max size in the mem pool(1MB as of yet))
{
- Perform standard allocation(CALLOC) of the requested size and
- add it to the list iobuf_pool->arenas[IOBUF_ARENA_MAX_INDEX].
+ Perform standard allocation(CALLOC) of the requested size
}
else
{
- -Round the page size to match the stndard sizes in iobuf pool.
- (eg: if 3KB is requested, it is rounded to 8KB).
- -Select the arena list corresponding to the rounded size
- (eg: select 8KB arena)
- If the selected arena has passive count > 0, then return the
- iobuf from this arena, set the counters(passive/active/etc.)
- appropriately.
- else the arena is full, allocate new arena with rounded size
- and standard page numbers and add to the arena list
- (eg: 128 iobufs of 8KB is allocated).
+ -request for memory from the per thread mem pool. This can be a miss
+ or hit, based on the availablility in the mem pool. Record the hit/miss
+ in the iobuf_pool.
}
```
Also takes a reference(increments ref count), hence no need of doing it
@@ -197,8 +76,6 @@ Unreference the iobuf, if the ref count is zero iobuf is considered free.
```
-Delete the iobuf, if allocated from standard alloc and return.
-set the active/passive count appropriately.
- -if passive count > 0 then add the arena to 'arena' list.
- -if active count = 0 then add the arena to 'purge' list.
```
Every iobuf_ref should have a corresponding iobuf_unref, and also every
iobuf_get/2 should have a correspondning iobuf_unref.
@@ -249,8 +126,7 @@ Unreference all the iobufs in the iobref, and also unref the iobref.
If all iobuf_refs/iobuf_new do not have correspondning iobuf_unref, then the
iobufs are not freed and recurring execution of such code path may lead to huge
memory leaks. The easiest way to identify if a memory leak is caused by iobufs
-is to take a statedump. If the statedump shows a lot of filled arenas then it is
-a sure sign of leak. Refer doc/debugging/statedump.md for more details.
+is to take a statedump.
If iobufs are leaking, the next step is to find where the iobuf_unref went
missing. There is no standard/easy way of debugging this, code reading and logs