summaryrefslogtreecommitdiffstats
path: root/doc/debugging
diff options
context:
space:
mode:
authorkshithijiyer <kshithij.ki@gmail.com>2019-06-05 19:40:29 +0530
committerAmar Tumballi <amarts@redhat.com>2019-06-08 05:45:35 +0000
commit2eaf8e846afd71c30f2a6ff6f863a39b1145b8b6 (patch)
tree333b3494bf767ab195717fe4e8cdf48398681a7f /doc/debugging
parent2ff76fa45c53b8e291cf70c98b28800f3ed5f6fc (diff)
Fixing formatting errors in markdown files
There are a lot of fromatting error is markdown files peresent under /doc directiory of the project. Fixing formatting errors and sending a patch. Fixes: bz#1718273 Change-Id: I08f938088bbaaafddf634f73616ea0dbfe7aedf3 Signed-off-by: kshithijiyer <kshithij.ki@gmail.com>
Diffstat (limited to 'doc/debugging')
-rw-r--r--doc/debugging/analyzing-regression-cores.md43
-rw-r--r--doc/debugging/gfid-to-path.md43
-rw-r--r--doc/debugging/split-brain.md75
-rw-r--r--doc/debugging/statedump.md77
4 files changed, 129 insertions, 109 deletions
diff --git a/doc/debugging/analyzing-regression-cores.md b/doc/debugging/analyzing-regression-cores.md
index cbbb387794d..5e10f41c6eb 100644
--- a/doc/debugging/analyzing-regression-cores.md
+++ b/doc/debugging/analyzing-regression-cores.md
@@ -1,36 +1,35 @@
-This document explains how to analyze core-dumps obtained from regression
-machines, with examples.
-1) Download the core-tarball and extract it.
-2) 'cd' into directory where the tarball is extracted.
-~~~
+# Analyzing Regression Cores
+This document explains how to analyze core-dumps obtained from regression machines, with examples.
+1. Download the core-tarball and extract it.
+2. `cd` into directory where the tarball is extracted.
+```
[sh]# pwd
/home/user/Downloads
[sh]# ls
build build-install-20150625_05_42_39.tar.bz2 lib64 usr
-~~~
-3) Determine the core file you need to examine. There can be more than one core file.
-You can list them from './build/install/cores' directory.
-~~~
+```
+3. Determine the core file you need to examine. There can be more than one core file. You can list them from './build/install/cores' directory.
+```
[sh]# ls build/install/cores/
core.9341 liblist.txt liblist.txt.tmp
-~~~
+```
In case you are unsure which binary generated the core-file, executing 'file' command on it will help.
-~~~
+```
[sh]# file ./build/install/cores/core.9341
./build/install/cores/core.9341: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from '/build/install/sbin/glusterfsd -s slave26.cloud.gluster.org --volfile-id patchy'
-~~~
-As seen, the core file was generated by glusterfsd binary, and path to it is provided (/build/install/sbin/glusterfsd).
-4) Now, run the following command on the core:
-~~~
+```
+As seen, the core file was generated by glusterfsd binary, and path to it is provided (/build/install/sbin/glusterfsd).
+
+4. Now, run the following command on the core:
+```
gdb -ex 'set sysroot ./' -ex 'core-file ./build/install/cores/core.xxx' <target, say ./build/install/sbin/glusterd>
In this case,
gdb -ex 'set sysroot ./' -ex 'core-file ./build/install/cores/core.9341' ./build/install/sbin/glusterfsd
-~~~
-5) You can cross check if all shared libraries are available and loaded by using 'info sharedlibrary' command from
-inside gdb.
-6) Once verified, usual gdb commands based on requirement can be used to debug the core.
-'bt' or 'backtrace' from gdb of core used in examples:
-~~~
+```
+5. You can cross check if all shared libraries are available and loaded by using 'info sharedlibrary' command from inside gdb.
+6. Once verified, usual gdb commands based on requirement can be used to debug the core.
+ `bt` or `backtrace` from gdb of core used in examples:
+```
Core was generated by `/build/install/sbin/glusterfsd -s slave26.cloud.gluster.org --volfile-id patchy'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f512a54e625 in raise () from ./lib64/libc.so.6
@@ -52,4 +51,4 @@ Program terminated with signal SIGABRT, Aborted.
#12 0x00007f512a55f8f0 in ?? () from ./lib64/libc.so.6
#13 0x0000000000000000 in ?? ()
(gdb)
-~~~
+```
diff --git a/doc/debugging/gfid-to-path.md b/doc/debugging/gfid-to-path.md
index 09c459e52c8..49e9aa09a3f 100644
--- a/doc/debugging/gfid-to-path.md
+++ b/doc/debugging/gfid-to-path.md
@@ -1,37 +1,37 @@
-#Convert GFID to Path
+# Convert GFID to Path
GlusterFS internal file identifier (GFID) is a uuid that is unique to each
file across the entire cluster. This is analogous to inode number in a
normal filesystem. The GFID of a file is stored in its xattr named
`trusted.gfid`.
-####Special mount using [gfid-access translator][1]:
-~~~
+#### Special mount using [gfid-access translator][1]:
+```
mount -t glusterfs -o aux-gfid-mount vm1:test /mnt/testvol
-~~~
+```
Assuming, you have `GFID` of a file from changelog (or somewhere else).
For trying this out, you can get `GFID` of a file from mountpoint:
-~~~
+```
getfattr -n glusterfs.gfid.string /mnt/testvol/dir/file
-~~~
+```
---
-###Get file path from GFID (Method 1):
+### Get file path from GFID (Method 1):
**(Lists hardlinks delimited by `:`, returns path as seen from mountpoint)**
-####Turn on build-pgfid option
-~~~
+#### Turn on build-pgfid option
+```
gluster volume set test build-pgfid on
-~~~
+```
Read virtual xattr `glusterfs.ancestry.path` which contains the file path
-~~~
+```
getfattr -n glusterfs.ancestry.path -e text /mnt/testvol/.gfid/<GFID>
-~~~
+```
**Example:**
-~~~
+```
[root@vm1 glusterfs]# ls -il /mnt/testvol/dir/
total 1
10610563327990022372 -rw-r--r--. 2 root root 3 Jul 17 18:05 file
@@ -46,28 +46,27 @@ glusterfs.gfid.string="11118443-1894-4273-9340-4b212fa1c0e4"
getfattr: Removing leading '/' from absolute path names
# file: mnt/testvol/.gfid/11118443-1894-4273-9340-4b212fa1c0e4
glusterfs.ancestry.path="/dir/file:/dir/file3"
-~~~
+```
---
-###Get file path from GFID (Method 2):
+### Get file path from GFID (Method 2):
**(Does not list all hardlinks, returns backend brick path)**
-~~~
+```
getfattr -n trusted.glusterfs.pathinfo -e text /mnt/testvol/.gfid/<GFID>
-~~~
+```
**Example:**
-~~~
+```
[root@vm1 glusterfs]# getfattr -n trusted.glusterfs.pathinfo -e text /mnt/testvol/.gfid/11118443-1894-4273-9340-4b212fa1c0e4
getfattr: Removing leading '/' from absolute path names
# file: mnt/testvol/.gfid/11118443-1894-4273-9340-4b212fa1c0e4
trusted.glusterfs.pathinfo="(<DISTRIBUTE:test-dht> <POSIX(/mnt/brick-test/b):vm1:/mnt/brick-test/b/dir//file3>)"
-~~~
+```
---
-###Get file path from GFID (Method 3):
+### Get file path from GFID (Method 3):
https://gist.github.com/semiosis/4392640
---
-####References and links:
+#### References and links:
[posix: placeholders for GFID to path conversion](http://review.gluster.org/5951)
-[1]: https://github.com/gluster/glusterfs/blob/master/doc/features/gfid-access.md
diff --git a/doc/debugging/split-brain.md b/doc/debugging/split-brain.md
index b0d938e26bc..6b122c40551 100644
--- a/doc/debugging/split-brain.md
+++ b/doc/debugging/split-brain.md
@@ -1,33 +1,36 @@
-Steps to recover from File split-brain.
-======================================
-
-Quick Start:
-============
-1. Get the path of the file that is in split-brain:
-> It can be obtained either by
-> a) The command `gluster volume heal info split-brain`.
-> b) Identify the files for which file operations performed
- from the client keep failing with Input/Output error.
-
-2. Close the applications that opened this file from the mount point.
+# Steps to recover from File split-brain
+This document contains steps to recover from a file split-brain.
+## Quick Start:
+### Step 1. Get the path of the file that is in split-brain:
+It can be obtained either by
+1. The command `gluster volume heal info split-brain`.
+2. Identify the files for which file operations performed from the client keep failing with Input/Output error.
+
+### Step 2. Close the applications that opened this file from the mount point.
In case of VMs, they need to be powered-off.
-3. Decide on the correct copy:
-> This is done by observing the afr changelog extended attributes of the file on
+### Step 3. Decide on the correct copy:
+This is done by observing the afr changelog extended attributes of the file on
the bricks using the getfattr command; then identifying the type of split-brain
(data split-brain, metadata split-brain, entry split-brain or split-brain due to
gfid-mismatch); and finally determining which of the bricks contains the 'good copy'
of the file.
-> `getfattr -d -m . -e hex <file-path-on-brick>`.
+```
+getfattr -d -m . -e hex <file-path-on-brick>
+```
+
It is also possible that one brick might contain the correct data while the
other might contain the correct metadata.
-4. Reset the relevant extended attribute on the brick(s) that contains the
-'bad copy' of the file data/metadata using the setfattr command.
-> `setfattr -n <attribute-name> -v <attribute-value> <file-path-on-brick>`
+### Step 4. Reset the relevant extended attribute on the brick(s) that contains the 'bad copy' of the file data/metadata using the setfattr command.
+```
+setfattr -n <attribute-name> -v <attribute-value> <file-path-on-brick>
+```
-5. Trigger self-heal on the file by performing lookup from the client:
-> `ls -l <file-path-on-gluster-mount>`
+### Step 5. Trigger self-heal on the file by performing lookup from the client:
+```
+ls -l <file-path-on-gluster-mount>
+```
Detailed Instructions for steps 3 through 5:
===========================================
@@ -36,13 +39,15 @@ afr changelog extended attributes.
Execute `getfattr -d -m . -e hex <file-path-on-brick>`
-* Example:
+Example:
+```
[root@store3 ~]# getfattr -d -e hex -m. brick-a/file.txt
\#file: brick-a/file.txt
security.selinux=0x726f6f743a6f626a6563745f723a66696c655f743a733000
trusted.afr.vol-client-2=0x000000000000000000000000
trusted.afr.vol-client-3=0x000000000200000000000000
trusted.gfid=0x307a5c9efddd4e7c96e94fd4bcdcbd1b
+```
The extended attributes with `trusted.afr.<volname>-client-<subvolume-index>`
are used by afr to maintain changelog of the file.The values of the
@@ -51,10 +56,11 @@ client (fuse or nfs-server) processes. When the glusterfs client modifies a file
or directory, the client contacts each brick and updates the changelog extended
attribute according to the response of the brick.
-'subvolume-index' is nothing but (brick number - 1) in
+`subvolume-index` is nothing but (brick number - 1) in
`gluster volume info <volname>` output.
-* Example:
+Example:
+```
[root@pranithk-laptop ~]# gluster volume info vol
Volume Name: vol
Type: Distributed-Replicate
@@ -71,6 +77,7 @@ attribute according to the response of the brick.
brick-f: pranithk-laptop:/gfs/brick-f
brick-g: pranithk-laptop:/gfs/brick-g
brick-h: pranithk-laptop:/gfs/brick-h
+```
In the example above:
```
@@ -91,12 +98,15 @@ present in all the other bricks in it's replica set as seen by that brick.
In the example volume given above, all files in brick-a will have 2 entries,
one for itself and the other for the file present in it's replica pair, i.e.brick-b:
+```
trusted.afr.vol-client-0=0x000000000000000000000000 -->changelog for itself (brick-a)
trusted.afr.vol-client-1=0x000000000000000000000000 -->changelog for brick-b as seen by brick-a
-
+```
Likewise, all files in brick-b will have:
+```
trusted.afr.vol-client-0=0x000000000000000000000000 -->changelog for brick-a as seen by brick-b
trusted.afr.vol-client-1=0x000000000000000000000000 -->changelog for itself (brick-b)
+```
The same can be extended for other replica pairs.
@@ -122,7 +132,8 @@ When a file split-brain happens it could be either data split-brain or
meta-data split-brain or both. When a split-brain happens the changelog of the
file would be something like this:
-* Example:(Lets consider both data, metadata split-brain on same file).
+Example:(Lets consider both data, metadata split-brain on same file).
+```
[root@pranithk-laptop vol]# getfattr -d -m . -e hex /gfs/brick-?/a
getfattr: Removing leading '/' from absolute path names
\#file: gfs/brick-a/a
@@ -133,10 +144,11 @@ trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57
trusted.afr.vol-client-0=0x000003b00000000100000000
trusted.afr.vol-client-1=0x000000000000000000000000
trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57
+```
-###Observations:
+### Observations:
-####According to changelog extended attributes on file /gfs/brick-a/a:
+#### According to changelog extended attributes on file /gfs/brick-a/a:
The first 8 digits of trusted.afr.vol-client-0 are all
zeros (0x00000000................), and the first 8 digits of
trusted.afr.vol-client-1 are not all zeros (0x000003d7................).
@@ -149,7 +161,7 @@ trusted.afr.vol-client-1 are not all zeros (0x........00000001........).
So the changelog on /gfs/brick-a/a implies that some metadata operations succeeded
on itself but failed on /gfs/brick-b/a.
-####According to Changelog extended attributes on file /gfs/brick-b/a:
+#### According to Changelog extended attributes on file /gfs/brick-b/a:
The first 8 digits of trusted.afr.vol-client-0 are not all
zeros (0x000003b0................), and the first 8 digits of
trusted.afr.vol-client-1 are all zeros (0x00000000................).
@@ -205,6 +217,7 @@ Hence execute
`setfattr -n trusted.afr.vol-client-1 -v 0x000003d70000000000000000 /gfs/brick-a/a`
Thus after the above operations are done, the changelogs look like this:
+```
[root@pranithk-laptop vol]# getfattr -d -m . -e hex /gfs/brick-?/a
getfattr: Removing leading '/' from absolute path names
\#file: gfs/brick-a/a
@@ -216,7 +229,7 @@ trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57
trusted.afr.vol-client-0=0x000000000000000100000000
trusted.afr.vol-client-1=0x000000000000000000000000
trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57
-
+```
Triggering Self-heal:
---------------------
@@ -243,9 +256,9 @@ needs to be removed.The gfid-link files are present in the .glusterfs folder
in the top-level directory of the brick. If the gfid of the file is
0x307a5c9efddd4e7c96e94fd4bcdcbd1b (the trusted.gfid extended attribute got
from the getfattr command earlier),the gfid-link file can be found at
-> /gfs/brick-a/.glusterfs/30/7a/307a5c9efddd4e7c96e94fd4bcdcbd1b
+`/gfs/brick-a/.glusterfs/30/7a/307a5c9efddd4e7c96e94fd4bcdcbd1b`
-####Word of caution:
+#### Word of caution:
Before deleting the gfid-link, we have to ensure that there are no hard links
to the file present on that brick. If hard-links exist,they must be deleted as
well.
diff --git a/doc/debugging/statedump.md b/doc/debugging/statedump.md
index 9939576e270..9d594320ddc 100644
--- a/doc/debugging/statedump.md
+++ b/doc/debugging/statedump.md
@@ -1,21 +1,30 @@
-#Statedump
+# Statedump
Statedump is a file generated by glusterfs process with different data structure state which may contain the active inodes, fds, mempools, iobufs, memory allocation stats of different types of datastructures per xlator etc.
-##How to generate statedump
-We can find the directory where statedump files are created using 'gluster --print-statedumpdir' command.
+## How to generate statedump
+We can find the directory where statedump files are created using `gluster --print-statedumpdir` command.
Create that directory if not already present based on the type of installation.
Lets call this directory `statedump-directory`.
-We can generate statedump using 'kill -USR1 <pid-of-gluster-process>'.
+We can generate statedump using `kill -USR1 <pid-of-gluster-process>`.
gluster-process is nothing but glusterd/glusterfs/glusterfsd process.
There are also commands to generate statedumps for brick processes/nfs server/quotad
-For bricks: `gluster volume statedump <volname>`
+For bricks:
+```
+gluster volume statedump <volname>
+```
-For nfs server: `gluster volume statedump <volname> nfs`
+For nfs server:
+```
+gluster volume statedump <volname> nfs
+```
-For quotad: `gluster volume statedump <volname> quotad`
+For quotad:
+```
+gluster volume statedump <volname> quotad
+```
For brick-processes files will be created in `statedump-directory` with name of the file as `hyphenated-brick-path.<pid>.dump.timestamp`. For all other processes it will be `glusterdump.<pid>.dump.timestamp`.
@@ -24,21 +33,21 @@ processes could have used the `SIGUSR1` signal already for other purposes.
To generate statedump for the processes, using libgfapi, below command can be
executed from one of the nodes in the gluster cluster to which the libgfapi
application is connected to.
-
- gluster volume statedump <volname> client <hostname>:<process id>
-
+```
+gluster volume statedump <volname> client <hostname>:<process id>
+```
The statedumps can be found in the `statedump-directory`, the name of the
statedumps being `glusterdump.<pid>.dump.timestamp`. For a process there can be
multiple such files created depending on the number of times the volume is
accessed by the process (related to the number of `glfs_init()` calls).
-##How to read statedump
+## How to read statedump
We shall see snippets of each type of statedump.
First and last lines of the file have starting and ending time of writing the statedump file. Times will be in UTC timezone.
mallinfo return status is printed in the following format. Please read man mallinfo for more information about what each field means.
-###Mallinfo
+### Mallinfo
```
[mallinfo]
mallinfo_arena=100020224 /* Non-mmapped space allocated (bytes) */
@@ -53,7 +62,7 @@ mallinfo_fordblks=3310112 /* Total free space (bytes) */
mallinfo_keepcost=133712 /* Top-most, releasable space (bytes) */
```
-###Data structure allocation stats
+### Data structure allocation stats
For every xlator data structure memory per translator loaded in the call-graph is displayed in the following format:
For xlator with name: glusterfs
@@ -74,7 +83,7 @@ max_num_allocs=3 #Maximum number of active allocations at any point in the life
total_allocs=7 #Number of times this data is allocated in the life of the process.
```
-###Mempools
+### Mempools
Mempools are optimization to reduce the number of allocations of a data type. If we create a mem-pool of lets say 1024 elements for a data-type, new elements will be allocated from heap using syscalls like calloc, only if all the 1024 elements in the pool are in active use.
@@ -94,7 +103,7 @@ cur-stdalloc=0 #Denotes the number of allocations made from heap once cold-count
max-stdalloc=0 #Maximum number of allocations from heap that are in active use at any point in the life of the process.
```
-###Iobufs
+### Iobufs
```
[iobuf.global]
iobuf_pool=0x1f0d970 #The memory pool for iobufs
@@ -105,7 +114,7 @@ iobuf_pool.arena_cnt=8 #Total number of arenas in the pool
iobuf_pool.request_misses=0 #The number of iobufs that were stdalloc'd (as they exceeded the default max page size provided by iobuf_pool).
```
-There are 3 lists of arenas
+There are 3 lists of arenas:
1. Arena list: arenas allocated during iobuf pool creation and the arenas that are in use(active_cnt != 0) will be part of this list.
2. Purge list: arenas that can be purged(no active iobufs, active_cnt == 0).
@@ -142,7 +151,7 @@ arena.6.active_iobuf.2.ptr=0x7fdb92189000
At any given point in time if there are lots of filled arenas then that could be a sign of iobuf leaks.
-###Call stack
+### Call stack
All the fops received by gluster are handled using call-stacks. Call stack contains the information about uid/gid/pid etc of the process that is executing the fop. Each call-stack contains different call-frames per xlator which handles that fop.
```
@@ -157,7 +166,7 @@ op=LOOKUP #Fop
type=1 #Type of the op i.e. FOP/MGMT-OP
cnt=9 #Number of frames in this stack.
```
-###Call-frame
+### Call-frame
Each frame will have information about which xlator the frame belongs to, what is the function it wound to/from and will be unwind to. It also mentions if the unwind happened or not. If we observe hangs in the system and want to find out which xlator is causing it. Take a statedump and see what is the final xlator which is yet to be unwound.
```
@@ -172,7 +181,7 @@ wind_to=priv->children[i]->fops->lookup
unwind_to=afr_lookup_cbk #Parent xlator function to which unwind happened
```
-###History of operations in Fuse
+### History of operations in Fuse
Fuse maintains history of operations that happened in fuse.
@@ -188,7 +197,7 @@ TIME=2014-07-09 16:44:57.523394
message=[0] fuse_getattr_resume: 4591, STAT, path: (/iozone.tmp), gfid: (3afb4968-5100-478d-91e9-76264e634c9f)
```
-###Xlator configuration
+### Xlator configuration
```
[cluster/replicate.r2-replicate-0] #Xlator type, name information
child_count=2 #Number of children to the xlator
@@ -208,7 +217,7 @@ favorite_child=-1
wait_count=1
```
-###Graph/inode table
+### Graph/inode table
```
[active graph - 1]
@@ -220,7 +229,7 @@ conn.1.bound_xl./data/brick01a/homegfs.lru_size=183 #Number of inodes present in
conn.1.bound_xl./data/brick01a/homegfs.purge_size=0 #Number of inodes present in purge list
```
-###Inode
+### Inode
```
[conn.1.bound_xl./data/brick01a/homegfs.active.324] #324th inode in active inode list
gfid=e6d337cf-97eb-44b3-9492-379ba3f6ad42 #Gfid of the inode
@@ -239,7 +248,7 @@ ia_type=2
Ref by xl:.fuse=1
Ref by xl:.patchy-client-0=-1
```
-###Inode context
+### Inode context
For each inode per xlator some context could be stored. This context can also be printed in the statedump. Here is the inode ctx of locks xlator
```
[xlator.features.locks.homegfs-locks.inode]
@@ -256,12 +265,12 @@ lock-dump.domain.domain=homegfs-replicate-0 #Domain name where entry/data operat
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=11141120, len=131072, pid = 18446744073709551615, owner=080b1ada117f0000, client=0xb7fc30, connection-id=compute-30-029.com-3505-2014/06/29-14:46:12:477358-homegfs-client-0-0-1, granted at Sun Jun 29 11:10:36 2014 #Active lock information
```
-##FAQ
-###How to debug Memory leaks using statedump?
+## FAQ
+### How to debug Memory leaks using statedump?
-####Using memory accounting feature:
+#### Using memory accounting feature:
-`https://bugzilla.redhat.com/show_bug.cgi?id=1120151` is one of the bugs which was debugged using statedump to see which data-structure is leaking. Here is the process used to find what the leak is using statedump. According to the bug the observation is that the process memory usage is increasing whenever one of the bricks is wiped in a replicate volume and a `full` self-heal is invoked to heal the contents. Statedump of the process is taken using kill -USR1 `<pid-of-gluster-self-heal-daemon>`.
+[Bug 1120151](https://bugzilla.redhat.com/show_bug.cgi?id=1120151) is one of the bugs which was debugged using statedump to see which data-structure is leaking. Here is the process used to find what the leak is using statedump. According to the bug the observation is that the process memory usage is increasing whenever one of the bricks is wiped in a replicate volume and a `full` self-heal is invoked to heal the contents. Statedump of the process is taken using `kill -USR1 <pid-of-gluster-self-heal-daemon>`.
```
grep -w num_allocs glusterdump.5225.dump.1405493251
num_allocs=77078
@@ -284,10 +293,10 @@ grep of the statedump revealed too many allocations for the following data-types
3. gf_common_mt_mem_pool.
After checking afr-code for allocations with tag `gf_common_mt_char` found `data-self-heal` code path does not free one such allocated memory. `gf_common_mt_mem_pool` suggests that there is a leak in pool memory. `replicate-0:dict_t`, `glusterfs:data_t` and `glusterfs:data_pair_t` pools are using lot of memory, i.e. cold_count is `0` and too many allocations. Checking source code of dict.c revealed that `key` in `dict` is allocated with `gf_common_mt_char` i.e. `2.` tag and value is created using gf_asprintf which in-turn uses `gf_common_mt_asprintf` i.e. `1.`. Browsing the code for leak in self-heal code paths lead to a line which over-writes a variable with new dictionary even when it was already holding a reference to another dictionary. After fixing these leaks, ran the same test to verify that none of the `num_allocs` are increasing even after healing 10,000 files directory hierarchy in statedump of self-heal daemon.
-Please check http://review.gluster.org/8316 for more info about patch/code.
+Please check this [patch](http://review.gluster.org/8316) for more info about the fix.
-####Debugging leaks in memory pools:
-Statedump output of memory pools was used to test and verify the fixes to https://bugzilla.redhat.com/show_bug.cgi?id=1134221. On code analysis, dict_t objects were found to be leaking (in terms of not being unref'd enough number of times, during name self-heal. The test involved creating 100 files on plain replicate volume, removing them from one of the bricks's backend, and then triggering lookup on them from the mount point. Statedump of the mount process was taken before executing the test case and after it, after compiling glusterfs with -DDEBUG flags (to have cold count set to 0 by default).
+#### Debugging leaks in memory pools:
+Statedump output of memory pools was used to test and verify the fixes to [Bug 1134221](https://bugzilla.redhat.com/show_bug.cgi?id=1134221). On code analysis, dict_t objects were found to be leaking (in terms of not being unref'd enough number of times, during name self-heal. The test involved creating 100 files on plain replicate volume, removing them from one of the brick's backend, and then triggering lookup on them from the mount point. Statedump of the mount process was taken before executing the test case and after it, after compiling glusterfs with -DDEBUG flags (to have cold count set to 0 by default).
Statedump output of the fuse mount process before the test case was executed:
@@ -319,7 +328,7 @@ cur-stdalloc=214
max-stdalloc=220
```
-Here, with cold count being 0 by default, cur-stdalloc indicated the number of dict_t objects that were allocated in heap using mem_get(), and yet to be freed using mem_put() (refer to https://github.com/gluster/glusterfs/blob/master/doc/data-structures/mem-pool.md for more details on how mempool works). After the test case (name selfheal of 100 files), there was a rise in the cur-stdalloc value (from 14 to 214) for dict_t.
+Here, with cold count being 0 by default, `cur-stdalloc` indicated the number of `dict_t` objects that were allocated in heap using `mem_get()`, and yet to be freed using `mem_put()` (refer to this [page](https://github.com/gluster/glusterfs/blob/master/doc/data-structures/mem-pool.md) for more details on how mempool works). After the test case (name selfheal of 100 files), there was a rise in the cur-stdalloc value (from 14 to 214) for `dict_t`.
After these leaks were fixed, glusterfs was again compiled with -DDEBUG flags, and the same steps were performed again and statedump was taken before and after executing the test case, of the mount. This was done to ascertain the validity of the fix. And the following are the results:
@@ -353,8 +362,8 @@ max-stdalloc=119
```
The value of cur-stdalloc remained 14 before and after the test, indicating that the fix indeed does what it's supposed to do.
-###How to debug hangs because of frame-loss?
-`https://bugzilla.redhat.com/show_bug.cgi?id=994959` is one of the bugs where statedump was helpful in finding where the frame was lost. Here is the process used to find where the hang is using statedump.
+### How to debug hangs because of frame-loss?
+[Bug 994959](https://bugzilla.redhat.com/show_bug.cgi?id=994959) is one of the bugs where statedump was helpful in finding where the frame was lost. Here is the process used to find where the hang is using statedump.
When the hang was observed, statedumps are taken for all the processes. On mount's statedump the following stack is shown:
```
[global.callpool.stack.1.frame.1]
@@ -402,4 +411,4 @@ unwind_to=qr_readdirp_cbk
```
`unwind_to` shows that call was unwound to `afr_readdirp_cbk` from client xlator.
Inspecting that function revealed that afr is not unwinding the stack when fop failed.
-Check http://review.gluster.org/5531 for more info about patch/code changes.
+Check this [patch](http://review.gluster.org/5531) for more info about the fix.