glusterfs.git/libglusterfs/src/mem-types.h, branch v3.10.7

dht/rebalance: allocate migrator thread pool dynamically

2016-07-28T12:46:30+00:00

Problems: The maximum number of migratior threads created was static set
to "40". And the number of these threads get created in rebalance depends
on the number of cores user has. If the number of cores exceeds 40, a 
crash or memory corruption can be seen.

Fix: Make the migratior thread pool dynamic.

Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839
BUG: 1359711
Signed-off-by: Susant Palai 
Reviewed-on: http://review.gluster.org/15000
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Raghavendra G

features/bitrot: Move throttling code to libglusterfs

2016-07-18T12:03:17+00:00

Since throttling is a separate feature by itself,
move throttling code to libglusterfs.

Change-Id: If9b99885ceb46e5b1865a4af18b2a2caecf59972
BUG: 1352019
Signed-off-by: Kotresh HR 
Reviewed-on: http://review.gluster.org/14846
Smoke: Gluster Build System 
CentOS-regression: Gluster Build System 
NetBSD-regression: NetBSD Build System 
Reviewed-by: Ravishankar N 
Reviewed-by: Jeff Darcy

protocol: add getactivelk () fop

2016-05-02T01:04:31+00:00

Change-Id: Ie38198db990f133fe163ba160cdf647e34f83f4f
BUG: 1326085
Signed-off-by: Susant Palai 
Reviewed-on: http://review.gluster.org/13994
Reviewed-by: Niels de Vos 
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System

Leases: Add a server side xlator to handle lease requests

2016-04-30T05:39:06+00:00

Before this patch, there was an effort to implement leases
in upcall xlator, these patches by Soumya and me can be
found @ http://review.gluster.org/#/c/10084/

Change-Id: I926728c7ec690727a8971039b240655882d02059
BUG: 1319992
Signed-off-by: Poornima G 
Reviewed-on: http://review.gluster.org/11643
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Raghavendra Talur 
Reviewed-by: Rajesh Joseph 
Reviewed-by: Pranith Kumar Karampuri

performance/decompounder: Introducing decompounder xlator

2016-04-26T06:47:28+00:00

This xlator decompounds the compound fops received,
and executes them serially.

Change-Id: Ieddcec3c2983dd9ca7919ba9d7ecaa5192a5f489
BUG: 1303829
Signed-off-by: Anuradha Talur 
Reviewed-on: http://review.gluster.org/13577
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri

cli/quota: Sort the list output alphabetically by path

2016-04-25T09:32:58+00:00

Change-Id: I0b124e119d167817be2ae3eb52ac6c80fc7db5d1
BUG: 1320716
Signed-off-by: vmallika 
Reviewed-on: http://review.gluster.org/14000
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Kaushal M

syncop: Add parallel dir scan functionality

2016-04-05T00:36:33+00:00

Most of this functionality's ideas are contributed
by Richard Wareing, in his patch:
https://bugzilla.redhat.com/show_bug.cgi?id=1221737#c1

VERY BIG thanks to him :-).

After starting porting/testing the patch above, I found a few things we can
improve in this patch based on the results we got in testing.
1) We are reading all the indices before we launch self-heals. In some customer
cases I worked on there were almost 5million files/directories that needed
heal. With such a big number self-heal daemon will be OOM killed if we go
this route. So I modified this to launch heals based on a queue length
limit.

2) We found that for directory hierarchies, multi-threaded self-heal
patch was not giving better results compared to single-threaded
self-heal because of the order problems. We improved index xlator to
give gfid type to make sure that all directories in the indices are
healed before the files that follow in that iteration of readdir
output(http://review.gluster.org/13553). In our testing this lead to
zero errors of self-heals as we were only doing self-heals in parallel
for files and not directories. I think we can further improve self-heal
speed for directories by doing name heals in parallel based on similar
techniques Richard's patch showed. I think the best thing there would be to
introduce synccond_t infra (pthread_cond_t kind of infra for syncops)
which I am planning to implement for future releases.

3) Based on 1), 2) and the fact that afr already does retries of the
indices in a loop I removed retries again in the threads.

4) After the refactor, the changes required to bring in multi-threaded
self-heal for ec would just be ~10 lines, most of it will be about
options initialization.

Our tests found that we are able to easily saturate network :-).

High level description of the final feature:
Traditionally self-heal daemon reads the indices (gfids) that need to be healed
from the brick and initiates heal one gfid at a time. Goal of this feature is
to add parallelization to the way we do self-heals in a way we do not regress
in any case but increase parallelization wherever we can. As part of this following
knobs are introduced to improve parallelization:
1) We can launch 'max-jobs' number of heals in parallel.
2) We can keep reading indices as long as the wait-q for heals doesn't go over
   'max-qlen' passed as arguments to multi-threaded dir_scan.

As a first cut, we always do healing of directories in serial order one at a time
but for files we launch heals in parallel. In future we can do name-heals of dir
in parallel, but this is not implemented as of now. Reason for this is mentioned
already in '2)' above.

AFR/EC can introduce options like max-shd-threads/wait-qlength which can be set
by users to increase the rate of heals when they want. Please note that the
options will take effect only for the next crawl.

BUG: 1221737
Change-Id: I8fc0afc334def87797f6d41e309cefc722a317d2
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/13569
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Jeff Darcy 
Smoke: Gluster Build System

debug/io-stats: Add FOP sampling feature

2015-11-01T17:14:34+00:00

Summary:
- Using sampling feature you can record details about every Nth FOP.
  The fields in each sample are: FOP type, hostname, uid, gid, FOP priority,
  port and time taken (latency) to fufill the request.
- Implemented using a ring buffer which is not (m/c) allocated in the IO path,
  this should make the sampling process pretty cheap.
- DNS resolution done @ dump time not @ sample time for performance w/
  cache
- Metrics can be used for both diagnostics, traffic/IO profiling as well
  as P95/P99 calculations
- To control this feature there are two new volume options:
  diagnostics.fop-sample-interval - The sampling interval, e.g. 1 means
  sample every FOP, 100 means sample every 100th FOP
  diagnostics.fop-sample-buf-size - The size (in bytes) of the ring
  buffer used to store the samples.  In the even more samples
  are collected in the stats dump interval than can be held in this buffer,
  the oldest samples shall be discarded.  Samples are stored in the log
  directory under /var/log/glusterfs/samples.
- Uses DNS cache written by sshreyas@fb.com (Thank-you!), the DNS cache
  TTL is controlled by the diagnostics.stats-dnscache-ttl-sec option
  and defaults to 24hrs.

Test Plan:
- Valgrind'd to ensure it's leak free
- Run prove test(s)
- Shadow testing on 100+ brick cluster

Change-Id: I9ee14c2fa18486b7efb38e59f70687249d3f96d8
BUG: 1271310
Signed-off-by: Jeff Darcy 
Reviewed-on: http://review.gluster.org/12210
Tested-by: Gluster Build System 
Reviewed-by: Vijay Bellur

libglusterfs: Use GF_CALLOC/GF_FREE instead of CALLOC/FREE

2015-07-02T07:36:27+00:00

- Also removed numbers for the types as the string form of type is printed in
  statedump now, so the numbers are not needed anymore.

Change-Id: I6e8c15a1dc8cb6187842f96f1d46ec0f26a602b4
BUG: 1237381
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/11495
Tested-by: Gluster Build System 
Tested-by: NetBSD Build System 
Reviewed-by: Vijay Bellur

rebalance: Introducing local crawl and parallel migration

2015-04-29T13:48:00+00:00

The current patch address two part of the design proposed.
1. Rebalance multiple files in parallel
2. Crawl only bricks that belong to the current node

Brief design explanation for the above two points.

1. Rebalance multiple files in parallel:
   -------------------------------------
The existing rebalance engine is single threaded. Hence, introduced
multiple threads which will be running parallel to the crawler. The
current rebalance migration is converted to a "Producer-Consumer"
frame work.

Where Producer is : Crawler
      Consumer is : Migrating Threads

Crawler: Crawler is the main thread. The job of the crawler is now
limited to fix-layout of each directory and add the files which are
eligible for the migration to a global queue in a round robin manner
so that we will use all the disk resources efficiently. Hence, the
crawler will not be "blocked" by migration process.

Producer: Producer will monitor the global queue. If any file is
added to this queue, it will dqueue that entry and migrate the file.
Currently 20 migration threads are spawned at the beginning of the
rebalance process. Hence, multiple file migration happens in parallel.

2. Crawl only bricks that belong to the current node:
   --------------------------------------------------
As rebalance process is spawned per node, it migrates only the files
that belongs to it's own node for the sake of load balancing. But it
also reads entries from the whole cluster, which is not necessary as
readdir hits other nodes.

New Design:
        As part of the new design the rebalancer decides the subvols
that are local to the rebalancer node by checking the node-uuid of
root directory prior to the crawler starts. Hence, readdir won't hit
the whole cluster  as it has already the context of local subvols and
also node-uuid request for each file can be avoided. This makes the
rebalance process "more scalable".

Change-Id: I73ed6ff807adea15086eabbb8d9883e88571ebc1
BUG: 1171954
Signed-off-by: Susant Palai 
Reviewed-on: http://review.gluster.org/9657
Tested-by: Gluster Build System 
Reviewed-by: N Balachandran 
Reviewed-by: Shyamsundar Ranganathan