glusterfs.git/xlators/cluster/afr, branch v5.0

afr: fix incorrect reporting of directory split-brain

2018-10-11T11:00:02+00:00

Backport of https://review.gluster.org/#/c/glusterfs/+/21135/

Problem:
When a directory has dirty xattrs due to failed post-ops or when
replace/reset brick is performed, AFR does a conservative merge as
expected, but heal-info reports it as split-brain because there are no
clear sources.

Fix:
Modify pending flag to contain information about pending heals and
split-brains. For directories, if spit-brain flag is not set,just show
them as needing heal and not being in split-brain.

Change-Id: I09ef821f6887c87d315ae99e6b1de05103cd9383
fixes: bz#1638163
Signed-off-by: Ravishankar N

afr: prevent winding inodelks twice for arbiter volumes

2018-10-11T10:56:41+00:00

Backport of https://review.gluster.org/#/c/glusterfs/+/21380/

Problem:
In an arbiter volume, if there is a pending data heal of a file only on
arbiter brick, self-heal takes inodelks twice due to a code-bug but unlocks
it only once, leaving behind a stale lock on the brick. This causes
the next write to the file to hang.

Fix:
Fix the code-bug to take lock only once. This bug was introduced master
with commit eb472d82a083883335bc494b87ea175ac43471ff

Thanks to  Pranith Kumar K  for finding the RCA.

fixes: bz#1638159
Change-Id: I15ad969e10a6a3c4bd255e2948b6be6dcddc61e1
Signed-off-by: Ravishankar N

cluster/afr: Make data eager-lock decision based on number of locks

2018-10-05T14:37:01+00:00

For both Virt and block workloads the file is opened multiple times
leading to dynamically setting eager-lock to off for the workload.
Instead of depending on the number-of-open-fds, if we change the
logic to depend on number of inodelks, then it will give better
performance than the earlier logic. When there is an eager-lock
and number of inodelks is more than 1 we know that there is a
conflicting lock, so depend on that information to decide whether
to keep the current transaction go through delayed-post-op or not.

Locks xlator doesn't have implementation to query number of locks in
fxattrop in releases older than 3.10 so to keep things backward
compatible in 3.12, data transactions will use new logic where as
fxattrop transactions will use old logic. I am planning to send one
more patch which makes metadata domain locks also depend on
inodelk-count

Profile info for a dd of 500MB to a file with another fd opened
on the file using exec 250>filename

Without this patch:
 0.14      67.41 us      16.72 us    3870.82 us  892 FINODELK
 0.59     279.87 us      95.71 us    2085.89 us  898 FXATTROP
 3.46     366.43 us      81.75 us    6952.79 us 4000 WRITE
95.79  148733.99 us   50568.12 us  919127.86 us  273 FSYNC

With this patch:
 0.00      51.01 us      38.07 us      80.16 us    4 FINODELK
 0.00     235.43 us     235.43 us     235.43 us    1 TRUNCATE
 0.00     125.07 us      56.80 us     193.33 us    2 GETXATTR
 0.00     135.86 us      62.13 us     209.59 us    2  INODELK
 0.00     197.88 us     155.39 us     253.90 us    4 FXATTROP
 0.00     450.59 us     394.28 us     506.89 us    2  XATTROP
 0.00      56.96 us      19.06 us     406.59 us   23    FLUSH
37.81  273648.93 us      48.43 us 6017657.05 us   44   LOOKUP
62.18    4951.86 us      93.80 us 1143154.75 us 3999    WRITE

postgresql benchmark performance changed from ~1130 TPS to ~2300TPS
randio fio job inside Ovirt based VM went from ~600IOPs to ~2000IOPS

fixes bz#1635972
Change-Id: If7f7388d2f08cf7f17ca517a4ea222560661dc36
Signed-off-by: Pranith Kumar K

cluster/afr: Batch writes in same lock even when multiple fds are open

2018-10-05T14:37:01+00:00

Problem:
When eager-lock is disabled because of multiple-fds opened and app
writes come on conflicting regions, the number of locks grows very
fast leading to all the CPU being spent just in locking and unlocking
by traversing huge queues in locks xlator for granting locks.

Fix:
Reduce the number of locks in transit by bundling the writes in the
same lock and disable delayed piggy-pack when we learn that multiple
fds are open on the file. This will reduce the size of queues in the
locks xlator.  This also reduces the number of network calls like
inodelk/fxattrop.

Please note that this problem can still happen if eager-lock is
disabled as the writes will not be bundled in the same lock.

fixes bz#1635975
Change-Id: I8fd1cf229aed54ce5abd4e6226351a039924dd91
Signed-off-by: Pranith Kumar K

gfapi: revert several patchs that introduced pre/post attrs

2018-09-17T14:26:06+00:00

Reverted the following:
  - 248152767b0599986bbb6bb35fc27197f6be6964
  - 09943beb499617212f2985ca8ea9ecd1ed1b470e
  - d01f7244e9d9f7e3ef84e0ba7b48ef1b1b09d809

The reverts are redone by hand, due to clang format changes
that made using git to revert the changes more tedious.

Change-Id: I96489638a2b641fb2206a110298543225783f7be
Updates: bz#1628620
Signed-off-by: ShyamsundarR

Land part 2 of clang-format changes

2018-09-12T12:22:45+00:00

Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4
Signed-off-by: Nigel Babu

Land clang-format changes

2018-09-12T11:52:48+00:00

Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b

multiple xlators: strncpy()->sprintf(), reduce strlen()'s

2018-09-07T03:39:50+00:00

xlators/cluster/afr/src/afr-common.c
xlators/cluster/dht/src/dht-common.c
xlators/cluster/dht/src/dht-rebalance.c
xlators/cluster/stripe/src/stripe-helpers.c

strncpy may not be very efficient for short strings copied into
a large buffer: If the length of src is less than n,
strncpy() writes additional null bytes to dest to ensure
that a total of n bytes are written.

Instead, use snprintf().
Also:
- save the result of strlen() and re-use it when possible.
- move from strlen to SLEN (sizeof() ) for const strings.

Compile-tested only!

Change-Id: Icdf79dd3d9f9ff120e4720ff2b8bd016df575c38
updates: bz#1193929
Signed-off-by: Yaniv Kaul

afr: thin-arbiter read txn changes

2018-09-05T08:28:23+00:00

If both data bricks are up, read subvol will be based on read_subvols.

If only one data brick is up:
- First qeury the data-brick that is up. If it blames the other brick,
allow the reads.

- If if doesn't, query the TA to obtain the source of truth.

TODO: See if in-memory state can be maintained for read txns (BZ 1624358).

updates: bz#1579788
Change-Id: I61eec35592af3a1aaf9f90846d9a358b2e4b2fcc
Signed-off-by: Ravishankar N

glusterd: Fix Buffer size issues

2018-09-04T14:01:59+00:00

This patch fixes buffer size issue 1138522.

Change-Id: Ia12fc8f34f75704f8ed3efae2022c4fd67a8c76c
updates: bz#789278
Signed-off-by: Sanju Rakonde