glusterfs.git, branch release-3.13

afr: capture the correct errno in post-op quorum check

2018-02-06T14:28:04+00:00

If the post-op phase of txn did not meet quorm checks, use that errno to
unwind the FOP rather than blindly setting ENOTCONN.

Change-Id: I0cb0c8771ec75a45f9a25ad4cd8601103deddf0c
BUG: 1536346
Signed-off-by: Ravishankar N 
(cherry picked from commit 440a048f24b006c80af3d7bcd0a1f13fe3459d87)

afr: don't treat all cases all bricks being blamed as split-brain

2018-02-06T14:27:55+00:00

Problem:
We currently don't have a roll-back/undoing of post-ops if quorum is not
met. Though the FOP is still unwound with failure, the xattrs remain on
the disk.  Due to these partial post-ops and partial heals (healing only when
2 bricks are up), we can end up in split-brain purely from the afr
xattrs point of view i.e each brick is blamed by atleast one of the
others. These scenarios are hit when there is frequent
connect/disconnect of the client/shd to the bricks while I/O or heal
are in progress.

Fix:
Instead of undoing the post-op, pick a source based on the xattr values.
If 2 bricks blame one, the blamed one must be treated as sink.
If there is no majority, all are sources. Once we pick a source,
self-heal will then do the heal instead of erroring out due to
split-brain.

Change-Id: I3d0224b883eb0945785ade0e9697a1c828aec0ae
BUG: 1541458
Signed-off-by: Ravishankar N 
(cherry picked from commit 0e6e8216823c2d9dafb81aae0f6ee3497c23d140)

cluster/afr: remove unnecessary child_up initialization

2018-02-05T09:15:06+00:00

The child_up array was initialized with all elements being -1 to
allow afr_notify() to differentiate down bricks from bricks that
haven't reported yet. With current implementation this is not needed
anymore and it was causing unexpected results when other parts of
the code considered that if child_up[i] != 0, it meant that it was up.

Backport of:
> BUG: 1541038

Change-Id: I2a9d712ee64c512f24bd5cd3a48dcb37e3139472
BUG: 1541929
Signed-off-by: Xavier Hernandez

cluster/ec: Do lock conflict check correctly for wait-list

2018-02-02T15:05:01+00:00

Problem:
ec_link_has_lock_conflict() is traversing over only owner_list
but the function is also getting called with wait_list.

Fix:
Modify ec_link_has_lock_conflict() to traverse lists correctly.
Updated the callers to reflect the changes.

BUG: 1540896
Change-Id: Ibd7ea10f4498e7c2761f9a6faac6d5cb7d750c91
Signed-off-by: Pranith Kumar K

afr: add quorum checks in post-op

2018-02-01T13:57:42+00:00

afr relies on pending changelog xattrs to identify source and sinks and the
setting of these xattrs happen in post-op. So if post-op fails, we need to
unwind the write txn with a failure.

Change-Id: I0f019ac03890108324ee7672883d774918b20be1
BUG: 1536346
Signed-off-by: Ravishankar N 
(cherry picked from commit a40a87ec3b226ae86a6ed8f4af25b45965a20cad)

build: glibc has removed rpc headers and rpcgen in Fedora28, use libtirpc

2018-01-25T15:56:37+00:00

Other Linux distributions are doing the same; some have already done
so.

Switch to libtirpc(-devel) and unbundled rpcgen packages. For now
rpcgen is still provided by the glibc-rpcgen RPM, but rpcsvc-proto's
rpcgen subpackage is available now; it will not be used until
glibc-rpcgen is retired. (note, rpcsvc-proto's rpcgen is just named
rpcgen-...rpm. I.e. not rpcsvc-proto-rpcgen-...rpm.) Either one
will satisfy the BuildRequires: rpcgen.

Also, when a .spec file has
  BuildRequires: foo-devel
it is not necessary to also have:
  BuildRequires: foo
or even:
  BuildRequires: foo foo-devel

The foo-devel package has a dependency on foo, which will install foo
automatically. It's usually also not necessary to have a corresponding
  Requires: foo
as the rpmbuild process will also automatically determine the
install-time dependencies.

See also Change-Id: I4a8292de2eddad16137df5998334133fc1e11261
and/or https://review.gluster.org/19311
and Change-Id: I97dc39c7844f44c36fe210aa813480c219e1e415
and/or https://review.gluster.org/#/c/19330/

Change-Id: I86f847dfda0fef83e22c6e8b761342d652a2d9ba
BUG: 1536187
Signed-off-by: Kaleb S. KEITHLEY

doc: Added release notes for 3.13.2

2018-01-20T00:15:42+00:00

Change-Id: I80f411f3820f82cb27fd5f8cf1cf99d5565d8b9d
BUG: 1530334
Signed-off-by: ShyamsundarR

selinux-xlator : validate dict before calling dict_rename_key()

2018-01-19T14:57:52+00:00

Upstream reference :
>Change-Id: I71da3b64e5e8c82e8842e119b2b05da3e2ace550
>BUG: 1535772
>Signed-off-by: Jiffin Tony Thottan 
>(cherry picked from commit bee06ccd7b80e3f5804f0c7c7c56936fed6d2b4e)

Change-Id: I71da3b64e5e8c82e8842e119b2b05da3e2ace550
BUG: 1536294

cluster/afr: Adding option to take full file lock

2018-01-19T14:24:52+00:00

Problem:
In replica 3 volumes there is a possibilities of ending up in split
brain scenario, when multiple clients writing data on the same file
at non overlapping regions in parallel.

Scenario:
- Initially all the copies are good and all the clients gets the value
  of data readables as all good.
- Client C0 performs write W1 which fails on brick B0 and succeeds on
  other two bricks.
- C1 performs write W2 which fails on B1 and succeeds on other two bricks.
- C2 performs write W3 which fails on B2 and succeeds on other two bricks.
- All the 3 writes above happen in parallel and fall on different ranges
  so afr takes granular locks and all the writes are performed in parallel.
  Since each client had data-readables as good, it does not see
  file going into split-brain in the in_flight_split_brain check, hence
  performs the post-op marking the pending xattrs. Now all the bricks
  are being blamed by each other, ending up in split-brain.

Fix:
Have an option to take either full lock or range lock on files while
doing data transactions, to prevent the possibility of ending up in
split brains. With this change, by default the files will take full
lock while doing IO. If you want to make use of the old range lock
change the value of "cluster.full-lock" to "no".

Change-Id: I7893fa33005328ed63daa2f7c35eeed7c5218962
BUG: 1535438
Signed-off-by: karthik-us

tests: Use /dev/urandom instead of /dev/random for dd

2018-01-19T06:50:16+00:00

If there's not enough entropy in the system then reading /dev/random would take
a significant time since it would take a long time for the /dev/random buffers
to get full as is desired in this dd run.
Milind found that this test file takes almost a 1000 seconds or more to pass
instead of just a minute because of this.

Backport of:
>BUG: 1431955

BUG: 1533023
Signed-off-by: Pranith Kumar K 
Change-Id: I9145b17f77f09d0ab71816ae249c69b8fe14c1a5