glusterfs.git/tests/basic/afr, branch v6.2

cluster/thin-arbiter: Consider thin-arbiter before marking new entry changelog

2019-02-01T05:44:36+00:00

If a fop to create an entry fails on one of the data brick,
we mark the pending changelog on the entry on brick for which
it was successful. This is done as part of post op phase to
make sure that entry gets healed even if it gets renamed to
some other path where its parent was not marked as bad.

As it happens as part of post op, we should consider thin-arbiter
to check if the brick, which was successful, is the good brick or not.
This will avoide split brain and other issues.

Change-Id: I12686675be98f02f70a5186b3ed748c541514d53
updates: bz#1662264
Signed-off-by: Ashish Pandey

tests: run nfs tests only if --enable-gnfs is provided

2019-01-24T15:18:00+00:00

Fixes: bz#1665358
Change-Id: Idbf88ec3ac683733b32c313377eeb72f2819bf0d
Signed-off-by: Amar Tumballi

cluster/afr: Disable client side heals in AFR by default.

2019-01-10T11:10:27+00:00

With this changeset, default value for the AFR client side
heal volume option is set to "off"

fixes: bz#1663102
Change-Id: Ie4016932339c4896487e3e7cb5caca68739b7ba2
Signed-off-by: Sunil Kumar Acharya

cluster/ta: Check number/type of locks held on ta file

2018-12-27T05:46:36+00:00

Change-Id: Iec47856ce2819e7d7d38a60279602e53ba45858d
updates: bz#1624332
Signed-off-by: Ashish Pandey

cluster/afr: Add test for thin-arbiter feature

2018-11-26T07:46:10+00:00

Test : Check success/failure of write fop while
different bricks/ta process are down.

Change-Id: I3c376935df93ebf1f794c964bd19bc1280d91c59
updates: bz#1624332
Signed-off-by: Ashish Pandey

tiering: remove the translator from build and glusterd

2018-11-02T02:39:35+00:00

Based on the proposal to remove few features as they are not
actively maintained [1], removing tier translator from the
build. Also make sure there are no regression tests involving
tiering feature are present.

[1] https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html

Change-Id: I2c177f711f9b54b7b24e1a13525ff3132bd9a9c5
updates: bz#1642807
Signed-off-by: Amar Tumballi

afr: thin-arbiter 2 domain locking and in-memory state

2018-10-25T12:26:22+00:00

2 domain locking + xattrop for write-txn failures:
--------------------------------------------------
- A post-op wound on TA takes AFR_TA_DOM_NOTIFY range lock and
AFR_TA_DOM_MODIFY full lock, does xattrop on TA and releases
AFR_TA_DOM_MODIFY lock and stores in-memory which brick is bad.

- All further write txn failures are handled based on this in-memory
value without querying the TA.

- When shd heals the files, it does so by requesting full lock on
AFR_TA_DOM_NOTIFY domain. Client uses this as a cue (via upcall),
releases AFR_TA_DOM_NOTIFY range lock and invalidates its in-memory
notion of which brick is bad. The next write txn failure is wound on TA
to again update the in-memory state.

- Any incomplete write txns before the AFR_TA_DOM_NOTIFY upcall release
request is got is completed before the lock is released.

- Any write txns got after the release request are maintained in a ta_waitq.

- After the release is complete, the ta_waitq elements are spliced to a
separate queue which is then processed one by one.

- For fops that come in parallel when the in-memory bad brick is still
unknown, only one is wound to TA on wire. The other ones are maintained
in a ta_onwireq which is then processed after we get the response from
TA.

Change-Id: I32c7b61a61776663601ab0040e2f0767eca1fd64
updates: bz#1579788
Signed-off-by: Ravishankar N 
Signed-off-by: Ashish Pandey

cluster/afr: Use 2 domain locking in SHD for thin-arbiter

2018-09-20T09:18:20+00:00

With this change when SHD starts the index crawl it requests
all the clients to release the AFR_TA_DOM_NOTIFY lock so that
clients will know the in memory state is no more valid and
any new operations needs to query the thin-arbiter if required.

When SHD completes healing all the files without any failure, it
will again take the AFR_TA_DOM_NOTIFY lock and gets the xattrs on
TA to see whether there are any new failures happened by that time.
If there are new failures marked on TA, SHD will start the crawl
immediately to heal those failures as well. If there are no new
failures, then SHD will take the AFR_TA_DOM_MODIFY lock and unsets
the xattrs on TA, so that both the data bricks will be considered
as good there after.

Change-Id: I037b89a0823648f314580ba0716d877bd5ddb1f1
fixes: bz#1579788
Signed-off-by: karthik-us

afr: thin-arbiter read txn changes

2018-09-05T08:28:23+00:00

If both data bricks are up, read subvol will be based on read_subvols.

If only one data brick is up:
- First qeury the data-brick that is up. If it blames the other brick,
allow the reads.

- If if doesn't, query the TA to obtain the source of truth.

TODO: See if in-memory state can be maintained for read txns (BZ 1624358).

updates: bz#1579788
Change-Id: I61eec35592af3a1aaf9f90846d9a358b2e4b2fcc
Signed-off-by: Ravishankar N

cluster/afr: Delegate name-heal when possible

2018-09-04T01:52:02+00:00

Problem:
When name-self-heal is triggered on the mount, it blocks
lookup until name-self-heal completes. But that can lead
to hangs when lot of clients are accessing a directory which
needs name heal and all of them trigger heals waiting
for other clients to complete heal.

Fix:
When a name-heal is needed but quorum number of names have the
file and pending xattrs exist on the parent, then better to
delegate the heal to SHD which will be completed as part of
entry-heal of the parent directory. We could also do the same
for quorum-number of names not present but we don't have
any known use-case where this is a frequent occurrence so
not changing that part at the moment. When there is a gfid
mismatch or missing gfid it is important to complete the heal
so that next rename doesn't assume everything is fine and
perform a rename etc

fixes bz#1622821
Change-Id: I8b002c85dffc6eb6f2833e742684a233daefeb2c
Signed-off-by: Pranith Kumar K