| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The general idea of the changes is to prevent resetting event generation
to zero in the inode ctx, since event gen is something that should
follow 'causal order'.
Change #1:
For a read txn, in inode refresh cbk, if event_generation is
found zero, we are failing the read fop. This is not needed
because change in event gen is only a marker for the next inode refresh to
happen and should not be taken into account by the current read txn.
Change #2:
The event gen being zero above can happen if there is a racing lookup,
which resets even get (in afr_lookup_done) if there are non zero afr
xattrs. The resetting is done only to trigger an inode refresh and a
possible client side heal on the next lookup. That can be acheived by
setting the need_refresh flag in the inode ctx. So replaced all
occurences of resetting even gen to zero with a call to
afr_inode_need_refresh_set().
Change #3:
In both lookup and discover path, we are doing an inode refresh which is
not required since all 3 essentially do the same thing- update the inode
ctx with the good/bad copies from the brick replies. Inode refresh also
triggers background heals, but I think it is okay to do it when we call
refresh during the read and write txns and not in the lookup path.
The .ts which relied on inode refresh in lookup path to trigger heals are
now changed to do read txn so that inode refresh and the heal happens.
Change-Id: Iebf39a9be6ffd7ffd6e4046c96b0fa78ade6c5ec
Fixes: #1179
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reported-by: Erik Jacobson <erik.jacobson at hpe.com>
(cherry picked from commit f0fcd909ad4535b60c9208d4804ebe6afe421a09)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...if pending xattrs are zero for all children.
Problem:
If there are no pending xattrs and a metadata heal needs to be
performed, it can be possible that we end up with xattrs inadvertendly
deleted from all bricks, as explained in the BZ.
Fix:
After picking one among the sources as the good copy, mark pending xattrs on
all sources to blame the sinks. Now even if this metadata heal fails midway,
a subsequent heal will still choose one of the valid sources that it
picked previously.
Updates: #1067
Change-Id: If1b050b70b0ad911e162c04db4d89b263e2b8d7b
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 2d5ba449e9200b16184b1e7fc84cabd015f1f779)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The was a problem when self-heal was sending lookups at the same time
that one of the bricks was coming up. In this case there was a chance
that the number of 'up' bricks changes in the middle of sending the
requests to subvolumes which caused a discrepancy in the expected
number of replies and the actual number of sent requests.
This discrepancy caused that AFR continued executing requests before
all requests were complete. Eventually, the frame of the pending
request was destroyed when the operation terminated, causing a use-
after-free issue when the answer was finally received.
In theory the same thing could happen in the reverse way, i.e. AFR
tries to wait for more replies than sent requests, causing a hang.
Backport of:
> Change-Id: I7ed6108554ca379d532efb1a29b2de8085410b70
> Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
> Fixes: bz#1808875
Change-Id: I7ed6108554ca379d532efb1a29b2de8085410b70
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Fixes: bz#1809438
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In a hyperconverged setup with granular-entry-heal enabled, if a file is
recreated while one of the bricks is down, and an index heal is triggered
(with the brick still down), entry-self heal was doing a spurious heal
with just the 2 good bricks. It was doing a post-op leading to removal
of the filename from .glusterfs/indices/entry-changes as well as
erroneous setting of afr xattrs on the parent. When the brick came up,
the xattrs were cleared, resulting in the renamed file not getting
healed and leading to gfid split-brain and EIO on the mount.
Fix:
Proceed with entry heal only when shd can connect to all bricks of the replica,
just like in data and metadata heal.
fixes: bz#1804591
Change-Id: I916ae26ad1fabf259bc6362da52d433b7223b17e
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 06453d77d056fbaa393a137ca277a20e38d2f67e)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When we mount a ta volume, as soon as 2 data bricks are connected
we consider that the mount is done and then send a lookup/create on
ta file on ta node. However, this connection with ta node might not
have been completed.
Due to this delay, ta replica id file will not be created and we
will see ENOTCONN error in log file if we do lookup.
Solution:
As we know that this ta node could have a higher latency, we should
wait for reasonable time for connection to happen before sending
lookup/create on replica id file.
fixes: bz#1804058
Change-Id: I36f90865afe617e4e84cee57fec832a16f5dd6cc
(cherry picked from commit a7fa54ddea3fe429f143b37e4de06a93b49d776a)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
heal-info code assumes that all indices in xattrop directory
definitely need heal. There is one corner case.
The very first xattrop on the file will lead to adding the
gfid to 'xattrop' index in fop path and in _cbk path it is
removed because the fop is zero-xattr xattrop in success case.
These gfids could be read by heal-info and shown as needing heal.
Fix:
Check the pending flag to see if the file definitely needs or
not instead of which index is being crawled at the moment.
fixes: bz#1802449
Change-Id: I79f00dc7366fedbbb25ec4bec838dba3b34c7ad5
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
(cherry picked from commit d27df94016b5526c18ee964d4a47508326329dda)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of https://review.gluster.org/#/c/glusterfs/+/23960/
This volume option was not made avaialble to `gluster volume set` CLI.
Reported-by: epolakis(https://github.com/kinsu) in
https://github.com/gluster/glusterfs/issues/781
fixes: bz#1788785
Change-Id: I7141bdd4e53ee99e22b354edde8d023bfc0b2cd7
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes in locks xlator:
Added support for per-domain inodelk count requests.
Caller needs to set GLUSTERFS_MULTIPLE_DOM_LK_CNT_REQUESTS key in the
dict and then set each key with name
'GLUSTERFS_INODELK_DOM_PREFIX:<domain name>'.
In the response dict, the xlator will send the per domain count as
values for each of these keys.
Changes in AFR:
Replaced afr_selfheal_locked_inspect() with afr_lockless_inspect(). Logic has
been added to make the latter behave same as the former, thus not
breaking the current heal info output behaviour.
fixes: bz#1783858
Change-Id: Ie9e83c162aa77f44a39c2ba7115de558120ada4d
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit d7e049160a9dea988ded5816491c2234d40ab6b3)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In a situation where B1 blames B2, B2 blames B1 and B3 doesn't blame
anything for entry heal, heal will not complete even though we have
clear source and sinks. This will happen because while doing
afr_selfheal_find_direction() only the bricks which are blamed by
non-accused bricks are considered as sinks. Later in
__afr_selfheal_entry_finalize_source() when it tries to mark all the
non-sources as sinks it fails to do so because there won't be any
healed_sinks marked, no witness present and there will be a source.
Fix:
If there is a source and no healed_sinks, then reset all the locked
sources to 0 and healed sinks to 1 to do conservative merge.
Change-Id: If40d8bc95d52a52b2730f55bdcf135109b421548
Fixes: bz#1760699
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ever since we added quorum checks for lookups in afr via commit
bd44d59741bb8c0f5d7a62c5b1094179dd0ce8a4, the split-brain resolution
commands would not work for replica 3 because there would be no
readables for the lookup fop.
The argument was that split-brains do not occur in replica 3 but we do
see (data/metadata) split-brain cases once in a while which indicate that there are
a few bugs/corner cases yet to be discovered and fixed.
Fortunately, commit 8016d51a3bbd410b0b927ed66be50a09574b7982 added
GF_CLIENT_PID_GLFS_HEALD as the pid for all fops made by glfsheal. If we
leverage this and allow lookups in afr when pid is GF_CLIENT_PID_GLFS_HEALD,
split-brain resolution commands will work for replica 3 volumes too.
Likewise, the check is added in shard_lookup as well to permit resolving
split-brains by specifying "/.shard/shard-file.xx" as the file name
(which previously used to fail with EPERM).
Change-Id: I3c543dea79caf7cfbc1633e9089cb1cdd2538ba9
Fixes: bz#1760791
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 47dbd753187f69b3835d2e42fdbe7485874c4b3e)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After add-brick and rebalance, the ctime xattr is not present
on rebalanced directories on new brick. This patch fixes the
same.
Note that ctime still doesn't support consistent time across
distribute sub-volume.
This patch also fixes the in-memory inconsistency of time attributes
when metadata is self healed.
Backport of:
> Patch: https://review.gluster.org/23127/
> Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df
> BUG: 1734026
> Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 304640e55c0f3c6d15f4e230dc6376e4f5020fea)
Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df
Signed-off-by: Kotresh HR <khiremat@redhat.com>
fixes: bz#1752429
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were not passing xattr_req when doing a name self heal
as well as a meta data heal. Because of this, some xdata
was missing which causes i/o errors
Backport of > https://review.gluster.org/#/c/glusterfs/+/23024/
>Change-Id: Ibfb1205a7eb0195632dc3820116ffbbb8043545f
>Fixes: bz#1728770
>Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Fixes: bz#1749305
Change-Id: Ibfb1205a7eb0195632dc3820116ffbbb8043545f
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
(cherry picked from commit d026f0bcfd301712e4f0671ccf238f43f2e6dd30)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...whenever shd is re-enabled after disabling or there is a change in
`cluster.heal-timeout`, without needing to restart shd or waiting for the
current `cluster.heal-timeout` seconds to expire.
See BZ 1743988 for more details.
Change-Id: Ia5ebd7c8e9f5b54cba3199c141fdd1af2f9b9bfe
fixes: bz#1747301
Reported-by: Glen Kiessling <glenk1973@hotmail.com>
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 600ba94183333c4af9b4a09616690994fd528478)
|
|
|
|
|
|
|
| |
Fixes: bz#1741041
Change-Id: I29e338bac62104233a6f80212df8d0fb016affda
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 8e9c53ebf16705b9a1db2fc486dc24a5cb244ddd)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In case of thin arbiter, before index healer starts crawling the
indices at every heal-timeout interval, even if there is nothing to
be healed it will send an upcall notification to all the clients to
release any AFR_TA_DOM_NOTIFY locks that they hold. SHD will wait
for the upcall to return before proceeding with the heal even though
there is nothing to be healed. This will also invalidates the cached
information about the bricks states on the clients which leads to
extra calls on TA from clients for the next reads & writes if needed.
This will impact the IO performance.
Fix:
- Before sending the upcall to the clients, check for any pending heals
on TA without taking any locks.
- If there is nothing marked bad on TA, then continue with the index
crawl to heal any dirty markings present on the files due to any post-op
failure.
- If there is a brick marked as bad on TA, then take the
AFR_TA_DOM_NOTIFY lock on TA from SHD, get the state on TA and
continue with the current healing process.
Change-Id: Ieb477bc6cb18bbdfd4e7a0453c5ed79b574ec9d6
fixes: bz#1729483
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problems:
1. When checking for type and gfid mismatch, if the type or gfid
is unknown because of missing gfid handle and the gfid xattr
it will be reported as type or gfid mismatch and the heal will
not complete.
2. If the source selected during entry heal has null gfid the same
will be sent to afr_lookup_and_heal_gfid(). In this function when
we try to assign the gfid on the bricks where it does not exist,
we are considering the same gfid and try to assign that on those
bricks. This will fail in posix_gfid_set() since the gfid sent
is null.
Fix:
If the gfid sent to afr_lookup_and_heal_gfid() is null choose a
valid gfid before proceeding to assign the gfid on the bricks
where it is missing.
In afr_selfheal_detect_gfid_and_type_mismatch(), do not report
type/gfid mismatch if the type/gfid is unknown or not set.
Change-Id: Ia06552e4dc4a9f89cb7f5302833604bd21bbf7da
fixes: bz#1729481
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Network latency is an important factor selecting a read subvolume.
So this patch is adding two new policy.
1) We measure the latency of a child during a GF_DUMP rpc call.
Then use this latency to pick a read subvol having the least
latency.
2) Second one is an hybrid mode where it calculates the effective
latency by multiplying outstanding pending read request and
latency, and choose the least one.
Change-Id: Ia49c8a08ab61f7dcdad8b8950aa4d338e7accf97
fixes: #520
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
| |
We should free the mem_pool local_pool during an afr_fini.
Otherwise this will lead to mem leak for shd
Change-Id: I805a34a88077bf7b886c28b403798bf9eeeb1c0b
Updates: bz#1716695
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are many include statements that are not needed.
A previous more ambitious attempt failed because of *BSD plafrom
(see https://review.gluster.org/#/c/glusterfs/+/21929/ )
Now trying a more conservative reduction.
It does not solve all circular deps that we have, but it
does reduce some of them. There is just too much to handle
reasonably (dht-common.h includes dht-lock.h which includes
dht-common.h ...), but it does reduce the overall number of lines
of include we need to look at in the future to understand and fix
the mess later one.
Change-Id: I550cd001bdefb8be0fe67632f783c0ef6bee3f9f
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
We currently don't have a roll-back/undoing of post-ops if quorum is not met.
Though the FOP is still unwound with failure, the xattrs remain on the disk.
Due to these partial post-ops and partial heals (healing only when 2 bricks
are up), we can end up in metadata split-brain purely from the afr xattrs
point of view i.e each brick is blamed by atleast one of the others for
metadata. These scenarios are hit when there is frequent connect/disconnect
of the client/shd to the bricks.
Fix:
Pick a source based on the xattr values. If 2 bricks blame one, the blamed
one must be treated as sink. If there is no majority, all are sources. Once
we pick a source, self-heal will then do the heal instead of erroring out
due to split-brain.
This patch also adds restriction of all the bricks to be up to perform
metadata heal to avoid any metadata loss.
Removed the test case tests/bugs/replicate/bug-1468279-source-not-blaming-sinks.t
as it was doing metadata heal even when only 2 of 3 bricks were up.
Change-Id: I07a9d62f84ceda329dcab1f02a33aeed258dcb09
fixes: bz#1717819
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In function "afr_selfheal_entry_granular", after completing the
heal we are not destroying the frame. This will lead to crash.
when we execute statedump operation, where it tried to access
xlator object. If this xlator object is freed as part of the
graph destroy this will lead to an invalid memory access
Change-Id: I0a5e78e704ef257c3ac0087eab2c310e78fbe36d
fixes: bz#1708926
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
We were not properly cleaning self-heal daemon resources
during ec fini. With shd multiplexing, it is absolutely
necessary to cleanup all the resources during ec fini.
Change-Id: Iae4f1bce7d8c2e1da51ac568700a51088f3cc7f2
fixes: bz#1703948
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- pass fop state instead of afr local to
afr_ta_dom_lock_check_and_release()
- avoid afr_lock_release_synctask() being called simultaneosuly from
notify code path and transaction (post-op) code path due to races.
- Check if the post-op on TA is valid based on event_gen checks.
- Invalidate in-memory information when we get TA child down.
Note: Thi patch addresses some pending review comments of commit
053b1309dc8fbc05fcde5223e734da9f694cf5cc
(https://review.gluster.org/#/c/glusterfs/+/20095/)
fixes: bz#1698449
Change-Id: I2ccd7e1b53362f9f3fed8680aecb23b5011eb18c
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I was working on a blog about troubleshooting AFR issues and I wanted to copy
the messages logged by self-heal for my blog. I then realized that AFR-v2 is not
logging *before* attempting data heal while it logs it for metadata and entry
heals.
I [MSGID: 108026] [afr-self-heal-entry.c:883:afr_selfheal_entry_do]
0-testvol-replicate-0: performing entry selfheal on
d120c0cf-6e87-454b-965b-0d83a4c752bb
I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal]
0-testvol-replicate-0: Completed entry selfheal on
d120c0cf-6e87-454b-965b-0d83a4c752bb. sources=[0] 2 sinks=1
I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal]
0-testvol-replicate-0: Completed data selfheal on
a9b5f183-21eb-4fb3-a342-287d3a7dddc5. sources=[0] 2 sinks=1
I [MSGID: 108026] [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
0-testvol-replicate-0: performing metadata selfheal on
a9b5f183-21eb-4fb3-a342-287d3a7dddc5
I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal]
0-testvol-replicate-0: Completed metadata selfheal on
a9b5f183-21eb-4fb3-a342-287d3a7dddc5. sources=[0] 2 sinks=1
Adding it in this patch. Now there is a 'performing' and a corresponding
'Completed' message for every type of heal.
fixes: bz#1707746
Change-Id: I0b954cf1e17b48280aefa76640b5119b92133d61
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Fixed coverity error, "Unchecked return value (CHECKED_RETURN)".
Checking return value & logging error message if afr_set_pending_dict
fails.
updates: bz#789278
Change-Id: Iab7da6b4f3cd0622b95b8e1c412b007a330467e5
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
|
|
|
|
|
|
| |
Updates: bz#1624701
Change-Id: I7152c28ad85925abccdcc4cd6de8cb2a2b847a51
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When eager-lock lock acquisition fails because of say network failures, the
local is not being removed from owners_list, this leads to accumulation of
waiting frames and the application will hang because the waiting frames are
under the assumption that another transaction is in the process of acquiring
lock because owner-list is not empty. Handled this case as well in this patch.
Added asserts to make it easier to find these problems in future.
fixes bz#1696599
Change-Id: I3101393265e9827755725b1f2d94a93d8709e923
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
| |
memdup() and gf_memdup() have the same implementation. Removed one API
as the presence of both can be confusing.
Change-Id: I562130c668457e13e4288e592792872d2e49887e
updates: bz#1193929
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
This patch address post-merge review comments for commit
5784a00f997212d34bd52b2303e20c097240d91c
Change-Id: I7ed954664a2ae8e1091d23ee3ceb9c66e83bfeac
fixes: bz#1697930
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
When split-brain choice is changed from one brick to another
brick, inode-invalidate is not called so readv call is served
from cache leading to failures in split-brain-resolution.t.
Fixed it by calling inode_invaldate() when this happens.
updates bz#1193929
Change-Id: I2624614eec38c0303f3e1dc55dfae3d4b864218b
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Found missing assignment of lk-owner for an inodelk/entrylk before winding
the fops. locks xlator at the moment allows this operation. This leads to
multiple threads in the same client being able to get locks on the inode
because lk-owner is same and transport is same. So isolation with locks can't
be achieved. To fix it, we need locks xlator change which will disallow
null-lk-owner based inodelk/entrylk/lk. To achieve that we need to first
fix all the places which do this mistake.
updates bz#1624701
Change-Id: Ic3431da3f451a1414f1f4fdcfc4cf41e555f69dd
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Fixes afr_ta_read_txn() to handle inode refresh failures.
code-path.
- Fixes a double free issue of dict.
Note: This patch address post-merge review comments for commit
69532c141be160b3fea03c1579ae4ac13018dcdf
fixes: bz#1686398
Change-Id: Id5299b45b68569d47df6b73755918237a1592cb4
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
| |
client-pid for glustershd is GF_CLIENT_PID_SELF_HEALD
client-pid for glfsheal is GF_CLIENT_PID_GLFS_HEALD
updates: bz#1689250
Change-Id: Ib3a863af160ff48c822a5e6b0c27c575c9887470
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
| |
updates bz#1193929
Change-Id: I01b60d644f517c00a1bcc127bf9a8ed90b6eb7a0
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
| |
In afr_ta_post_op_do, we were sending EIO for every failure.
However, the original error code should be sent.
Change-Id: I9fdc15dac00d758baf8e6f14db244f526481a63a
updates: bz#1686711
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In an arbiter volume configuration SHD will not send any writes onto the arbiter
brick even if there is data pending marker for the arbiter brick. If we have a
arbiter setup on the geo-rep master and there are data pending markers for the files
on arbiter brick, SHD will not mark any data changelog during healing. While syncing
the data from master to slave, if the arbiter-brick is considered as ACTIVE, then
there is a chance that slave will miss out some data. If the arbiter brick is being
newly added or replaced there is a chance of slave missing all the data during sync.
Fix:
If there is data pending marker for the arbiter brick, send truncate on the arbiter
brick during heal, so that it will record truncate as the data transaction in changelog.
Change-Id: I3242ba6cea6da495c418ef860d9c3359c5459dec
fixes: bz#1686568
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Currently even if open & opendir fails on quorum number of bricks,
but succeeds on atleast one brick, it will result in success. This leads
to inconsistency in the behaviour with other operations following the
open, which has quorum checks.
Fix:
Add quorum checks to open & opendir fops to avoid inconsistency.
Change-Id: If8fcb82072a6dc45ea6d4a6754b79763215eba2a
fixes: bz#1634664
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
As afr_changelog_fsync is used for internal operations, use
GLUSTERFS_INTERNAL_FOP_KEY so that lease xlator can avoid treating
it as conflicting fop and recall lease.
Change-Id: I52cdc161002e840199d24439231a8bfa4f98b1b6
updates: bz#1648768
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
|
|
|
|
|
|
|
|
|
| |
We were not properly cleaning self-heal daemon resources
during afr fini. This patch will clean the same.
Change-Id: I597860be6f781b195449e695d871b8667a418d5a
updates: bz#1659708
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a fop to create an entry fails on one of the data brick,
we mark the pending changelog on the entry on brick for which
it was successful. This is done as part of post op phase to
make sure that entry gets healed even if it gets renamed to
some other path where its parent was not marked as bad.
As it happens as part of post op, we should consider thin-arbiter
to check if the brick, which was successful, is the good brick or not.
This will avoide split brain and other issues.
Change-Id: I12686675be98f02f70a5186b3ed748c541514d53
updates: bz#1662264
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mostly, unlock before logging.
In some cases, moved different code that was not needed
to be under lock (for example, taking time, or malloc'ing)
to be executed before taking the lock.
Note: logging might be slightly less accurate in order, since it may
not be done now under the lock, so order of logs is racy. I think
it's a reasonable compromise.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I2438710016afc9f4f62a176ef1a0d3ed793b4f89
|
|
|
|
|
|
|
|
|
|
|
|
| |
Automatic Splitbrain with size as policy must
not resolve splitbrains when the copies are of same size.
Determining if the sizes of copies are same and
returning -1 in that case.
updates: bz#1655052
Change-Id: I3d8e8b4d7962b070ed16c3ee02a1e5a926fd5eab
Signed-off-by: Iraj Jamali <ijamali@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In automatic Splitbrain resolution when favorite child policy
is set as size, split brain resolution must not work for
directories.
Currently, if a directory is in split brain with both copies
having same size, the source is selected arbitrarily
and healed.
fixes: bz#1655050
Change-Id: I5739498639c17c89874cc577362e543adab55f5d
Signed-off-by: Sheetal Pamecha <sheetal.pamecha08@gmail.com>
|
|
|
|
|
|
|
|
| |
This maybe one mistake when coding.
Fixes: bz#1665332
Change-Id: Ia8f8dadf4a71579240ff9950b141ca528bd342b3
Signed-off-by: Xiubo Li <xiubli@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes memory leak reported by ASan.
The fix was first merged by
https://review.gluster.org/#/c/glusterfs/+/21805.
But later change was reverted due to this patch
https://review.gluster.org/#/c/glusterfs/+/21178/.
updates: bz#1633930
Change-Id: I1febe121e0be33a637397a0b54d6b78391692b0d
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
| |
With this changeset, default value for the AFR client side
heal volume option is set to "off"
fixes: bz#1663102
Change-Id: Ie4016932339c4896487e3e7cb5caca68739b7ba2
Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For implementing copy_file_range fop, AFR needs to perform two inodelks in the
same transaction. This patch brings in the necessary structure to make it
easier to do so.
Entry-locks in AFR were already taking multiple entry-locks on different inodes
with the respective basenames. This patch extends the logic in inodelks to use
the same lockee_t structure. This lead to removal of quite a lot of duplicate
code present in afr-lk-common.c as both the locks are doing same things except
'winding' part.
updates: #536
Change-Id: Ibfce7e3f260bb27b18645152ec680c33866fe0ae
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: When trying to convert a plain distribute volume to replica-3
or arbiter type it is failing with ENOTCONN error as the lookup on
the root will fail as there is no quorum.
Fix: Allow lookup on root if it is coming from the ADD_REPLICA_MOUNT
which is used while adding bricks to a volume. It will try to set the
pending xattrs for the newly added bricks to allow the heal to happen
in the right direction and avoid data loss scenarios.
Note: This fix will solve the problem of type conversion only in the
case where the volume was mounted at least once. The conversion of
non mounted volumes will still fail since the dht selfheal tries to
set the directory layout will fail as they do that with the PID
GF_CLIENT_PID_NO_ROOT_SQUASH set in the frame->root.
Change-Id: Ic511939981dad118cc946754341318b164954b3b
fixes: bz#1655854
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added '-Wvla' and saw this - gcc doesn't like variable arrays.
There are plenty of others in the EC code, but this seems OK to remove:
there is no use for the array members (I hope - that was from reading
the code).
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I350f4520e52b86c8bbcd60eea1b27ef99cd119aa
|
|
|
|
|
|
| |
updates bz#1650403
Change-Id: Ib5a11e691599ce4bd93c1ed5aca6060592893961
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|