| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Change-Id: Ib927a770a486c95e4b157e76ba96e9904d1a9716
Fixes: #1499
Signed-off-by: perrynzhou <perrynzhou@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem1:
When a directory is renamed while a brick
is down entry-heal always did an rm -rf on that directory on
the sink on old location and did mkdir and created the directory
hierarchy again in the new location. This is inefficient.
Problem2:
Renamedir heal order may lead to a scenario where directory in
the new location could be created before deleting it from old
location leading to 2 directories with same gfid in posix.
Fix:
As part of heal, if oldlocation is healed first and is not present in
source-brick always rename it into a hidden directory inside the
sink-brick so that when heal is triggered in new-location shd can
rename it from this hidden directory to the new-location.
If new-location heal is triggered first and it detects that the
directory already exists in the brick, then it should skip healing the
directory until it appears in the hidden directory.
Credits: Ravi for rename-data-loss.t script
Fixes: #1211
Change-Id: I0cba2006f35cd03d314d18211ce0bd530e254843
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Prefer timespec_now_realtime() and gf_time() over clock_gettime()
and time(), use gf_tvdiff() and gf_tsdiff() where appropriate,
drop unused time_elapsed() and leftovers in 'struct posix_private'.
Change-Id: Ie1f0229df5b03d0862193ce2b7fb91d27b0981b6
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Updates: #1002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Lookup/creation of thin-arbiter ID file happens in background during
mounting. On new volumes, if the ID file creation is in progress, and a
FOP fails on data brick, a post-op (xattrop) is attemtped on TA. Since
the TA file's gfid is null at this point, the ASSERT checks in protocol/
client causes a crash.
Fix:
Given that we decided to do Lookup/creation of thin-arbiter in
background, fail the other AFR FOPS on TA if the ID file's gfid is null
instead of winding it down to protocol/client.
Also remove afr_changelog_thin_arbiter_post_op() which seems to be dead
code.
Updates: #763
Change-Id: I70dc666faf55cc5c8f7cf8e7d36085e4fa399c4d
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Add thin convenient library wrapper gf_time(),
adjust related users and comments as well.
Change-Id: If8969af2f45ee69c30c3406bce5baa8305fb7f80
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Updates: #1002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
If we set favourite child policy, then automatic split-brain resolution
should work in all cases. This was failing when quorum count was set to
a non-zero value. The initial lookup before the read txn was failing
with ENOTCONN. Since we don't have a readable subvol, we were failing it.
We were only looking to the split brain resolution choice set through the
cli command.
Fix:
We will now consider the favourite child policy if split-brain choice
has not been set via cli command.
Change-Id: Id2016c3a90d0763ac6f1a0131571053f595576f0
Fixes: #1404
Signed-off-by: Mohammed Rafi KC <rafi.kavungal@iternity.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ 144s] afr-dir-read.h:15:1: warning: type of 'afr_opendir' does not match original declaration [-Wlto-type-mismatch]
[ 144s] 15 | afr_opendir(call_frame_t *frame, xlator_t *this, loc_t *loc, fd_t *fd, dict_t *xdata)
[ 144s] | ^
[ 144s] afr-dir-read.c:71:1: note: type mismatch in parameter 5
[ 144s] 71 | afr_opendir(call_frame_t *frame, xlator_t *this, loc_t *loc, fd_t *fd)
[ 144s] | ^
[ 144s] afr-dir-read.c:71:1: note: 'afr_opendir' was previously declared here
only a warning, more of a truth-and-beauty thing
Change-Id: I2d6ff3fa0a8c5e6ef36e090a6545eaf638752192
Updates: #1002
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Fixing the unchecked return value issues reported by coverity scan
CID: 1400734
CID: 1400750
Change-Id: I3c953df9ade4a1548e41e18018edb1b041f7e15e
Signed-off-by: karthik-us <ksubrahm@redhat.com>
Updates: #1060
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added a check for NULL before dereferencing
the object as it may be NULL in few cases
inside the funtion. Also, added a check for
the negative value of gfid_idx.
CID: 1430140
CID: 1430145
Change-Id: Ib7d23459b48bbc471dbcccab6d20572261882d11
Updates: #1060
Signed-off-by: nik-redhat <nladha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: See github issue for details.
Fix:
-In lookup if the entry exists in 2 out of 3 bricks, don't fail the
lookup with ENOENT just because there is an entrylk on the parent.
Consider quorum before deciding.
-If entry FOP does not succeed on quorum no. of bricks, do not perform
new entry mark.
Fixes: #1303
Change-Id: I56df8c89ad53b29fa450c7930a7b7ccec9f4a6c5
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
AFR doesn't delay post-op for fsync fop. For fsync heavy workloads
this leads to un-necessary fxattrop/finodelk for every fsync leading
to bad performance.
Fix:
Have delayed post-op for fsync. Add special flag in xdata to indicate
that afr shouldn't delay post-op in cases where either the
process will terminate or graph-switch would happen. Otherwise it leads
to un-necessary heals when the graph-switch/process-termination
happens before delayed-post-op completes.
Fixes: #1253
Change-Id: I531940d13269a111c49e0510d49514dc169f4577
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In a replicate/arbiter volume if file creations or writes fails on
quorum number of bricks and on one brick it is due to ENOSPC and
on other brick it fails for a different reason, it may fail with
errors other than ENOSPC in some cases.
Fix:
Prioritize ENOSPC over other lesser priority errors and do not set
op_errno in posix_gfid_set if op_ret is 0 to avoid receiving any
error_no which can be misinterpreted by __afr_dir_write_finalize().
Also removing the function afr_has_arbiter_fop_cbk_quorum() which
might consider a successful reply form a single brick as quorum
success in some cases, whereas we always need fop to be successful
on quorum number of bricks in arbiter configuration.
Change-Id: I106e267f8b9451f681022f1cccb410d9bc824c08
Fixes: #1254
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Found with GCC ASan:
Direct leak of 202 byte(s) in 2 object(s) allocated from:
#0 0x7fc6c6ef0667 in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb0667)
#1 0x7fc6c6bd145b in __gf_malloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:175
#2 0x7fc6c6bd17a3 in gf_vasprintf /path/to/glusterfs/libglusterfs/src/mem-pool.c:223
#3 0x7fc6c6bd1993 in gf_asprintf /path/to/glusterfs/libglusterfs/src/mem-pool.c:243
#4 0x7fc6b0dc92f6 in init /path/to/glusterfs/xlators/cluster/afr/src/afr.c:590
...
Change-Id: I29feb1d30a045fb70472758e6ed4e195888090b2
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1278
|
|
|
|
|
|
|
|
|
| |
This patch includes the following CID from Coverity Scan:
*1419116
*1420206
Change-Id: Id92fd6a78c8a00726a61aa4697b5c126ced8ed4d
Updates: #1202
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The general idea of the changes is to prevent resetting event generation
to zero in the inode ctx, since event gen is something that should
follow 'causal order'.
Change #1:
For a read txn, in inode refresh cbk, if event_generation is
found zero, we are failing the read fop. This is not needed
because change in event gen is only a marker for the next inode refresh to
happen and should not be taken into account by the current read txn.
Change #2:
The event gen being zero above can happen if there is a racing lookup,
which resets even get (in afr_lookup_done) if there are non zero afr
xattrs. The resetting is done only to trigger an inode refresh and a
possible client side heal on the next lookup. That can be acheived by
setting the need_refresh flag in the inode ctx. So replaced all
occurences of resetting even gen to zero with a call to
afr_inode_need_refresh_set().
Change #3:
In both lookup and discover path, we are doing an inode refresh which is
not required since all 3 essentially do the same thing- update the inode
ctx with the good/bad copies from the brick replies. Inode refresh also
triggers background heals, but I think it is okay to do it when we call
refresh during the read and write txns and not in the lookup path.
The .ts which relied on inode refresh in lookup path to trigger heals are
now changed to do read txn so that inode refresh and the heal happens.
Change-Id: Iebf39a9be6ffd7ffd6e4046c96b0fa78ade6c5ec
Fixes: #1179
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reported-by: Erik Jacobson <erik.jacobson at hpe.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support for gluster volume heal <volname> info healed/heal-failed
was removed by commit bb02cfb56ae08f56df4452c2b948fa962ae1212b in
release-3.6. cli parser will display the usage message in all the
supported versions whenever these clis are run, leading to some
dead code in the latest branches. Since support for these clis
were removed long back, this should not give any backward
compatibility issues as well. Hence removing the dead code from
the code base which will lead to better code coverage by the
regression runs as well.
Updates: #1052
Change-Id: I0c2b061469caf233c06d9699b0d159ce48e240b9
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...if pending xattrs are zero for all children.
Problem:
If there are no pending xattrs and a metadata heal needs to be
performed, it can be possible that we end up with xattrs inadvertendly
deleted from all bricks, as explained in the BZ.
Fix:
After picking one among the sources as the good copy, mark pending xattrs on
all sources to blame the sinks. Now even if this metadata heal fails midway,
a subsequent heal will still choose one of the valid sources that it
picked previously.
Fixes: #1067
Change-Id: If1b050b70b0ad911e162c04db4d89b263e2b8d7b
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Current implementation assumes that ping-event will come after connect event
but that may not be the case in the cases where after socket connection fds
need to be re-opened which would consume more time. So handle any order of the
ping/child-up events.
fixes: bz#1800583
Change-Id: I6bcdc0caa503bdc039ef2b4739fbf4afae121f05
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The was a problem when self-heal was sending lookups at the same time
that one of the bricks was coming up. In this case there was a chance
that the number of 'up' bricks changes in the middle of sending the
requests to subvolumes which caused a discrepancy in the expected
number of replies and the actual number of sent requests.
This discrepancy caused that AFR continued executing requests before
all requests were complete. Eventually, the frame of the pending
request was destroyed when the operation terminated, causing a use-
after-free issue when the answer was finally received.
In theory the same thing could happen in the reverse way, i.e. AFR
tries to wait for more replies than sent requests, causing a hang.
Change-Id: I7ed6108554ca379d532efb1a29b2de8085410b70
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Fixes: bz#1808875
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In a hyperconverged setup with granular-entry-heal enabled, if a file is
recreated while one of the bricks is down, and an index heal is triggered
(with the brick still down), entry-self heal was doing a spurious heal
with just the 2 good bricks. It was doing a post-op leading to removal
of the filename from .glusterfs/indices/entry-changes as well as
erroneous setting of afr xattrs on the parent. When the brick came up,
the xattrs were cleared, resulting in the renamed file not getting
healed and leading to gfid split-brain and EIO on the mount.
Fix:
Proceed with entry heal only when shd can connect to all bricks of the replica,
just like in data and metadata heal.
fixes: bz#1801624
Change-Id: I916ae26ad1fabf259bc6362da52d433b7223b17e
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When we mount a ta volume, as soon as 2 data bricks are connected
we consider that the mount is done and then send a lookup/create on
ta file on ta node. However, this connection with ta node might not
have been completed.
Due to this delay, ta replica id file will not be created and we
will see ENOTCONN error in log file if we do lookup.
Solution:
As we know that this ta node could have a higher latency, we should
wait for reasonable time for connection to happen before sending
lookup/create on replica id file.
fixes: bz#1720463
Change-Id: I36f90865afe617e4e84cee57fec832a16f5dd6cc
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In function afr_selfheal_data_block(), we only check for the lock count
to be equal to or greater than the number of sinks. There can be a case
where we have 2 source bricks and one sink and the locking is successful
on only the source brick(s). In this case we continue with the healing
on sink without having a lock, which is not correct.
Fix:
Check for lock on atleast source & one sink before starting the data heal.
Change-Id: Iebcb57dcaa4b31831fedfee63d6ca16e9d6c8df8
fixes: bz#1688115
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
heal-info code assumes that all indices in xattrop directory
definitely need heal. There is one corner case.
The very first xattrop on the file will lead to adding the
gfid to 'xattrop' index in fop path and in _cbk path it is
removed because the fop is zero-xattr xattrop in success case.
These gfids could be read by heal-info and shown as needing heal.
Fix:
Check the pending flag to see if the file definitely needs or
not instead of which index is being crawled at the moment.
fixes: bz#1801623
Change-Id: I79f00dc7366fedbbb25ec4bec838dba3b34c7ad5
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
For files: During metadata heal, we restore timestamps
only for non-regular (char, block etc.) files.
Extenting it for regular files as timestamp is updated
via touch command also
fixes: bz#1787274
Change-Id: I26fe4fb6dff679422ba4698a7f828bf62ca7ca18
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In many cases, we were freely allocating long keys with no need.
Smaller char arrays are just fine almost anywhere, so just went ahead
and looked where they we can use smaller ones.
In some cases, annotated the functions as static and the prefixes
passed as const as it was easier to read and understand.
Where relevant, converted the dict functions to use known key length.
Change-Id: I882ab33ea20d90b63278336cd1370c09ffdab7f2
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This volume option was not made avaialble to `gluster volume set` CLI.
Reported-by: epolakis(https://github.com/kinsu) in
https://github.com/gluster/glusterfs/issues/781
fixes: bz#1787554
Change-Id: I7141bdd4e53ee99e22b354edde8d023bfc0b2cd7
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Perform AFR_COUNT() once, in afr_has_quorum() and pass the result
to afr_lookup_has_quorum()
2. Simplify afr_lookup_has_quorum() - pass less parameters to it.
(Via the change in item 1 above).
3. Make afr_is_add_replica_mount_lookup_on_root() static function.
4. Remove dead code - afr_decide_heal_info() which was not used.
Change-Id: If9168cd01e22788a0e60b91e315787d2aa60e97b
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code like:
f(..., uuid_utoa(x), uuid_utoa(y));
is not valid (causes undefined behaviour) because uuid_utoa()
uses the only static thread-local buffer which will be overwritten
by the subsequent call. All such cases should be converted to use
uuid_utoa_r() with explicitly specified buffer.
Change-Id: I5e72bab806d96a9dd1707c28ed69ca033b9c8d6c
Updates: bz#1193929
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
|
|
|
|
|
|
|
|
| |
convert gf_msg() into gf_smsg()
Change-Id: I8f5b7bbb9caa78902b06f67257502b67adab7405
Updates: #657
Signed-off-by: yatipadia <ypadia@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes in locks xlator:
Added support for per-domain inodelk count requests.
Caller needs to set GLUSTERFS_MULTIPLE_DOM_LK_CNT_REQUESTS key in the
dict and then set each key with name
'GLUSTERFS_INODELK_DOM_PREFIX:<domain name>'.
In the response dict, the xlator will send the per domain count as
values for each of these keys.
Changes in AFR:
Replaced afr_selfheal_locked_inspect() with afr_lockless_inspect(). Logic has
been added to make the latter behave same as the former, thus not
breaking the current heal info output behaviour.
fixes: bz#1774011
Change-Id: Ie9e83c162aa77f44a39c2ba7115de558120ada4d
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
As a follow up on https://review.gluster.org/#/c/glusterfs/+/23749/,
adding error logging for the entire method.
In addition, converted logging to structured logging in the method.
Fixes: bz#1778457
Change-Id: I1f412159e6849d6f6ddbde53ec4a85ad709bbdf4
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Added a log for a failure in order to avoid "unused variable" coverity
issue.
fixes: CID#1274209
Change-Id: Ibc6b0ab4bdff482096e42e88fd4c8c7eadfeeadb
Updates: bz#789278
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
|
|
|
|
|
|
|
|
| |
This reverts commit fce5f68bc72d448490a0d41be494ac54a9181b3c.
I merged the wrong patch by mistake! Hence reverting it.
updates: bz#1774011
Change-Id: Id7d6ed1d727efc02467c8a9aea3374331261ebd5
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes in locks xlator:
Added support for per-domain inodelk count requests.
Caller needs to set GLUSTERFS_MULTIPLE_DOM_LK_CNT_REQUESTS key in the
dict and then set each key with name
'GLUSTERFS_INODELK_DOM_PREFIX:<domain name>'.
In the response dict, the xlator will send the per domain count as
values for each of these keys.
Changes in AFR:
Replaced afr_selfheal_locked_inspect() with afr_lockless_inspect(). Logic has
been added to make the latter behave same as the former, thus not
breaking the current heal info output behaviour.
fixes: bz#1774011
Change-Id: I9ae08ce768b39aeb6ee230207b5b7fa744176952
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit "ccf33e789 - dict.c: remove redundant checks"
removed some NULL checks in certain dict functions. This caused
flooding of fuse mount logs when I/O was done on the mount on a replica
volume:
Message:
W [dict.c:1478:dict_get_with_refn]
(-->/usr/local/lib/libglusterfs.so.0(dict_get_uint32+0x4d)
[0x7ff9121ec963] -->/usr/local/lib/libglusterfs.so.0(dict_get_with_ref+0x90)
[0x7ff9121eb93f] -->/usr/local/lib/libglusterfs.so.0(+0x229be)
[0x7ff9121eb9be] ) 0-dict: dict OR key (glusterfs.lk.lkmode) is NULL [Invalid argument]
Fix:
In the relevant AFR functions, check that dict is not NULL before trying
to perform operations on it.
See bug description for more details.
fixes: bz#1772006
Change-Id: I30c89c0b5d6c80cc86a6047aae70127769412120
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implements lock healing for gluster-block fencing use case.
If mandatory lock is enabled:
- Add domain lock/unlock to afr_lk fop.
- Maintain a list of locks to be healed in afr_private_t.
- Add lock to the list if afr_lk(F_SETLK or F_SETLKW) was sucessful.
- Remove it from the list during afr_lk(F_UNLCK).
- On child_down, mark lock as needing heal on that child. If lock is
lost on quorum no. of bricks, remove it from the list and mark fd bad.
- For fds marked as bad, fail the subsequent fd based fops.
- On parent up, traverse the list and heal the locks IFF the client is
the lk owner and has quorum. (shd does not heal any locks).
updates: #613
Change-Id: I03c46ceaea30f5e6236d5ec13f71d843d827f1bc
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Afr adds its own xattrs to the req, so it should take a copy of the
dictionary to prevent parent xlator re-using the modified xattr-req
to another subvolume
fixes: bz#1765155
Change-Id: I268e2dbd1b12323135d369e90a22a8bdde2cf7c2
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
| |
fixes: bz#1760189
Change-Id: Iffbf8d6f4c50b8e2de8364658697bdbe96549f5d
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
squash >50 warnings on padding of structs in afr structures.
The warnings were found by manually added '-Wpadded' to the GCC
command line.
Change-Id: I961fbdeb33715cedf3dd10db8e4f8ef40cd3e867
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In a situation where B1 blames B2, B2 blames B1 and B3 doesn't blame
anything for entry heal, heal will not complete even though we have
clear source and sinks. This will happen because while doing
afr_selfheal_find_direction() only the bricks which are blamed by
non-accused bricks are considered as sinks. Later in
__afr_selfheal_entry_finalize_source() when it tries to mark all the
non-sources as sinks it fails to do so because there won't be any
healed_sinks marked, no witness present and there will be a source.
Fix:
If there is a source and no healed_sinks, then reset all the locked
sources to 0 and healed sinks to 1 to do conservative merge.
Change-Id: If40d8bc95d52a52b2730f55bdcf135109b421548
Fixes: bz#1749322
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ever since we added quorum checks for lookups in afr via commit
bd44d59741bb8c0f5d7a62c5b1094179dd0ce8a4, the split-brain resolution
commands would not work for replica 3 because there would be no
readables for the lookup fop.
The argument was that split-brains do not occur in replica 3 but we do
see (data/metadata) split-brain cases once in a while which indicate that there are
a few bugs/corner cases yet to be discovered and fixed.
Fortunately, commit 8016d51a3bbd410b0b927ed66be50a09574b7982 added
GF_CLIENT_PID_GLFS_HEALD as the pid for all fops made by glfsheal. If we
leverage this and allow lookups in afr when pid is GF_CLIENT_PID_GLFS_HEALD,
split-brain resolution commands will work for replica 3 volumes too.
Likewise, the check is added in shard_lookup as well to permit resolving
split-brains by specifying "/.shard/shard-file.xx" as the file name
(which previously used to fail with EPERM).
Change-Id: I3c543dea79caf7cfbc1633e9089cb1cdd2538ba9
Fixes: bz#1756938
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If you are already under lock, just decrement the call count
directly instead of removing the lock, re-taking the lock
and decrementing.
Implements https://github.com/gluster/glusterfs/issues/728
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I3fa20b4651fbdb826655c5a03baeed46e99b5487
|
|
|
|
|
|
|
|
|
|
| |
In 3 cases, there was a memory allocation and zeroing, followed
directly by populating it with content. Replaced with memory
allocation that did not zero the memory.
Change-Id: I4fbb5c924fb3a144e415d2368126b784dde760ea
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After add-brick and rebalance, the ctime xattr is not present
on rebalanced directories on new brick. This patch fixes the
same.
Note that ctime still doesn't support consistent time across
distribute sub-volume.
This patch also fixes the in-memory inconsistency of time attributes
when metadata is self healed.
Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df
fixes: bz#1734026
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
We were not passing xattr_req when doing a name self heal
as well as a meta data heal. Because of this, some xdata
was missing which causes i/o errors
Change-Id: Ibfb1205a7eb0195632dc3820116ffbbb8043545f
Fixes: bz#1728770
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...whenever shd is re-enabled after disabling or there is a change in
`cluster.heal-timeout`, without needing to restart shd or waiting for the
current `cluster.heal-timeout` seconds to expire.
See BZ 1743988 for more details.
Change-Id: Ia5ebd7c8e9f5b54cba3199c141fdd1af2f9b9bfe
fixes: bz#1744548
Reported-by: Glen Kiessling <glenk1973@hotmail.com>
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
-Minor change to if-else structure to avoid code duplication.
-Added logging in case method calls fails
CID: 1394654
Updates: bz#789278
Change-Id: Ibef4450dc89ddd3bf951303d5b87f503924fd250
Signed-off-by: Barak Sason <bsasonro@redhat.com>
|
|
|
|
|
|
| |
Fixes: bz#1734370
Change-Id: I29e338bac62104233a6f80212df8d0fb016affda
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Initialize a dictionary for example seems to be prefectly fine to be done
before taking a lock.
Change-Id: Ib29516c4efa8f0e2b526d512beab488fcd16d2e7
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
| |
This function does length, allocation and serialization for you.
Change-Id: I142a259952a2fe83dd719442afaefe4a43a8e55e
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|