| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we have cascading locks with same lk-owner there is a possibility for
a deadlock to happen. One example is as follows:
self-heal takes a lock in data-domain for big name with 256 chars of "aaaa...a"
and starts heal in a 3-way replication when brick-0 is offline and healing from
brick-1 to brick-2 is in progress. So this lock is active on brick-1 and
brick-2. Now brick-0 comes online and an operation wants to take full lock and
the lock is granted at brick-0 and it is waiting for lock on brick-1. As part
of entry healing it takes full locks on all the available bricks and then
proceeds with healing the entry. Now this lock will start waiting on brick-0
because some other operation already has a granted lock on it. This leads to a
deadlock. Operation is waiting for unlock on "aaaa..." by heal where as heal is
waiting for the operation to unlock on brick-0. Initially I thought this is
happening because healing is trying to take a lock on all the available bricks
instead of just the bricks that are participating in heal. But later realized
that same kind of deadlock can happen if a brick goes down after the heal
starts but comes back before it completes. So the essential problem is the
cascading locks with same lk-owner which were added for backward compatibility
with afr-v1 which can be safely removed now that versions with afr-v1 are
already EOL. This patch removes the compatibility with v1 which requires
cascading locks with same lk-owner.
In the next version we can make locking-scheme option a dummy and switch
completely to v2.
BUG: 1401404
Change-Id: Ic9afab8260f5ff4dff5329eb0429811bcb879079
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/16024
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
1) When a blocking lock is issued and the parallel lock phase fails
on all subvolumes with EAGAIN, it is not switching to serialized
locking phase.
2) When quorum is enabled and locks fail partially it is better
to give errno returned by brick rather than the default
quorum errno.
Fix:
Handled this error case and changed op_errno to reflect the actual
errno in case of quorum error.
BUG: 1369077
Change-Id: Ifac2e4a13686e9fde601873012700966d56a7f31
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15984
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I7b70de317a5f15a3bf483ffe40b971143deddc11
BUG: 1401218
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/16029
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bugs found and fixed:
1. Use correct subvolume index in pre-op-writev compound cbk
2. Prevent use-after-free of local->compound_args members in
compound fops cbk in protocol/client
3. Fix xdata and xattr leaks in client_process_response
4. Fix possible leak of xdata in client_pre_writev() in
test mode.
5. Free req->compound_req_array.compound_req_array_val as well
after freeing its members
6. Free tmp_rsp->flock.lk_owner.lk_owner_val in LK fop.
Change-Id: I15b646d7d4e0e5cd4ea3d2d6452c815cf2eaf68f
BUG: 1401218
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/16020
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: ec_writev_start calls ec_make_internal_fop_xdata
to set "yes" in xdata before ec_readv (an internal fop)
is called for head and tail. Second call to this function
is overwriting the previous allocated dict_t to "xdata",
which results in memory leak.
Solution: In ec_make_internal_fop_xdata, check if *xdata
is NULL or not to avoid overwriting *xdata.
Change-Id: I49b83923e11aff9b92d002e86424c0c2e1f5f74f
BUG: 1400818
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/16007
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upcall as a part of setattr, sends an invalidation and the
invalidation carries the resulting stat value. When a file
is converted to linkto files, even then an invalidation
is set and as a result the mountpoint shows the sticky
bit in the stat of the file.
eg: ---------T. 945 root root 0 Nov 8 10:14 hardlink.999
Fix:
When dht recieves a notification of sticky bit change, it updates
the flag, to indicate md-cache to send the subsequent lookup.
Change-Id: Ic2fd7a5b196db0754f9b97072e644e6bf69da606
BUG: 1392713
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15789
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Susant Palai <spalai@redhat.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I2beaba829710565a3246f7449a5cd21755cf5f7d
BUG: 1399592
Signed-off-by: Mateusz Slupny <mateusz.slupny@appeartv.com>
Reviewed-on: http://review.gluster.org/15968
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generally linkto file is created using root user. Consider following
case, a user is trying to rename a file which he is not permitted.
So the rename fails with EACESS and when rename tries to cleanup the
linkto file, it fails.
The above issue happens when rename/00.t test executed on nfs-ganesha
clients :
Steps executed in script
* create a file "abc" using root
* rename the file "abc" to "xyz" using a non root user, it fails with EACESS
* delete "abc"
* create directory "abc" using root
* again try ot rename "abc" to "xyz" using non root user, test hungs here
which slowly leds to OOM kill of ganesha process
RCA put forwarded by Du for OOM kill of ganesha
Note that when we hit this bug, we've a scenario of a dentry being
present as:
* a linkto file on one subvol
* a directory on rest of subvols
When a lookup happens on the dentry in such a scenario, the control flow
goes into an infinite loop of:
dht_lookup_everywhere
dht_lookup_everywhere_cbk
dht_lookup_unlink_cbk
dht_lookup_everywhere_done
dht_lookup_directory (as local->dir_count > 0)
dht_lookup_dir_cbk (sets to local->need_selfheal = 1 as the entry is a linkto file on one of the subvol)
dht_lookup_everywhere (as need_selfheal = 1).
This infinite loop can cause increased consumption of memory due to:
1) dht_lookup_directory assigns a new layout to local->layout unconditionally
2) Most of the functions in this loop do a stack_wind of various fops.
This results in growing of call stack (note that call-stack is destroyed only after lookup response is
received by fuse - which never happens in this case)
Thanks Du for root causing the oom kill and Sushant for suggesting the fix
Change-Id: I1e16bc14aa685542afbd21188426ecb61fd2689d
BUG: 1397052
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
Reviewed-on: http://review.gluster.org/15894
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following have been completely removed from the source tree,
makefiles, configure script, and RPM specfile.
cluster/afr/pump
cluster/ha
cluster/map
features/filter
features/mac-compat
features/path-convertor
features/protect
Change-Id: I2f966999ac3c180296ff90c1799548fba504f88f
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/15906
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: A hard link is lost during rebalance + lookup.Rebalance skip files
if file has hardlink.In dht_migrate_file __is_file_migratable ()
function checks if a file has hardlink, if yes file is not migrated
but if link is created after call this function then link will lost.
Solution: Call __check_file_has_hardlink to check hardlink existence after (S+T) bits
in migration process ,if file has hardlink then skip the file for
migrate rebalance process.
BUG: 1396048
Change-Id: Ia53c07ef42f1128c2eedf959a757e8df517b9d12
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Reviewed-on: http://review.gluster.org/15866
Reviewed-by: Susant Palai <spalai@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(1) afr_have_quorum is dead code. It was copied to afr_has_quorum,
and everything else uses that, but the original was never deleted
(until now).
(2) Auto-quorum should be default for any N>2. Leaving quorum
disabled is BAD, but apparently deemed acceptable for N=2 because
there's no real quorum in that case. For any larger number (including
arbiter configurations) there is such a thing as real quorum and we
should use it by default. Note that for N=3 the answers we get from
"N % 2" (the old check) and "N > 2" (the new one) are the same.
(3) The special case for even N in afr_has_quorum has been simplified and
explained more thoroughly in a comment.
Change-Id: I48b33c15093512fecf516b26dcf09afecb7ae33b
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/15873
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of cache invalidation(upcall).
Issue:
------
When a cache invalidation is recieved as a result of changing
pending xattr, the read_subvol is reset. Consider the below chain
of execution:
CHILD_DOWN
...
afr_readv
...
afr_inode_refresh
...
afr_inode_read_subvol_reset <- as a result of pending xattr set by
some other client GF_EVENT_UPCALL will
be sent
afr_refresh_done -> this results in an EIO, as the read subvol was
reset by the end of the afr_inode_refresh
Solution:
---------
When GF_EVENT_UPCALL is recieved, instead of resetting read_subvol,
set a variable need_refresh in inode_ctx, the next time some one
starts a txn, along with event gen, need_rrefresh also needs to
be checked.
Change-Id: Ifda21a7a8039b8874215e1afa4bdf20f7d991b58
BUG: 1396952
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15892
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In a disperse volume with "K+R" configuration, where
"K" is the number of data bricks and "R" is the number of redundancy
bricks (Total number of bricks, N = K+R), if only K bricks are UP,
we should NOT start heal process. This is because the bricks, which
are supposed to be healed, are not UP. This will unnecessary
eat up the resources.
Solution: Check for the number of xl_up_count and only
if it is greater than ec->fragments (number of data bricks),
start heal process.
Change-Id: I8579f39cfb47b65ff0f76e623b048bd67b15473b
BUG: 1399072
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/15937
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Currently, I/O on a split-brained file fails even when the
favorite-child-policy is set until the self-heal is complete.
Fix:
If a valid 'source' is found using the set favorite-child-policy, inspect
and reset the afr pending xattrs on the 'sinks' (inside appropriate locks),
refresh the inode and then proceed with the read or write transaction.
The resetting itself happens in the self-heal code and hence can also
happen in the client side background-heal or by the shd's index-heal in
addition to the txn code path explained above. When it happens in via
heal, we also add checks in undo-pending to not reset the sink xattrs
again.
Change-Id: Ic8c1317720cb26bd114b6fe6af4e58c73b864626
BUG: 1386188
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reported-by: Simon Turcotte-Langevin <simon.turcotte-langevin@ubisoft.com>
Reviewed-on: http://review.gluster.org/15673
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problems:
1) Inodelk is not taking quorum into account
2) finodelk, [f]entrylk are not implemented correctly
3) By default afr doesn't go for non-blocking parallel locks.
Fix:
Implemented a common framework which can be used by
[f]inodelk/[f]entrylk. Used quorum for the same.
Change-Id: I239f13875a065298630d266941df10cfa3addc85
BUG: 1369077
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15802
Tested-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an afr data transaction is eligible for using
eager-lock, this information is represented in
local->transaction.eager_lock_on. However, if non-blocking
inodelk attempt (which is a full lock) fails, AFR falls back
to blocking locks which are range locks. At this point,
local->transaction.eager_lock[] per brick is reset but
local->transaction.eager_lock_on is still true.
When AFR decides to compound post-op and unlock, it is after
confirming that the transaction did not use eager lock (well,
except for a small bug where local->transaction.locks_acquired[]
is not considered).
But within afr_post_op_unlock_do(), afr again incorrectly sets
the lock range to full-lock based on local->transaction.eager_lock_on
value. This is a bug and can lead to deadlock since the locks acquired
were range locks and a full unlock is being sent leading to unlock failure
and thereby every other lock request (be it from SHD or other clients or
glfsheal) getting blocked forever and the user perceives a hang.
FIX:
Unconditionally rely on the range locks in inodelk object for unlocking
when using compounded post-op + unlock.
Big thanks to Pranith for helping with the debugging.
Change-Id: Idb4938f90397fb4bd90921f9ae6ea582042e5c67
BUG: 1398566
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15929
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Id8ba76ba116d056bc7299dc5ce0980680a5a23f8
BUG: 1398226
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15924
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DHT does not set the layout for newly created
directories as root. This causes EPERM failures
when a non-root user with insufficient permissions
creates directories.
credit: srangana@redhat.com for RCA
Change-Id: Ia646e41665ce172c43c5f01d2707455e8eb374ed
BUG: 1392772
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/15794
Reviewed-by: Susant Palai <spalai@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently these are few events related to child_up/down:
GF_EVENT_CHILD_UP : Issued when any of the protocol client
connects.
GF_EVENT_CHILD_MODIFIED : Issued by afr/dht/ec
GF_EVENT_CHILD_DOWN : Issued when any of the protocol client
disconnects.
These events get modified at the dht/afr/ec layers. Here is a
brief on the same.
DHT:
- All the subvolumes reported once, and atleast one child came
up, then GF_EVENT_CHILD_UP is issued
- connect GF_EVENT_CHILD_UP is issued
- disconnect GF_EVENT_CHILD_MODIFIED is issued
- All the subvolumes disconnected, GF_EVENT_CHILD_DOWN is issued
AFR:
- First subvolume came up, then GF_EVENT_CHILD_UP is issued
- Subsequent subvolumes coming up, results in GF_EVENT_CHILD_MODIFIED
- Any of the subvolumes go down, then GF_EVENT_SOME_CHILD_DOWN is issued
- Last up subvolume goes down, then GF_EVENT_CHILD_DOWN is issued
Until the patch [1] introduced GF_EVENT_SOME_CHILD_UP,
GF_EVENT_CHILD_MODIFIED was issued by afr/dht when any of the subvolumes
go up or down.
Now with md-cache changes, there is a necessity to differentiate between
child up and down. Hence, introducing GF_EVENT_SOME_DESCENDENT_DOWN/UP and
getting rid of GF_EVENT_CHILD_MODIFIED.
[1] http://review.gluster.org/12573
Change-Id: I704140b6598f7ec705493251d2dbc4191c965a58
BUG: 1396038
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15764
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Raghavendra G found that posix is trying to print %s
but passing an int when HEALTH_CHECK fails in posix.
These are the kind of bugs that should be caught
at compilation itself.
Also fixed the problematic gf_event() callers.
BUG: 1386097
Change-Id: Id7bd6d9a9690237cec3ca1aefa2aac085e8a1270
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15671
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Check for NULL inode before attempting to
set dht inode ctx.
Change-Id: I7693c18445f138221d8417df5e95b118cedb818a
BUG: 1395261
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/15847
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I39de2bcfc660f23a4229a885a0e0420ca949ffc9
BUG: 1392865
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15800
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On a sharded volume when a brick is replaced while IO is going on, named
lookup on individual shards as part of read/write was failing with
ENOENT on the replaced brick, and as a result AFR initiated name heal in
lookup callback. But since pargfid was empty (which is what this patch
attempts to fix), the resolution of the shards by protocol/server used
to fail and the following pattern of logs was seen:
Brick-logs:
[2016-11-08 07:41:49.387127] W [MSGID: 115009]
[server-resolve.c:566:server_resolve] 0-rep-server: no resolution type
for (null) (LOOKUP)
[2016-11-08 07:41:49.387157] E [MSGID: 115050]
[server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null)
(00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82)
==> (Invalid argument) [Invalid argument]
Client-logs:
[2016-11-08 07:41:27.497687] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.497755] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.498500] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.499680] E [MSGID: 133010]
Also, this patch makes AFR by itself choose a non-NULL pargfid even if
its ancestors fail to initialize all pargfid placeholders.
Change-Id: I5f85b303ede135baaf92e87ec8e09941f5ded6c1
BUG: 1392445
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15788
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Break tier_migrate_using_query_file() into a more manageable
tier_migrate_link() and helpers.
Change-Id: I5eb2d2cff9e7a2a2da567c3c4c2d53aab195f477
BUG: 1358296
Signed-off-by: Milind Changire <mchangir@redhat.com>
Reviewed-on: http://review.gluster.org/14957
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the
fix.
BUG: 1384297
Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15728
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The rebalance event code was using strtok to parse the
volume name which is incorrect.
Reworked the code to get the correct volume name using
strstr.
Change-Id: Ib5f3305a34e6bf1ecfef677d87c5aff96bdeb0e6
BUG: 1388010
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/15712
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Consider a replica setup, where one mount writes data to a
file and the other mount reads the file. In afr, read operations
are not transaction based, a brick(read subvolume) is chosen as
a part of lookup or other operations, read is always wound only
to the read subvolume, even if there was write from a different client
that failed on this brick. This stale read continues until there is
a lookup or any write operation from the mount point. Currently, this
is not a major issue, as a lookup is issued before every read and it will
switch the read subvolume to a correct one. But with the plan of
increasing md-cache timeout to 600s, the stale read problem will be
more pronounced, i.e. stale read can continue for 600s(or more if cascaded
with readdirp), as there will be no lookups.
Solution:
Afr doesn't have any built-in solution for stale read(without affecting
the performance). The solution that came up, was to use upcall. When a file
on any brick is marked bad for the first time, upcall sends a notification
to all the clients that had recently accessed the file. The solution has
2 parts:
- Identifying when a file is marked bad, on any of the bricks,
for the first time
- Client side actions on recieving the notifications
Identifying when a file is marked bad on any of the bricks for the first time:
-----------------------------------------------------------------------------
The idea is to track xattrop in upcall. xattrop currently comes with 2 afr
xattrs - afr dirty bit and afr pending xattrs.
Dirty xattr is set to 1 before every write, and is unset if write succeeds.
In certain scenarios, dirty xattr can be 0 and still the file could be bad
copy. Hence do not track dirty xattr.
Pending xattr is set on the good copy, indicating the other bricks that have
bad copy. It is still not as simple as, notifying when any of the pending xattrs
change. It could lead to flood of notifcations, in case the other brick is
completely down or consistantly failing. Hence it is important to notify only
once, the first time a good copy is marked bad.
Client side actions on recieving pending xattr change, notification:
--------------------------------------------------------------------
md-cache will invalidate the cache of that file, so that further lookup is
passed down to afr and hence update the read subvolume. Invalidating only in
md-cache is not enough, consider the folling oder of opertaions:
- pending xattr invalidation - invalidate md-cache
- readdirp on the bad read subvolume - fill md-cache
- lookup (served from md-cache)
- read - wound to the old read subvol.
Hence, along with invalidating md-cache, it is very important to reset the
read subvolume for that file, in afr.
Design Credit: Anuradha Talur, Ravishankar N
1. xattrop doesn't carry info saying post op/pre op.
2. Pre xattrop will have 0 value for all pending xattrs,
the cbk of pre xattrop carries the on-disk xattr value.
Non zero indicated healing is required.
3. Post xattrop will have non zero value for any of the
pending xattrs, if the fop failed on any of the bricks.
Change-Id: I469cbc111714c433984fe1c922be2ef113c25804
BUG: 1211863
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15398
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Demote files on priority if hi-watermark has been breached and continue
to demote until the watermark drops below hi-watermark.
Monitor watermark more frequently.
Trigger demotion as soon as hi-watermark is breached.
Add cluster.tier-emergency-demote-query-limit option to limit number
of files returned from the database query for every iteration of
tier_migrate_using_query_file(). If watermark hasn't dropped below
hi-watermark during the first iteration, the next iteration will be
triggered approximately 1 second after tier_demote() returns to the
main tiering loop.
Update changetimerecorder xlator to handle query for emergency demote
mode.
Add tier-ctr-interface.h:
Move tier and ctr interface specific macros and struct definition from
libglusterfs/src/gfdb/gfdb_data_store.h to new header
libglusterfs/src/tier-ctr-interface.h
Change-Id: If56af78c6c81d37529b9b6e65ae606ba5c99a811
BUG: 1366648
Signed-off-by: Milind Changire <mchangir@redhat.com>
Reviewed-on: http://review.gluster.org/15158
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BUG: 1385593
Change-Id: Icfae9e557a284182c6c22e9606fdd641528906f0
Reported-by: Patrick Matthäi <pmatthaei@debian.org>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/15656
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In afr lookup when NULL dict is received in lookup, afr
is supposed to set all the xattrs it requires in a new dict
it creates, but for 'link-count' it is trying to set to the
dict that is passed in lookup which can be NULL sometimes.
This is leading to error logs. Fixed the same in this patch.
BUG: 1385104
Change-Id: I679af89cfc410cbc35557ae0691763a05eb5ed0e
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15646
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Sharding exposed a bug in arbiter config. where `dd` throughput was
extremely slow. Shard xlator was sending a fxattrop to update the file
size immediately after a writev. Arbiter was incorrectly over-riding the
LLONGMAX-1 start offset (for metadata domain locks) for this fxattrop,
causing the inodelk to be taken on the data domain. And since the
preceeding writev hadn't released the lock (afr does a 'lazy'
unlock if write succeeds on all bricks), this degraded to a blocking
lock causing extra lock/unlock calls and delays.
Fix:
Modify flock.l_len and flock.l_start to take full locks only for data
transactions.
Change-Id: I906895da2f2d16813607e6c906cb4defb21d7c3b
BUG: 1384906
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reported-by: Max Raba <max.raba@comsysto.com>
Reviewed-on: http://review.gluster.org/15641
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Currently heal info command prints all
the files/directories if the index for the
file/directory is present in .glusterfs/indices folder.
After implementing patch http://review.gluster.org/#/c/13733/
indices of the file which is going through update fop
will also be present in .glusterfs/indices even
if the fop is successful on all the brick. At this time
if heal info command is being used, it will also display this
file which is actually healthy and does not require any heal.
Solution: Take lock on a file corresponding to the indices
and inspect xattrs to decide if the file needs heal or not.
Change-Id: I6361e2813ece369be12d02e74816df4eddb81cfa
BUG: 1366815
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/15543
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. While looking at glustershd logs in DEBUG log-level, it was found that
all bricks of the replica were printed as local bricks even though they
were not. Fixed it.
2. Print the name of the subvol from which the entry was got during
index crawl.
Change-Id: I08b32e38694c755715e9fe0c0e1dd9212abcfb16
BUG: 1381421
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/15610
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently ipc() is not implemented in afr. md-cache and upcall
uses ipc to register the list of xattrs, [1] for more details.
For the ipc op GF_IPC_TARGET_UPCALL, it has to be wound to all
the replica subvolumes. ipc() is failed when any of the
subvolumes fails with other than ENOTCONN or all of the subvolumes
are down.
[1] http://review.gluster.org/#/c/15002/
Change-Id: I0f651330eafda64e4d922043fe53bd0014536247
BUG: 1211863
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15378
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In arbiter configuration, posix-xlator in the arbiter brick always sets
the GF_CONTENT_KEY in the response dict with a value 0. If the file size on
the data bricks is more than quick-read's max-file-size (64kb default),
those bricks don't set the key. Because of this difference in the no. of dict
elements, afr triggers metadata heal in lookup code path, in turn
leading to extra lookups+inodelks.
Fix:
Changed afr dict comparison logic to ignore all virtual xattrs and the
on-disk ones that we should not be healing.
Also removed is_virtual_xattr() function. The original callers to this
function (upcall) don't seem to need it anymore.
Change-Id: I05730bdd39d8fb0b9a49a5fc9c0bb01f0d3bb308
BUG: 1378684
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/15548
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ipc will be wound to all the bricks, but for it to be
successfull, the fop should succeed on minimum number of bricks.
Change-Id: I3f8cb6a349e87bafd0773583def9d4e3765aa140
BUG: 1211863
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15387
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Modified afr event message to add a 'type' key as detailed in the BZ.
Also added events for data and metadata split-brain.
Change-Id: I8156674b4b6a501499fc10fd68e05115fdaef3e4
BUG: 1378072
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/15550
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Removed a structure member that was not used
from dht_local_t. Skipped some unnecessary checks
in dht_filter_loc_subvol_key.
Change-Id: I81740b6528e063fb9cf5817e05865ff4d77aa748
BUG: 1378305
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/15542
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
And minor cleanup of a few of the Makefile.am files while we're
at it.
Rewrite the make rules to do what xdrgen does. Now we can get rid
of xdrgen.
Note 1. netbsd6's sed doesn't do -i. Why are we still running
smoke tests on netbsd6 and not netbsd7? We barely support netbsd7
as it is.
Note 2. Why is/was libgfxdr.so (.../rpc/xdr/src/...) linked with
libglusterfs? A cut-and-paste mistake? It has no references to
symbols in libglusterfs.
Note3. "/#ifndef\|#define\|#endif/" (note the '\'s) is a _basic_
regex that matches the same lines as the _extended_ regex
"/#(ifndef|define|endif)/". To match the extended regex sed needs to
be run with -r on Linux; with -E on *BSD. However NetBSD's and
FreeBSD's sed helpfully also provide -r for compatibility. Using a
basic regex avoids having to use a kludge in order to run sed with
the correct option on OS X.
Note 4. Not copying the bit of xdrgen that inserts copyright/license
boilerplate. AFAIK it's silly to pretend that machine generated
files like these can be copyrighted or need license boilerplate.
The XDR source files have their own copyright and license; and
their copyrights are bound to be more up to date than old
boilerplate inserted by a script. From what I've seen of other
Open Source projects -- e.g. gcc and its C parser files generated
by yacc and lex -- IIRC they don't bother to add copyright/license
boilerplate to their generated files.
It appears that it's a long-standing feature of make (SysV, BSD,
gnu) for out-of-tree builds to helpfully pretend that the source
files it can find in the VPATH "exist" as if they are in the $cwd.
rpcgen doesn't work well in this situation and generates files
with "bad" #include directives.
E.g. if you `rpcgen ../../../../$srcdir/rpc/xdr/src/glusterfs3-xdr.x`,
you get an #include directive in the generated .c file like this:
...
#include "../../../../$srcdir/rpc/xdr/src/glusterfs3-xdr.h"
...
which (obviously) results in compile errors on out-of-tree build
because the (generated) header file doesn't exist at that location.
Compared to `rpcgen ./glusterfs3-xdr.x` where you get:
...
#include "glusterfs3-xdr.h"
...
Which is what we need. We have to resort to some Stupid Make Tricks
like the addition of various .PHONY targets to work around the VPATH
"help".
Warning: When doing an in-tree build, -I$(top_builddir)/rpc/xdr/...
looks exactly like -I$(top_srcdir)/rpc/xdr/... Don't be fooled though.
And don't delete the -I$(top_builddir)/rpc/xdr/... bits
Change-Id: Iba6ab96b2d0a17c5a7e9f92233993b318858b62e
BUG: 1330604
Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/14085
Tested-by: Niels de Vos <ndevos@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The rebalance process will now send an event when it is
complete.
Also fixed a problem where the run-time was not always
set causing spurious rebalance failure events to be sent.
Change-Id: Ib445171c78c9560940022bca20c887d31a9bb1ca
BUG: 1371874
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/15501
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://review.gluster.org/14085 fixes a "pragma leak" where the
generated rpc/xdr headers have a pair of pragmas that disable these
warnings. With the warnings disabled, many unused variables have
crept into the code base.
And 14085 won't pass its own smoke test until all these warnings are
fixed.
BUG: 1369124
Change-Id: I24607fc2082c3424f876f740a88fb7d0173d322d
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/15518
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, for all the update operations, metadata or data,
we set the dirty flag at the end of the operation only if
a brick is down. This leads to delay in healing and in some
cases not at all.
In this patch we set (+1) the dirty flag
at the start of the metadata or data update operations and
after successfull completion of the fop, we unset (-1) it again.
Change-Id: Ide5668bdec7b937a61c5c840cdc79a967598e1e9
BUG: 1316873
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/13733
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://review.gluster.org/14085 fixes a "pragma leak" where the
generated rpc/xdr headers have a pair of pragmas that disable these
warnings. With the warnings disabled, many unused variables have
crept into the code base.
And 14085 won't pass its own smoke test until all these warnings are
fixed.
BUG: 1369124
Change-Id: I4946d460f66d3430bac8910260c0cb1b3e9ff87b
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/15513
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://review.gluster.org/14085 fixes a "pragma leak" where the
generated rpc/xdr headers have a pair of pragmas that disable these
warnings. With the warnings disabled, many unused variables have
crept into the code base.
And 14085 won't pass its own smoke test until all these warnings are
fixed.
BUG: 1369124
Change-Id: I6feee863927254ae9f59013112bcc297ad09e89b
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/15477
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://review.gluster.org/14085 fixes a "pragma leak" where the
generated rpc/xdr headers have a pair of pragmas that disable these
warnings. With the warnings disabled, many unused variables have
crept into the code base.
And 14085 won't pass its own smoke test until all these warnings are
fixed.
BUG: 1369124
Change-Id: I55d4106ac828380f3315f5a21593390905d3ab3a
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/15476
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In a distributed-replicate volume attribute
"replica.split-brain-status" value does not display split-brain
condition though directory is in split-brain.
If directory is in split brain on mutiple replica-pairs
it does not show full list of replica pairs.
Solution: Update the dht_aggregate code to aggregate the xattr
value in this specific condition.
Fix: 1) function getChoices returns the choices from split-brain
status string.
2) function add_opt adding the choices to local buffer to
store in dictionary
3) For the key "replica.split-brain-status" function dht_aggregate
call dht_aggregate_split_brain_xattr to prepare the list.
Test: To verify the patch followed below steps
1) Create a distributed replica volume and create mount point
2) Stop heal daemon
3) Touch file and directories on mount point
mkdir test{1..5};touch tmp{1..5}
4) Down brick process on one of the replica set
pkill -9 glusterfsd
5) Change permission of dir on mount point
chmod 755 test{1..5}
6) Restart brick process on node with force option
7) kill brick process on other node in same replica set
8) Change permission of dir again on mount point
chmod 766 test{1..5}
9) Reexecute same step from 4-9 on other replica set also
10) After check heal status on server it will show dir's are
in split brain on all replica sets
11) After check the replica.split-brain-status attr on mount
point it will show wrong status of split brain.
12) After apply the patch the attribute shows correct value.
BUG: 1368312
Change-Id: Icdfd72005a4aa82337c342762775a3d1761bbe4a
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Reviewed-on: http://review.gluster.org/15201
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Post add-brick event the new brick will have permission of 755
by default. If the root directory permission was other than 755,
that does not get healed to the new brick leading to permission
errors/inconsistencies.
For choosing source of attr heal we can trust the subvols which
have layouts with latest ctime(as part of missing directory heal,
we heal the proper attr). In case none of the subvols have layout,
return ESTALE to retrigger a fresh lookup.
Note: This patch heals the permission of the root directories only.
Since, permission healing of directory is not straight forward and
required intrusive fix, those are not addressed here.
Change-Id: If894e3895d070d46b62d2452e52c1eaafcf56c29
BUG: 1368012
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/15195
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Id0af15a2aec96bdbe675b4c959b56f0fc8e72504
BUG: 1341948
Signed-off-by: ankit <anraj@redhat.com>
Reviewed-on: http://review.gluster.org/15345
Tested-by: ankitraj
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: For healing of uid/gid we check if local->stbuf.ia_ctime is
lesser than stbuf->ia_ctime (received from brick). If yes then uid/gid
is updated to local->prebuf(source of healing).
But we merge local->stbuf also form the newly added brick. So if we
receive response from the newly added brick first and update the
local->stbuf, then local->prebuf will remain empty since the newly added
brick will have the latest ctime among all servers. And this can result
in healing wrong uid/gids to the rest of servers.
Hence, we should update local->stbuf from servers with a layout which
will ignore merging stbufs from newly added bricks.
Change-Id: If4b64f75a0ea669abdbe9f5a3d1d18ff19374c2f
BUG: 1365740
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/15126
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dht_selfheal_layout_maximize_overlap () does not consider
chunk sizes while calculating overlaps. Temporarily
enabling this operation if only if weighted rebalance is disabled
or all bricks are the same size.
Change-Id: I5ed16cdff2551b826a1759ca8338921640bfc7b3
BUG: 1366494
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/15403
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|