| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
1) snprintf into linkname_expected should happen with PATH_MAX
2) comparison should happen with linkname_actual with complete
string linkname_expected
fixes bz#1595190
Change-Id: Ic3b3c362dc6c69c046b9a13e031989be47ecff14
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
see https://review.gluster.org/#/c/19788/,
https://review.gluster.org/#/c/19871/,
https://review.gluster.org/#/c/19952/,
https://review.gluster.org/#/c/20104/,
https://review.gluster.org/#/c/20162/,
https://review.gluster.org/#/c/20185/,
https://review.gluster.org/#/c/20207/,
https://review.gluster.org/#/c/20227/, and
https://review.gluster.org/#/c/20307/
This patch fixes more selected comma white space (ws_comma) as suggested
by the 2to3 utility.
Note: Fedora packaging guidelines and SUSE rpmlint require explicit
shebangs, so popular practices like #!/usr/bin/env python and
or #!/usr/bin/python3
Note: Selected small fixes from 2to3 utility. Specifically apply,
basestring, funcattrs, has_key, idioms, map, numliterals, raise,
set_literal, types, urllib, and zip have already been applied. Also
version agnostic imports for urllib, cpickle, socketserver, _thread,
queue, etc., suggested by Aravinda in https://review.gluster.org/#/c/19767/1
Note: these 2to3 fixes report no changes are necessary: asserts, buffer,
exec, execfile, exitfunc, filter, getcwdu, imports2, input, intern,
itertools, metaclass, methodattrs, ne, next, nonzero, operator, paren,
raw_input, reduce, reload, renames, repr, standarderror, sys_exc, throw,
tuple_params, xreadlines.
Change-Id: I60932030813484803f73733a9b2b7b23c7a843fd
updates: #411
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A synctask is created that would scan the indices from
.shard/.remove_me, to delete the shards associated with the
gfid corresponding to the index bname and the rate of deletion
is controlled by the option features.shard-deletion-rate whose
default value is 100.
The task is launched on two accounts:
1. when shard receives its first-ever lookup on the volume
2. when a rename or unlink deleted an inode
Change-Id: Ia83117230c9dd7d0d9cae05235644f8475e97bc3
updates: bz#1568521
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
commit 20fa80057eb430fd72b4fa31b9b65598b8ec1265 introduced a regression
wherein if a file is present in only 1 brick of replica *and* doesn't
have a gfid associated with it, it doesn't get healed upon the next
lookup from the client. Fix it.
Change-Id: I7d1111dcb45b1b8b8340a7d02558f05df70aa599
fixes: bz#1591193
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(part 1)
PROBLEM:
Shards are deleted synchronously when a sharded file is unlinked or
when a sharded file participating as the dst in a rename() is going to
be replaced. The problem with this approach is it makes the operation
really slow, sometimes causing the application to time out, especially
with large files.
SOLUTION:
To make this operation atomic, we introduce a ".remove_me" directory.
Now renames and unlinks will simply involve two steps:
1. creating an empty file under .remove_me named after the gfid of the file
participating in unlink/rename
2. carrying out the actual rename/unlink
A synctask is created (more on that in part 2) to scan this directory
after every unlink/rename operation (or upon a volume mount) and clean
up all shards associated with it. All of this happens in the background.
The task takes care to delete the shards associated with the gfid in
.remove_me only if this gfid doesn't exist in backend, ensuring that the
file was successfully renamed/unlinked and its shards can be discarded now
safely.
Change-Id: Ia1d238b721a3e99f951a73abbe199e4245f51a3a
updates: bz#1568521
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch addresses following:
1. On volume stop, for the last brick, pmap_registry_remove () is
invoked by glusterd.
2. If a brick process is sigkilled, remove all the associated brick
instances from the portmap.
3. Bump up PROCESS_UP_TIMEOUT to 45.
4. gf_attach to kill a brick takes more time in mux (which is an
issue that needs a fix), but in the interim, give br-state-check.t
more time to complete (there are 2 kill_bricks, each taking 120
seconds, and the test usually passes in 30 odd seconds, hence bumping
this up to 350 seconds)
5. The test bug-1559004-EMLINK-handling.t is taking ~950 seconds at
times on master without mux, in mux cases, when it fails, it is almost
at the last iteration, hence bumping the timeout for this test case
to reduce regression error rates
Updates: bz#1577672
Change-Id: I1922675e112baca4c125c4c094eaa42a11e34e67
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
There seems to be races which are not fixed by commit
9704d203f0. Though the test itself is not bad, it is failing very
frequently. So, till the issue is fixed, marking this test as bad.
updates: bz#1543279
Change-Id: I4a5869da1586178914ffb9563414879e6beab183
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
If certain builds have readdir-ahead disabled by default, the
test case fails, as it asumes readdir-ahead is enabled by
default. Hence explicitly enabling readdir-ahead.
Change-Id: Ib5bef266707c2c557aeb2cf2ffbf4d0c92025c46
fixes: bz#1582051
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Details in the bug
Updates: bz#1581735
Change-Id: Id984e10b60daf274d5510e3ccbf7abf0cb19f368
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In the .t, when the only good brick was brought down, writes on the fd were
still succeeding on the bad bricks. The inflight split-brain check was
marking the write as failure but since the write succeeded on all the
bad bricks, afr_txn_nothing_failed() was set to true and we were
unwinding writev with success to DHT and then catching the failure in
post-op in the background.
Fix:
Don't wind the FOP phase if the write_subvol (which is populated with readable
subvols obtained in pre-op cbk) does not have at least 1 good brick which was up
when the transaction started.
Note: This fix is not related to brick muliplexing. I ran the .t
10 times with this fix and brick-mux enabled without any failures.
Change-Id: I915c9c366aa32cd342b1565827ca2d83cb02ae85
updates: bz#1577672
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
| |
Change-Id: I3a8d452d00560dac5e0b7ff0b1835d1f20a59f91
updates: bz#1570962
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: There's a race between the glusterfs_handle_terminate()
response sent to glusterd from last brick of the process and the
socket disconnect event that encounters after the brick process
got killed.
Solution: When it is a last brick for the brick process, instead of
sending GLUSTERD_BRICK_TERMINATE to brick process, glusterd will
kill the process (same as we do it in case of non brick multiplecing).
The test case is added for https://bugzilla.redhat.com/show_bug.cgi?id=1549996
Change-Id: If94958cd7649ea48d09d6af7803a0f9437a85503
fixes: bz#1545048
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Test case:
# while true; do uuid="`uuidgen`"; echo "some data" > "test$uuid"; mv
"test$uuid" "test" -f || break; echo "done:$uuid"; done
This script was run in parallel from multiple mountpoints
Along the course of getting the above usecase working, many issues
were found:
Issue 1:
=======
consider a case of rename (src, dst). We can encounter a situation
where,
* dst is a file present at the time of lookup
* dst is removed by the time rename fop reaches glusterfs
In this scenario, acquring inodelk on dst fails with ESTALE resulting
in failure of rename. However, as per POSIX irrespective of whether
dst is present or not, rename should be successful. Acquiring entrylk
provides synchronization even in races like this.
Algorithm:
1. Take inodelks on src and dst (if dst is present) on respective
cached subvols. These inodelks are done to preserve backward
compatibility with older clients, so that synchronization is
preserved when a volume is mounted by clients of different
versions. Once relevant older versions (3.10, 3.12, 3.13) reach
EOL, this code can be removed.
2. Ignore ENOENT/ESTALE errors of inodelk on dst.
3. protect namespace of src and dst. To protect namespace of a file,
take inodelk on parent on hashed subvol, then take entrylk on the
same subvol on parent with basename of file. inodelk on parent is
done to guard against changes to parent layout so that hashed
subvol won't change during rename.
4. <rest of rename continues>
5. unlock all locks
Issue 2:
========
linkfile creation in lookup codepath can race with a rename. Imagine
the following scenario:
* lookup finds a data-file with gfid - gfid-dst - without a
corresponding linkto file on hashed-subvol. It decides to create
linkto file with gfid - gfid-dst.
- Note that some codepaths of dht-rename deletes linkto file of
dst as first step. So, a lookup racing with an in-progress
rename can easily run into this situation.
* a rename (src-path:gfid-src, dst-path:gfid-dst) renames data-file
and hence gfid of data-file changes to gfid-src with path dst-path.
* lookup proceeds and creates linkto file - dst-path - with gfid -
dst-gfid - on hashed-subvol.
* rename tries to create a linkto file dst-path with src-gfid on
hashed-subvol, but it fails with EEXIST. But EEXIST is ignored
during linkto file creation.
Now we've ended with dst-path having different gfids - dst-gfid on
linkto file and src-gfid on data file. Future lookups on dst-path will
always fail with ESTALE, due to differing gfids.
The fix is to synchronize linkfile creation in lookup path with rename
using the same mechanism of protecting namespace explained in solution
of Issue 1. Once locks are acquired, before proceeding with linkfile
creation, we check whether conditions for linkto file creation are
still valid. If not, we skip linkto file creation.
Issue 3:
========
gfid of dst-path can change by the time locks are acquired. This
means, either another rename overwrote dst-path or dst-path was
deleted and recreated by a different client. When this happens,
cached-subvol for dst can change. If rename proceeds with old-gfid and
old-cached subvol, we'll end up in inconsistent state(s) like dst-path
with different gfids on different subvols, more than one data-file
being present etc.
Fix is to do the lookup with a new inode after protecting namespace of
dst. Post lookup, we've to compare gfids and correct local state
appropriately to be in sync with backend.
Issue 4:
========
During revalidate lookup, if following a linkto file doesn't lead to a
valid data-file, local->cached-subvol was not reset to NULL. This
means we would be operating on a stale state which can lead to
inconsistency. As a fix, reset it to NULL before proceeding with
lookup everywhere.
Issue 5:
========
Stale dentries left out in inode table on brick resulted in failures
of link fop even though the file/dentry didn't exist on backend fs. A
patch is submitted to fix this issue. Please check the dependency tree
of current patch on gerrit for details
In short, we fix the problem by not blindly trusting the
inode-table. Instead we validate whether dentry is present by doing
lookup on backend fs.
Change-Id: I832e5c47d232f90c4edb1fafc512bf19bebde165
updates: bz#1543279
BUG: 1543279
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
cut -d$'\n' is not separating the xattrs shown as part of getfattr output.
Hence use awk to get the nth line of getfattr output for nth iteration
in the for loop.
Change-Id: I1a96cd3f72f4f407f9a783375f78d9a69d5d3885
fixes: bz#1574606
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
force-migration config for remove-brick operation.
The cli will take input from the user before starting "remove-brick"
start operation. The message/confirmation looks like the following:
<Running remove-brick with cluster.force-migration enabled can result
in data corruption. It is safer to disable this option so that files
that receive writes during migration are not migrated. Files that are
not migrated can then be manually copied after the remove-brick commit
operation. Do you want to continue with your current
cluster.force-migration settings? (y/n)>
And also question for COMMIT_FORCE is changed.
Fixes: bz#1572586
Change-Id: Ifdb6b108a646f50339dd196d6e65962864635139
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
see https://review.gluster.org/#/c/19788/
use print fn from __future__
Change-Id: If5075d8d9ca9641058fbc71df8a52aa35804cda4
updates: #411
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Sometimes brick process is getting crashed at the time
of stop brick while brick mux is enabled.
Solution: Brick process was getting crashed because of rpc connection
was not cleaning properly while brick mux is enabled.In this patch
after sending GF_EVENT_CLEANUP notification to xlator(server)
waits for all rpc client connection destroy for specific xlator.Once rpc
connections are destroyed in server_rpc_notify for all associated client
for that brick then call xlator_mem_cleanup for for brick xlator as well as
all child xlators.To avoid races at the time of cleanup introduce
two new flags at each xlator cleanup_starting, call_cleanup.
BUG: 1544090
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Note: Run all test-cases in separate build (https://review.gluster.org/#/c/19700/)
with same patch after enable brick mux forcefully, all test cases are
passed.
Change-Id: Ic4ab9c128df282d146cf1135640281fcb31997bf
updates: bz#1544090
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. If pre-op fails on all bricks,set lock->release to true in
afr_handle_lock_acquire_failure so that the GF_ASSERT in afr_unlock() does not
crash.
2. Added a missing 'return' after handling pre-op failure in
afr_transaction_perform_fop(), fixing a use-after-free issue.
Change-Id: If0627a9124cb5d6405037cab3f17f8325eed2d83
fixes: bz#1561129
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note 1) we're not supposed to be using #!/usr/bin/env python, see
https://fedoraproject.org/wiki/Packaging:Guidelines?rd=Packaging/Guidelines#Shebang_lines
Note 2) we're also not supposed to be using "!/usr/bin/python,
see https://fedoraproject.org/wiki/Changes/Avoid_usr_bin_python_in_RPM_Build#Quick_Opt-Out
The previous patch (https://review.gluster.org/19767) tried to do too
much in one patch, so it was abandoned.
This patch does two things:
1) minor cleanup of configure(.ac) to explicitly use python2
2) change all the shebang lines to #!/usr/bin/python2 and add them
where they were missing based on warnings emitted during rpmbuild.
In a follow-up patch python2 will eventually be changed to python3.
Before that python2-isms (e.g. print, string.join(), etc.) need to be
converted to python3. Some of those can be rewritten in version agnostic
python. E.g. print statements become print() with "from __future_ import
print_function". The python 2to3 utility will be used for some of those.
Also Aravinda has given guidance in the comments to the first patch for
changes.
updates: #411
Change-Id: I471730962b2526022115a1fc33629fb078b74338
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
| |
Change-Id: I4648816af908539efdc2528608aa2ebf7f0d0e2f
fixes: bz#1559004
BUG: 1559004
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterd maintains a boolean flag 'port_registered' which is used to determine
if a brick has completed its portmap sign in process. This flag is (re)set in
pmap_sigin and pmap_signout events. In case of brick multiplexing this flag is
the identifier to determine if the very first brick with which the process is
spawned up has completed its sign in process. However in case of glusterd
restart when a brick is already identified as running, glusterd does a
pmap_registry_bind to ensure its portmap table is updated but this flag isn't
which is fine in case of non brick multiplex case but causes an issue if
the very first brick which came as part of process is replaced and then
the subsequent brick attach will fail. One of the way to validate this
is to create and start a volume, remove the first brick and then
add-brick a new one. Add-brick operation will take a very long time and
post that the volume status will show all other brick status apart from
the new brick as down.
Solution is to set brickinfo->port_registered to true for all the
running bricks when brick multiplexing is enabled.
Change-Id: Ib0662d99d0fa66b1538947fd96b43f1cbc04e4ff
Fixes: bz#1560957
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit a60fc2ddc03134fb23c5ed5c0bcb195e1649416b.
This commit was causing multiple tests to time out when brick
multiplexing is enabled. With further debugging, it's found that even
though the volume stop transaction is converted into mgmt_v3 to allow
the remote nodes to follow the synctask framework to process the command,
there are other callers of glusterd_brick_stop () which are not synctask
based.
Change-Id: I7aee687abc6bfeaa70c7447031f55ed4ccd64693
updates: bz#1545048
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: There's a race between the last glusterfs_handle_terminate()
response sent to glusterd and the kill that happens immediately if the
terminated brick is the last brick.
Solution: When it is a last brick for the brick process, instead of glusterfsd
killing itself, glusterd will kill the process in case of brick multiplexing.
And also changing gf_attach utility accordingly.
Change-Id: I386c19ca592536daa71294a13d9fc89a26d7e8c0
fixes: bz#1545048
BUG: 1545048
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
1) Afr's eager-lock only works for data transactions.
2) When there are conflicting writes, write with conflicting region initiates
unlock of eager-lock leading to extra pre-ops and post-ops on the file. When
eager-lock goes off, it leads to extra fsyncs for random-write workload in afr.
Solution (that is modeled after EC):
In EC, when there is a conflicting write, it waits for the current write to
complete before it winds the conflicted write. This leads to better utilization
of network and disk, because we will not be doing extra xattrops and FSYNCs and
inodelk/unlock. Moved fd based counters to inode based counters.
I tried to model the solution based on EC's locking, but it is not similar to
AFR because we had to keep backward compatibility.
Lifecycle of lock:
==================
First transaction is added to inode->owners list and an inodelk will be sent on
the wire. All the next transactions will be put in inode->waiters list until
the first transaction completes inodelk and [f]xattrop completely. Once
[f]xattrop also completes, all the requests in the inode->waiters list are
checked if it conflict with any of the existing locks which are in
inode->owners list and if not are added to inode->owners list and resumed with
doing transaction. When these transactions complete fop phase they will be
moved to inode->post_op list and resume the transactions that were paused
because of conflicts. Post-op and unlock will not be issued on the wire until
that is the last transaction on that inode. Last transaction when it has to
perform post-op can choose to sleep for deyed-post-op-secs value. During that
time if any other transaction comes, it will wake up the sleeping transaction
and takes over the ownership of the lock and the cycle continues. If the
dealyed-post-op-secs expire, then the timer thread will wakeup the sleeping
transaction and it will set lock->release to true and starts doing post-op and
then unlock. During this time if any other transactions come, they will be put
in inode->frozen list. Once the previous unlock comes it will move the frozen
list to waiters list and moves the first element from this waiters-list to
owners-list and attempts the lock and the cycle continues. This is the general
idea. There is logic at the time of dealying and at the time of new
transaction or in flush fop to wakeup existing sleeping transactions or
choosing whether to delay a transaction etc, which is subjected to change based
on future enhancements etc.
Fixes: #418
BUG: 1549606
Change-Id: I88b570bbcf332a27c82d2767dfa82472f60055dc
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Self-heal creates a thread per brick to sweep the index looking for
files that need to be healed. These threads are started before the
volume comes online, so nothing is done but waiting for the next
sweep. This happens once per minute.
When a replace brick command is executed, the new graph is loaded and
all index sweeper threads started. When all bricks have reported, a
getxattr request is sent to the root directory of the volume. This
causes a heal on it (because the new brick doesn't have good data),
and marks its contents as pending to be healed. This is done by the
index sweeper thread on the next round, one minute later.
This patch solves this problem by waking all index sweeper threads
after a successful check on the root directory.
Additionally, the index sweep thread scans the index directory
sequentially, but it might happen that after healing a directory entry
more index entries are created but skipped by the current directory
scan. This causes the remaining entries to be processed on the next
round, one minute later. The same can happen in the next round, so
the heal is running in bursts and taking a lot to finish, specially
on volumes with many directory levels.
This patch solves this problem by immediately restarting the index
sweep if a directory has been healed.
Change-Id: I58d9ab6ef17b30f704dc322e1d3d53b904e5f30e
BUG: 1547662
Signed-off-by: Xavi Hernandez <jahernan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This test does:
1. mount a volume
2. kill a brick in the volume
3. mkdir (/somedir)
In my local tests and in [1], I see that mkdir in step 3 fails because
there is no dht-layout on root directory.
The reason I think is by the time first lookup on "/" hit dht, a brick
was killed as per step 2. This means layout was not healed for "/" and
since this is a new volume, no layout is present on it. Note that the
first lookup done on "/" by fuse-bridge is not synchronized with
parent process of daemonized glusterfs mount completing. IOW, by the
time glusterfs cmd executed there is no guarantee that lookup on "/"
is complete. So, if step 2 races ahead of fuse_first_lookup on "/", we
end up with an invalid dht-layout on "/" resulting in failures.
Doint an operation like ls makes sure that lookup on "/" is completed
before we kill a brick
Change-Id: Ie0c4e442c4c629fad6f7ae850437e3d63fe4bea9
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
BUG: 1543279
|
|
|
|
|
|
|
|
|
|
|
| |
Because bug-924726.t depends on netstat, tests failed before. This got resolved
by adding respective check to run-tests.sh.
Enabled respective test again.
Change-Id: I70c9bff03379ed9ee8cd95842c3501dfb50b8e86
BUG: 1312830
Signed-off-by: Sven Fischer <sven@fischer-abc.de>
|
|
|
|
|
|
| |
Change-Id: Ib74354f57a18569762ad45a51f182822a2537421
BUG: 1468483
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
For as long as a shard's inode is in priv->lru_list, it should have a non-zero
ref-count. This patch achieves it by taking a ref on the inode when it
is added to lru list. When it's time for the inode to be evicted
from the lru list, a corresponding unref is done.
Change-Id: I289ffb41e7be5df7489c989bc1bbf53377433c86
BUG: 1468483
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This uses 'timeout' command with 300 seconds default. Right now,
there is just 1 test which takes more than that in a properly
setup machine.
Ideally best case is set the default to something like 30 seconds,
and if a test is supposed to take more than that, owner should add
a timeout line to test knowingly. That way, it makes test writers
think about a time limit too.
Change-Id: I747005ce1f208aeb2ecbf899e8feea487ecd21a0
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
| |
In bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
check for peer count after starting glusterd instance on node 2
Change-Id: I3f92013719d94b6d92fb5db25efef1fb4b41d510
BUG: 1540607
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As nfs-ganesha, a wcc data contains pre/post attributes is return
in read/write rpc reply. nfs-ganesha get those attributes by
two getattr between the real read/write right now.
But, gluster has return pre/post attributes from glusterfsd,
those attributes are skipped in syncop/gfapi, if gfapi return them,
the upper user (nfs-ganesha) can use them directly without any
duplicate getattr.
Updates: #389
Change-Id: I7b643ae4241cfe2aeb17063de00192d81674024a
Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To reduce the overall time taken by the every regression job for all glusterd test cases,
avoiding some duplicate tests by clubbing similar test cases into one.
real time taken for all regression jobs of glusterd without this patch is 1959 seconds,
with this patch it is 1059 seconds.
Look at the below document for your reference.
https://docs.google.com/document/d/1u8o4-wocrsuPDI8BwuBU6yi_x4xA_pf2qSrFY6WEQpo/edit?usp=sharing
Change-Id: Ib14c61ace97e62c3abce47230dd40598640fe9cb
BUG: 1530905
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
In one of the case in commit cb0339f there's one particular case where
after removing the old snap it wasn't writing the new snap version and
this resulted into one of the test to fail spuriously.
Change-Id: I3e83435fb62d6bba3bbe227e40decc6ce37ea77b
BUG: 1540607
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
We currently don't have a roll-back/undoing of post-ops if quorum is not
met. Though the FOP is still unwound with failure, the xattrs remain on
the disk. Due to these partial post-ops and partial heals (healing only when
2 bricks are up), we can end up in split-brain purely from the afr
xattrs point of view i.e each brick is blamed by atleast one of the
others. These scenarios are hit when there is frequent
connect/disconnect of the client/shd to the bricks while I/O or heal
are in progress.
Fix:
Instead of undoing the post-op, pick a source based on the xattr values.
If 2 bricks blame one, the blamed one must be treated as sink.
If there is no majority, all are sources. Once we pick a source,
self-heal will then do the heal instead of erroring out due to
split-brain.
Change-Id: I3d0224b883eb0945785ade0e9697a1c828aec0ae
BUG: 1539358
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
| |
Change-Id: Ifbf5e628ccb9a0ecb285f5884a41e70d935316bd
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the list of xattrs that md-cache can cache is hard coded
in the md-cache.c file, this necessiates code change and rebuild
everytime a new xattr needs to be added to md-cache xattr cache
list.
With this patch, the user will be able to configure a comma
seperated list of xattrs to be cached by md-cache
Updates #297
Change-Id: Ie35ed607d17182d53f6bb6e6c6563ac52bc3132e
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
afr relies on pending changelog xattrs to identify source and sinks and the
setting of these xattrs happen in post-op. So if post-op fails, we need to
unwind the write txn with a failure.
Change-Id: I0f019ac03890108324ee7672883d774918b20be1
BUG: 1506140
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, md-cache sends a list of xattrs, it is inttrested in recieving
invalidations for. But, it cannot specify any wildcard in the xattr names
Eg: user.* - invalidate on updating any xattr with user. prefix.
This patch, enable upcall to honor wildcard in the xattr key names
Updates: #297
Change-Id: I98caf0ed72f11ef10770bf2067d4428880e0a03a
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
| |
..in order for self-heal of symlinks to work properly (see BZ for
details).
Change-Id: I9a011d00b07a690446f7fd3589e96f840e8b7501
BUG: 1529488
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
so that glusterd is also aware that shd is up and running.
While not reproducible locally, on the jenkins slaves, 'gluster vol heal patchy'
fails with "Self-heal daemon is not running. Check self-heal daemon log file.",
while infact the afr_child_up_status_in_shd() checks before that passed. In the
shd log also, I see the shd being up and connected to at least one brick before
the heal is launched.
Change-Id: Id3801fa4ab56a70b1f0bd6a7e240f69bea74a5fc
BUG: 1515163
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
| |
This reverts commit 56e5fdae74845dfec0ff7ad0c8fee77695d36ad5.
Change-Id: Ia62cee5440bbe8e23f5da9cff692d792091d544a
Signed-off-by: Milind Changire <mchangir@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Existing EC code doesn't try to heal the OpenFD to
avoid unnecessary healing of the data later.
Fix implements the healing of open FDs before
carrying out file operations on them by making an
attempt to open the FDs on required up nodes.
BUG: 1431955
Change-Id: Ib696f59c41ffd8d5678a484b23a00bb02764ed15
Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With heterogenous bricks now being supported in DHT
we could run into issues where files are not migrated
even though there is sufficient space in newly added bricks
which just happen to be considerably smaller than older
bricks. Using percentages instead of absolute available
space for space checks can mitigate that to some extent.
Marking bug-1247563.t as that used to depend on the easier
code to prevent a file from migrating. This will be removed
once we find a way to force a file migration failure.
Change-Id: I3452520511f304dbf5af86f0632f654a92fcb647
BUG: 1529440
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
| |
Reported by: Sam McLeod
Change-Id: Ic8f9b46b173796afd70aff1042834b03ac3e80b2
BUG: 1512437
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
| |
The fallocate, zerofill and discard modify file data on the server thus
rendering stale any cache held by the xlator on the client.
BUG: 1524252
Change-Id: I432146c6390a0cd5869420c373f598da43915f3f
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch attempts to use the epoll infra for handling SSL connections
as well instead of the socket_poller() thread func.
This essentially makes priv->own_thread flag redundant.
SSL_connect()/SSL_accept() is now non-blocking which has done away
with the localised poll() in ssl_do(). So, ssl_do() has been updated
appropriately.
own_thread and coincidently socket_poller() thread for SSL processing
is now deprecated.
Added a timeout to test whether seal-heal daemon is up and running
as per Ravi's suggestion.
Change-Id: If2b5d7b4fd19e321cb289e08d49a718d2161aafe
Signed-off-by: Milind Changire <mchangir@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
* Introduce xlator methods to allow dumping of metrics
* Separate options to get the metrics dumped in a path
Updates #168
Change-Id: I7df80df33b71d6f449f03c2332665b4a45f6ddf2
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
dd is doing a statfs and failing with ENOSPC instead of writign and
getting EDQUOTA. Make either error a success in this test.
> Signed-off-by: Kevin Vigor <kvigor@fb.com>
> Reviewed-on: http://review.gluster.org/16352
BUG: 1521116
Signed-off-by: ShyamsundarR <srangana@redhat.com>
Change-Id: I9f580d9e4a4dd293df55a1d954f86a9862fcae7b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Options "create-mask" and "create-directory-mask" are added to
remove the mode bits set on a file or directory when its created.
Default value of these options is 0777.
Options "force-create-mode" and "force-create-directory" sets
the default permission for a file or directory irrespective of
the clients umask.
Default value of these options is 0000.
Command to set option:
volume set <volume name> storage.<option-name> <value>
The valid value range from 0000 to 0777.
Updates #301
Change-Id: Ia33d13f2117202ca55a056c747ccc3674eb8bae1
Signed-off-by: Subha sree Mohankumar <smohanku@redhat.com>
|