| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Change-Id: I4648816af908539efdc2528608aa2ebf7f0d0e2f
fixes: bz#1559004
BUG: 1559004
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterd maintains a boolean flag 'port_registered' which is used to determine
if a brick has completed its portmap sign in process. This flag is (re)set in
pmap_sigin and pmap_signout events. In case of brick multiplexing this flag is
the identifier to determine if the very first brick with which the process is
spawned up has completed its sign in process. However in case of glusterd
restart when a brick is already identified as running, glusterd does a
pmap_registry_bind to ensure its portmap table is updated but this flag isn't
which is fine in case of non brick multiplex case but causes an issue if
the very first brick which came as part of process is replaced and then
the subsequent brick attach will fail. One of the way to validate this
is to create and start a volume, remove the first brick and then
add-brick a new one. Add-brick operation will take a very long time and
post that the volume status will show all other brick status apart from
the new brick as down.
Solution is to set brickinfo->port_registered to true for all the
running bricks when brick multiplexing is enabled.
Change-Id: Ib0662d99d0fa66b1538947fd96b43f1cbc04e4ff
Fixes: bz#1560957
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit a60fc2ddc03134fb23c5ed5c0bcb195e1649416b.
This commit was causing multiple tests to time out when brick
multiplexing is enabled. With further debugging, it's found that even
though the volume stop transaction is converted into mgmt_v3 to allow
the remote nodes to follow the synctask framework to process the command,
there are other callers of glusterd_brick_stop () which are not synctask
based.
Change-Id: I7aee687abc6bfeaa70c7447031f55ed4ccd64693
updates: bz#1545048
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: There's a race between the last glusterfs_handle_terminate()
response sent to glusterd and the kill that happens immediately if the
terminated brick is the last brick.
Solution: When it is a last brick for the brick process, instead of glusterfsd
killing itself, glusterd will kill the process in case of brick multiplexing.
And also changing gf_attach utility accordingly.
Change-Id: I386c19ca592536daa71294a13d9fc89a26d7e8c0
fixes: bz#1545048
BUG: 1545048
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
1) Afr's eager-lock only works for data transactions.
2) When there are conflicting writes, write with conflicting region initiates
unlock of eager-lock leading to extra pre-ops and post-ops on the file. When
eager-lock goes off, it leads to extra fsyncs for random-write workload in afr.
Solution (that is modeled after EC):
In EC, when there is a conflicting write, it waits for the current write to
complete before it winds the conflicted write. This leads to better utilization
of network and disk, because we will not be doing extra xattrops and FSYNCs and
inodelk/unlock. Moved fd based counters to inode based counters.
I tried to model the solution based on EC's locking, but it is not similar to
AFR because we had to keep backward compatibility.
Lifecycle of lock:
==================
First transaction is added to inode->owners list and an inodelk will be sent on
the wire. All the next transactions will be put in inode->waiters list until
the first transaction completes inodelk and [f]xattrop completely. Once
[f]xattrop also completes, all the requests in the inode->waiters list are
checked if it conflict with any of the existing locks which are in
inode->owners list and if not are added to inode->owners list and resumed with
doing transaction. When these transactions complete fop phase they will be
moved to inode->post_op list and resume the transactions that were paused
because of conflicts. Post-op and unlock will not be issued on the wire until
that is the last transaction on that inode. Last transaction when it has to
perform post-op can choose to sleep for deyed-post-op-secs value. During that
time if any other transaction comes, it will wake up the sleeping transaction
and takes over the ownership of the lock and the cycle continues. If the
dealyed-post-op-secs expire, then the timer thread will wakeup the sleeping
transaction and it will set lock->release to true and starts doing post-op and
then unlock. During this time if any other transactions come, they will be put
in inode->frozen list. Once the previous unlock comes it will move the frozen
list to waiters list and moves the first element from this waiters-list to
owners-list and attempts the lock and the cycle continues. This is the general
idea. There is logic at the time of dealying and at the time of new
transaction or in flush fop to wakeup existing sleeping transactions or
choosing whether to delay a transaction etc, which is subjected to change based
on future enhancements etc.
Fixes: #418
BUG: 1549606
Change-Id: I88b570bbcf332a27c82d2767dfa82472f60055dc
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Self-heal creates a thread per brick to sweep the index looking for
files that need to be healed. These threads are started before the
volume comes online, so nothing is done but waiting for the next
sweep. This happens once per minute.
When a replace brick command is executed, the new graph is loaded and
all index sweeper threads started. When all bricks have reported, a
getxattr request is sent to the root directory of the volume. This
causes a heal on it (because the new brick doesn't have good data),
and marks its contents as pending to be healed. This is done by the
index sweeper thread on the next round, one minute later.
This patch solves this problem by waking all index sweeper threads
after a successful check on the root directory.
Additionally, the index sweep thread scans the index directory
sequentially, but it might happen that after healing a directory entry
more index entries are created but skipped by the current directory
scan. This causes the remaining entries to be processed on the next
round, one minute later. The same can happen in the next round, so
the heal is running in bursts and taking a lot to finish, specially
on volumes with many directory levels.
This patch solves this problem by immediately restarting the index
sweep if a directory has been healed.
Change-Id: I58d9ab6ef17b30f704dc322e1d3d53b904e5f30e
BUG: 1547662
Signed-off-by: Xavi Hernandez <jahernan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This test does:
1. mount a volume
2. kill a brick in the volume
3. mkdir (/somedir)
In my local tests and in [1], I see that mkdir in step 3 fails because
there is no dht-layout on root directory.
The reason I think is by the time first lookup on "/" hit dht, a brick
was killed as per step 2. This means layout was not healed for "/" and
since this is a new volume, no layout is present on it. Note that the
first lookup done on "/" by fuse-bridge is not synchronized with
parent process of daemonized glusterfs mount completing. IOW, by the
time glusterfs cmd executed there is no guarantee that lookup on "/"
is complete. So, if step 2 races ahead of fuse_first_lookup on "/", we
end up with an invalid dht-layout on "/" resulting in failures.
Doint an operation like ls makes sure that lookup on "/" is completed
before we kill a brick
Change-Id: Ie0c4e442c4c629fad6f7ae850437e3d63fe4bea9
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
BUG: 1543279
|
|
|
|
|
|
|
|
|
|
|
| |
Because bug-924726.t depends on netstat, tests failed before. This got resolved
by adding respective check to run-tests.sh.
Enabled respective test again.
Change-Id: I70c9bff03379ed9ee8cd95842c3501dfb50b8e86
BUG: 1312830
Signed-off-by: Sven Fischer <sven@fischer-abc.de>
|
|
|
|
|
|
| |
Change-Id: Ib74354f57a18569762ad45a51f182822a2537421
BUG: 1468483
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
For as long as a shard's inode is in priv->lru_list, it should have a non-zero
ref-count. This patch achieves it by taking a ref on the inode when it
is added to lru list. When it's time for the inode to be evicted
from the lru list, a corresponding unref is done.
Change-Id: I289ffb41e7be5df7489c989bc1bbf53377433c86
BUG: 1468483
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This uses 'timeout' command with 300 seconds default. Right now,
there is just 1 test which takes more than that in a properly
setup machine.
Ideally best case is set the default to something like 30 seconds,
and if a test is supposed to take more than that, owner should add
a timeout line to test knowingly. That way, it makes test writers
think about a time limit too.
Change-Id: I747005ce1f208aeb2ecbf899e8feea487ecd21a0
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
| |
In bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
check for peer count after starting glusterd instance on node 2
Change-Id: I3f92013719d94b6d92fb5db25efef1fb4b41d510
BUG: 1540607
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As nfs-ganesha, a wcc data contains pre/post attributes is return
in read/write rpc reply. nfs-ganesha get those attributes by
two getattr between the real read/write right now.
But, gluster has return pre/post attributes from glusterfsd,
those attributes are skipped in syncop/gfapi, if gfapi return them,
the upper user (nfs-ganesha) can use them directly without any
duplicate getattr.
Updates: #389
Change-Id: I7b643ae4241cfe2aeb17063de00192d81674024a
Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To reduce the overall time taken by the every regression job for all glusterd test cases,
avoiding some duplicate tests by clubbing similar test cases into one.
real time taken for all regression jobs of glusterd without this patch is 1959 seconds,
with this patch it is 1059 seconds.
Look at the below document for your reference.
https://docs.google.com/document/d/1u8o4-wocrsuPDI8BwuBU6yi_x4xA_pf2qSrFY6WEQpo/edit?usp=sharing
Change-Id: Ib14c61ace97e62c3abce47230dd40598640fe9cb
BUG: 1530905
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
In one of the case in commit cb0339f there's one particular case where
after removing the old snap it wasn't writing the new snap version and
this resulted into one of the test to fail spuriously.
Change-Id: I3e83435fb62d6bba3bbe227e40decc6ce37ea77b
BUG: 1540607
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
We currently don't have a roll-back/undoing of post-ops if quorum is not
met. Though the FOP is still unwound with failure, the xattrs remain on
the disk. Due to these partial post-ops and partial heals (healing only when
2 bricks are up), we can end up in split-brain purely from the afr
xattrs point of view i.e each brick is blamed by atleast one of the
others. These scenarios are hit when there is frequent
connect/disconnect of the client/shd to the bricks while I/O or heal
are in progress.
Fix:
Instead of undoing the post-op, pick a source based on the xattr values.
If 2 bricks blame one, the blamed one must be treated as sink.
If there is no majority, all are sources. Once we pick a source,
self-heal will then do the heal instead of erroring out due to
split-brain.
Change-Id: I3d0224b883eb0945785ade0e9697a1c828aec0ae
BUG: 1539358
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
| |
Change-Id: Ifbf5e628ccb9a0ecb285f5884a41e70d935316bd
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the list of xattrs that md-cache can cache is hard coded
in the md-cache.c file, this necessiates code change and rebuild
everytime a new xattr needs to be added to md-cache xattr cache
list.
With this patch, the user will be able to configure a comma
seperated list of xattrs to be cached by md-cache
Updates #297
Change-Id: Ie35ed607d17182d53f6bb6e6c6563ac52bc3132e
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
afr relies on pending changelog xattrs to identify source and sinks and the
setting of these xattrs happen in post-op. So if post-op fails, we need to
unwind the write txn with a failure.
Change-Id: I0f019ac03890108324ee7672883d774918b20be1
BUG: 1506140
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, md-cache sends a list of xattrs, it is inttrested in recieving
invalidations for. But, it cannot specify any wildcard in the xattr names
Eg: user.* - invalidate on updating any xattr with user. prefix.
This patch, enable upcall to honor wildcard in the xattr key names
Updates: #297
Change-Id: I98caf0ed72f11ef10770bf2067d4428880e0a03a
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
| |
..in order for self-heal of symlinks to work properly (see BZ for
details).
Change-Id: I9a011d00b07a690446f7fd3589e96f840e8b7501
BUG: 1529488
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
so that glusterd is also aware that shd is up and running.
While not reproducible locally, on the jenkins slaves, 'gluster vol heal patchy'
fails with "Self-heal daemon is not running. Check self-heal daemon log file.",
while infact the afr_child_up_status_in_shd() checks before that passed. In the
shd log also, I see the shd being up and connected to at least one brick before
the heal is launched.
Change-Id: Id3801fa4ab56a70b1f0bd6a7e240f69bea74a5fc
BUG: 1515163
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
| |
This reverts commit 56e5fdae74845dfec0ff7ad0c8fee77695d36ad5.
Change-Id: Ia62cee5440bbe8e23f5da9cff692d792091d544a
Signed-off-by: Milind Changire <mchangir@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Existing EC code doesn't try to heal the OpenFD to
avoid unnecessary healing of the data later.
Fix implements the healing of open FDs before
carrying out file operations on them by making an
attempt to open the FDs on required up nodes.
BUG: 1431955
Change-Id: Ib696f59c41ffd8d5678a484b23a00bb02764ed15
Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With heterogenous bricks now being supported in DHT
we could run into issues where files are not migrated
even though there is sufficient space in newly added bricks
which just happen to be considerably smaller than older
bricks. Using percentages instead of absolute available
space for space checks can mitigate that to some extent.
Marking bug-1247563.t as that used to depend on the easier
code to prevent a file from migrating. This will be removed
once we find a way to force a file migration failure.
Change-Id: I3452520511f304dbf5af86f0632f654a92fcb647
BUG: 1529440
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
| |
Reported by: Sam McLeod
Change-Id: Ic8f9b46b173796afd70aff1042834b03ac3e80b2
BUG: 1512437
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
| |
The fallocate, zerofill and discard modify file data on the server thus
rendering stale any cache held by the xlator on the client.
BUG: 1524252
Change-Id: I432146c6390a0cd5869420c373f598da43915f3f
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch attempts to use the epoll infra for handling SSL connections
as well instead of the socket_poller() thread func.
This essentially makes priv->own_thread flag redundant.
SSL_connect()/SSL_accept() is now non-blocking which has done away
with the localised poll() in ssl_do(). So, ssl_do() has been updated
appropriately.
own_thread and coincidently socket_poller() thread for SSL processing
is now deprecated.
Added a timeout to test whether seal-heal daemon is up and running
as per Ravi's suggestion.
Change-Id: If2b5d7b4fd19e321cb289e08d49a718d2161aafe
Signed-off-by: Milind Changire <mchangir@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
* Introduce xlator methods to allow dumping of metrics
* Separate options to get the metrics dumped in a path
Updates #168
Change-Id: I7df80df33b71d6f449f03c2332665b4a45f6ddf2
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
dd is doing a statfs and failing with ENOSPC instead of writign and
getting EDQUOTA. Make either error a success in this test.
> Signed-off-by: Kevin Vigor <kvigor@fb.com>
> Reviewed-on: http://review.gluster.org/16352
BUG: 1521116
Signed-off-by: ShyamsundarR <srangana@redhat.com>
Change-Id: I9f580d9e4a4dd293df55a1d954f86a9862fcae7b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Options "create-mask" and "create-directory-mask" are added to
remove the mode bits set on a file or directory when its created.
Default value of these options is 0777.
Options "force-create-mode" and "force-create-directory" sets
the default permission for a file or directory irrespective of
the clients umask.
Default value of these options is 0000.
Command to set option:
volume set <volume name> storage.<option-name> <value>
The valid value range from 0000 to 0777.
Updates #301
Change-Id: Ia33d13f2117202ca55a056c747ccc3674eb8bae1
Signed-off-by: Subha sree Mohankumar <smohanku@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue: When using statedump command to take statedump of the
gfapi process, we specify the following things:
$gluster volume statedump <volname> client <host>:<pid>
pid: Pid of the gfapi application
host: This should be the IP/hostname as seen by the glusterd,
the gfapi application is connected to.
In this test case, if gfapi application is running locally,
and is connected to $H1 glusterd, the <host> need not be $H1.
<host> could be localhost, 127.0.0.1, 127.1.1.1 etc. based on
the configuration of the system. Hence use netstat to find the
right <host> value.
Change-Id: I6efb9d1ccaf9c6841a9ab7c9ebfecafc03c0bc5e
BUG: 1517961
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
problem: detach commit was issues before detach start was completed.
fix: wait for detach start to finish and then detach commit.
Change-Id: I639962be6de6dbd1512f0a5617050d1e6872eac8
BUG: 1517961
Signed-off-by: hari gowtham <hgowtham@redhat.com>
|
|
|
|
|
|
| |
Change-Id: If6c36dc6c395730dfb17b5b4df6f24629d904926
BUG: 1517961
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Ied1215bfec0ccf2ec8ee55a0aaf618517b67bd55
BUG: 1517961
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem : snapshot creation was failing after brick reset/replace
Fix : changed code to set mount_dir value in rsp_dict during prerequisites
phase i.e glusterd_brick_op_prerequisites call and removed form prevalidate
phase.
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
Change-Id: Ief5d0fafe882a7eb1a7da8535b7c7ce6f011604c
BUG: 1512451
|
|
|
|
|
|
|
|
|
| |
A timing issue caused the remove-brick commit to fail.
Replaced 'remove-brick commit' with 'remove-brick force'.
Change-Id: I69144b2f7be34095dbd3a7d182e0bf01b27fb0a4
BUG: 1517904
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Sometime test case ./tests/bugs/bug-1371806_1.t is failing on
centos due to race condition between fresh lookup and setxattr fop.
Solution: In selfheal code path we do save mds on inode_ctx, it was not
serialize with lookup unwind. Due to this behavior after lookup
unwind if mds is not saved on inode_ctx and if any subsequent
setxattr fop call it has failed with ENOENT because
no mds has found on inode ctx.To resolve it save mds on
inode ctx has been serialize with lookup unwind.
BUG: 1498966
Change-Id: I8d4bb40a6cbf0cec35d181ec0095cc7142b02e29
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In an arbiter volume, lookup was being served from one of the sink
bricks (source brick was down). shard uses the iatt values from lookup cbk
to calculate the size and block count, which in this case were incorrect
values. shard_local_t->last_block was thus initialised to -1, resulting
in an infinite while loop in shard_common_resolve_shards().
Fix:
Use client quorum logic to allow or fail the lookups from afr if there
are no readable subvolumes. So in replica-3 or arbiter vols, if there is
no good copy or if quorum is not met, fail lookup with ENOTCONN.
With this fix, we are also removing support for quorum-reads xlator
option. So if quorum is not met, neither read nor write txns are allowed
and we fail the fop with ENOTCONN.
Change-Id: Ic65c00c24f77ece007328b421494eee62a505fa0
BUG: 1467250
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
| |
Change-Id: I04c35305bfb663eabbf715eee78695adfd4a2d20
BUG: 1511310
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The restriction of using fds opened by the same Pid means fds cannot
be shared across threads of multithreaded application. Note that fops
from kernel have different Pid for different threads. Imagine
following sequence of operations:
* Turn off performance.open-behind
* Thread t1 opens an fd - fd1 - on file "file". Let's assume nodeid of
"file" is "nodeid-file".
* Thread t2 does RENAME ("newfile", "file"). Let's assume nodeid of
"newfile" as "nodeid-newfile".
* t2 proceeds to do fstat (fd1)
The above set of operations can sometimes result in ESTALE/ENOENT
errors. RENAME overwrites "file" with "newfile" changing its nodeid
from "nodeid-file" to "nodeid-newfile" and post RENAME, "nodeid-file" is
removed from the backend. If fstat carries nodeid-file as argument,
which can happen if lookup has not refreshed the nodeid of "file" and
since t2 doesn't have an fd opened, fuse_getattr_resume uses STAT
which will fail as "nodeid-file" no longer exists.
Since the above set of operations and sharing of fds across
multiple threads are valid, this is a bug.
The fix is to use any fd opened on the inode. In this specific example
fuse_getattr_resume will find fd1 and winds down the call as fstat
(fd1) which won't fail.
Cross-checked with "Miklos Szeredi" <mszeredi.at.redhat.dot.com> for
any security issues with this solution and he approves the solution.
Thanks to "Miklos Szeredi" <mszeredi.at.redhat.dot.com> for all the
pointers and discussions.
Change-Id: I88dd29b3607cd2594eee9d72a1637b5346c8d49c
BUG: 1510401
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
| |
Add peer_count check before checking for brick status
Change-Id: I0179ec29729ab6bbc3571eb6ffd631b7b0d15f7c
BUG: 1510415
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
| |
A new option is added to allow independent configuration of eager
locking for regular files and non-regular files.
Change-Id: I8f80e46d36d8551011132b15c0fac549b7fb1c60
BUG: 1502610
Signed-off-by: Xavier Hernandez <jahernan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
While stopping the brick which is to be reset and replaced delete_brick
flag was passed as true which resulted glusterd to free up to source
brick before the actual operation. This results commit force to fail
failing to find the source brickinfo.
Change-Id: I1aa7508eff7cc9c9b5d6f5163f3bb92736d6df44
BUG: 1507466
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
| |
Update .t tier tests to use the new tier CLI.
Change-Id: I0e7f1769071108d8266fc86378c4466bcaf96e7d
BUG: 1505253
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Append on a file with split-brain succeeds. Open is intercepted by open-behind,
when write comes on the file, open-behind does open+write. Open succeeds
because afr doesn't fail it. Then write succeeds because write-behind
intercepts it. Flush is also intercepted by write-behind, so the application
never gets to know that the write failed.
Fix:
Fail open on split-brain, so that when open-behind does open+write open fails
which leads to write failure. Application will know about this failure.
Change-Id: I4bff1c747c97bb2925d6987f4ced5f1ce75dbc15
BUG: 1294051
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added code for unmount of activated snapshot brick during snapshot
deactivation process which make sense as mount point for deactivated
bricks should not exist.
Removed code for mounting newly created snapshot, as newly created
snapshots should not mount until it is activated.
Added code for mount point creation and snapshot mount during snapshot
activation.
Added validation during glusterd init for mounting only those snapshot
whose status is either STARTED or RESTORED.
During snapshot restore, mount point for stopped snap should exist as
it is required to set extended attribute.
During handshake, after getting updates from friend mount point for
activated snapshot should exist and should not for deactivated
snapshot.
While getting snap status we should show relevent information for
deactivated snapshots, after this pathch 'gluster snap status' command
will show output like-
Snap Name : snap1
Snap UUID : snap-uuid
Brick Path : server1:/run/gluster/snaps/snap-vol-name/brick
Volume Group : N/A (Deactivated Snapshot)
Brick Running : No
Brick PID : N/A
Data Percentage : N/A
LV Size : N/A
Fixes: #276
Change-Id: I65783488e35fac43632615ce1b8ff7b8e84834dc
BUG: 1482023
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
brick multiplexing
In brick multiplexing environment, if a brick process goes down
i.e., if we kill it with SIGKILL, the status of the brick for which
the process came up for the first time is only changing to stopped.
all other brick statuses are remain started. This is happening because
the process was killed abruptly using SIGKILL signal and signal
handler wasn't invoked and further cleanup wasn't triggered.
When we try to start a volume using force, it shows error saying
"Request timed out", since all the brickinfo->status are still in
started state, we're waiting for one of the brick process to come up
which never going to happen since the brick process was killed.
To resolve this, In the disconnect event, We are checking all the
processes that whether the brick which got disconnected belongs the
process. Once we get the process we are calling a function named
glusterd_mark_bricks_stopped_by_proc() and sending brick_proc_t object as
an argument.
From the glusterd_brick_proc_t we can get all the bricks attached
to that process. but these are duplicated ones. To get the original
brickinfo we are reading volinfo from brick. In volinfo we will have
original brickinfo copies. We are changing brickinfo->status to
stopped for all the bricks.
Change-Id: Ifb9054b3ee081ef56b39b2903ae686984fe827e7
BUG: 1499509
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Problem: Test case ./tests/bugs/bug-1371806_1.t is failing.
Solution: Mark test case ./tests/bugs/bug-1371806_1.t as a bad test case.
BUG: 1499663
Change-Id: Icb3f41d23dcc74cce6fde05ca343c158d5f58cdd
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Commit ff075a3d6f9b142911d25c27fd209838782bfff0 disabled loading
client-io-threads for replicate volumes (it was set to on by default in
commit e068c1997314046658dd502e9118dab32decf879) due to performance
issues but in doing so, inadvertently failed to load the xlator even if
the user explicitly enabled the option using the volume set command.
This was despite returning returning sucess for the volume set.
Fix:
Modify the check in perfxl_option_handler() and add checks in volume
create/add-brick/remove-brick code paths, tying it all to
GD_OP_VERSION_3_12_2.
Change-Id: Ib612973a999a7da818cc926f5c2601b1f0794fcf
BUG: 1498570
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|