| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ibf639695ebd99c11c6960c9be82c0cee71b50744
BUG: 905864
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4458
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I58d237a3d2f4caa7f3865c2e4899c472f7457450
BUG: 906887
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4457
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In dht_mkdir_cbk, EEXIST error is treated like a true error. Because
of this the following sequence of events can happen, eventually
resulting in GFID mismatch and (and possibly leaked locks and hang,
in the presence of replicate.)
The issue exists when many clients concurrently attempt creation of
directory and subdirectory (e.g mkdir -p /mnt/gluster/dir1/subdir)
0. First mkdir happens by one client on the hashed subvolume. Only
one client wins the race. Others racing mkdirs get EEXIST. Yet
other "laggers" in the race encounter the just-created directory
in lookup() on the hash dir.
1. At least one "lagger" lookup() notices that there are missing
directories on other subvolumes (which the "winner" mkdir is yet
to create), and starts off self-heal of the directory.
2. At least on some subvolumes, self-heal's mkdir wins the race
against the "winner" mkdir and creates the directory first. This
causes the "winner" mkdir to experience EEXIST error on those
subvolumes.
3. On other subvolumes where "winner" mkdir won the race, self-heal
experiences EEXIST error, but self-heal is properly translating
that into a success (but mkdir code path is not -- which is the
bug.)
4. Both mkdir and self-heal assign hash layouts to the just created
directory. But self-heal distributes hash range across N (total)
subvolumes, whereas mkdir distributes hash range across N - M
(where M is the number of subvolumes where mkdir lost the race).
Both the clients "cache" their respective layouts in the near
future for all future creates inside them (evidence in logs)
5. During the creation of the subdirectory, two clients race again.
Ideally winner performs mkdir() on the hashed subvolume and proceeds
to create other dirs, loser experiences EEXIST error on the hashed
subvolume and backs off. But in this case, because the two clients
have different layout views of the parent directory (because of
different hash splits and assignements), the hashed subvolumes for
the new directory can end up being different. Therefore, both clients
now win the race (they were never fighting against each other on a
common server), assigning different GFIDs to the directory on their
respective (different) subvolumes. Some of the remaining subvolumes
get GFID1, others GFID2.
Conclusion/Fix:
Making mkdir translate EEXIST error as success (just the way self-heal
is already rightly doing) will bring back truth to the design claim
that concurrent mkdir/self-heals perform deterministic + idempotent
operations. This will prevent the differing "hash views" by different
clients and thereby also avoid GFID mismatch by forcing all clients
to have a "fair race", because the hashed subvolume for all will be
the same (and thereby avoiding leaked locks and hangs.)
Change-Id: I84592fb9b8a3f739a07e2afb23b33758a0a9a157
BUG: 907072
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4459
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
using bind(3) to identify local address fails when net.ipv4.ip_nonlocal_bind
(i.e, /proc/sys/net/ipv4/ip_nonlocal_bind) is set to 1.
Change-Id: I7047b6fb94ef0df10b78673fab34dbd169344fec
BUG: 890587
Original-author: JulesWang <w.jq0722@gmail.com>
Signed-off-by: JulesWang <w.jq0722@gmail.com>
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/4437
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Id4062799104e5831467ced65a43bfe377b6163f4
BUG: 852147
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4297
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Made rsp dict available to all glusterd's STAGE/BRICK/COMMIT OP.
Change-Id: I5d825d0670d0f1aa8a0603f2307b3600ff6ccfe4
BUG: 852147
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4296
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ib4c4794563a5a694fab16f17c642f788399462f6
BUG: 852147
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4295
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I87e02c95d0b650dab7f9ee86c96b2e09ada50109
BUG: 862834
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4118
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Iaf119f839cb2113b8f8efb7bf7636d471b6541bf
BUG: 866440
Signed-off-by: Venkatesh Somyajula <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/4385
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the brick is taken down and the hard disk is replaced
and the brick is brought back up, the re-opens of the open-fds
will fail because the file is not present on the brick.
Re-opens are not attempted even if the files are re-created by
self-heal until the brick is brought down after the files are
re-created and brought back up. This is a problem with a VM-store
in a replica-setup. Until the fd is re-opened the writes will
never happen on the brick where the hard-disk is replaced.
To handle this situation gracefully, client xlator is enhanced
to perform finodelk, fxattrop, writev, readv using anonymous fds
if the file is yet to be re-opened. If the fop succeeds then client
xlator attempts re-open.
Change-Id: I1cc6d1bbf8227cd996868ab2ed0a57fb05e00017
BUG: 821056
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4358
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I01caa1b51570359e6e3ffe1ffb7279cbdb0b0c64
BUG: 821056
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4357
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Default_fops uses stack_wind_tail. It winds without creating the frame leading
into wrong subvol return in the cookie. To avoid the problem caused by the
same, we're getting the subvol by passing the cookie.
Change-Id: I51ee79b22c89e4fb0b89e9a0bc3ac96c5b469f8f
BUG: 893338
Signed-off-by: Varun Shastry <vshastry@redhat.com>
Reviewed-on: http://review.gluster.org/4388
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
Tested-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- do not attempt lock migration if no locks were ever acquired on
an fd.
- fix fd_lk_ctx_t ref leak during fd migration
- remove spurious fd_unref() (probably added to compensate for
the fd_ref leak in syncop_open_cbk)
- remove @newfdptr out-param which makes fd ref management really
tricky (and currently refs were unmanaged for the out-param).
Instead acquire ref and unref within lock migration function.
Change-Id: I4cc9c451f0df4c051612bd1fa7bef11e801570e4
BUG: 808400
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4453
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Do not do fd_ref in cbks of the fops which return a fd (such as
open, opendir, create).
Change-Id: Ic2f5b234c5c09c258494f4fb5d600a64813823ad
BUG: 885008
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/4282
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ibdede396c4d6859225937316b7a59a661bcaf9f5
BUG: 764890
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/4422
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When one of the subvolume is down, then lock request is not attempted
on that subvolume and move on to the next subvolume.
/* skip over children that are down */
while ((child_index < priv->child_count)
&& !local->child_up[child_index])
child_index++;
In the above case if there are 2 subvolumes and 2nd subvolume is down (subvolume
1 from afr's view), then after attempting lock on 1st child (i.e subvolume 0)
child index is calculated to be 1. But since the 2nd child is down child_index
is incremented to 2 as per the above logic and lock request is STACK_WINDed to
the child with child_index 2. Since there are only 2 children for afr the child
(i.e the xlator_t pointer) for child_index will be NULL. The process crashes
when it dereference the NULL xlator object.
Change-Id: Icd9b5ad28bac1b805e6e80d53c12d296526bedf5
BUG: 765564
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/4438
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I5d84ef72615f9d71b4af210976e2449de6e02326
BUG: 888174
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4446
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generally inode-write fops do transaction.unwind then
transaction.resume, but writev needs to make sure that
delayed post-op frame is placed in fdctx before unwind
happens. This prevents the race of flush doing the
changelog wakeup first in fuse thread and then this
writev placing its delayed post-op frame in fdctx.
This helps flush make sure all the delayed post-ops are
completed.
Change-Id: Ia78ca556f69cab3073c21172bb15f34ff8c3f4be
BUG: 888174
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4428
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- initialize xdata in qr_lookup even if it was NULL from top. This
allows qr to do its job even if lookup originated from fuse-resolve.c
- extend test cases to include 1 second delay and retry
- fix bug while checking condition for cached unwind
qr_readv_cached() unwinds if op_ret > 0. Therefore qr_readv()
must wind to subvol only if !(op_ret > 0) (i.e, op_ret <= 0).
- qr_readv_cached() is using uninitialized @conf pointer. Thanks
to Raghavendra Bhat for catching this!
Change-Id: Ifaf2ea2685e452210ef9ba3c2d1f2ab51900650c
BUG: 846240
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4452
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
read path.
Change-Id: Ieb5d592a987e8681d5ec019da309f75e3b207580
BUG: 858242
Signed-off-by: Raghavendra G <raghavendra@gluster.com>
Reviewed-on: http://review.gluster.org/4204
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I13e3699bd58d53896ae54e1bfafb3cd1c9580c7c
BUG: 905307
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4443
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
md-cache currently transforms all readdir fops into readdirp fops.
This patch creates the 'force-readdirp' configuration flag to
provide control over this behavior. force-readdirp is enabled by
default to maintain current default behavior.
BUG: 903175
Change-Id: Idd70926dec7c271204bdfb11fb052e56d0a39420
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/4440
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- peel out 'open behind' functionality into a separate translator
- issue where, if file size had grown by revalidate, data was not flushed
- removed unnecessary acquistion of table->lock (e.g in qr_lookup())
- keep inode ctx persistent, prune only data (effectively changing the
order of lock acquisition from INODE -> TABLE)
- validation with readdirplus
- use variable size iobufs to simply cached reads
Change-Id: If1586d0298fd1697ddff9fd7008efb3d286d436a
BUG: 846240
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4403
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
entrylk
when the expected lock count is equal to the attempted lock count, then before
deciding that lock is failed on all the nodes, make sure the lock type is
checked properly.
Change-Id: I1f362d54320cb6ec5654c5c69915c0f61c91d8c7
BUG: 765564
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/4436
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Id3bd0bfc4802c166f7a32b0cc6a726aeb5617b5d
BUG: 890618
Signed-off-by: JulesWang <w.jq0722@gmail.com>
Reviewed-on: http://review.gluster.org/4427
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a minor latency optimization to the readdirp path in
storage/posix. During a recursive list, we hit this codepath with
an empty list once per high-level directory to read when end of
directory is reached. Skip constructing hpath, since we don't do
anything with it in this case.
BUG: 903175
Change-Id: I98d7c65505205d55575f064b1e982700f1320cc0
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/4432
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of http://review.gluster.org/2828, the blocking lock code
path's condition for checking completion of locking atempt is
broken. The condition -
if ((child_index == priv->child_count) || ...)
and
if ((child_index == priv->child_count) && ...)
which is retained to check completion of blocking lock attempts
for DATA/METADATA transaction will _always_ fail because a few
lines above we have -
child_index = cookie % priv->child_count;
So child_index will never equal priv->child_count. This leaves
the correctness at the mercy of the next part of the
conditional -
.. (int_lock->lock_count == int_lock->lk_expected_count) ..
This "works" as long as no server went down during the transaction.
If a server goes down in the middle of the transaction, then this
condition also fails, and the code wraps around and starts a
blocking lock attempt loop all the way again from from the first server.
This results in double locks getting acquired on those servers, and
eventually the second condition gets hit (first condition is _never_
hit) and we come out of locking phase.
During unlock phase we perform only one unlock per server leaving the
other lock "leaked" forever.
Change-Id: I7189cdf3f70901b04647516fe1d1e189f36cc8dd
BUG: 765564
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4433
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The earlier logic used to check if (layout-spread-count <= subvol_cnt -
decommissioned bricks). With this if a subvol was down, and layout-spread was >
upsubvols, a mkdir ended up creating holes in the layout.
The fix is to consider only the combination of subvols which are usable (not
down or not decommissioned).
Change-Id: I61ad3bcaf4589f5a75f7887cfa595c98311ae3bb
BUG: 902610
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/4412
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I85b22db5cc456b3e8c9f26c8254f08a796fc2b28
BUG: 903336
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4418
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
for passing the build with -pedantic flag
Change-Id: I80fd9528321e4c6ea5bec32bf5cdc54cc4e4f65e
BUG: 875913
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: http://review.gluster.org/4186
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* There are upto 3 entry lockees that may be needed to perform
entrylk'ing in posix dir-write operations.
* For eg, rmdir ("/a/b") needs to acquire locks on two entities,
- entrylk ("/a", "b")
- entrylk ("/a/b", null)
* Changed existing entrylk/rename/selfheal (entrylk) transactions
to use the new book-keeping structures
* Fixed few issues in afr_trace_entry_lk{in,out} functions. Tracing is now
aware of the new entry lockee structure.
Implementation notes:
* Changed 'cookie' sent in stack_wind to encode lockee_entity_no
and subvol_no.
cookie is a non-negative integer such that 0 <= cookie < replica_count,
When more than one lock is being acquired across the subvolumes,
cookie % replica_count gives the subvol_no
cookie / replica_count gives the lockee_entity_no.
Change-Id: Idbf41803387a7d59a0f7fcb1453d91cea74da153
BUG: 765564
Signed-off-by: Krishnan Parthasarathi <kp@gluster.com>
Reviewed-on: http://review.gluster.org/2828
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Leaving option frame-work un-changed for backward compatibility.
Change-Id: I40bce1ec360801307e67f09e53b0721f64efab37
BUG: 886998
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4309
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ic1c9559aec59c1fb9dfede4aba8895f3b86f32f1
BUG: 861015
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4098
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I0db00b7334bb9707ab48bd661ac03a3ad818d6e4
BUG: 893458
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/4393
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
glusterd should not hang if gsyncd ends up in some weird state
Change-Id: Ic141daa0cd05d515848c8b6c25702418e15b7599
BUG: 826512
Signed-off-by: Csaba Henk <csaba@redhat.com>
Reviewed-on: http://review.gluster.org/3919
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* warnings on 'void *' arguments
* warnings on empty initializations
* warnings on empty array (array[0])
Change-Id: Iae440f54cbd59580eb69f3ecaed5a9926c0edf95
BUG: 875913
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/4219
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When "gluster volume heal <volname> info is executed, crawl's
process_entry is not going to populate iatt structure so the
iatt's gfid will be empty. So inode_links are failing.
Fix:
inode_link should be done only after lookup i.e. when heal is
performed. So moved the inode_link related code to just after
the lookup which is triggered when self-heal is done.
Tests:
The testcase that gives this issue does not give the inode-link
failures anymore. glustershd heal, info commands are working as
expected.
Wrote basic automation tests for proactive-self-heal-daemon
https://github.com/pranithk/gluster-tests/blob/master/afr/proactive-self-heal.sh
Change-Id: Ic112bf104a4d553a64d3d8559f681a25ae1a5362
BUG: 861015
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4090
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we follow a linkfile, and the lookup returns a ENOTCONN error, return
the error, as the cached subvol is down, and lookup_everywhere wont succeed,
but actually ends up clearing the linkfile, and clearing the namespace.
Change-Id: I772bf71531bc646e8fb62d3e8549a5fe0f3896da
BUG: 893378
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/4383
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I04c0dd23bc5bc34fd9d7bddb11beeecb8e7e2a49
BUG: 853842
Signed-off-by: Shireesh Anjal <sanjal@redhat.com>
Reviewed-on: http://review.gluster.org/4398
Reviewed-by: Anand Avati <avati@redhat.com>
Tested-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When eager-lock is disabled, inodelks for write-fops on same
fd conflict with each other. If eager-lock is disabled but
delayed post-op is enabled then each write fop's inodelk unlock
waits for post-op-delay-secs. So the conflicting write fop
acquires inodelk after post-op-delay-secs. This results in
post-op-delay-secs delay for every write fop on the fd for
sequential writes (Ex: dd).
Fix:
Disable delayed-post-op when eager-lock is off.
Change-Id: I87ea4c8d1c7bb269b9b174388ae50f37e82629b7
BUG: 895235
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4391
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes log changes mostly in the op state machine as also
in volume stop codepath of glusterd.
Changes made:
* Moved log level from INFO to DEBUG, of log messages on the various
state transitions within a transaction.
For example, messages of the following kind:
a. "Sent op req to <n> peers"
b. "Received LOCK from uuid: <peer-uuid>", etc.
* Changed some of the log messages to give as much information as
available in case of failure.
* Added logs to identify on which machine lock/stage/commit failed.
* Quite a few s/THIS/this changes.
Also, with this change, log changes in all other volume ops
should (hopefully) boil down to modifying the respective logs in
handler, stage and commit (and brick ops in some cases).
Change-Id: I2b8443042b07fb41a1d12033741f7e156aa6b3da
BUG: 812356
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/4382
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Afr prevents opens on a file in split-brian but the
fd that is already open still has the capability to perform
both reads and writes to the file.
Fix:
Fail readvs on a file with EIO.
Change-Id: I8e07f24c36fab800499b36ab374f984b743332cd
BUG: 873962
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4199
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See https://bugzilla.redhat.com/show_bug.cgi?id=895656
https://bugzilla.redhat.com/show_bug.cgi?id=764679 (GLUSTER-2947)
https://bugzilla.redhat.com/show_bug.cgi?id=764623 (GLUSTER-2891)
The comments in the bzs are a bit obtuse and/or vague. As near as I
can make out we had, for a while, a "convenience symlink" to or from
/usr/local/libexec/gsyncd, which no longer exists.
And, lacking any comments in the code, I gather this is some sort of
fallback or failsafe logic: if the first, normal attempt to invoke gsyncd
fails then an attempt is made to ssh to the box and invoke it.
In any event, there's nothing in /usr/local/... so it's unquestionably
wrong to try to invoke anything there.
BUG: 895656
Change-Id: I3b7ac7a049b91ce101b930599294830147cc60ad
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/4392
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Joe Julian <joe.julian.prime@gmail.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The most important errno logic historically only prioritized ESTALE
over ENOENT. Commit c8c0942d added EIO prioritization over ENOENT
to ensure that split-brain was reported when it occurs in
conjunction with bricks missing the file entry. The unintended side
effect of this change is that (non split-brain) EIO errors reported
from the bricks themselves are now reported to the client when the
expectation is that afr should squash said errors in favor of
marking the file inconsistent.
The high-level problem is that EIO is overloaded with different
meanings from different contexts. This commit adds an eio parameter
to the errno priority logic to conditionally flag when EIO is of
higher priority and should be propagated to the client.
BUG: 892730
Change-Id: Ib692a8a1f1737ef190d57894f392ec53ffb33aab
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/4376
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In cases where the servers use virtual ip's, this commit
makes sure we use them and not the physical ip.
This change also refactors code around nlm4_establish_callback
by sending granted msg only after a connection establishment,
and removing the separate thread creation.
Change-Id: I087362c547a25aa52ef7fc6653845a3863466ee6
BUG: 888283
Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com>
Reviewed-on: http://review.gluster.org/4326
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Make use of event-history in debug/trace xlator to dump the recent fops,
when statedump is given. trace xlator saves the fop it received along
with the time in the event-history and upon statedump signal, dumps its
history. The size of the event-history can be given as a xlator option.
* Make changes in trace to take logging into log-file or logging to
history as an option. By default both are off.
Change-Id: I12baee5805c6efb55735cead4e2093fb94d7a6a0
BUG: 797171
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/4088
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
afr_more_important_error() is written to return whether a new errno
should override an existing errno for high-level operations that
could span multiple sub-operations. It specifically prioritizes
ESTALE over EIO over ENOENT, and otherwise defaults to the latest
error passed having priority.
This change preserves current behavior, but rewrites the logic to
return the higher priority error of the existing and new errno. The
purpose of the change is to make the logic a bit more clear and set
the stage for future changes to make the logic flexible based on
context.
BUG: 892730
Change-Id: Id1aa48855dfb0507abc9d1ef22f2259b30472576
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/4375
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Used local->postparent(contains merged iatt of all succesful calls) instead
of postparent for dht ctx time update.
2. dht_inode_ctx_time_update avoided in case of opret -1.
Change-Id: Ie04a7842a41c241f911b6a3f76267b996d27fb43
BUG: 881013
Signed-off-by: Varun Shastry <vshastry@redhat.com>
Reviewed-on: http://review.gluster.org/4338
Reviewed-by: Shishir Gowda <sgowda@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When fop fails post-op is always performed
over the network irrespective of whether pre-op is piggybacked
or not. Decrementing Pre-op-done count even for the piggybacked
ones is wrong.
I have added an assert for pre_op_done to be non-zero and when
dd of=a if=/dev/urandom bs=5M count=1000 is executed and a brick
is taken down, the mount is crashing.
Fix:
Decrement pre-op-done count only when the post-op is not
piggybacked.
Change-Id: Ie837251a43bfb437f0fada191302eeee60be1601
BUG: 863939
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4310
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Iebfa6770a688e89c051666b46977862188061738
BUG: 802417
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/4034
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|