| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Just after replace-brick the mount logs are filled with ENOENT/ESTALE
warning logs because the file is yet to be self-healed now that the
brick is new.
Fix:
Do conditional logging for the logs. ENOENT/ESTALE will be logged at
lower log level. Only when debug logs are enabled, these logs will
be written to the logfile.
Change-Id: If203d09e2479e8c2415ebc14fb79d4fbb81dfc95
BUG: 1151303
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8918
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks a lot to Niels for helping me to get build stuff right.
Change-Id: I634f24d90cd856ceab3cc0c6e9a91003f443403e
BUG: 1147462
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6529
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All the special cases v1 handles and also
self-accusing pending changelog from v1 pre-op also is handled
in this patch.
Change-Id: Ie10f71633fb20276f01ecafbd728f20483e7029c
BUG: 1128721
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8536
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to POSIX, seekdir() should only be given offset obtained from
telldir() on the same DIR *
http://pubs.opengroup.org/onlinepubs/9699919799/functions/seekdir.html
Code from afr-self-heald.c and index.c is operating outside of the
specification, by doing using seekdir() with offset from a previously
open/close/re-open directory. This seems to work on Linux (although with
no guarantee it will always in the future). On NetBSD the seekdir()
with a in invalid offset is a nilpotent operation, and causes an infinite
loop, since index_fill_readdir() always restart from the beginning of the
directory.
The situation is fixed by using a non anonymous fd in afr-self-heald.c:
we explicitely open the directory so that it remains open on the brick
side during the timeframe where we want to reuse offsets in seekdir().
This requires adding an opendir fop in index xlator.
If the brick was not updated, the opendir will fail and we fallback
to the standard violating approach for backward compatibility on Linux.
On other systems we fail since it never worked.
While there, add tests to check seekdir() success in index and posix
xlators, so that incorrect usage from calling code produce an explicit
error instead of an infinite loop. We can only do it on non Linux systems,
for the sake of backward compatibility when the brick was updated but
not the client.
BUG: 1129939
Change-Id: I88ca90acfcfee280988124bd6addc1a1893ca7ab
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/8760
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
File goes into split-brain because of wrong erasing of xattrs.
RCA:
The issue happens because index self-heal is triggered even before all the
bricks are up. So what ends up happening while erasing the xattrs is, xattrs
are erased only on the sink brick for the brick that it thinks is up leading to
split-brain
Example:
lets say the xattrs before heal started are:
brick 2:
trusted.afr.vol1-client-2=0x000000020000000000000000
trusted.afr.vol1-client-3=0x000000020000000000000000
brick 3:
trusted.afr.vol1-client-2=0x000010040000000000000000
trusted.afr.vol1-client-3=0x000000000000000000000000
if only brick-2 came up at the time of triggering the self-heal only
'trusted.afr.vol1-client-2' is erased leading to the following xattrs:
brick 2:
trusted.afr.vol1-client-2=0x000000000000000000000000
trusted.afr.vol1-client-3=0x000000020000000000000000
brick 3:
trusted.afr.vol1-client-2=0x000010040000000000000000
trusted.afr.vol1-client-3=0x000000000000000000000000
So the file goes into split-brain.
Change-Id: I1185713c688e0f41fd32bf2a5953c505d17a3173
BUG: 1142601
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8755
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sending appropriate return value from afr_selfheal()
fixes the issue.
Credits : Krutika Dhananjay and Pranith Kumar.
Change-Id: I01dbd49476f6bfbd02028fdde1f60cc0324a1e39
BUG: 1146812
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/8868
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I723b65d6fcae5bd39c0daece3b2f48baf0a72166
BUG: 1145471
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/8875
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I6618b7b2df0f88a3e6252f9136135cf402147a37
BUG: 1134221
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/8852
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Original reporter of the bug & designer of the solution:
Pranith Kumar K <pkarampu@redhat.com>
Change-Id: I9ed89aa92e4cd0f8049f5f6c7a3701e52989ae5e
BUG: 1128721
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/8837
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Added logging for metadata and data self-heals which helped
in debugging this issue.
- Added checks to skip self-heals when no sinks are available to heal
Change-Id: I0d50dceb84cd9ad4fe00e0b749ddf7d4ff42348a
BUG: 1128721
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8709
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AFR_STACK_RESET previously didn't cleanup afr_local_t,
leading to memory leaks. With this patch, cleanup is
done.
All credit goes to Pranith Kumar Karampuri.
Change-Id: I3c727ff4bb323dccb81da4b3168ac69bb340d17d
BUG: 1145471
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/8821
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ie27219a376382e2455a0fcc094f8b7eb243738ae
BUG: 1140613
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/8816
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When self-heal code doesn't see at least 2 successes on looking up children,
then self-heal can't be done. What is happening now is if all the lookups fail
then the pending changelog is all zeros in xattrs so all the children are
becoming sources and leading to crashes when the code paths further assume that
some data structures are populated properly
Fix:
Don't proceed with self-heals when < 2 children succeed lookups.
BUG: 1128721
Change-Id: Iffdf0feebb6f98812d9d01cdd0cf97f3e19ba76f
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8698
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Index xlator removes the index file from indices
xattrop directory in case the value for keys sent
are zero.
If all the required keys are not set by afr
then index file might be removed in an invalid
way.
With this change all the keys required by index
xlator are set by afr such that invalid removal of
files does not occur.
Change-Id: Idbed0764a95157fd5cab8d6685057a43788fc7df
BUG: 1139230
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/8652
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When one of the brick is taken down and brough back up in a replica pair, locks
on that brick will be allowed. Afr returns inodelk success even when one of the
bricks already has the lock taken.
Fix:
If any brick returns EAGAIN return failure to parent xlator.
Change-Id: I5b842d0fc094359cc4231494053d2bfeb606bbbe
BUG: 1141539
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8710
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Detect and heal mismatching user extended attributes during lookup.
'Forward' port of http://review.gluster.org/#/c/7444/
Change-Id: Id03c9746f083ffd3014711d0b3a2e5a71a45eed4
BUG: 1134691
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/8558
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Based on type of file, set appropriate pending changelogs
for new entries.
Change-Id: Ifd124bf9bc54b996ce83ab9f39d03b3ccca7eb3c
BUG: 1130892
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/8555
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fixed a bug in afr_selfheal_name_type_mismatch_check()
* Fixed indentation
* Removed redundant 'continue' statements
Change-Id: Ie58b5dec9085ce9fe46ae9f244ebae1b1cef7ac5
BUG: 1132469
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/8586
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Original author of the test script:
Pranith Kumar K <pkarampu@redhat.com>
Change-Id: If515ecefd3c17f85f175b6a8cb4b78ce8c916de2
BUG: 1132469
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/8574
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dict_t objects that are ref'd in alloca'd "replies" in
afr_replies_copy() are not unref'd after "replies" go out of scope.
Change-Id: Id5a6ca3c17a8de72b94b3e0f92165609da5a36ea
BUG: 1134221
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/8553
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Counts may give wrong results when the number of bricks is > 2. If the
locks are acquired on one source and sink, but the source accuses even the
down sink then there will be 2 sinks and lock is acquired on 2 bricks so
even when there is a clear source and sink **_finalize_source functions think
the file/directory is in split-brain.
Fix:
Check that all the bricks which are locked are sinks.
Change-Id: Ia43790e8e1bfb5e72a3d0b56bcad94abd0dc58ab
BUG: 1128721
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8456
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ie55b3d37cacbdad74a3f63c3b0f025b0ffd0104a
BUG: 1132461
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8514
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Allowing lookup with 'gfid-req' will lead to assigning gfid at posix layer.
When two mounts perform lookup in parallel that can lead to both bricks getting
different gfids leading to gfid-mismatch/EIO for the lookup.
Fix:
Perform gfid heal inside lock.
BUG: 1129529
Change-Id: I20c6c5e25ee27eeb906bff2f4c8ad0da18d00090
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8512
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ib6eea2dfe43aacf1f3446cc023adecbcf8645d48
BUG: 1132102
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8506
Reviewed-by: Anuradha Talur <atalur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
local->xattr_req is already reffed with xattr_req that comes
in lookup fop. But when afr_lookup_xattr_req_prepare is called
local->xattr_req is over-written with dict_new() which leads
to ref leak on the dict which came in lookup fop
Fix:
Create local->xattr_req only when it is NULL
Change-Id: Ib1548f2df97688859f2cace44b93b3b733297c36
BUG: 1128801
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8457
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ie792875546b4f8e11ebf870a6e63aaf5cb221976
BUG: 1121920
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/8344
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes changelog capturing internal FOPs in a cascaded
setup, where the intermediate master would record internal FOPs
(generated by DHT on link()/rename()). This is due to I/O happening
on the intermediate slave on geo-replication's auxillary mount with
client-pid -1. Currently, the internal FOP capturing logic depends
on client pid being non-negative and the presence of a special key
in dictionary. Due to this, internal FOPs on an inter-mediate master
would be recorded in the changelog. Checking client-pid being
non-negative was introduced to capture AFR self-heal traffic in
changelog, thereby breaking cascading setups. By coincidence,
AFR self-heal daemon uses -1 as frame->root->pid thereby making
is hard to differentiate b/w geo-rep's auxillary mount and self-heal
daemon.
Change-Id: Ib7bd71e80dd1856770391edb621ba9819cab7056
BUG: 1122037
Original-Author: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Kotresh H R <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/8347
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
afr does itransform by taking the list of entries given by client xlator
to separate list but doesn't free that list.
Change-Id: Ibb4d38b4934b2bb924385c88f9d7942fad933cb9
BUG: 1117243
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8261
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changed the description of AFR_MSG_SUBVOL_UP to make it more meaningful.
Change-Id: I30fa13c2e9a280a22d48e777d259d04a3b71deef
BUG: 1075611
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/8149
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problems with fuse/server:
Fuse loc touch up sets loc->name even when pargfid
is not known. Server lookup does (pargfid, name) based
lookup when name is set ignoring the gfid. Because of this server
resolver finds that the lookup came on (null-pargfid, name) and
fails the lookup with EINVAL.
Fix:
Don't set loc->name in loc_touchup if the pargfid is not known.
Did the same even for server-resolver
Problem with afr:
Lets say there is a directory hierarchy a/b/c/d on the mount and the
user is cd'ed into the directory. Bring down one of the bricks of replica and
remove all directories/files to simulate disk replacement on that brick. Now
this brick is brought back up. Creates on the cd'ed directory fail with ESTALE.
Basically before sending a create of 'f' inside 'd', fuse sends a lookup to
make sure the file is not present. On one of the bricks 'd' is present and
'f' is not so it sends ENOENT as response. On the new brick 'd' itself is not
present. So it sends ESTALE. In afr ESTALE is considered to be special errno on
witnessing which lookup has to fail. And ESTALE is given more priority than
ENOENT. Due to these reasons lookup fails with ESTALE rather than ENOENT. Since
lookup didn't fail with ENOENT, 'create' can't be issued so the command is
failed with ESTALE.
Solution:
Afr needs to consider ESTALE errno normally and ENOENT needs to
be given more priority so that operations like create can proceed even when
only one of the brick is up and running. Whenever client xlator identifies
that gfid-changed, it sets that information in lookup xdata. Afr uses this
information to fail the lookup with ESTALE so that top xlator can send
fresh lookup.
Change-Id: Ica6ce01baef08620154050a635e6f97d51029ef6
BUG: 1106408
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8015
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I7ec2821c96d4cbff78b4959d9b07019896cc9a2c
BUG: 1075611
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/7840
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Iecfd3150e4f4e795e3403bcb1ac56340759a37d0
BUG: 1098027
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/7766
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change important (from a diagnostics point of view) log messages to use
the gf_msg() framework.
Change-Id: I0a58184bbb78989db149e67f07c140a21c781bc2
BUG: 1075611
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/7784
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Have common place to perform quorum fop wind check
- Check if fop succeeded in a way that matches quorum
to avoid marking changelog in split-brain.
BUG: 1066996
Change-Id: Ibc5b80e01dc206b2abbea2d29e26f3c60ff4f204
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/7600
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I635fc0fa955b33590f1c5b4dfec22d591ea8575c
BUG: 1032894
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6592
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
If two self-heals are triggered on same inode in
parallel then one inode will be linked and the other
inode will not be linked as an inode with that gfid
is already linked in inode table. Calling inode-forget
on that inode leads to assert failure.
Fix:
Always use linked inode for performing self-heal.
Added inode-forgets in other places as well even though
its not really a memory leak.
Change-Id: Ib84bf080c8cb6a4243f66541ece587db28f9a052
BUG: 1091597
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/7567
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
git@forge.gluster.org:~schafdog/glusterfs-core/osx-glusterfs
Working functionality on MacOSX
- GlusterD (management daemon)
- GlusterCLI (management cli)
- GlusterFS FUSE (using OSXFUSE)
- GlusterNFS (without NLM - issues with rpc.statd)
Change-Id: I20193d3f8904388e47344e523b3787dbeab044ac
BUG: 1089172
Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
Signed-off-by: Dennis Schafroth <dennis@schafroth.com>
Tested-by: Harshavardhana <harsha@harshavardhana.net>
Tested-by: Dennis Schafroth <dennis@schafroth.com>
Reviewed-on: http://review.gluster.org/7503
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia26e17a7147ed825319c7c29880b9cf4ae80a48c
BUG: 1085259
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-on: http://review.gluster.org/7416
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I869d191dc3470b2208c17343bbf772f01ef744cb
BUG: 1085511
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/7424
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I516f4fb0237dd0b3e512117bf987cea69f8678b8
BUG: 1084485
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/7407
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
| |
BUG: 1084485
Change-Id: I89ddf10add041638ef70baebbce0ec2807ef4b6d
Signed-off-by: Justin Clift <justin@gluster.org>
Reviewed-on: http://review.gluster.org/7402
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
For write fops afr's transaction eager-lock init adds transactions
that can share eager-lock to fdctx list. But if eager-lock finodelk
fop fails the stub remains in the list. This could later lead to
corruption of the list and lead to infinite loop on the list
leading to a mount hang.
Fix:
Remove the stub when finodelk fails.
Change-Id: I0ed4bc6b62f26c5e891c1181a6871ee6e4f4f5fd
BUG: 1063190
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6944
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Fix boundary condition for offset
- Honour data-self-heal-algorithm option
- Added tests for sparse file self-healing
Change-Id: I14bb1c9d04118a3df4072f962fc8f2f197391d95
BUG: 1080707
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/7339
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Remove client side self-healing completely (opendir, openfd, lookup)
- Re-work readdir-failover to work reliably in case of NFS
- Remove unused/dead lock recovery code
- Consistently use xdata in both calls and callbacks in all FOPs
- Per-inode event generation, used to force inode ctx refresh
- Implement dirty flag support (in place of pending counts)
- Eliminate inode ctx structure, use read subvol bits + event_generation
- Implement inode ctx refreshing based on event generation
- Provide backward compatibility in transactions
- remove unused variables and functions
- make code more consistent in style and pattern
- regularize and clean up inode-write transaction code
- regularize and clean up dir-write transaction code
- regularize and clean up common FOPs
- reorganize transaction framework code
- skip setting xattrs in pending dict if nothing is pending
- re-write self-healing code using syncops
- re-write simpler self-heal-daemon
Change-Id: I1e4080c9796c8a2815c2dab4be3073f389d614a8
BUG: 1021686
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6010
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
To be used in afr metadata self-heal
Change-Id: I8dac4b19d61e331702427eeb5b606aab3d20b328
BUG: 1021686
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6941
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Memory is allocated for pump_priv and for pump_priv->resume_path, but if
an error is detected the references to that memory go out of scope and
the memory is never freed.
This patch assures that the memory is freed on error.
Patchset 2: These are Kaleb's recommended changes which, compared to my
original fix, are more comprehensive and provide a more
complete resolution to the memory leakage bugs in this
function. The bug reported by Coverity was limited to a
single memory allocation.
BUG: 789278
CID: 1124737
Change-Id: Ie239e3b5d28d97308bf948efec6a92f107bc648b
Signed-off-by: Christopher R. Hertel <crh@redhat.com>
Reviewed-on: http://review.gluster.org/6929
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I811d104684905a5a9a794cde8e925bd1a97f6546
BUG: 789278
Signed-off-by: Poornima <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/6906
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ic3c8e179a63e82a4e416aea620796f8bb3236c7c
BUG: 1052759
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-on: http://review.gluster.org/6706
Reviewed-by: Kaushal M <kaushal@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
We found a day-1 bug when syncop_xxx() infra is used inside a synctask with
compilation optimization (CFLAGS -O2).
Detailed explanation of the Root cause:
We found the bug in 'gf_defrag_migrate_data' in rebalance operation:
Lets look at interesting parts of the function:
int
gf_defrag_migrate_data (xlator_t *this, gf_defrag_info_t *defrag, loc_t *loc,
dict_t *migrate_data)
{
.....
code section - [ Loop ]
while ((ret = syncop_readdirp (this, fd, 131072, offset, NULL,
&entries)) != 0) {
.....
code section - [ ERRNO-1 ] (errno of readdirp is stored in readdir_operrno by a
thread)
/* Need to keep track of ENOENT errno, that means, there is no
need to send more readdirp() */
readdir_operrno = errno;
.....
code section - [ SYNCOP-1 ] (syncop_getxattr is called by a thread)
ret = syncop_getxattr (this, &entry_loc, &dict,
GF_XATTR_LINKINFO_KEY);
code section - [ ERRNO-2] (checking for failures of syncop_getxattr(). This
may not always be executed in same thread which executed [SYNCOP-1])
if (ret < 0) {
if (errno != ENODATA) {
loglevel = GF_LOG_ERROR;
defrag->total_failures += 1;
.....
}
the function above could be executed by thread(t1) till [SYNCOP-1] and code
from [ERRNO-2] can be executed by a different thread(t2) because of the way
syncop-infra schedules the tasks.
when the code is compiled with -O2 optimization this is the assembly code that
is generated:
[ERRNO-1]
1165 readdir_operrno = errno; <<---- errno gets expanded
as *(__errno_location())
0x00007fd149d48b60 <+496>: callq 0x7fd149d410c0 <address@hidden>
0x00007fd149d48b72 <+514>: mov %rax,0x50(%rsp) <<------ Address
returned by __errno_location() is stored in a special location in stack for
later use.
0x00007fd149d48b77 <+519>: mov (%rax),%eax
0x00007fd149d48b79 <+521>: mov %eax,0x78(%rsp)
....
[ERRNO-2]
1281 if (errno != ENODATA) {
0x00007fd149d492ae <+2366>: mov 0x50(%rsp),%rax <<----- Because
it already stored the address returned by __errno_location(), it just
dereferences the address to get the errno value. BUT THIS CODE NEED NOT BE
EXECUTED BY SAME THREAD!!!
0x00007fd149d492b3 <+2371>: mov $0x9,%ebp
0x00007fd149d492b8 <+2376>: mov (%rax),%edi
0x00007fd149d492ba <+2378>: cmp $0x3d,%edi
The problem is that __errno_location() value of t1 and t2 are different. So
[ERRNO-2] ends up reading errno of t1 instead of errno of t2 even though t2 is
executing [ERRNO-2] code section.
When code is compiled without any optimization for [ERRNO-2]:
1281 if (errno != ENODATA) {
0x00007fd58e7a326f <+2237>: callq 0x7fd58e797300
<address@hidden><<--- As it is calling __errno_location() again it gets the
location from t2 so it works as intended.
0x00007fd58e7a3274 <+2242>: mov (%rax),%eax
0x00007fd58e7a3276 <+2244>: cmp $0x3d,%eax
0x00007fd58e7a3279 <+2247>: je 0x7fd58e7a32a1
<gf_defrag_migrate_data+2287>
Fix:
Make syncop_xxx() return (-errno) value as the return value in
case of errors and all the functions which make syncop_xxx() will need to use
(-ret) to figure out the reason for failure in case of syncop_xxx() failures.
Change-Id: I314d20dabe55d3e62ff66f3b4adb1cac2eaebb57
BUG: 1040356
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6475
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Under the entry self heal, readlink is done at the
source and sink. When readlink is done at the sink,
because link is not present at the sink, afr expects
ENOENT. AFR translator takes decisions for new link
creation based on ENOENT but server translator is modified
to return ESTALE because of which afr xlator is not able
to heal.
Fix:
The check for inode absence at server includes ESTALE as
well.
Change-Id: I319e4cb4156a243afee79365b7b7a5a7823e9a24
BUG: 1046624
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/6599
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|