summaryrefslogtreecommitdiffstats
path: root/xlators/cluster
Commit message (Collapse)AuthorAgeFilesLines
* dht: Add missing braces in dht_opendirPoornima G2017-05-141-1/+2
| | | | | | | | | | | | | | | | | | | | | >Change-Id: I6adce98f52e17953f501bc590ff7189cceac3c31 >BUG: 1431908 >Signed-off-by: Poornima G <pgurusid@redhat.com> >Reviewed-on: https://review.gluster.org/17057 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Vijay Bellur <vbellur@redhat.com> (cherry picked from commit af218797fa98f2f75594fc9ae595f184682f1a0d) Change-Id: I6adce98f52e17953f501bc590ff7189cceac3c31 BUG: 1435942 Reviewed-on: https://review.gluster.org/17285 Tested-by: Raghavendra Talur <rtalur@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* afr: send the correct iatt values in fsync cbkRavishankar N2017-05-141-25/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: afr unwinds the fsync fop with an iatt buffer from one of its children on whom fsync was successful. But that child might not be a valid read subvolume for that inode because of pending heals or because it happens to be the arbiter brick etc. Thus we end up sending the wrong iatt to mdcache which will in turn serve it to the application on a subsequent stat call as reported in the BZ. Fix: Pick a child on whom the fsync was successful *and* that is readable as indicated in the inode context. > Reviewed-on: https://review.gluster.org/17227 > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 1a8fa910ccba7aa941f673302c1ddbd7bd818e39) Change-Id: Ie8647289219cebe02dde4727e19a729b3353ebcf BUG: 1444892 RCA'ed-by: Miklós Fokin <miklos.fokin@appeartv.com> Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17247 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/ec: fix incorrect answer check in seek fopXavier Hernandez2017-05-111-15/+8
| | | | | | | | | | | | | | | | | | | | | | | | A bad check in the answer of a seek request caused a segmentation fault when seek reported an error. > Change-Id: Ifb25ae8bf7cc4019d46171c431f7b09b376960e8 > BUG: 1439068 > Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> > Reviewed-on: https://review.gluster.org/16998 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Change-Id: Ifb25ae8bf7cc4019d46171c431f7b09b376960e8 BUG: 1438813 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: https://review.gluster.org/17232 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht: Fix ret checkN Balachandran2017-05-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Fixed an incorrect return code check in the rebalance code. > BUG: 1448640 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17197 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 67598f538efb24a9e5ac561b294a05e707e15761) Change-Id: I60804ff121cec7a2f0419e2ee70dd22ea7533c0c BUG: 1448864 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17204 Reviewed-by: MOHIT AGRAWAL <moagrawa@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: Pass the correct xdata in fremovexattr fopKrutika Dhananjay2017-05-011-8/+4
| | | | | | | | | | | | | | | | | | Backport of: > Change-Id: Id84bc87e48f435573eba3b24d3fb3c411fd2445d > BUG: 1440051 > Reviewed-on: https://review.gluster.org/17126 > (cherry-picked from ab88f655e6423f51e2f2fac9265ff4d4f5c3e579) Change-Id: Id84bc87e48f435573eba3b24d3fb3c411fd2445d BUG: 1426508 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/17134 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht Remove redundant logs in dht rmdirN Balachandran2017-04-271-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | Removing redundant logs were introduced in https://review.gluster.org/#/c/17065/ > BUG: 1445590 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17118 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Susant Palai <spalai@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 25f0a7b153b30b2c0e8278b0ce11d1199c3fb006) Change-Id: I0d6055488b51a13c91d2121e87f653cdb94888b0 BUG: 1446227 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17130 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: Pass the req dict instead of NULL in dht_attr2()Krutika Dhananjay2017-04-273-57/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: > Change-Id: Id7823fd932b4e5a9b8779ebb2b612a399c0ef5f0 > BUG: 1440051 > Reviewed on: https://review.gluster.org/17085 > (cherry-picked from commit d60ca8e96bbc16b13f8f3456f30ebeb16d0d1e47) This bug was causing VMs to pause during rebalance. When qemu winds down a STAT, shard fills the trusted.glusterfs.shard.file-size attribute in the req dict which DHT doesn't wind its STAT fop with upon detecting the file has undergone migration. As a result shard doesn't find the value to this key in the unwind path, causing it to fail the STAT with EINVAL. Also, the same bug exists in other fops too, which is also fixed in this patch. Change-Id: Id7823fd932b4e5a9b8779ebb2b612a399c0ef5f0 BUG: 1426508 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/17119 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht: rm -rf fails if dir has stale linkto filesN Balachandran2017-04-271-45/+203
| | | | | | | | | | | | | | | | | | | | | | | | | | | | rm -rf <dir> fails with ENOENT if dir contains a lot of stale linkto files. This is because a single readdirp is sent as part of the rmdir which would return and delete only as many linkto files on the bricks as would fit in one readdirp buffer. Running rm -rf <dir> multiple times will eventually delete all the files. The fix sends readdirp on each subvol until no more entries are returned. > BUG: 1442724 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17065 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit e5f9ba138571bd18226462c49ff6a55f5c3ed3a4) Change-Id: I447f2d193de4bd8ac16e4541c6b919d22250e39e BUG: 1444540 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17102 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht: The xattrs sent in readdirp should be sent in opendir aswellPoornima G2017-04-271-28/+44
| | | | | | | | | | | | | | | | | | | | | | As readdir-ahead can be loaded as a child of dht, dht has to specify the xattrs it is intrested in, as part of opendir call itself. >Reviewed-on: https://review.gluster.org/16902 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >(cherry picked from commit 0f71338e1d7c0b70f4fe3b19c68612fe730d9de2) Change-Id: I012ef96cc143b0cef942df78aa7150d85ec38606 BUG: 1435942 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/16947 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* afr: don't do a post-op on a brick if op failedRavishankar N2017-04-271-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In afr-v2, self-blaming xattrs are not there by design. But if the FOP failed on a brick due to an error other than ENOTCONN (or even due to ENOTCONN, but we regained connection before postop was wound), we wind the post-op also on the failed brick, leading to setting self-blaming xattrs on that brick. This can lead to undesired results like healing of files in split-brain etc. Fix: If a fop failed on a brick on which pre-op was successful, do not perform post-op on it. This also produces the desired effect of not resetting the dirty xattr on the brick, which is how it should be because if the fop failed on a brick, there is no reason to clear the dirty bit which actually serves as an indication of the failure. > Reviewed-on: https://review.gluster.org/16976 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 10dad995c989e9d77c341135d7c48817baba966c) Change-Id: I5f1caf4d1b39f36cf8093ccef940118638caa9c4 BUG: 1443501 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17083 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/dht: Modify local->loc.gfid in thread safe mannerPranith Kumar K2017-04-131-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of https://review.gluster.org/16986 Problem: local->loc.gfid in dht_lookup_directory() will be null-gfid for a fresh lookup. dht_lookup_dir_cbk() updates local->loc.gfid while in other thread dht_lookup_directory() is still winding lookup calls to subvolumes so there is a chance of partial gfid being seen by EC. We saw in 12x(4+2) volume, ec is receiving an loc where the gfid has last 10 bytes matching with the gfid of the directory and the first 4 bytes are all-zeros. This is leading to EC erroring out the lookup with EINVAL which leads to NFS failing lookup with EIO. snip from gdb: $37 = (dht_local_t *) 0x7fde5de5b3cc (gdb) p /x $37->loc.gfid $39 = {0x3b, 0x82, 0x10, 0x5e, 0x40, 0x65, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5, 0x6c, 0x2c, 0xb8, 0x56} (gdb) fr 7 state=<optimized out>) at ec-generic.c:837 837 ec_lookup_rebuild(fop->xl->private, fop, cbk); (gdb) p /x fop->loc[0].gfid $40 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5, 0x6c, 0x2c, 0xb8, 0x56} snip from log: [2017-01-29 03:22:30.132328] W [MSGID: 122019] [ec-helpers.c:354:ec_loc_gfid_check] 0-butcher-disperse-4: Mismatching GFID's in loc [2017-01-29 03:22:30.132709] W [MSGID: 112199] [nfs3-helpers.c:3515:nfs3_log_newfh_res] 0-nfs-nfsv3: /linux-4.9.5/Documentation => (XID: b27b9474, MKDIR: NFS: 5(I/O error), POSIX: 5(Input/output error)), FH: exportid 00000000-0000-0000-0000-000000000000, gfid 00000000-0000-0000-0000-000000000000, mountid 00000000-0000-0000-0000-000000000000 [Invalid argument] Fix: update local->loc.gfid in last-call to make sure there are no races. >BUG: 1438411 >Change-Id: Ifcb7e911568c1f1f83123da6ff0cf742b91800a0 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> BUG: 1438423 Change-Id: I804822a1d50215301881ac18318282c1a6951cfb Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/16987 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* build: miscellaneous spelling fixesPatrick Matthäi2017-04-021-1/+1
| | | | | | | | | | | | | | | | | | Debian builds detected spelling issues with GlusterFS 3.10.1. Instead of carrying the patch in the Debian sources, let's include the fixes here too. Change-Id: I38db6adf142f7ec247bffd47aa1e6ff1a0c49e00 Reviewed-on-master: https://review.gluster.org/16973 Reported-by: Patrick Matthäi <pmatthaei@debian.org> BUG: 1437854 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16974 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* cluster/afr: Undo pending xattrs only on the up brickskarthik-us2017-03-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: While doing conservative merge, even if a brick is down, it will reset the pending xattr on that. When that brick comes up, as part of the heal, it will consider this brick as the source and removes the entries on the other bricks, which leads to data loss. Fix: Undo pending only for the bricks which are up. > Change-Id: I18436fa0bb1faa5f60531b357dea3f6b20446303 > BUG: 1433571 > Signed-off-by: karthik-us <ksubrahm@redhat.com> > Reviewed-on: https://review.gluster.org/16913 > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit f91596e6566c605e70a31a60523d11f78a097c3c) Change-Id: I51dbdc53e84051ec73308df9d4cf27726fc29dc7 BUG: 1436203 Signed-off-by: karthik-us <ksubrahm@redhat.com> Reviewed-on: https://review.gluster.org/16955 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
* afr: do not mention split-brain in log message in read_txnRavishankar N2017-03-271-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I am seeing a lot of messages in qe/customer logs where read_txn complains that file is possibly in split-brain because of no readable subvol being found, does inode refresh and then there is no split-brain message post the inode refresh. This means that a lookup was not issued on the indoe to populate 'readable' or it can mean one brick is source for data and the other for metadata, making readable to be zero (because readable=intersection of (data,metadata readable) since commit 7a1c1e290470149696. Since we anyway log actual split-brains post inode-refresh, move this message to DEBUG log level. > Signed-off-by: Ravishankar N <ravishankar@redhat.com> > Reviewed-on: https://review.gluster.org/16879 > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 71e023fcaab0058f32fedc7b6b702040fdd85f46) Change-Id: Idb88b8ea362515279dc9b246f06b6b646c6d8013 BUG: 1434303 Reviewed-on: https://review.gluster.org/16934 Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Metadata healing fails to update the versionSunil Kumar Acharya2017-03-241-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | During meatadata heal, we were not updating the version though all the inode attributes were in sync. Updated the code to adjust version when all the inode attributes are in sync. >BUG: 1425703 >Change-Id: I6723be3c5f748b286d4efdaf3c71e9d2087c7235 >Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> >Reviewed-on: https://review.gluster.org/16772 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> BUG: 1434296 Change-Id: I07fe126ebb7dac0b0de0f41cae139790e7eb4b37 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> Reviewed-on: https://review.gluster.org/16932 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* afr: restore atime/mtime for non-regular filesRavishankar N2017-03-064-51/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | AFR restores atime/mtime only as a part of data heal. For non-regular files (dirs, symlinks, char/block/socket files etc) which do not undergo data-heal, atime/mtime is not restored. This patch restores atime/mtime as a part of metadata heal for such files. > Reviewed-on: https://review.gluster.org/16844 > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 804a65f07ea8e2093f781807651d0d07513b2627) Change-Id: Id8da885fc93fdf65c2f4bae2af3605b146ac1f16 BUG: 1429402 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/16851 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht: Fix crash in "nuke-dir" featureKrutika Dhananjay2017-03-041-1/+10
| | | | | | | | | | | | | | | | | | | | | Backport of: https://review.gluster.org/16829 My patch at https://review.gluster.org/16419 is resulting in core dumps everytime I run tests/features/nuke.t. Turns out dht, upon successfully "nuking" a directory, which was initiated through a setxattr, unwinds the operation with rmdir fop signature, resulting in readdir-ahead casting a struct iatt (preparent) to dict_t, leading to a crash. Change-Id: I480f80170c31a6da3ff7809852449e6f44f7d047 BUG: 1428739 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/16836 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/ec: Don't trigger data/metadata heal on LookupsPranith Kumar K2017-02-275-100/+309
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem-1 If Lookup which doesn't take any locks observes version mismatch it can't be trusted. If we launch a heal based on this information it will lead to self-heals which will affect I/O performance in the cases where Lookup is wrong. Considering self-heal-daemon and operations on the inode from client which take locks can still trigger heal we can choose to not attempt a heal on Lookup. Problem-2: Fixed spurious failure of tests/bitrot/bug-1373520.t For the issues above, what was happening was that ec_heal_inspect() is preventing 'name' heal to happen Problem-3: tests/basic/ec/ec-background-heals.t To be honest I don't know what the problem was, while fixing the 2 problems above, I made some changes to ec_heal_inspect() and ec_need_heal() after which when I tried to recreate the spurious failure it just didn't happen even after a long time. >BUG: 1414287 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Change-Id: Ife2535e1d0b267712973673f6d474e288f3c6834 >Reviewed-on: https://review.gluster.org/16468 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Ashish Pandey <aspandey@redhat.com> BUG: 1419824 Change-Id: I340b48cd416b07890bf3a5427562f5e3f88a481f Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/16765 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Smoke: Gluster Build System <jenkins@build.gluster.org>
* cluster/ec: Fallback to precompiled codeXavier Hernandez2017-02-213-70/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | When dynamic code generation fails for some reason, instead of causing a failure in encode/decode, fallback to the precompiled version. >Change-Id: I4f8a97d3033aa5885779722b19c6e611caa4ffea >BUG: 1421955 >Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> >Reviewed-on: https://review.gluster.org/16614 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Niels de Vos <ndevos@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >(cherry picked from commit 9f9d1482868e8e1044790c8358893f4421d89692) Change-Id: I5e6221fce7a77782fd7d961e752a61dd59821a98 BUG: 1421956 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: https://review.gluster.org/16697 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht Fix error assignment in dht_*xattr2 functionsN Balachandran2017-02-201-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Corrected the op_errno assignments and NULL checks in the dht_sexattr2 and dht_removexattr2 functions. Earlier, they unwound with the default EINVAL op_errno if the file had been deleted. > Change-Id: Iaf837a473d769cea40132487a966c7f452990071 > BUG: 1421653 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/16610 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: MOHIT AGRAWAL <moagrawa@redhat.com> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> (cherry picked from commit 028626a86ea409f908783b9007c02877f20be43e) Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: Icc53ca52281a533fc0c8eeafb4c889c818c313e1 BUG: 1424921 Reviewed-on: https://review.gluster.org/16679 Tested-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* posix: Fix creation of files with S_ISVTX on FreeBSDXavier Hernandez2017-02-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | On FreeBSD the S_ISVTX flag is completely ignored when creating a regular file. Since gluster needs to create files with this flag set, specialy for DHT link files, it's necessary to force the flag. This fix does this by calling fchmod() after creating a file that must have this flag set. > Change-Id: I51eecfe4642974df6106b9084a0b144835a4997a > BUG: 1411228 > Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> > Reviewed-on: https://review.gluster.org/16417 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Change-Id: I2087516383bd132c59bbab98eda8f2243a2163fe BUG: 1424973 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: https://review.gluster.org/16686 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/afr: Perform new entry mark before creating new entryPranith Kumar K2017-02-164-54/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a chance for the source brick to go down just after the new entry is created and before source brick is marked with necessary pending markers. If after this any I/O happens then new entry will become source and reverse heal will happen. To prevent this mark the pending xattrs before creating the new entry. >BUG: 1417466 >Change-Id: I233b87e694d32e5d734df5a83b4d2ca711c17503 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: https://review.gluster.org/16474 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> BUG: 1422942 Change-Id: I460caaa2c6571c4256e50421e48857e3c2769b4d Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/16646 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/ec: Change log level of messages to DEBUGSunil Kumar Acharya2017-02-161-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | Heal failed or passed should not be logged as info. These can be observed from heal info if the heal is happening or not. If we require to debug a case where heal is not happening, we can set the level to DEBUG. >Change-Id: I062668eadd145ef809b25e818e6bca1094f54cd6 >BUG: 1420619 >Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> >Reviewed-on: https://review.gluster.org/16580 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Ashish Pandey <aspandey@redhat.com> Change-Id: I96517f9cf39340ae6883115e464b62339577fc5d BUG: 1422766 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> Reviewed-on: https://review.gluster.org/16635 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* afr: all children of AFR must be up to resolve s-brainRavishankar N2017-02-153-15/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: The various split-brain resolution policies (favorite-child-policy based, CLI based and mount (get/setfattr) based) attempt to resolve split-brain even when not all bricks of replica are up. This can be a problem when say in a replica 3, the only good copy is down and the other 2 bricks are up and blame each other (i.e. split-brain). We end up healing the file in such a case and allow I/O on it. Fix: A decision on whether the file is in split-brain or not must be taken only if we are able to examine the afr xattrs of *all* bricks of a given replica. Signed-off-by: Ravishankar N <ravishankar@redhat.com> > Reviewed-on: https://review.gluster.org/16476 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 0e03336a9362e5717e561f76b0c543e5a197b31b) Change-Id: Icddb1268b380005799990f5379ef957d84639ef9 BUG: 1420982 Reviewed-on: https://review.gluster.org/16587 Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: Don't update layout in rebalance_task_completionN Balachandran2017-02-071-24/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Updating the layout in the dht inode_ctx in rebalance_task_completion after the file is migrated is erroneous in case of files with hardlinks. This step can be skipped as the layout will be set in the syncop_lookup call post the migration in dht_migrate_file. > Change-Id: I24ac798a919585d91a117d6a207e6a31b88486c6 > BUG: 1415761 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/16457 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-by: Susant Palai <spalai@redhat.com> (cherry picked from commit ddf05f3d1e39cc920251c809e9ba42fe42b2c5f2) Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: I46176f8605cfb782aa17c2bc9ffd7a5f0d07dd88 BUG: 1419855 Reviewed-on: https://review.gluster.org/16554 Tested-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/disperse: Do not log fop failed for lockless fopsAshish Pandey2017-02-071-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Operation failed messages are getting logged based on the callbacks of lockless fop's. If a fop does not take a lock, it is possible that it will get some out of sync xattr, iatts. We can not depend on these callback to psay that the fop has failed. Solution: Print failed messages only for locked fops. However, heal would still be triggered. > Change-Id: I4427402c8c944c23f16073613caa03ea788bead3 > BUG: 1414287 > Signed-off-by: Ashish Pandey <aspandey@redhat.com> > Reviewed-on: http://review.gluster.org/16435 > Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: I8728109d5cd93c315a5ada0a50b1f0f158493309 BUG: 1419824 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: https://review.gluster.org/16550 Tested-by: Xavier Hernandez <xhernandez@datalab.es> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* cluster/ec: Do not start heal on good file while IO is going onAshish Pandey2017-02-072-4/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Write on a file has been slowed down significantly after http://review.gluster.org/#/c/13733/ RC : When update fop starts on a file, it sets dirty flag at the start and remove it at the end which make an index entry in indices/xattrop. During IO, SHD scans this and finds out an index and starts heal even if all the fragments are healthy and up tp date. This heal takes inodelk for different types of heal. If the IO is for long time this will happen in every 60 seconds. Due to this extra, unneccessary locking, IO gets slowed down. Solution: Before starting any type of heal check if file needs heal or not. > Change-Id: Ib9519a43e7e4b2565d3f3153f9ca0fb92174fe51 > BUG: 1409191 > Signed-off-by: Ashish Pandey <aspandey@redhat.com> > Reviewed-on: http://review.gluster.org/16377 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Change-Id: I1a66aca626156164555a7a99a4f715c54c87e9a9 BUG: 1419825 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: https://review.gluster.org/16552 Tested-by: Xavier Hernandez <xhernandez@datalab.es> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/ec: fix selinux issues with mmap()Xavier Hernandez2017-02-0610-94/+272
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | EC uses mmap() to create a memory area for the dynamic code. Since the code is created on the fly and executed when needed, this region of memory needs to have write and execution privileges. This combination is not allowed by default by selinux. To solve the problem a file is used as a backend storage for the dynamic code and it's mapped into two distinct memory regions, one with write access and the other one with execution access. This approach is the recommended way to create dynamic code by a program in a more secure way, and selinux allows it. Additionally selinux requires that the backend file be stored in a directory marked with type bin_t to be able to map it in an executable area. To satisfy this condition, GLUSTERFS_LIBEXECDIR has been used. This fix also changes the error check for mmap(), that was done incorrectly (it checked against NULL instead of MAP_FAILED), and it also correctly propagates the error codes and makes sure they aren't silently ignored. > Change-Id: I71c2f88be4e4d795b6cfff96ab3799c362c54291 > BUG: 1402661 > Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> > Reviewed-on: https://review.gluster.org/16405 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: I5c2dd51b1161505316c8f78b73e9a585d0c115d0 BUG: 1418650 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: https://review.gluster.org/16522 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* core: run many bricks within one glusterfsd processJeff Darcy2017-02-012-10/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for multiple brick translator stacks running in a single brick server process. This reduces our per-brick memory usage by approximately 3x, and our appetite for TCP ports even more. It also creates potential to avoid process/thread thrashing, and to improve QoS by scheduling more carefully across the bricks, but realizing that potential will require further work. Multiplexing is controlled by the "cluster.brick-multiplex" global option. By default it's off, and bricks are started in separate processes as before. If multiplexing is enabled, then *compatible* bricks (mostly those with the same transport options) will be started in the same process. Backport of: > Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb > BUG: 1385758 > Reviewed-on: https://review.gluster.org/14763 Change-Id: I4bce9080f6c93d50171823298fdf920258317ee8 BUG: 1418091 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16496 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/ec: Change level of messages to DEBUGAshish Pandey2017-01-301-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Heal failed or passed should not be logged as warning. These can be observed from heal info if the heal is happening or not. If we require to debug a case where heal is not happening, we can set the level to DEBUG. >Change-Id: I347665c8c8b6223bb08a9f3dd5643a10ddc3b93e >BUG: 1417050 >Signed-off-by: Ashish Pandey <aspandey@redhat.com> >Reviewed-on: https://review.gluster.org/16473 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Signed-off-by: Ashish Pandey <aspandey@redhat.com> Change-Id: I347665c8c8b6223bb08a9f3dd5643a10ddc3b93e BUG: 1417135 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: https://review.gluster.org/16478 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* Readdir-ahead : Honour readdir-optimise option of dhtPoornima G2017-01-271-0/+13
| | | | | | | | | | | | | | | | | | | | | | | >Change-Id: I9c5e65b32e316e6a2fc7e1f5c79fce79386b78e2 >BUG: 1401812 >Signed-off-by: Poornima G <pgurusid@redhat.com> >Reviewed-on: https://review.gluster.org/16071 >Smoke: Gluster Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I9c5e65b32e316e6a2fc7e1f5c79fce79386b78e2 BUG: 1417027 Signed-off-by: Poornima G <pgurusid@redhat.com> (cherry picked from commit 7c6538f6c8f9a015663b4fc57c640a7c451c87f7) Reviewed-on: https://review.gluster.org/16461 Tested-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* dht/rebalance Estimate time to complete rebalanceN Balachandran2017-01-241-1/+101
| | | | | | | | | | | | | | | | | | | | | | | | | The estimates will be logged to the rebalance log on running gluster v rebalance <vol> status > Change-Id: I9d51b139cd4c8dfde1ff2c2050720ae606c13fc6 > BUG: 1396004 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/15893 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 2edd75ec8de17da89004859375844f60890a4df0) Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: I58b0550210149443966798d9a26e73cb598eeb6a BUG: 1415915 Reviewed-on: https://review.gluster.org/16458 Tested-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* tier : Tier as a servicehari gowtham2017-01-164-6/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tierd is implemented by separating from rebalance process. The commands affected: 1) Attach tier will trigger this process instead of old one 2) tier start and tier start force will also trigger this process. 3) volume status [tier] will show tier daemon as a process instead of task and normal tier status and tier detach status works. 4) tier stop implemented. 5) detach tier implemented separately along with new detach tier status 6) volume tier volname status will work using the changes. 7) volume set works This patch has separated the tier translator from the legacy DHT rebalance code. It now sends the RPCs from the CLI to glusterd separate to the DHT rebalance code. The daemon is now a service, similar to the snapshot daemon, and can be viewed using the volume status command. The code for the validation and commit phase are the same as the earlier tier validation code in DHT rebalance. The “brickop” phase has been changed so that the status command can use this framework. The service management framework is now used. DHT rebalance does not use this framework. This service framework takes care of : *) spawning the daemon, killing it and other such processes. *) volume set options , which are written on the volfile. *) restart and reconfigure functions. Restart is to restart the daemon at two points 1)after gluster goes down and comes up. 2) to stop detach tier. *) reconfigure is used to make immediate volfile changes. By doing this, we don’t restart the daemon. it has the code to rewrite the volfile for topological changes too (which comes into place during add and remove brick). With this patch the log, pid, and volfile are separated and put into respective directories. Change-Id: I3681d0d66894714b55aa02ca2a30ac000362a399 BUG: 1313838 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/13365 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* afr: Avoid resetting event_gen when brick is always downRavishankar N2017-01-123-19/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: __afr_set_in_flight_sb_status(), which resets event_gen to zero, is called if failed_subvols[i] is non-zero for any brick. But failed_subvols[i] is true even if the brick was down *before* the transaction started. Hence say if 1 brick is down in a replica-3, every writev that comes will trigger an inode refresh because of this resetting, as seen from the no. of FSTATs in the profile info in the BZ. Fix: Reset event gen only if the brick was previously a valid read child and the FOP failed on it the first time. Also `s/afr_inode_read_subvol_reset/afr_inode_event_gen_reset` because the function only resets event gen and not the data/metadata readable. Change-Id: I603ae646cbde96995c35db77916e2ed80b602a91 BUG: 1409206 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/16309 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: Do not log of split-brain when there isn't oneKrutika Dhananjay2017-01-113-24/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | * Even on errors like ENOENT, AFR logs split-brain after read-txn refresh, introduced by commit a07ddd8f. This can be a cause of much panic and confusion and needs to be fixed. * Also fixed this issue in write-txns. * Fixed afr read txns to log about split-brain only after knowing that there is no split-brain choice configured. * Removed code duplication * Fixed incorrect passing of error code in afr_write_txn_refresh_done() (the function was passing -0 as errno to gf_msg(). Change-Id: I354f454ce5bf0e5f00bc27916eb597367cb7d927 BUG: 1411625 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16362 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* feature/dht: undo partially successful dir renameCsaba Henk2017-01-114-7/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As with dht, dirs are present on all subvolumes, renaming them is a compound operation and thus a partial success + partial failure scenario is possible, resulting in an inconsistent state. For purposes of reproduction, such a scenario can easily be produced by stopping the volume, edit the volfile of a certain subvolume to get at an "option read-only on" setting, and then restart the volume. Thus those operations that are to make change on the affected subvolume will fail with EROFS. To handle such scenarios, we introduce an in-memory cache where we record the return values obtained from the subvolumes. At the final stage of the dir rename operation we check if it's a partial success/fail situation. If yes, then we perform a reverse rename op on those subvolumes where the operation succeeded. Change-Id: I3d05f74f53932cb984a918d252a7309c1009a51d BUG: 1412069 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/15739 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* dht: At places needed use STACK_WIND_COOKIEPoornima G2017-01-0910-654/+664
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue: frame has a void * cookie pointer. In case of STACK_WIND_COOKIE frame->cookie is assigned to what is sent by the caller. In case of STACK_WIND frame->cookie is assigned to point point to the frame itself. For ease of coding, at many places, the cookie in the cbk is used to get the pointer to the next xl. This is inconsistent when STACK_WIND_TAIL comes into picture. Eg: dht_setxattr () { for (i = 0 ; i < conf->subvolume_cnt ; i++) { STACK_WIND (..dht_checking_pathinfo_cbk, conf->subvolumes[i] ..); } dht_checking_pathinfo_cbk (...void *cookie...) { prev = cookie; ... for (i = 0; i < conf->subvolume_cnt; i++) { if (conf->subvolumes[i] == prev->this) { ... } } } Consider the below graph: dht (parent) readdir-ahead => Doesn't define setxattr and uses STACK_WIND_TAIL protocol-client With this graph, when dht_checking_pathinfo_cbk is called, cookie will have frame pointing to protocol-client. i.e. prev->this will be protocol-client. But dht was expecting it to be readdir-ahead as it has stored in conf->subvolumes[i] Solution: Hence, as a thumb rule, if cbk is using cookie, then we explicitly call STACK_WIND_COOKIE. Change-Id: I83aea1e24c809c5a91a0db7283e908e125471bd4 BUG: 1401812 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/16039 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/ec: Fixing log messageSunil Kumar H G2017-01-081-5/+10
| | | | | | | | | | | | | | | Updating the warning message with details to improve user understanding. BUG: 1409202 Change-Id: I001f8d5c01c97fff1e4e1a3a84b62e17c025c520 Signed-off-by: Sunil Kumar H G <sheggodu@redhat.com> Reviewed-on: http://review.gluster.org/16315 Tested-by: Sunil Kumar Acharya Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* cluster/dht: Incorrect migration checks in fsyncN Balachandran2017-01-081-5/+6
| | | | | | | | | | | | | | | | | Fixed the order of the migration phase checks in dht_fsync_cbk. Phase1 should never be hit if op_ret is non zero. Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: I9222692e04868bffa93498059440f0aa553c83ec BUG: 1410777 Reviewed-on: http://review.gluster.org/16350 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht/rebalance: remove errno check for failure detectionSusant Palai2017-01-061-16/+13
| | | | | | | | | | | BUG: 1410355 Change-Id: I867419ca36a81ef7209e6911a46c1c2c898b8eab Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/16328 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/ec: Do lookup on an existing file in linkPranith Kumar K2017-01-053-17/+18
| | | | | | | | | | | | | | | | | | | Problem: In link fop lookup is happening on the new fop which doesn't exist so the iatt ec serves parent xlators has size as zero which leads to 'cat' giving empty output Fix: Change code so that lookup happens on the existing link instead. BUG: 1409730 Change-Id: I70eb02fe0633e61d1d110575589cc2dbe5235d76 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/16320 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Xavier Hernandez <xhernandez@datalab.es> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* ec: Invalidations in disperse volume should not update the statPoornima G2017-01-052-2/+14
| | | | | | | | | | | | | | | | | | | | | | Issue: In disperse volume, the file is present across bricks, hence the stat from one brick doesn't carry the valid size of the file. Therefore the upcall from one brick updating the md-cache results in wrong size being updated. Fix: If the notification is cache invalidation then, indicate md-cache that the attributes is invalid. BUG: 1410375 Change-Id: Id89d2283478e70b62b435a8891fffc86d2be8cb2 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/16329 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/dht: Do rename cleanup as rootPranith Kumar K2017-01-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | Problem: Rename linkfile cleanup is done as non-root which may not have priviliges to do the rename so it fails with EACCESS. MKDIR on that name in future will start to hole on this subvolume. It is not easy to hit on fuse mounts because vfs takes care of the permission checks even before rename fop is wound. But with nfs-ganesha mounts it happens. Fix: Do rename cleanup as root BUG: 1409727 Change-Id: I414c1eb6dce76b4516a6c940557b249e6c3f22f4 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/16317 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cluster/dht: Fix dict_leak in migration check tasksN Balachandran2017-01-021-0/+7
| | | | | | | | | | | | | | | | Fixed a memleak where dict was not being unrefed in the dht_migration_complete_check_task and dht_rebalance_inprogress_task functions. Change-Id: I3d42e9a2e5c8596c985bf6431a68fd3905227383 BUG: 1409186 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/16308 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: MOHIT AGRAWAL <moagrawa@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/afr: Fix missing name indices due to EEXIST errorKrutika Dhananjay2016-12-271-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: Consider a volume with granular-entry-heal and sharding enabled. When a replica is down and a shard is created as part of a write, the name index is correctly created under indices/entry-changes/<dot-shard-gfid>. Now when a read on the same region triggers another MKNOD, the fop fails on the online bricks with EEXIST. By virtue of this being a symmetric error, the failed_subvols[] array is reset to all zeroes. Because of this, before post-op, the GF_XATTROP_ENTRY_OUT_KEY will be set, causing the name index, which was created in the previous MKNOD operation, to be wrongly deleted in THIS MKNOD operation. FIX: The ideal fix would have been for a transaction to delete the name index ONLY if it knows it is the one that created the index in the first place. This would involve gathering information as to whether THIS xattrop created the index from individual bricks, aggregating their responses and based on the various posisble combinations of responses, decide whether to delete the index or not. This is rather complex. Simpler fix would be for post-op to examine local->op_ret in the event of no failed_subvols to figure out whether to delete the name index or not. This can occasionally lead to creation of stale name indices but they won't be affecting the IO path or mess with pending changelogs in any way and self-heal in its crawl of "entry-changes" directory would take care to delete such indices. Change-Id: Ic1b5257f4dc9c20cb740a866b9598cf785a1affa BUG: 1408712 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16286 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* afr: use accused matrix instead of readable matrix for deciding healsRavishankar N2016-12-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Problem: afr_replies_interpret() used the 'readable' matrix to trigger client side heals after inode refresh. But for arbiter, readable is always zero. So when `dd` is run with a data brick down, spurious data heals are are triggered. These heals open an fd, causing eager lock to be disabled (open fd count >1) in afr transactions, leading to extra FXATTROPS Fix: Use the accused matrix (derived from interpreting the afr pending xattrs) to decide whether we can start heal or not. Change-Id: Ibbd56c9aed6026de6ec42422e60293702aaf55f9 BUG: 1408395 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/16277 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* dht/rebalance: reverify lookup failuresSusant Palai2016-12-222-58/+130
| | | | | | | | | | | | | | | | | | race: readdirp has read one entry, and doing a lookup on that entry, but user might have renamed/removed that entry just after readdirp but before lookup. Since remove-brick is a costly opertaion,will ingore any ENOENT/ESTALE failures and move on. Change-Id: I62c7fa93c0b9b7e764065ad1574b97acf51b5996 BUG: 1408115 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/15846 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* afr: Ignore event_generation checks post inode refresh for write txnsRavishankar N2016-12-223-1/+4
| | | | | | | | | | | | | | | | | | | Before http://review.gluster.org/#/c/15673/, after inode refresh, we failed read txns in case of EIO or event_generation being zero. For write transactions, the check was only for EIO. 15673 re-factored the code to fail both read and write when event_generation=0. This seems to have caused a regression as explained in the BZ. This patch restores that behaviour in afr_txn_refresh_done(). Change-Id: Ib8e116506badce6f58b55827dbe403d95069d744 BUG: 1406224 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/16205 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cluster/ec: Fix lk-owner set race in ec_unlockPranith Kumar K2016-12-136-34/+42
| | | | | | | | | | | | | | | | | | | | | | Problem: Rename does two locks. There is a case where when it tries to unlock it sends xattrop of the directory with new version, callback of these two xattrops can be picked up by two separate epoll threads. Both of them will try to set the lk-owner for unlock in parallel on the same frame so one of these unlocks will fail because the lk-owner doesn't match. Fix: Specify the lk-owner which will be set on inodelk frame which will not be over written by any other thread/operation. BUG: 1402710 Change-Id: I666ffc931440dc5253d72df666efe0ef1d73f99a Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/16074 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: Fix per-txn optimistic changelog initialisationKrutika Dhananjay2016-12-122-29/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Incorrect initialisation of local->optimistic_change_log was leading to skipped pre-op and post-op even when a brick didn't participate in the txn because it was down. The result - missing granular name index resulting in some entries never getting healed. FIX: Initialise local->optimistic_change_log just before pre-op. Also fixed granular entry heal to create the granular name index in pre-op as opposed to post-op. This is to prevent loss of granular information when during an entry txn, the good (src) brick goes offline before the post-op is done. This would cause self-heal to do conservative merge (since dirty xattr is the only information available), which when granular-entry-heal is enabled, expects granular indices, the lack of which can lead to loss of data in the worst case. Change-Id: Ia3ad716d6fb1821555f02180e86e8711a79f958d BUG: 1402730 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16075 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>