summaryrefslogtreecommitdiffstats
path: root/tests
Commit message (Collapse)AuthorAgeFilesLines
* posix: delete stale gfid handles in nameless lookupRavishankar N2018-01-181-0/+65
| | | | | | | | | | | ..in order for self-heal of symlinks to work properly (see BZ for details). Backport of https://review.gluster.org/#/c/19070/ Signed-off-by: Ravishankar N <ravishankar@redhat.com> Change-Id: I9a011d00b07a690446f7fd3589e96f840e8b7501 BUG: 1534848
* mount/fuse: use fstat in getattr implementation if any opened fd is availableRaghavendra G2018-01-031-11/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The restriction of using fds opened by the same Pid means fds cannot be shared across threads of multithreaded application. Note that fops from kernel have different Pid for different threads. Imagine following sequence of operations: * Turn off performance.open-behind * Thread t1 opens an fd - fd1 - on file "file". Let's assume nodeid of "file" is "nodeid-file". * Thread t2 does RENAME ("newfile", "file"). Let's assume nodeid of "newfile" as "nodeid-newfile". * t2 proceeds to do fstat (fd1) The above set of operations can sometimes result in ESTALE/ENOENT errors. RENAME overwrites "file" with "newfile" changing its nodeid from "nodeid-file" to "nodeid-newfile" and post RENAME, "nodeid-file" is removed from the backend. If fstat carries nodeid-file as argument, which can happen if lookup has not refreshed the nodeid of "file" and since t2 doesn't have an fd opened, fuse_getattr_resume uses STAT which will fail as "nodeid-file" no longer exists. Since the above set of operations and sharing of fds across multiple threads are valid, this is a bug. The fix is to use any fd opened on the inode. In this specific example fuse_getattr_resume will find fd1 and winds down the call as fstat (fd1) which won't fail. Cross-checked with "Miklos Szeredi" <mszeredi.at.redhat.dot.com> for any security issues with this solution and he approves the solution. Thanks to "Miklos Szeredi" <mszeredi.at.redhat.dot.com> for all the pointers and discussions. >Change-Id: I88dd29b3607cd2594eee9d72a1637b5346c8d49c >BUG: 1510401 >Signed-off-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 8b57378e5596f287a7b9d106dd6fb56a624b42ee) Change-Id: I88dd29b3607cd2594eee9d72a1637b5346c8d49c BUG: 1529086 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd: delete source brick only once in reset-brick commit forceAtin Mukherjee2017-10-311-0/+24
| | | | | | | | | | | | | | While stopping the brick which is to be reset and replaced delete_brick flag was passed as true which resulted glusterd to free up to source brick before the actual operation. This results commit force to fail failing to find the source brickinfo. > mainline patch : https://review.gluster.org/#/c/18581/ Change-Id: I1aa7508eff7cc9c9b5d6f5163f3bb92736d6df44 BUG: 1507880 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 0fb8acaa6ff80c43e46deac0ce66b29ae0df0ca4)
* glusterd: Gluster should keep PID file in correct locationGaurav Kumar Garg2017-10-2511-18/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently Gluster keeps process pid information of all the daemons and brick processes in Gluster configuration file directory (ie., /var/lib/glusterd/*). These pid files should be seperate from configuration files. Deletion of the configuration file directory might result into serious problems. Also, /var/run/gluster is the default placeholder directory for pid files. So, with this fix Gluster will keep all process pid information of all processes in /var/run/gluster/* directory. > Change-Id: Idb09e3fccb6a7355fbac1df31082637c8d7ab5b4 > BUG: 1258561 > Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> > Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com> > Reviewed-on: https://review.gluster.org/13580 > Tested-by: MOHIT AGRAWAL <moagrawa@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > (Cherry pick from commit 220d406ad13d840e950eef001a2b36f87570058d) BUG: 1491059 Change-Id: Idb09e3fccb6a7355fbac1df31082637c8d7ab5b4 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* features/worm: Adding check to newloc when doing renameluneo72017-10-251-0/+9
| | | | | | | | | | | | | | | | | | | | | | | Problem: Since rename didn't check if newloc exists and it's retention state it was possible to rename a new file that wasn't in retention over a existing file that was in read-only state. Cherry picked from commit 00a4dc0: > Change-Id: I63c6bbabb7bb456ebedf201cc77b878ffda62229 > BUG: 1484490 > Signed-off-by: luneo7 <luneo7@gmail.com> > Reviewed-on: https://review.gluster.org/18104 > Tested-by: jiffin tony Thottan <jthottan@redhat.com> > Tested-by: Prashanth Pai <ppai@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Prashanth Pai <ppai@redhat.com> > Reviewed-by: Karthik U S <ksubrahm@redhat.com> > Reviewed-by: Amar Tumballi <amarts@redhat.com> Change-Id: I63c6bbabb7bb456ebedf201cc77b878ffda62229 BUG: 1480788 Signed-off-by: luneo7 <luneo7@gmail.com>
* md-cache: avoid checking the xattr value buffer with string functions.Günther Deschner2017-10-251-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | xattrs may very well contain binary, non-text data with leading 0 values. Using strcmp for checking empty values is not the appropriate thing to do: In the best case, it might treat a binary xattr value starting with 0 from being cached (and hence also from being reported back with xattr). In the worst case, we might read beyond the end of a data blob that does contain any zero byte. We fix this by checking the length of the data blob and checking the first byte against 0 if the length is one. > Signed-off-by: Guenther Deschner <gd@samba.org> > Pair-Programmed-With: Michael Adam <obnox@samba.org> > Change-Id: If723c465a630b8a37b6be58782a2724df7ac6b11 > BUG: 1476324 > Reviewed-on: https://review.gluster.org/17910 > Reviewed-by: Michael Adam <obnox@samba.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Poornima G <pgurusid@redhat.com> > Tested-by: Poornima G <pgurusid@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > (cherry picked from commit ab4ffdac9dec1867f2d9b33242179cf2b347319d) Change-Id: If723c465a630b8a37b6be58782a2724df7ac6b11 BUG: 1499893 Signed-off-by: Günther Deschner <gd@samba.org>
* gfapi: set lkowner in glfdSoumya Koduri2017-10-172-0/+239
| | | | | | | | | | | | | | | | | | We need a provision to be able to set lkowner (which is used to distinguish locks maintained by server) in gfapi. Since the same lk_owner need to be used to be able to flush the lock while closing the fd, store the lkowner in the glfd structure itself. A new API has been added to be able to set lkowner in glfd. This is backport of below mainline fix - https://review.gluster.org/#/c/18429 https://review.gluster.org/#/c/18522/ Change-Id: I67591d6b9a89c20b9617d52616513ff9e6c06b47 BUG: 1501955 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
* mount/fuse: Make event-history feature configurableKrutika Dhananjay2017-10-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | ... and disable it by default. Backport of: > Change-Id: Ia533788d309c78688a315dc8cd04d30fad9e9485 > Reviewed-on: https://review.gluster.org/18242 > BUG: 1467614 > cherry-picked from commit 956d43d6e89d40ee683547003b876f1f456f03b6 This is because having it disabled seems to improve performance. This could be due to the lock contention by the different epoll threads on the circular buff lock in the fop cbks just before writing their response to /dev/fuse. Just to provide some data - wrt ovirt-gluster hyperconverged environment, I saw an increase in IOPs by 12K with event-history disabled for randrom read workload. Usage: mount -t glusterfs -o event-history=on $HOSTNAME:$VOLNAME $MOUNTPOINT OR glusterfs --event-history=on --volfile-server=$HOSTNAME --volfile-id=$VOLNAME $MOUNTPOINT Change-Id: Ia533788d309c78688a315dc8cd04d30fad9e9485 BUG: 1495430 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* afr: auto-resolve split-brains for zero-byte filesRavishankar N2017-10-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Backport of https://review.gluster.org/#/c/18283/ Problems: As described in BZ 1491670, renaming hardlinks can result in data/mdata split-brain of the DHT link-to files (T files) without any mismatch of data and metadata. As described in BZ 1486063, for a zero-byte file with only dirty bits set, arbiter brick will likely be chosen as the source brick. Fix: For zero byte files in split-brain, pick first brick as a) data source if file size is zero on all bricks. b) metadata source if metadata is the same on all bricks In arbiter case, if file size is zero on all bricks and there are no pending afr xattrs, pick 1st brick as data source. Change-Id: I0270a9a2f97c3b21087e280bb890159b43975e04 BUG: 1496321 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Rahul Hinduja <rhinduja@redhat.com> Reported-by: Mabi <mabi@protonmail.ch>
* posix: add sanity checks for removing the gfid symlink for directoriesRavishankar N2017-09-231-0/+47
| | | | | | | | | | | | | Backport of https://review.gluster.org/17945 ...during mkdir and rmdir. Otherwise, during entry self-heal, the directory could be left out without a .glusterfs symlink causing fops like opendir, readdir to fail. The only chance the missing symlink will be created is when a fresh lookup comes on it. Change-Id: I2e1cf1bce8962ea80187edd8f6d73e0a09cf9f8e BUG: 1491966 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* features/shard: Return aggregated size in stbuf of LINK fopKrutika Dhananjay2017-09-171-0/+25
| | | | | | | | | | | | | | | | Backport of: > Change-Id: I42df7679d63fec9b4c03b8dbc66c5625f097fac0 > Reviewed-on: https://review.gluster.org/18209 > BUG: 1488546 > cherry-picked from 91430817ce5bcbeabf057e9c978485728a85fb2b Change-Id: I42df7679d63fec9b4c03b8dbc66c5625f097fac0 BUG: 1488719 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/18212 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht: Make rebalance throttle option tuned by numberSusant Palai2017-08-111-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current rebalance throttle options: lazy/normal/aggressive may not always be sufficient for the purpose of throttling. In our recent test, we observed for certain setups, normal and aggressive modes behaved similarly consuming full disk bandwidth. So in cases like this admin should be able to tune it down(or vice versa) depending on the need. Along with old throttle configurations, thread counts are tuned based on number. e.g. gluster v set vol-name cluster-rebal.throttle 5. Admin can tune up/down between 0 and the number of cores available. Note: For heterogenous servers, validation will fail on the old server if "number" is given for throttle configuration. The message looks something like this: "volume set: failed: Staging failed on vm2. Error: cluster.rebal-throttle should be {lazy|normal|aggressive}" Test: Manual test by logging active thread number after reconfiguring throttle option. testcase: tests/basic/distribute/throttle-rebal.t > Change-Id: I46e3cde546900307831028b344ecf601fd9b02c3 > BUG: 1438370 > Signed-off-by: Susant Palai <spalai@redhat.com> > Reviewed-on: https://review.gluster.org/16980 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Signed-off-by: Susant Palai <spalai@redhat.com> Change-Id: I46e3cde546900307831028b344ecf601fd9b02c3 BUG: 1473136 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/17835 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: Make rebalance honor min-free-diskSusant Palai2017-08-111-2/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | test: Manual created files of size 1K on 2 brick(of size 1GB) setup . added a brick of size 16GB. set min-free-disk to 12GB(so that first two bricks won't receive any files). removed one of the 1st brick of size 1GB. Logs from test: [2017-04-12 08:52:08.196484] W [MSGID: 0] [dht-rebalance.c:895:__dht_check_free_space] 0-test1-dht: Write will cross min-free-disk for file - /tile32 on subvol - test1-client-1. Looking for new subvol. [2017-04-12 08:52:08.196904] I [MSGID: 0] [dht-rebalance.c:925:__dht_check_free_space] 0-test1-dht: new target found - test1-client-2 for file - /tile32 - Post migration we have two files. The new destination (/brick/1) has the data file [root@vm1 ~]# ll /brick/1/tile32 -rw-r--r--. 2 root root 0 Apr 12 14:22 /brick/1/tile32 - On the old target the linkto file is there with linkto xattr pointing to /brick/1 [root@vm1 ~]# ll /tmp/2/tile32 ---------T. 2 root root 1000 Apr 12 14:22 /tmp/2/tile32 [root@vm1 ~]# getfattr -m . -de text /tmp/2/tile32 getfattr: Removing leading '/' from absolute path names security.selinux="unconfined_u:object_r:user_tmp_t:s0" trusted.gfid="����:Aс�#�/'b2" trusted.glusterfs.dht.linkto="test1-client-2" Marking ./tests/features/worm_sh.t as bad test. Reason being, this patch failed on master branch as well and it has nothing to do with rebalance/remove-brick. > BUG: 1441508 > Change-Id: I90bae251cda3d957a49cdceda90cd08311a392fb > Signed-off-by: Susant Palai <spalai@redhat.com> > Reviewed-on: https://review.gluster.org/17034 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Signed-off-by: Susant Palai <spalai@redhat.com> Change-Id: I90bae251cda3d957a49cdceda90cd08311a392fb BUG: 1473132 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/17831 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* afr: mark non sources as sinks in metadata healRavishankar N2017-07-191-0/+64
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: In a 3 way replica, when the source brick does not have pending xattrs for the sinks, but the 2 sinks blame each other, metadata heal was not happpening because we were not setting all non-sources as sinks. Fix: Mark all non-sources as sinks, like it is done in data and entry heal. > Reviewed-on: https://review.gluster.org/17717 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 77c1ed5fd299914e91ff034d78ef6e3600b9151c) Change-Id: I534978940f5087302e307fcc810a48ffe898ce08 BUG: 1471612 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17782 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/ec: correctly handle end of file for seekXavier Hernandez2017-07-072-0/+242
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When a SEEK_HOLE was issued near to the end of file, sometimes an offset beyond the end of file was returned. Another problem was that using some offsets greater than the end of file returned successfully instead of failing with ENXIO. >Change-Id: I238d2884ba02fd19a78116b0f8f8e8d6338fb3f5 >BUG: 1449348 >Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> >Reviewed-on: https://review.gluster.org/17228 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Amar Tumballi <amarts@redhat.com> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >(cherry picked from commit eb96dd45f8e583c6bad84bf32ca17e2bb01dd38f) Change-Id: I238d2884ba02fd19a78116b0f8f8e8d6338fb3f5 BUG: 1468126 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: https://review.gluster.org/17711 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* libglusterfs : Fix crash in glusterd while peer probingGaurav Yadav2017-06-201-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterd crashes when port is being set explcitly to a range which is outside greater than short data type range. Eg. sysctl net.ipv4.ip_local_reserved_ports="49152-49156" In above case glusterd crashes while parsing the port. With this fix glusterd will be able to handle port range between INT_MIN to INT_MAX > Reviewed-on: https://review.gluster.org/17359 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > Reviewed-by: Niels de Vos <ndevos@redhat.com> > Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Change-Id: I7c75ee67937b0e3384502973d96b1c36c89e0fe1 BUG: 1459760 Signed-off-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-on: https://review.gluster.org/17494 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* Tier/cli: detach status xml outputhari gowtham2017-05-301-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: detach status xml output was broken because of the wrong argument. The status_op sent to verify whether it is a tier status command was as false. Fix: the argument being passed was changed from false to true. >Change-Id: I8cdd4dd972d6bfbb61c1182cbf4097767f83c7c5 >BUG: 1446362 >Signed-off-by: hari gowtham <hgowtham@redhat.com> >Reviewed-on: https://review.gluster.org/17131 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Tested-by: hari gowtham <hari.gowtham005@gmail.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Atin Mukherjee <amukherj@redhat.com> >Reviewed-by: N Balachandran <nbalacha@redhat.com> Change-Id: I8cdd4dd972d6bfbb61c1182cbf4097767f83c7c5 BUG: 1451587 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/17313 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: Don't spawn new glusterfsds on node reboot with brick-muxSamikshan Bairagya2017-05-301-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With brick multiplexing enabled, upon a node reboot new bricks were not being attached to the first spawned brick process even though there wasn't any compatibility issues. The reason for this is that upon glusterd restart after a node reboot, since brick services aren't running, glusterd starts the bricks in a "no-wait" mode. So after a brick process is spawned for the first brick, there isn't enough time for the corresponding pid file to get populated with a value before the compatibilty check is made for the next brick. This commit solves this by iteratively waiting for the pidfile to be populated in the brick compatibility comparison stage before checking if the brick process is alive. > Reviewed-on: https://review.gluster.org/17307 > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 13e7b3b354a252ad4065f7b2f0f805c40a3c5d18) Change-Id: Ibd1f8e54c63e4bb04162143c9d70f09918a44aa4 BUG: 1453087 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: https://review.gluster.org/17352 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* cluster/dht: Rebalance on all nodes should migrate filesN Balachandran2017-05-182-2/+165
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Rebalance compares the node-uuid of a file against its own to and migrates a file only if they match. However, the current behaviour in both AFR and EC is to return the node-uuid of the first brick in a replica set for all files. This means a single node ends up migrating all the files if the first brick of every replica set is on the same node. Fix: AFR and EC will return all node-uuids for the replica set. The rebalance process will divide the files to be migrated among all the nodes by hashing the gfid of the file and using that value to select a node to perform the migration. This patch makes the required DHT and tiering changes. Some tests in rebal-all-nodes-migrate.t will need to be uncommented once the AFR and EC changes are merged. > BUG: 1366817 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17239 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> (cherry picked from commit b23bd3dbc2c153171d0bb1205e6804afe022a55f) Change-Id: I5ce41600f5ba0e244ddfd986e2ba8fa23329ff0c BUG: 1451561 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17311 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* Fixes quota aux mount failureSanoj Unnikrishnan2017-05-1315-19/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The aux mount is created on the first limit/remove_limit/list command and it remains until volume is stopped / deleted / (quota is disabled) , where we do a lazy unmount. If the process is uncleanly terminated, then the mount entry remains and we get (Transport disconnected) error on subsequent attempts to run quota list/limit-usage/remove commands. Second issue, There is also a risk of inadvertent rm -rf on the /var/run/gluster causing data loss for the user. Ideally, /var/run is a temp path for application use and should not cause any data loss to persistent storage. Solution: 1) unmount the aux mount after each use. 2) clean stale mount before mounting, if any. One caveat with doing mount/unmount on each command is that we cannot use same mount point for both list and limit commands. The reason for this is that list command needs mount to be accessible in cli after response from glusterd, So it could be unmounted by a limit command if executed in parallel (had we used same mount point) Hence we use separate mount points for list and limit commands. > Reviewed-on: https://review.gluster.org/16938 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Manikandan Selvaganesh <manikandancs333@gmail.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > (cherry picked from commit 2ae4b4058691b324535d802f4e6d24cce89a10e5) Change-Id: I4f9e39da2ac2b65941399bffb6440db8a6ba59d0 BUG: 1449779 Signed-off-by: Sanoj Unnikrishnan <sunnikri@redhat.com> Reviewed-on: https://review.gluster.org/17241 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* glusterd: Make reset-brick work correctly if brick-mux is onSamikshan Bairagya2017-05-122-1/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Reset brick currently kills of the corresponding brick process. However, with brick multiplexing enabled, stopping the brick process would render all bricks attached to it unavailable. To handle this correctly, we need to make sure that the brick process is terminated only if brick-multiplexing is disabled. Otherwise, we should send the GLUSTERD_BRICK_TERMINATE rpc to the respective brick process to detach the brick that is to be reset. > Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> > Reviewed-on: https://review.gluster.org/17128 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 74383e3ec6f8244b3de9bf14016452498c1ddcf0) Change-Id: I69002d66ffe6ec36ef48af09b66c522c6d35ac58 BUG: 1449934 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: https://review.gluster.org/17253 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* Tier: Watermark check for hi and low value being equalhari gowtham2017-05-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | back-port of : https://review.gluster.org/17175 Problem: Both low and hi watermark can be set to same value as the check missed the case for being equal. Fix: Add the check to both the hi and low values being equal along with the low value being higher than hi value. >Change-Id: Ia235163aeefdcb2a059e2e58a5cfd8fb7f1a4c64 >BUG: 1447960 >Signed-off-by: hari gowtham <hgowtham@redhat.com> >Reviewed-on: https://review.gluster.org/17175 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Tested-by: hari gowtham <hari.gowtham005@gmail.com> >Reviewed-by: Atin Mukherjee <amukherj@redhat.com> >Reviewed-by: Milind Changire <mchangir@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: Ia235163aeefdcb2a059e2e58a5cfd8fb7f1a4c64 BUG: 1448790 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/17202 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: socketfile & pidfile related fixes for brick multiplexing featureMohit Agrawal2017-05-105-11/+145
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: While brick-muliplexing is on after restarting glusterd, CLI is not showing pid of all brick processes in all volumes. Solution: While brick-mux is on all local brick process communicated through one UNIX socket but as per current code (glusterd_brick_start) it is trying to communicate with separate UNIX socket for each volume which is populated based on brick-name and vol-name.Because of multiplexing design only one UNIX socket is opened so it is throwing poller error and not able to fetch correct status of brick process through cli process. To resolve the problem write a new function glusterd_set_socket_filepath_for_mux that will call by glusterd_brick_start to validate about the existence of socketpath. To avoid the continuous EPOLLERR erros in logs update socket_connect code. Test: To reproduce the issue followed below steps 1) Create two distributed volumes(dist1 and dist2) 2) Set cluster.brick-multiplex is on 3) kill glusterd 4) run command gluster v status After apply the patch it shows correct pid for all volumes > BUG: 1444596 > Change-Id: I5d10af69dea0d0ca19511f43870f34295a54a4d2 > Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > Reviewed-on: https://review.gluster.org/17101 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Prashanth Pai <ppai@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > (cherry picked from commit 21c7f7baccfaf644805e63682e5a7d2a9864a1e6) Change-Id: I1892c80b9ffa93974f20c92d421660bcf93c4cda BUG: 1449002 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: https://review.gluster.org/17210 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Prashanth Pai <ppai@redhat.com>
* TIER/TESTS: improving regression test for tierhari gowtham2017-05-107-151/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The test files that were marked as bad test were checked and updated for centos. The tests that had issue were fixed. Tests that aren't needed anymore are removed. REASON: tests/basic/tier/tier-file-create.t This test checks one line after creating a tiered volume (which is done in every tier test). So this line is moved along with other test in tier and the file is deleted. tests/bugs/tier/bug-1286974.t This bug checks for the tier as a task and tier has been moved from a task to service as a part of the tier as a service patch https://review.gluster.org/#/c/13365/ So it is removed from bad tests. tests/basic/tier/record-metadata-heat.t This test had a bug and has been fixed. tests/basic/tier/bug-1214222-directories_missing_after_attach_tier.t tests/basic/tier/fops-during-migration.t tests/basic/tier/tier-snapshot.t tests/basic/tier/tier_lookup_heal.t These test seem to work fine on centos now. >Change-Id: I05537f4bbb91584410177ce43543897eff8761a1 >BUG: 1421600 >Signed-off-by: hari gowtham <hgowtham@redhat.com> >Reviewed-on: https://review.gluster.org/16605 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Tested-by: hari gowtham <hari.gowtham005@gmail.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Change-Id: I05537f4bbb91584410177ce43543897eff8761a1 BUG: 1440742 Signed-off-by: hari gowtham <hgowtham@redhat.com> Change-Id: I9402312608de1ede28009ec52f7385e45678ed75 Reviewed-on: https://review.gluster.org/17027 Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* geo-rep: filter out xtime attribute during getxattrSaravanakumar Arumugam2017-05-033-13/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | georep gsyncd's xtime needs to filtered irrespective of any process access. This way, we can avoid (unnecessarily)syncing xtime attribute to slave, which may raise permission denied errors. test case modified to check for xtime xattr only in backend. Back port of> >BUG: 1353952 >Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com> >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: https://review.gluster.org/14880 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Kotresh HR <khiremat@redhat.com> >Tested-by: Kotresh HR <khiremat@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Change-Id: I2390b703048d5cc747d91fa2ae884dc55de58669 BUG: 1441576 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: https://review.gluster.org/17046 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* tests: remove tests/bugs/core/bug-1421590-brick-mux-reuse-ports.tAtin Mukherjee2017-04-271-60/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bug-1421590-brick-mux-reuse-ports.t seems to be a bad test to me and here is my reasoning: This test tries to check if the ports are reused or not. When a volume is restarted, by the time glusterd tries to allocate a new port to the one of the brick processes of the volume there is no guarantee that the older port will be allocated given the kernel might take some extra time to free up the port between this time frame. From https://build.gluster.org/job/regression-test-burn-in/2932/console we can clearly see that post restart of the volume, glusterd allocated port 49153 & 49155 for brick1 & brick2 respectively but the test was expecting the ports to be matched with 49155 & 49156 which were allocated before the volume was restarted. >Reviewed-on: https://review.gluster.org/17033 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Amar Tumballi <amarts@redhat.com> >(cherry picked from commit 1612355327fa5f86078b9dbcf7a38e4e0c63e205) Change-Id: Id887bf28445261d4de04fc7502e58057659c9512 BUG: 1445407 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/17116 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* afr: don't do a post-op on a brick if op failedRavishankar N2017-04-271-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In afr-v2, self-blaming xattrs are not there by design. But if the FOP failed on a brick due to an error other than ENOTCONN (or even due to ENOTCONN, but we regained connection before postop was wound), we wind the post-op also on the failed brick, leading to setting self-blaming xattrs on that brick. This can lead to undesired results like healing of files in split-brain etc. Fix: If a fop failed on a brick on which pre-op was successful, do not perform post-op on it. This also produces the desired effect of not resetting the dirty xattr on the brick, which is how it should be because if the fop failed on a brick, there is no reason to clear the dirty bit which actually serves as an indication of the failure. > Reviewed-on: https://review.gluster.org/16976 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 10dad995c989e9d77c341135d7c48817baba966c) Change-Id: I5f1caf4d1b39f36cf8093ccef940118638caa9c4 BUG: 1443501 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17083 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* glusterd: hold off volume deletes while still restarting bricksJeff Darcy2017-04-242-0/+96
| | | | | | | | | | | | | | | | | | | | | We need to do this because modifying the volume/brick tree while glusterd_restart_bricks is still walking it can lead to segfaults. Without waiting we could accidentally "slip in" while attach_brick has released big_lock between retries and make such a modification. Backport of: > Commit a7ce0548b7969050644891cd90c0bf134fa1594c > BUG: 1432542 > Reviewed-on: https://review.gluster.org/16927 Change-Id: I30ccc4efa8d286aae847250f5d4fb28956a74b03 BUG: 1441476 Signed-off-by: Jeff Darcy <jeff@pl.atyp.us> Reviewed-on: https://review.gluster.org/17044 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* protocol : fix auth-allow regressionAtin Mukherjee2017-03-301-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the brick multiplexing patches (commit 1a95fc3) had some changes in gf_auth () & server_setvolume () functions which caused auth-allow feature to be broken. mount doesn't succeed even if it's part of the auth-allow list. This fix does the following: 1. Reintroduce the peer-info data back in gf_auth () so that fnmatch has valid input and it can decide on the result. 2. config-params dict should capture key values pairs for all the bricks in case brick multiplexing is on. In case brick multiplexing isn't enabled, then config-params should carry attributes from protocol/server such that all rpc auth related attributes stay in tact in the dictionary. >Reviewed-on: https://review.gluster.org/16920 >Tested-by: Jeff Darcy <jeff@pl.atyp.us> >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> >Reviewed-by: MOHIT AGRAWAL <moagrawa@redhat.com> >(cherry picked from commit 0bd58241143e91b683a3e5c4335aabf9eed537fe) Change-Id: I007c4c6d78620a896b8858a29459a77de8b52412 BUG: 1429117 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/16967 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/afr: Undo pending xattrs only on the up brickskarthik-us2017-03-281-0/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: While doing conservative merge, even if a brick is down, it will reset the pending xattr on that. When that brick comes up, as part of the heal, it will consider this brick as the source and removes the entries on the other bricks, which leads to data loss. Fix: Undo pending only for the bricks which are up. > Change-Id: I18436fa0bb1faa5f60531b357dea3f6b20446303 > BUG: 1433571 > Signed-off-by: karthik-us <ksubrahm@redhat.com> > Reviewed-on: https://review.gluster.org/16913 > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit f91596e6566c605e70a31a60523d11f78a097c3c) Change-Id: I51dbdc53e84051ec73308df9d4cf27726fc29dc7 BUG: 1436203 Signed-off-by: karthik-us <ksubrahm@redhat.com> Reviewed-on: https://review.gluster.org/16955 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
* rpc: bump up conn->cleanup_gen in rpc_clnt_reconnect_cleanupAtin Mukherjee2017-03-271-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 086436a introduced generation number (cleanup_gen) to ensure that rpc layer doesn't end up cleaning up the connection object if application layer has already destroyed it. Bumping up cleanup_gen was done only in rpc_clnt_connection_cleanup (). However the same is needed in rpc_clnt_reconnect_cleanup () too as with out it if the object gets destroyed through the reconnect event in the application layer, rpc layer will still end up in trying to delete the object resulting into double free and crash. Peer probing an invalid host/IP was the basic test to catch this issue. >Reviewed-on: https://review.gluster.org/16914 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Milind Changire <mchangir@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> >(cherry picked from commit 39e09ad1e0e93f08153688c31433c38529f93716) Change-Id: Id5332f3239cb324cead34eb51cf73d426733bd46 BUG: 1434399 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/16936 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* glusterd: take conn->lock around operations on conn->reconnectJeff Darcy2017-03-101-0/+25
| | | | | | | | | | | | | | | | | | | | | Failure to do this could lead to a race in which a timer would be removed twice concurrently, corrupting the timer list (because gf_timer_call_cancel has no internal protection against this) and possibly causing a crash. Backport of: > 4e0d4b15717da1f6466133158a26927fb91384b8 > BUG: 1421721 > Reviewed-on: https://review.gluster.org/16662 Change-Id: Ic1a8b612d436daec88fd6cee935db0ae81a47d5c BUG: 1431175 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16885 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* glusterd: disallow increasing replica count for arbiter volumesRavishankar N2017-03-071-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: add-brick command to increase replica count in an arbiter volume succeeds, causing undesirable effects like the 4th brick being loaded with the arbiter xlator, the 3rd one losing the arbiter xlator (when the brick process is restarted), arbitration logic in afr going for a toss etc. Fix: Arbiter configuration should always be a replica 3 volume (of which 3rd brick is arbiter). Hence disallow increasing replica count for arbiter volume configurations. > Reviewed-on: https://review.gluster.org/16845 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit b7ba77ab3ffb641d06223f7af5145d3d670b032a) Change-Id: I9fe4edac880d0f711e6d44324ad5562974e53e51 BUG: 1429773 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/16863 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* core: Clean up pmap registry up correctly on volume/brick stopSamikshan Bairagya2017-02-281-0/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit changes the following: 1. In glusterfs_handle_terminate, send out individual pmap signout requests to glusterd for every brick. 2. Add another parameter to glusterfs_mgmt_pmap_signout function to pass the brickname that needs to be removed from the pmap registry. 3. Make sure pmap_registry_search doesn't break out from the loop iterating over the list of bricks per port if the first brick entry corresponding to a port is whitespaced out. 4. Make sure the pmap registry entries are removed for other daemons like snapd. > Reviewed-on: https://review.gluster.org/16689 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Gaurav Yadav <gyadav@redhat.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> (cherry picked from commit 1e3538baab7abc29ac329c78182b62558da56d98) Change-Id: I69949874435b02699e5708dab811777ccb297174 BUG: 1427461 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: https://review.gluster.org/16786 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* cluster/ec: Don't trigger data/metadata heal on LookupsPranith Kumar K2017-02-275-41/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem-1 If Lookup which doesn't take any locks observes version mismatch it can't be trusted. If we launch a heal based on this information it will lead to self-heals which will affect I/O performance in the cases where Lookup is wrong. Considering self-heal-daemon and operations on the inode from client which take locks can still trigger heal we can choose to not attempt a heal on Lookup. Problem-2: Fixed spurious failure of tests/bitrot/bug-1373520.t For the issues above, what was happening was that ec_heal_inspect() is preventing 'name' heal to happen Problem-3: tests/basic/ec/ec-background-heals.t To be honest I don't know what the problem was, while fixing the 2 problems above, I made some changes to ec_heal_inspect() and ec_need_heal() after which when I tried to recreate the spurious failure it just didn't happen even after a long time. >BUG: 1414287 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Change-Id: Ife2535e1d0b267712973673f6d474e288f3c6834 >Reviewed-on: https://review.gluster.org/16468 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Ashish Pandey <aspandey@redhat.com> BUG: 1419824 Change-Id: I340b48cd416b07890bf3a5427562f5e3f88a481f Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/16765 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Smoke: Gluster Build System <jenkins@build.gluster.org>
* marker: Fix inode value in loc, in setxattr fopPoornima G2017-02-201-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On recieving a rename fop, marker_rename() stores the, oldloc and newloc in its 'local' struct, once the rename is done, the xtime marker(last updated time) is set on the file, but sending a setxattr fop. When upcall receives the setxattr fop, the loc->inode is NULL and it crashes. The loc->inode can be NULL only in one valid case, i.e. in rename case where the inode of new loc can be NULL. Hence, marker should have filled the inode of the new_loc before issuing a setxattr. marker_rename_cbk was already fixed in a previous commit. Fixing marker_rename_done to send valid inode in this commit. Also in upcall check for NULL inode so that there is no crash. >Reviewed-on: https://review.gluster.org/16633 >Reviewed-by: Kotresh HR <khiremat@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: soumya k <skoduri@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> >(cherry picked from commit 73defab8be16b73241225bb1c2588a61e3e425d5) Change-Id: I3ed2a05118fed3367dfe3251ce4477310cb480d0 BUG: 1424937 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/16684 Tested-by: Atin Mukherjee <amukherj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* afr: all children of AFR must be up to resolve s-brainRavishankar N2017-02-151-0/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: The various split-brain resolution policies (favorite-child-policy based, CLI based and mount (get/setfattr) based) attempt to resolve split-brain even when not all bricks of replica are up. This can be a problem when say in a replica 3, the only good copy is down and the other 2 bricks are up and blame each other (i.e. split-brain). We end up healing the file in such a case and allow I/O on it. Fix: A decision on whether the file is in split-brain or not must be taken only if we are able to examine the afr xattrs of *all* bricks of a given replica. Signed-off-by: Ravishankar N <ravishankar@redhat.com> > Reviewed-on: https://review.gluster.org/16476 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 0e03336a9362e5717e561f76b0c543e5a197b31b) Change-Id: Icddb1268b380005799990f5379ef957d84639ef9 BUG: 1420982 Reviewed-on: https://review.gluster.org/16587 Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* gfapi: create statedump when glusterd requests itNiels de Vos2017-02-132-7/+101
| | | | | | | | | | | | | | | | | | | | | | | | | | When GlusterD sends the STATEDUMP procedure to the libgfapi client, the client checks if it matches the PID that should take the statedump. If so, it will do a statedump for the glfs_t that is connected to this mgmt connection. > BUG: 1169302 > See-also: http://review.gluster.org/9228 > Signed-off-by: Poornima G <pgurusid@redhat.com> > [ndevos: separated patch from 9228] > Reviewed-on: https://review.gluster.org/16415 > Reviewed-by: Niels de Vos <ndevos@redhat.com> > Tested-by: Niels de Vos <ndevos@redhat.com> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > Reviewed-by: Prashanth Pai <ppai@redhat.com> BUG: 1418981 Change-Id: I70d6a1f4f19d525377aebc8fa57f51e513b92d84 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: https://review.gluster.org/16602 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* gfapi: add API to trigger events for debugging and troubleshootingNiels de Vos2017-02-132-0/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce glfs_sysrq() as a generic API for triggering debug and troubleshoot events. This interface will be used by the feature to get statedumps for applications using libgfapi. The current events that can be requested through this API are: - 'h'elp: log a mesage with all supported events - 's'tatedump: trigger a statedump for the passed glfs_t In future, this API can be used by a CLI to trigger statedumps from storage servers. At the moment it is limited to take statedumps, but it is extensible to set the log-level, clear caches, force reconnects and much more. > BUG: 1169302 > Change-Id: I18858359a3957870cea5139c79efe1365a15a992 > Original-author: Poornima G <pgurusid@redhat.com> > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on-master: http://review.gluster.org/16414 > Reviewed-by: Prashanth Pai <ppai@redhat.com> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> BUG: 1418981 Change-Id: I18858359a3957870cea5139c79efe1365a15a992 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: https://review.gluster.org/16600 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* glusterd: add a cli command to trigger a statedump on a clientPoornima G2017-02-131-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this, we will be able to trigger statedumps on remote Gluster clients, mainly targetted for applications using libgfapi. Design: SIGUSR signal is the most comman way of taking a statedump in Gluster. But it cannot be used for libgfapi based processes, as the process loading the library might have already consumed SIGUSR signal. Hence going by the command way. One has to issue a Gluster command to initiate a statedump on the libgfapi based client. The command takes hostname and PID as an argument. All the glusterds in the cluster, check if they are connected to the specified hostname, and send an RPC request to all the connected clients from that hostname (via the mgmt connection). > URL: http://review.gluster.org/16357 > BUG: 1169302 > Signed-off-by: Poornima G <pgurusid@redhat.com> > [ndevos: minor fixes and split patch in smaller pieces] > Reviewed-on-master: https://review.gluster.org/9228 > Reviewed-by: Niels de Vos <ndevos@redhat.com> > Tested-by: Niels de Vos <ndevos@redhat.com> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> BUG: 1418981 Change-Id: Icbe4d2f026b32a2c7d5535e1bfb2cdaaff042e91 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: https://review.gluster.org/16601 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* tests: reenable trash.tJeff Darcy2017-02-131-3/+0
| | | | | | | | | | | | | | | | | | | Now that the underlying bug has been fixed (by d97e63d0) we can allow the test to run again. Backport of: > Change-Id: If9736d142f414bf9af5481659c2b2673ec797a4b > BUG: 1420434 > Reviewed-on: https://review.gluster.org/16584 Change-Id: I44edf2acfd5f9ab33bdc95cc9fc981e04c3eead1 BUG: 1418091 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16598 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* glusterd: keep snapshot bricks separate from regular onesJeff Darcy2017-02-102-13/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | The problem here is that a volume's transport options can change, but any snapshots' bricks don't follow along even though they're now incompatible (with respect to multiplexing). This was causing the USS+SSL test to fail. By keeping the snapshot bricks separate (though still potentially multiplexed with other snapshot bricks including those for other volumes) we can ensure that they remain unaffected by changes to their parent volumes. Also fixed various issues with how the test waits (or more precisely didn't) for various events to complete before it continues. Backport of: > Change-Id: Iab4a8a44fac5760373fac36956a3bcc27cf969da > BUG: 1385758 > Reviewed-on: https://review.gluster.org/16544 ` Change-Id: I91c73e3fdf20d23bff15fbfcc03a8a1922acec27 BUG: 1418091 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16597 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* trash: fix problem with trash feature under multiplexingJeff Darcy2017-02-101-1/+3
| | | | | | | | | | | | | | | | | | | | | | With multiplexing, the trash translator gets a reconfigure call before a notify(CHILD_UP). In this case, priv->trash_itable was not yet initialized, so the reconfigure would get a SEGV. Moving the itable allocation to init seems to fix it, so trash can be reenabled. Backport of: > Change-Id: I21ac2d7fc66bac1bc4ec70fbc8bae306d73ac565 > BUG: 1420434 > Reviewed-on: https://review.gluster.org/16567 Change-Id: I43a6de6ac5070848619c5f905f075e4a4099c1bd BUG: 1420808 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16582 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Anoop C S <anoopcs@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* glusterd: ignore return code of glusterd_restart_bricksAtin Mukherjee2017-02-101-0/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When GlusterD is restarted on a multi node cluster, while syncing the global options from other GlusterD, it checks for quorum and based on which it decides whether to stop/start a brick. However we handle the return code of this function in which case if we don't want to start any bricks the ret will be non zero and we will end up failing the import which is incorrect. Fix is just to ignore the ret code of glusterd_restart_bricks () >Reviewed-on: https://review.gluster.org/16574 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> >Reviewed-by: Jeff Darcy <jdarcy@redhat.com> >(cherry picked from commit 55625293093d485623f3f3d98687cd1e2c594460) Change-Id: I37766b0bba138d2e61d3c6034bd00e93ba43e553 BUG: 1420991 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/16593 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* tests: fix online_brick_count for multiplexingJeff Darcy2017-02-083-8/+24
| | | | | | | | | | | | | | | | | | | | | | | The number of brick processes no longer matches the number of bricks, therefore counting processes doesn't work. Counting *pidfiles* does. Ironically, the fix broke multiplex.t which used this function, so it now uses a different function with the old process-counting behavior. Also had to fix online_brick_count and kill_node in cluster.rc to be consistent with the new reality. Backport of: > Change-Id: I4e81a6633b93227e10604f53e18a0b802c75cbcc > BUG: 1385758 > Reviewed-on: https://review.gluster.org/16527 Change-Id: I70b5cd169eafe3ad5b523bc0a30d21d864b3036a BUG: 1418091 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16565 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* CLI/TIER: removing old tier commands under rebalancehari gowtham2017-02-071-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | back-port of : https://review.gluster.org/#/c/16463/ PROBLEM: gluster v rebalance <volname> tier start works even after the switch of tier to service framework. This lets the user have two tierd for the same volume. FIX: checking for each process will make the new code hard to maintain. So we are removing the support for old commands. >Change-Id: I5b0974b2dbb74f0bee8344b61c7f924300ad73f2 >BUG: 1415590 >Signed-off-by: hari gowtham <hgowtham@redhat.com> >Reviewed-on: https://review.gluster.org/16463 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Tested-by: hari gowtham <hari.gowtham005@gmail.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: N Balachandran <nbalacha@redhat.com> >Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Change-Id: Ib996d89b1bd250176a3f5eeb369b71b0a4f95968 BUG: 1419868 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/16555 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* tests: Marked 2 tests in master that bad, the same in release-3.10Shyam2017-02-072-0/+4
| | | | | | | | | | | | | | | | | | | | | The 2 tests that were marked bad in master are, tests/basic/ec/ec-background-heals.t tests/bitrot/bug-1373520.t The same are marked bad in release-3.10 to prevent any spurious failures during backporting other commits. When the failures are analyzed and fixed in master the same would need to be backported, as applicable, to release-3.10 branch. Change-Id: I7d686bc90e1ee443475d3a3512fa7dfb7cbeede7 BUG: 1419696 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: https://review.gluster.org/16549 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* extras: Provide group set for md-cache and invalidation optionsPoornima G2017-02-061-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | To enable the integration of md-cache and invalidation features we need to perform 3 volume set options in a specific order. In order to ease this for user provide a group volume set option. Usage: gluster vol set <VOLNAME> group metadata-cache >Reviewed-on: https://review.gluster.org/16503 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> >Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Change-Id: I9bf0fd4217aa2a1c7ffbdc93e879b10f87addeac BUG: 1419306 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/16546 Tested-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* glusterd: double-check brick liveness for remove-brick validationJeff Darcy2017-02-031-2/+4
| | | | | | | | | | | | | | | | | | | | | Same problem as https://review.gluster.org/#/c/16509/ in a different place. Tests detach bricks without glusterd's knowledge, so glusterd's internal brick state is out of date and we have to re-check (via the brick's pidfile) as well. Backport of: > BUG: 1385758 > Change-Id: I169538c1c62d72a685a49d57ef65fb6c3db6eab2 > Reviewed-on: https://review.gluster.org/16529 BUG: 1418091 Change-Id: Id0b597bc60807ed090f6ecdba549c5cf3d758f98 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16537 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* tests: use kill_brick instead of kill -9Jeff Darcy2017-02-037-9/+14
| | | | | | | | | | | | | | | | | | The system actually handles this OK, but with multiplexing the result of killing the whole process is not what some tests assumed. Backport of: > Change-Id: I89ebf0039ab1369f25b0bfec3710ec4c13725915 > BUG: 1385758 > Reviewed-on: https://review.gluster.org/16528 BUG: 1418091 Change-Id: I39943e27b4b1a5e56142f48dc9ef2cdebe0a9d5b Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16534 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>