summaryrefslogtreecommitdiffstats
path: root/xlators/cluster
Commit message (Collapse)AuthorAgeFilesLines
* cluster/dht: Add migration checks to dht_(f)xattropN Balachandran2018-01-305-53/+334
| | | | | | | | | | | | | | | | The dht_(f)xattrop implementation did not implement migration phase1/phase2 checks which could cause issues with rebalance on sharded volumes. This does not solve the issue where fops may reach the target out of order. > Change-Id: I2416fc35115e60659e35b4b717fd51f20746586c > BUG: 1471031 > Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: I2416fc35115e60659e35b4b717fd51f20746586c BUG: 1498081 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* cluster/afr: Honor default timeout of 5min for analyzing split-brain fileskarthik-us2017-11-271-1/+5
| | | | | | | | | | | | | | | | | | | | Problem: After setting split-brain-choice option to analyze the file to resolve the split brain using the command "setfattr -n replica.split-brain-choice -v "choiceX" <path-to-file>" should allow to access the file from mount for default timeout of 5mins. But the timeout was not honored and was able to access the file even after the timeout. Fix: Call the inode_invalidate() in afr_set_split_brain_choice_cbk() so that it will triger the cache invalidate after resetting the timer and the split brain choice. So the next calls to access the file will fail with EIO. Change-Id: I698cb833676b22ff3e4c6daf8b883a0958f51a64 BUG: 1514388 Signed-off-by: karthik-us <ksubrahm@redhat.com> (cherry picked from commit 933ec57ccda2c1ba5ce6f207313c3b6802e67ca3)
* afr: don't check for file size in afr_mark_source_sinks_if_file_emptyRavishankar N2017-10-051-6/+7
| | | | | | | | | | ... for AFR_METADATA_TRANSACTION and just mark source and sinks if metadata is the same. (cherry picked from commit 24637d54dcbc06de8a7de17c75b9291fcfcfbc84) Change-Id: I69e55d3c842c7636e3538d1b57bc4deca67bed05 BUG: 1496321 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* cluster/dht: EBADF handling for fremovexattr and fsetxattrN Balachandran2017-10-033-3/+44
| | | | | | | | | | | | | | | | | Add EBADF handling for dht_fremovexattr and dht_fsetxattr. > BUG: 1476665 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17999 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 747a08d34e2a1e94d7fce68a3577370288bb1955) Change-Id: Ide0d5812dae79655d2565157e5baabcd753b4309 BUG: 1467010 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* afr: auto-resolve split-brains for zero-byte filesRavishankar N2017-10-023-0/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | Backport of https://review.gluster.org/#/c/18283/ Problems: As described in BZ 1491670, renaming hardlinks can result in data/mdata split-brain of the DHT link-to files (T files) without any mismatch of data and metadata. As described in BZ 1486063, for a zero-byte file with only dirty bits set, arbiter brick will likely be chosen as the source brick. Fix: For zero byte files in split-brain, pick first brick as a) data source if file size is zero on all bricks. b) metadata source if metadata is the same on all bricks In arbiter case, if file size is zero on all bricks and there are no pending afr xattrs, pick 1st brick as data source. Change-Id: I0270a9a2f97c3b21087e280bb890159b43975e04 BUG: 1496321 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Rahul Hinduja <rhinduja@redhat.com> Reported-by: Mabi <mabi@protonmail.ch>
* dht: add FOP check to dht_file_setattr_cbkRavishankar N2017-09-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: bug-797171.7 loaded error-gen xlator on the brick which sent EBADF for a non fd-based fop, namely setattr. This caused dht_check_and_open_fd_on_subvol_task() to crash as local->fd was NULL. Fix: Call dht_check_and_open_fd_on_subvol_task() from dht_file_setattr_cbk only for dht_fsetattr and not dht_setattr or dht_setattr2 > Reviewed-on: https://review.gluster.org/18208 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Susant Palai <spalai@redhat.com> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 47188e9eac59de416a5c86c7ec7540ed6aaa1c98) Signed-off-by: Ravishankar N <ravishankar@redhat.com> Change-Id: Iab4999e213bf2065804f3f8237e470ad454e3c99 BUG: 1497122
* afr: Prevent null gfids in self-heal entry re-creationRavishankar N2017-09-171-3/+11
| | | | | | | | | | | | | | | | | | > Reviewed-on: https://review.gluster.org/17981 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: Karthik U S <ksubrahm@redhat.com> (cherry picked from commit bead74a6e085001225bc0704bad1a5db36dd75a1) Change-Id: I5acb8bd0a19fc4e764d61e349bb690b5236ee610 BUG: 1491985 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/18300 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* afr: heal metadata in discover code pathRavishankar N2017-09-172-17/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | ****************************************************** Backport of: https://review.gluster.org/18202 Also added loc_is_nameless() to libglusterfs since the patch that introduced it in master was not backported to release-3.10. Note: 18202 is a squash of 17850 and 18187 in master. ****************************************************** During graph switch, if fuse sends nameless (gfid) lookups, afr takes the discover code path to serve it. If there are pending metadata heals, they do not happen unless an inode refresh happens as a part of discover (which is not guaranteed to happen always). This patch fixes it by attempting metadata heal as a part of discover, just like how it is done in lookup code path. Change-Id: I87c493045b9225741cad173bf3f645848697032e BUG: 1492010 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/18304 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* afr: check validity of afr_replyRavishankar N2017-09-175-47/+71
| | | | | | | | | | | | | | | | | | | | | | ...in various self-heal code paths. Originally found by Pranith in __afr_selfheal_name_impunge () Also change __afr_selfheal_assign_gfid() to send lookup only on those bricks that don't have a gfid matching that of the source. > Reviewed-on: https://review.gluster.org/18065 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit d594900dbca92c356152be65fce16f77c402117c) Change-Id: I70a2ccd750a2af92c5fc36e0eefb2b6125404b4a BUG: 1491995 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/18303 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: Check for open fd only on EBADFN Balachandran2017-09-174-160/+160
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DHT fd based fops used to check if the fd was open on the cached subvol before winding the call. However, this introduced a performance regression of about 30% for reads. This check was introduced to handle cases where files were migrated while IOs were happening. As this is not the common case, dht will now check if the fd is open on the cached subvol only if the call fails with EBADF. This will prevent a performance hit where a rebalance is not running. > BUG: 1476665 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17976 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: Susant Palai <spalai@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I2035a858d63c3fcd22bb634055bbb0ad01686808 BUG: 1467010 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/18057 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/ec: Node uuid xattr support update for ECSunil Kumar Acharya2017-09-132-6/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: The change in EC to return list of node uuids for GF_XATTR_NODE_UUID_KEY was causing problems with geo-rep. Fix: This patch will allow to get the single node uuid as it was doing before with the key "GF_XATTR_NODE_UUID_KEY", and will also allow to get the list of node uuids by using a new key "GF_XATTR_LIST_NODE_UUIDS_KEY". This will solve the problem with geo-rep and any other features which were depending on this. >BUG: 1462790 >Change-Id: I2d9214a9658d4a41a3d6de08600884d2bda5f3eb >Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> >Reviewed-on: https://review.gluster.org/17594 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> BUG: 1487647 Change-Id: I2d9214a9658d4a41a3d6de08600884d2bda5f3eb Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> Reviewed-on: https://review.gluster.org/17667 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/ec: return all node uuids from all subvolumesXavier Hernandez2017-09-012-105/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | EC was retuning the UUID of the brick with smaller value. This had the side effect of not evenly balancing the load between bricks on rebalance operations. This patch modifies the common functions that combine multiple subvolume values into a single result to take into account the subvolume order and, optionally, other subvolumes that could be damaged. This makes easier to add future features where brick order is important. It also makes possible to easily identify the originating brick of each answer, in case some brick will have an special meaning in the future. >Change-Id: Iee0a4da710b41224a6dc8e13fa8dcddb36c73a2f >BUG: 1366817 >Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> >Reviewed-on: https://review.gluster.org/17297 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Ashish Pandey <aspandey@redhat.com> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> BUG: 1487042 Change-Id: Iee0a4da710b41224a6dc8e13fa8dcddb36c73a2f Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> Reviewed-on: https://review.gluster.org/18148 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cluster/rebalance: Fix hardlink migration failuresSusant Palai2017-08-112-15/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A brief about how hardlink migration works: - Different hardlinks (to the same file) may hash to different bricks, but their cached subvol will be same. Rebalance picks up the first hardlink, calculates it's hash(call it TARGET) and set the hashed subvolume as an xattr on the data file. - Now all the hardlinks those come after this will fetch that xattr and will create linkto files on TARGET (all linkto files for the hardlinks will be hardlink to each other on TARGET). - When number of hardlinks on source is equal to the number of hardlinks on TARGET, the data migration will happen. RACE:1 Since rebalance is multi-threaded, the first lookup (which decides where the TARGET subvol should be), can be called by two hardlink migration parallely and they may end up creating linkto files on two different TARGET subvols. Hence, hardlinks won't be migrated. Fix: Rely on the xattr response of lookup inside gf_defrag_handle_hardlink since it is executed under synclock. RACE:2 The linkto files on TARGET can be created by other clients also if they are doing lookup on the hardlinks. Consider a scenario where you have 100 hardlinks. When rebalance is migrating 99th hardlink, as a result of continuous lookups from other client, linkcount on TARGET is equal to source linkcount. Rebalance will migrate data on the 99th hardlink itself. On 100th hardlink migration, hardlink will have TARGET as cached subvolume. If it's hash is also the same, then a migration will be triggered from TARGET to TARGET leading to data loss. Fix: Make sure before the final data migration, source is not same as destination. RACE:3 Since a hardlink can be migrating to a non-hashed subvolume, a lookup from other client or even the rebalance it self, might delete the linkto file on TARGET leading to hardlinks never getting migrated. This will be addressed in a different patch in future. > Change-Id: If0f6852f0e662384ee3875a2ac9d19ac4a6cea98 > BUG: 1469964 > Signed-off-by: Susant Palai <spalai@redhat.com> > Reviewed-on: https://review.gluster.org/17755 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Signed-off-by: Susant Palai <spalai@redhat.com> Change-Id: If0f6852f0e662384ee3875a2ac9d19ac4a6cea98 BUG: 1473141 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/17838 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: fix on demand migration files from clientSusant Palai2017-08-114-20/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On demand migration of files i.e. migration done by clients triggered by a setfattr was broken. Dependency on defrag led to crash when migration was triggered from client. Note: This functionality is not available for tiered volumes. Migration from tier served client will fail with ENOTSUP. usage (But refer to the steps mentioned below to avoid any issues) : setfattr -n "trusted.distribute.migrate-data" -v "1" <filename> The purpose of fixing the on-demand client migration was to give a workaround where the user has lots of empty directories compared to files and want to do a remove-brick process. Here are the steps to trigger file migration for remove-brick process from client. (This is highly recommended to follow below steps as is) Let's say it is a replica volume and user want to remove a replica pair named brick1 and brick2. (Make sure healing is completed before you run these steps) Step-1: Start remove-brick process - gluster v remove-brick <volname> brick1 brick2 start Step-2: Kill the rebalance daemon - ps aux | grep glusterfs | grep rebalance\/ | awk '{print $2}' | xargs kill Step-3: Do a fresh mount as mentioned here - glusterfs -s ${localhostname} --volfile-id rebalance/$volume-name /tmp/mount/point Step-4: Go to one of the bricks (among brick1 and brick2) - cd <brick1 path> Step-5: Run the following command. - find . -not \( -path ./.glusterfs -prune \) -type f -not -perm 01000 -exec bash -c 'setfattr -n "distribute.fix.layout" -v "1" ${mountpoint}/$(dirname '{}')' \; -exec setfattr -n "trusted.distribute.migrate-data" -v "1" ${mountpoint}/'{}' \; This command will ignore the linkto files and empty directories. Do a fix-layout of the parent directory. And trigger a migration operation on the files. Step-6: Once this process is completed do "remove-brick force" - gluster v remove-brick <volname> brick1 brick2 force Note: Use the above script only when there are large number of empty directories. Since the script does a crawl on the brick side directly and avoids directories those are empty, the time spent on fixing layout on those directories are eliminated(even if the script does not do fix-layout on empty directories, post remove-brick a fresh layout will be built for the directory, hence not affecting application continuity). Detailing the expectation for hardlink migartion with this patch: Hardlink is migrated only for remove-brick process. It is highly essential to have a new mount(step-3) for the hardlink migration to happen. Why?: setfattr operation is an inode based operation. Since, we are doing setfattr from fuse mount here, inode_path will try to build path from the linked dentries to the inode. For a file without hardlinks the path construction will be correct. But for hardlinks, the inode will have multiple dentries linked. Without fresh mount, inode_path will always get the most recently linked dentry. e.g. if there are three hardlinks named dir1/link1, dir2/link2, dir3/link3, on a client where these hardlinks are looked up, inode_path will always return the path dir3/link3 if dir3/link3 was looked up most recently. Hence, we won't be able to create linkto files for all other hardlinks on destination (read gf_defrag_handle_hardlink for more details on hardlink migration). With a fresh mount, the lookup and setfattr become serialized. e.g. link2 won't be looked up until link1 is looked up and migrated. Hence, inode_path will always have the correct path, in this case link1 dentry is picked up(as this is the most recently looked up inode) and the path is built right. Note: If you run the above script on an existing mount(all entries looked up), hard links may not be migrated, but there should not be any other issue. Please raise a bug, if you find any issue. Tests: Manual > Change-Id: I9854cdd4955d9e24494f348fb29ba856ea7ac50a > BUG: 1450975 > Signed-off-by: Susant Palai <spalai@redhat.com> > Reviewed-on: https://review.gluster.org/17115 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Signed-off-by: Susant Palai <spalai@redhat.com> Change-Id: I9854cdd4955d9e24494f348fb29ba856ea7ac50a BUG: 1473140 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/17837 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: initialize throttle option "normal" to same in init and reconfigureSusant Palai2017-08-111-77/+52
| | | | | | | | | | | | | | | | | | | | | | | | | Normal value were different in dht_init and dht_reconfigure. Initialization/reconfigure of throttle option are carved out to a separate function (dht_configure_throttle) now. Normal value will be "2". > Change-Id: Ie323eae019af41d6bef0a136e3d284dc82bab9a1 > BUG: 1451162 > Signed-off-by: Susant Palai <spalai@redhat.com> > Reviewed-on: https://review.gluster.org/17303 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Signed-off-by: Susant Palai <spalai@redhat.com> Change-Id: Ie323eae019af41d6bef0a136e3d284dc82bab9a1 BUG: 1473137 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/17836 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: Make rebalance throttle option tuned by numberSusant Palai2017-08-113-26/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current rebalance throttle options: lazy/normal/aggressive may not always be sufficient for the purpose of throttling. In our recent test, we observed for certain setups, normal and aggressive modes behaved similarly consuming full disk bandwidth. So in cases like this admin should be able to tune it down(or vice versa) depending on the need. Along with old throttle configurations, thread counts are tuned based on number. e.g. gluster v set vol-name cluster-rebal.throttle 5. Admin can tune up/down between 0 and the number of cores available. Note: For heterogenous servers, validation will fail on the old server if "number" is given for throttle configuration. The message looks something like this: "volume set: failed: Staging failed on vm2. Error: cluster.rebal-throttle should be {lazy|normal|aggressive}" Test: Manual test by logging active thread number after reconfiguring throttle option. testcase: tests/basic/distribute/throttle-rebal.t > Change-Id: I46e3cde546900307831028b344ecf601fd9b02c3 > BUG: 1438370 > Signed-off-by: Susant Palai <spalai@redhat.com> > Reviewed-on: https://review.gluster.org/16980 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Signed-off-by: Susant Palai <spalai@redhat.com> Change-Id: I46e3cde546900307831028b344ecf601fd9b02c3 BUG: 1473136 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/17835 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: rebalance perf enhancementSusant Palai2017-08-112-108/+246
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Throttle settings "normal" and "aggressive" for rebalance did not have performance difference. normal mode spawns $(no. of cores - 4)/2 threads and aggressive spawns $(no. of cores - 4) threads. Though aggressive mode has twice the number of threads compared to that of normal mode, there was no performance gain when switched to aggressive mode from normal mode. RCA: During the course of debugging the above problem, we tried assigning migration job to migration threads spawned by rebalance, rather than synctasks(as there is more overhead associated to manage the task queue and threads). This gave us a significant improvement over rebalance under synctasks. This patch does not really gurantee that there will be a clear performance difference between normal and aggressive mode, but this patch certainly maximized the disk utilization for 1GBfiles run. Results: Test enviroment: Gluster Config: Number of Bricks: 2 (one brick per disk(RAID-6 12 disk)) Bricks: Brick1: server1:/brick/test1/1 Brick2: server2:/brick/test1/1 Options Reconfigured: performance.readdir-ahead: on server.event-threads: 4 client.event-threads: 4 1000 files with 1GB each were created/renamed such that all files will have server1 as cached and server2 as hashed, so that all files will be migrated. Test machines had 24 cores each. Results with/without synctask based migration: ----------------------------------------------- mode normal(10threads) aggressive(20threads) timetaken 0:55:30 (h:m:s) 0:56:3 (h:m:s) withsynctask timetaken with migrator 0:38:3 (h:m:s) 0:23:41 (h:m:s) threads From above table it can be seen that, there is a clear 2x perf gain between rebalance with synctask vs rebalance with migrator threads. Additionally this patch modifies the code so that caller will have the exact error number returned by dht_migrate_file(earlier the errno meaning was overloaded). This will help avoiding scenarios where migration failure due to ENOENT, can result in rebalance abort/failure. > Change-Id: I8904e2fb147419d4a51c1267be11a08ffd52168e > BUG: 1420166 > Signed-off-by: Susant Palai <spalai@redhat.com> > Reviewed-on: https://review.gluster.org/16427 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Signed-off-by: Susant Palai <spalai@redhat.com> Change-Id: I8904e2fb147419d4a51c1267be11a08ffd52168e BUG: 1473134 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/17834 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: correct space check for rebalanceSusant Palai2017-08-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | With rebalance doing fallocate on destination, we don't need to add file size to the "destination available space" to decide whether to migrate the file or not. Notes: Fallocate would have already occupied the file size space on destination > Change-Id: If7f6a6654e6257726680cf20d618482a6e9095a6 > BUG: 1441508 > Signed-off-by: Susant Palai <spalai@redhat.com> > Reviewed-on: https://review.gluster.org/17104 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Signed-off-by: Susant Palai <spalai@redhat.com> Change-Id: If7f6a6654e6257726680cf20d618482a6e9095a6 BUG: 1473133 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/17833 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: Skip file migration if the subvol that meets min-free-diskSusant Palai2017-08-113-25/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ... criteria happens to be the same subvol containing data-file Rebalance need to figure out a new subvol in case the hashed subvol does not have enough space. In the process of figuring out the new subvol, we need to ignore the source subvol, otherwise it will lead to data loss. Test: Manual Ran the following sizeof /tmp/1: 1.5GB sizeof /brick/1: 16GB sizeof /tmp/2: 1.5GB <start> glusterd; gluster v create test1 vm1:/brick/1 vm1:/tmp/1; gluster v start test1; mount -t glusterfs vm1:test1 /mnt; for i in {1..2000} do dd if=/dev/zero of=/mnt/file$i bs=1KB count=1 &> /dev/null; done gluster v add-brick test1 vm1:/tmp/2 gluster v set test1 min-free-disk 12GB gluster v remove-brick test1 vm1:/tmp/1 star <end> file count and data were intact. > Change-Id: Ib8fc8467a3d48a7c12958824c4f0b88e160b86c1 > BUG: 1441508 > Signed-off-by: Susant Palai <spalai@redhat.com> > Reviewed-on: https://review.gluster.org/17064 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Signed-off-by: Susant Palai <spalai@redhat.com> Change-Id: Ib8fc8467a3d48a7c12958824c4f0b88e160b86c1 BUG: 1473133 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/17832 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Shyamsundar Ranganathan <srangana@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: Make rebalance honor min-free-diskSusant Palai2017-08-113-16/+234
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | test: Manual created files of size 1K on 2 brick(of size 1GB) setup . added a brick of size 16GB. set min-free-disk to 12GB(so that first two bricks won't receive any files). removed one of the 1st brick of size 1GB. Logs from test: [2017-04-12 08:52:08.196484] W [MSGID: 0] [dht-rebalance.c:895:__dht_check_free_space] 0-test1-dht: Write will cross min-free-disk for file - /tile32 on subvol - test1-client-1. Looking for new subvol. [2017-04-12 08:52:08.196904] I [MSGID: 0] [dht-rebalance.c:925:__dht_check_free_space] 0-test1-dht: new target found - test1-client-2 for file - /tile32 - Post migration we have two files. The new destination (/brick/1) has the data file [root@vm1 ~]# ll /brick/1/tile32 -rw-r--r--. 2 root root 0 Apr 12 14:22 /brick/1/tile32 - On the old target the linkto file is there with linkto xattr pointing to /brick/1 [root@vm1 ~]# ll /tmp/2/tile32 ---------T. 2 root root 1000 Apr 12 14:22 /tmp/2/tile32 [root@vm1 ~]# getfattr -m . -de text /tmp/2/tile32 getfattr: Removing leading '/' from absolute path names security.selinux="unconfined_u:object_r:user_tmp_t:s0" trusted.gfid="����:Aс�#�/'b2" trusted.glusterfs.dht.linkto="test1-client-2" Marking ./tests/features/worm_sh.t as bad test. Reason being, this patch failed on master branch as well and it has nothing to do with rebalance/remove-brick. > BUG: 1441508 > Change-Id: I90bae251cda3d957a49cdceda90cd08311a392fb > Signed-off-by: Susant Palai <spalai@redhat.com> > Reviewed-on: https://review.gluster.org/17034 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Signed-off-by: Susant Palai <spalai@redhat.com> Change-Id: I90bae251cda3d957a49cdceda90cd08311a392fb BUG: 1473132 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/17831 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* dht/rebalance: Crawler performance improvementSusant Palai2017-08-111-127/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The job of the crawler in rebalance is to fetch files from each local subvolume and push them to migration queue if it is eligible for migration. And we do a lookup on the entries received to figure out the eligibilty. Since, the lookup done is on a local subvolume we receive linkto files and regular files as well. This requires us to do two lookups. first: do a lookup on the file to figure out whether it is a linkto file second: do a lookup on the file to figure out if it should be migrated Note: The migrator thread also does one lookup for the file before migration. Optimization: Remove the lookup done by the crawler. Offload these task to the migrator threads. For linkto file verification get the stat and xattr information from readdirp. So in total we have one lookup instead of three for each entry. Performance numbers: Create two node, two brick setup. Created 100000 files. And started rebalance. Since, there is no add-brick, no files will be migrated and we will get the crawler performance. Without patch: [root@gprfs039 ~]# grs Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 50070 0 0 completed 0:0:48 server2 0 0Bytes 49930 0 0 completed 0:0:44 volume rebalance: test1: success Total: 48 seconds WiththecurrentPatch: [root@gprfs039 mnt]# gluster v rebalance test1 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 50070 0 0 completed 0:0:12 server2 0 0Bytes 49930 0 0 completed 0:0:12 volume rebalance: test1: success Total: 12 seconds That's 4X speed gain. :) > Updates glusterfs#155 > Change-Id: Idc8e5b366e76c54aa40d698876ae62fe1630b6cc > BUG: 1439571 > Signed-off-by: Susant Palai <spalai@redhat.com> > Reviewed-on: https://review.gluster.org/15781 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Updates glusterfs#155 Change-Id: Idc8e5b366e76c54aa40d698876ae62fe1630b6cc BUG: 1473129 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/17830 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* refcount: typecast function for calling on freeNiels de Vos2017-08-111-7/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All of the functions called to free the refcounted structure are doing a typecast from (void*) to their own type taht is being free'd. This really is not needed and the refcount interface is made a little simpler without the requirement of typecasting. With this small improvement in the API, all callers are updated too. Cherry picked from commit f2ca301bd741e3e3f076cd3f72fcd377bcef2a1a: > Change-Id: I32473b6d1799f62861d4b2d78ea30c09e6c80ab1 > BUG: 1416889 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/16471 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Backport note: This patch makes it easier to backport changes that use gf_refcount_t. There is no functional change. Change-Id: I32473b6d1799f62861d4b2d78ea30c09e6c80ab1 BUG: 1471870 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17913 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: Fix fd check raceN Balachandran2017-07-202-1/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a another race between the cached subvol being updated in the inode_ctx and the fd being opened on the target. 1. fop1 -> fd1 -> subvol0 2. file migrated from subvol0 to subvol1 and cached_subvol changed to subvol1 in inode_ctx 3. fop2 -> fd1 -> subvol1 [takes new cached subvol] 4. fop2 -> checks fd ctx (fd not open on subvol1) -> opens fd1 on subvol1 5. fop1 -> checks fd ctx (fd not open on subvol0) -> tries to open fd1 on subvol0 -> fails with "No such file on directory". Fix: If dht_fd_open_on_dst fails with ENOENT or ESTALE, wind to old subvol and let the phase1/phase2 checks handle it. Change-Id: I34f8011574a8b72e3bcfe03b0cc4f024b352f225 BUG: 1467010 > BUG: 1465075 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17731 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-by: Amar Tumballi <amarts@redhat.com> (cherry picked from commit f7a450c17fee7e43c544473366220887f0534ed7) Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17829 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* cluster/dht: Check if fd is opened on dst subvolN Balachandran2017-07-196-30/+543
| | | | | | | | | | | | | | | | | | | | | | | | | | | If an fd is opened on a file, the file is migrated and the cached subvol is updated in the inode_ctx before an fd based fop is sent, the fop is sent to the dst subvol on which the fd is not opened. This causes the FOP to fail with EBADF. Now, every fd based fop will check to see that the fd has been opened on the dst subvol before winding it down. > BUG: 1465075 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17630 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-by: Susant Palai <spalai@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 91db0d47ca267aecfc6124a3f337a4e2f2c9f1e2) Change-Id: Id92ef5eb7a5b5226688e2d2868b15e383f5f240e BUG: 1467010 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17752 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* afr: mark non sources as sinks in metadata healRavishankar N2017-07-192-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: In a 3 way replica, when the source brick does not have pending xattrs for the sinks, but the 2 sinks blame each other, metadata heal was not happpening because we were not setting all non-sources as sinks. Fix: Mark all non-sources as sinks, like it is done in data and entry heal. > Reviewed-on: https://review.gluster.org/17717 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 77c1ed5fd299914e91ff034d78ef6e3600b9151c) Change-Id: I534978940f5087302e307fcc810a48ffe898ce08 BUG: 1471612 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17782 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/ec: correctly handle end of file for seekXavier Hernandez2017-07-071-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When a SEEK_HOLE was issued near to the end of file, sometimes an offset beyond the end of file was returned. Another problem was that using some offsets greater than the end of file returned successfully instead of failing with ENXIO. >Change-Id: I238d2884ba02fd19a78116b0f8f8e8d6338fb3f5 >BUG: 1449348 >Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> >Reviewed-on: https://review.gluster.org/17228 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Amar Tumballi <amarts@redhat.com> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >(cherry picked from commit eb96dd45f8e583c6bad84bf32ca17e2bb01dd38f) Change-Id: I238d2884ba02fd19a78116b0f8f8e8d6338fb3f5 BUG: 1468126 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: https://review.gluster.org/17711 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cluster/dht: Additional checks for rebalance estimatesN Balachandran2017-07-041-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rebalance estimates calculation was not handling calculations correctly when no files had been processed, i.e., when rate_lookedup was 0. Now, the estimated time is set to 0 in such scenarios as there is no way for rebalance to figure out how long the process will take to complete without knowing the rate at which the files are being processed. > BUG: 1457985 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17564 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I7b6378e297e1ba139852bcb2239adf2477336b5b BUG: 1460914 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17599 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: Returning single and list of node uuids from AFRkarthik-us2017-07-011-9/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: The change in afr to return list of node uuids was causing problems with geo-rep. Fix: This patch will allow to get the single node uuid as it was doing before with the key "GF_XATTR_NODE_UUID_KEY", and will also allow to get the list of node uuids by using a new key "GF_XATTR_LIST_NODE_UUIDS_KEY". This will solve the problem with geo-rep and any other feature which were depending on this. > Change-Id: I09885dac6dfca127be94b708470c8c2941356f9a > BUG: 1462790 > Signed-off-by: karthik-us <ksubrahm@redhat.com> > Reviewed-on: https://review.gluster.org/17576 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Ravishankar N <ravishankar@redhat.com> > Reviewed-by: Kotresh HR <khiremat@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> (cherry picked from commit 475ec9928ef96b63a0bfa859a9ae68709275033c) Change-Id: I5e741a48a426ee9a3cc69612051e0e9bcf33b500 BUG: 1464078 Signed-off-by: karthik-us <ksubrahm@redhat.com> Reviewed-on: https://review.gluster.org/17603 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster:dht Fix crash in dht_rename_lock_cbkN Balachandran2017-07-011-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Use a local variable to store the call count in the STACK_WIND for loop. Using frame->local is dangerous as it could be freed while the loop is still being processed > BUG: 1466110 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17645 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Tested-by: Nigel Babu <nigelb@redhat.com> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> (cherry picked from commit 56da27cf5dc6ef54c7fa5282dedd6700d35a0ab0) Change-Id: Ie65cdcfb7868509b4a83bc2a5b5d6304eabfbc8e BUG: 1466863 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17665 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht: Include dirs in rebalance estimatesN Balachandran2017-06-203-31/+83
| | | | | | | | | | | | | | | | | | | | | | | | | Empty directories were not being considered while calculating rebalance estimates leading to negative time-left values being displayed as part of the rebalance status. > BUG: 1457985 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17448 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I48d41d702e72db30af10e6b87b628baa605afa98 BUG: 1460914 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17530 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* afr: add errno to afr_inode_refresh_done()Ravishankar N2017-06-161-7/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of https://review.gluster.org/17413 and https://review.gluster.org/17436 Problem: When parellel `rm -rf`s were being done from cifs clients, opendir might fail on some replicas with ENOENT. DHT ignores partial opendir failures in dht_fd_cbk() and winds readdirs on those replicas. Afr inode refresh (as a part of readdirp read_txn) sees in its fd context that the state of the fds is *not* AFR_FD_OPENED and bails out to afr_inode_refresh_done() without doing a refresh. When this happens, the errno is set as EIO due to lack of readable subvols, logging split-brain messages in the logs. Fix: Introduce an errno argument to afr_inode_refresh_do() to bail out with the right error value when inode refresh is not performed. Change-Id: Id88e604278abb8df47750d45258d9e6dde710600 BUG: 1457732 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17516 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr: Return the list of node_uuids for the subvolumekarthik-us2017-05-303-50/+145
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: AFR was returning the node uuid of the first node for every file if the replica set was healthy, which was resulting in only one node migrating all the files. Fix: With this patch AFR returns the list of node_uuids to the upper layer, so that they can decide on which node to migrate which files, resulting in improved performance. Ordering of node uuids will be maintained based on the ordering of the bricks. If a brick is down, then the node uuid for that will be set to all zeros. > Reviewed-on: https://review.gluster.org/17084 > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 0a50167c0a8f950f5a1c76442b6c9abea466200d) Change-Id: I73ee0f9898ae473584fdf487a2980d7a6db22f31 BUG: 1451561 Signed-off-by: karthik-us <ksubrahm@redhat.com> Reviewed-on: https://review.gluster.org/17337 Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* cluster/dht: Fix crash in dht_selfheal_dir_setattrN Balachandran2017-05-291-2/+6
| | | | | | | | | | | | | | | | | | | | | | | Use a local variable to store the call cnt used in the for loop for the STACK_WIND so as not to access local which may be freed by STACK_UNWIND after all fops return. > BUG: 1452102 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17343 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 17784aaa311494e4538c616f02bf95477ae781bc) Change-Id: I24f49b6dbd29a2b706e388e2f6d5196c0f80afc5 BUG: 1453056 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17349 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* afr: propagate correct errno for fop failures in arbiterRavishankar N2017-05-184-15/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: If quorum is not met in fop cbk, arbiter sends an ENOTCONN error to the upper xlators. In a VM workload with sharding enabled, this was leading to the VM pausing when replace-brick was performed as described in the BZ. Fix: Move the fop cbk arbitration logic to afr_handle_quorum() because in normal replica volumes, that is the function that has the quorum and errno checks in the fop cbk path before doing a post-op. Thanks to Pranith for suggesting this approach. > Reviewed-on: https://review.gluster.org/17235 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 93c850dd2a513fab75408df9634ad3c970a0e859) Change-Id: Ie6315db30c5e36326b71b90a01da824109e86796 BUG: 1450934 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17295 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* cluster/dht: Rebalance on all nodes should migrate filesN Balachandran2017-05-186-20/+215
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Rebalance compares the node-uuid of a file against its own to and migrates a file only if they match. However, the current behaviour in both AFR and EC is to return the node-uuid of the first brick in a replica set for all files. This means a single node ends up migrating all the files if the first brick of every replica set is on the same node. Fix: AFR and EC will return all node-uuids for the replica set. The rebalance process will divide the files to be migrated among all the nodes by hashing the gfid of the file and using that value to select a node to perform the migration. This patch makes the required DHT and tiering changes. Some tests in rebal-all-nodes-migrate.t will need to be uncommented once the AFR and EC changes are merged. > BUG: 1366817 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17239 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> (cherry picked from commit b23bd3dbc2c153171d0bb1205e6804afe022a55f) Change-Id: I5ce41600f5ba0e244ddfd986e2ba8fa23329ff0c BUG: 1451561 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17311 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: Fix crash in dht rmdirN Balachandran2017-05-181-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Using local->call_cnt to check STACK_WINDs can cause dht_rmdir_do to be called erroneously if dht_rmdir_readdirp_cbk unwinds before we check if local->call_cnt is zero in dht_rmdir_opendir_cbk. This can cause frame corruptions and crashes. Thanks to Shyam (srangana@redhat.com) for the analysis. > BUG: 1451083 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17305 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> (cherry picked from commit 6f7d55c9d58797beaf8d5393c03a5a545bed8bec) Change-Id: I5362cf78f97f21b3fade0b9e94d492002a8d4a11 BUG: 1451371 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17309 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* dht: Add missing braces in dht_opendirPoornima G2017-05-141-1/+2
| | | | | | | | | | | | | | | | | | | | | >Change-Id: I6adce98f52e17953f501bc590ff7189cceac3c31 >BUG: 1431908 >Signed-off-by: Poornima G <pgurusid@redhat.com> >Reviewed-on: https://review.gluster.org/17057 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Vijay Bellur <vbellur@redhat.com> (cherry picked from commit af218797fa98f2f75594fc9ae595f184682f1a0d) Change-Id: I6adce98f52e17953f501bc590ff7189cceac3c31 BUG: 1435942 Reviewed-on: https://review.gluster.org/17285 Tested-by: Raghavendra Talur <rtalur@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* afr: send the correct iatt values in fsync cbkRavishankar N2017-05-141-25/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: afr unwinds the fsync fop with an iatt buffer from one of its children on whom fsync was successful. But that child might not be a valid read subvolume for that inode because of pending heals or because it happens to be the arbiter brick etc. Thus we end up sending the wrong iatt to mdcache which will in turn serve it to the application on a subsequent stat call as reported in the BZ. Fix: Pick a child on whom the fsync was successful *and* that is readable as indicated in the inode context. > Reviewed-on: https://review.gluster.org/17227 > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 1a8fa910ccba7aa941f673302c1ddbd7bd818e39) Change-Id: Ie8647289219cebe02dde4727e19a729b3353ebcf BUG: 1444892 RCA'ed-by: Miklós Fokin <miklos.fokin@appeartv.com> Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17247 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/ec: fix incorrect answer check in seek fopXavier Hernandez2017-05-111-15/+8
| | | | | | | | | | | | | | | | | | | | | | | | A bad check in the answer of a seek request caused a segmentation fault when seek reported an error. > Change-Id: Ifb25ae8bf7cc4019d46171c431f7b09b376960e8 > BUG: 1439068 > Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> > Reviewed-on: https://review.gluster.org/16998 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Change-Id: Ifb25ae8bf7cc4019d46171c431f7b09b376960e8 BUG: 1438813 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: https://review.gluster.org/17232 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht: Fix ret checkN Balachandran2017-05-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Fixed an incorrect return code check in the rebalance code. > BUG: 1448640 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17197 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 67598f538efb24a9e5ac561b294a05e707e15761) Change-Id: I60804ff121cec7a2f0419e2ee70dd22ea7533c0c BUG: 1448864 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17204 Reviewed-by: MOHIT AGRAWAL <moagrawa@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: Pass the correct xdata in fremovexattr fopKrutika Dhananjay2017-05-011-8/+4
| | | | | | | | | | | | | | | | | | Backport of: > Change-Id: Id84bc87e48f435573eba3b24d3fb3c411fd2445d > BUG: 1440051 > Reviewed-on: https://review.gluster.org/17126 > (cherry-picked from ab88f655e6423f51e2f2fac9265ff4d4f5c3e579) Change-Id: Id84bc87e48f435573eba3b24d3fb3c411fd2445d BUG: 1426508 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/17134 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht Remove redundant logs in dht rmdirN Balachandran2017-04-271-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | Removing redundant logs were introduced in https://review.gluster.org/#/c/17065/ > BUG: 1445590 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17118 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Susant Palai <spalai@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 25f0a7b153b30b2c0e8278b0ce11d1199c3fb006) Change-Id: I0d6055488b51a13c91d2121e87f653cdb94888b0 BUG: 1446227 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17130 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: Pass the req dict instead of NULL in dht_attr2()Krutika Dhananjay2017-04-273-57/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: > Change-Id: Id7823fd932b4e5a9b8779ebb2b612a399c0ef5f0 > BUG: 1440051 > Reviewed on: https://review.gluster.org/17085 > (cherry-picked from commit d60ca8e96bbc16b13f8f3456f30ebeb16d0d1e47) This bug was causing VMs to pause during rebalance. When qemu winds down a STAT, shard fills the trusted.glusterfs.shard.file-size attribute in the req dict which DHT doesn't wind its STAT fop with upon detecting the file has undergone migration. As a result shard doesn't find the value to this key in the unwind path, causing it to fail the STAT with EINVAL. Also, the same bug exists in other fops too, which is also fixed in this patch. Change-Id: Id7823fd932b4e5a9b8779ebb2b612a399c0ef5f0 BUG: 1426508 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/17119 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht: rm -rf fails if dir has stale linkto filesN Balachandran2017-04-271-45/+203
| | | | | | | | | | | | | | | | | | | | | | | | | | | | rm -rf <dir> fails with ENOENT if dir contains a lot of stale linkto files. This is because a single readdirp is sent as part of the rmdir which would return and delete only as many linkto files on the bricks as would fit in one readdirp buffer. Running rm -rf <dir> multiple times will eventually delete all the files. The fix sends readdirp on each subvol until no more entries are returned. > BUG: 1442724 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17065 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit e5f9ba138571bd18226462c49ff6a55f5c3ed3a4) Change-Id: I447f2d193de4bd8ac16e4541c6b919d22250e39e BUG: 1444540 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17102 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht: The xattrs sent in readdirp should be sent in opendir aswellPoornima G2017-04-271-28/+44
| | | | | | | | | | | | | | | | | | | | | | As readdir-ahead can be loaded as a child of dht, dht has to specify the xattrs it is intrested in, as part of opendir call itself. >Reviewed-on: https://review.gluster.org/16902 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >(cherry picked from commit 0f71338e1d7c0b70f4fe3b19c68612fe730d9de2) Change-Id: I012ef96cc143b0cef942df78aa7150d85ec38606 BUG: 1435942 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/16947 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* afr: don't do a post-op on a brick if op failedRavishankar N2017-04-271-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In afr-v2, self-blaming xattrs are not there by design. But if the FOP failed on a brick due to an error other than ENOTCONN (or even due to ENOTCONN, but we regained connection before postop was wound), we wind the post-op also on the failed brick, leading to setting self-blaming xattrs on that brick. This can lead to undesired results like healing of files in split-brain etc. Fix: If a fop failed on a brick on which pre-op was successful, do not perform post-op on it. This also produces the desired effect of not resetting the dirty xattr on the brick, which is how it should be because if the fop failed on a brick, there is no reason to clear the dirty bit which actually serves as an indication of the failure. > Reviewed-on: https://review.gluster.org/16976 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 10dad995c989e9d77c341135d7c48817baba966c) Change-Id: I5f1caf4d1b39f36cf8093ccef940118638caa9c4 BUG: 1443501 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17083 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/dht: Modify local->loc.gfid in thread safe mannerPranith Kumar K2017-04-131-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of https://review.gluster.org/16986 Problem: local->loc.gfid in dht_lookup_directory() will be null-gfid for a fresh lookup. dht_lookup_dir_cbk() updates local->loc.gfid while in other thread dht_lookup_directory() is still winding lookup calls to subvolumes so there is a chance of partial gfid being seen by EC. We saw in 12x(4+2) volume, ec is receiving an loc where the gfid has last 10 bytes matching with the gfid of the directory and the first 4 bytes are all-zeros. This is leading to EC erroring out the lookup with EINVAL which leads to NFS failing lookup with EIO. snip from gdb: $37 = (dht_local_t *) 0x7fde5de5b3cc (gdb) p /x $37->loc.gfid $39 = {0x3b, 0x82, 0x10, 0x5e, 0x40, 0x65, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5, 0x6c, 0x2c, 0xb8, 0x56} (gdb) fr 7 state=<optimized out>) at ec-generic.c:837 837 ec_lookup_rebuild(fop->xl->private, fop, cbk); (gdb) p /x fop->loc[0].gfid $40 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5, 0x6c, 0x2c, 0xb8, 0x56} snip from log: [2017-01-29 03:22:30.132328] W [MSGID: 122019] [ec-helpers.c:354:ec_loc_gfid_check] 0-butcher-disperse-4: Mismatching GFID's in loc [2017-01-29 03:22:30.132709] W [MSGID: 112199] [nfs3-helpers.c:3515:nfs3_log_newfh_res] 0-nfs-nfsv3: /linux-4.9.5/Documentation => (XID: b27b9474, MKDIR: NFS: 5(I/O error), POSIX: 5(Input/output error)), FH: exportid 00000000-0000-0000-0000-000000000000, gfid 00000000-0000-0000-0000-000000000000, mountid 00000000-0000-0000-0000-000000000000 [Invalid argument] Fix: update local->loc.gfid in last-call to make sure there are no races. >BUG: 1438411 >Change-Id: Ifcb7e911568c1f1f83123da6ff0cf742b91800a0 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> BUG: 1438423 Change-Id: I804822a1d50215301881ac18318282c1a6951cfb Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/16987 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* build: miscellaneous spelling fixesPatrick Matthäi2017-04-021-1/+1
| | | | | | | | | | | | | | | | | | Debian builds detected spelling issues with GlusterFS 3.10.1. Instead of carrying the patch in the Debian sources, let's include the fixes here too. Change-Id: I38db6adf142f7ec247bffd47aa1e6ff1a0c49e00 Reviewed-on-master: https://review.gluster.org/16973 Reported-by: Patrick Matthäi <pmatthaei@debian.org> BUG: 1437854 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16974 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* cluster/afr: Undo pending xattrs only on the up brickskarthik-us2017-03-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: While doing conservative merge, even if a brick is down, it will reset the pending xattr on that. When that brick comes up, as part of the heal, it will consider this brick as the source and removes the entries on the other bricks, which leads to data loss. Fix: Undo pending only for the bricks which are up. > Change-Id: I18436fa0bb1faa5f60531b357dea3f6b20446303 > BUG: 1433571 > Signed-off-by: karthik-us <ksubrahm@redhat.com> > Reviewed-on: https://review.gluster.org/16913 > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit f91596e6566c605e70a31a60523d11f78a097c3c) Change-Id: I51dbdc53e84051ec73308df9d4cf27726fc29dc7 BUG: 1436203 Signed-off-by: karthik-us <ksubrahm@redhat.com> Reviewed-on: https://review.gluster.org/16955 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
* afr: do not mention split-brain in log message in read_txnRavishankar N2017-03-271-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I am seeing a lot of messages in qe/customer logs where read_txn complains that file is possibly in split-brain because of no readable subvol being found, does inode refresh and then there is no split-brain message post the inode refresh. This means that a lookup was not issued on the indoe to populate 'readable' or it can mean one brick is source for data and the other for metadata, making readable to be zero (because readable=intersection of (data,metadata readable) since commit 7a1c1e290470149696. Since we anyway log actual split-brains post inode-refresh, move this message to DEBUG log level. > Signed-off-by: Ravishankar N <ravishankar@redhat.com> > Reviewed-on: https://review.gluster.org/16879 > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 71e023fcaab0058f32fedc7b6b702040fdd85f46) Change-Id: Idb88b8ea362515279dc9b246f06b6b646c6d8013 BUG: 1434303 Reviewed-on: https://review.gluster.org/16934 Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>