summaryrefslogtreecommitdiffstats
path: root/xlators/cluster
Commit message (Collapse)AuthorAgeFilesLines
* ec: Invalidations in disperse volume should not update the statPoornima G2017-01-172-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/16329 Issue: In disperse volume, the file is present across bricks, hence the stat from one brick doesn't carry the valid size of the file. Therefore the upcall from one brick updating the md-cache results in wrong size being updated. Fix: If the notification is cache invalidation then, indicate md-cache that the attributes is invalid. >Reviewed-on: http://review.gluster.org/16329 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 95d07a3d2d68805d93d36a447436e27c48777939) BUG: 1410688 Change-Id: Id89d2283478e70b62b435a8891fffc86d2be8cb2 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/16341 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr: Remove backward compatibility for locks with v1Pranith Kumar K2017-01-172-69/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we have cascading locks with same lk-owner there is a possibility for a deadlock to happen. One example is as follows: self-heal takes a lock in data-domain for big name with 256 chars of "aaaa...a" and starts heal in a 3-way replication when brick-0 is offline and healing from brick-1 to brick-2 is in progress. So this lock is active on brick-1 and brick-2. Now brick-0 comes online and an operation wants to take full lock and the lock is granted at brick-0 and it is waiting for lock on brick-1. As part of entry healing it takes full locks on all the available bricks and then proceeds with healing the entry. Now this lock will start waiting on brick-0 because some other operation already has a granted lock on it. This leads to a deadlock. Operation is waiting for unlock on "aaaa..." by heal where as heal is waiting for the operation to unlock on brick-0. Initially I thought this is happening because healing is trying to take a lock on all the available bricks instead of just the bricks that are participating in heal. But later realized that same kind of deadlock can happen if a brick goes down after the heal starts but comes back before it completes. So the essential problem is the cascading locks with same lk-owner which were added for backward compatibility with afr-v1 which can be safely removed now that versions with afr-v1 are already EOL. This patch removes the compatibility with v1 which requires cascading locks with same lk-owner. In the next version we can make locking-scheme option a dummy and switch completely to v2. >BUG: 1401404 >Change-Id: Ic9afab8260f5ff4dff5329eb0429811bcb879079 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/16024 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> BUG: 1413062 Change-Id: I4f5d485d9e0646ad3dc384e5ec36682b0933c9d3 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/16413 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cluster/afr: Do not log of split-brain when there isn't oneKrutika Dhananjay2017-01-133-24/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/16362 * Even on errors like ENOENT, AFR logs split-brain after read-txn refresh, introduced by commit a07ddd8f. This can be a cause of much panic and confusion and needs to be fixed. * Also fixed this issue in write-txns. * Fixed afr read txns to log about split-brain only after knowing that there is no split-brain choice configured. * Removed code duplication * Fixed incorrect passing of error code in afr_write_txn_refresh_done() (the function was passing -0 as errno to gf_msg(). Change-Id: Ie40d2c498674a1fe8dc2c521b05e30c0bce85c02 BUG: 1412914 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16388 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* afr: Avoid resetting event_gen when brick is always downRavishankar N2017-01-133-19/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: __afr_set_in_flight_sb_status(), which resets event_gen to zero, is called if failed_subvols[i] is non-zero for any brick. But failed_subvols[i] is true even if the brick was down *before* the transaction started. Hence say if 1 brick is down in a replica-3, every writev that comes will trigger an inode refresh because of this resetting, as seen from the no. of FSTATs in the profile info in the BZ. Fix: Reset event gen only if the brick was previously a valid read child and the FOP failed on it the first time. Also `s/afr_inode_read_subvol_reset/afr_inode_event_gen_reset` because the function only resets event gen and not the data/metadata readable. > Signed-off-by: Ravishankar N <ravishankar@redhat.com> > Reviewed-on: http://review.gluster.org/16309 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 522640be476a3f97dac932f7046f0643ec0ec2f2) Change-Id: I603ae646cbde96995c35db77916e2ed80b602a91 BUG: 1412886 Reviewed-on: http://review.gluster.org/16385 Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/ec: Check xdata to avoid memory leakAshish Pandey2017-01-111-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: ec_writev_start calls ec_make_internal_fop_xdata to set "yes" in xdata before ec_readv (an internal fop) is called for head and tail. Second call to this function is overwriting the previous allocated dict_t to "xdata", which results in memory leak. Solution: In ec_make_internal_fop_xdata, check if *xdata is NULL or not to avoid overwriting *xdata. >Change-Id: I49b83923e11aff9b92d002e86424c0c2e1f5f74f >BUG: 1400818 >Signed-off-by: Ashish Pandey <aspandey@redhat.com> >Reviewed-on: http://review.gluster.org/16007 >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: I49b83923e11aff9b92d002e86424c0c2e1f5f74f BUG: 1400833 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/16006 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* dht/md-cache: Filter invalidate if the file is made a linkto filePoornima G2017-01-041-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/15789 Upcall as a part of setattr, sends an invalidation and the invalidation carries the resulting stat value. When a file is converted to linkto files, even then an invalidation is set and as a result the mountpoint shows the sticky bit in the stat of the file. eg: ---------T. 945 root root 0 Nov 8 10:14 hardlink.999 Fix: When dht recieves a notification of sticky bit change, it updates the flag, to indicate md-cache to send the subsequent lookup. >Reviewed-on: http://review.gluster.org/15789 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Niels de Vos <ndevos@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Susant Palai <spalai@redhat.com> >Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> >(cherry picked from commit 4536f7bdf16f8286d67598eda9a46c029f0c0bf4) Change-Id: Ic2fd7a5b196db0754f9b97072e644e6bf69da606 BUG: 1401376 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/16022 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: Fix memory corruption while accessing regex stored inRaghavendra G2017-01-033-45/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | private If reconfigure is executed parallely (or concurrently with dht_init), there are races that can corrupt memory. One such race is modification of regexes stored in conf (conf->rsync_regex_valid and conf->extra_regex_valid) through dht_init_regex. With change [1], reconfigure codepath can get executed parallely (with itself or with dht_init) and this fix is needed. Also, a reconfigure can race with any thread doing dht_layout_search, resulting in dht_layout_search accessing regex freed up by reconfigure (like in bz 1399134). [1] http://review.gluster.org/15046 >Change-Id: I039422a65374cf0ccbe0073441f0e8c442ebf830 >BUG: 1399134 >Signed-off-by: Raghavendra G <rgowdapp@redhat.com> >Reviewed-on: http://review.gluster.org/15945 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: N Balachandran <nbalacha@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Change-Id: I039422a65374cf0ccbe0073441f0e8c442ebf830 BUG: 1399422 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 64451d0f25e7cc7aafc1b6589122648281e4310a) Reviewed-on: http://review.gluster.org/15949 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: Fix missing name indices due to EEXIST errorKrutika Dhananjay2016-12-281-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/16286 PROBLEM: Consider a volume with granular-entry-heal and sharding enabled. When a replica is down and a shard is created as part of a write, the name index is correctly created under indices/entry-changes/<dot-shard-gfid>. Now when a read on the same region triggers another MKNOD, the fop fails on the online bricks with EEXIST. By virtue of this being a symmetric error, the failed_subvols[] array is reset to all zeroes. Because of this, before post-op, the GF_XATTROP_ENTRY_OUT_KEY will be set, causing the name index, which was created in the previous MKNOD operation, to be wrongly deleted in THIS MKNOD operation. FIX: The ideal fix would have been for a transaction to delete the name index ONLY if it knows it is the one that created the index in the first place. This would involve gathering information as to whether THIS xattrop created the index from individual bricks, aggregating their responses and based on the various posisble combinations of responses, decide whether to delete the index or not. This is rather complex. Simpler fix would be for post-op to examine local->op_ret in the event of no failed_subvols to figure out whether to delete the name index or not. This can occasionally lead to creation of stale name indices but they won't be affecting the IO path or mess with pending changelogs in any way and self-heal in its crawl of "entry-changes" directory would take care to delete such indices. Change-Id: I8c5c08b7a208e840b5970fe5699dabdaf751a150 BUG: 1408785 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16294 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: use accused matrix instead of readable matrix for deciding healsRavishankar N2016-12-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: afr_replies_interpret() used the 'readable' matrix to trigger client side heals after inode refresh. But for arbiter, readable is always zero. So when `dd` is run with a data brick down, spurious data heals are are triggered. These heals open an fd, causing eager lock to be disabled (open fd count >1) in afr transactions, leading to extra FXATTROPS Fix: Use the accused matrix (derived from interpreting the afr pending xattrs) to decide whether we can start heal or not. > Reviewed-on: http://review.gluster.org/16277 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 5a7c86e578f5bbd793126a035c30e6b052177a9f) Change-Id: Ibbd56c9aed6026de6ec42422e60293702aaf55f9 BUG: 1408770 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/16290 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: Ignore event_generation checks post inode refresh for write txnsRavishankar N2016-12-263-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Before http://review.gluster.org/#/c/15673/, after inode refresh, we failed read txns in case of EIO or event_generation being zero. For write transactions, the check was only for EIO. 15673 re-factored the code to fail both read and write when event_generation=0. This seems to have caused a regression as explained in the BZ. This patch restores that behaviour in afr_txn_refresh_done(). > Reviewed-on: http://review.gluster.org/16205 > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> (cherry picked from commit 7ee998b9041d594d93a4e2ef369892c185e80def) Change-Id: Ib8e116506badce6f58b55827dbe403d95069d744 BUG: 1408171 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/16271 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Healing should not start if only "data" bricks are UPAshish Pandey2016-12-211-11/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In a disperse volume with "K+R" configuration, where "K" is the number of data bricks and "R" is the number of redundancy bricks (Total number of bricks, N = K+R), if only K bricks are UP, we should NOT start heal process. This is because the bricks, which are supposed to be healed, are not UP. This will unnecessary eat up the resources. Solution: Check for the number of xl_up_count and only if it is greater than ec->fragments (number of data bricks), start heal process. >Change-Id: I8579f39cfb47b65ff0f76e623b048bd67b15473b >BUG: 1399072 >Signed-off-by: Ashish Pandey <aspandey@redhat.com> >Reviewed-on: http://review.gluster.org/15937 >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >Smoke: Gluster Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Signed-off-by: Ashish Pandey <aspandey@redhat.com> Change-Id: I8579f39cfb47b65ff0f76e623b048bd67b15473b BUG: 1399989 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15974 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* afr: fix bug in passing child index in afr_inode_write_fillRavishankar N2016-12-211-4/+3
| | | | | | | | | | | | | | Backport of: http://review.gluster.org/16029 Change-Id: I2589aab4b781cc7b50c93f33c8dc51036dbf82f8 BUG: 1402212 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/16100 Tested-by: Krutika Dhananjay <kdhananj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr, client: More mem-leak fixes in COMPOUND fop cbkKrutika Dhananjay2016-12-203-20/+22
| | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/16020 Bugs found and fixed: 1. Use correct subvolume index in pre-op-writev compound cbk 2. Prevent use-after-free of local->compound_args members in compound fops cbk in protocol/client 3. Fix xdata and xattr leaks in client_process_response 4. Fix possible leak of xdata in client_pre_writev() in test mode. 5. Free req->compound_req_array.compound_req_array_val as well after freeing its members 6. Free tmp_rsp->flock.lk_owner.lk_owner_val in LK fop. Change-Id: I40f576b48625d65fcbd7a11181eeff37f9e1e011 BUG: 1402212 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16046 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Fix lk-owner set race in ec_unlockPranith Kumar K2016-12-146-34/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Rename does two locks. There is a case where when it tries to unlock it sends xattrop of the directory with new version, callback of these two xattrops can be picked up by two separate epoll threads. Both of them will try to set the lk-owner for unlock in parallel on the same frame so one of these unlocks will fail because the lk-owner doesn't match. Fix: Specify the lk-owner which will be set on inodelk frame which will not be over written by any other thread/operation. >BUG: 1402710 >Change-Id: I666ffc931440dc5253d72df666efe0ef1d73f99a >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/16074 >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> BUG: 1404572 Change-Id: Iff4f0c1364e6533f3c07f192138bcd321789b4cd Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/16130 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* afr,dht,ec: Replace GF_EVENT_CHILD_MODIFIED with event SOME_DESCENDENT_DOWN/UPPoornima G2016-12-133-27/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/15764/ Currently these are few events related to child_up/down: GF_EVENT_CHILD_UP : Issued when any of the protocol client connects. GF_EVENT_CHILD_MODIFIED : Issued by afr/dht/ec GF_EVENT_CHILD_DOWN : Issued when any of the protocol client disconnects. These events get modified at the dht/afr/ec layers. Here is a brief on the same. DHT: - All the subvolumes reported once, and atleast one child came up, then GF_EVENT_CHILD_UP is issued - connect GF_EVENT_CHILD_UP is issued - disconnect GF_EVENT_CHILD_MODIFIED is issued - All the subvolumes disconnected, GF_EVENT_CHILD_DOWN is issued AFR: - First subvolume came up, then GF_EVENT_CHILD_UP is issued - Subsequent subvolumes coming up, results in GF_EVENT_CHILD_MODIFIED - Any of the subvolumes go down, then GF_EVENT_SOME_CHILD_DOWN is issued - Last up subvolume goes down, then GF_EVENT_CHILD_DOWN is issued Until the patch [1] introduced GF_EVENT_SOME_CHILD_UP, GF_EVENT_CHILD_MODIFIED was issued by afr/dht when any of the subvolumes go up or down. Now with md-cache changes, there is a necessity to differentiate between child up and down. Hence, introducing GF_EVENT_SOME_DESCENDENT_DOWN/UP and getting rid of GF_EVENT_CHILD_MODIFIED. [1] http://review.gluster.org/12573 >Reviewed-on: http://review.gluster.org/15764 >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: N Balachandran <nbalacha@redhat.com> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> (cherry picked from commit f7ab6c45963fa0da68acedfb14281cd2456abc68) Change-Id: I704140b6598f7ec705493251d2dbc4191c965a58 BUG: 1396880 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15890 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: Fix the EIO that can occur in afr_inode_refresh as a resultPoornima G2016-12-134-18/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of cache invalidation(upcall). Issue: ------ When a cache invalidation is recieved as a result of changing pending xattr, the read_subvol is reset. Consider the below chain of execution: CHILD_DOWN ... afr_readv ... afr_inode_refresh ... afr_inode_read_subvol_reset <- as a result of pending xattr set by some other client GF_EVENT_UPCALL will be sent afr_refresh_done -> this results in an EIO, as the read subvol was reset by the end of the afr_inode_refresh Solution: --------- When GF_EVENT_UPCALL is recieved, instead of resetting read_subvol, set a variable need_refresh in inode_ctx, the next time some one starts a txn, along with event gen, need_rrefresh also needs to be checked. >Reviewed-on: http://review.gluster.org/15892 >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Signed-off-by: Poornima G <pgurusid@redhat.com> Change-Id: Ifda21a7a8039b8874215e1afa4bdf20f7d991b58 BUG: 1399450 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15959 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: allow I/O when favorite-child-policy is enabledRavishankar N2016-12-128-61/+387
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Currently, I/O on a split-brained file fails even when the favorite-child-policy is set until the self-heal is complete. Fix: If a valid 'source' is found using the set favorite-child-policy,inspect and reset the afr pending xattrs on the 'sinks' (inside appropriate locks),refresh the inode and then proceed with the read or write transaction. The resetting itself happens in the self-heal code and hence can also happen in the client side background-heal or by the shd's index-heal in addition to the txn code path explained above. When it happens in via heal, we also add checks in undo-pending to not reset the sink xattrs again. > Reviewed-on: http://review.gluster.org/15673 > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: Ic8c1317720cb26bd114b6fe6af4e58c73b864626 BUG: 1403121 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Simon Turcotte-Langevin <simon.turcotte-langevin@ubisoft.com> Reviewed-on: http://review.gluster.org/16088 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr: Fix per-txn optimistic changelog initialisationKrutika Dhananjay2016-12-122-29/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/16075 Incorrect initialisation of local->optimistic_change_log was leading to skipped pre-op and post-op even when a brick didn't participate in the txn because it was down. The result - missing granular name index resulting in some entries never getting healed. FIX: Initialise local->optimistic_change_log just before pre-op. Also fixed granular entry heal to create the granular name index in pre-op as opposed to post-op. This is to prevent loss of granular information when during an entry txn, the good (src) brick goes offline before the post-op is done. This would cause self-heal to do conservative merge (since dirty xattr is the only information available), which when granular-entry-heal is enabled, expects granular indices, the lack of which can lead to loss of data in the worst case. Change-Id: I213d98ca9b3c4604b095478bf427fa69c04a7d64 BUG: 1403743 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16106 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: Serialize conflicting locks on all subvolsPranith Kumar K2016-12-082-33/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: 1) When a blocking lock is issued and the parallel lock phase fails on all subvolumes with EAGAIN, it is not switching to serialized locking phase. 2) When quorum is enabled and locks fail partially it is better to give errno returned by brick rather than the default quorum errno. Fix: Handled this error case and changed op_errno to reflect the actual errno in case of quorum error. >BUG: 1369077 >Change-Id: Ifac2e4a13686e9fde601873012700966d56a7f31 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15984 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> BUG: 1402482 Change-Id: Ib1ca577bfa52ae537ab7186d10bfa2ae755813e3 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/16057 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: Fix bugs in [f]inodelk/[f]entrylkPranith Kumar K2016-12-075-336/+381
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problems: 1) Inodelk is not taking quorum into account 2) finodelk, [f]entrylk are not implemented correctly 3) By default afr doesn't go for non-blocking parallel locks. Fix: Implemented a common framework which can be used by [f]inodelk/[f]entrylk. Used quorum for the same. >Change-Id: I239f13875a065298630d266941df10cfa3addc85 >BUG: 1369077 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15802 >Tested-by: Krutika Dhananjay <kdhananj@redhat.com> >Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> BUG: 1402482 Change-Id: I0c5fed6ca87c6432bb20d00f76cdf5c328a52a85 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/16056 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
* cluster/dht: Check for null inodeN Balachandran2016-12-071-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | Check for NULL inode before attempting to set dht inode ctx. > Change-Id: I7693c18445f138221d8417df5e95b118cedb818a > BUG: 1395261 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/15847 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 8313d53accaa22feb14d284fb91245be0a32e16e) Change-Id: Id8c7bfe181bb40a02cd49b0f5fc3b45cabf5afa6 BUG: 1395517 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15851 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* dht/rename : Incase of failure remove linkto file properlyJiffin Tony Thottan2016-12-062-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generally linkto file is created using root user. Consider following case, a user is trying to rename a file which he is not permitted. So the rename fails with EACESS and when rename tries to cleanup the linkto file, it fails. The above issue happens when rename/00.t test executed on nfs-ganesha clients : Steps executed in script * create a file "abc" using root * rename the file "abc" to "xyz" using a non root user, it fails with EACESS * delete "abc" * create directory "abc" using root * again try ot rename "abc" to "xyz" using non root user, test hungs here which slowly leds to OOM kill of ganesha process RCA put forwarded by Du for OOM kill of ganesha Note that when we hit this bug, we've a scenario of a dentry being present as: * a linkto file on one subvol * a directory on rest of subvols When a lookup happens on the dentry in such a scenario, the control flow goes into an infinite loop of: dht_lookup_everywhere dht_lookup_everywhere_cbk dht_lookup_unlink_cbk dht_lookup_everywhere_done dht_lookup_directory (as local->dir_count > 0) dht_lookup_dir_cbk (sets to local->need_selfheal = 1 as the entry is a linkto file on one of the subvol) dht_lookup_everywhere (as need_selfheal = 1). This infinite loop can cause increased consumption of memory due to: 1) dht_lookup_directory assigns a new layout to local->layout unconditionally 2) Most of the functions in this loop do a stack_wind of various fops. This results in growing of call stack (note that call-stack is destroyed only after lookup response is received by fuse - which never happens in this case) Thanks Du for root causing the oom kill and Sushant for suggesting the fix Upstream reference : >Change-Id: I1e16bc14aa685542afbd21188426ecb61fd2689d >BUG: 1397052 >Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> >Reviewed-on: http://review.gluster.org/15894 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> >(cherry picked from commit 57d59f4be205ae0c7888758366dc0049bdcfe449) Change-Id: I1e16bc14aa685542afbd21188426ecb61fd2689d BUG: 1401023 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/16014 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* selfheal: fix memory leak on client side healing queueMateusz Slupny2016-12-043-3/+8
| | | | | | | | | | | | | | | | | | | | | | > Reviewed-on: http://review.gluster.org/15968 > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Reviewed-by: Ravishankar N <ravishankar@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit fb95eb4da6f4fc0b9c69e3b159a2214fe47e6d1d) Change-Id: I2beaba829710565a3246f7449a5cd21755cf5f7d BUG: 1400926 Signed-off-by: Mateusz Slupny <mateusz.slupny@appeartv.com> Reviewed-on: http://review.gluster.org/16011 Tested-by: Ravishankar N <ravishankar@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* md-cache, afr: Reduce the window of stale readPoornima G2016-12-012-2/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Consider a replica setup, where one mount writes data to a file and the other mount reads the file. In afr, read operations are not transaction based, a brick(read subvolume) is chosen as a part of lookup or other operations, read is always wound only to the read subvolume, even if there was write from a different client that failed on this brick. This stale read continues until there is a lookup or any write operation from the mount point. Currently, this is not a major issue, as a lookup is issued before every read and it will switch the read subvolume to a correct one. But with the plan of increasing md-cache timeout to 600s, the stale read problem will be more pronounced, i.e. stale read can continue for 600s(or more if cascaded with readdirp), as there will be no lookups. Solution: Afr doesn't have any built-in solution for stale read(without affecting the performance). The solution that came up, was to use upcall. When a file on any brick is marked bad for the first time, upcall sends a notification to all the clients that had recently accessed the file. The solution has 2 parts: - Identifying when a file is marked bad, on any of the bricks, for the first time - Client side actions on recieving the notifications Identifying when a file is marked bad on any of the bricks for the first time: ----------------------------------------------------------------------------- The idea is to track xattrop in upcall. xattrop currently comes with 2 afr xattrs - afr dirty bit and afr pending xattrs. Dirty xattr is set to 1 before every write, and is unset if write succeeds. In certain scenarios, dirty xattr can be 0 and still the file could be bad copy. Hence do not track dirty xattr. Pending xattr is set on the good copy, indicating the other bricks that have bad copy. It is still not as simple as, notifying when any of the pending xattrs change. It could lead to flood of notifcations, in case the other brick is completely down or consistantly failing. Hence it is important to notify only once, the first time a good copy is marked bad. Client side actions on recieving pending xattr change, notification: -------------------------------------------------------------------- md-cache will invalidate the cache of that file, so that further lookup is passed down to afr and hence update the read subvolume. Invalidating only in md-cache is not enough, consider the folling oder of opertaions: - pending xattr invalidation - invalidate md-cache - readdirp on the bad read subvolume - fill md-cache - lookup (served from md-cache) - read - wound to the old read subvol. Hence, along with invalidating md-cache, it is very important to reset the read subvolume for that file, in afr. Design Credit: Anuradha Talur, Ravishankar N 1. xattrop doesn't carry info saying post op/pre op. 2. Pre xattrop will have 0 value for all pending xattrs, the cbk of pre xattrop carries the on-disk xattr value. Non zero indicated healing is required. 3. Post xattrop will have non zero value for any of the pending xattrs, if the fop failed on any of the bricks. >Reviewed-on: http://review.gluster.org/15398 >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Signed-off-by: Poornima G <pgurusid@redhat.com> Change-Id: I469cbc111714c433984fe1c922be2ef113c25804 BUG: 1399450 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15958 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: Implement IPC fopPoornima G2016-12-012-0/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/15378 Currently ipc() is not implemented in afr. md-cache and upcall uses ipc to register the list of xattrs, [1] for more details. For the ipc op GF_IPC_TARGET_UPCALL, it has to be wound to all the replica subvolumes. ipc() is failed when any of the subvolumes fails with other than ENOTCONN or all of the subvolumes are down. [1] http://review.gluster.org/#/c/15002/ >Reviewed-on: http://review.gluster.org/15378 >Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Signed-off-by: Poornima G <pgurusid@redhat.com> Change-Id: I0f651330eafda64e4d922043fe53bd0014536247 BUG: 1399450 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15957 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* ec: Implement ipc fopPoornima G2016-11-304-1/+150
| | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/15387/ The ipc will be wound to all the bricks, but for it to be successfull, the fop should succeed on minimum number of bricks. >Reviewed-on: http://review.gluster.org/15387 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Ashish Pandey <aspandey@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >(cherry picked from commit 359b72a57b7c92fc2a11236ac05f5d740db2f540) Change-Id: I3f8cb6a349e87bafd0773583def9d4e3765aa140 BUG: 1399450 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15956 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/dht: A hard link is lost during rebalance + lookupMohit Agrawal2016-11-291-30/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | Problem: A hard link is lost during rebalance + lookup.Rebalance skip files if file has hardlink.In dht_migrate_file __is_file_migratable () function checks if a file has hardlink, if yes file is not migrated but if link is created after call this function then link will lost. Solution: Call __check_file_has_hardlink to check hardlink existence after (S+T) bits in migration process ,if file has hardlink then skip the file for migrate rebalance process. > BUG: 1396048 > Change-Id: Ia53c07ef42f1128c2eedf959a757e8df517b9d12 > Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > (cherry picked from commit 4b8ccbed28837bd78894cb5ce3cf15bc8f364a93) BUG: 1399430 Change-Id: Idc869f2cf2355dacf54c36008840092b8e77acb9 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15955 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/tier: handle fast demotionsMilind Changire2016-11-284-16/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Demote files on priority if hi-watermark has been breached and continue to demote until the watermark drops below hi-watermark. Monitor watermark more frequently. Trigger demotion as soon as hi-watermark is breached. Add cluster.tier-query-limit option to limit number of files returned from the database query for every iteration of tier_migrate_using_query_file(). If watermark hasn't dropped below hi-watermark during the first iteration, the next iteration will be triggered approximately 1 second after tier_demote() returns to the main tiering loop. Update changetimerecorder xlator to handle query for emergency demote mode. Add tier-ctr-interface.h: Move tier and ctr interface specific macros and struct definition from libglusterfs/src/gfdb/gfdb_data_store.h to new header libglusterfs/src/tier-ctr-interface.h > Reviewed-on: http://review.gluster.org/15158 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> (cherry picked from commit 460016428cf27484c333227f534c2e2f73a37fb1) Change-Id: If56af78c6c81d37529b9b6e65ae606ba5c99a811 BUG: 1394482 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/15835 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* cluster/afr: Fix deadlock due to compound fopsKrutika Dhananjay2016-11-271-14/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/15929 When an afr data transaction is eligible for using eager-lock, this information is represented in local->transaction.eager_lock_on. However, if non-blocking inodelk attempt (which is a full lock) fails, AFR falls back to blocking locks which are range locks. At this point, local->transaction.eager_lock[] per brick is reset but local->transaction.eager_lock_on is still true. When AFR decides to compound post-op and unlock, it is after confirming that the transaction did not use eager lock (well, except for a small bug where local->transaction.locks_acquired[] is not considered). But within afr_post_op_unlock_do(), afr again incorrectly sets the lock range to full-lock based on local->transaction.eager_lock_on value. This is a bug and can lead to deadlock since the locks acquired were range locks and a full unlock is being sent leading to unlock failure and thereby every other lock request (be it from SHD or other clients or glfsheal) getting blocked forever and the user perceives a hang. FIX: Unconditionally rely on the range locks in inodelk object for unlocking when using compounded post-op + unlock. Big thanks to Pranith for helping with the debugging. Change-Id: I2edcc13ac00bc1ba2e3558891ba98d0cd410b47a BUG: 1398888 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15932 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr: Handle rpc errors, xdr failures etc with proper NULL checksKrutika Dhananjay2016-11-251-6/+17
| | | | | | | | | | | | | Backport of: http://review.gluster.org/15924 Change-Id: Ie2a181e113ba24abca8cd4fd6bb722d048f014a8 BUG: 1398499 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15925 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* events: Add FMT_WARN for gf_eventPranith Kumar K2016-11-232-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | Raghavendra G found that posix is trying to print %s but passing an int when HEALTH_CHECK fails in posix. These are the kind of bugs that should be caught at compilation itself. Also fixed the problematic gf_event() callers. >BUG: 1386097 >Change-Id: Id7bd6d9a9690237cec3ca1aefa2aac085e8a1270 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15671 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Atin Mukherjee <amukherj@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> BUG: 1396778 Change-Id: Idf8e1f427578d02dccd2a8165884a5cf086eb07e Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15884 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* cluster/dht Set layout after mkdir as rootN Balachandran2016-11-231-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DHT does not set the layout for newly created directories as root. This causes EPERM failures when a non-root user with insufficient permissions creates directories. credit: srangana@redhat.com for RCA > Change-Id: Ia646e41665ce172c43c5f01d2707455e8eb374ed > BUG: 1392772 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/15794 > Reviewed-by: Susant Palai <spalai@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> (cherry picked from commit 3e405b546e8b9fe15ae477613474e9cd2d2df4e7) Change-Id: Ib792d4018e528b5805ec7cff4988fada17fff0da BUG: 1397252 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15898 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/afr: When failing fop due to lack of quorum, also log error stringKrutika Dhananjay2016-11-151-11/+12
| | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/15800/ Change-Id: I11a1321dfd4c49a326886bc92b0f9c08556b4a66 BUG: 1393629 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15812 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Fill loc.pargfid too for named lookups on individual shardsKrutika Dhananjay2016-11-081-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/15788/ On a sharded volume when a brick is replaced while IO is going on, named lookup on individual shards as part of read/write was failing with ENOENT on the replaced brick, and as a result AFR initiated name heal in lookup callback. But since pargfid was empty (which is what this patch attempts to fix), the resolution of the shards by protocol/server used to fail and the following pattern of logs was seen: Brick-logs: [2016-11-08 07:41:49.387127] W [MSGID: 115009] [server-resolve.c:566:server_resolve] 0-rep-server: no resolution type for (null) (LOOKUP) [2016-11-08 07:41:49.387157] E [MSGID: 115050] [server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null) (00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82) ==> (Invalid argument) [Invalid argument] Client-logs: [2016-11-08 07:41:27.497687] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.497755] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.498500] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.499680] E [MSGID: 133010] Also, this patch makes AFR by itself choose a non-NULL pargfid even if its ancestors fail to initialize all pargfid placeholders. Change-Id: I1ad027b7b2ccd6b5c0162726b63b06d261b108e8 BUG: 1392844 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15795 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* afr,ec: Heal device files with correct major, minor numbersPranith Kumar K2016-10-262-2/+4
| | | | | | | | | | | | | | | | | | | | | | | Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the fix. >BUG: 1384297 >Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15728 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: Ie7977caf7c98c91fca64752c56731c37ad27df4d BUG: 1388912 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15734 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* cluster/dht: Incorrect volname in rebalance eventsN Balachandran2016-10-251-6/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | The rebalance event code was using strtok to parse the volume name which is incorrect. Reworked the code to get the correct volume name using strstr. > Change-Id: Ib5f3305a34e6bf1ecfef677d87c5aff96bdeb0e6 > BUG: 1388010 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/15712 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit f4efbf0e3092bd2181f62be1e1f30f202678c866) Change-Id: Ia22d6a8ce23e9dcab8281b4995277cbb3a196fda BUG: 1388563 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15725 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr: Prevent dict_set() on NULL dictPranith Kumar K2016-10-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | In afr lookup when NULL dict is received in lookup, afr is supposed to set all the xattrs it requires in a new dict it creates, but for 'link-count' it is trying to set to the dict that is passed in lookup which can be NULL sometimes. This is leading to error logs. Fixed the same in this patch. >BUG: 1385104 >Change-Id: I679af89cfc410cbc35557ae0691763a05eb5ed0e >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15646 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> BUG: 1385236 Change-Id: I802e74e7ad24e183b6653101ad7bf5ab0bf6e55b Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15650 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
* afr: Take full locks in arbiter only for data transactionsRavishankar N2016-10-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Sharding exposed a bug in arbiter config. where `dd` throughput was extremely slow. Shard xlator was sending a fxattrop to update the file size immediately after a writev. Arbiter was incorrectly over-riding the LLONGMAX-1 start offset (for metadata domain locks) for this fxattrop, causing the inodelk to be taken on the data domain. And since the preceeding writev hadn't released the lock (afr does a 'lazy' unlock if write succeeds on all bricks), this degraded to a blocking lock causing extra lock/unlock calls and delays. Fix: Modify flock.l_len and flock.l_start to take full locks only for data transactions. > Reviewed-on: http://review.gluster.org/15641 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 3a97486d7f9d0db51abcb13dcd3bc9db935e3a60) Change-Id: I906895da2f2d16813607e6c906cb4defb21d7c3b BUG: 1385224 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Max Raba <max.raba@comsysto.com> Reviewed-on: http://review.gluster.org/15648 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* cluster/ec: set/unset dirty flag for data/metadata updateAshish Pandey2016-10-133-124/+166
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, for all the update operations, metadata or data, we set the dirty flag at the end of the operation only if a brick is down. This leads to delay in healing and in some cases not at all. In this patch we set (+1) the dirty flag at the start of the metadata or data update operations and after successfull completion of the fop, we unset (-1) it again. >Change-Id: Ide5668bdec7b937a61c5c840cdc79a967598e1e9 >BUG: 1316873 >Signed-off-by: Ashish Pandey <aspandey@redhat.com> >Reviewed-on: http://review.gluster.org/13733 >Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Change-Id: Ide5668bdec7b937a61c5c840cdc79a967598e1e9 BUG: 1377570 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15534 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/ec: Implement heal info with lockAshish Pandey2016-10-135-79/+261
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Currently heal info command prints all the files/directories if the index for the file/directory is present in .glusterfs/indices folder. After implementing patch http://review.gluster.org/#/c/13733/ indices of the file which is going through update fop will also be present in .glusterfs/indices even if the fop is successful on all the brick. At this time if heal info command is being used, it will also display this file which is actually healthy and does not require any heal. Solution: Take lock on a file corresponding to the indices and inspect xattrs to decide if the file needs heal or not. >Reviewed-on: http://review.gluster.org/15543 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Signed-off-by: Ashish Pandey <aspandey@redhat.com> Change-Id: I6361e2813ece369be12d02e74816df4eddb81cfa BUG: 1383913 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15627 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* afr: Ignore gluster internal (virtual) xattrs in metadata heal checkRavishankar N2016-09-291-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In arbiter configuration, posix-xlator in the arbiter brick always sets the GF_CONTENT_KEY in the response dict with a value 0. If the file size on the data bricks is more than quick-read's max-file-size (64kb default), those bricks don't set the key. Because of this difference in the no. of dict elements, afr triggers metadata heal in lookup code path, in turn leading to extra lookups+inodelks. Fix: Changed afr dict comparison logic to ignore all virtual xattrs and the on-disk ones that we should not be healing. Also removed is_virtual_xattr() function. The original callers to this function (upcall) don't seem to need it anymore. > Reviewed-on: http://review.gluster.org/15548 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 5afc6aba906a21aee19c2f1baaa7d9eb762ae0ac) Change-Id: I05730bdd39d8fb0b9a49a5fc9c0bb01f0d3bb308 BUG: 1379528 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/15577 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: Modifications to afr eventsRavishankar N2016-09-245-8/+14
| | | | | | | | | | | | | | | | | | | | | | Modified afr event message to add a 'type' key as detailed in the BZ. Also added events for data and metadata split-brain. > Reviewed-on: http://review.gluster.org/15550 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit f6a3c541941df6fd19ef57185aca5c4bcec2dec3) Change-Id: I8156674b4b6a501499fc10fd68e05115fdaef3e4 BUG: 1379028 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/15565 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* dht/events: Added rebalance eventsN Balachandran2016-09-171-6/+50
| | | | | | | | | | | | | | | | | | | | | | | | The rebalance process will now send an event when it is complete. Also fixed a problem where the run-time was not always set causing spurious rebalance failure events to be sent. > Change-Id: Ib445171c78c9560940022bca20c887d31a9bb1ca > BUG: 1371874 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/15501 > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> (cherry picked from commit e872917dbf4563703b8f396d7d30f8168ed646cd) Change-Id: Ia24038c04548a98aac899df10414585b988caa76 BUG: 1376477 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15520 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* cluster/ec: Add support for hardware accelerationXavier Hernandez2016-09-0938-12302/+21106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements functionalities for fast encoding/decoding using hardware support. Currently optimized x86_64, SSE and AVX is added. Additionally this patch implements a caching mecanism for inverse matrices to reduce computation time, as well as a new method for computing the inverse that takes quadratic time instead of cubic. Finally some unnecessary memory copies have been eliminated to further increase performance. >Change-Id: I26c75f26fb4201bd22b51335448ea4357235065a >BUG: 1289922 >Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> >Reviewed-on: http://review.gluster.org/12837 >Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> BUG: 1374841 Change-Id: I83731663922ed11ca84536deab5737463416e1e0 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15455 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/tier: add tiering eventsMilind Changire2016-09-084-0/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add events for: * tier attach and detach * tier pause and resume * tier rising and dropping hi and lo watermarks Update eventskeygen.py with tiering events. Update cli help with: * attach: add optional force argument * detach: make force available as non-optional argument on its own > Reviewed-on: http://review.gluster.org/15232 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> (cherry picked from commit cd23242b8953040b231f402c334f16520ac8029f) Change-Id: I43990d3a8742151a4a7889bafa19cb572fe661bd BUG: 1374324 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/15431 Tested-by: Atin Mukherjee <amukherj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* cluster/ec: Add events for EC translatorAshish Pandey2016-09-081-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch will generates events in following cases which will be consumed by new event framework. Consider an EC volume with (K+M) configuration K = Data bricks M = Redundancy bricks 1- EVENT_EC_MIN_BRICKS_NOT_UP - When minimum "K" number of bricks, required for any ec fop, are not up. 2- EVENT_EC_MIN_BRICKS_UP When minimum "K" number of bricks, required for any ec fop, are up. >Change-Id: I0414b8968c39740a171e5aa14b087afd524d574f >BUG: 1371470 >Signed-off-by: Ashish Pandey <aspandey@redhat.com> >Reviewed-on: http://review.gluster.org/15348 >Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Change-Id: I0414b8968c39740a171e5aa14b087afd524d574f BUG: 1374167 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15425 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* afr: add replication eventsRavishankar N2016-09-074-16/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | Added the following events for the eventing framework: "EVENT_AFR_QUORUM_MET", --> Sent when quorum is met. "EVENT_AFR_QUORUM_FAIL" -->Sent when quorum is lost. "EVENT_AFR_SUBVOL_UP" -->Sent when afr witnesses the first up subvolume. "EVENT_AFR_SUBVOLS_DOWN"-->Sent when all children of an afr subvol are down. "EVENT_AFR_SPLIT_BRAIN" -->Sent when self-heal detects split-brain in heal path (not read/write path). Change-Id: I937c61ca1ce78b5922ade73c7bfa3051df59c513 BUG: 1373743 Signed-off-by: Ravishankar N <ravishankar@redhat.com> > Reviewed-on: http://review.gluster.org/15349 > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 86e312d872a957afff1e508f13f62c4102dba22d) Reviewed-on: http://review.gluster.org/15417 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: Consume compound fops in afr transactionAnuradha Talur2016-09-067-64/+535
| | | | | | | | | | | | | | | Backport of: http://review.gluster.org/15014 cherry-picked from befae0b48729894282b5b8b174907e24779c3442 Change-Id: I62ff41aa80fe700df2dda8b10606e484ee1f382e BUG: 1372686 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/15388 Tested-by: Krutika Dhananjay <kdhananj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* dht, md-cache, upcall: Add invalidation of IATT when the layout changesPoornima G2016-08-301-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue: dht_layout is built as a part of lookup only. The layout can be modified by rebalance process. Since every IO fop is preceded by a lookup, there are very less issues of stale layout. But with enhancements of aggressive caching of stats in md-cache, the lookup will reduce and expose the stale layout issue often. Solution: Since stale layout is already an issue on dht, there is already a plan to fix this at the dht layer, but this fix is not currently planned for any release. Until this fix comes out, we can have a workaround where, the upcall will send a notification to md-cache when a layout xattr is changed. As a part of layout change notification the existing cache is invalidated and the next lookup will fetch the latest layout. This is not a foolproof solution as the window between the layout change and the next lookup(after invalidation of stat), where there will be stale layout. But until the final fix comes in, this reduces the stale layout window. Change-Id: Iacf871a38b35880c1fc0bc68fe7ce291265e71d4 BUG: 1369638 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15300 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht/tiering: fix unused variable warnings/errorsKaleb S. KEITHLEY2016-08-303-3/+0
| | | | | | | | | | | | | | | | | | | http://review.gluster.org/14085 fixes a/the "leak" - via the generated rpc/xdr headers - of pragmas that mask these warnings. However 14085 won't pass the smoke test until all the warnings are fixed. Change-Id: I367a737570dd7d2f6cc25f4bf4299d31bb6826aa BUG: 1369124 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15242 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>