glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	encryption: Move all xlators to experimentalexp	Vijay Bellur	2016-12-20	20	-9/+10
\| \| \| \| \| \| \| \| \|	rot-13 is never meant to be used in production. crypt requires a lot more attention before it can be ready for production. Hence moving both xlators to experimental. Change-Id: I6dce653c88e2ede109f3031ab9e8f318ce7d87cb Signed-off-by: Vijay Bellur <vbellur@redhat.com>
*	snapshot: Fix restore rollback to reassign snap volume ids to bricks	Avra Sengupta	2016-12-16	6	-1/+158
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added further checks to ensure we do not go beyond prevalidate when trying to restore a snapshot which has a nfs-gansha conf file, in a cluster when nfs-ganesha is not enabled The error message for the particular scenario is: "Snapshot(<snapname>) has a nfs-ganesha export conf file. cluster.enable-shared-storage and nfs-ganesha should be enabled before restoring this snapshot." Change-Id: I1b87e9907e0a5e162f26ef1ca89fe76e8da8610f BUG: 1404118 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/16116 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	protocol/client: Fix potential mem-leaks	Krutika Dhananjay	2016-12-16	2	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 93eaeb9c93be3232f24e840044d560f9f0e66f71 introduces leaks in INODELK callback where a dict is unserialized twice, leading to dict leaks. Change-Id: I219ccb2279f237ebc2e4fc366af4775a461929b8 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16156 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	access_control : address O_TRUNC and O_APPEND flag properly in posix_acl_open	Jiffin Tony Thottan	2016-12-14	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In posix_acl_open, in switch value passed is (flag & O_ACCMODE). The value for O_ACCMODE is 0003, so the result will always be less than or equal to 3. But value for O_TRUNC is 01000 and O_APPEND is 02000, so it is not right to check it in switch case Change-Id: Ia17db80a6a5f681c35e08e062d384f33ef7e0354 BUG: 1387241 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/15688 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	cluster/ec: Fix lk-owner set race in ec_unlock	Pranith Kumar K	2016-12-13	6	-34/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Rename does two locks. There is a case where when it tries to unlock it sends xattrop of the directory with new version, callback of these two xattrops can be picked up by two separate epoll threads. Both of them will try to set the lk-owner for unlock in parallel on the same frame so one of these unlocks will fail because the lk-owner doesn't match. Fix: Specify the lk-owner which will be set on inodelk frame which will not be over written by any other thread/operation. BUG: 1402710 Change-Id: I666ffc931440dc5253d72df666efe0ef1d73f99a Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/16074 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	upcall: Fix 'use after free' in a log message	Soumya Koduri	2016-12-13	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is chance of accessing freed pointer in a log message at TRACE level while cleaning up expired client entries. Change-Id: I06b4dad755df63978ab04ca52442bfd4600d139a BUG: 1404168 Reported-by: Ravishankar N <ravishankar@redhat.com> Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/16117 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	glusterd: Handle volinfo->refcnt properly during volume start command	Avra Sengupta	2016-12-12	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While running the volume start command, the refcnt of the volume is incremented. At the end of the command, the refcnt should also be decremented. This is currently not the case. This patch, makes sure the refcnt is also decremented at the end of the volume start command. Change-Id: I017b5039be5948df41dde6bc89d2955d5d18971f BUG: 1403780 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/16108 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	cluster/afr: Fix per-txn optimistic changelog initialisation	Krutika Dhananjay	2016-12-12	2	-29/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Incorrect initialisation of local->optimistic_change_log was leading to skipped pre-op and post-op even when a brick didn't participate in the txn because it was down. The result - missing granular name index resulting in some entries never getting healed. FIX: Initialise local->optimistic_change_log just before pre-op. Also fixed granular entry heal to create the granular name index in pre-op as opposed to post-op. This is to prevent loss of granular information when during an entry txn, the good (src) brick goes offline before the post-op is done. This would cause self-heal to do conservative merge (since dirty xattr is the only information available), which when granular-entry-heal is enabled, expects granular indices, the lack of which can lead to loss of data in the worst case. Change-Id: Ia3ad716d6fb1821555f02180e86e8711a79f958d BUG: 1402730 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16075 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	performance/write-behind: Add more fops for checking dependency with	Raghavendra G	2016-12-12	1	-0/+317
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cached writes Fops like readdirp, link, fallocate, discard, zerofill return iatt of files in their responses. This iatt can be cached by md-cache. Hence it is important that write-behind maintains relative ordering of these fops with cached writes. Failure to do so, can result in md-cache storing stale iatts and returning the same to applications. Change-Id: Icfe12ad807e42fe9e52a9f63e47ce63f511c6946 BUG: 1390050 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/15757 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	snapshot/ganesha: Copy export.conf, only if ganesha.enable is on.	Avra Sengupta	2016-12-12	1	-13/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Status of the volume being exported via nfs ganesha, should be checked by checking if ganesha.enable is set or not, rather than deciding based on the errno of the stat Change-Id: Iaff786d9f77a2de1322ce8ccb4b80954f84d3373 BUG: 1402828 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/16094 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
*	refcount: return pointer to the structure instead of a counter	Niels de Vos	2016-12-11	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are no real users of the counter. It was thought of a handy tool to track and debug refcounting, but it is not used at all. Some parts of the code would benefit from a pointer getting returned instead. BUG: 1399780 Change-Id: I97e52c48420fed61be942ea27ff4849b803eed12 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15971 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	cluster/dht: Fix memory corruption while accessing regex stored in	Raghavendra G	2016-12-08	3	-45/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	private If reconfigure is executed parallely (or concurrently with dht_init), there are races that can corrupt memory. One such race is modification of regexes stored in conf (conf->rsync_regex_valid and conf->extra_regex_valid) through dht_init_regex. With change [1], reconfigure codepath can get executed parallely (with itself or with dht_init) and this fix is needed. Also, a reconfigure can race with any thread doing dht_layout_search, resulting in dht_layout_search accessing regex freed up by reconfigure (like in bz 1399134). [1] http://review.gluster.org/15046 Change-Id: I039422a65374cf0ccbe0073441f0e8c442ebf830 BUG: 1399134 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/15945 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	glusterd/geo-rep: Fix glusterd crash	Kotresh HR	2016-12-08	2	-3/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: glusterd crashes when geo-rep mountbroker setup is created if the slave user length is more than 8 characters. Cause: _POSIX_LOGIN_NAME_MAX is used which is 9 including NULL byte. Analysis: While the man page says it sufficient for portability, but acutally it's not. Linux allows the creation of username upto 32 characters by default where the max length is 256. And NetBSD's max is 17. Linux: #getconf LOGIN_NAME_MAX 256 NetBSD: #getconf LOGIN_NAME_MAX 17 Fix: Use LOGIN_NAME_MAX instead of _POSIX_LOGIN_NAME_MAX Change-Id: I26b7230433ecbbed6e6914ed39221a478c0266a8 BUG: 1368138 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/16053 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	geo-rep: Do not restart workers when log-rsync-performance config change	Aravinda VK	2016-12-07	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Geo-rep restarts workers when any of the configurations changed. We don't need to restart workers if tunables like log-rsync-performance is modified. With this patch, Geo-rep workers will get new "log-rsync-performance" config automatically without restart. BUG: 1393678 Change-Id: I40ec253892ea7e70c727fa5d3c540a11e891897b Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/15816 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kotresh HR <khiremat@redhat.com>
*	cluster/afr: Remove backward compatibility for locks with v1	Pranith Kumar K	2016-12-07	2	-69/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we have cascading locks with same lk-owner there is a possibility for a deadlock to happen. One example is as follows: self-heal takes a lock in data-domain for big name with 256 chars of "aaaa...a" and starts heal in a 3-way replication when brick-0 is offline and healing from brick-1 to brick-2 is in progress. So this lock is active on brick-1 and brick-2. Now brick-0 comes online and an operation wants to take full lock and the lock is granted at brick-0 and it is waiting for lock on brick-1. As part of entry healing it takes full locks on all the available bricks and then proceeds with healing the entry. Now this lock will start waiting on brick-0 because some other operation already has a granted lock on it. This leads to a deadlock. Operation is waiting for unlock on "aaaa..." by heal where as heal is waiting for the operation to unlock on brick-0. Initially I thought this is happening because healing is trying to take a lock on all the available bricks instead of just the bricks that are participating in heal. But later realized that same kind of deadlock can happen if a brick goes down after the heal starts but comes back before it completes. So the essential problem is the cascading locks with same lk-owner which were added for backward compatibility with afr-v1 which can be safely removed now that versions with afr-v1 are already EOL. This patch removes the compatibility with v1 which requires cascading locks with same lk-owner. In the next version we can make locking-scheme option a dummy and switch completely to v2. BUG: 1401404 Change-Id: Ic9afab8260f5ff4dff5329eb0429811bcb879079 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/16024 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	cluster/afr: Serialize conflicting locks on all subvols	Pranith Kumar K	2016-12-06	2	-33/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: 1) When a blocking lock is issued and the parallel lock phase fails on all subvolumes with EAGAIN, it is not switching to serialized locking phase. 2) When quorum is enabled and locks fail partially it is better to give errno returned by brick rather than the default quorum errno. Fix: Handled this error case and changed op_errno to reflect the actual errno in case of quorum error. BUG: 1369077 Change-Id: Ifac2e4a13686e9fde601873012700966d56a7f31 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15984 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
*	glusterd/ganesha : handle volume reset properly for ganesha options	Jiffin Tony Thottan	2016-12-06	3	-34/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "gluster volume reset" should first unexport the volume and then delete export configuration file. Also reset option is not applicable for ganesha.enable if volume value is "all". This patch also changes the name of create_export_config into manange_export_config Change-Id: Ie81a49e7d3e39a88bca9fbae5002bfda5cab34af BUG: 1397795 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/15914 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	glusterd : coverity fix for string overflow	Muthu-vigneshwaran	2016-12-06	2	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CID : 1357872, 1357873, 1351695 BUG: 789278 Change-Id: I2ee01a6054326f35de621ee7a1f2afd09c5738fe Signed-off-by: Muthu-vigneshwaran <mvignesh@redhat.com> Reviewed-on: http://review.gluster.org/15989 Tested-by: Muthu Vigneshwaran Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Manikandan Selvaganesh <manikandancs333@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	afr: fix bug in passing child index in afr_inode_write_fill	Ravishankar N	2016-12-06	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: I7b70de317a5f15a3bf483ffe40b971143deddc11 BUG: 1401218 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/16029 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	afr, client: More mem-leak fixes in COMPOUND fop cbk	Krutika Dhananjay	2016-12-04	7	-293/+190
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bugs found and fixed: 1. Use correct subvolume index in pre-op-writev compound cbk 2. Prevent use-after-free of local->compound_args members in compound fops cbk in protocol/client 3. Fix xdata and xattr leaks in client_process_response 4. Fix possible leak of xdata in client_pre_writev() in test mode. 5. Free req->compound_req_array.compound_req_array_val as well after freeing its members 6. Free tmp_rsp->flock.lk_owner.lk_owner_val in LK fop. Change-Id: I15b646d7d4e0e5cd4ea3d2d6452c815cf2eaf68f BUG: 1401218 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16020 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	cluster/ec: Check xdata to avoid memory leak	Ashish Pandey	2016-12-02	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: ec_writev_start calls ec_make_internal_fop_xdata to set "yes" in xdata before ec_readv (an internal fop) is called for head and tail. Second call to this function is overwriting the previous allocated dict_t to "xdata", which results in memory leak. Solution: In ec_make_internal_fop_xdata, check if xdata is NULL or not to avoid overwriting xdata. Change-Id: I49b83923e11aff9b92d002e86424c0c2e1f5f74f BUG: 1400818 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/16007 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	dht/md-cache: Filter invalidate if the file is made a linkto file	Poornima G	2016-12-02	2	-1/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upcall as a part of setattr, sends an invalidation and the invalidation carries the resulting stat value. When a file is converted to linkto files, even then an invalidation is set and as a result the mountpoint shows the sticky bit in the stat of the file. eg: ---------T. 945 root root 0 Nov 8 10:14 hardlink.999 Fix: When dht recieves a notification of sticky bit change, it updates the flag, to indicate md-cache to send the subsequent lookup. Change-Id: Ic2fd7a5b196db0754f9b97072e644e6bf69da606 BUG: 1392713 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15789 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
*	selfheal: fix memory leak on client side healing queue	Mateusz Slupny	2016-12-02	3	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	Change-Id: I2beaba829710565a3246f7449a5cd21755cf5f7d BUG: 1399592 Signed-off-by: Mateusz Slupny <mateusz.slupny@appeartv.com> Reviewed-on: http://review.gluster.org/15968 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	cluster/tier: fix op-version for tier-query-limit	Milind Changire	2016-12-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Correct the op-version for tier-query-limit option from 3.9.0 to 3.9.1 Change-Id: I3a52a94c2708a97c18377e945d559a51d8025c41 BUG: 1366648 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/15990 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	dht/rename : Incase of failure remove linkto file properly	Jiffin Tony Thottan	2016-12-01	2	-2/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Generally linkto file is created using root user. Consider following case, a user is trying to rename a file which he is not permitted. So the rename fails with EACESS and when rename tries to cleanup the linkto file, it fails. The above issue happens when rename/00.t test executed on nfs-ganesha clients : Steps executed in script * create a file "abc" using root * rename the file "abc" to "xyz" using a non root user, it fails with EACESS * delete "abc" * create directory "abc" using root * again try ot rename "abc" to "xyz" using non root user, test hungs here which slowly leds to OOM kill of ganesha process RCA put forwarded by Du for OOM kill of ganesha Note that when we hit this bug, we've a scenario of a dentry being present as: * a linkto file on one subvol * a directory on rest of subvols When a lookup happens on the dentry in such a scenario, the control flow goes into an infinite loop of: dht_lookup_everywhere dht_lookup_everywhere_cbk dht_lookup_unlink_cbk dht_lookup_everywhere_done dht_lookup_directory (as local->dir_count > 0) dht_lookup_dir_cbk (sets to local->need_selfheal = 1 as the entry is a linkto file on one of the subvol) dht_lookup_everywhere (as need_selfheal = 1). This infinite loop can cause increased consumption of memory due to: 1) dht_lookup_directory assigns a new layout to local->layout unconditionally 2) Most of the functions in this loop do a stack_wind of various fops. This results in growing of call stack (note that call-stack is destroyed only after lookup response is received by fuse - which never happens in this case) Thanks Du for root causing the oom kill and Sushant for suggesting the fix Change-Id: I1e16bc14aa685542afbd21188426ecb61fd2689d BUG: 1397052 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/15894 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	glusterd, cli: Fix volume options output format in get-state cli	Samikshan Bairagya	2016-12-01	1	-4/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the get-state cli outputs the volume options in the following format: Volume1.rebalance.skipped: 0 Volume1.rebalance.lookedup: 0 Volume1.rebalance.files: 0 Volume1.rebalance.data: 0Bytes [Volume1.options] features.barrier: on transport.address-family: inet performance.readdir-ahead: on nfs.disable: on Volume2.name: tv2 Volume2.id: 35854708-bb72-45a5-bdbd-77c51e5ebfb9 Volume2.type: Distribute This above format is a valid ini file format syntactically, but is not very easily parseable. This patch changes the format to look like the following and should be more easily parseable: Volume1.rebalance.skipped: 0 Volume1.rebalance.lookedup: 0 Volume1.rebalance.files: 0 Volume1.rebalance.data: 0Bytes Volume1.options.features.barrier: on Volume1.options.transport.address-family: inet Volume1.options.performance.readdir-ahead: on Volume1.options.nfs.disable: on Volume2.name: tv2 Volume2.id: 35854708-bb72-45a5-bdbd-77c51e5ebfb9 Volume2.type: Distribute Change-Id: I9768b45de288d9817ec669d3a801874eb1914750 BUG: 1399995 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: http://review.gluster.org/15975 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shubhendu Tripathi <shtripat@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	uss: snapd should enable SSL if SSL is enabled on volume	Rajesh Joseph	2016-12-01	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During snapd graph generation we should check if SSL is enabled on main volume or not. This is because clients will communicate with snapd as if it is communicating to a brick. Change-Id: I0d7fe86c567b297a8528a48faf06161d4c3cb415 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> BUG: 1400013 Reviewed-on: http://review.gluster.org/15979 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaushal M <kaushal@redhat.com>
*	glusterd: Remove duplicate values for glusterd message macros	Samikshan Bairagya	2016-11-30	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Both GD_MSG_BRICK_CLEANUP_SUCCESS and GD_MSG_DAEMON_STATE_REQ_RCVD macros are assigned the same value (GLUSTERD_COMP_BASE + 584). Also the number of messages should be 588 instead of 587 Change-Id: I015d32435c05ded1b14cd8ba11911af826bc956b BUG: 1400026 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: http://review.gluster.org/15980 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	protocol/server: Fix mem-leaks in compound fops	Krutika Dhananjay	2016-11-30	1	-187/+116
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Remove spurious 'return' statement. * Free up 'compound_rsp_array_val' as well in the end. * Remove multiple refs on this_args->xdata. Change-Id: I212c6dbe4d81b0381c1323d05fdfcc853886b25b BUG: 1399578 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15965 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	all: remove dead translators	Jeff Darcy	2016-11-29	33	-13759/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following have been completely removed from the source tree, makefiles, configure script, and RPM specfile. cluster/afr/pump cluster/ha cluster/map features/filter features/mac-compat features/path-convertor features/protect Change-Id: I2f966999ac3c180296ff90c1799548fba504f88f Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/15906 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/dht: A hard link is lost during rebalance + lookup	Mohit Agrawal	2016-11-29	1	-40/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: A hard link is lost during rebalance + lookup.Rebalance skip files if file has hardlink.In dht_migrate_file __is_file_migratable () function checks if a file has hardlink, if yes file is not migrated but if link is created after call this function then link will lost. Solution: Call __check_file_has_hardlink to check hardlink existence after (S+T) bits in migration process ,if file has hardlink then skip the file for migrate rebalance process. BUG: 1396048 Change-Id: Ia53c07ef42f1128c2eedf959a757e8df517b9d12 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15866 Reviewed-by: Susant Palai <spalai@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	afr: fix auto-quorum	Jeff Darcy	2016-11-28	3	-49/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(1) afr_have_quorum is dead code. It was copied to afr_has_quorum, and everything else uses that, but the original was never deleted (until now). (2) Auto-quorum should be default for any N>2. Leaving quorum disabled is BAD, but apparently deemed acceptable for N=2 because there's no real quorum in that case. For any larger number (including arbiter configurations) there is such a thing as real quorum and we should use it by default. Note that for N=3 the answers we get from "N % 2" (the old check) and "N > 2" (the new one) are the same. (3) The special case for even N in afr_has_quorum has been simplified and explained more thoroughly in a comment. Change-Id: I48b33c15093512fecf516b26dcf09afecb7ae33b Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/15873 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	afr: Fix the EIO that can occur in afr_inode_refresh as a result	Poornima G	2016-11-28	4	-18/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of cache invalidation(upcall). Issue: ------ When a cache invalidation is recieved as a result of changing pending xattr, the read_subvol is reset. Consider the below chain of execution: CHILD_DOWN ... afr_readv ... afr_inode_refresh ... afr_inode_read_subvol_reset <- as a result of pending xattr set by some other client GF_EVENT_UPCALL will be sent afr_refresh_done -> this results in an EIO, as the read subvol was reset by the end of the afr_inode_refresh Solution: --------- When GF_EVENT_UPCALL is recieved, instead of resetting read_subvol, set a variable need_refresh in inode_ctx, the next time some one starts a txn, along with event gen, need_rrefresh also needs to be checked. Change-Id: Ifda21a7a8039b8874215e1afa4bdf20f7d991b58 BUG: 1396952 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15892 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	cluster/ec: Healing should not start if only "data" bricks are UP	Ashish Pandey	2016-11-28	1	-11/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In a disperse volume with "K+R" configuration, where "K" is the number of data bricks and "R" is the number of redundancy bricks (Total number of bricks, N = K+R), if only K bricks are UP, we should NOT start heal process. This is because the bricks, which are supposed to be healed, are not UP. This will unnecessary eat up the resources. Solution: Check for the number of xl_up_count and only if it is greater than ec->fragments (number of data bricks), start heal process. Change-Id: I8579f39cfb47b65ff0f76e623b048bd67b15473b BUG: 1399072 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15937 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
*	cluster/afr: CLI for granular entry heal enablement/disablement	Krutika Dhananjay	2016-11-28	2	-15/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When there are already existing non-granular indices created that are yet to be healed, if granular-entry-heal option is toggled from 'off' to 'on', AFR self-heal whenever it kicks in, will try to look for granular indices in 'entry-changes'. Because of the absence of name indices, granular entry healing logic will fail to heal these directories, and worse yet unset pending extended attributes with the assumption that are no entries that need heal. To get around this, a new CLI is introduced which will invoke glfsheal program to figure whether at the time an attempt is made to enable granular entry heal, there are pending heals on the volume OR there are one or more bricks that are down. If either of them is true, the command will be failed with the appropriate error. New CLI: gluster volume heal <VOL> granular-entry-heal {enable,disable} Change-Id: I1f4fe8162813b9068e198965d94169fee4adc099 BUG: 1370410 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15747 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	protocol/server: capture offset in seek	Ravishankar N	2016-11-28	3	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: http://review.gluster.org/11482 implemented seek FOP but http://review.gluster.org/#/c/14137/ 'undid' the change where we pack the offset returned by seek in server xlator before sending it to the client. As a result, seek always returns zero to the client for SEEK_HOLE/ SEEK_DATA. Fix: I think 14137 removed it unintentionally, hence adding it back again. Signed-off-by: Ravishankar N <ravishankar@redhat.com> Change-Id: I67a1f7b53214b043c5291f5704be4a50b698f91c BUG: 1398076 Reviewed-on: http://review.gluster.org/15920 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	afr: allow I/O when favorite-child-policy is enabled	Ravishankar N	2016-11-27	8	-61/+387
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Currently, I/O on a split-brained file fails even when the favorite-child-policy is set until the self-heal is complete. Fix: If a valid 'source' is found using the set favorite-child-policy, inspect and reset the afr pending xattrs on the 'sinks' (inside appropriate locks), refresh the inode and then proceed with the read or write transaction. The resetting itself happens in the self-heal code and hence can also happen in the client side background-heal or by the shd's index-heal in addition to the txn code path explained above. When it happens in via heal, we also add checks in undo-pending to not reset the sink xattrs again. Change-Id: Ic8c1317720cb26bd114b6fe6af4e58c73b864626 BUG: 1386188 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Simon Turcotte-Langevin <simon.turcotte-langevin@ubisoft.com> Reviewed-on: http://review.gluster.org/15673 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	protocol/server: Print pargfid in logs for rename error	Pranith Kumar K	2016-11-27	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	BUG: 1394548 Change-Id: I42ee627c8cdf54158f083f9019a096ace449e3cc Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15872 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
*	cluster/afr: Fix bugs in [f]inodelk/[f]entrylk	Pranith Kumar K	2016-11-26	5	-336/+381
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problems: 1) Inodelk is not taking quorum into account 2) finodelk, [f]entrylk are not implemented correctly 3) By default afr doesn't go for non-blocking parallel locks. Fix: Implemented a common framework which can be used by [f]inodelk/[f]entrylk. Used quorum for the same. Change-Id: I239f13875a065298630d266941df10cfa3addc85 BUG: 1369077 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15802 Tested-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
*	cluster/afr: Fix deadlock due to compound fops	Krutika Dhananjay	2016-11-26	1	-14/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an afr data transaction is eligible for using eager-lock, this information is represented in local->transaction.eager_lock_on. However, if non-blocking inodelk attempt (which is a full lock) fails, AFR falls back to blocking locks which are range locks. At this point, local->transaction.eager_lock[] per brick is reset but local->transaction.eager_lock_on is still true. When AFR decides to compound post-op and unlock, it is after confirming that the transaction did not use eager lock (well, except for a small bug where local->transaction.locks_acquired[] is not considered). But within afr_post_op_unlock_do(), afr again incorrectly sets the lock range to full-lock based on local->transaction.eager_lock_on value. This is a bug and can lead to deadlock since the locks acquired were range locks and a full unlock is being sent leading to unlock failure and thereby every other lock request (be it from SHD or other clients or glfsheal) getting blocked forever and the user perceives a hang. FIX: Unconditionally rely on the range locks in inodelk object for unlocking when using compounded post-op + unlock. Big thanks to Pranith for helping with the debugging. Change-Id: Idb4938f90397fb4bd90921f9ae6ea582042e5c67 BUG: 1398566 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15929 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	cluster/afr: Handle rpc errors, xdr failures etc with proper NULL checks	Krutika Dhananjay	2016-11-24	1	-6/+17
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: Id8ba76ba116d056bc7299dc5ce0980680a5a23f8 BUG: 1398226 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15924 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	features/index: Delete granular entry indices of already healed directories ↵	Krutika Dhananjay	2016-11-24	1	-2/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	during crawl If granular name indices are already in existence for a volume, and before they are healed, granular entry heal be disabled, a crawl on indices/xattrop will clear the changelogs on these directories. When their corresponding entry-changes indices are crawled subsequently, if it is found that the directories don't need heal anymore, the granular indices are not cleaned up. This patch fixes that problem by ensuring that the zero-xattrop also deletes the stale indices at the level of index translator. Change-Id: Ifbaa6bec2a14e3041addfee4054131babbf4d35e BUG: 1370410 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15880 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	io-cache: Fix a read hang	Poornima G	2016-11-23	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue: ===== In certain cases, there was no unwind of read from read-ahead xlator, thus resulting in hang. RCA: ==== In certain cases, ioc_readv() issues STACK_WIND_TAIL() instead of STACK_WIND(). One such case is when inode_ctx for that file is not present (can happen if readdirp was called, and populates md-cache and serves all the lookups from cache). Consider the following graph: ... io-cache (parent) \| readdir-ahead \| read-ahead ... Below is the code snippet of ioc_readv calling STACK_WIND_TAIL: ioc_readv() { ... if (!inode_ctx) STACK_WIND_TAIL (frame, FIRST_CHILD (frame->this), FIRST_CHILD (frame->this)->fops->readv, fd, size, offset, flags, xdata); /* Ideally, this stack_wind should wind to readdir-ahead:readv() but it winds to read-ahead:readv(). See below for explaination. / ... } STACK_WIND_TAIL (frame, obj, fn, ...) { frame->this = obj; / for the above mentioned graph, frame->this will be readdir-ahead * frame->this = FIRST_CHILD (frame->this) i.e. readdir-ahead, which * is as expected / ... THIS = obj; / THIS will be read-ahead instead of readdir-ahead!, as obj expands * to "FIRST_CHILD (frame->this)" and frame->this was pointing * to readdir-ahead in the previous statement. / ... fn (frame, obj, params); / fn will call read-ahead:readv() instead of readdir-ahead:readv()! * as fn expands to "FIRST_CHILD (frame->this)->fops->readv" and * frame->this was pointing ro readdir-ahead in the first statement / ... } Thus, the readdir-ahead's readv() implementation will be skipped, and ra_readv() will be called with frame->this = "readdir-ahead" and this = "read-ahead". This can lead to corruption / hang / other problems. But in this perticular case, when 'frame->this' and 'this' passed to ra_readv() doesn't match, it causes ra_readv() to call ra_readv() again!. Thus the logic of read-ahead readv() falls apart and leads to hang. Solution: ========= Ideally, STACK_WIND_TAIL() should be modified as: STACK_WIND_TAIL (frame, obj, fn, ...) { next_xl = obj / resolve obj as the variables passed in obj macro can be overwritten in the further instrucions / next_xl_fn = fn / resolve fn and store in a tmp variable, before modifying any variables */ frame->this = next_xl; ... THIS = next_xl; ... next_xl_fn (frame, next_xl, params); ... } But for this solution, knowing the type of variable 'next_xl_fn' is a challenge and is not easy. Hence just modifying all the existing callers to pass "FIRST_CHILD (this)" as obj, instead of "FIRST_CHILD (frame->this)". Change-Id: I179ffe3d1f154bc5a1935fd2ee44e912eb0fbb61 BUG: 1388292 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15901 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	glusterd: fix few events generation	Atin Mukherjee	2016-11-23	5	-7/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch does the following: 1. Generate PEER_REJECT event if the peer add request is from an unknown peer during peer handshaking. 2. EVENT_COMPARE_FRIEND_VOLUME_FAILED should be generated based on status code, not ret. 3. Add EVENT_BRICKPATH_RESOLVE_FAILED event in case glusterd fails to resolve bricks, this is mainly at restore path. 4. Remove EVENT_BRICKS_START_FAILED event as we already have EVENT_BRICK_START_FAILED Change-Id: I90e5bc4a331166d0bb3554eb2ec9df2526837a1d BUG: 1397424 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15903 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
*	protocol/server: Remove unused variable	Anoop C S	2016-11-22	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: I0d0a786b2d02d4db37c4da6194ee4b4feac31b63 BUG: 1198849 Signed-off-by: Anoop C S <anoopcs@redhat.com> Reviewed-on: http://review.gluster.org/15899 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	glusterd: dump volinfo->dict in gluster get-state	Atin Mukherjee	2016-11-22	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Change-Id: I7e60629fb8003c620847fa63441f6b098db59721 BUG: 1396807 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15889 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-by: Rohan Kanade <rkanade@redhat.com>
*	performance/io-threads: Exit threads in fini() as well	Pranith Kumar K	2016-11-21	2	-11/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: io-threads starts the thread in 'init()' but doesn't clean them up on 'fini()'. It relies on PARENT_DOWN to exit threads but there can be cases where event before PARENT_UP the graph init code can think of issuing fini(). This code path is hit when glfs_init() is called on a volume that is in 'stopped' state. It leads to a crash in ganesha process, because the io-thread tries to access freed memory. Fix: Ideal fix would be to wait for all fops in io-thread list to be completed on PARENT_DOWN, and have fini() do cleanup of threads. Because there is no proper documentation about how PARENT_DOWN/fini are supposed to be used, we are getting different kinds of sequences in different higher level protocols. So for now cleaning up in both PARENT_DOWN and fini(). Fuse doesn't call fini() gfapi is not calling PARENT_DOWN in some cases, so for now I don't see another way out. BUG: 1396793 Change-Id: I9c9154e7d57198dbaff0f30d3ffc25f6d8088aec Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15888 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	cluster/dht Set layout after mkdir as root	N Balachandran	2016-11-21	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DHT does not set the layout for newly created directories as root. This causes EPERM failures when a non-root user with insufficient permissions creates directories. credit: srangana@redhat.com for RCA Change-Id: Ia646e41665ce172c43c5f01d2707455e8eb374ed BUG: 1392772 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15794 Reviewed-by: Susant Palai <spalai@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	afr,dht,ec: Replace GF_EVENT_CHILD_MODIFIED with event SOME_DESCENDENT_DOWN/UP	Poornima G	2016-11-21	5	-34/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently these are few events related to child_up/down: GF_EVENT_CHILD_UP : Issued when any of the protocol client connects. GF_EVENT_CHILD_MODIFIED : Issued by afr/dht/ec GF_EVENT_CHILD_DOWN : Issued when any of the protocol client disconnects. These events get modified at the dht/afr/ec layers. Here is a brief on the same. DHT: - All the subvolumes reported once, and atleast one child came up, then GF_EVENT_CHILD_UP is issued - connect GF_EVENT_CHILD_UP is issued - disconnect GF_EVENT_CHILD_MODIFIED is issued - All the subvolumes disconnected, GF_EVENT_CHILD_DOWN is issued AFR: - First subvolume came up, then GF_EVENT_CHILD_UP is issued - Subsequent subvolumes coming up, results in GF_EVENT_CHILD_MODIFIED - Any of the subvolumes go down, then GF_EVENT_SOME_CHILD_DOWN is issued - Last up subvolume goes down, then GF_EVENT_CHILD_DOWN is issued Until the patch [1] introduced GF_EVENT_SOME_CHILD_UP, GF_EVENT_CHILD_MODIFIED was issued by afr/dht when any of the subvolumes go up or down. Now with md-cache changes, there is a necessity to differentiate between child up and down. Hence, introducing GF_EVENT_SOME_DESCENDENT_DOWN/UP and getting rid of GF_EVENT_CHILD_MODIFIED. [1] http://review.gluster.org/12573 Change-Id: I704140b6598f7ec705493251d2dbc4191c965a58 BUG: 1396038 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15764 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
*	events: Add FMT_WARN for gf_event	Pranith Kumar K	2016-11-18	5	-16/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Raghavendra G found that posix is trying to print %s but passing an int when HEALTH_CHECK fails in posix. These are the kind of bugs that should be caught at compilation itself. Also fixed the problematic gf_event() callers. BUG: 1386097 Change-Id: Id7bd6d9a9690237cec3ca1aefa2aac085e8a1270 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15671 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>