glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	gfapi: async fops should unref in callbacks	Raghavendra Talur	2016-11-05	3	-18/+215
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If fd is unref'd at the end of async call then the unref in cbks would lead to double unref and possible crash. Removing duplicate unrefs. Added unref only in failure cases. A simple test case has been added to test async write case. Need to extend the same for other async APIs too. Details: All glfd based calls in libgfapi, except for glfs_open and glfs_close, behave in the same way. At the start of the operation, they take a ref on glfd and fd. At the end of the operation, they unref it. Async calls are a little different as they unref in the cbk function. A successfull open call does not unref either the glfd or fd, thereby functioning as a reference for a OPEN file object. glfs_close makes a syncop_flush call sandwiched between a fd ref and unref(this can be removed, more on this below), followed by a call to glfs_mark_glfd_for_deletion which unrefs glfd and also calls glfs_fd_destroy as a release function thereby doing a unref on fd too. Functionally, there is no problem with how everything works when as described above. However, it is a little non-intuitive that we need to perform a fd_unref as a consequence of a implicit fd_ref that happens within glfs_resolve_fd. As we perform a GF_REF_GET(glfd) at the start of every operation, it would be worthwhile to remove the fd_ref that glfs_resovle_fd takes and do away with explicit fd_unref()s at the end of every operation. This is the same reason why we don't need the fd_ref in glfs_close. This is however not in the scope of this patch. Change-Id: I86b1d3b2ad846b16ea527d541dc82b5e90b0ba85 BUG: 1391086 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/15768 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Prasanna Kumar Kalever <pkalever@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	posix-acl: check dictionary before using it	Rajesh Joseph	2016-11-04	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If extended attributes are not present in md-cache it returns NULL as xattr. posix acl xlator should check for NULL before using xattr. If normal and default ACLs are not set on file then md-cache will not contain system.posix_acl_access and system.posix_acl_default extended attributes in its cache. Therefore posix_acl_lookup_cbk should check xattr before using it, otherwise the logs will get filled with dictionary errors. Change-Id: Icebf73cf0b313bd3e82ca8cbda63786dd0fa47da BUG: 1391387 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/15769 Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	gfapi/upcall: Fix mismatch in few upcall API SYMVER	Soumya Koduri	2016-11-03	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is mismatch in few of the upcall API routine definitions and their corresponding symbol version declarations. Fixed the same. Change-Id: I2edfd9546a4c6a9128757f3b68e3ae4edd2c7a79 BUG: 1344714 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/15760 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Kaleb KEITHLEY <kkeithle@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	xlators/trash : Remove upper limit for trash max file size	Jiffin Tony Thottan	2016-11-03	2	-18/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently file which size exceeds more than 1GB never moved to trash directory. This is due to the hard coded check using GF_ALLOWED_MAX_FILE_SIZE. Change-Id: I2ed707bfe1c3114818896bb27a9856b9a164be92 BUG: 1386766 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/15689 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Anoop C S <anoopcs@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	snapshot: Fix the failure to recreate clones with same name	Avra Sengupta	2016-11-01	5	-17/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The brick path of snapshot clones contained the clonename, thereby failing to create newer clones with the same name after the original clone had been deleted. This fix creates the brick path with the clone's vol id instead of the clones name. Hence future clones with the same name will not have the namespace clash. Change-Id: I262712adc576122f051b5d1ce171d020efaefd1a BUG: 1387160 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15683 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
*	performance/write-behind: fix flush stuck by former failed writes	Ryan Ding	2016-11-01	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the issue is happened in this case: assume a file is opened with fd1 and fd2. 1. some WRITE opto fd1 got error, they were add back to 'todo' queue because of those error. 2. fd2 closed, a FLUSH op is send to write-behind. 3. FLUSH can not be unwind because it's not a legal waiter for those failed write(as func __wb_request_waiting_on() say). and those failed WRITE also can not be ended if fd1 is not closed. fd2 stuck in close syscall. to resolve this issue, we can change the way we determine 2 requests is 'conflict': flush/fsync is not conflict with those write that is not belonged to them. so __wb_pick_winds() can wind the FLUSH op. below is some information when the stuck issue happen: glusterdump logs: [xlator.performance.write-behind.wb_inode] path=/ltp-F9eG0ZSOME/rw-buffered-16436 inode=0x7fdbe8039b9c window_conf=1048576 window_current=249856 transit-size=0 dontsync=0 [.WRITE] request-ptr=0x7fdbe8020200 refcount=1 wound=no generation-number=4 req->op_ret=-1 req->op_errno=116 sync-attempts=3 sync-in-progress=no size=131072 offset=1220608 lied=-1 append=0 fulfilled=0 go=0 [.WRITE] request-ptr=0x7fdbe8068c30 refcount=1 wound=no generation-number=5 req->op_ret=-1 req->op_errno=116 sync-attempts=2 sync-in-progress=no size=118784 offset=1351680 lied=-1 append=0 fulfilled=0 go=0 [.FLUSH] request-ptr=0x7fdbe8021cd0 refcount=1 wound=no generation-number=6 req->op_ret=0 req->op_errno=0 sync-attempts=0 gdb detail about above 3 requests: (gdb) print ((wb_request_t )0x7fdbe8021cd0) $2 = {all = {next = 0x7fdbe803a608, prev = 0x7fdbe8068c30}, todo = {next = 0x7fdbe803a618, prev = 0x7fdbe8068c40}, lie = {next = 0x7fdbe8021cf0, prev = 0x7fdbe8021cf0}, winds = {next = 0x7fdbe8021d00, prev = 0x7fdbe8021d00}, unwinds = {next = 0x7fdbe8021d10, prev = 0x7fdbe8021d10}, wip = { next = 0x7fdbe8021d20, prev = 0x7fdbe8021d20}, stub = 0x7fdbe80224dc, write_size = 0, orig_size = 0, total_size = 0, op_ret = 0, op_errno = 0, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_FLUSH, lk_owner = {len = 8, data = "W\322T\f\271\367y$", '\000' <repeats 1015 times>}, iobref = 0x0, gen = 6, fd = 0x7fdbe800f0dc, wind_count = 0, ordering = {size = 0, off = 0, append = 0, tempted = 0, lied = 0, fulfilled = 0, go = 0}} (gdb) print ((wb_request_t )0x7fdbe8020200) $3 = {all = {next = 0x7fdbe8068c30, prev = 0x7fdbe803a608}, todo = {next = 0x7fdbe8068c40, prev = 0x7fdbe803a618}, lie = {next = 0x7fdbe8068c50, prev = 0x7fdbe803a628}, winds = {next = 0x7fdbe8020230, prev = 0x7fdbe8020230}, unwinds = {next = 0x7fdbe8020240, prev = 0x7fdbe8020240}, wip = { next = 0x7fdbe8020250, prev = 0x7fdbe8020250}, stub = 0x7fdbe8062c3c, write_size = 131072, orig_size = 4096, total_size = 0, op_ret = -1, op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>}, iobref = 0x7fdbe80311a0, gen = 4, fd = 0x7fdbe805c89c, wind_count = 3, ordering = {size = 131072, off = 1220608, append = 0, tempted = -1, lied = -1, fulfilled = 0, go = 0}} (gdb) print ((wb_request_t )0x7fdbe8068c30) $4 = {all = {next = 0x7fdbe8021cd0, prev = 0x7fdbe8020200}, todo = {next = 0x7fdbe8021ce0, prev = 0x7fdbe8020210}, lie = {next = 0x7fdbe803a628, prev = 0x7fdbe8020220}, winds = {next = 0x7fdbe8068c60, prev = 0x7fdbe8068c60}, unwinds = {next = 0x7fdbe8068c70, prev = 0x7fdbe8068c70}, wip = { next = 0x7fdbe8068c80, prev = 0x7fdbe8068c80}, stub = 0x7fdbe806746c, write_size = 118784, orig_size = 4096, total_size = 0, op_ret = -1, op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>}, iobref = 0x7fdbe8052b10, gen = 5, fd = 0x7fdbe805c89c, wind_count = 2, ordering = {size = 118784, off = 1351680, append = 0, tempted = -1, lied = -1, fulfilled = 0, go = 0}} you can see they are all on 'todo' queue, and FLUSH op fd is not the same WRITE op fd. Change-Id: Id687f9cd3b9f281e1a97c83f1ce981ede272b8ab BUG: 1372211 Signed-off-by: Ryan Ding <ryan.ding@open-fs.com> Reviewed-on: http://review.gluster.org/15380 Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	build: incorrect Requires for portblock resource agent	Kaleb S. KEITHLEY	2016-10-28	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	was: Requires: /usr/lib/ocf/resource.d/portblock s/b: Requires: /usr/lib/ocf/resource.d/heartbeat/portblock or: Requires: resource-agents >= 3.9.6 Note: RHEL6.8 and RHEL7.2 have resource-agents-3.9.5 which does not contain the portblock resource agent. I'm not sure what the point is actually of: Requires: /usr/lib/ocf/resource.d/heartbeat/portblock as it will fail to install on RHEL whether you have the resource-agents package installed or not. Hence wrapping it in %if ( fedora ). Change-Id: Ia7d6a475464c7469018678c98fc710a3b3bfc553 BUG: 1389293 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15743 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	tests: fix cleanup in bug-1110262.t	Jeff Darcy	2016-10-28	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The test wasn't cleaning up the user/group it had created if it terminated abnormally. We have a mechanism (push_trapfunc) to add cleanup actions in a way that ensures they'll be run when necessary, so I changed the test to use it. While I was there, I fixed it to use kill_brick instead of "ps\|grep\|kill" because that will be necessary for it to pass with brick multiplexing anyway. Change-Id: Ia515bd2420050f922970d28c5856c55df9b5247b Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/15744 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	glusterd/shared storage: Check for hook-script at staging	Avra Sengupta	2016-10-27	2	-6/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Check if S32gluster_enable_shared_storage.sh is present at /var/lib/glusterd/hooks/1/set/post/ at staging before proceeding with the command. Fail the command with the appropriate error message in case it is not present. Change-Id: I84e3912f1cdffb927f8a40d74d52be43ee69388b BUG: 1388348 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15718 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	md-cache: Invalidate cache entry for open() with O_TRUNC	Soumya Koduri	2016-10-26	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a file is opened with O_TRUNC flag set, its size gets set to '0'. This case needs to be handled in md-cache to avoid sending incorrect cached stat. Change-Id: I95d1f8a6634734898883ede010c3e7b0b7eb97d9 BUG: 1382266 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/15618 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Tested-by: jiffin tony Thottan <jthottan@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	cluster/tier: break the monolith processing function	Milind Changire	2016-10-26	2	-309/+431
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Break tier_migrate_using_query_file() into a more manageable tier_migrate_link() and helpers. Change-Id: I5eb2d2cff9e7a2a2da567c3c4c2d53aab195f477 BUG: 1358296 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/14957 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
*	afr,ec: Heal device files with correct major, minor numbers	Pranith Kumar K	2016-10-26	4	-13/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the fix. BUG: 1384297 Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15728 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	tools/glusterfind: kill remote processes and separate run-time directories	Milind Changire	2016-10-25	4	-16/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem #1: Hitting CTRL+C leaves stale processes on remote nodes if glusterfind pre has been initiated. Solution #1: Adding "-t -t" to ssh command-line forces pseudo-terminal to be assigned to remote process. When local process receives Keyboard Interrupt, SIGHUP is immediately conveyed to the remote terminal causing remote changelog.py process to terminate immediately. Problem #2: Concurrent glusterfind pre runs are not possible on the same glusterfind session in case of a runaway process. Solution #2: glusterfind pre runs now add random directory name to the working directory to store and manage temporary database and changelog processing. If KeyboardInterrupt is received, the function call run_cmd_nodes("cleanup", args, tmpfilename=gtmpfilename) cleans up the remote run specific directory. Patch: 7571380 cli/xml: Fix wrong XML format in volume get command broke "gluster volume get <vol> changelog.rollover-time --xml" Now fixed function utils.py::get_changelog_rollover_time() Fixed spurious trailing space getting written if second path is empty in main.py::write_output() Fixed repetitive changelog processing in changelog.py::get_changes() Change-Id: Ia8d96e2cd47bf2a64416bece312e67631a1dbf29 BUG: 1382236 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/15609 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com>
*	CLI/TIER: throw warning regarding the removal of the older commands.	hari	2016-10-25	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The older tier commands for attach tier and detach tier have to be removed from code. This patch sends a warning asking to use new command as older ones are depricated and will be removed. Change-Id: Ie1c62947bad6ff106f40331ff6134838a6c72a7a BUG: 1388062 Signed-off-by: hari <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/15713 Tested-by: hari gowtham <hari.gowtham005@gmail.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
*	glusterd: use GF_BRICK_STOPPING as intermediate brickinfo->status state	Atin Mukherjee	2016-10-25	2	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On a volume stop trigger glusterd issues a brick-op to terminate the brick process during brick-op phase , however in the commit-op glusterd once again tries to kill the same process if it exists and then mark the brickinfo->status flag to GF_BRICK_STOPPED. In the former case, if brick is successfully killed there is a possibility that GlusterD will receive RPC_CLNT_DISCONNECT from the said brick process before even the commit op phase is executed and hence by that time brickinfo->status will still be set to GF_BRICK_STARTED. BRICK_DISCONNECT event should be only sent if a brick has been killed and not through a volume stop/remove brick trigger, however due to this trace, this event is also sent out on a volume stop. Fix is to introduce an intermediate state GF_BRICK_STOPPING which can be used to mark the brick status at brick op phase of volume stop/remove brick to avoid sending spurious BRICK_DISCONNECT events on a volume stop trigger. Change-Id: Ieed4450e1c988715e0f9958be44faa6b14be81e1 BUG: 1387652 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15699 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaushal M <kaushal@redhat.com>
*	cluster/dht: Incorrect volname in rebalance events	N Balachandran	2016-10-25	1	-6/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The rebalance event code was using strtok to parse the volume name which is incorrect. Reworked the code to get the correct volume name using strstr. Change-Id: Ib5f3305a34e6bf1ecfef677d87c5aff96bdeb0e6 BUG: 1388010 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15712 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	snapshot: Fix for memory leaks in snapshot code path	Avra Sengupta	2016-10-25	3	-12/+47
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: Idc2cb16574d166e3c0ee1f7c3a485f1acb19fc8c BUG: 1386088 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15668 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
*	rpc: Fix the race between notification and reconnection	Pranith Kumar K	2016-10-24	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: There was a hang because unlock on an entry failed with ENOTCONN. Client thinks the connection is down where as server thinks the connection is up. This is the race we are seeing: 1) Connection from client to the brick disconnects. 2) Saved frames unwind is called which unwinds all frames that were wound before disconnect. 3) connection from client to the brick happens and setvolume. 4) Disconnect notification for the connection in 1) comes now and calls client_rpc_notify() which marks the connection to be offline even when the connection is up. This is happening because I/O can retrigger connection before disconnect notification is sent to the higher layers in rpc. Fix: Notify the higher layers that a disconnect happened and then go ahead with reconnect logic. For the logs which point to the information above check: https://bugzilla.redhat.com/show_bug.cgi?id=1386626#c1 Thanks to Raghavendra G for suggesting the correct fix. BUG: 1386626 Change-Id: I3c84ba1f17010bd69049fa88ec5f0ae431f8cda9 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15681 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	protocol/client: reduce memory usage	N Balachandran	2016-10-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	readdirp calls use a lot of memory in case of a large number of files. The dict->extra_free is not used here so free buf immediately. Change-Id: I097f5dde2df471f5834264152711110a3bdb7e9a BUG: 1380249 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15593 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	geo-rep: Upgrade conf file only if it is session config	Aravinda VK	2016-10-24	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ignore config upgrade if it is template config file present in /var/lib/glusterd/geo-replication/gsyncd_template.conf BUG: 1386123 Change-Id: I2cbba3103b6801c16ff57f778a90b9a0bb2467cf Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/15669 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kotresh HR <khiremat@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
*	compound fops: Fix file corruption issue	Krutika Dhananjay	2016-10-24	8	-22/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1. Address of a local variable @args is copied into state->req in server3_3_compound (). But even after the function has gone out of scope, in server_compound_resume () this pointer is accessed and dereferenced. This patch fixes that. 2. Compound fops, by virtue of NOT having a vector sizer (like the one writev has), ends up having both the header and the data (in case one of its member fops is WRITEV) in the same hdr_iobuf. This buffer was not being preserved through the lifetime of the compound fop, causing it to be overwritten by a parallel write fop, even when the writev associated with the currently executing compound fop is yet to hit the desk, thereby corrupting the file's data. This is fixed by associating the hdr_iobuf with the iobref so its memory remains valid through the lifetime of the fop. 3. Also fixed a use-after-free bug in protocol/client in compound fops cbk, missed by Linux but caught by NetBSD. Finally, big thanks to Pranith Kumar K and Raghavendra Gowdappa for their help in debugging this file corruption issue. Change-Id: I6d5c04f400ecb687c9403a17a12683a96c2bf122 BUG: 1378778 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15654 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	eventsapi: Fix sending event during volume set help	Aravinda VK	2016-10-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Event sent when `gluster volume set help` command is run with Volume name as "help" and empty options list With this patch, event sent only when volume set on a real volume BUG: 1387207 Change-Id: Ia8785d6108cb86f7d89ecf9ea552df0334b41398 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/15685 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	geo-rep: Logging improvements	Aravinda VK	2016-10-24	4	-29/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Redundant log messages removed. - Worker and connected slave node details added in "starting worker" log - Added log for Monitor state change - Added log for Worker status change(Initializing/Active/Passive/Faulty) - Added log for Crawl status Change - Added log for config set and reset - Added log for checkpoint set, reset and completion BUG: 1359612 Change-Id: Icc7173ff3c93de4b862bdb1a61760db7eaf14271 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/15684 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kotresh HR <khiremat@redhat.com>
*	rpc/socket.c : Modify socket_poller code in case of ENODATA error code.	Mohit Agrawal	2016-10-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Continuous warning message(ENODATA) are coming in socket_rwv while SSL is enabled. Solution: To avoid the warning message update one condition in socket_poller loop code before break from loop in case of error returned by poll functions. BUG: 1386450 Change-Id: I19b3a92d4c3ba380738379f5679c1c354f0ab9b1 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15677 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	rpc/socket: Close pipe on disconnection	Kaushal M	2016-10-23	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Encrypted connections create a pipe, which isn't closed when the connection disconnects. This leaks fds, and gluster eventually ends up in a situation with fd starvation which leads to operation failures. Change-Id: I144e1f767cec8c6fc1aa46b00cd234129d2a4adc BUG: 1336371 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/14356 Tested-by: MOHIT AGRAWAL <moagrawa@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	performance/io-threads: Exit all threads on PARENT_DOWN	Pranith Kumar K	2016-10-23	2	-17/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When glfs_fini() is called on a volume where client.io-threads is enabled, fini() will free up iothread xl's private structure but there would be some threads that are sleeping which would wakeup after the timedwait completes leading to accessing already free'd memory. Fix: As part of parent-down, exit all sleeping threads. BUG: 1381830 Change-Id: I0bb8d90241112c355fb22ee3fbfd7307f475b339 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15620 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	bitrot/cli: Add ondemand scrub event	Kotresh HR	2016-10-22	2	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Following Bitrot Events are added BITROT_SCRUB_ONDEMAND { "nodeid": NODEID, "ts": TIMESTAMP, "event": EVENT_TYPE, "message": { "name": VOLUME_NAME, } } Change-Id: I85e668e254e6f29c447ddb4ad2ce2fc04f98bf3c BUG: 1387864 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/15700 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd: conditionally pass uuid for EVENT_PEER_CONNECT	Atin Mukherjee	2016-10-21	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a new node is probed, on the first RPC_CLNT_CONNECT peerctx->uuid is set to NULL as the same is yet to be populated. However the subesquent (dis)connect events would be carrying the valid UUIDs. Solution is not to generate EVENT_PEER_CONNECT on a peer probe trigger as CLI is already going to take care of generating the same. Change-Id: I2f0de054ca09f12013a6afdd8ee158c0307796b9 BUG: 1386516 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15678 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
*	eventsapi/packaging: Fix wrong usage of %post	Aravinda VK	2016-10-20	2	-7/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	%postun was used for events package instead of %post. eventsd service should be restarted only after install/upgrade and not during uninstallation(%postun) BUG: 1386141 Change-Id: Iae3eab06d02c5f4134b3de09f040123bed053bb8 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/15670 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
*	events: add TIER_START and TIER_START_FORCE events	Milind Changire	2016-10-20	2	-11/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add TIER_START and TIER_START_FORCE events Conditionally generate DETACH events as per user confirmation. Change-Id: I205dc14884d707087edce42e8cf4208bd89d31dc BUG: 1386247 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/15675 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
*	tests: gfapi/bug1291259.c should only call glfs_free() on success	Niels de Vos	2016-10-20	2	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In case glfs_h_poll_upcall() does not return success, the 'struct glfs_upcall' would not have been allocated. A retry will be done and glfs_free() is called on the unallocated structure. In case the pointer does not point to NULL, glfs_free() will try to free up some random area. Change-Id: I38788d3bf22bbac3924f25edf45cd4a2637fa777 BUG: 1371540 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15603 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
*	cli, glusterd: Address issues in get-state cli output	Samikshan Bairagya	2016-10-20	4	-79/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes the following data points: 1. Volume type 2. Peer state 3. List of other hostnames for a peer 4. Data unit information for rebalance The following data points are removed: 1. Mount options and filesystem types for bricks 2. global-option-version from list of global options The following data points are added: 1. Replica Count 2. Tier type for bricks belonging to hot/cold tier Change-Id: I5011250e863fdc4929b203cdb345d79b2f16c6a5 BUG: 1385839 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: http://review.gluster.org/15662 Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	md-cache, afr: Reduce the window of stale read	Poornima G	2016-10-20	7	-116/+370
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Consider a replica setup, where one mount writes data to a file and the other mount reads the file. In afr, read operations are not transaction based, a brick(read subvolume) is chosen as a part of lookup or other operations, read is always wound only to the read subvolume, even if there was write from a different client that failed on this brick. This stale read continues until there is a lookup or any write operation from the mount point. Currently, this is not a major issue, as a lookup is issued before every read and it will switch the read subvolume to a correct one. But with the plan of increasing md-cache timeout to 600s, the stale read problem will be more pronounced, i.e. stale read can continue for 600s(or more if cascaded with readdirp), as there will be no lookups. Solution: Afr doesn't have any built-in solution for stale read(without affecting the performance). The solution that came up, was to use upcall. When a file on any brick is marked bad for the first time, upcall sends a notification to all the clients that had recently accessed the file. The solution has 2 parts: - Identifying when a file is marked bad, on any of the bricks, for the first time - Client side actions on recieving the notifications Identifying when a file is marked bad on any of the bricks for the first time: ----------------------------------------------------------------------------- The idea is to track xattrop in upcall. xattrop currently comes with 2 afr xattrs - afr dirty bit and afr pending xattrs. Dirty xattr is set to 1 before every write, and is unset if write succeeds. In certain scenarios, dirty xattr can be 0 and still the file could be bad copy. Hence do not track dirty xattr. Pending xattr is set on the good copy, indicating the other bricks that have bad copy. It is still not as simple as, notifying when any of the pending xattrs change. It could lead to flood of notifcations, in case the other brick is completely down or consistantly failing. Hence it is important to notify only once, the first time a good copy is marked bad. Client side actions on recieving pending xattr change, notification: -------------------------------------------------------------------- md-cache will invalidate the cache of that file, so that further lookup is passed down to afr and hence update the read subvolume. Invalidating only in md-cache is not enough, consider the folling oder of opertaions: - pending xattr invalidation - invalidate md-cache - readdirp on the bad read subvolume - fill md-cache - lookup (served from md-cache) - read - wound to the old read subvol. Hence, along with invalidating md-cache, it is very important to reset the read subvolume for that file, in afr. Design Credit: Anuradha Talur, Ravishankar N 1. xattrop doesn't carry info saying post op/pre op. 2. Pre xattrop will have 0 value for all pending xattrs, the cbk of pre xattrop carries the on-disk xattr value. Non zero indicated healing is required. 3. Post xattrop will have non zero value for any of the pending xattrs, if the fop failed on any of the bricks. Change-Id: I469cbc111714c433984fe1c922be2ef113c25804 BUG: 1211863 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15398 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	eventsapi: Auto convert Boolean and Int attributes	Aravinda VK	2016-10-19	3	-1/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before publishing in JSON format, automatically convert the attribute to "bool" or "int" if configured. For example, instead of sending force="1", convert to bool and send as force=True { "event": "VOLUME_START", "name" : "gv1", "force": "1" } Convert to, { "event": "VOLUME_START", "name" : "gv1", "force": true } BUG: 1379328 Change-Id: Iabc51fd61abc267a7c8dcf0aeac6b3c722d89649 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/15574 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	cluster/tier: handle fast demotions	Milind Changire	2016-10-19	13	-69/+213
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Demote files on priority if hi-watermark has been breached and continue to demote until the watermark drops below hi-watermark. Monitor watermark more frequently. Trigger demotion as soon as hi-watermark is breached. Add cluster.tier-emergency-demote-query-limit option to limit number of files returned from the database query for every iteration of tier_migrate_using_query_file(). If watermark hasn't dropped below hi-watermark during the first iteration, the next iteration will be triggered approximately 1 second after tier_demote() returns to the main tiering loop. Update changetimerecorder xlator to handle query for emergency demote mode. Add tier-ctr-interface.h: Move tier and ctr interface specific macros and struct definition from libglusterfs/src/gfdb/gfdb_data_store.h to new header libglusterfs/src/tier-ctr-interface.h Change-Id: If56af78c6c81d37529b9b6e65ae606ba5c99a811 BUG: 1366648 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/15158 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
*	common-ha: Use UpdateExports dbus msg for refresh-config	Soumya Koduri	2016-10-19	2	-28/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In nfs-ganesha 2.4, new dbs msg type "UpdateExports" support has been added. With this support, the exports can be re-configured dynamically without the need to re-export the entries. Change-Id: Iee7330d33e91db1126974a2ff46becb3764f2e5e BUG: 1382258 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/15617 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	build: Avoid printing python version to stdout discovered during configure	Anoop C S	2016-10-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Change-Id: I1e0d0dd462cd8fa6d3ce40850099e8a019d754de BUG: 1198849 Signed-off-by: Anoop C S <anoopcs@redhat.com> Reviewed-on: http://review.gluster.org/15666 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	gfapi: add glfs_free() to glfs.h	Niels de Vos	2016-10-18	3	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 4721188a154acd9a0a4c096d8d73e97f3bf1b2a9 introduces glfs_free() but missed adding the function to the header. The symbol is correctly available in the library though. Testcases do not seem to fail when a function is missing for the headers... The glusterfs-3.7.16 packages have been released with the missing declaration in the header and symbol-maps. Still, the function is available for applications: $ objdump -T usr/lib64/libgfapi.so.0 \| grep -w glfs_free 0000000000006aa0 g DF .text 0000000000000035 GFAPI_3.7.16 glfs_free Change-Id: Ia707ee957f090dbfca028192fcc81a83dfdf4ae0 BUG: 1344714 Reported-by: Jiffin Tony Thottan <jthottan@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15653 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
*	glusterd: set the brickinfo->port before spawning the bricks	Atin Mukherjee	2016-10-18	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As of now, when glusterd spawns a brick process, post spawning, the brickinfo's port is set. The side effect of this is it opens up an window where the pmap_signin event can be initiated by the brick to glusterd and glusterd fails to update signed_in flag since the brickinfo port is still 0 and the comparison of port and brickinfo->port fails. As a solution, set the brickinfo->port post pmap_registry_alloc and if the brick spawn fails reset it to 0. This logic applies for rdma port too. Change-Id: I00a13d4c6d6809ebd19a972aa13e71ee5eac7e35 BUG: 1385575 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15655 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	features/read-only: reten_mode is invalid to be free by mem_put()	Ryan Ding	2016-10-18	1	-11/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	priv->reten_mode is initialised by option 'retention-mode'. and it reference the memory in this->options. so fini() use mem_put to free priv->reten_mode will cause a problem. there is no need to call mem_put(), so just remove it will be fine. Change-Id: Iee6f9d1d54df38cba8c9b9100e2824f4f2b18ab4 BUG: 1369523 Signed-off-by: Ryan Ding <ryan.ding@open-fs.com> Reviewed-on: http://review.gluster.org/15296 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
*	crypt: changes needed for openssl-1.1 (coming in Fedora 26)	Kaleb S. KEITHLEY	2016-10-18	1	-4/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fedora is poised to update openssl-1.1.0b in/for Fedora 26 in the next day or so. But already Fedora koji scratch builds are built against openssl-1.1.0b because of the way scratch builds work. N.B. that the latest Fedora rawhide (11 October) still ships with openssl-1.0.2j. HMAC_CTX is now an opaque type and instances of it must be created and released by calling HMAC_CTX_new() and HMAC_CTX_free(). Change-Id: I3a09751d7b0d9fc25fe18aac6527e5431e9ab19a BUG: 1384142 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15629 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	geo-rep/eventsapi: Additional Events	Aravinda VK	2016-10-18	6	-8/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added following events EVENT_GEOREP_ACTIVE { "nodeid": NODEID, "ts": TIMESTAMP, "event": "GEOREP_ACTIVE", "message": { "master_volume": MASTER_VOLUME_NAME, "slave_host": SLAVE_HOST, "slave_volume": SLAVE_VOLUME, "brick_path": BRICK_PATH } } EVENT_GEOREP_PASSIVE { "nodeid": NODEID, "ts": TIMESTAMP, "event": "GEOREP_PASSIVE", "message": { "master_volume": MASTER_VOLUME_NAME, "slave_host": SLAVE_HOST, "slave_volume": SLAVE_VOLUME, "brick_path": BRICK_PATH } } EVENT_GEOREP_CHECKPOINT_COMPLETED { "nodeid": NODEID, "ts": TIMESTAMP, "event": "GEOREP_ACTIVE", "message": { "master_volume": MASTER_VOLUME_NAME, "slave_host": SLAVE_HOST, "slave_volume": SLAVE_VOLUME, "brick_path": BRICK_PATH, "checkpoint_time": CHECKPOINT_TIME, "checkpoint_completion_time": CHECKPOINT_COMPLETION_TIME } } BUG: 1379330 Change-Id: I90716175868c59dd65c8d202e73e0ede90347b6a Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/15630 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: Kotresh HR <khiremat@redhat.com>
*	trivial: correct some spelling mistakes in comments and logs	Niels de Vos	2016-10-18	7	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	BUG: 1385593 Change-Id: Icfae9e557a284182c6c22e9606fdd641528906f0 Reported-by: Patrick Matthäi <pmatthaei@debian.org> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15656 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	storage/posix: Fix race in posix_pstat	Pranith Kumar K	2016-10-17	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When one thread is in the process of creating a file/directory and the other thread is doing readdirp, there is a chance that posix_pstat, creation fops race in the following manner which will lead to wrong stat values to be read by parent xlators like posix-acl. Creation fops posix_pstat() as part of readdirp 1) file is created with uid/gid 0/0 1) does stat of the path that is created just now. 2) Does chown to set the correct uid/gid 3) Sets the acl/user/internal xattrs 4) Sets the gfid on the entry and completes the creation of the file/dir 2) fills the gfid in the iatt If unwind of readdirp hits server xlator before creation fop, then posix-acl remembers uid/gid of the file to be root/root and fails fops like open etc on it. Fix: Reverse the order of filling gfid and filling lstat() values in posix_pstat() so that if there is gfid in iatt buffer uid/gid are valid. Change-Id: I46caa7f6da7abfa40a0b1d70e35b88de9c64959c Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15564 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	performance/write-behind: remove the request from liability queue in	Raghavendra G	2016-10-16	1	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	wb_fulfill_request Before this patch, a request is removed from liability queue only when ref count of request hits 0. Though, wb_fulfill_request does an unref, it need not be the last unref and hence the request may survive in liability queue till the last unref. Let, T1: the time at which wb_fulfill_request is invoked T2: the time at which last unref is done on request Let's consider a case of T2 > T1. In the time window between T1 and T2, any other request (waiter) conflicting with request in liability queue (blocker - basically a write which has been lied) is blocked from winding. If T2 happens to be when wb_do_unwinds is invoked, no further processing of request list happens and "waiter" would get blocked forever. An example imaginary sequence of events is given below: 1. A write request w1 is picked up for unwinding in __wb_pick_unwinds (but unwind is not done _yet_ and hence reference remains). However, w1 is moved to liability queue. Let's call this invocation of wb_process_queue by wb_writev as PQ1. 2. A flush (f1) request hits write behind. Since the liability queue of inode is not empty, f1 is not picked for unwinding. Let's call the invocation of wb_process_queue by wb_flush as PQ2. 3. PQ2 continues and picks w1 for fulfilling and invokes wb_fulfill. As part of successful wb_fulfill_cbk, wb_fulfill_request (w1) is invoked. But, w1 is not freed (and hence not removed from liability queue) as w1 is not unwound _yet_ and a ref remains (PQ1 has not invoked wb_do_unwinds _yet_). 4. wb_fulfill_cbk (triggered by PQ2) invokes a wb_process_queue (let's say PQ3). f1 is not resumed in PQ3 as w1 is still in liability queue. At this time, PQ2 and PQ3 are complete. 5. PQ1 continues, unwinds w1 and does last unref on w1 and w1 is freed (and removed from liability queue). Since PQ1 didn't invoke wb_fulfill on any other write requests, there won't be any future codepaths that would invoke wb_process_queue and f1 is stuck forever. With this fix, w1 is removed from liability queue in step 3 above and PQ3 resumes f1 in step 4 (as there are no requests conflicting with f1 in liability queue during execution of PQ3). Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1379655 Change-Id: Idacda1fcd520ac27f30224f8dfe8360dba6ac6cb Reviewed-on: http://review.gluster.org/15579 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	glusterd: enable default configurations post upgrade to >= 3.9.0 versions	Atin Mukherjee	2016-10-16	2	-10/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With 3.8.0 onwards volume options like nfs.disable, transport.address-family have some default configuration value. If a volume was created pre upgrade to 3.8.0 or higher the default options are not set post upgrade. This patch takes care of putting the default values in the op-version bump up workflow. However these changes will only reflect from 3.9.0 onwards Change-Id: I9a8d848cd08d87ddcb80dbeac27eaae097d9cbeb BUG: 1379223 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15568 Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
*	cluster/afr: Prevent dict_set() on NULL dict	Pranith Kumar K	2016-10-15	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In afr lookup when NULL dict is received in lookup, afr is supposed to set all the xattrs it requires in a new dict it creates, but for 'link-count' it is trying to set to the dict that is passed in lookup which can be NULL sometimes. This is leading to error logs. Fixed the same in this patch. BUG: 1385104 Change-Id: I679af89cfc410cbc35557ae0691763a05eb5ed0e Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15646 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
*	afr: Take full locks in arbiter only for data transactions	Ravishankar N	2016-10-14	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Sharding exposed a bug in arbiter config. where `dd` throughput was extremely slow. Shard xlator was sending a fxattrop to update the file size immediately after a writev. Arbiter was incorrectly over-riding the LLONGMAX-1 start offset (for metadata domain locks) for this fxattrop, causing the inodelk to be taken on the data domain. And since the preceeding writev hadn't released the lock (afr does a 'lazy' unlock if write succeeds on all bricks), this degraded to a blocking lock causing extra lock/unlock calls and delays. Fix: Modify flock.l_len and flock.l_start to take full locks only for data transactions. Change-Id: I906895da2f2d16813607e6c906cb4defb21d7c3b BUG: 1384906 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Max Raba <max.raba@comsysto.com> Reviewed-on: http://review.gluster.org/15641 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	gfapi: correct the gfapi.map for glfs_ipc@GFAPI_4_0_0	Niels de Vos	2016-10-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 85e959052148ec481823d55c8b91cdee36da2b43 introduced an inconsistency in gfapi.map. We need to figure out how to handle the glfs_ipc() function at one point... Change-Id: If53ad904318d5a60c14bd8b80685f7a852bf25e5 BUG: 1370931 Reported-by: Anoop C S <anoopcs@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15633 Reviewed-by: Anoop C S <anoopcs@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	rpc/socket.c : Modify gf_log message in socket_poller code in case of error	Mohit Agrawal	2016-10-12	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In case of SSL after stopping the volume if client(mount point) is still trying to write the data on socket then it will throw an EIO error on that socket and given this log message is captured at every attempt this would flood the log file. Solution: To reduce the frequency of stored log message use GF_LOG_OCCASIONALLY instead of gf_log. BUG: 1381115 Change-Id: I66151d153c2cbfb017b3ebc4c52162278c0f537c Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15605 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>