glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	marker: Fix inode value in loc, in setxattr fop	Poornima G	2016-11-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On recieving a rename fop, marker_rename() stores the, oldloc and newloc in its 'local' struct, once the rename is done, the xtime marker(last updated time) is set on the file, but sending a setxattr fop. When upcall receives the setxattr fop, the loc->inode is NULL and it crashes. The loc->inode can be NULL only in one valid case, i.e. in rename case where the inode of new loc can be NULL. Hence, marker should have filled the inode of the new_loc before issuing a setxattr. Change-Id: Id638f678c3daaf4a5c29b970b58929d377ae8977 BUG: 1394131 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15826 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kotresh HR <khiremat@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
*	cli/rebalance: remove brick status is incorrect	N Balachandran	2016-11-17	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a remove brick operation is preceded by a fix-layout, running remove-brick status on a node which does not contain any of the bricks that were removed displays fix-layout status. The defrag_cmd variable was not updated in glusterd for the nodes not hosting removed bricks causing the status parsing to go wrong. This is now updated. Also made minor modifications to the spacing in the fix-layout status output. Change-Id: Ib735ce26be7434cd71b76e4c33d9b0648d0530db BUG: 1389697 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15749 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	system/posix-acl: Log reason for EACCES	Pranith Kumar K	2016-11-17	3	-15/+160
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is becoming increasingly difficult to debug the reason why posix-acl decides to fail a fop with EACCES. This patch prints a big log everytime such a condition occurs giving out the details that may help in finding why the fop is errored out. Change-Id: I2505baaafb5d77ef6c187554ff027df9b20468db BUG: 1394548 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15837 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
*	protocol/client: Fix iobref and iobuf leaks in COMPOUND fop	Krutika Dhananjay	2016-11-16	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: I408879aa2bbd8ea176fbc0d0eba5567e5df1b2b3 BUG: 1395687 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15860 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
*	protocol/server: Print error-xlator name	Pranith Kumar K	2016-11-16	1	-182/+255
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the moment from which xlator the errors are stemming from is a mystery. Fix: With this patch we can find on the server side which xlator first gave the errno received by server xlator. I am not yet sure how to get this done for client side which has lot of copy_frame()s. May be another patch. Change-Id: Ie13307b965facf2a496123e81ce0bd6756f98ac9 BUG: 1394548 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15836 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	cluster/dht: Check for null inode	N Balachandran	2016-11-15	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Check for NULL inode before attempting to set dht inode ctx. Change-Id: I7693c18445f138221d8417df5e95b118cedb818a BUG: 1395261 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15847 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	nfs: revalidate lookup converted to fresh lookup	Mohammed Rafi KC	2016-11-10	4	-10/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	when an inode ctx is missing for a linked inode the revalidate lookups are converted to fresh. This could result in sending ESTALE when the gfid are recreated We are not able to reproduce the issue with normal setup, most part of RCA was done with code reading. Possible scenario in which this bug can reproduce, Delete a file and recreate a new file with same name, at the same time from another client process try to list/or access the file. In this case the second client may throw an ESTALE error for such files Thanks to Soumya and Pranith for doing the complete RCA Change-Id: I73992a65844b09a169cefaaedc0dcfb129d66ea1 BUG: 1379720 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/15580 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	performance/open-behind: Avoid deadlock in statedump	Pranith Kumar K	2016-11-09	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: open-behind is taking fd->lock then inode->lock where as statedump is taking inode->lock then fd->lock, so it is leading to deadlock In open-behind, following code exists: void ob_fd_free (ob_fd_t ob_fd) { loc_wipe (&ob_fd->loc); <<--- this takes (inode->lock) ....... } int ob_wake_cbk (call_frame_t frame, void cookie, xlator_t this, int op_ret, int op_errno, fd_t fd_ret, dict_t xdata) { ....... LOCK (&fd->lock); <<---- fd->lock { ....... __fd_ctx_del (fd, this, NULL); ob_fd_free (ob_fd); <<<--------------- } UNLOCK (&fd->lock); ....... } ================================================================= In statedump this code exists: inode_dump (inode_t inode, char prefix) { ....... ret = TRY_LOCK(&inode->lock); <<---- inode->lock ....... fd_ctx_dump (fd, prefix); <<<----- ....... } fd_ctx_dump (fd_t fd, char prefix) { ....... LOCK (&fd->lock); <<<------------------ this takes fd-lock { ....... } Fix: Make sure open-behind doesn't call ob_fd_free() inside fd->lock BUG: 1393259 Change-Id: I4abdcfc5216270fa1e2b43f7b73445f49e6d6e6e Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15808 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Poornima G <pgurusid@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	cluster/afr: When failing fop due to lack of quorum, also log error string	Krutika Dhananjay	2016-11-09	1	-11/+12
\| \| \| \| \| \| \| \| \| \| \| \|	Change-Id: I39de2bcfc660f23a4229a885a0e0420ca949ffc9 BUG: 1392865 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15800 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	upcall: Fix a log level	Poornima G	2016-11-08	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In upcall_cache_invalidation(), the gfid can be NULL in certain valid test cases(eg: entry for ".." in readdirp), hence change the log level from WARNING to DEBUG. Change-Id: Ic90167a0e2076694e9131913114460df7b939b30 BUG: 1392167 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15777 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	jbr: Sending rollback from failed fop to fdl	Avra Sengupta	2016-11-08	12	-34/+327
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In case of a failed fop, the failure is detected by the leader in the jbr-server in two places. First during a quorum check of +ve responses when it receives responses from all the followers. At this point if the fop hasn't been successfully journaled at a quorum of followers (as in there is no merit in trying the fop in the leader as the quorum will never be met), then we fail the fop. Also if this quorum is met, then the fop is tried on the leader, and after the leader completes the fop a quorum check similar to the previous one is done again, this time including the leaders outcome. If quorum is not met, then we fail the fop. In both these cases, when the fop fails we send a -ve ack to the client. With this patch, now we will also send a rollback through a GF_FOP_IPC to all the followers(and also to the leader in the second case of failure). This rollback will contain the index and term number of the fop which failed. This will be recorded in the respective journals of the bricks and will be used to rollback the fop on that brick later. A subsequent write, and it's respective rollback would look something like the following in the journal. The trusted.jbr.term and trusted.jbr.index present in the dict of both the logs, relate them, and the presence of "rollback-fop" in the dict of IPC indicates that it is a rollback fop, and the value 13(stands for GF_FOP_WRITE) indicates what kind of rollback operation it is. === GF_FOP_WRITE fd = <gfid 77f12ea2-ca56-40e3-a46e-ba2308baa035> vector = <158 bytes> offset = 0 (0x0) flags = 32769 (0x8001) xdata = dict { trusted.jbr.term = 0 <2 bytes> trusted.jbr.index = 4 <2 bytes> } === GF_FOP_IPC xdata = dict { trusted.jbr.term = 0 <2 bytes> trusted.jbr.index = 4 <2 bytes> rollback-fop = 13 <3 bytes> } Change-Id: I70b6a143d20697153d58e2f719e34ecd1ed160a5 BUG: 1349385 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/14783 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	features/shard: Fill loc.pargfid too for named lookups on individual shards	Krutika Dhananjay	2016-11-08	2	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On a sharded volume when a brick is replaced while IO is going on, named lookup on individual shards as part of read/write was failing with ENOENT on the replaced brick, and as a result AFR initiated name heal in lookup callback. But since pargfid was empty (which is what this patch attempts to fix), the resolution of the shards by protocol/server used to fail and the following pattern of logs was seen: Brick-logs: [2016-11-08 07:41:49.387127] W [MSGID: 115009] [server-resolve.c:566:server_resolve] 0-rep-server: no resolution type for (null) (LOOKUP) [2016-11-08 07:41:49.387157] E [MSGID: 115050] [server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null) (00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82) ==> (Invalid argument) [Invalid argument] Client-logs: [2016-11-08 07:41:27.497687] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.497755] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.498500] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.499680] E [MSGID: 133010] Also, this patch makes AFR by itself choose a non-NULL pargfid even if its ancestors fail to initialize all pargfid placeholders. Change-Id: I5f85b303ede135baaf92e87ec8e09941f5ded6c1 BUG: 1392445 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15788 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	glusterd/quota: upgrade quota.conf file during an upgrade	Manikandan Selvaganesh	2016-11-07	4	-16/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem ======= When quota is enabled on 3.6, it will have quota conf version in quota.conf as v1.1. This node gets upgraded to 3.7 but it will still have quota conf version as v1.1 until a quota enable/disable/set limit is initiated. When this is not initiated and when this node tries to peer probe a node which is a fresh install of 3.7 (which will have quota conf version as v1.2), then this will result in "Peer rejected" state. This patch fixes the issue. Solution ======== When an upgrade happens from 3.6 to 3.7, quota.conf file needs to be modified as well. With 3.6, in quota.conf the version will be v1.1 and it needs to be changed to v1.2 from 3.7. This is because in 3.7, inode quota feature is introduced. So when an op-version bumpup happens quota.conf needs to be upgraded with quota conf version v1.2 and all the 16 byte uuid needs to be changed to 17 bytes uuid as well. Previously, when the cluster version is upgraded to 3.7, the quota.conf got upgraded as well. But, the upgradation was done only when quota enable/disable/set limit is done. With this patch, the upgradation is done during a cluster op version bump up as well. Change-Id: Idb5ba29d3e1ea0e45c85d87c952c75da9e0f99f0 BUG: 1371539 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/15352 Tested-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	posix-acl: check dictionary before using it	Rajesh Joseph	2016-11-04	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If extended attributes are not present in md-cache it returns NULL as xattr. posix acl xlator should check for NULL before using xattr. If normal and default ACLs are not set on file then md-cache will not contain system.posix_acl_access and system.posix_acl_default extended attributes in its cache. Therefore posix_acl_lookup_cbk should check xattr before using it, otherwise the logs will get filled with dictionary errors. Change-Id: Icebf73cf0b313bd3e82ca8cbda63786dd0fa47da BUG: 1391387 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/15769 Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	xlators/trash : Remove upper limit for trash max file size	Jiffin Tony Thottan	2016-11-03	2	-18/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently file which size exceeds more than 1GB never moved to trash directory. This is due to the hard coded check using GF_ALLOWED_MAX_FILE_SIZE. Change-Id: I2ed707bfe1c3114818896bb27a9856b9a164be92 BUG: 1386766 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/15689 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Anoop C S <anoopcs@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	snapshot: Fix the failure to recreate clones with same name	Avra Sengupta	2016-11-01	4	-17/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The brick path of snapshot clones contained the clonename, thereby failing to create newer clones with the same name after the original clone had been deleted. This fix creates the brick path with the clone's vol id instead of the clones name. Hence future clones with the same name will not have the namespace clash. Change-Id: I262712adc576122f051b5d1ce171d020efaefd1a BUG: 1387160 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15683 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
*	performance/write-behind: fix flush stuck by former failed writes	Ryan Ding	2016-11-01	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the issue is happened in this case: assume a file is opened with fd1 and fd2. 1. some WRITE opto fd1 got error, they were add back to 'todo' queue because of those error. 2. fd2 closed, a FLUSH op is send to write-behind. 3. FLUSH can not be unwind because it's not a legal waiter for those failed write(as func __wb_request_waiting_on() say). and those failed WRITE also can not be ended if fd1 is not closed. fd2 stuck in close syscall. to resolve this issue, we can change the way we determine 2 requests is 'conflict': flush/fsync is not conflict with those write that is not belonged to them. so __wb_pick_winds() can wind the FLUSH op. below is some information when the stuck issue happen: glusterdump logs: [xlator.performance.write-behind.wb_inode] path=/ltp-F9eG0ZSOME/rw-buffered-16436 inode=0x7fdbe8039b9c window_conf=1048576 window_current=249856 transit-size=0 dontsync=0 [.WRITE] request-ptr=0x7fdbe8020200 refcount=1 wound=no generation-number=4 req->op_ret=-1 req->op_errno=116 sync-attempts=3 sync-in-progress=no size=131072 offset=1220608 lied=-1 append=0 fulfilled=0 go=0 [.WRITE] request-ptr=0x7fdbe8068c30 refcount=1 wound=no generation-number=5 req->op_ret=-1 req->op_errno=116 sync-attempts=2 sync-in-progress=no size=118784 offset=1351680 lied=-1 append=0 fulfilled=0 go=0 [.FLUSH] request-ptr=0x7fdbe8021cd0 refcount=1 wound=no generation-number=6 req->op_ret=0 req->op_errno=0 sync-attempts=0 gdb detail about above 3 requests: (gdb) print ((wb_request_t )0x7fdbe8021cd0) $2 = {all = {next = 0x7fdbe803a608, prev = 0x7fdbe8068c30}, todo = {next = 0x7fdbe803a618, prev = 0x7fdbe8068c40}, lie = {next = 0x7fdbe8021cf0, prev = 0x7fdbe8021cf0}, winds = {next = 0x7fdbe8021d00, prev = 0x7fdbe8021d00}, unwinds = {next = 0x7fdbe8021d10, prev = 0x7fdbe8021d10}, wip = { next = 0x7fdbe8021d20, prev = 0x7fdbe8021d20}, stub = 0x7fdbe80224dc, write_size = 0, orig_size = 0, total_size = 0, op_ret = 0, op_errno = 0, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_FLUSH, lk_owner = {len = 8, data = "W\322T\f\271\367y$", '\000' <repeats 1015 times>}, iobref = 0x0, gen = 6, fd = 0x7fdbe800f0dc, wind_count = 0, ordering = {size = 0, off = 0, append = 0, tempted = 0, lied = 0, fulfilled = 0, go = 0}} (gdb) print ((wb_request_t )0x7fdbe8020200) $3 = {all = {next = 0x7fdbe8068c30, prev = 0x7fdbe803a608}, todo = {next = 0x7fdbe8068c40, prev = 0x7fdbe803a618}, lie = {next = 0x7fdbe8068c50, prev = 0x7fdbe803a628}, winds = {next = 0x7fdbe8020230, prev = 0x7fdbe8020230}, unwinds = {next = 0x7fdbe8020240, prev = 0x7fdbe8020240}, wip = { next = 0x7fdbe8020250, prev = 0x7fdbe8020250}, stub = 0x7fdbe8062c3c, write_size = 131072, orig_size = 4096, total_size = 0, op_ret = -1, op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>}, iobref = 0x7fdbe80311a0, gen = 4, fd = 0x7fdbe805c89c, wind_count = 3, ordering = {size = 131072, off = 1220608, append = 0, tempted = -1, lied = -1, fulfilled = 0, go = 0}} (gdb) print ((wb_request_t )0x7fdbe8068c30) $4 = {all = {next = 0x7fdbe8021cd0, prev = 0x7fdbe8020200}, todo = {next = 0x7fdbe8021ce0, prev = 0x7fdbe8020210}, lie = {next = 0x7fdbe803a628, prev = 0x7fdbe8020220}, winds = {next = 0x7fdbe8068c60, prev = 0x7fdbe8068c60}, unwinds = {next = 0x7fdbe8068c70, prev = 0x7fdbe8068c70}, wip = { next = 0x7fdbe8068c80, prev = 0x7fdbe8068c80}, stub = 0x7fdbe806746c, write_size = 118784, orig_size = 4096, total_size = 0, op_ret = -1, op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>}, iobref = 0x7fdbe8052b10, gen = 5, fd = 0x7fdbe805c89c, wind_count = 2, ordering = {size = 118784, off = 1351680, append = 0, tempted = -1, lied = -1, fulfilled = 0, go = 0}} you can see they are all on 'todo' queue, and FLUSH op fd is not the same WRITE op fd. Change-Id: Id687f9cd3b9f281e1a97c83f1ce981ede272b8ab BUG: 1372211 Signed-off-by: Ryan Ding <ryan.ding@open-fs.com> Reviewed-on: http://review.gluster.org/15380 Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	glusterd/shared storage: Check for hook-script at staging	Avra Sengupta	2016-10-27	2	-6/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Check if S32gluster_enable_shared_storage.sh is present at /var/lib/glusterd/hooks/1/set/post/ at staging before proceeding with the command. Fail the command with the appropriate error message in case it is not present. Change-Id: I84e3912f1cdffb927f8a40d74d52be43ee69388b BUG: 1388348 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15718 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	md-cache: Invalidate cache entry for open() with O_TRUNC	Soumya Koduri	2016-10-26	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a file is opened with O_TRUNC flag set, its size gets set to '0'. This case needs to be handled in md-cache to avoid sending incorrect cached stat. Change-Id: I95d1f8a6634734898883ede010c3e7b0b7eb97d9 BUG: 1382266 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/15618 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Tested-by: jiffin tony Thottan <jthottan@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	cluster/tier: break the monolith processing function	Milind Changire	2016-10-26	1	-309/+429
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Break tier_migrate_using_query_file() into a more manageable tier_migrate_link() and helpers. Change-Id: I5eb2d2cff9e7a2a2da567c3c4c2d53aab195f477 BUG: 1358296 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/14957 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
*	afr,ec: Heal device files with correct major, minor numbers	Pranith Kumar K	2016-10-26	2	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the fix. BUG: 1384297 Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15728 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	glusterd: use GF_BRICK_STOPPING as intermediate brickinfo->status state	Atin Mukherjee	2016-10-25	2	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On a volume stop trigger glusterd issues a brick-op to terminate the brick process during brick-op phase , however in the commit-op glusterd once again tries to kill the same process if it exists and then mark the brickinfo->status flag to GF_BRICK_STOPPED. In the former case, if brick is successfully killed there is a possibility that GlusterD will receive RPC_CLNT_DISCONNECT from the said brick process before even the commit op phase is executed and hence by that time brickinfo->status will still be set to GF_BRICK_STARTED. BRICK_DISCONNECT event should be only sent if a brick has been killed and not through a volume stop/remove brick trigger, however due to this trace, this event is also sent out on a volume stop. Fix is to introduce an intermediate state GF_BRICK_STOPPING which can be used to mark the brick status at brick op phase of volume stop/remove brick to avoid sending spurious BRICK_DISCONNECT events on a volume stop trigger. Change-Id: Ieed4450e1c988715e0f9958be44faa6b14be81e1 BUG: 1387652 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15699 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaushal M <kaushal@redhat.com>
*	cluster/dht: Incorrect volname in rebalance events	N Balachandran	2016-10-25	1	-6/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The rebalance event code was using strtok to parse the volume name which is incorrect. Reworked the code to get the correct volume name using strstr. Change-Id: Ib5f3305a34e6bf1ecfef677d87c5aff96bdeb0e6 BUG: 1388010 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15712 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	snapshot: Fix for memory leaks in snapshot code path	Avra Sengupta	2016-10-25	3	-12/+47
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: Idc2cb16574d166e3c0ee1f7c3a485f1acb19fc8c BUG: 1386088 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15668 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
*	protocol/client: reduce memory usage	N Balachandran	2016-10-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	readdirp calls use a lot of memory in case of a large number of files. The dict->extra_free is not used here so free buf immediately. Change-Id: I097f5dde2df471f5834264152711110a3bdb7e9a BUG: 1380249 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15593 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	compound fops: Fix file corruption issue	Krutika Dhananjay	2016-10-24	3	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1. Address of a local variable @args is copied into state->req in server3_3_compound (). But even after the function has gone out of scope, in server_compound_resume () this pointer is accessed and dereferenced. This patch fixes that. 2. Compound fops, by virtue of NOT having a vector sizer (like the one writev has), ends up having both the header and the data (in case one of its member fops is WRITEV) in the same hdr_iobuf. This buffer was not being preserved through the lifetime of the compound fop, causing it to be overwritten by a parallel write fop, even when the writev associated with the currently executing compound fop is yet to hit the desk, thereby corrupting the file's data. This is fixed by associating the hdr_iobuf with the iobref so its memory remains valid through the lifetime of the fop. 3. Also fixed a use-after-free bug in protocol/client in compound fops cbk, missed by Linux but caught by NetBSD. Finally, big thanks to Pranith Kumar K and Raghavendra Gowdappa for their help in debugging this file corruption issue. Change-Id: I6d5c04f400ecb687c9403a17a12683a96c2bf122 BUG: 1378778 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15654 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	performance/io-threads: Exit all threads on PARENT_DOWN	Pranith Kumar K	2016-10-23	2	-17/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When glfs_fini() is called on a volume where client.io-threads is enabled, fini() will free up iothread xl's private structure but there would be some threads that are sleeping which would wakeup after the timedwait completes leading to accessing already free'd memory. Fix: As part of parent-down, exit all sleeping threads. BUG: 1381830 Change-Id: I0bb8d90241112c355fb22ee3fbfd7307f475b339 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15620 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	glusterd: conditionally pass uuid for EVENT_PEER_CONNECT	Atin Mukherjee	2016-10-21	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a new node is probed, on the first RPC_CLNT_CONNECT peerctx->uuid is set to NULL as the same is yet to be populated. However the subesquent (dis)connect events would be carrying the valid UUIDs. Solution is not to generate EVENT_PEER_CONNECT on a peer probe trigger as CLI is already going to take care of generating the same. Change-Id: I2f0de054ca09f12013a6afdd8ee158c0307796b9 BUG: 1386516 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15678 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
*	cli, glusterd: Address issues in get-state cli output	Samikshan Bairagya	2016-10-20	4	-79/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes the following data points: 1. Volume type 2. Peer state 3. List of other hostnames for a peer 4. Data unit information for rebalance The following data points are removed: 1. Mount options and filesystem types for bricks 2. global-option-version from list of global options The following data points are added: 1. Replica Count 2. Tier type for bricks belonging to hot/cold tier Change-Id: I5011250e863fdc4929b203cdb345d79b2f16c6a5 BUG: 1385839 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: http://review.gluster.org/15662 Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	md-cache, afr: Reduce the window of stale read	Poornima G	2016-10-20	5	-116/+323
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Consider a replica setup, where one mount writes data to a file and the other mount reads the file. In afr, read operations are not transaction based, a brick(read subvolume) is chosen as a part of lookup or other operations, read is always wound only to the read subvolume, even if there was write from a different client that failed on this brick. This stale read continues until there is a lookup or any write operation from the mount point. Currently, this is not a major issue, as a lookup is issued before every read and it will switch the read subvolume to a correct one. But with the plan of increasing md-cache timeout to 600s, the stale read problem will be more pronounced, i.e. stale read can continue for 600s(or more if cascaded with readdirp), as there will be no lookups. Solution: Afr doesn't have any built-in solution for stale read(without affecting the performance). The solution that came up, was to use upcall. When a file on any brick is marked bad for the first time, upcall sends a notification to all the clients that had recently accessed the file. The solution has 2 parts: - Identifying when a file is marked bad, on any of the bricks, for the first time - Client side actions on recieving the notifications Identifying when a file is marked bad on any of the bricks for the first time: ----------------------------------------------------------------------------- The idea is to track xattrop in upcall. xattrop currently comes with 2 afr xattrs - afr dirty bit and afr pending xattrs. Dirty xattr is set to 1 before every write, and is unset if write succeeds. In certain scenarios, dirty xattr can be 0 and still the file could be bad copy. Hence do not track dirty xattr. Pending xattr is set on the good copy, indicating the other bricks that have bad copy. It is still not as simple as, notifying when any of the pending xattrs change. It could lead to flood of notifcations, in case the other brick is completely down or consistantly failing. Hence it is important to notify only once, the first time a good copy is marked bad. Client side actions on recieving pending xattr change, notification: -------------------------------------------------------------------- md-cache will invalidate the cache of that file, so that further lookup is passed down to afr and hence update the read subvolume. Invalidating only in md-cache is not enough, consider the folling oder of opertaions: - pending xattr invalidation - invalidate md-cache - readdirp on the bad read subvolume - fill md-cache - lookup (served from md-cache) - read - wound to the old read subvol. Hence, along with invalidating md-cache, it is very important to reset the read subvolume for that file, in afr. Design Credit: Anuradha Talur, Ravishankar N 1. xattrop doesn't carry info saying post op/pre op. 2. Pre xattrop will have 0 value for all pending xattrs, the cbk of pre xattrop carries the on-disk xattr value. Non zero indicated healing is required. 3. Post xattrop will have non zero value for any of the pending xattrs, if the fop failed on any of the bricks. Change-Id: I469cbc111714c433984fe1c922be2ef113c25804 BUG: 1211863 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15398 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	cluster/tier: handle fast demotions	Milind Changire	2016-10-19	6	-20/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Demote files on priority if hi-watermark has been breached and continue to demote until the watermark drops below hi-watermark. Monitor watermark more frequently. Trigger demotion as soon as hi-watermark is breached. Add cluster.tier-emergency-demote-query-limit option to limit number of files returned from the database query for every iteration of tier_migrate_using_query_file(). If watermark hasn't dropped below hi-watermark during the first iteration, the next iteration will be triggered approximately 1 second after tier_demote() returns to the main tiering loop. Update changetimerecorder xlator to handle query for emergency demote mode. Add tier-ctr-interface.h: Move tier and ctr interface specific macros and struct definition from libglusterfs/src/gfdb/gfdb_data_store.h to new header libglusterfs/src/tier-ctr-interface.h Change-Id: If56af78c6c81d37529b9b6e65ae606ba5c99a811 BUG: 1366648 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/15158 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
*	glusterd: set the brickinfo->port before spawning the bricks	Atin Mukherjee	2016-10-18	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As of now, when glusterd spawns a brick process, post spawning, the brickinfo's port is set. The side effect of this is it opens up an window where the pmap_signin event can be initiated by the brick to glusterd and glusterd fails to update signed_in flag since the brickinfo port is still 0 and the comparison of port and brickinfo->port fails. As a solution, set the brickinfo->port post pmap_registry_alloc and if the brick spawn fails reset it to 0. This logic applies for rdma port too. Change-Id: I00a13d4c6d6809ebd19a972aa13e71ee5eac7e35 BUG: 1385575 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15655 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	features/read-only: reten_mode is invalid to be free by mem_put()	Ryan Ding	2016-10-18	1	-11/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	priv->reten_mode is initialised by option 'retention-mode'. and it reference the memory in this->options. so fini() use mem_put to free priv->reten_mode will cause a problem. there is no need to call mem_put(), so just remove it will be fine. Change-Id: Iee6f9d1d54df38cba8c9b9100e2824f4f2b18ab4 BUG: 1369523 Signed-off-by: Ryan Ding <ryan.ding@open-fs.com> Reviewed-on: http://review.gluster.org/15296 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
*	crypt: changes needed for openssl-1.1 (coming in Fedora 26)	Kaleb S. KEITHLEY	2016-10-18	1	-4/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fedora is poised to update openssl-1.1.0b in/for Fedora 26 in the next day or so. But already Fedora koji scratch builds are built against openssl-1.1.0b because of the way scratch builds work. N.B. that the latest Fedora rawhide (11 October) still ships with openssl-1.0.2j. HMAC_CTX is now an opaque type and instances of it must be created and released by calling HMAC_CTX_new() and HMAC_CTX_free(). Change-Id: I3a09751d7b0d9fc25fe18aac6527e5431e9ab19a BUG: 1384142 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15629 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	trivial: correct some spelling mistakes in comments and logs	Niels de Vos	2016-10-18	6	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	BUG: 1385593 Change-Id: Icfae9e557a284182c6c22e9606fdd641528906f0 Reported-by: Patrick Matthäi <pmatthaei@debian.org> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15656 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	storage/posix: Fix race in posix_pstat	Pranith Kumar K	2016-10-17	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When one thread is in the process of creating a file/directory and the other thread is doing readdirp, there is a chance that posix_pstat, creation fops race in the following manner which will lead to wrong stat values to be read by parent xlators like posix-acl. Creation fops posix_pstat() as part of readdirp 1) file is created with uid/gid 0/0 1) does stat of the path that is created just now. 2) Does chown to set the correct uid/gid 3) Sets the acl/user/internal xattrs 4) Sets the gfid on the entry and completes the creation of the file/dir 2) fills the gfid in the iatt If unwind of readdirp hits server xlator before creation fop, then posix-acl remembers uid/gid of the file to be root/root and fails fops like open etc on it. Fix: Reverse the order of filling gfid and filling lstat() values in posix_pstat() so that if there is gfid in iatt buffer uid/gid are valid. Change-Id: I46caa7f6da7abfa40a0b1d70e35b88de9c64959c Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15564 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	performance/write-behind: remove the request from liability queue in	Raghavendra G	2016-10-16	1	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	wb_fulfill_request Before this patch, a request is removed from liability queue only when ref count of request hits 0. Though, wb_fulfill_request does an unref, it need not be the last unref and hence the request may survive in liability queue till the last unref. Let, T1: the time at which wb_fulfill_request is invoked T2: the time at which last unref is done on request Let's consider a case of T2 > T1. In the time window between T1 and T2, any other request (waiter) conflicting with request in liability queue (blocker - basically a write which has been lied) is blocked from winding. If T2 happens to be when wb_do_unwinds is invoked, no further processing of request list happens and "waiter" would get blocked forever. An example imaginary sequence of events is given below: 1. A write request w1 is picked up for unwinding in __wb_pick_unwinds (but unwind is not done _yet_ and hence reference remains). However, w1 is moved to liability queue. Let's call this invocation of wb_process_queue by wb_writev as PQ1. 2. A flush (f1) request hits write behind. Since the liability queue of inode is not empty, f1 is not picked for unwinding. Let's call the invocation of wb_process_queue by wb_flush as PQ2. 3. PQ2 continues and picks w1 for fulfilling and invokes wb_fulfill. As part of successful wb_fulfill_cbk, wb_fulfill_request (w1) is invoked. But, w1 is not freed (and hence not removed from liability queue) as w1 is not unwound _yet_ and a ref remains (PQ1 has not invoked wb_do_unwinds _yet_). 4. wb_fulfill_cbk (triggered by PQ2) invokes a wb_process_queue (let's say PQ3). f1 is not resumed in PQ3 as w1 is still in liability queue. At this time, PQ2 and PQ3 are complete. 5. PQ1 continues, unwinds w1 and does last unref on w1 and w1 is freed (and removed from liability queue). Since PQ1 didn't invoke wb_fulfill on any other write requests, there won't be any future codepaths that would invoke wb_process_queue and f1 is stuck forever. With this fix, w1 is removed from liability queue in step 3 above and PQ3 resumes f1 in step 4 (as there are no requests conflicting with f1 in liability queue during execution of PQ3). Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1379655 Change-Id: Idacda1fcd520ac27f30224f8dfe8360dba6ac6cb Reviewed-on: http://review.gluster.org/15579 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	glusterd: enable default configurations post upgrade to >= 3.9.0 versions	Atin Mukherjee	2016-10-16	2	-10/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With 3.8.0 onwards volume options like nfs.disable, transport.address-family have some default configuration value. If a volume was created pre upgrade to 3.8.0 or higher the default options are not set post upgrade. This patch takes care of putting the default values in the op-version bump up workflow. However these changes will only reflect from 3.9.0 onwards Change-Id: I9a8d848cd08d87ddcb80dbeac27eaae097d9cbeb BUG: 1379223 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15568 Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
*	cluster/afr: Prevent dict_set() on NULL dict	Pranith Kumar K	2016-10-15	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In afr lookup when NULL dict is received in lookup, afr is supposed to set all the xattrs it requires in a new dict it creates, but for 'link-count' it is trying to set to the dict that is passed in lookup which can be NULL sometimes. This is leading to error logs. Fixed the same in this patch. BUG: 1385104 Change-Id: I679af89cfc410cbc35557ae0691763a05eb5ed0e Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15646 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
*	afr: Take full locks in arbiter only for data transactions	Ravishankar N	2016-10-14	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Sharding exposed a bug in arbiter config. where `dd` throughput was extremely slow. Shard xlator was sending a fxattrop to update the file size immediately after a writev. Arbiter was incorrectly over-riding the LLONGMAX-1 start offset (for metadata domain locks) for this fxattrop, causing the inodelk to be taken on the data domain. And since the preceeding writev hadn't released the lock (afr does a 'lazy' unlock if write succeeds on all bricks), this degraded to a blocking lock causing extra lock/unlock calls and delays. Fix: Modify flock.l_len and flock.l_start to take full locks only for data transactions. Change-Id: I906895da2f2d16813607e6c906cb4defb21d7c3b BUG: 1384906 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Max Raba <max.raba@comsysto.com> Reviewed-on: http://review.gluster.org/15641 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	mgmt/glusterd: Cleanup memory leaks in handshake	Vijay Bellur	2016-10-11	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	Thanks to bingxuan.zhang at nokia dot com for the report and patch. Change-Id: I994f82493fec7827f31592340af5bda83322f878 BUG: 1377584 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/15612 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	cluster/ec: Implement heal info with lock	Ashish Pandey	2016-10-11	5	-79/+261
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Currently heal info command prints all the files/directories if the index for the file/directory is present in .glusterfs/indices folder. After implementing patch http://review.gluster.org/#/c/13733/ indices of the file which is going through update fop will also be present in .glusterfs/indices even if the fop is successful on all the brick. At this time if heal info command is being used, it will also display this file which is actually healthy and does not require any heal. Solution: Take lock on a file corresponding to the indices and inspect xattrs to decide if the file needs heal or not. Change-Id: I6361e2813ece369be12d02e74816df4eddb81cfa BUG: 1366815 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15543 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	afr: fix incorrect debug log in selfheal path	Ravishankar N	2016-10-04	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1. While looking at glustershd logs in DEBUG log-level, it was found that all bricks of the replica were printed as local bricks even though they were not. Fixed it. 2. Print the name of the subvol from which the entry was got during index crawl. Change-Id: I08b32e38694c755715e9fe0c0e1dd9212abcfb16 BUG: 1381421 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/15610 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	afr: Implement IPC fop	Poornima G	2016-09-29	2	-0/+121
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently ipc() is not implemented in afr. md-cache and upcall uses ipc to register the list of xattrs, [1] for more details. For the ipc op GF_IPC_TARGET_UPCALL, it has to be wound to all the replica subvolumes. ipc() is failed when any of the subvolumes fails with other than ENOTCONN or all of the subvolumes are down. [1] http://review.gluster.org/#/c/15002/ Change-Id: I0f651330eafda64e4d922043fe53bd0014536247 BUG: 1211863 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15378 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	glusterd: Revert 368e26f454fe35477e46dc698fa6b8c3c608ea8d	N Balachandran	2016-09-26	1	-39/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The earlier fix prevented rebalance from starting even if only a single brick of a replica set was down. Change-Id: I8c9a7d4c1fe33f7749568357462f29796e1b832a BUG: 1376671 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15517 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd: "gluster v heal test statistics heal-count replica" output is not ↵	Mohit Agrawal	2016-09-26	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	correct Problem : "gluster v heal test statistcs heal-count replica" does not show correct output. Solution: After update condition (match brick name) in _select_hxlator_with_matching_brick, it shows correct output. BUG: 1325792 Change-Id: I60cc7c68ea70bce267a747570f91dcddbc1d9016 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15494 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	afr: Ignore gluster internal (virtual) xattrs in metadata heal check	Ravishankar N	2016-09-26	2	-23/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In arbiter configuration, posix-xlator in the arbiter brick always sets the GF_CONTENT_KEY in the response dict with a value 0. If the file size on the data bricks is more than quick-read's max-file-size (64kb default), those bricks don't set the key. Because of this difference in the no. of dict elements, afr triggers metadata heal in lookup code path, in turn leading to extra lookups+inodelks. Fix: Changed afr dict comparison logic to ignore all virtual xattrs and the on-disk ones that we should not be healing. Also removed is_virtual_xattr() function. The original callers to this function (upcall) don't seem to need it anymore. Change-Id: I05730bdd39d8fb0b9a49a5fc9c0bb01f0d3bb308 BUG: 1378684 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/15548 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	glusterd: fix return val in glusterd_op_volume_dict_uuid_to_hostname ()	Atin Mukherjee	2016-09-26	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	glusterd_op_volume_dict_uuid_to_hostname () ignores few dict_get_str failures but doesn't reset the ret value to 0. Reproducer steps: 1. Configure a volume and start it (gNFS has to be disabled) 2. gluster volume status An warning log is observed with: [MSGID: 106217] [glusterd-op-sm.c:4706:glusterd_op_modify_op_ctx] 0-management: Failed uuid to hostname conversion [2016-09-22 09:21:38.537533] W [MSGID: 106387] [glusterd-op-sm.c:4812:glusterd_op_modify_op_ctx] 0-management: op_ctx modification failed Change-Id: I1d3aa79304d83a9d5db1b7198773d8c2780e24a9 BUG: 1378492 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15547 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
*	ec: Implement ipc fop	Poornima G	2016-09-25	4	-1/+150
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ipc will be wound to all the bricks, but for it to be successfull, the fop should succeed on minimum number of bricks. Change-Id: I3f8cb6a349e87bafd0773583def9d4e3765aa140 BUG: 1211863 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15387 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ashish Pandey <aspandey@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	afr: Modifications to afr events	Ravishankar N	2016-09-23	5	-8/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Modified afr event message to add a 'type' key as detailed in the BZ. Also added events for data and metadata split-brain. Change-Id: I8156674b4b6a501499fc10fd68e05115fdaef3e4 BUG: 1378072 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/15550 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>