glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	afr: heal gfids when file is not present on all bricks	Ravishankar N	2018-06-19	5	-12/+51
\| \| \| \| \| \| \| \| \| \| \|	commit 20fa80057eb430fd72b4fa31b9b65598b8ec1265 introduced a regression wherein if a file is present in only 1 brick of replica and doesn't have a gfid associated with it, it doesn't get healed upon the next lookup from the client. Fix it. Change-Id: I7d1111dcb45b1b8b8340a7d02558f05df70aa599 fixes: bz#1591193 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	afr: don't update readables if inode refresh failed on all children	Ravishankar N	2018-06-18	4	-32/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: If inode refresh failed on all children of afr due to ENOENT (say file migrated by dht), it resets the readables to zero. Any inflight txn which then later comes on the inode fails with EIO because no readable children present for the inode. Fix: Don't update readables when inode refresh fails on all children of afr. In that way any inflight txns will either proceed with its own inode refresh if needed and fail it with the right errno or use the old value of readables and continue with the txn. Also, add quorum checks to the beginning of afr_transaction(). Otherwise, we seem to be winding the lock and checking for quorum only in pre-op pahse. Note: This should ideally fix BZ 1329505 since the stop gap fix for it is has been reverted at https://review.gluster.org/#/c/20028. Change-Id: Ia638c092d8d12dc27afb3cdad133394845061319 updates: bz#1584483 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	performance/quick-read: provide an invalidation based on ctime	Raghavendra G	2018-06-18	3	-1/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Quick-read by default uses mtime to identify changes to file data. However there are applications like rsync which explicitly set mtime making it unreliable for the purpose of identifying change in file content. Since ctime also changes when content of a file changes and it cannot be set explicitly, it becomes suitable for identifying staleness of cached data. This option makes quick-read to prefer ctime over mtime to validate its cache. However, using ctime can result in false positives as ctime changes with just attribute changes like permission without changes to file data. So, use this option only when mtime is not reliable. credits to Kotresh Hiremath Ravishankar <khiremat@redhat.com> for suggestion on using ctime instead of mtime. Change-Id: Ib3ae39a3252b2876c8ffe81f471d02a87190e9b9 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1591621
*	protocol/client: Remove code duplication	Krutika Dhananjay	2018-06-15	3	-119/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	client_submit_vec_request() which is used by WRITEV, and PUT and client_submit_request() used by the rest of the fops have almost similar code. However, there have been some more checks - such as whether setvolume was successful or not, and one more that is send-gid-specific - that have been missed out in the vectored version of the function. This patch fixes this code duplication. Change-Id: I363a28eeead6219cb1009dc831538153e8bd7d40 fixes: bz#1591580 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	glusterd: removing the unnecessary glusterd message	Sanju Rakonde	2018-06-14	1	-2/+1
\| \| \| \| \| \|	Fixes: bz#1589253 Change-Id: I5510250a3d094e19e471b3ee47bf13ea9ee8aff5 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	features/snapview-server: properly go through the list of snapshots	Raghavendra Bhat	2018-06-14	1	-3/+9
\| \| \| \| \| \| \| \| \| \|	The comparison code to check whether a glfs instance is valid (i.e. whether it corresponds to one in the list of current snapshots) was not correct and was not comparing all the snapshots Change-Id: I87c58edb47bd9ebbb91d805e45df2c4baf2c8118 fixes: bz#1589842 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
*	storage/posix: Handle ENOSPC correctly in zero_fill	Pranith Kumar K	2018-06-14	1	-1/+22
\| \| \| \| \| \|	Change-Id: Icc521d86cc510f88b67d334b346095713899087a fixes: bz#1590710 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	glusterd: Fix for shd not coming up	Sanju Rakonde	2018-06-13	2	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: After creating and starting n(n is large) distribute-replicated volumes using a script, if we create and start (n+1)th distribute-replicate volume manually self heal daemon is down. Solution: In glusterd_proc_stop after giving SIGTERM signal if the process is still running, we are giving a SIGKILL. As SIGKILL will not perform any cleanup process, we need to remove the pidfile. Fixes: bz#1589253 Change-Id: I7c114334eec74c8d0f21b3e45cf7db6b8ef28af1 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	core/various: python3 compat, prepare for python2 -> python3	Kaleb S. KEITHLEY	2018-06-13	2	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	see https://review.gluster.org/#/c/19788/, https://review.gluster.org/#/c/19871/, https://review.gluster.org/#/c/19952/, https://review.gluster.org/#/c/20104/, https://review.gluster.org/#/c/20162/, https://review.gluster.org/#/c/20185/, and https://review.gluster.org/#/c/20207/ This patch changes uses of has_key() as suggested by the 2to3 utility. Note: Fedora packaging guidelines require explicit shebangs, so popular practices like #!/usr/bin/env python and #!/usr/bin/python are not allowed; they must be #!/usr/bin/python2 or #!/usr/bin/python3 Note: Selected small fixes from 2to3 utility. Specifically apply, basestring, funcattrs, idioms, numliterals, set_literal, types, urllib, zip, map, and raise have already been applied. Also version agnostic imports for urllib, cpickle, socketserver, _thread, queue, etc., suggested by Aravinda in https://review.gluster.org/#/c/19767/1 Note: these 2to3 fixes report no changes are necessary: asserts, buffer, exec, execfile, exitfunc, filter, getcwdu, imports2, input, intern, itertools, metaclass, methodattrs, ne, next, nonzero, operator, paren, raw_input, reduce, reload, renames, repr, standarderror, sys_exc, throw, tuple_params, xreadlines. Updates: #411 Change-Id: I79bda20f1583a0a1bb0320667498f4c137de93b3 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	storage/posix: Fix excessive logging in WRITE fop path	Krutika Dhananjay	2018-06-13	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	I was running some write-intensive tests on my volume, and in a matter of 2 hrs, the 50GB space in my root partition was exhausted. On inspecting further, figured that excessive logging in bricks was the cause - specifically in posix write when posix_check_internal_writes() does dict_get() without a NULL-check on xdata. Change-Id: I89de57a3a90ca5c375e5b9477801a9e5ff018bbf fixes: bz#1590655 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	features/shard: Introducing ".shard/.remove_me" for atomic shard deletion ↵	Krutika Dhananjay	2018-06-13	4	-421/+1040
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(part 1) PROBLEM: Shards are deleted synchronously when a sharded file is unlinked or when a sharded file participating as the dst in a rename() is going to be replaced. The problem with this approach is it makes the operation really slow, sometimes causing the application to time out, especially with large files. SOLUTION: To make this operation atomic, we introduce a ".remove_me" directory. Now renames and unlinks will simply involve two steps: 1. creating an empty file under .remove_me named after the gfid of the file participating in unlink/rename 2. carrying out the actual rename/unlink A synctask is created (more on that in part 2) to scan this directory after every unlink/rename operation (or upon a volume mount) and clean up all shards associated with it. All of this happens in the background. The task takes care to delete the shards associated with the gfid in .remove_me only if this gfid doesn't exist in backend, ensuring that the file was successfully renamed/unlinked and its shards can be discarded now safely. Change-Id: Ia1d238b721a3e99f951a73abbe199e4245f51a3a updates: bz#1568521 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	cluster/dht: Refactor rebalance code	N Balachandran	2018-06-13	1	-309/+253
\| \| \| \| \| \| \| \| \| \|	Created init and cleanup functions for certain functionality in order to improve readability. Removed unused code. Change-Id: Ia6a2f4ab64923b6ea8e10487227fb5621eec1488 updates: bz#1586363 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	glusterd: Coverity Fixes	Sanju Rakonde	2018-06-11	2	-4/+7
\| \| \| \| \| \| \|	Fixes: #789278 Change-Id: I633704fab49992cac6ee9e05bc368f7da360d09e Signed-off-by: Sanju Rakonde <srakonde@redhat.com> Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
*	cluster/dht: Leverage MDS subvol for dht_removexattr also	Mohit Agrawal	2018-06-11	1	-60/+190
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In a distributed volume situation can be arise when custom extended attributed are not removed from all bricks after stop/start or added newly brick. Solution: To resolve the same use MDS subvol for remove xattr also BUG: 1575587 Change-Id: I7701e0d3833e3064274cb269f26061bff9b71f50 fixes: bz#1575587 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	protocol/server: Fix xdata leak in seek fop	Pranith Kumar K	2018-06-11	1	-2/+1
\| \| \| \| \| \|	Change-Id: I6125283ed22c04564f0b77bb7a50579a83e02eb0 fixes: bz#1589691 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	core/various: python3 compat, prepare for python2 -> python3	Kaleb S. KEITHLEY	2018-06-07	2	-15/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	see https://review.gluster.org/#/c/19788/, https://review.gluster.org/#/c/19871/, https://review.gluster.org/#/c/19952/, and https://review.gluster.org/#/c/20104/ https://review.gluster.org/#/c/20162/ This patch changes uses of map() and raise(), and a few cases of print() that were overlooked in the prior patch that fixed print. Note: Fedora packaging guidelines require explicit shebangs, so popular practices like #!/usr/bin/env python and #!/usr/bin/python are not allowed; they must be #!/usr/bin/python2 or #!/usr/bin/python3 Note: Selected small fixes from 2to3 utility. Specifically apply, basestring, funcattrs, idioms, numliterals, set_literal, types, urllib, zip, map, and raise have already been applied. Also version agnostic imports for urllib, cpickle, socketserver, _thread, queue, etc., suggested by Aravinda in https://review.gluster.org/#/c/19767/1 Note: these 2to3 fixes report no changes are necessary: asserts, buffer, exec, execfile, exitfunc, filter, getcwdu, intern, itertools, metaclass, methodattrs, ne, next, nonzero, operator, paren, raw_input, reduce, reload, renames, repr, standarderror, sys_exc, throw, tuple_params, xreadlines. Change-Id: Id62ea491e4ab5dd390075c5c6d9d889cf6f9da27 updates: #411 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	core/various: python3 compat, prepare for python2 -> python3	Kaleb S. KEITHLEY	2018-06-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	see https://review.gluster.org/#/c/19788/, https://review.gluster.org/#/c/19871/, https://review.gluster.org/#/c/19952/, and https://review.gluster.org/#/c/20104/ This patch changes uses of xrange() to range(), as suggested by the python 2to3 utility. https://www.geeksforgeeks.org/range-vs-xrange-python/ In Python 3, there is no xrange , but the range function behaves like xrange in Python 2. (My concern is that range() in python2 may behave differently until we "throw the switch" to switch to python3.) Note: Fedora packaging guidelines require explicit shebangs, so popular practices like #!/usr/bin/env python and #!/usr/bin/python are not allowed; they must be #!/usr/bin/python2 or #!/usr/bin/python3 Note: Selected small fixes from 2to3 utility. Specifically apply, basestring, funcattrs, idioms, numliterals, set_literal, types, urllib, and zip have already been applied. Also version agnostic imports for urllib, cpickle, socketserver, _thread, queue, etc., suggested by Aravinda in https://review.gluster.org/#/c/19767/1 Note: these 2to3 fixes report no changes are necessary: asserts, buffer, exec, execfile, exitfunc, filter, getcwdu, intern, itertools, metaclass, methodattrs, ne, next, nonzero, operator, paren, raw_input, reduce, reload, renames, repr, standarderror, sys_exc, throw, tuple_params, xreadlines. Change-Id: I16ae9f4e3a4fd02a0623fb6f9fdb7aaf65f2a8a9 updates: #411 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	glusterd: gluster v status is showing wrong status for glustershd	Sanju Rakonde	2018-06-06	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we restart the bricks, connect and disconnect events happen for glustershd. glusterd use two threads to handle disconnect and connects events from glustershd. When we restart the bricks we'll get both disconnect and connect events. So both the threads will compete for the big lock. We want disconnect event to finish before connect event. But If connect thread gets the big lock first, it sets svc->online to true, and then disconnect thread will et svc->online to false. So, glustershd will be disconnected from glusterd and wrong status is shown. After killing shd, glusterd sleeps for 1 second. To avoid the problem, If glusterd releses the lock before sleep and acquires it after sleep, disconnect thread will get a chance to handle the glusterd_svc_common_rpc_notify before other thread completes connect event. Change-Id: Ie82e823fdfc936feb7c0ae10599297b050ee9986 fixes: bz#1585391 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	mgmt/glusterd: Cleanup dead code	Vijay Bellur	2018-06-06	1	-9/+0
\| \| \| \| \| \| \|	updates: bz#789278 Change-Id: Id67ab681317eb0a69874400a40e3b249dfc7a7db Signed-off-by: Vijay Bellur <vbellur@redhat.com>
*	rpc/clnt: Don't let consumers manage "connected" state	Raghavendra G	2018-06-04	4	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The state management of "connected" in rpc is ad-hoc as far as the responsibility goes. Note that there is nothing wrong with functionality itself. rpc layer manages this state in disconnect codepath and has exposed an api to manage this one from consumers. Note that rpc layer never sets "connected" to true by itself, which forces the consumers to use this api to get a working rpc connection. The situation is best captured from a comment in code from Jeff Darcy in glusterfsd/src/gf-attach.c: -/* - * In a sane world, the generic RPC layer would be capable of tracking - * connection status by itself, with no help from us. It might invoke our - * callback if we had registered one, but only to provide information. Sadly, - * we don't live in that world. Instead, the callback must exist and must - * call rpc_clnt_{set,unset}_connected, because that's the only way those - * fields get set (with RPC both above and below us on the stack). If we don't - * do that, then rpc_clnt_submit doesn't think we're connected even when we - * are. It calls the socket code to reconnect, but the socket code tracks this - * stuff in a sane way so it knows we're connected and returns EINPROGRESS. - * Then we're stuck, connected but unable to use the connection. To make it - * work, we define and register this trivial callback. - */ Also, consumers of rpc know about state of connection only through the notifications sent by rpc-clnt. So, consumers don't have any extra information to manage the state and hence letting them manage the state is counter intuitive. This patch cleans that up and instead moves the responsibility of state management of rpc layer into itself. Change-Id: I31e641a60795fc480ca753917f4b2579f1e05094 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1585585
*	posix/ctime: Fix fops racing in updating mtime/atime	Kotresh HR	2018-06-03	1	-11/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In distributed systems, there could be races with fops updating mtime/atime which could result in different mtime/atime for same file. So updating them only if time is greater than the existing makes sure, only the highest time is retained. If the mtime/atime update comes from the explicit utime syscall, it is allowed to set to previous time. Thanks Xavi for helping in rooting the issue. fixes: bz#1584981 Change-Id: If1230a75b96d7f9a828795189fcc699049e7826e Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	dht: Delete MDS internal xattr from dict in dht_getxattr_cbk	Mohit Agrawal	2018-06-03	2	-31/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the time of fetching xattr to heal xattr by afr it is not able to fetch xattr because posix_getxattr has a check to ignore if xattr name is MDS Solution: To ignore same xattr update a check in dht_getxattr_cbk instead of having a check in posix_getxattr BUG: 1584098 Change-Id: I86cd2b2ee08488cb6c12f407694219d57c5361dc fixes: bz#1584098 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	changed 'sometime' messsages to 'some time'	Levi Baber	2018-06-01	4	-10/+10
\| \| \| \| \| \|	Change-Id: I0936229fc84c011db7791218bb566c971fdea174 fixes: bz#1584864 Signed-off-by: Levi Baber <baber@iastate.edu>
*	features/shard: Fix missing unlock in shard_fsync_shards_cbk()	Vijay Bellur	2018-06-01	1	-0/+1
\| \| \| \| \| \| \|	updates: bz#789278 Change-Id: I745a98e957cf3c6ba69247fcf6b58dd05cf59c3c Signed-off-by: Vijay Bellur <vbellur@redhat.com>
*	libgfchangelog: Remove duplicate includedir definition for changelog.h	Anoop C S	2018-06-01	1	-1/+0
\| \| \| \| \| \| \| \| \| \|	includedir for changelog.h is already defined in Makefile.am under libglusterfs/src since it was moved from xlators/features/changelog/lib/src. Therefore removing the duplicate definition. Change-Id: Iaff2e02fca45715820caa35b41efc2f6b656203a updates: bz#1193929 Signed-off-by: Anoop C S <anoopcs@redhat.com>
*	performance/io-cache: fix a missing unlock	Vijay Bellur	2018-05-31	1	-1/+1
\| \| \| \| \| \| \| \|	Fixes: bz789278 Change-Id: If8ca1fef8a10f1e7270390b61121f8a20a76b1d0 updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
*	glusterd: address test failures with brick mux enabled	Atin Mukherjee	2018-05-31	2	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch addresses following: 1. On volume stop, for the last brick, pmap_registry_remove () is invoked by glusterd. 2. If a brick process is sigkilled, remove all the associated brick instances from the portmap. 3. Bump up PROCESS_UP_TIMEOUT to 45. 4. gf_attach to kill a brick takes more time in mux (which is an issue that needs a fix), but in the interim, give br-state-check.t more time to complete (there are 2 kill_bricks, each taking 120 seconds, and the test usually passes in 30 odd seconds, hence bumping this up to 350 seconds) 5. The test bug-1559004-EMLINK-handling.t is taking ~950 seconds at times on master without mux, in mux cases, when it fails, it is almost at the last iteration, hence bumping the timeout for this test case to reduce regression error rates Updates: bz#1577672 Change-Id: I1922675e112baca4c125c4c094eaa42a11e34e67 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	protocol/client: Don't send fops till SETVOLUME is complete	Raghavendra G	2018-05-31	2	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An earlier commit set conf->connected just after rpc layer sends RPC_CLNT_CONNECT event. However, success of socket level connection connection doesn't indicate brick stack is ready to receive fops, as an handshake has to be done b/w client and server after RPC_CLNT_CONNECT event. Any fop sent to brick in the window between, * protocol/client receiving RPC_CLNT_CONNECT event * protocol/client receiving a successful setvolume response can end up accessing an uninitialized brick stack. So, set conf->connected only after a successful SETVOLUME. Change-Id: I139a03d2da6b0d95a0d68391fcf54b00e749decf fixes: bz#1583937 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
*	cluster:dht: Corrected ret code check	N Balachandran	2018-05-30	1	-1/+1
\| \| \| \| \| \| \| \|	syncop functions return -op_errno. Change-Id: Ifdb1bd1d1d11972b4306a2336e6737d6236a2fb1 fixes: bz#1580238 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	core/various: python3 compat, prepare for python2 -> python3	Kaleb S. KEITHLEY	2018-05-30	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	see https://review.gluster.org/#/c/19788/ and https://review.gluster.org/#/c/19871/ Selected small fixes from 2to3 utility. Specifically apply, basestring, funcattrs, idioms, numliterals, set_literal, types, urllib, zip Note: these 2to3 fixes report no changes are necessary: exec, execfile, exitfunc, filter, getcwdu, intern, itertools, metaclass, methodattrs, ne, next, nonzero, operator, paren, raw_input, reduce, reload, renames, repr, standarderror, sys_exc, throw, tuple_params, xreadlines. Any 2to3 fixes not in the above two lists have more extensive changes which will follow in separate patches. most unicode changes suggested by 2to3 will need to be applied at the same time as changing the shebangs from python2 to python3. Prashanth notes that unicode strings in py2 need 'u' prefix; unicode strings in py3 3.0, 3.1, and 3.2 a 'u' prefix will throw an error, but in py3 3.3+ it is legal (or just ignored). All Linux dists we care about have 3.3 or later so we can leave 'u' prefixes on unicode strings. Change-Id: I49bba2f328b0ee24b9a8115a7183be979981563e updates: #411 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	cloudsync: Adding s3 plugin for cloudsync	Susant Palai	2018-05-30	12	-16/+702
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a plugin which provides an interface to retrive files from amazon-s3 which are archived in to s3. Users need to give the above information for cloudsync to retrieve the file from s3. TODO: 1- A separate commit in to developer-guide will detail about the usage of this plugin in more detail. 2- Need to create target file in aws-bucket with "gfid" names. Helps avoiding name collisions. Change-Id: I2e4a586f4e3f86164de9178e37673a07f317e7d9 Updates: #387 Signed-off-by: Susant Palai <spalai@redhat.com>
*	dht: Excessive 'dict is null' logs in dht_revalidate_cbk	Mohit Agrawal	2018-05-29	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In case of error(ESTALE/ENOENT) dht_revalidate_cbk throws "dict is null" error because xattr is not available Solution: To avoid the logs update condition in dht_revalidate_cbk and dht_lookup_dir_cbk BUG: 1583565 Change-Id: Ife6b3eeb6d91bf24403ed3100e237bb5d15b4357 fixes: bz#1583565 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	performance/open-behind: open pending fds before permission change	Raghavendra G	2018-05-29	1	-1/+60
\| \| \| \| \| \| \| \| \| \| \|	setattr, posix-acl and selinux changes on a file can revoke permission to open the file after permission changes. To prevent that, make sure the pending fd is opened before winding down setattr or setxattr (for posix-acl and selinux) calls. Change-Id: Ib0b91795d286072e445190f9a1b3b1e9cd363282 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> fixes: bz#1405147
*	performance/read-ahead: throwaway read-ahead cache of all fds on writes on ↵	Raghavendra G	2018-05-29	1	-28/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	any fd This is to make sure applications that read and write on different fds of the same file work. This patch also fixes two other issues: 1. while iterating over the list of open fds on an inode, initialize tmp_file to 0 for each iteration before fd_ctx_get to make sure we don't carry over the history from previous iterations. 2. remove flushing of cache in flush and fsync as by themselves, they don't modify the data Change-Id: Ib9959eb73702a3ebbf90badccaa16b2608050eff Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1512691
*	Revert "performance/write-behind: fix flush stuck by former failed writes"	Raghavendra G	2018-05-29	1	-7/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 9340b3c7a6c8556d6f1d4046de0dbd1946a64963. operations/writes across different fds of the same file cannot be considered as independent. For eg., man 2 fsync states, <man 2 fsync> fsync() transfers ("flushes") all modified in-core data of (i.e., modified buffer cache pages for) the file referred to by the file descriptor fd to the disk device </man> This means fsync is an operation on file and fd is just a way to reach file. So, it has to sync writes done on other fds too. Patch 9340b3c7a6c, prevents this. The problem fixed by patch 9340b3c7a6c - a flush on an fd is hung on a failed write (held in cache for retrying) on a different fd - is solved in this patch by making sure __wb_request_waiting_on considers failed writes on any fd as dependent on flush/fsync on any fd (not just the fd on which writes happened) opened on the same file. This means failed writes on any fd are either synced or thrown away on witnessing flush/fsync on any fd of the same file. Change-Id: Iee748cebb6d2a5b32f9328aff2b5b7cbf6c52c05 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1512691
*	glusterd: glusterd is releasing the locks before timeout	Sanju Rakonde	2018-05-28	6	-0/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: We introduced lock timer in mgmt v3, which will realease the lock after 3 minutes from command execution. Some commands related to heal/profile will take more time to execute. For these comands timeout is set to 10 minutes. As the lock timer is set to 3 minutes glusterd is releasing the lock after 3 minutes. That means locks are released before the command is completed its execution. Solution: Pass a timeout parameter from cli to glusterd, when there is a change in default timeout value(i.e, default timeout value can be changed through command line or for the commands related to profile/heal we will change the default timeout value to 10 minutes.) glusterd will set the lock timer timeout according to the timeout value passed. Change-Id: I7b7a9a4f95ed44aca39ef9d9907f546bca99c69d fixes: bz#1577731 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	performance/quick-read: Use generation numbers to avoid updating the cache ↵	Raghavendra G	2018-05-28	2	-27/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	with stale data Thanks to Pranith for the example. Following is the race we are trying to solve with this patch. 1) We have a file with content 'abc' 2) lookup and writev which replaces 'abc' with 'def' comes. Lookup fetches abc but yet to update the cache, and then immediately writev is wound which zeros out the cache. Now lookup_cbk updates the buffer with 'abc' even though on disk it is 'def'. Now writev completes and returns to application. 3) application does a readv which will be fetched from quick-read as 'abc'. Change-Id: I9a9cab9c99652aa6d17230a4fe4dc034ec502b1b BUG: 1390050 Updates: bz#1390050 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
*	cluster/dht: Increase failure count for lookup failure in remove-brick op	Susant Palai	2018-05-28	1	-3/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An entry from readdirp might get renamed just before migration leading to lookup failures. For such lookup failure, remove-brick process does not see any increment in failure count. Though there is a warning message after remove-brick commit for the user to check in the decommissioned brick for any files those are not migrated, it's better to increase the failure count so that user can check in the decommissioned bricks for files before commit. Note: This can result in false negative cases for rm -rf interaction with remove-brick op, where remove-brick shows non-zero failed count, but the entry was actually deleted by user. Fixes :bz#1580269 Change-Id: Icd1047ab9edc1d5bfc231a1f417a7801c424917c fixes: bz#1580269 Signed-off-by: Susant Palai <spalai@redhat.com>
*	feature/locks: Unwind response based on clinet version	Ashish Pandey	2018-05-28	1	-54/+88
\| \| \| \| \| \|	Change-Id: I6fc7755cca0d6f61cb775363618036228925842c fixes: bz#1570538 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	glusterd: memory leak in geo-rep status	Sanju Rakonde	2018-05-28	1	-2/+6
\| \| \| \| \| \| \|	Fixes: bz#1580352 Change-Id: I9648e73090f5a2edbac663a6fb49acdb702cdc49 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	changelog: fix br-state-check.t failure for brick_mux	Mohit Agrawal	2018-05-25	2	-1/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Sometime br-state-check.t crash while runnning for brick multiplex and command in test case is taking 2 minutes for detach a brick Solution: Update code in changelog xlator specific to wait on all connection before cleanup rpc threads and cleanup rpc object only in non brick mux scenario BUG: 1577672 Change-Id: I16e257c1e127744a815000b87bd8b7b8d9c51e1b fixes: bz#1577672 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	posix/ctime: Fix updating mtime to older time	Kotresh HR	2018-05-25	1	-5/+11
\| \| \| \| \| \| \| \| \| \| \|	With ctime feature enabled, the mtime is not updated when it's set to time older than the existing one. Fixed the same. But the ctime is not allowed to change to older dates. fixes: bz#1581035 Change-Id: If520922df42d6ce084c8df3046c138f8367164e5 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	posix/ctime: Fix gfid heal on first lookup	Kotresh HR	2018-05-24	3	-27/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With ctime feature enabled, the gfid is not healing on first lookup. The fresh file logic depends on ctime and it was fetching from backend instead of xattr with ctime feature enabled. Fixed the same. Also fixed a possible hang with inode lock Change-Id: I020875c0462b284d6fa0e68304a422fa3d6a3e73 fixes: bz#1580532 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	storage/posix: use proper FOP for unwinding readdir(p)	Raghavendra Bhat	2018-05-24	1	-3/+8
\| \| \| \| \| \| \| \| \|	As of now, even for readdirp, posix is unwinding with readdir signature. Change-Id: I6440c8a253c5d78bbcc97043e4e6e208e3d47cd1 fixes: bz#1581345 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
*	posix/ctime: Fix atime update for hardlink	Kotresh HR	2018-05-24	1	-8/+19
\| \| \| \| \| \| \| \| \| \| \| \| \|	With ctime feature enabled, atime is not being updated for a hardlink when the file is accessed. e.g., touch -a <hardlink_file> fails to update atime. This patch fixes the same. fixes: bz#1580529 Change-Id: I2201c88d502d0070300a1f5023af1b36951284ec Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	cluster/dht: Fix rebalance log msg	N Balachandran	2018-05-24	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	Corrected the name of the xattr and fixed the code to log an error only if op_errno is not ENODATA or ENOATTR. Change-Id: I42c5b1d838eec586ac7bed2471eb1d27ff09a9ea fixes: bz#1580238 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	sdfs: enable by default	Amar Tumballi	2018-05-24	2	-2/+26
\| \| \| \| \| \| \| \|	also provide an option for pass-through to enable/disable xlator fixes: #421 Change-Id: Ie30a91ad09620db62ab07b797e23123fd1200d1f Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	ctime: Fix updating ctime in rename and unlink	Kotresh HR	2018-05-24	3	-13/+31
\| \| \| \| \| \| \| \| \| \| \| \|	1. Successful rename was not updating ctime. Fixed the same. 2. Successful unlink when link count is more than 1 was not updating ctime. Fixed the same. 3. Copy ctime and flags during frame copy. fixes: bz#1580020 Change-Id: Ied47275a36aea60254b2add7a59128a9c83b3645 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	features/sdfs: implement readdirp	Raghavendra G	2018-05-24	1	-3/+144
\| \| \| \| \| \| \| \| \| \|	Since readdirp acts as a batched lookup for all dentries it reads, it has to synchronize with any entry operation within the directory being read. Change-Id: I923a6ebd21856dbaa5fa5db4a26a29b7b29b3159 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> fixes: #421
*	cluster/ec: Fix pre-op xattrop management	Xavi Hernandez	2018-05-23	3	-32/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Multiple pre-op xattrop can be simultaneously being processed. On the cbk it was checked if the fop was waiting for some specific data (like size and version) and, if so, it was assumed that this answer should contain that data. This is not true, since a fop can be waiting for some data, but it may come from the xattrop of another fop. This patch differentiates between needing some information and providing it. This is related to parallel writes. Disabling them fixed the problem, but also prevented concurrent reads. A change has been made so that disabling parallel writes still allows parallel reads. Fixes: bz#1578325 Change-Id: I74772ad6b80b7b37805da93d5ec3ae099e96b041 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>