glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	afr: assign gfid during name heal when no 'source' is present.	Ravishankar N	2018-12-03	1	-0/+149
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: If parent dir is in split-brain or has dirty xattrs set, and the file has gfid missing on one of the bricks, then name heal won't assign the gfid. Fix: Use the brick we select the gfid from as the 'source'. Note: Problem was found while trying to debug a split-brain issue on Cynthia Zhou's setup. updates: bz#1637249 Change-Id: Id088d4f0fb017aa35122de426654194e581ed742 Reported-by: Cynthia Zhou <cynthia.zhou@nokia-sbell.com> Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	tests/geo-rep: Add Arbiter volume test case	Harpreet Kaur	2018-11-28	2	-0/+439
\| \| \| \| \| \| \| \| \|	Added geo-rep regression tests with Arbiter volume. Fixes: bz#1653565 Change-Id: Id99523c1f1d3d301fbe871aa0641d9ae4ed7b8d7 Signed-off-by: Harpreet Kaur <hlalwani@redhat.com>
*	cluster/afr: Add test for thin-arbiter feature	Ashish Pandey	2018-11-26	1	-0/+51
\| \| \| \| \| \| \| \| \|	Test : Check success/failure of write fop while different bricks/ta process are down. Change-Id: I3c376935df93ebf1f794c964bd19bc1280d91c59 updates: bz#1624332 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	gfapi: Offload callback notifications to synctask	Soumya Koduri	2018-11-26	2	-0/+374
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upcall notifications are received from server via epoll and same thread is used to forward these notifications to the application. This may lead to deadlock and hang in the following scenario. Consider if as part of handling these callbacks, application has to do some operations which involve sending I/Os to gfapi stack which inturn have to wait for epoll threads to receive repsonse. Thus this may lead to deadlock if all the epoll threads are waiting to complete these callback notifications. To address it, instead of using epoll thread itself, make use of synctask to send those notificaitons to the application. Change-Id: If614e0d09246e4279b9d1f40d883a32a39c8fd90 updates: bz#1648768 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
*	cluster/dht: sync brick root perms on add brick	N Balachandran	2018-11-19	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a single brick is added to the volume and the newly added brick is the first to respond to a dht_revalidate call, its stbuf will not be merged into local->stbuf as the brick does not yet have a layout. The is_permission_different check therefore fails to detect that an attr heal is required as it only considers the stbuf values from existing bricks. To fix this, merge all stbuf values into local->stbuf and use local->prebuf to store the correct directory attributes. Change-Id: Ic9e8b04a1ab9ed1248b6b056e3450bbafe32e1bc fixes: bz#1648298 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	lease: Treat unlk request as noop if lease not found	Soumya Koduri	2018-11-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the glusterfs server recalls the lease, it expects client to flush data and unlock the lease. If not it sets a timer (starting from the time it sends RECALL request) and post timeout, it revokes it. Here we could have a race where in client did send UNLK lease request but because of network delay it may have reached after server revokes it. To handle such situations, treat such requests as noop and return sucesss. Change-Id: I166402d10273f4f115ff04030ecbc14676a01663 updates: bz#1648768 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
*	ctime: Enable ctime feature by default	Kotresh HR	2018-11-11	3	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch does following. 1. Enable ctime feature by default. 2. Earlier, to enable the ctime feature, two options needed to be enabled a. gluster vol set <volname> utime on b. gluster vol set <volname> ctime on This is inconvenient from the usability point of view. Hence changed it to following single option a. gluster vol set <volname> ctime on fixes: bz#1624724 Change-Id: I04af0e5de1ea6126c58a06ba8a26e22f9f06344e Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	bd: remove from the build	Amar Tumballi	2018-11-08	1	-142/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Based on the proposal to remove few features as they are not actively maintained [1], removed BD (block device) translator from the build. [1] - https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html Updates: bz#1635688 Change-Id: Ia96db406c58a7aef355dde6bc33523bb2492b1a9 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	glupy: remove from the build	Amar Tumballi	2018-11-08	1	-31/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Based on the proposal to remove few features as they are not actively maintained [1], removing 'glupy' translator from the build. [1] https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html This patch aims at clearing the translator from build and tests. A followup is needed to remove the code from repository. Updates: bz#1642810 Change-Id: I41d0c1956330c3bbca62c540ccf9ab01bbf3a092 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	tests/interrupt.t: remove 'stripe' volume type	Amar Tumballi	2018-11-06	1	-1/+1
\| \| \| \| \| \| \| \| \|	Merged the patch which introduced this testcase after the 'remove stripe' patch got merged, and hence the confusion. Updates: bz#1193929 Change-Id: Ia08552debb111292caf14e51ea6a27334fe5c788 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	fuse: diagnostic FLUSH interrupt	Csaba Henk	2018-11-06	2	-0/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We add dummy interrupt handling for the FLUSH fuse message. It can be enabled by the "--fuse-flush-handle-interrupt" hidden command line option, or "-ofuse-flush-handle-interrupt=yes" mount option. It serves no other than diagnostic & demonstational purposes -- to exercise the interrupt handling framework a bit and to give an usage example. Documentation is also provided that showcases interrupt handling via FLUSH. Change-Id: I522f1e798501d06b74ac3592a5f73c1ab0590c60 updates: #465 Signed-off-by: Csaba Henk <csaba@redhat.com>
*	glusterd: coverity fixes	Atin Mukherjee	2018-11-03	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	Addresses CIDs : 1124769, 1124852, 1124864, 1134024, 1229876, 1382382 Also addressed a spurious failure in tests/bugs/glusterd/df-results-post-replace-brick-operations.t to ensure post replace brick operation and before triggering 'df' from mount, client has connection to the newly replaced bricks. Change-Id: Ie5d7e02f89400a661491d7fc2a120d6f6a83a1cc Updates: bz#789278 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	tiering: remove the translator from build and glusterd	Amar Tumballi	2018-11-02	25	-2078/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Based on the proposal to remove few features as they are not actively maintained [1], removing tier translator from the build. Also make sure there are no regression tests involving tiering feature are present. [1] https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html Change-Id: I2c177f711f9b54b7b24e1a13525ff3132bd9a9c5 updates: bz#1642807 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	glusterd: set fsid while performing replace brick	Sanju Rakonde	2018-11-02	1	-0/+58
\| \| \| \| \| \| \| \| \| \|	While performing the replace-brick operation, we should set fsid value to the new brick. fixes: bz#1637196 Change-Id: I9e9a4962fc0c2f5dff43e4ac11767814a0c0beaf Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	tests: brick-mux-fd-cleanup.t should be under core directory	Atin Mukherjee	2018-10-31	1	-0/+0
\| \| \| \| \| \|	Fixes: bz#1637934 Change-Id: I5f95beab62bd2bdde3bbee94c308b0ad03e94379 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	stripe: remove the translator from build and glusterd	Amar Tumballi	2018-10-31	25	-331/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Based on the proposal to remove few features as they are not actively maintained [1], removing stripe translator from the build. Also make sure there are no regression tests involving stripe translator. [1] https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html Note that this patch aims at removing the translator from build, and a followup patch is needed to remove the code from repository. Updates: bz#1364707 Change-Id: I235b305338f138e29e9f30cba65bc0dadbebbbd5 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	afr: thin-arbiter 2 domain locking and in-memory state	Ravishankar N	2018-10-25	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	2 domain locking + xattrop for write-txn failures: -------------------------------------------------- - A post-op wound on TA takes AFR_TA_DOM_NOTIFY range lock and AFR_TA_DOM_MODIFY full lock, does xattrop on TA and releases AFR_TA_DOM_MODIFY lock and stores in-memory which brick is bad. - All further write txn failures are handled based on this in-memory value without querying the TA. - When shd heals the files, it does so by requesting full lock on AFR_TA_DOM_NOTIFY domain. Client uses this as a cue (via upcall), releases AFR_TA_DOM_NOTIFY range lock and invalidates its in-memory notion of which brick is bad. The next write txn failure is wound on TA to again update the in-memory state. - Any incomplete write txns before the AFR_TA_DOM_NOTIFY upcall release request is got is completed before the lock is released. - Any write txns got after the release request are maintained in a ta_waitq. - After the release is complete, the ta_waitq elements are spliced to a separate queue which is then processed one by one. - For fops that come in parallel when the in-memory bad brick is still unknown, only one is wound to TA on wire. The other ones are maintained in a ta_onwireq which is then processed after we get the response from TA. Change-Id: I32c7b61a61776663601ab0040e2f0767eca1fd64 updates: bz#1579788 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	glusterd: ensure volinfo->caps is set to correct value.	Sanju Rakonde	2018-10-25	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the commit febf5ed4848, during the volume create op, we are setting volinfo->caps to 0, only if any of the bricks belong to the same node and brickinfo->vg[0] is null. Previously, we used to set volinfo->caps to 0, when either brick doesn't belong to the same node or brickinfo->vg[0] is null. With this patch, we set volinfo->caps to 0, when either brick doesn't belong to the same node or brickinfo->vg[0] is null. (as we do earlier without commit febf5ed4848). fixes: bz#1635820 Change-Id: I00a97415786b775fb088ac45566ad52b402f1a49 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	tests: correction in tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t	Sanju Rakonde	2018-10-25	1	-9/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch https://review.gluster.org/#/c/glusterfs/+/19135/ has optimised glusterd test cases by clubbing the similar test cases into a single test case. https://review.gluster.org/#/c/glusterfs/+/19135/15/tests/bugs/glusterd/bug-1293414-import-brickinfo-uuid.t test case has been deleted and added as a part of tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t In the original test case, we create a volume with two bricks, each on a separate node(N1 & N2). From another node in cluster(N3), we try to detach a node which is hosting bricks. It fails. In the new test, we created volume with single brick on N1. and from another node in cluster, we tried to detach N1. we expect peer detach to fail, but peer detach was success as the node is hosting all the bricks of volume. Now, changing the new test case to cover the original test case scenario. Please refer https://bugzilla.redhat.com/show_bug.cgi?id=1642597#c1 to understand why the new test case is not failing in centos-regression. fixes: bz#1642597 Change-Id: Ifda12b5677143095f263fbb97a6808573f513234 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	cluster/ec : Prevent volume create without redundant brick	Sunil Kumar Acharya	2018-10-24	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: EC volumes can be created without any redundant brick. Solution: Updated the conditional check to avoid volume create without redundant brick. fixes: bz#1642448 Change-Id: I0cb334b1b9378d67fcb8abf793dbe312c3179c0b Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
*	tests: check for shd up status in bug-1637802-arbiter-stale-data-heal-lock.t	Ravishankar N	2018-10-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: https://review.gluster.org/#/c/glusterfs/+/21427/ seems to be failing this .t spuriously. On checking one of the failure logs, I see: 22:05:44 Launching heal operation to perform index self heal on volume patchy has been unsuccessful: 22:05:44 Self-heal daemon is not running. Check self-heal daemon log file. 22:05:44 not ok 20 , LINENUM:38 In glusterd log: [2018-10-18 22:05:44.298832] E [MSGID: 106301] [glusterd-syncop.c:1352:gd_stage_op_phase] 0-management: Staging of operation 'Volume Heal' failed on localhost : Self-heal daemon is not running. Check self-heal daemon log file But the tests which preceed this check whether via a statedump if the shd is conected to the bricks, and they have succeeded and even started healing. From glustershd.log: [2018-10-18 22:05:40.975268] I [MSGID: 108026] [afr-self-heal-common.c:1732:afr_log_selfheal] 0-patchy-replicate-0: Completed data selfheal on 3b83d2dd-4cf2-4ea3-a33e-4275be40f440. sources=[0] 1 sinks=2 So the only reason I can see launching heal via cli failing is a race where shd has been spawned but glusterd has not yet updated in-memory that it is up, and hence failing the CLI. Fix: Check for shd up status before launching heal via CLI Change-Id: Ic88abf14ad3d51c89cb438db601fae4df179e8f4 fixes: bz#1641344 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	api: fill out attribute information if not valid	Raghavendra Gowdappa	2018-10-17	2	-0/+137
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	translators like readdir-ahead selectively retain entry information of iatt (gfid and type) when rest of the iatt is invalidated (for write invalidating ia_size, (m)(c)times etc). Fuse-bridge uses this information and sends only entry information in readdirplus response. However such option doesn't exist in gfapi. This patch modifies gfapi to populate the stat by forcing an extra lookup. Thanks to Shyamsundar Ranganathan <srangana@redhat.com> and Prashanth Pai <ppai@redhat.com> for tests. Change-Id: Ieb5f8fc76359c327627b7d8420aaf20810e53000 Fixes: bz#1630804 Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> Signed-off-by: Soumya Koduri <skoduri@redhat.com>
*	gfapi: Bug fixes in leases processing code-path	Soumya Koduri	2018-10-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes below issues in gfapi lease code-path * 'glfs_setfsleasid' should allow NULL input to be able to reset leaseid * Applications should be allowed to (un)register for upcall notifications of type GLFS_EVENT_LEASE_RECALL * APIs added to read contents of GLFS_EVENT_LEASE_RECALL argument which is of type "struct glfs_upcall_lease" Change-Id: I3320ddf235cc82fad561e13b9457ebd64db6c76b updates: #350 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
*	features/shard: Hold a ref on base inode when adding a shard to lru list	Krutika Dhananjay	2018-10-16	5	-7/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In __shard_update_shards_inode_list(), previously shard translator was not holding a ref on the base inode whenever a shard was added to the lru list. But if the base shard is forgotten and destroyed either by fuse due to memory pressure or due to the file being deleted at some point by a different client with this client still containing stale shards in its lru list, the client would crash at the time of locking lru_base_inode->lock owing to illegal memory access. So now the base shard is ref'd into the inode ctx of every shard that is added to lru list until it gets lru'd out. The patch also handles the case where none of the shards associated with a file that is about to be deleted are part of the LRU list and where an unlink at the beginning of the operation destroys the base inode (because there are no refkeepers) and hence all of the shards that are about to be deleted will be resolved without the existence of a base shard in-memory. This, if not handled properly, could lead to a crash. Change-Id: Ic15ca41444dd04684a9458bd4a526b1d3e160499 updates: bz#1605056 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	core: glusterfsd keeping fd open in index xlator	Mohit Agrawal	2018-10-12	1	-0/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the time of processing GF_EVENT_PARENT_DOWN at brick xlator, it forwards the event to next xlator only while xlator ensures no stub is in progress. At io-thread xlator it decreases stub_cnt before the process a stub and notify EVENT to next xlator Solution: Introduce a new counter to save stub_cnt and decrease the counter after process the stub completely at io-thread xlator. To avoid brick crash at the time of call xlator_mem_cleanup move only brick xlator if detach brick name has found in the graph Note: Thanks to pranith for sharing a simple reproducer to reproduce the same fixes bz#1637934 Change-Id: I1a694a001f7a5417e8771e3adf92c518969b6baa Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	afr: prevent winding inodelks twice for arbiter volumes	Ravishankar N	2018-10-10	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In an arbiter volume, if there is a pending data heal of a file only on arbiter brick, self-heal takes inodelks twice due to a code-bug but unlocks it only once, leaving behind a stale lock on the brick. This causes the next write to the file to hang. Fix: Fix the code-bug to take lock only once. This bug was introduced master with commit eb472d82a083883335bc494b87ea175ac43471ff Thanks to Pranith Kumar K <pkarampu@redhat.com> for finding the RCA. fixes: bz#1637802 Change-Id: I15ad969e10a6a3c4bd255e2948b6be6dcddc61e1 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	cli: memory leak issues reported by asan	Amar Tumballi	2018-10-09	1	-1/+1
\| \| \| \| \| \| \| \| \|	With this fix, a run on 'rpc-coverage.t' passes properly. This should help to get started with other fixes soon! Change-Id: I257ae4e28b9974998a451d3b490cc18c02650ba2 updates: bz#1633930 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	tests: add get-state command to test	Sanju Rakonde	2018-10-07	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When geo-replication session is running, run "gluster get-state" command to test. https://review.gluster.org/#/c/glusterfs/+/20461/ patch fixes glusterd crash, when we run get-state command with geo-rep session configured. Adding the test now. Fixes: bz#1598345 Change-Id: I56283fba2c782f83669923ddfa4af3400255fed6 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	Reduce execution time of bug-1559004-EMLINK-handling.t	Xavi Hernandez	2018-10-04	1	-12/+51
\| \| \| \| \| \| \| \| \| \| \|	This patch reduces the execution time of bug-1559004-EMLINK-handling.t from ~14 minutes to ~90 seconds. To do so, it creates some fake hard links directly on the brick instead of creating them through the volume. Change-Id: I9715ff1a4eba47574c733d4f28e68f42f56a7d3f updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	rpc: make binding to port 0 as the default if no option is provided	Amar Tumballi	2018-10-02	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Right now, if no option is provided, the default port is assumed, which is 24007. Ideally, for 'glusterfsd' processes, it is better to not assume there are any ports given, so it can start listening on any port which is available. This helps us to cleanup the dependencies on glusterd from glusterfsd at the moment. No changes would be done to glusterd code, but making the right defaults helps to make glusterfsd more independent process later. NOTE: This patch is a reduced version of below set of patches: * https://review.gluster.org/14613/ & * https://review.gluster.org/14670/ & * https://review.gluster.org/14671/ Credits: Prasanna Kumar Kalever <pkalever@redhat.com> updates: bz#1343926 Change-Id: Ib874e10505e7366dc56ba754458252b67052e653 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	python3: assume python3 unless building _packages_ on sys without py3	Kaleb S. KEITHLEY	2018-09-27	10	-10/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The jenkins release-new job runs on a CentOS 7 box, which does not have python3. As a result it runs (autogen.sh and) configure before producing the dist tar file, converting all the python3 shebangs to python2 shebangs in the dist tar file. Then when that tar file is "carried" to, e.g. Fedora koji build system to build packages, the shebangs are incorrect, despite having originally been correct in the git repo. Change-Id: I5154baba3f6d29d3c4823bafc2b57abecbf90e5b updates: #411 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	ctime: Provide noatime option	Kotresh HR	2018-09-25	1	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most of the applications are {c\|m}time dependant and very few are atime dependant. So provide noatime option to not update atime when ctime feature is enabled. Also this option has to be enabled with ctime feature to avoid unnecessary self heal. Since AFR/EC reads data from single subvolume, atime is only updated in one subvolume triggering self heal. updates: bz#1593538 Change-Id: I085fb33c882296545345f5df194cde7b6cbc337e Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	cluster/ec: Fix failure of tests/basic/ec/ec-1468261.t	Ashish Pandey	2018-09-25	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In this test we are relying on eager-lock time duration of 1 second to delay the post op + unlock phase of an entry fop so that in this 1 second we can kill 2 bricks and dirty on directory could be set. Solution: To fix this issue, we should set the others.eager-lock option to "ON" explicitly in the beginning of this test. Change-Id: I19bbb9c15d7bdf96a96b20587c618192d0b740ef fixes bz#1632161 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	afr: fix incorrect reporting of directory split-brain	Ravishankar N	2018-09-21	2	-1/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When a directory has dirty xattrs due to failed post-ops or when replace/reset brick is performed, AFR does a conservative merge as expected, but heal-info reports it as split-brain because there are no clear sources. Fix: Modify pending flag to contain information about pending heals and split-brains. For directories, if spit-brain flag is not set,just show them as needing heal and not being in split-brain. Fixes: bz#1626994 Change-Id: I09ef821f6887c87d315ae99e6b1de05103cd9383 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	tests: fix test case failure	Sanju Rakonde	2018-09-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tests/bugs/glusterd/bug-1595320.t is failing in downstream. In downstream repo, enabling the brick multiplexing made interactive, so it will throw an prompt for the user input. As no input is provided during the test case execution, the test is failing. Using macro CLI instead of using gluster command, will bypass the interacive commands. so replacing the gluster command with CLI macro will address the issue. Change-Id: I6b39052d8e415a8ed08de7c80a91dadce155146a updates: bz#1193929 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	cluster/afr: Use 2 domain locking in SHD for thin-arbiter	karthik-us	2018-09-20	2	-0/+230
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With this change when SHD starts the index crawl it requests all the clients to release the AFR_TA_DOM_NOTIFY lock so that clients will know the in memory state is no more valid and any new operations needs to query the thin-arbiter if required. When SHD completes healing all the files without any failure, it will again take the AFR_TA_DOM_NOTIFY lock and gets the xattrs on TA to see whether there are any new failures happened by that time. If there are new failures marked on TA, SHD will start the crawl immediately to heal those failures as well. If there are no new failures, then SHD will take the AFR_TA_DOM_MODIFY lock and unsets the xattrs on TA, so that both the data bricks will be considered as good there after. Change-Id: I037b89a0823648f314580ba0716d877bd5ddb1f1 fixes: bz#1579788 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	dht: utilize the framework to pass-through xlator tasks	Amar Tumballi	2018-09-19	3	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also fixes the issue caused due to not converting back the fn function to after getting its address. We wanted the value of the field, not the address of the pt_fop field. With this patch, DHT will always be started in pass-through mode if the number of subvols is just 1. Fixes some tests to make sure DHT is in full config (ie, subvols > 1). - increased timeout of brick-mux test as it was bordering on 300 seconds. - Also change the volume type to supported 'replica 3' from 'replica 2'. - also no DHT tests should assume presence of DHT when there is just 1 brick in volume Credits: Nithya B <nbalacha@redhat.com> fixes: #405 Change-Id: I8e55239ce58d6ac6ae1901e2e384be1ecbd33d6e Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	tests/dht: Uncomment cleanup steps	N Balachandran	2018-09-18	1	-5/+5
\| \| \| \| \| \| \| \| \|	I had forgotten to uncomment the cleanup steps for file-create.t. Fixed. Change-Id: Id702b99b8e09f56b7333491a477828b4a37b2687 updates: bz#1628194 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	tests: fixes to bug-1015990-rep.t	Ravishankar N	2018-09-18	1	-14/+7
\| \| \| \| \| \| \| \| \| \| \|	- check that the shd is connected to brick before running statistics command - remove sleep statements - remove unneeded ($count-$value==0) test when it is known that both values will be same Fixes: bz#1625850 Change-Id: Ifcd4887f0238031e5bca803cd9bfdb75a6e6c01b Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	geo-rep: Fix issues related config set	Kotresh HR	2018-09-18	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1. '--ignore-mising-args' option for rsync is not being used even though the rsync version is greater than 3.1.0. Fixed the same. 2. '--existing' option for rsync is also not being used. Fixed the same. 3. geo-rep config fails to set rsync-options as the value contains '--'. Interestingly, python argsparse treats the value with '--' (e.g., --ignore-missing-args) as option. But when passed with something like --value=--ignore-missing-args, it succeeds. Fixed the same. Change-Id: Iaeb838acaff1c2920fee9c7f920c99edce13a0a1 Signed-off-by: Kotresh HR <khiremat@redhat.com> fixes: bz#1629561
*	tests/dht: Add tests for file create	N Balachandran	2018-09-17	2	-0/+159
\| \| \| \| \| \| \| \|	Test dht file creates Change-Id: I7aba710f4911432bd3b86834efecae8f01e4052f updates: bz#1628194 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	Land part 2 of clang-format changes	Gluster Ant	2018-09-12	65	-7508/+7651
\| \| \| \| \|	Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
*	Land clang-format changes	Gluster Ant	2018-09-12	2	-83/+108
\| \| \| \|	Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b
*	misc: fix misc. shebangs	Kaleb S. KEITHLEY	2018-09-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* One #!/usr/bin/env python and three #!/usr/bin/python were overlooked in all the other python fixups. Ugh. * Two new python files missed the memo about #!/usr/bin/python3. * One #!/usr/bin/env bash. Various distribution packaging policies have strong wording about the use of #!/usr/bin/env ... Note: this patch does not change the use of #!/usr/bin/env bash in the two files extras/{clang-checker.sh,check_goto.pl} as these are not included in any packages. (Although I'm not actually sure why anyone would ever use '/usr/bin/env {sh,bash}' as I'm not aware of any version-specific differences like there are with, e.g., python.) * One #!/usr/bin/bash. On Fedora and CentOS > 6, /bin is a symlink to /usr/bin, so it makes little difference. But Debian & Ubuntu still have separate /bin and /usr/bin; and sh and bash are in /bin, not /usr/bin. (Historically, in BSD and SYSV Unix it was /bin/sh.) Note: Fedora and CentOS package build runs a script that converts all /bin/sh and /bin/bash to /usr/bin/sh and /usr/bin/bash. Change-Id: I9171265829af78dd0cd7622c22b56d22179ff8a3 updates: bz#1193929 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	glusterd: avoid using glusterd's working directory as a brick	Sanju Rakonde	2018-09-08	1	-0/+9
\| \| \| \| \| \| \| \| \|	Adding checks for avoiding glusterd's working directory used as a brick for volume creation. fixes: bz#853601 Change-Id: I4b16a05f752e92216aa628f542a4fdbf59b3c669 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	cluster/dht: Rework the debug xattr to get hashed subvol	N Balachandran	2018-09-07	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The earlier implementation required the file to already exist when trying to get the hashed subvol. The reworked implementation allows a user to get the hashed subvol for any filename, whether it exists or not. Usage: getfattr -n "dht.file.hashed-subvol.<filename>" <parent dir> Eg:To get the hashed subvol for file-1 inside dir-1 getfattr -n "dht.file.hashed-subvol.file-1" /mnt/gluster/dir1 credit: rgowdapp@redhat.com Change-Id: Iae20bd5f56d387ef48c1c0a4ffa9f692866bf739 fixes: bz#1624244 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	io-stats: dump io-stats info in /var/run/gluster	Amar Tumballi	2018-09-05	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It wouldn't make sense to allow iostats file to be written in any directory. While the formating makes sure we try to append io-stats-name for the file, so overwriting existing file is slim, but in any case it makes sense to restrict dumping to one directory. Below are the sample commands, and files created for the corresponding values: $ setfattr -n trusted.io-stats-dump -v file-for-dump $M0 In this case, the file would be in /var/run/gluster/file-for-dump $ setfattr -n trusted.io-stats-dump -v /dir1/dir2/file-for-dump $M0 In this case, then the dump file is in /var/run/gluster/dir1-dir2-file-for-dump Note that the value passed for this virtual xattr would be treated as a file, and even if the value has '/' in it, it would be changed to '-' for sanity. Fixes: bz#1625106 Change-Id: Id9ae6a40a190b8937c51662e6e1c2a0f6c86a0e0 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	afr: thin-arbiter read txn changes	Ravishankar N	2018-09-05	2	-0/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If both data bricks are up, read subvol will be based on read_subvols. If only one data brick is up: - First qeury the data-brick that is up. If it blames the other brick, allow the reads. - If if doesn't, query the TA to obtain the source of truth. TODO: See if in-memory state can be maintained for read txns (BZ 1624358). updates: bz#1579788 Change-Id: I61eec35592af3a1aaf9f90846d9a358b2e4b2fcc Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	New flag to glusterfsd binary to print libexec dir	Aravinda VK	2018-09-05	2	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \|	New CLI option for `glusterfsd` binary to get the path of libexec directory. This helps glusterd2 to detect the installed path of `gsyncd` and other binaries. Usage: `glusterfsd --print-libexecdir` Updates: bz#1193929 Change-Id: I8c1a74afd9acec7ee7bd3deabed9d9f20fe3fb5f Signed-off-by: Aravinda VK <avishwan@redhat.com>
*	multiple files: calloc -> malloc	Yaniv Kaul	2018-09-04	3	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	xlators/cluster/stripe/src/stripe-helpers.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/tier.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-layout.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-helper.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-common.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/afr/src/afr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/afr/src/afr-inode-read.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/bugs/replicate/bug-1250170-fsync.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/basic/gfapi/gfapi-async-calls-test.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/basic/ec/ec-fast-fgetxattr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/xdr/src/glusterfs3.h: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/rpc-transport/socket/src/socket.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/rpc-lib/src/rpc-clnt.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible extras/geo-rep/gsync-sync-gfid.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-xml-output.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-rpc-ops.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-volume.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-system.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-snapshot.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-peer.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-global.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible It doesn't make sense to calloc (allocate and clear) memory when the code right away fills that memory with data. It may be optimized by the compiler, or have a microscopic performance improvement. In some cases, also changed allocation size to be sizeof some struct or type instead of a pointer - easier to read. In some cases, removed redundant strlen() calls by saving the result into a variable. 1. Only done for the straightforward cases. There's room for improvement. 2. Please review carefully, especially for string allocation, with the terminating NULL string. Only compile-tested! updates: bz#1193929 Original-Author: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Amar Tumballi <amarts@redhat.com> Change-Id: I16274dca4078a1d06ae09a0daf027d734b631ac2