glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	tests: do not truncate file offsets and sizes to 32-bit	Dmitry Antipov	2020-04-15	3	-6/+9
\| \| \| \| \| \| \| \| \| \|	Do not truncate file offsets and sizes to 32-bit to prevent tests from spurious failures on >2Gb files. Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Change-Id: I2a77ea5f9f415249b23035eecf07129f19194ac2 Fixes: #1161
*	test: Fix test "bug-1064148" to pass in mux	Barak Sason Rofman	2020-04-13	1	-10/+11
\| \| \| \| \| \| \| \|	Parts of the test weren't designed to run in mux mode, this is now fixed Change-Id: I428c2fcce6d047e324ca5dcaef677ee1794e3dfe updates: #1154 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
*	tests: Fix spurious failure of ↵	Mohit Agrawal	2020-04-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	tests/bugs/glusterd/serialize-shd-manager-glusterd-restart.t Problem: Sometime volume status is failed after restart glusterd in one cluster node Solution: Wait to finish glusterd handshake on down cluster node Change-Id: Ib23ca41c943caf2903c61ebf42dc437c1b9d6054 Fixes: #1158 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	posix: Avoid dict_del logs in posix_is_layout_stale while key is NULL	Mohit Agrawal	2020-04-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Problem: The key "GF_PREOP_PARENT_KEY" has been populated by dht and for non-distribute volume like 1x3 key is not populated so posix_is_layout stale throw a message while a file is created Solution: To avoid a log put a condition before delete a key Change-Id: I813ee7960633e7f9f5e9ad2f42f288053d9eb71f Fixes: #1150 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	mount/fuse: Wait for 'mount' child to exit before dying	Pranith Kumar K	2020-04-09	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: tests/bugs/protocol/bug-1433815-auth-allow.t fails sometimes because of stale mount. This stale mount comes into picture when parent process dies without waiting for the child process which mounts fuse fs to die Fix: Wait for mounting child process to die before dying. Fixes: #1152 Change-Id: I8baee8720e88614fdb762ea822d5877973eef8dc Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	dht - fixing a permission update issue	Barak Sason Rofman	2020-04-08	1	-0/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When bringing back a downed brick and performing lookup from the client side, the permission on said brick aren't updated on the first lookup, but only on the second. This patch modifies permission update logic so the first lookup will trigger a permission update on the downed brick. LIMITATIONS OF THE PATCH: As the choice of source depends on whether the directory has layout or not. Even the directories on the newly added brick will have layout xattr[zeroed], but the same is not true for a root directory. Hence, in case in the entire cluster only the newly added bricks are up [and others are down], then any change in permission during this time will be overwritten by the older permissions when the cluster is restarted. fixes: #999 Change-Id: Ieb70246d41e59f9cae9f70bc203627a433dfbd33 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
*	tests: Fix spurious failure of tests/bugs/snapshot/bug-1111041.t	Pranith Kumar K	2020-04-07	1	-4/+6
\| \| \| \| \| \| \| \| \|	Test should wait for process down notification to be received by glusterd. Fixes: #1153 Change-Id: I9162b58a92c1a909ca98097f14c0714f9086bdd1 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	fuse: Add error-logs to debug bug-1433815-auth-allow.t failures	Pranith Kumar K	2020-04-06	1	-0/+1
\| \| \| \| \| \|	Fixes: #1149 Change-Id: I38483fc7d76d7fe0ac9fb649669a46bdf9c82234 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	write-behind: fix data corruption	Xavi Hernandez	2020-04-03	2	-0/+307
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There was a bug in write-behind that allowed a previous completed write to overwrite the overlapping region of data from a future write. Suppose we want to send three writes (W1, W2 and W3). W1 and W2 are sequential, and W3 writes at the same offset of W2: W2.offset = W3.offset = W1.offset + W1.size Both W1 and W2 are sent in parallel. W3 is only sent after W2 completes. So W3 should always overwrite the overlapping part of W2. Suppose write-behind processes the requests from 2 concurrent threads: Thread 1 Thread 2 <received W1> <received W2> wb_enqueue_tempted(W1) /* W1 is assigned gen X / wb_enqueue_tempted(W2) / W2 is assigned gen X / wb_process_queue() __wb_preprocess_winds() / W1 and W2 are sequential and all * other requisites are met to merge * both requests. / __wb_collapse_small_writes(W1, W2) __wb_fulfill_request(W2) __wb_pick_unwinds() -> W2 / In this case, since the request is * already fulfilled, wb_inode->gen * is not updated. / wb_do_unwinds() STACK_UNWIND(W2) / The application has received the * result of W2, so it can send W3. / <received W3> wb_enqueue_tempted(W3) / W3 is assigned gen X / wb_process_queue() / Here we have W1 (which contains * the conflicting W2) and W3 with * same gen, so they are interpreted * as concurrent writes that do not * conflict. / __wb_pick_winds() -> W3 wb_do_winds() STACK_WIND(W3) wb_process_queue() / Eventually W1 will be * ready to be sent / __wb_pick_winds() -> W1 __wb_pick_unwinds() -> W1 / Here wb_inode->gen is * incremented. */ wb_do_unwinds() STACK_UNWIND(W1) wb_do_winds() STACK_WIND(W1) So, as we can see, W3 is sent before W1, which shouldn't happen. The problem is that wb_inode->gen is only incremented for requests that have not been fulfilled but, after a merge, the request is marked as fulfilled even though it has not been sent to the brick. This allows that future requests are assigned to the same generation, which could be internally reordered. Solution: Increment wb_inode->gen before any unwind, even if it's for a fulfilled request. Special thanks to Stefan Ring for writing a reproducer that has been crucial to identify the issue. Change-Id: Id4ab0f294a09aca9a863ecaeef8856474662ab45 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Fixes: #884
*	features/utime: Don't access frame after stack-wind	Pranith Kumar K	2020-04-03	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: frame is accessed after stack-wind. This can lead to crash if the cbk frees the frame. Fix: Use new frame for the wind instead. Updates: #832 Change-Id: I64754609f1114b0bbd4d1336fa81a56f2cca6e03 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	afr: mark pending xattrs as a part of metadata heal	Ravishankar N	2020-04-02	1	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	...if pending xattrs are zero for all children. Problem: If there are no pending xattrs and a metadata heal needs to be performed, it can be possible that we end up with xattrs inadvertendly deleted from all bricks, as explained in the BZ. Fix: After picking one among the sources as the good copy, mark pending xattrs on all sources to blame the sinks. Now even if this metadata heal fails midway, a subsequent heal will still choose one of the valid sources that it picked previously. Fixes: #1067 Change-Id: If1b050b70b0ad911e162c04db4d89b263e2b8d7b Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	glusterd: Brick process fails to come up with brickmux on	Vishal Pandey	2020-02-20	1	-1/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue: 1- In a cluster of 3 Nodes N1, N2, N3. Create 3 volumes vol1, vol2, vol3 with 3 bricks (one from each node) 2- Set cluster.brick-multiplex on 3- Start all 3 volumes 4- Check if all bricks on a node are running on same port 5- Kill N1 6- Set performance.readdir-ahead for volumes vol1, vol2, vol3 7- Bring N1 up and check volume status 8- All bricks processes not running on N1. Root Cause - Since, There is a diff in volfile versions in N1 as compared to N2 and N3 therefore glusterd_import_friend_volume() is called. glusterd_import_friend_volume() copies the new_volinfo and deletes old_volinfo and then calls glusterd_start_bricks(). glusterd_start_bricks() looks for the volfiles and sends an rpc request to glusterfs_handle_attach(). Now, since the volinfo has been deleted by glusterd_delete_stale_volume() from priv->volumes list before glusterd_start_bricks() and glusterd_create_volfiles_and_notify_services() and glusterd_list_add_order is called after glusterd_start_bricks(), therefore the attach RPC req gets an empty volfile path and that causes the brick to crash. Fix- Call glusterd_list_add_order() and glusterd_create_volfiles_and_notify_services before glusterd_start_bricks() cal is made in glusterd_import_friend_volume Change-Id: Idfe0e8710f7eb77ca3ddfa1cabeb45b2987f41aa Fixes: bz#1773856 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	afr: prevent spurious entry heals leading to gfid split-brain	Ravishankar N	2020-02-18	2	-14/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In a hyperconverged setup with granular-entry-heal enabled, if a file is recreated while one of the bricks is down, and an index heal is triggered (with the brick still down), entry-self heal was doing a spurious heal with just the 2 good bricks. It was doing a post-op leading to removal of the filename from .glusterfs/indices/entry-changes as well as erroneous setting of afr xattrs on the parent. When the brick came up, the xattrs were cleared, resulting in the renamed file not getting healed and leading to gfid split-brain and EIO on the mount. Fix: Proceed with entry heal only when shd can connect to all bricks of the replica, just like in data and metadata heal. fixes: bz#1801624 Change-Id: I916ae26ad1fabf259bc6362da52d433b7223b17e Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	feature/changelog: Avoid thread creation if xlator is not enabled	Mohit Agrawal	2020-02-09	2	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Changelog creates threads even if the changelog is not enabled Background: Changelog xlator broadly does two things 1. Journalling - Cosumers are geo-rep and glusterfind 2. Event Notification for registered events like (open, release etc) - Consumers are bitrot, geo-rep The existing option "changelog.changelog" controls journalling and there is no option to control event notification and is enabled by default. So when bitrot/geo-rep is not enabled on the volume, threads and resources(rpc and rbuf) related to event notifications consumes resources and cpu cycle which is unnecessary. Solution: The solution is to have two different options as below. 1. changelog-notification : Event notifications 2. changelog : Journalling This patch introduces the option "changelog-notification" which is not exposed to user. When either bitrot or changelog (journalling) is enabled, it internally enbales 'changelog-notification'. But once the 'changelog-notification' is enabled, it will not be disabled for the life time of the brick process even after bitrot and changelog is disabled. As of now, rpc resource cleanup has lot of races and is difficult to cleanup cleanly. If allowed, it leads to memory leaks and crashes on enable/disable of bitrot or changelog (journal) in a loop. Hence to be safer, the event notification is not disabled within lifetime of process once enabled. Change-Id: Ifd00286e0966049e8eb9f21567fe407cf11bb02a Updates: #475 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	dht: Fix stale-layout and create issue	Susant Palai	2020-02-09	1	-0/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: With lookup-optimize set to on by default, a client with stale-layout can create a new file on a wrong subvol. This will lead to possible duplicate files if two different clients attempt to create the same file with two different layouts. Solution: Send in-memory layout to be cross checked at posix before commiting a "create". In case of a mismatch, sync the client layout with that of the server and attempt the create fop one more time. test: Manual, testcase(attached) fixes: bz#1786679 Change-Id: Ife0941f105113f1c572f4363cbcee65e0dd9bd6a Signed-off-by: Susant Palai <spalai@redhat.com>
*	tests: fix test failures for nfsnobody user and group	Sunil Kumar Acharya	2020-01-24	2	-4/+20
\| \| \| \| \| \| \| \| \| \| \|	'nfsnobody' user and group is merged with 'nobody' user and group in RHEL8. Tests are modified to use appropriate user and group. BUG: 1756900 Change-Id: I59863da2262283b00b1cb417d3652ebe29a36407 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
*	afr: restore timestamp of files during metadata heal	Sheetal Pamecha	2020-01-24	1	-0/+74
\| \| \| \| \| \| \| \| \| \| \|	For files: During metadata heal, we restore timestamps only for non-regular (char, block etc.) files. Extenting it for regular files as timestamp is updated via touch command also fixes: bz#1787274 Change-Id: I26fe4fb6dff679422ba4698a7f828bf62ca7ca18 Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
*	server: Mount fails after reboot 1/3 gluster nodes	Mohit Agrawal	2020-01-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the time of coming up one server node(1x3) after reboot client is unmounted.The client is unmounted because a client is getting AUTH_FAILED event and client call fini for the graph.The client is getting AUTH_FAILED because brick is not attached with a graph at that moment Solution: To avoid the unmounting the client graph throw ENOENT error from server in case if brick is not attached with server at the time of authenticate clients. Credits: Xavi Hernandez <xhernandez@redhat.com> Change-Id: Ie6fbd73cbcf23a35d8db8841b3b6036e87682f5e Fixes: bz#1793852 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	glusterd: deafult options after volume reset	Sanju Rakonde	2020-01-01	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: default option itransport.address-family is disappered in volume info output after a volume reset. Cause: with 3.8.0 onwards volume option transport.address-family has default value, any volume which is created will have this option set. So, volume info will show this in its output. But, with reset volume, this option is not handled. Solution: In glusterd_enable_default_options(), we should add this option along with other default options. This function is called by glusterd_options_reset() with volume reset command. fixes: bz#1786478 Change-Id: I58f7aa24cf01f308c4efe6cae748cc3bc8b99b1d Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	posix: Avoid diskpace error in case of overwriting the data	Mohit Agrawal	2020-01-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Sometime fops like posix_writev, posix_fallocate, posix_zerofile failed and throw error ENOSPC if storage.reserve threshold limit has reached even fops is overwriting the data Solution: Retry the fops in case of overwrite if diskspace check is failed Credits: kinsu <vpolakis@gmail.com> Change-Id: I987d73bcf47ed1bb27878df40c39751296e95fe8 Updates: #745 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	tests/fuse/bug-965974.t: turn off md-cache, to fix failure of the test	Raghavendra G	2019-12-19	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Imagine the following set of operations: 1. touch $M0/file 2. ln $M0/file $M0/file.lnk 3. rm $M1/file 4. wait for md-cache-timeout 5. stat $M0/file.lnk $M0/file stat on $M0/file in step 5 succeeds when md-cache-timeout is non-zero even though it was removed from $M1. The reason is 1. fuse-bridge on $M0 would resolve both file and file.lnk to same inode. This inode i1 is cached in md-cache. 2. After md-cache-timeout lookup on $M0/file.lnk would be sent to backend. This lookup will be successful as file.lnk is present on backend and would refresh the md-cache. 3. The lookup on $M0/file sent within md-cache-timeout after lookup on $M0/file.lnk would be sent with i1. Since i1 was refreshed by lookup on $M0/file.lnk, the cache is deemed valid and md-cache responds lookup on $M0/file as success To fix this failure we can either: 1. remove lookup on $M0/file.lnk, so that lookup on $M0/file is actually sent to backend. 2. turn off md-cache This patch chooses option 2. credits: Csaba Henk <csaba@redhat.com> Change-Id: I352c2acd377fe10c4bdf3b6e53c1de86a4e544c7 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1756900
*	[Cli] Removing old log rotate command.	kshithijiyer	2019-12-17	2	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old command for log rotate is still present removing it completely. Also adding testcase to test the log rotate command with both the old as well as the new command and fixing testcase which use the old syntax to use the new one. Code to be removed: 1. In cli-cmd-volume.c from struct cli_cmd volume_cmds[]: {"volume log rotate <VOLNAME> [BRICK]", cli_cmd_log_rotate_cbk, "rotate the log file for corresponding volume/brick" " NOTE: This is an old syntax, will be deprecated from next release."}, 2. In cli-cmd-volume.c from cli_cmd_log_rotate_cbk(): \|\|(strcmp("rotate", words[2]) == 0))) 3. In cli-cmd-parser.c from cli_cmd_log_rotate_parse() if (strcmp("rotate", words[2]) == 0) volname = (char *)words[3]; else fixes: bz#1750387 Change-Id: I56e4d295044e8d5fd1fc0d848bc87e135e9e32b4 Signed-off-by: kshithijiyer <kshithij.ki@gmail.com>
*	glusterd: Client Handling of Elastic Clusters	Mohit Agrawal	2019-11-12	1	-0/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Configure the list of gluster servers in the key GLUSTERD_BRICK_SERVERS at the time of GETSPEC RPC CALL and access the value in client side to update volfile serve list so that client would be able to connect next volfile server in case of current volfile server is down Updates #741 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Change-Id: I23f36ddb92982bb02ffd83937a8bd8a2c97e8104
*	cli: display detailed rebalance info	Sanju Rakonde	2019-11-12	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When one of the node is down in cluster, rebalance status is not displaying detailed information. Cause: In glusterd_volume_rebalance_use_rsp_dict() we are aggregating rsp from all the nodes into a dictionary and sending it to cli for printing. While assigning a index to keys we are considering all the peers instead of considering only the peers which are up. Because of which, index is not reaching till 1. while parsing the rsp cli unable to find status-1 key in dictionary and going out without printing any information. Solution: The simplest fix for this without much code change is to continue to look for other keys when status-1 key is not found. fixes: bz#1764119 Change-Id: I0062839933c9706119eb85416256eade97e976dc Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	ssl/test: Change the rsa key length to 2048	Mohammed Rafi KC	2019-10-29	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	On a rhel-8 machine, we need to have a key length greater than or eaual to 2048. So changing the values to 2048 to pass the test. Credits: Mohit Agrawal <moagrawal@redhat.com> Change-Id: I0f21db4d737203d0b2e44e7e61f50ae1279795ad Updates: bz#1756900 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	tests: Fix spurious failure	Pranith Kumar K	2019-10-16	1	-2/+2
\| \| \| \| \| \|	fixes: bz#1759002 Change-Id: I4d49e1c2ca9b3c1d74b9dd5a30f1c66983a76529 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	tests: Specify bs for dd	Pranith Kumar K	2019-10-16	1	-1/+1
\| \| \| \| \| \| \| \| \|	On some distros default bs is very slow and the test takes close to 2 minutes instead of 20 seconds. fixes: bz#1761769 Change-Id: If10d595a7ca05f053237f3c5ffbb09c5151eab35 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	tests: mark tests/bugs/glusterd/quorum-value-check.t as NFS test	Sanju Rakonde	2019-10-15	1	-0/+2
\| \| \| \| \| \| \|	Fixes: bz#1665358 Change-Id: Iea000dd839d4e4dbef45941f97ab3725a2aa1726 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	glusterd: rebalance start should fail when quorum is not met	Sanju Rakonde	2019-10-10	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	rebalance start should not succeed if quorum is not met. this patch adds a condition to check whether quorum is met in pre-validation stage. fixes: bz#1760467 Change-Id: Ic7d0d08f69e4bc6d5e7abae713ec1881531c8ad4 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	Fix spurious failure in bug-1744548-heal-timeout.t	Pranith Kumar K	2019-10-09	1	-6/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Script was assuming that the heal would have triggered by the time test was executed, which may not be the case. It can lead to following failures when the race happens: ... 18:29:45 not ok 14 [ 85/ 1] < 26> '[ 331 == 333 ]' -> '' ... 18:29:45 not ok 16 [ 10097/ 1] < 33> '[ 668 == 666 ]' -> '' Heal on 3rd brick didn't start completely first time the command was executed. So the extra count got added to the next profile info. Fixed it by depending on cumulative stats and waiting until the count is satisfied using EXPECT_WITHIN fixes: bz#1759002 Change-Id: I3b410671c902d6b1458a757fa245613cb29d967d Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	cluster/afr: Heal entries when there is a source & no healed_sinks	karthik-us	2019-10-09	1	-0/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In a situation where B1 blames B2, B2 blames B1 and B3 doesn't blame anything for entry heal, heal will not complete even though we have clear source and sinks. This will happen because while doing afr_selfheal_find_direction() only the bricks which are blamed by non-accused bricks are considered as sinks. Later in __afr_selfheal_entry_finalize_source() when it tries to mark all the non-sources as sinks it fails to do so because there won't be any healed_sinks marked, no witness present and there will be a source. Fix: If there is a source and no healed_sinks, then reset all the locked sources to 0 and healed sinks to 1 to do conservative merge. Change-Id: If40d8bc95d52a52b2730f55bdcf135109b421548 Fixes: bz#1749322 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	tests: Fix spurious failure in bug-1134691-afr-lookup-metadata-heal.t	Ravishankar N	2019-10-09	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: The .t was examining the sink brick's iatt value before the launched client-side metadata heal got a chance to complete. Fix: Wait for heal completion. Fixes: bz#1759081 Change-Id: I4dd4e3a1cccf35fd18e8cdfea6aa76a726a4763b Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	afr: support split-brain CLI for replica 3	Ravishankar N	2019-10-09	1	-0/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ever since we added quorum checks for lookups in afr via commit bd44d59741bb8c0f5d7a62c5b1094179dd0ce8a4, the split-brain resolution commands would not work for replica 3 because there would be no readables for the lookup fop. The argument was that split-brains do not occur in replica 3 but we do see (data/metadata) split-brain cases once in a while which indicate that there are a few bugs/corner cases yet to be discovered and fixed. Fortunately, commit 8016d51a3bbd410b0b927ed66be50a09574b7982 added GF_CLIENT_PID_GLFS_HEALD as the pid for all fops made by glfsheal. If we leverage this and allow lookups in afr when pid is GF_CLIENT_PID_GLFS_HEALD, split-brain resolution commands will work for replica 3 volumes too. Likewise, the check is added in shard_lookup as well to permit resolving split-brains by specifying "/.shard/shard-file.xx" as the file name (which previously used to fail with EPERM). Change-Id: I3c543dea79caf7cfbc1633e9089cb1cdd2538ba9 Fixes: bz#1756938 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	tests: add a pending test case	Amar Tumballi	2019-10-03	1	-1/+17
\| \| \| \| \| \| \| \| \|	While merging the protocol handshake fixes (bz#1620580), there was a case which was left out. Adding it separately now. Change-Id: I52133d5fe160b4567400a65e60aac8f7bc20697f Updates: bz#1193929 Signed-off-by: Amar Tumballi <amarts@gmail.com>
*	ssl: fix RHEL8 regression failure	Sanju Rakonde	2019-10-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This tests is failing with "SSL routines:SSL_CTX_use_certificate:ee key too small" in RHEL8. This change is made according to https://access.redhat.com/solutions/4157431 updates: bz#1756900 Change-Id: Ib436372c3bd94bcf7324976337add7da4088b3d5 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	protocol/handshake: pass volume-id for extra check	Amar Tumballi	2019-09-30	1	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With added check of volume-id during handshake, we can be sure to not connect with a brick if this gets re-used in another volume. This prevents any accidental issues which can happen with a stale client process lurking along. Also added test case for testing same volume name which would fetch a different volfile (ie, different bricks, different type), and a different volume name, but same brick. For reference: Currently a client<->server handshake happens in glusterfs through protocol/client translator (setvolume) to protocol/server using a dictionary which containes many keys. Rejection happens in server side if some of the required keys are missing in handshake dictionary. Till now, there was no single unique identifier to validate for a client to tell server if it is actually talking to a corresponding server. All we look in protocol/client is a key called 'remote-subvolume', which should match with a subvolume name in server volume file, and for any volume with same brick name (can be present in same cluster due to recreate), it would be same. This could cause major issue, when a client was connected to a given brick, in one volume would be connected to another volume's brick if its re-created/re-used. To prevent this behavior, we are now passing along 'volume-id' in handshake, which would be preserved for the life of client process, which can prevent this accidental connections. NOTE: This behavior wouldn't be applicable for user-snapshot enabled volumes, as snapshotted volume's would have different volume-id. Fixes: bz#1620580 Change-Id: Ie98286e94ce95ae09c2135fd6ec7d7c2ca1e8095 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	tests/shard: Remove dependence on distributed cache	Pranith Kumar K	2019-09-27	1	-3/+3
\| \| \| \| \| \|	fixes: bz#1756211 Change-Id: Iee5b37af89ab624c16a45df364806003238280e5 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	ctime/rebalance: Heal ctime xattr on directory during rebalance	Kotresh HR	2019-09-16	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After add-brick and rebalance, the ctime xattr is not present on rebalanced directories on new brick. This patch fixes the same. Note that ctime still doesn't support consistent time across distribute sub-volume. This patch also fixes the in-memory inconsistency of time attributes when metadata is self healed. Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df fixes: bz#1734026 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	protocol/client: don't reopen fds on which POSIX locks are held after a ↵	Raghavendra G	2019-09-12	1	-0/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	reconnect Bricks cleanup any granted locks after a client disconnects and currently these locks are not healed after a reconnect. This means post reconnect a competing process could be granted a lock even though the first process which was granted locks has not unlocked. By not re-opening fds, subsequent operations on such fds will fail forcing the application to close the current fd and reopen a new one. This way we prevent any silent corruption. A new option "client.strict-locks" is introduced to control this behaviour. This option is set to "off" by default. Change-Id: Ieed545efea466cb5e8f5a36199aa26380c301b9e Signed-off-by: Raghavendra G <rgowdapp@redhat.com> updates: bz#1694920
*	api: Cleanup of executable not done	Sheetal	2019-09-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	In test tests/bugs/gfapi/bug-1447266/bug-1447266.t actual file created is - tests/bugs/gfapi/bug-1447266/bug-1447266 which is not cleaned up later fixes: bz#1750618 Change-Id: I93120418e54b95018a7213d106a1f1c990766281 Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
*	tests: Fix spurious failure	Pranith Kumar K	2019-09-11	1	-2/+20
\| \| \| \| \| \| \| \| \| \|	If heal from next brick starts after the first brick completes heal, then opendir on the brick can change atime leading to failure of the test. When ctime is disabled it is better to just check mtime to be same after heal. fixes: bz#1751134 Change-Id: Ia03e30fd547e6bbe85c1e299845ffa122f3a2692 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	afr/lookup: Pass xattr_req in while doing a selfheal in lookup	Mohammed Rafi KC	2019-09-05	1	-0/+52
\| \| \| \| \| \| \| \| \| \|	We were not passing xattr_req when doing a name self heal as well as a meta data heal. Because of this, some xdata was missing which causes i/o errors Change-Id: Ibfb1205a7eb0195632dc3820116ffbbb8043545f Fixes: bz#1728770 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	tests: fix spurious failure of bug-1402841.t-mt-dir-scan-race.t	Ravishankar N	2019-09-04	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Since commit 600ba94183333c4af9b4a09616690994fd528478, shd starts healing as soon as it is toggled from disabled to enabled. This was causing the following line in the .t to fail on a 'fast' machine (always on my laptop and sometimes on the jenkins slaves). EXPECT_NOT "^0$" get_pending_heal_count $V0 because by the time shd was disabled, the heal was already completed. Fix: Increase the no. of files to be healed and make it a variable called FILE_COUNT, should we need to bump it up further because the machines become even faster. Also created pending metadata heals to increase the time taken to heal a file. fixes: bz#1748744 Change-Id: I5a26b08e45b8c19bce3c01ce67bdcc28ed48198d Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	afr: wake up index healer threads	Ravishankar N	2019-08-30	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \|	...whenever shd is re-enabled after disabling or there is a change in `cluster.heal-timeout`, without needing to restart shd or waiting for the current `cluster.heal-timeout` seconds to expire. See BZ 1743988 for more details. Change-Id: Ia5ebd7c8e9f5b54cba3199c141fdd1af2f9b9bfe fixes: bz#1744548 Reported-by: Glen Kiessling <glenk1973@hotmail.com> Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	performance/md-cache: Do not skip caching of null character xattr values	Anoop C S	2019-08-20	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Null character string is a valid xattr value in file system. But for those xattrs processed by md-cache, it does not update its entries if value is null('\0'). This results in ENODATA when those xattrs are queried afterwards via getxattr() causing failures in basic operations like create, copy etc in a specially configured Samba setup for Mac OS clients. On the other side snapview-server is internally setting empty string("") as value for xattrs received as part of listxattr() and are not intended to be cached. Therefore we try to maintain that behaviour using an additional dictionary key to prevent updation of entries in getxattr() and fgetxattr() callbacks in md-cache. Credits: Poornima G <pgurusid@redhat.com> Change-Id: I7859cbad0a06ca6d788420c2a495e658699c6ff7 Fixes: bz#1726205 Signed-off-by: Anoop C S <anoopcs@redhat.com>
*	glusterd: ./tests/bugs/glusterd/bug-1595320.t is failing	Mohit Agrawal	2019-08-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: sometime ./tests/bugs/glusterd/bug-1595320.t is failing is failing at the time of checking brick_process after sending a kill signal to brick process Solution: Wait sometime after just sending a kill signal to brick process to make sure brick process is stopped Change-Id: Iee9e91284618abfc62a550d47e4f9117785def58 Fixes: bz#1743200 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	tests: mark ↵	Atin Mukherjee	2019-08-19	1	-0/+1
\| \| \| \| \| \| \| \|	bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t as BRICK_MUX_BAD_TEST Updates: bz#1743069 Change-Id: I1eea0186ca0c1b1226f4b3d0d7c0e41fc7821cbd Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	afr: restore timestamp of parent dir during entry-heal	Ravishankar N	2019-08-14	1	-0/+78
\| \| \| \| \| \|	Fixes: bz#1734370 Change-Id: I29e338bac62104233a6f80212df8d0fb016affda Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	features/shard: Send correct size when reads are sent beyond file size	Krutika Dhananjay	2019-08-12	1	-0/+29
\| \| \| \| \| \|	Change-Id: I0cebaaf55c09eb1fb77a274268ff564e871b743b fixes bz#1738419 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	tests: fix bug-880898.t crash	Ravishankar N	2019-08-12	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: https://build.gluster.org/job/centos7-regression/7337/consoleFull indicates the shd crashing for this .t. On looking at the core, I see the crash is at the time of shd init and glusterfs context is null: (gdb) bt (gdb) p ctx $2 = (glusterfs_ctx_t *) 0xf00000000 The .t is killing all gluster processes immediately after volume start, so it looks like a race between shd coming up and it being killed. Fix: Kill gluster processes only after they are up and running. Fixes: bz#1740017 Change-Id: I7cf589201669bd9f535e968d147015dc99e9a4b6 Signed-off-by: Ravishankar N <ravishankar@redhat.com>