glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	no-mtab (-n) mount option ignore next mount option	James Augustine	2016-01-10	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The -n option does not take any arguments. It seems like this shift is removing the next option. On my CentOS 7 system, automount calls mount.glusterfs with the parameters: host:/volume /mountpoint -n -o rw,acl,_netdev This causes the -o option to be siliently ignored. Change-Id: Ice3c877f6ab346b04292e3dfed968d04d15077a5 BUG: 1297195 Signed-off-by: James Augustine <jcaugust81@gmail.com> Reviewed-on: http://review.gluster.org/12988 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com>
*	features/bitrot: add check for corrupted object in f{stat}	Venky Shankar	2016-01-10	2	-32/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Check for corrupted objects is done bt bitrot stub component for data operations and such fops are denied processing by returning EIO. These checks were not done for operations such as get/set extended attribute, stat and the likes - IOW, stub only blocked pure data operations. However, its necessary to have these checks for certain other fops, most importantly stat (and fstat). This is due to the fact that clients could possibly get stale stat information (such as size, {a,c,m}time) resulting in incorrect operation of the application that rely on these fields. Note that, the data that replication would take care of fetching good (and correct) data, but the staleness of stat information could lead to data inconsistencies (e.g., rebalance, tier). Change-Id: I5a22780373b182a13f8d2c4ca6b7d9aa0ffbfca3 BUG: 1296399 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/13120 Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	hook-scripts: remove outdated comment from C29CTDB start scipt	Michael Adam	2016-01-10	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \|	This script does not change Samba's config any more. Change-Id: Ie6001f9a49006f95b291e24252dc362f2a7db14c BUG: 1295504 Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-on: http://review.gluster.org/13168 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anoop C S <anoopcs@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	extras: add script to analyze regression-test failures	Jeff Darcy	2016-01-08	1	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Often a test will fail quite frequently, but not so frequently that it will fail twice in a row for the same patch. This allows it to "fly beneath the radar" for quite a long time, slowing project-wide progress until somebody crawls through the logs looking for patterns. This patch adds a script to automate some of that process. Change-Id: Ic74fbf6b0bfa34bffd9cb109fd51db019053e2cc Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/12510 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	tests: use SIGKILL in cleanup, not SIGTERM	Raghavendra Talur	2016-01-07	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sending a SIGTERM to test processes and waiting a second for them to gracefully exit before sending a SIGKILL seems like a waste of time. Just send SIGKILL directly. Change-Id: Icc73b07eae47876ba41955793a8daf77a964a0e0 BUG: 1294826 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/13121 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	cluster/tier: allow db queries to be interruptable	Dan Lambright	2016-01-07	3	-30/+110
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A query to the database may take a long time if the database has many entries. The tier daemon also sends IPC calls to the bricks which can run slowly, espcially in RHEL6. While it is possible to track down each such instance, the snapshot feature should not be affected by database operations. It requires no migration be underway. Therefore it is okay to pause tiering at any time except when DHT is moving a file. This fix implements this strategy by monitoring when control passes to DHT to migrate a file using the GF_XATTR_FILE_MIGRATE_KEY trigger. If it is not, the pause operation is successful. Change-Id: I21f168b1bd424077ad5f38cf82f794060a1fabf6 BUG: 1287842 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/13104 Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	features/bitrot: Fail node-uuid getxattr if file is marked bad	Kotresh HR	2016-01-07	3	-0/+97
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If xattr is node-uuid and the inode is marked bad, fail getxattr and fgetxattr with EIO. Returning EIO would result in AFR to choose correct node-uuid coresponding to the subvolume where the good copy of the file resides. Change-Id: I45a42ca38f8322d2b10f3c4c48dc504521162b42 BUG: 1294786 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/13116 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
*	quota: handle quota xattr removal when quota is enabled again	vmallika	2016-01-07	2	-4/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a quota is disable and enabled again before completing the cleanup operation, this can remove the new xattrs and quota accounting can become wrong Remove removing the xattr, check if quota enabled again and the xattr is new Change-Id: Idda216f1e7346a9b843dbc112ea3e6faa9c47483 BUG: 1293601 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/13065 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
*	performance/write-behind: maintain correct transit size in case of	Raghavendra G	2016-01-06	1	-15/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	short writes. 1. Imagine a write when cache is filled with failed syncs. 2. This write won't be unwound since cache size has exceeded configured limit. 3. With trickling-writes on by default, the last write request wont be considered for winding when there is non zero in-transit size. 4. There was a bug in accounting of in-transit size when winds resulted in short writes. Due to this bug, in-transit size used to be non-zero even when there are no syncs in progress. 5. Due to 3 and 4, current write request won't be wound till there is another write or fsync or flush from application. But application can't do any other fop till current write request is unwound. This resulted in deadlock and hence application would be hung in 'D' state. This patch fixes bug in accounting of in-transit size during short writes. Change-Id: I04ce8bb510efaaed7623cac38d69b32dbc3730ce Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1279730 Reviewed-on: http://review.gluster.org/13177 Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	vagrant-test: Exit on critical errors	Raghavendra Talur	2016-01-06	2	-18/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There were quite a few places where exiting the script made more sense. More debug messages have been added. Move back to top directory after the script is complete. Change-Id: I2a66ee3a68c41a3acd0b7168c56b801fb5567e5f BUG: 1291537 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/13175 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Michael Adam <obnox@samba.org>
*	dht: missleading indendentation, gcc-6	Kaleb S KEITHLEY	2016-01-05	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gcc-6 now has -Wmisleading-indentation as part of -Wall. compiling with gcc-6 gives this warning. ... dht-diskusage.c: In function ‘dht_subvol_has_err’: dht-diskusage.c:361:33: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation] goto out; ^~~~ dht-diskusage.c:358:25: note: ...this ‘if’ clause, but it is not if (conf->decommissioned_bricks[i] && ^~ ... Inspection of the source shows that without braces the loop is terminated prematurely. Change-Id: Ica48a8c59ee5d0a206797827d7920259d33b47ec BUG: 1295784 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13176 Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	api: Fix errno being set to EINVAL even on success	Prashanth Pai	2016-01-05	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	BUG: 1289068 Change-Id: I7905ac70a537f23e1844c097a24eaa6cb762fb82 Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: http://review.gluster.org/12909 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	tier/create: store TIER_LINKFILE_GFID in xattr dictionary	Mohammed Rafi KC	2016-01-05	1	-17/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In tier_create, a new key TIER_LINKFILE_GFID was introduced to avoid a race in stale linkfile deletion. Storing this key in xattr dictionary instead of using local->params dictionary. Because local->params dictionary was also used to create the file before stale linkfile deletion, that leads posix_create to fail, trying to set the added key as extended attributes Change-Id: I24fecb62b47bee65a1e86103925a67d13304c5df BUG: 1290677 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13130 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	cluster/tier: check watermark during migration	Dan Lambright	2016-01-05	1	-46/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we check the watermarks only before a cycle begins. We should also check the hot tier's fullness against the watermarks during the migration so the watermark is not exceeded as files are promoted. Change-Id: I2ff87a1c308d64fbdca14bbdf55f3ec3007290ae BUG: 1293932 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/13103 Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
*	cluster/tier: Additional details in error messages	N Balachandran	2016-01-04	1	-41/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added file path/gfid when available to the tier log messages to make debugging easier. Change-Id: I22dda329367df2b846dcf254594312c997b66083 BUG: 1273043 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/13114 Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	tier/glusterd: tier daemon not updating the status	Mohammed Rafi KC	2016-01-03	2	-2/+33
\| \| \| \| \| \| \| \| \| \| \| \| \|	Tier process is not updating the status when the process killed mnually. Change-Id: Ia5ea903af78ff3582da2242e6058f11c71923fab BUG: 1294600 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13107 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	wb: remove inline keyword	Raghavendra Talur	2015-12-31	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When compiled with -Werror flag gcc throws the following error: ‘iov_length’ is static but used in inline function ‘__wb_modify_write_request’ which is not static. Let gcc decide what functions to inline and remove the inline keyword. Change-Id: I6d832596eefcf08306634936e11d2c8d4b8f9ccd BUG: 1279730 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/13113
*	Tier : typo in tier help	hari gowtham	2015-12-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: I75bca55901849cf725e02c782f75ff1e6054fddd BUG: 1294448 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/13097 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	tier/create: Dynamically allocate gfid memory	Mohammed Rafi KC	2015-12-29	1	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we are storing the memory as a static pointer. There is a chance to go that variable in out of scope. So we should allocate in Dynamic way. Change-Id: I096876deb8055ac3a44681599591a0a032bc0c24 BUG: 1290677 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13102 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	tier/unlink: symlink failed to unlink	Mohammed Rafi KC	2015-12-29	2	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	during unlink of a file, we will get stat just after deleting the file, to see if the file is under migration or not. but this stat call will fail for symlink if the actual file was deleted. So it is better not to send stat request from client if it is a symlink as we are not migrating symlink. Change-Id: Idc033b24fa3522b5261e579889d2195b43419682 BUG: 1293963 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13074 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	quota: limit xattr for subdir not healed on newly added bricks	vmallika	2015-12-29	2	-4/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	DHT after creating missing directory, tries to heal the xattrs. This xattrs operation fails as INTERNAL FOP key was not set Change-Id: I819d373cf7073da014143d9ada908228ddcd140c BUG: 1294479 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/13100 Reviewed-by: Susant Palai <spalai@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
*	afr: Fix excessive logging in afr_accuse_smallfiles()	Ravishankar N	2015-12-28	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 2b7226f9d3470d8fe4c98c1fddb06e7f641e364d did not check for the validity of a dict before doing a dict_get. Fix that. Change-Id: Ie21f4da19256b17196f242cd8fd5bb76b0a69c1e BUG: 1294053 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/13077 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	cli: should not dereference NULL pointer while printing bitrot scrub status	Gaurav Kumar Garg	2015-12-28	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When user execute bitrot scrub status command and scrubber is pending to do scrubbing then value of last_scrub time will be NULL. Currently cli is dereferencing NULL pointer in this case, That might lead to crash. Fix is to use proper check condition while printing scrub status. Change-Id: I3c4be8e25d089451c6ab77b16737c01d0348ee70 BUG: 1293558 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/13060 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
*	tests: handle bad objects during lookup/inode_refresh	Ravishankar N	2015-12-28	2	-0/+57
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: I1848f0e9243c9376e0deba6738757350fe8b704a BUG: 1290965 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/13044 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	tests: Introduce a Vagrant VM based test environment	Raghavendra Talur	2015-12-28	10	-0/+282
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This introduces a mechanism using which a developer could easily test the Gluster code in a VM environment. Also, it will help bring uniformity in the environments used by various developers. How to use: 1. git checkout -b custom-branch-name 2. Make changes 3. Execute ./run-tests-in-vagrant.sh What happens in the background: 1. A new directory is created: tests/vagrant/vagrant-custom-branch-name It will serve as the Vagrant dir which has the Vagrantfile and related ansible playbooks. The VM is started using Vagrant and provisioned using ansible. 2. The source dir is recursively copied over to the VM under /home/vagrant/glusterfs. 3. Gluster is source installed in VM. What happens in the foreground: 1. run-tests.sh is executed in VM using ssh and output is displayed in the same terminal with option to use ctrl-c to interrupt the test midway. The VM would still persist and you could ssh into it. Also, you can checkout a different branch elsewhere and execute run-tests-in-vagrant.sh there to get another VM which would execute tests on that code. If you wish to make some changes in the code, you could: a. Change the code in host and run the script again to repeat the whole process. OR b. vagrant ssh into the VM and make the changes in the VM. Co-authored-by: Kaushal M <kaushal@redhat.com> Co-authored-by: Michael Adam <obnox@samba.org> Change-Id: Ic87801172c8b614cdecbdf2a765e1b3370a5faf7 BUG: 1291537 Signed-off-by: Michael Adam <obnox@samba.org> Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/12753 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	build: use 'make install' to install the hook scripts	Niels de Vos	2015-12-26	7	-41/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The installation should be the same on all distributions, and doing manual installation of files in the .spec is very ugly. This change adds the rules so that 'make install' places the hook scripts in the right location. Also, the hook script(s) for NFS-Ganesha should be part of the glusterfs-ganesha sub-package and got moved there. BUG: 1174765 Change-Id: Iba25a7a5112c7d40db4c10ff4a5ac7a5fb4f7c4e Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/13072 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	glusterfs.spec.in: use %global per Fedora pkg guidelines	Kaleb S KEITHLEY	2015-12-26	1	-29/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ref. recent emails in fedora-devel ml See: https://fedoraproject.org/wiki/Packaging:Guidelines?rd=Packaging/Guidelines#.25global_preferred_over_.25define and the thread beginning at https://www.redhat.com/archives/fedora-packaging/2009-May/msg00095.html Also fix a couple instances of %if ... %else ... %endif indentation to be consistent with the rest of the .spec Change-Id: Iaf7332fd8601d78bc0d8249033cff12a452654bf BUG: 1294209 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13079 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	glusterd: correct ret code in glusterd_volume_status_copy_to_op_ctx_dict	Atin Mukherjee	2015-12-24	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is to supress the error log of Failed to aggregate rsp_dict where the above function returns a non zero ret which is not required Change-Id: If331980291bd369690257215333cea175e2042ec BUG: 1290734 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/12950 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
*	glusterd: reduce friend update flood	Kaushal M	2015-12-22	1	-2/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When in a befriended state, glusterd would broadcast friend updates to all other peers whenver a ACC or LOCAL_ACC event occurred. When a downed glusterd came back up and established connections again, this lead to a flood of friend updates to happen on the order of N^2 (N is the number of peers in the cluster) In larger clusters this was problematic, and could lead to very long times for the cluster to settle down when a peer came back up. Multiple peers coming back up at the same time would compound the problem. Broadcasting of friend updates doesn't have much use in places other that during a peer probe. Instead of broadcasting friend updates on connection re-establishment, updates can just be exchanged between the peers involved in the connection. This patch changes the glusterd friend state-machine to send updates only to the required peer for ACC or LOCAL_ACC events when in befriended state. The number of updates sent now is in the order of N. For a 10 node cluster, the number of updates reduced by 5 times. When creating the 10 node cluster, the updates reduced from ~500 to ~150. When a glusterd restarted, the number of exchanges reduced from ~160 to ~35. BUG: 1292749 Change-Id: Ib6072090c7069b081d018cdaa3dc878819ab1d18 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/12999 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	Tier: "tier start force" command implementation	hari gowtham	2015-12-22	10	-59/+185
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The start command doesnt restart the tier deamon if the deamon is running at one node. hence to bring up the tierd on the nodes where the deamon is down, the force command is implemented. It skips the check for tierd running. Change-Id: I0037d3e5ecfe56637d0da201a97903c435d26436 BUG: 1292112 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/12983 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	cluster/dht : Ftruncate on migrating file fails with EINVAL	N Balachandran	2015-12-22	9	-80/+317
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	What: If dht_open is called on a migrating file after the inode_ctx is set, subsequent FOPs on that fd do not open the fd on the dst subvol. This is seen when the open-ftruncate-close sequence is repeatedly called on a migrating file. A second call to the sequence described above causes dht_truncate_cbk to call dht_truncate2 as the dht_inode_ctx was already set by the first call. As dht_rebalance_in_progress_check is not called, the fd is not opened on the dst subvol. On a distributed-replicate volume, this causes AFR to open the fd using afr_fix_open, but with the wrong flags, causing posix_ftruncate to fail with EINVAL. The fix: We require fd specific information to make a decision while handling migrating files. Set the fd_ctx to indicate the fd has been opened on the dst subvol and check if it has been set while processing Phase1/Phase2 checks in the FOP callback functions. Change-Id: I43cdcd8017b4a11e18afdd210469de7cd9a5ef14 BUG: 1284823 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12985 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	build: export minimum symbols from xlators for correct resolution	Kaleb S KEITHLEY	2015-12-22	54	-58/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Revisiting http://review.gluster.org/#/c/11814/, which unintentionally introduced warnings from libtool about the xlator .so names. According to [1], the -module option must appear in the Makefile.am file(s); if -module is defined in a macro, e.g. in configure(.ac), then libtool will not recognize that this is a module and will emit a warning. [1] http://www.gnu.org/software/automake/manual/automake.html#Libtool-Modules Change-Id: Ifa5f9327d18d139597791c305aa10cc4410fb078 BUG: 1248669 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13003 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	core: add preadv, pwritev, pread, pwrite syscall wrappers	Kaleb S KEITHLEY	2015-12-22	7	-18/+66
\| \| \| \| \| \| \| \| \| \| \| \| \|	add additional system calls plus pick up a couple missed unwrapped system calls that seem to have slipped into the master branch. Change-Id: If268ccd5e9a139ac3ffd38293c67cd2f62ea5b58 BUG: 1289258 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/12895 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	ctr/sql: Providing for vol set for sqlcachesize and sqlWALsize and skip ↵	Joseph Fernandes	2015-12-22	10	-175/+353
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	recording path 1. Providing vol set option for cache size and wal autocheck point so that performance can be tuned. 2. Removed recording of file path in the db. Trimming database columns. Path need not be stored in the db, as PARGFID, GFID, Basename is suffice to derive the path during migration. Change-Id: I2cb590451a6d244bc91fe66c6dbffe2c2059dfb8 BUG: 1293034 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12972 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	geo-rep: Fix getting subvol count	Kotresh HR	2015-12-22	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tiering doesn't support disperse volume as hot tier, hence xml output doesn't give 'hotdisperseCount'. Remove the usage of 'hotdisperseCount' in geo-rep and return 0 instead. Change-Id: I736e29257de085a25e38eb02959caad3465ebcda BUG: 1292084 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/13062 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vivek Reviewed-by: Aravinda VK <avishwan@redhat.com>
*	tier:delete the linkfile if data file creation fails	Mohammed Rafi KC	2015-12-22	7	-83/+307
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we are creating data file in a hot subvolume then we will create a linkfile in cold subvolume. Linkfile creation happens first. If linkfile creation was successful and data file creation failed, then linkfile in cold subvolume will become stale. This patch will delete the linkfile as well, if data file creation fails. Also this code duplicates dht_create to make tier_create Change-Id: I377a90dad47f288e9576c7323b23cf694a91a7a3 BUG: 1290677 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12948 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	performance/write-behind: retry "failed syncs to backend"	Raghavendra G	2015-12-22	7	-142/+562
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1. When sync fails, the cached-write is still preserved unless there is a flush/fsync waiting on it. 2. When a sync fails and there is a flush/fsync waiting on the cached-write, the cache is thrown away and no further retries will be made. In other words flush/fsync act as barriers for all the previous writes. The behaviour of fsync acting as a barrier is controlled by an option (see below for details). All previous writes are either successfully synced to backend or forgotten in case of an error. Without such barrier fop (especially flush which is issued prior to a close), we end up retrying for ever even after fd is closed. 3. If a fop is waiting on cached-write and syncing to backend fails, the waiting fop is failed. 4. sync failures when no fop is waiting are ignored and are not propagated to application. For eg., a. first attempt of sync of a cached-write w1 fails b. second attempt of sync of w1 succeeds If there are no fops dependent on w1 are issued b/w a and b, application won't know about failure encountered in a. 5. The effect of repeated sync failures is that, there will be no cache for future writes and they cannot be written behind. fsync as a barrier and resync of cached writes post fsync failure: ================================================================== Whether to keep retrying failed syncs post fsync is controlled by an option "resync-failed-syncs-after-fsync". By default, this option is set to "off". If sync of "cached-writes issued before fsync" (to backend) fails, this option configures whether to retry syncing them after fsync or forget them. If set to on, cached-writes are retried till a "flush" fop (or a successful sync) on sync failures. fsync itself is failed irrespective of the value of this option, when there is a sync failure of any cached-writes issued before fsync. Change-Id: I6097c9257bfb9ee5b15616fbe6a0576ae9af369a Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1279730 Reviewed-on: http://review.gluster.org/12594
*	libgfchangelog: Allocate logbuf_pool in master xlator	Kotresh HR	2015-12-22	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	The master xlator needs to allocate 'logbuf_pool' else 'gf_msg' fails with EINVAL. Change-Id: I6b2d3450250de7e77126d12b75b0dbc4db414bfb BUG: 1292463 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/12997 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
*	quota: fix backward compatibility of new quota volinfo option	vmallika	2015-12-22	2	-5/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	quota-version features is implemented for 3.7.6 please see below patch for more details: http://review.gluster.org/#/c/12386 As part of this feature, new volume info option 'quota-version' was introduced, this can cause check-sum problem when a one of the node in a cluster is upgraded to 3.7.6 (heterogeneous cluster) So do a OP_VERSION check when storing this option in volume info Change-Id: Ic5b03a1e3f1236b645a065b1fadee7950307e191 BUG: 1283178 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12642 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	cluster/afr: Fix data loss due to race between sh and ongoing write	Krutika Dhananjay	2015-12-22	4	-7/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When IO is happening on a file and a brick goes down comes back up during this time, protocol/client translator attempts reopening of the fd on the gfid handle of the file. But if another client renames this file while a brick was down && writes were in progress on it, once this brick is back up, there can be a race between reopening of the fd and entry self-heal replaying the effect of the rename() on the sink brick. If the reopening of the fd happens first, the application's writes continue to go into the data blocks associated with the gfid. Now entry-self-heal deletes 'src' and creates 'dst' file on the sink, marking dst as a 'newentry'. Data self-heal is also completed on 'dst' as a result and self-heal terminates. If at this point the application is still writing to this fd, all writes on the file after self-heal would go into the data blocks associated with this fd, which would be lost once the fd is closed. The result - the 'dst' file on the source and sink are not the same and there is no pending heal on the file, leading to silent corruption on the sink. Fix: Leverage http://review.gluster.org/#/c/12816/ to ensure the gfid handle path gets saved in .glusterfs/unlink until the fd is closed on the file. During this time, when self-heal sends mknod() with gfid of the file, do the following: link() the gfid handle under .glusterfs/unlink to the new path to be created in mknod() and rename() the gfid handle to go back under .glusterfs/ab/cd/. Change-Id: I86ef1f97a76ffe11f32653bb995f575f7648f798 BUG: 1292379 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/13001 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	tier/snap : Adding tier-snapshot.t to bad test	Joseph Fernandes	2015-12-21	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	./tests/basic/tier/tier-snapshot.t to bad test On RHEL6 the pause tier mechanism times out due to slow IPC calls between the tier daemon and brick process. A workaround may be to increase the timeout or improve the speed of RHEL6 Change-Id: I5af6123923416e611672ffe2aaea3f0dd7964dd9 BUG: 1293523 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/13056 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	afr: warn if pending xattrs missing during init()	Ravishankar N	2015-12-21	2	-1/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since commit 6e635284a4411b816d4d860a28262c9e6dc4bd6a (glusterfs-3.7.7), the afr pending xattrs are stored in the volfile and used by afr when it initializes. If a cluster is upgraded, prevent afr from loading until the op-version has been bumped up to 3.7.7 and the volfiles have been regenerated using a volume set command. Without this fix, AFR will crash when initialzing. Change-Id: I14249dedb3f2f77cd754d78d8a9a70fdc5fc8c10 BUG: 1293293 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/13038 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	tier/unlink: open fd for special file for fdstat	Mohammed Rafi KC	2015-12-21	1	-11/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DUring unlink of a file, dht request stat to see whether the file is under migration or not. But in posix_unlink currently we are opening for regular files. so the fdstat for special files are failing with EBAD Change-Id: Ic0678e42e7701c3dffb91d98272e664b0fc646b5 BUG: 1293256 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13034 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Susant Palai <spalai@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	cluster/tier: do not block in synctask created from pause tier	Dan Lambright	2015-12-21	3	-18/+126
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We had run sleep() in the pause tier callback. Blocking within a synctask is dangerous. The sleep() call does not inform the synctask scheduler that a thread is no longer running. It therefore believes it is running. If a second synctask already exists, it may not be able to run. This occurs if the thread limit in the pool has been reached. Note the pool size only grows when a synctask is created, not when it is moved from wait state to run state, as is the case when an FOP completes. When the tier is paused during migration, synctasks already exist waiting for responses to FOPs to the server with high probability. The fix is to yield() in the RPC callback, which will place the synctask into the wait queue and free up a thread for the FOP callback. A timer wakes the callback after sufficient time has elapsed for the pause to occur. Change-Id: I6a947ee04c6e5649946cb6d8207ba17263a67fc6 BUG: 1267950 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12987 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
*	tier: Demotion failed if the file was renamed when it was in cold	Mohammed Rafi KC	2015-12-21	1	-11/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During migration if the file is present we just open the file in hashed subvol. Now if the linkfile present on hashed is just linkfile to another subvol, we actually open in hashed subvol. But subsequent operation will go to linkto subvol ie, to non-hashed subvol. This operation will get failed since we haven't opened d on non-hashed. Change-Id: I9753ad3a48f0384c25509612ba76e7e10645add3 BUG: 1292067 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12980 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Susant Palai <spalai@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	tier/dht : Properly free file descriptors during data migration	Joseph Fernandes	2015-12-21	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While tier migration, free src and dst fd's when create of destination or open of source fails. Change-Id: I62978a669c6c9fbab5fed9df2716b9b2ba00ddf1 BUG: 1291566 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12969 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	cli/xml: display correct xml output of tier volume	Gaurav Kumar Garg	2015-12-21	1	-15/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently When hot tier type is distributed-replicate and cold tier type is disperse volume then #gluster volume info --xml command is not giving its correct output. In case of HOT tier case its displaying wrong volume type. With this fix it will show correct xml output for tier volume irrespective of all the type of the volume's. Change-Id: If1de8d52d1e0ef3d0523163abed37b2b571715e8 BUG: 1292084 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/12982 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	geo-rep: Fix getting subvol number	Kotresh HR	2015-12-21	2	-17/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix getting subvol number if the volume type is tier. If the volume type was tier, the subvol number was calculated incorrectly and hence few of workers didn't become ACTIVE resulting in files not being replicated from corresponding brick. This patch addresses the same. Change-Id: Ic10ad7f09a0fa91b4bf2aa361dea3bd48be74853 BUG: 1292084 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/12994 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	afr: handle bad objects during lookup/inode_refresh	Ravishankar N	2015-12-20	2	-1/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an object (file) is marked bad by bitrot, do not consider the brick on which the object is present as a potential read subvolume for AFR irrespective of the pending xattr values. Also do not consider the brick containing the bad object while performing afr_accuse_smallfiles(). Otherwise if the bad object's size is bigger, we may end up considering that as the source. Change-Id: I4abc68e51e5c43c5adfa56e1c00b46db22c88cf7 BUG: 1290965 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12955 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	heal : Do not print heal count on ENOTCONN	Anuradha Talur	2015-12-20	1	-8/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a brick is not reachable due to ENOTCONN, there are no entries gathered and this information should not be printed. It is a bug intorduced due to commit 9c378026e9561595586a817fee0b439e2c863a22, fixing it. Change-Id: I45559a9560c297854ea6b4177f86e0be30dc6b78 BUG: 1250803 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/12919 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>