summaryrefslogtreecommitdiffstats
path: root/tests/bugs
Commit message (Collapse)AuthorAgeFilesLines
* performance/write-behind: retry "failed syncs to backend"Raghavendra G2015-12-223-35/+162
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. When sync fails, the cached-write is still preserved unless there is a flush/fsync waiting on it. 2. When a sync fails and there is a flush/fsync waiting on the cached-write, the cache is thrown away and no further retries will be made. In other words flush/fsync act as barriers for all the previous writes. The behaviour of fsync acting as a barrier is controlled by an option (see below for details). All previous writes are either successfully synced to backend or forgotten in case of an error. Without such barrier fop (especially flush which is issued prior to a close), we end up retrying for ever even after fd is closed. 3. If a fop is waiting on cached-write and syncing to backend fails, the waiting fop is failed. 4. sync failures when no fop is waiting are ignored and are not propagated to application. For eg., a. first attempt of sync of a cached-write w1 fails b. second attempt of sync of w1 succeeds If there are no fops dependent on w1 are issued b/w a and b, application won't know about failure encountered in a. 5. The effect of repeated sync failures is that, there will be no cache for future writes and they cannot be written behind. fsync as a barrier and resync of cached writes post fsync failure: ================================================================== Whether to keep retrying failed syncs post fsync is controlled by an option "resync-failed-syncs-after-fsync". By default, this option is set to "off". If sync of "cached-writes issued before fsync" (to backend) fails, this option configures whether to retry syncing them after fsync or forget them. If set to on, cached-writes are retried till a "flush" fop (or a successful sync) on sync failures. fsync itself is failed irrespective of the value of this option, when there is a sync failure of any cached-writes issued before fsync. Change-Id: I6097c9257bfb9ee5b15616fbe6a0576ae9af369a Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1279730 Reviewed-on: http://review.gluster.org/12594
* cluster/afr: Fix data loss due to race between sh and ongoing writeKrutika Dhananjay2015-12-221-0/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: When IO is happening on a file and a brick goes down comes back up during this time, protocol/client translator attempts reopening of the fd on the gfid handle of the file. But if another client renames this file while a brick was down && writes were in progress on it, once this brick is back up, there can be a race between reopening of the fd and entry self-heal replaying the effect of the rename() on the sink brick. If the reopening of the fd happens first, the application's writes continue to go into the data blocks associated with the gfid. Now entry-self-heal deletes 'src' and creates 'dst' file on the sink, marking dst as a 'newentry'. Data self-heal is also completed on 'dst' as a result and self-heal terminates. If at this point the application is still writing to this fd, all writes on the file after self-heal would go into the data blocks associated with this fd, which would be lost once the fd is closed. The result - the 'dst' file on the source and sink are not the same and there is no pending heal on the file, leading to silent corruption on the sink. Fix: Leverage http://review.gluster.org/#/c/12816/ to ensure the gfid handle path gets saved in .glusterfs/unlink until the fd is closed on the file. During this time, when self-heal sends mknod() with gfid of the file, do the following: link() the gfid handle under .glusterfs/unlink to the new path to be created in mknod() and rename() the gfid handle to go back under .glusterfs/ab/cd/. Change-Id: I86ef1f97a76ffe11f32653bb995f575f7648f798 BUG: 1292379 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/13001 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd/afr: store afr pending xattrs as a volume optionRavishankar N2015-12-152-0/+2
| | | | | | | | | | | | | | | | | | | | | | | Problem: When AFR xlator initialises, it uses the name of the client xlators below it for storing the pending changelogs (xattrs). This can be problem when some other xlator is loaded in between AFR and the client. Though that is a trivial 'traverse-graph-till-the-client-and-use-the-name' fix in AFR's init(), there are other issues like when there's no client xlator at all when, say, AFR is moved to the server side. Fix: The client xlator names are currenly unique and stored as brickinfo->brick_ids. So persist these ids as comma separated values in AFR's volume_options and use them as xattr values during init(). Change-Id: Ie761ffeb3373a4c4d85ad05c84a768c4188aa90d BUG: 1285152 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12738 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* tests:bugs:fuse: add setup.sh and teardown.sh to facilitate manual testingMichael Adam2015-12-112-0/+25
| | | | | | | | | | Change-Id: Ia8fe402663bbdabdc10c18ab42a2063466eb42b6 BUG: 1286735 Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-on: http://review.gluster.org/12830 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* tests:bugs:fuse: add test for bug #1283103 - selinux mount vs security xattrsMichael Adam2015-12-111-0/+59
| | | | | | | | | | BUG: 1283103 Change-Id: Ic4485d650275f67eb6b0b8382a92eb829c06e27c Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-on: http://review.gluster.org/12827 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* tests, shard: Remove dependency on strict-write-orderingKrutika Dhananjay2015-12-105-10/+9
| | | | | | | | | | Change-Id: I00171a77bdefb1c2e7e4610cb0ade5679bdb761f BUG: 1289840 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12915 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* tests/tier: spurious failure in rename testN Balachandran2015-12-101-4/+8
| | | | | | | | | | | | | | | bug-1279376-rename-demoted-file.t fails sometimes The fix is based on the assumption that the test failed because the demotion happened too quickly. Change-Id: Ieccc736f387fcf6afaa72fa9918adb6dd34f2c8a BUG: 1289845 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12926 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tests/bug-924726.t: fix grep pattern to get correct glusterfs pidRaghavendra G2015-12-091-1/+1
| | | | | | | | | Change-Id: Ia2444b1b3e45e3e224bcd59e780a0f38c492f133 BUG: 1289428 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/12906 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* tests: fix brick_up_statusKaushal M2015-12-093-3/+3
| | | | | | | | | | | | | | | The brick_up_status function wasn't correct after the introduction of the RDMA port into the `volume status` output. It has been fixed to use the XML brick status of a specific brick instead of normal CLI output. Change-Id: I5327e1a32b1c6f326bc3def735d0daa9ea320074 BUG: 1289584 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/12913 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* quota: copy quota_version value in func glusterd_volinfo_dupvmallika2015-12-081-0/+48
| | | | | | | | | | | | | | | | | | | | | quota_version is a new variable introduced for quota xattr versioning feature. quota_version was not copied when creating duplicate volinfo in function 'glusterd_volinfo_dup' so any feature like snapshot/tiering using glusterd_volinfo_dup will get the default value of quota_version instead of the correct number and can cause a problem Change-Id: I7b0f418002d49aa7210e2e741e65ee5b2593e6a6 BUG: 1288474 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12881 Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd/quota: quota-version conflict in export/import volinfovmallika2015-12-081-0/+24
| | | | | | | | | | | | | | | When exporting/importing voinfo during handshake, quota conf and quota xattr version were using same key 'quota-version' and updated wrong values when importing quota version values. Change-Id: If939d6f5bc4851d4114963877be72dda21834f0f BUG: 1287996 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12865 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* tier/glusterd: Check before starting tier daemon during volume startMohammed Rafi KC2015-12-081-0/+72
| | | | | | | | | | | | | | | | | We start tier daemon when volume is started without looking into the previous status. The problem with that if detach-tier is started and then volume force start is actually starting tier daemon. This is also fixes a problem where tier daemon is not starting after detach stop. Change-Id: I15b56a711e12f0e24f5ab123561258bd448621f7 BUG: 1286974 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12833 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* snapshot/clone : Fix tier pause failure for snapshot cloneAvra Sengupta2015-12-022-8/+30
| | | | | | | | | | | | | | | | | | | | On a tiered volume, snapshot clone fails while trying to pause tier, as we pass volname(snap) to the brick_op_phase module, which tries to look for the snap volume amongst regular volumes, and obviously doesn't find it and fail. Well as snapshot volumes are read only volume, and will not have tiering daemon acting upon them, there is really no need to pause tiereing while taking clone of snapshot volumes. Hence removing the code to pause and resume tiering during clone create. Change-Id: I2266aba589a830a13a806c0d8a56fd8855143ccd BUG: 1279327 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/12548 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* cluster/tier: Do not delete linkto file on demotionN Balachandran2015-11-181-0/+88
| | | | | | | | | | | | | | | | | The current DHT migration code will always delete the src linkto file after migration as dht always moves files to the hashed subvol. This is not the case in tiering. The lack of linkto files causes rename to fail leaving 2 files with the same name but different gfids on the volume. Modified to leave the linkto file behind if the source volume is the hashed subvolume. Change-Id: I2b99f7d34b4b719aee6232dc40c6a8f8ba88225d BUG: 1279376 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12551 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* quota: fix spurious failurevmallika2015-11-171-2/+1
| | | | | | | | | | Change-Id: I5d18533d66df3175752a73430f680dcdfdb3c12a BUG: 1278689 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12546 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* features/gfid-access: Fix entry creation via setxattr for geo-repKotresh HR2015-11-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GEO-REP INTEROP WITH SHARD FEATURE Problem: Geo-replication uses setxattr interface of gfid-access xlator to create entries and send explicit setattr after entry creation to set uid and gid. But between entry creation and setattr, the inode would not be linked. Hence operation which accesses inode structure during setattr by any the below xlator fails. Solution: Linking inode would seem the obvious solution but, gfid-access xlator cannot link inodes and maintain it as it would result in same inode pointing to two different paths one being virtual .gfid/<gfid> path and other being actual path. The solution is to set uid and gid in frame->root->uid and frame->root->gid respectively from which posix extracts and sets. Change-Id: Ic0749ee471432caeb8ded3152a07de6e64d8538d BUG: 1265148 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/12206 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* quota: fix for spurious failurevmallika2015-11-061-0/+3
| | | | | | | | | | | | | | Filed a bug# 1278689. For now marking the testcase tests/bugs/quota/bug-1235182.t' bad once the bug# 1278689, remove the testcase from bad list Change-Id: I224f907153d3e5f35834007a40b0050246d8787a BUG: 1278689 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12526 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* snapshot: Making bug-1275616.t more regression failure tolerantAvra Sengupta2015-11-061-0/+10
| | | | | | | | | | | | | | | | | | | snapshot clone creation fails 'spuriously' on the regression setup coz the brick rpc connect for snap3 in the testcase, happens way after the snap was created. So adding a EXPECT_WITHIN $PROCESS_UP_TIMEOUT check(read delay) to help the cause. But this isn't a 100% guaranteed fix, as on an even slower machine, even this check will fail followed by the subsequent failures that this patch is trying to fix in the first place Change-Id: I2f31558b717fd610111f14e451fe444c09f3f254 BUG: 1278418 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/12516 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* snapshot: Inherit snap-max-hard-limit from original volumeAvra Sengupta2015-11-021-0/+48
| | | | | | | | | | | | | | | | | | A snapshot should inherit snap-max-hard-limit from the original volume while being created and when being restored to, it should restore the same. Similarly a clone taken from a snapshot should inherit snap-max-hard-limit from the snapshot. Change-Id: If8e90e2ffc10e22086b803ac8e2638a16bcec968 BUG: 1275616 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/12437 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* snapshot: Don't display snapshot's hard-limit and soft-limit in vol infoAvra Sengupta2015-11-021-5/+6
| | | | | | | | | | | | | | | | | | | | | | | The snap-max-hard-limit being displayed in the volume info currently is propagated from system's snap-max-hard-limit as that is a global option common for all volumes, and hence ends up showing the system's snap-max-hard-limit. We should not be displaying snap-max-hard-limit and snap-max-soft-limit in the volume info at all, as these are snap config options and should be set and displayed via snap config command. Modified bug-1113476.t to test the same behaviour. Change-Id: I90891f0cf7fb39fd686787297c7f7cd8c1e7daa1 BUG: 1276018 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/12443 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* quota: add version to quota xattrsvmallika2015-11-021-2/+2
| | | | | | | | | | | | | | | | | | | | | When a quota is disable and the clean-up process terminated without completely cleaning-up the quota xattrs. Now when quota is enabled again, this can mess-up the accounting A version number is suffixed for all quota xattrs and this version number is specific to marker xaltor, i.e when quota xattrs are requested by quotad/client marker will remove the version suffix in the key before sending the response Change-Id: I1ca2c11460645edba0f6b68db70d476d8d26e1eb BUG: 1272411 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12386 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* features/shard: Force cache-refresh when lookup/readdirp/stat detect that ↵Krutika Dhananjay2015-10-281-0/+36
| | | | | | | | | | | xattr value has changed Change-Id: Ia3225a523287f6689b966ba4f893fc1b1fa54817 BUG: 1272986 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12400 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/snap : cleanup the root loc in statfsAshish Pandey2015-10-201-0/+25
| | | | | | | | | | | | | | | | Problem : In svc_statfs function, wipe_loc is getting called on loc passed by nfs. This loc is being used by svc_stat which throws erro if loc->inode is NULL. Solution : wipe_loc should be called on local root_loc. Change-Id: I9cc5ee3b1bd9f352f2362a6d997b7b09051c0f68 BUG: 1260848 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/12123 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* glusterd: disabling enable-shared-storage option should not delete volumeGaurav Kumar Garg2015-10-131-0/+36
| | | | | | | | | | | | | | | | | | | Previously when you create volume with "glusterd_shared_storage" name and if user disable enable-shared-storage option then gluster will delete the "glusterd_shared_storage" volume. With this fix gluster will do appropriate validation of enable-shared-storage option and it will not delete volume with "glusterd_shared_storage" name if it is a user created volume. Change-Id: I2bd92f938fb3de6ef496a934933bdcea9f251491 BUG: 1266818 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/12232 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Anand Nekkunti <anekkunt@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* quota/marker: dir_count accounting is not atomicvmallika2015-10-121-0/+50
| | | | | | | | | | | | | | | | | | | | | | | Consider below scenario: Quota enabled on pre-existing data Now quota-crawl process will start healing xattrs Now if write is performed where healing is not complete, there is a possibility that 'update txn' is started before 'create xattr txn', in this case dir count can be missed on a dir where quota size xattr is not yet created. Solution is to get size xattr and if xattr is missing, add 1 for dir_count, this requires one additional fop if done in marker during each update iteration Better solution is to us xattrop GF_XATTROP_ADD_ARRAY64_WITH_DEFAULT Change-Id: Idc8978860a3914e70c98f96effeff52e9a24e6ba BUG: 1243798 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/11694 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* xlators: add JSON FOP statistics dumps every N secondsRichard Wareing2015-10-083-3/+13
| | | | | | | | | | | | | | | | | | | | | | | Summary: - Adds a thread to the io-stats translator which dumps out statistics every N seconds where N is configurable by an option called "diagnostics.stats-dump-interval" - Thread cleanly starts/stops when translator is unloaded - Updates macros to use "Atomic Builtins" (e.g. intel CPU extentions) to use memory barries to update counters vs using locks. This should reduce overhead and prevent any deadlock bugs due to lock contention. Test Plan: - Test on development machine - Run prove -v tests/basic/stats-dump.t Change-Id: If071239d8fdc185e4e8fd527363cc042447a245d BUG: 1266476 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/12209 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com>
* quota: use copy_frame when creating new frame during quota_check_limitvmallika2015-10-061-1/+2
| | | | | | | | | | | | | | | | | DHT re-balance, sets frame root PID < 0 and quota_check_limit skips enforcement if this PID is less than 0. When creating new frame for quota_check_limit we need to use copy_frame instead of create_frame, so that all auth information are copied from original frame. Change-Id: Ib3b4a3744f8b0d72a8bc32826f6edae836d6faed BUG: 1267812 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12265 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd: validate function for replica volume optionsSakshi2015-10-011-0/+67
| | | | | | | | | | Change-Id: I5b4a28db101e9f7e07f4b388c7a2594051c9e8dd BUG: 1265479 Signed-off-by: Sakshi <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/12215 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterfsd : newly added brick receives fops only after it is startedSakshi2015-09-221-0/+4
| | | | | | | | | | | | | | | | | | | When new bricks are added in the middle of an on-going fop like 'rm', the volfile changes without waiting for the newly added bricks to get port. Fops are sent to all bricks and may fail on some with ENOTCONN as these bricks may not have a port yet. This patch ensures that the volfile change happens only after all the bricks have a port. Change-Id: I7ed2413475f80d0cc8849fed33036ade8d75a191 BUG: 1233151 Signed-off-by: Sakshi <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/11342 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd : check if all bricks are started before performing remove-brickSakshi2015-09-223-0/+36
| | | | | | | | | | Change-Id: Ie9e24e037b7a39b239a7badb983504963d664324 BUG: 1225716 Signed-off-by: Sakshi <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/10954 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* Tier/cli: Change detach-tier commit force to detach-tier forceMohammed Rafi KC2015-09-221-1/+1
| | | | | | | | | | | | | | | | | | Current detach-tier cli command support commit force. Deprecating the same to force. So the new syntax would be: volume detach-tier <VOLNAME> <start|stop|status|commit|force> Change-Id: Ie86dfd72341078c0a1be94767f523730911312ef BUG: 1261862 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12151 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* marker: don't account destination linkto-file during internal migrationvmallika2015-09-221-0/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | During a DHT re-balance operation, quota accounts for the destination. Problem of accounting this destination file are: 1) Migration is an internal operation, 'quota list' shows more usage on the CLI and this will come to the normal numbers once the migration is complete 2) If the usage is close to the limit set, then we can get 'Disk Quota Exceeded' errors in the I/O path during file migration Solution is we should not account of the usage on the destination file during migration, at the end of the migration. We need to reduce size of the source directory and accounting for the migrated dest file We assume that there are sufficent disk space in the back-end. DHT migrator should make sure that there are sufficient disk space before it starts the migration process. Change-Id: Ie3cfe3e4ab5241c2a127ba0edc599a053d30c3a0 BUG: 1260545 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12113 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* features/shard: Filter internal shard xattrs in {get,remove,set}xattrKrutika Dhananjay2015-09-081-0/+41
| | | | | | | | | Change-Id: I40e4a5dbd13d6c3d777e7e01f93dabc83e52b137 BUG: 1260637 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12121 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* glusterd: Do not allow "detach-tier commit" unnecessarilyGaurav Kumar Garg2015-09-071-0/+41
| | | | | | | | | | | | | | | | | Currently when user execute gluster v detach-tier commit command without starting detach-tier or without giving force option then gluster will success this operation. Detach-tier commit should not allow without giving "force" optioin. Change-Id: Id161c288f6f3e0f6b298878a5c35a49fcbd9c6e3 BUG: 1260185 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/12107 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* features/shard: Fix incorrect op_ret in READVKrutika Dhananjay2015-09-031-0/+41
| | | | | | | | | Change-Id: I31ac99b290f82f4b74236c206193f7641c73d4dc BUG: 1259651 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12099 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* nfs: Fixes "Remote I/O error" mount failuresRichard Wareing2015-09-011-0/+28
| | | | | | | | | | | | | | | | | - Fixes issue where NFS mount fail with "Remove I/O error" after the target directory has been deleted and re-created after the gNFSd has already cached the inode of the first generation of the target directory. - The solution is to follow the guidance of the AFR2 comments and refresh the inode by deleting it from cache and looking it up again. BUG: 1258196 Change-Id: I9c7d8bd460ee9e5ea0b5b47d23886b1afcdcd563 Reported-by: Richard Wareing <rwareing@fb.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/12046 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: Return better error messages for probe and detach failuresBrad Hubbard2015-09-011-1/+1
| | | | | | | | | | | | | | | We handle some specific errors and return good error messages for those, but for the default case where the error code is not recognised we just report "unknown errno". This patch attempts to at least return the output of strerror to provide more informative errors. BUG: 1257149 Change-Id: I0027e74e41adac4ab0c0a929c6fff56878bf39c8 Signed-off-by: Brad Hubbard <bhubbard@redhat.com> Reviewed-on: http://review.gluster.org/12021 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* features/shard: Fix unlink failure due to non-existent shard(s)Krutika Dhananjay2015-08-311-0/+41
| | | | | | | | | | | | | | | | | | | | Unlink of a sharded file with holes was leading to EINVAL errors because it was being wound on non-existent shards (those blocks that fall in the hole region). loc->inode was NULL in these cases and dht_unlink used to fail the FOP with EINVAL for failure to fetch cached subvol for the inode. The fix involves winding unlink on only those shards whose corresponding inodes exist in memory. Change-Id: I993ff70cab4b22580c772a9c74fc19ac893a03fc BUG: 1258334 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12059 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Don't set posix acls on linkto filesNithya Balachandran2015-08-311-0/+57
| | | | | | | | | | | | | | | | | | | | Posix acls on a linkto file change the file's permission bits and cause DHT to treat it as a non-linkto file.This happens on the migration failure of a file on which posix acls were set. The fix prevents posix acls from being set on a linkto file and copies them across only after a file has been successfully migrated. Change-Id: Iccf7ff6fba49fe05d691d9b83bf76a240848b212 BUG: 1247563 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12025 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Fix permission issuesKrutika Dhananjay2015-08-301-11/+57
| | | | | | | | | | | | | | | | | | | | This patch does the following: * reverts commit b467af0e99b39ef708420d3f7f6696b0ca618512 * changes ownership on shards under /.shard to be root:root * makes readv, writev, [f]truncate, rename, and unlink fops to perform operations on files under /.shard with frame->root->{uid,gid} as 0. This would ensure that a [f]setattr on a sharded file does not need to be called on all the shards associated with it. Change-Id: Idcfb8c0dd354b0baab6b2356d2ab83ce51caa20e BUG: 1251824 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11992 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Make [f]xattrop metadata transactionPranith Kumar K2015-08-301-0/+38
| | | | | | | | | | | | | | | | | | | Problem: When xlators above afr do [f]xattrop when one of the bricks is down, after the brick comes backup, the metadata is not healed because [f]xattrop is not considered a transaction. Fix: Treat [f]xattrop as transaction so that changes done by xlators above afr are marked for heal when some of the bricks were down at the time of [f]xattrop. Change-Id: Iea180f9a456509847c3cd8d5d59a0cdc2712d334 BUG: 1248887 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/11809 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* features/shard: Fix size update for writes at hole regionKrutika Dhananjay2015-08-291-0/+35
| | | | | | | | | | Change-Id: Iceccef8f3f466c7ffb9991f8eb248b81e7b80efb BUG: 1256580 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12020 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* glusterd: Don't allow remove brick start/commit if glusterd is down of the ↵Atin Mukherjee2015-08-261-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | host of the brick remove brick stage blindly starts the remove brick operation even if the glusterd instance of the node hosting the brick is down. Operationally its incorrect and this could result into a inconsistent rebalance status across all the nodes as the originator of this command will always have the rebalance status to 'DEFRAG_NOT_STARTED', however when the glusterd instance on the other nodes comes up, will trigger rebalance and make the status to completed once the rebalance is finished. This patch fixes two things: 1. Add a validation in remove brick to check whether all the peers hosting the bricks to be removed are up. 2. Don't copy volinfo->rebal.dict from stale volinfo during restore as this might end up in a incosistent node_state.info file resulting into volume status command failure. Change-Id: Ia4a76865c05037d49eec5e3bbfaf68c1567f1f81 BUG: 1245045 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/11726 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
* cluster/afr : Examine data/metadata readable for read-subvolAnuradha Talur2015-08-251-0/+48
| | | | | | | | | | | | | | | | | | During lookup and discover, currently read_subvol is based only on data_readable. read_subvol should be decided based on both data_readable and metadata_readable. Credits to Ravishankar N for the logic of afr_first_up_child from http://review.gluster.org/10905/ . Change-Id: I98580b23c278172ee2902be08eeaafb6722e830c BUG: 1240244 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/11551 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: modify afr_txn_nothing_failed()Ravishankar N2015-08-252-0/+91
| | | | | | | | | | | | | | | | | | In an AFR transaction, we need to consider something as failed only if the failure (either in the pre-op or the FOP phase) occurs on the bricks on which a transaction lock was obtained. Without this, we would end up considering the transaction as failure even on the bricks on which the lock was not obtained, resulting in unnecessary fsyncs during the post-op phase of every write transaction for non-appending writes. Change-Id: Iee79e5d85dc7b4c41459d8bdd04a8454bdaf9a9d BUG: 1250170 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/11827 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* glusterd: stop all the daemons services on peer detachGaurav Kumar Garg2015-08-241-0/+41
| | | | | | | | | | | | | | | Currently glusterd is not stopping all the deamon service on peer detach With this fix it will do peer detach cleanup properlly and will stop all the daemon which was running before peer detach on the node. Change-Id: Ifed403ed09187e84f2a60bf63135156ad1f15775 BUG: 1255386 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/11509 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* snapshot: Fix snapshot info's xml outputAvra Sengupta2015-08-241-0/+26
| | | | | | | | | | | | | | Display description field with (null) if no description is present for the snapshot, instead of removing the field altogether. Change-Id: I965b08cd6e54eea56c32e2712fab7daa8a663f11 BUG: 1250387 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/11834 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* quota/marker: fix inode quota with renamevmallika2015-08-192-4/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are three problems with marker-rename which is fixed in this patch Problem 1) 1) mq_reduce_parent_size is not handling inode-quota contribution 2) When dest files exists and IO is happening Now renaming will overwrite existing file mq_reduce_parent_size called on dest file with saved contribution, this can be a problem is IO is still happening contribution might have changed Problem 2) There is a small race between rename and in-progress write Consider below scenario 1) rename FOP invoked on file 'x' 2) write is still in progress for file 'x' 3) rename takes a lock on old-parent 4) write-update txn blocked on old-parent to acquire lock 5) in rename_cbk, contri xattrs are removed and contribution is deleted and lock is released 6) now write-update txn gets the lock and updates the wrong parent as it was holding lock on old parent so validate parent once the lock is acquired Problem 3) when a rename operation is performed, a lock is held on old parent. This lock is release before unwinding the rename operation. This can be a problem if there are in-progress writes happening during rename, where update txn can take a lock and update the old parent as inode table is not updated with new parent Change-Id: Ic3316097c001c33533f98592e8fcf234b1ee2aa2 BUG: 1240991 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/11578 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* features/shard: Ensure shards are owned by the same owner/group as the ↵Krutika Dhananjay2015-08-191-0/+61
| | | | | | | | | | | | original file Change-Id: Id759af8f3ff5fd8bfa9f8121bab25722709d42b7 BUG: 1251824 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11874 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Fix write size in self-healXavier Hernandez2015-08-141-0/+50
| | | | | | | | | | | | | | | | | | Self-heal was always using a fixed block size to heal a file. This was incorrect for dispersed volumes with a number of data bricks not being a power of 2. This patch adjusts the block size to a multiple of the stripe size of the volume. It also propagates errors detected during the data heal to stop healing the file and not mark it as healed. Change-Id: I9ee3fde98a9e5d6116fd096ceef88686fd1d28e2 BUG: 1251446 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/11862 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>