summaryrefslogtreecommitdiffstats
path: root/xlators/cluster
Commit message (Collapse)AuthorAgeFilesLines
...
* tier: Demotion failed if the file was renamed when it was in coldMohammed Rafi KC2015-12-301-11/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During migration if the file is present we just open the file in hashed subvol. Now if the linkfile present on hashed is just linkfile to another subvol, we actually open in hashed subvol. But subsequent operation will go to linkto subvol ie, to non-hashed subvol. This operation will get failed since we haven't opened d on non-hashed. Backport of> >Change-Id: I9753ad3a48f0384c25509612ba76e7e10645add3 >BUG: 1292067 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: http://review.gluster.org/12980 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Susant Palai <spalai@redhat.com> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Dan Lambright <dlambrig@redhat.com> >Tested-by: Dan Lambright <dlambrig@redhat.com> (cherry picked from commit d2f48214d436be633efb1136ee951b0736935143) Change-Id: I562ac4ba73e6b572bc1be91e55a029fa75262b33 BUG: 1293342 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13045 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* afr: warn if pending xattrs missing during init()Ravishankar N2015-12-282-1/+17
| | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/13038/ Since commit 6e635284a4411b816d4d860a28262c9e6dc4bd6a (glusterfs-3.7.7), the afr pending xattrs are stored in the volfile and used by afr when it initializes. If a cluster is upgraded, prevent afr from loading until the op-version has been bumped up to 3.7.7 and the volfiles have been regenerated using a volume set command. Without this fix, AFR will crash when initialzing. Change-Id: I14249dedb3f2f77cd754d78d8a9a70fdc5fc8c10 BUG: 1293536 Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit 4bfbabfdd698e93a1dc1aad5590ed18f10936c55) Reviewed-on: http://review.gluster.org/13058 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: refresh inode using fstatRavishankar N2015-12-281-26/+91
| | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/12894 For fd based operations (fgetxattr, readv etc.) if an inode refresh is required, do so using fstat instead of lookup. This is because the file might have been deleted by another client before refresh but posix mandates that FOPS using already open fds must still succeed. Change-Id: Id5f71c3af4892b648eb747f363dffe6208e7ac09 BUG: 1290363 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/13040 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/tier: do not block in synctask created from pause tierDan Lambright2015-12-253-18/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We had run sleep() in the pause tier callback. Blocking within a synctask is dangerous. The sleep() call does not inform the synctask scheduler that a thread is no longer running. It therefore believes it is running. If a second synctask already exists, it may not be able to run. This occurs if the thread limit in the pool has been reached. Note the pool size only grows when a synctask is created, not when it is moved from wait state to run state, as is the case when an FOP completes. When the tier is paused during migration, synctasks already exist waiting for responses to FOPs to the server with high probability. The fix is to yield() in the RPC callback, which will place the synctask into the wait queue and free up a thread for the FOP callback. A timer wakes the callback after sufficient time has elapsed for the pause to occur. This is a backport of 12987. > Change-Id: I6a947ee04c6e5649946cb6d8207ba17263a67fc6 > BUG: 1267950 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12987 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I9bb7564d57d59abb98e89c8d54c8b1ff4a5fc3f3 BUG: 1274100 Reviewed-on: http://review.gluster.org/13078 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/dht : Properly free file descriptors during data migrationJoseph Fernandes2015-12-221-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | While tier migration, free src and dst fd's when create of destination or open of source fails. Backport of http://review.gluster.org/12969 > Change-Id: I62978a669c6c9fbab5fed9df2716b9b2ba00ddf1 > BUG: 1291566 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/12969 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I4d745076b3258a64584778ba88dd665ddd907d27 BUG: 1293348 Reviewed-on: http://review.gluster.org/13046 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* Tier: "tier start force" command implementationhari gowtham2015-12-222-3/+3
| | | | | | | | | | | | | | | | | | | | | back port of : http://review.gluster.org/#/c/12983/ The start command doesnt restart the tier deamon if the deamon is running at one node. hence to bring up the tierd on the nodes where the deamon is down, the force command is implemented. It skips the check for tierd running. >Change-Id: I0037d3e5ecfe56637d0da201a97903c435d26436 >BUG: 1292112 >Signed-off-by: hari gowtham <hgowtham@redhat.com> Change-Id: Idaca442c1a41ded8bf555a6e34eed0ebb9ea4034 BUG: 1293698 Signed-off-by: hari <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/13069 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/afr : Remove stale indicesAshish Pandey2015-12-221-1/+22
| | | | | | | | | | | Change-Id: Iba23338a452b49dc9fe6ae7b4ca108ebc377fe42 BUG: 1283036 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/12336 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12604 Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* heal : Changed heal info to process all indices directoriesAnuradha Talur2015-12-217-58/+102
| | | | | | | | | | | | Backport of http://review.gluster.org/#/c/12658/ Change-Id: Ida863844e14309b6526c1b8434273fbf05c410d2 BUG: 1287531 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/12854 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr : Readdirp performance enhancementAnuradha Talur2015-12-215-82/+162
| | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/12507/ Things done : 1) during lookup and inode_refresh as part of read_txn, request is sent to detect if heal is required or not. 2) If heal is required, be conservative in setting the readdirp entry inodes to NULL, otherwise don't be. 3) Self-heal-daemon now crawls both indices/xattrop and indices/dirty directory while healing. Change-Id: Ic4a4da63fb7e0726eab5f341a200859b29cf7eb7 BUG: 1287531 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/12507 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/12853
* cluster/tier: fix tier-max-files bookeeping and helpDan Lambright2015-12-202-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The option tier-max-files represents the maximum number of files trasferred by a node in a gives cycle. Fix help message to reflect the "per node" aspect. The code transferred one more file per cycle than the given value. Also change the default values of max file and max bytes to very large values, effectively we will not throttle migration unless the administrator requests it via CLI. This is a backport of 12984. > Change-Id: Ic2949ed3d8c35afe7c9ae4db72195603cfb2e28f > BUG: 1292671 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12984 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: Ibb234976f912ff4b20e894f2984c63792acfe814 BUG: 1292945 Reviewed-on: http://review.gluster.org/13004 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/afr: During name heal, propagate EIO only on gfid or type mismatchKrutika Dhananjay2015-12-181-5/+13
| | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/12973/ When the disk associated with a brick returns EIO during lookup, chances are that name heal would return an EIO because one of the syncop_XXX() operations as part of it returned an EIO. This is inherently treated by afr_lookup_selfheal_wrap() as a split-brain and the lookup is aborted prematurely with EIO even if it succeeded on the other replica(s). Change-Id: I754fa59c585712b8037f98a8c3c1737a2167fa1b BUG: 1292046 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12979 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* glusterd/afr: store afr pending xattrs as a volume optionRavishankar N2015-12-161-9/+23
| | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/12738/ Problem: When AFR xlator initialises, it uses the name of the client xlators below it for storing the pending changelogs (xattrs). This can be problem when some other xlator is loaded in between AFR and the client. Though that is a trivial 'traverse-graph-till-the-client-and-use-the-name' fix in AFR's init(), there are other issues like when there's no client xlator at all when, say, AFR is moved to the server side. Fix: The client xlator names are currenly unique and stored as brickinfo->brick_ids. So persist these ids as comma separated values in AFR's volume_options and use them as xattr values during init(). Change-Id: Ie761ffeb3373a4c4d85ad05c84a768c4188aa90d BUG: 1291985 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12977 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* dht/rebalance: Use seperate return variable for destination cleanupJoseph Fernandes2015-12-151-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Use seperate local return variable for destination cleanup, without messing with the function return variable. Backport http://review.gluster.org/12956 > Change-Id: Iaea9ed2927234fdb888aef7a31ec362090e98196 > BUG: 1290975 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/12956 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I2df0e29321647fd6eb1719eb3cb7d657fbed3793 BUG: 1291002 Reviewed-on: http://review.gluster.org/12959 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/dht: files are still going to decommissioned subvolMohammed Rafi KC2015-12-111-3/+27
| | | | | | | | | | | | | | | | | | | | | | | | After detach tier start, creates are still going to hot tier. Because when creating data files we are not checking for decommissioned bricks. Backpor of> >Change-Id: I8e28258d9b2367dcc8ad6e5e91d0e54d92fdf771 >BUG: 1289602 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: http://review.gluster.org/12914 >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Dan Lambright <dlambrig@redhat.com> >Tested-by: Dan Lambright <dlambrig@redhat.com> (cherry picked from commit cb41de7c4a7d620dbab645f2b105286b68bde9a9) Change-Id: I56e4551657a234ac51fa2513dc46a48c19c4fac7 BUG: 1289602 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12937 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* tier : Spawn promotion or demotion thread depending on local bricksJoseph Fernandes2015-12-091-22/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | Spawn Promotion or Demotion depending if there are any Cold or Hot bricks present localy. IF the local HOT brick list is empty dont spawn demote thread. IF the local COLD brick list is empty dont spawn promote thread. Backport of http://review.gluster.org/12912 > Change-Id: I524730e59414dd156c78ec0bd7a3629212697e6e > BUG: 1289578 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/12912 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: Ic3113934051c7a751ae56508e00d098d010f4c0e BUG: 1290048 Reviewed-on: http://review.gluster.org/12928 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/dht: Fix mem leak from lookup response dictJoseph Fernandes2015-12-091-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | Fixing memory leak from response dict during a parent lookup to get the path. Backport of http://review.gluster.org/12867 > Change-Id: I60c23d0b25e7f763f0e53c40e71ee053aba6d555 > BUG: 1288019 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/12867 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Joseph Fernandes > Tested-by: Joseph Fernandes Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I77dfbb097e0200ace4c1adcf48560caf283007d7 BUG: 1288992 Reviewed-on: http://review.gluster.org/12889 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier : Fix double free in tier processN Balachandran2015-12-081-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | The tier process tries to free ipc_ctr_params twice if the syncop_ipc call in tier_process_ctr_query fails. ipc_ctr_params is freed when ctr_ipc_in_dict is freed. But ctr_ipc_out_dict is NULL when syncop_ipc fails, causing GF_FREE to be called on a non-NULL ipc_ctr_params ptr again. > Change-Id: Ia15f36dfbcd97be5524588beb7caad5cb79efdb4 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12890 > Reviewed-by: Joseph Fernandes > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > (cherry picked from commit 06818a0fd69bb0d6daabde73e5c3cc2661a70854) Change-Id: Ida2da0416272ff1b04cf51f76467e27b62121f41 BUG: 1289414 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12904 Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/tier: Ignoring status of already migrated filesJoseph Fernandes2015-12-061-5/+16
| | | | | | | | | | | | | | | | | | | | | | | Ignore the status of already migrated files and in the process don't count. Backport http://review.gluster.org/12758 > Change-Id: Idba6402508d51a4285ac96742c6edf797ee51b6a > BUG: 1276141 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/12758 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I44b7b965ecbc34159c2233af1a74762fd410dcaf BUG: 1262860 Reviewed-on: http://review.gluster.org/12884 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: fix loading tier.so into glusterdN Balachandran2015-12-047-41/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | The glusterd process loads the shared libraries of client translators. This failed for tiering due to a reference to dht_methods which is defined as a global variable which is not necessary. The global variable has been removed and this is now a member of dht_conf and is now initialised in the *_init calls. > Change-Id: Ifa0a21e3962b5cd8d9b927ef1d087d3b25312953 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12863 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> >Tested-by: Dan Lambright <dlambrig@redhat.com> (cherry picked from commit 96fc7f64da2ef09e82845a7ab97574f511a9aae5) Change-Id: If3cc908ebfcd1f165504f15db2e3079d97f3132e BUG: 1288352 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12877 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/ec: Create copy of dict for setting internal xattrsPranith Kumar K2015-12-012-19/+19
| | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.com/12831 Problem: Ec takes a ref of the request xdata and sets trusted.ec.version/algo etc xattrs as part of it. But this request xdata could be using same dictionary to do the operation on multiple subvolumes, due to which other subvolumes will have internal xattrs of ec in it and will be created on subvols where they are not supposed to appear. Fix: Take a copy of the request xdata/dict to prevent this from happening. Most of the debugging work and test script is contributed by Nitya. BUG: 1286985 Change-Id: Ie9b7d9f063434789f6c5902c3a68ececdc3c7efa Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12835 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* cluster/afr: Remember flags sent by create fopPranith Kumar K2015-11-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/12739 Problem: In create fop, afr doesn't remember the flags. When afr has to perform fixing of the fd that needs to be opened on other bricks because the brick was down at the time of create, the flags with which it needs to send open are not present. Fix: Remember the flags in the fd_ctx. Thanks to Nitya for showing us the problem in re-open with the flags. BUG: 1285174 Change-Id: Id203a79896eaa093b035d645fe0fc24d7de71a3a Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12740 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/ec: Allow read fops to be processed in parallelXavier Hernandez2015-11-247-192/+360
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently ec only sends a single read request at a time for a given inode. Since reads do not interfere between them, this patch allows multiple concurrent read requests to be sent in parallel. This is a backport of these patches: > Change-Id: If853430482a71767823f39ea70ff89797019d46b > BUG: 1245689 > Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> > Reviewed-on: http://review.gluster.org/11742 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > > Change-Id: I6042129f09082497b80782b5704a52c35c78f44d > BUG: 1276031 > Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Change-Id: I1b1146d1fd1828b12bfc566cd76e5ea110f8909b BUG: 1251467 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/12447 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/tier: readdirp to cold tier onlyDan Lambright2015-11-246-109/+507
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible a file would get migrated in the middle of a readdir operation. If there are four subvolumes A,B,C,D, and if readdir reads them in order and reaches subvol B, then, if a file is moved from D to A, it will not be included in the readdir output. This phenonema has pre-existed in DHT migration but is more apparent in tiering. When a file is moved off the hashed subvolume a T file is created. For tiering, we will make the cold subvolume the hashed subvolume. This will ensure the creation of a T file. Readdir will not skip T files in the tier translator. Making the cold subvolume the hashed subvolume ensures the T files created on promotions or creates will be less likely to fill the volume. Creates still put the data on the hot subvolume. This is a backport of 12530 > Change-Id: Ifde557d3d0e94a4570ca9f115adee3db2ee75407 > BUG: 1281598 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12530 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Conflicts: xlators/cluster/dht/src/tier.c Change-Id: I5720a4cd04ae5088e5d7d23439b0f90d6bbc6265 BUG: 1283923 Reviewed-on: http://review.gluster.org/12722 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* afr: Drop compatibility lock for data self-healRavishankar N2015-11-231-18/+0
| | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/12602/ In glusterfs 3.4 and older, AFR did not take locks in self-heal domain during data self-heal. So this compat lock in data domain was added to prevent older clients from trying to heal a file while an existing self-heal was going on by a newer client. But the side effect was that all appending writes (which take full locks in data domain) from mounts would be stalled until self-heal was complete. Since glusterfs 3.4 is not supported anymore, remove the compat lock. Change-Id: I20bb2edf7527ce5d8291d13eb8b9149e66e9bcac BUG: 1283478 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12653 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/ec: Mark internal fops appropriatelyXavier Hernandez2015-11-204-27/+58
| | | | | | | | | | | | | | | | | | | | 1) Mark read fops in read-modify-write by EC as internal. 2) Handle uid/gid set/reset correctly >BUG: 1282761 >Change-Id: I5c1ce0cd6213367eaead5fed33aa2397c4e46df7 >Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> >Reviewed-on: http://review.gluster.org/12599 >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> BUG: 1283757 Change-Id: I9f039cf3ec6351525fb65381bad44d986595844f Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/12669 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier: Do not delete linkto file on demotionN Balachandran2015-11-192-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | The current DHT migration code will always delete the src linkto file after migration as dht always moves files to the hashed subvol. This is not the case in tiering. The lack of linkto files causes rename to fail leaving 2 files with the same name but different gfids on the volume. Modified to leave the linkto file behind if the source volume is the hashed subvolume. > Change-Id: I2b99f7d34b4b719aee6232dc40c6a8f8ba88225d > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12551 > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I210b94cdae0409c87af8ba198e3cd263a6c85190 BUG: 1283480 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12655 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: Disallow detach commit when detach in progressDan Lambright2015-11-162-4/+1
| | | | | | | | | | | | | 1. Check if detach is running, disallow detach commit if so. 2. Cleanup shutdown of tier daemon on detach: do not rerun fix-layout, do not send incorrect status back to glusterd. Change-Id: I97202f748773c1176396a4ffd32a4c7fa9b9c1bc BUG: 1264441 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12272 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes
* cluster/ec: fix bug in update_goodPranith Kumar K2015-11-111-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.com/12561 Problem: Bricks that didn't participate in the fops are considered to be good. This is happening two fold. Examples: Case-1: 1) 2+1 volume. 'd1' directory on Brick-0 is bad. 2) readdir takes locks and lock->good_mask is '7' 3) readdir does xattrop and fop->mask is '6'. 4) because fop->expected is '1' lock->good_mask remains '7' Case-2: 1) when all the bricks are up, it does lock + xattrop before op and figures out all the bricks are good. 2) By the time second operation starts brick-0 is down. Now lock->good_mask will always have the '0' bit set as long as the operations are happening on it. because: "lock->good_mask &= ~fop->mask | fop->remaining" fop->mask doesn't have '0' th bit. 3) When it comes time to perform the final xattrop in update_size_version brick-0 comes online because of which it gives the same version to brick-0 as well thinking it has participated in all the transactions till then, even when it didn't participate in the transactions. Fix: Case-1's fix: Update lock->good_mask in ec_prepare_update_cbk with latest good/bad bricks Case-2's fix: Consider non-participating brick as bad. BUG: 1278744 Change-Id: I5c2b07005107f3c067bac69da3b37ff39688bd69 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12562 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* dht: update cached subvolume during readdirp cbkMohammed Rafi KC2015-11-091-32/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | back port of http://review.gluster.org/#/c/12449/ This reverts commit bb2370514598a99e6ab268af81df57dc16caa2c5. issue and impact: readdirp_cbk was not resetting the layout for files, this causes problem if the files is moved from one cached subvolume and if the layout was not proper, then there is chance to fail entry fops if the fops executed with out a lookup. Because the cached subvolume will not change and the application assumes the presence of file in cached subvol. so it fails with ENOENT. The patch preset the layout information in readdirp cbk for each files in the entry. That leaves the problem the commit bb2370514598a99e6ab268af81df57dc16caa2c5 try to fix. We will fix the problem in a separate patch. Change-Id: Ia09f2b5edbacaeb31cf1b730c27b76ca1c93f1a8 BUG: 1279095 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12538 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* dht: heal directory path if the directory is not presentMohammed Rafi KC2015-11-093-7/+246
| | | | | | | | | | | | | | | | | | | | | back port of http://review.gluster.org/#/c/12376/ After a successful nameless lookup if the directory is not present on any of the subvol, then we will get the path of the directory and will recursively send a named lookp on each parent directory. This will help particularly for the scenarios like add brick and attach-tier. Change-Id: I97f3178adb08b88910f709f0acf1d86c147be017 BUG: 1279095 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12539 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* tier/libgfdb: Replacing ASCII query file with binaryJoseph Fernandes2015-11-082-194/+152
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Earlier, when the database was queried we used to save all the queried records in an ASCII format in the query file. This caused issues like filename having ASCII delimiter and used to take a lot of space. The tier.c file also had a lot of parsing code. Here we changed the format of the query file to binary. All the logic of serialization and formating of query record is done by libgfdb. Libgfdb provides API, gfdb_write_query_record() and gfdb_read_query_record(), which the user i.e tier migrator and CTR xlator can use to write to and read from query file. With this binary format we save on disk space i.e reduce to 50% atleast as we are saving GFID's in binary format 16 bytes and not the string format which takes 36 bytes + We are not saving path of the file + we are also saving on ASCII delimiters. The on disk format of query record is as follows, +---------------------------------------------------------------------------+ | Length of serialized query record | Serialized Query Record | +---------------------------------------------------------------------------+ 4 bytes Length of serialized query record | | -------------------------------------------------| | | V Serialized Query Record Format: +---------------------------------------------------------------------------+ | GFID | Link count | <LINK INFO> |..... | FOOTER | +---------------------------------------------------------------------------+ 16 B 4 B Link Length 4 B | | | | -----------------------------| | | | | | V | Each <Link Info> will be serialized as | +-----------------------------------------------+ | | PGID | BASE_NAME_LENGTH | BASE_NAME | | +-----------------------------------------------+ | 16 B 4 B BASE_NAME_LENGTH | | | ------------------------------------------------------------------------| | | V FOOTER is a magic number 0xBAADF00D indicating the end of the record. This also serves as a serialized schema validator. Backport of http://review.gluster.org/#/c/12354/ > Change-Id: I9db7416fd421e118dd44eafab8b535caafe50d5a > BUG: 1272207 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/12354 > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I170c579027f2594a58706f826e3ddf89e34022f4 BUG: 1263619 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12535 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier correct promotion cycle calculationDan Lambright2015-11-071-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 12480 The tier translator should only choose candidate files for promotion from the most recent cycle, not a multiple of the most recent cycles. Otherwise user observed behavior can be inconsistent. Remove related test in tier.t that is subject to race condition. > Change-Id: I9ad1523cac00f904097ce468efa6ddd515857024 > BUG: 1275524 > Signed-off-by: root <root@rhs-cli-15.gdev.lab.eng.bos.redhat.com> > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12480 > Reviewed-by: Joseph Fernandes > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Conflicts: tests/basic/tier/tier.t xlators/cluster/dht/src/tier.c Change-Id: Ic4587bf1b5d26ba377a12a4ce8e329362988a33b BUG: 1275483 Reviewed-on: http://review.gluster.org/12536 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: fix lookup-unhashed on tiered volumesDan Lambright2015-11-061-2/+1
| | | | | | | | | | | | | | | | | | | | | | During attach tier the commit hash must be copied to the hot tier. This is a backport of 12498. > Change-Id: I91b92fd8e98696993433856e1436409b657c439d > BUG: 1277716 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12498 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I012c8ce9ef1dbb9550f109a1141afbecaa4fa54d BUG: 1278603 Reviewed-on: http://review.gluster.org/12525 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/dht: Ignoring replica for migration countingJoseph Fernandes2015-11-051-7/+24
| | | | | | | | | | | | | | | | | | | | | We used to count replica files for migration counting even though they were ignore for migration as the replica brick didnt have the ownership (as per the replication xlator either AFR/EC). As a result the number of files migrated would show a wrong count, i.e each replicated file would be counted 1 + number of replica. This patch ignores such cases. Backport of http://review.gluster.org/#/c/12453/ Change-Id: Ib005fedaee16f171e0499782b91f3eeedf8860ed BUG: 1262860 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12511 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier : Files skipped during tier query parsingN Balachandran2015-11-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | The tier query parsing code was using fscanf to read each record. As space is a delimiter for fscanf, filenames containing spaces caused the parsing to return unexpected values causing various issues in the tier process, including crashes due to buffer overflows. > Change-Id: Ife602cb7ecb158fccbc2c89e4d2959bd97098a87 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12469 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> (cherry picked from commit 499b43058049572e33b525ac669ef623d476fe41) Change-Id: Id60f9c484dfbb02de6ebb44032160ad4cc94cb7f BUG: 1277587 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12502 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* quota: add version to quota xattrsvmallika2015-11-023-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of http://review.gluster.org/#/c/12386/ When a quota is disable and the clean-up process terminated without completely cleaning-up the quota xattrs. Now when quota is enabled again, this can mess-up the accounting A version number is suffixed for all quota xattrs and this version number is specific to marker xaltor, i.e when quota xattrs are requested by quotad/client marker will remove the version suffix in the key before sending the response > Change-Id: I1ca2c11460645edba0f6b68db70d476d8d26e1eb > BUG: 1272411 > Signed-off-by: vmallika <vmallika@redhat.com> > Reviewed-on: http://review.gluster.org/12386 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I67b1b930b28411d76b2d476a4e5250c52aa495a0 BUG: 1277080 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12487 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/ec: update version and size on good bricksAshish Pandey2015-11-011-10/+2
| | | | | | | | | | | | | | | | | | | | | | Problem: readdir/readdirp fops calls [f]xattrop with fop->good which contain only one brick for these operations. That causes xattrop to be failed as it requires at least "minimum" number of brick. Solution: Use lock->good_mask to call xattrop. lock->good_mask contain all the good locked bricks on which the previous write opearion was successfull. Change-Id: If1b500391aa6fca6bd863702e030957b694ab499 BUG: 1272404 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/12419 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12440 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* stripe: set ENOENT when a READ hits EOFNiels de Vos2015-11-011-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The NFS-server sets EOF only in the READ reply when op_errno is set to ENOENT. Xlators are expected to set op_errno to ENOENT when EOF is reached, op_ret will contain the number of bytes returned by the READ. When an NFS-client (like VMware ESXi) do a READ that exceeds the size of the file, errno should be set to EOF and the return value contains the number of bytes that are read (from the requested offset, until the end of the file). Not setting EOF on a correct short READ, can result in errors on the NFS-client. This is not an issue with the Linux NFS-client (or VFS). Linux is smart enough to not try to read more bytes than the file contains. Cherry picked from commit 2bd2ccf0fdd5390c1c07cb228048f93e5e516512: > BUG: 1209298 > Change-Id: Ib15538744908a6001d729288d3e18a432d19050b > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: http://review.gluster.org/10142 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> BUG: 1219399 Change-Id: Ib15538744908a6001d729288d3e18a432d19050b Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/12470 Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/tier dont log error on lookup heal for files on hot tierDan Lambright2015-10-302-17/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 12430 On fix-layout heal files are scanned. Files found are exist on the hot or cold subvolume. Those not found in the cold tier would exist on the hot. They should not be flagged as an error. Replace INFO with TRACE for common tier migration logs. Frequent migration was growing the log files too quickly. On migratation failures, do not acrue files towards cycle limit's budget. > Change-Id: Ie832ee07c43bce5477ae81c939d1fe8416a11615 > BUG: 1275383 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12430 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Joseph Fernandes Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: Ia1ce5c3ac9c8c43cf3f3f7e0bd6161aa13affe5f BUG: 1272398 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12465 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/ec: Implement gfid-hash read-policyPranith Kumar K2015-10-293-10/+73
| | | | | | | | | | | | | | | | | | | | | Add a policy in ec to performs reads from same bricks as long as they are good. Based on the gfid of the file/directory it determines the bricks to be considered for reading. >Change-Id: Ic97b5c54c086a28b5e07a330a4fd448551b49376 >BUG: 1261260 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/12133 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> BUG: 1270705 Change-Id: Ibf0d21d7210125fa7aaa12b3f98bcdf7cd89ef02 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12456 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: disable self-heal lock compatibility for arbiter volumesPranith Kumar K2015-10-291-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: afrv2 takes locks from infinity-2 to infinity-1 to be compatible with <=3.5.x clients. For arbiter volumes this leads to problems as the I/O takes full file locks. Solution: Don't be compatible with <=3.5.x clients on arbiter volumes as arbiter volumes are introduced in 3.7 >Change-Id: I48d6aab2000cab29c0c4acbf0ad356a3fa9e7bab >BUG: 1275247 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/12426 >Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> >Reviewed-by: Vijay Bellur <vbellur@redhat.com> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >(cherry picked from commit 33b1e373ee40546e1aeed00d4f5f7bfd6d9fefb9) Change-Id: I22c00e94d7ab9bbcd1a6836fc6cfc300df26b765 BUG: 1276229 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12455 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: jiademing.dd <iesool@126.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* afr: write zeros to sink for non-sparse filesRavishankar N2015-10-292-16/+43
| | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/12371/ Problem: If a file is created with zeroes ('dd', 'fallocate' etc.) when a brick is down, the self-heal does not write the zeroes to the sink after it comes up. Consequenty, there is a mismatch in disk-usage amongst the bricks of the replica. Fix: If we definitely know that the file is not sparse, then write the zeroes to the sink even if the checksums match. Change-Id: Ic739b3da5dbf47d99801c0e1743bb13aeb3af864 BUG: 1275921 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12436 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: fixes in transaction codeRavishankar N2015-10-263-15/+11
| | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/12368/ and http://review.gluster.org/#/c/12415/ 1. When winding the pre-op, transaction.pre_op[i] is set. If the pre-op fails, transaction.failed_subvols[i] is set. If if fails on all chidren, we can directly proceed to unlock (via afr_changelog_post_op_now) without trying to wind the write, fail and then going to unlock. 2. 'fop_subvols' seems to be an unused variable, hence removing it. 3. Call local->transaction.wind() only on subvols where pre-op succeeded. Change-Id: I9525628daf48082e979b0093fa0478934495e61f BUG: 1273334 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12399 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/dht : op_ret not set correctly in dht_fsync_cbkN Balachandran2015-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | local->op_ret was not set correctly in dht_fsync_cbk in case the file was being migrated > Change-Id: If73ae04368ea0c7f6868c8704dfc2deb2faee753 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12401 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> (cherry picked from commit 9710f58e5874bccb4b328abef80ea226ccf9c798) Change-Id: I2addb86083c1d8305cf91e0b0385deeb227216c8 BUG: 1272036 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12409 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: add pause tier for snapshotsDan Lambright2015-10-216-7/+227
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 12304 Snaps of tiered volumes cannot handle files undergoing migration. We implement a helper mechanism to "pause" migration. Any files undergoing migration are aborted. Clean up is done to remove sticky bits and data at the destination. Migration is restarted after snap completes. For testing an internal switch is added. It is not exposed externally. gluster volume set vol1 tier-pause [true|false] > Change-Id: Ia85bbf89ac142e9b7e73fcbef98bb9da86097799 > BUG: 1267950 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12304 > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Conflicts: xlators/mgmt/glusterd/src/glusterd-messages.h Change-Id: I5f039d8d38a4c915bd873969f336b96755a0b8f1 BUG: 1274101 Reviewed-on: http://review.gluster.org/12411 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier do not abort migration if a single brick is downDan Lambright2015-10-211-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | backport fix 12397 When a bricks are down, promotion/demotion should still be possible. For example, if an EC brick is down, the other bricks are able to recover the data and migrate it. > Change-Id: I8e650c640bce22a3ad23d75c363fbb9fd027d705 > BUG: 1273215 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12397 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Joseph Fernandes Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I6688757eaf97426c8e1ea1038c598b34bf6b8ccc BUG: 1272334 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12405 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier remove suprious log messages on valid failed migrationDan Lambright2015-10-191-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Backport fix 12391 > On a write to a replica volume, we record in all brick's databases an entry. > When the tier daemon runs, it will only move the file if it is the true > owner of the file as defined by the XATTR_NODE_UUID_KEY. > Change-Id: Ib82717f87a3f94f3d0d9f969773de9e88d6aaf22 > BUG: 1273043 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12391 > Reviewed-by: Joseph Fernandes > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I12147f878cd1927f845867fb7c0b84c4db017ee1 BUG: 1272398 Reviewed-on: http://review.gluster.org/12394 Reviewed-by: Joseph Fernandes Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/ec : Remove index entries if file/dir does not existAshish Pandey2015-10-181-33/+45
| | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/12353 Problem: During write and rebalance if a brick is down, index entries will be created. If the same file gets migrated to other subvol by rebalance process, these index entries will remain in index directory. During heal, these indices should be removed when we get ENOENT or ESTALE for a index. Solution: Capture correct errno and take appropriate action to purge these indices. Change-Id: I1aad8b99e4df2e139648e3bf971e4cb1c4b38699 Bug: 1271967 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/12361 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/dht : Do not migrate files with POSIX locks heldN Balachandran2015-10-181-11/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dht_migrate_file does not migrate file locks to the dst file. Any locks held on the source file are lost once the migration is complete. This issue is magnified in the case of a tier volume as file migrations occur more frequently and repeatedly as compared to a DHT rebalance. The fix makes 2 changes: 1. Before starting the actual migration process, check if there are any locks held on the file. If yes, do not migrate the file. 2. The rebalance process tries to lock on the entire file just before moving into the Phase 2 of the file migration. If the lock acquisition fails, the file migration does not proceed. If the lock is granted, the file migration proceeds. This still leaves a small window where conflicting locks can be granted to different clients. If client1 requests a lock on the src file just after it is converted to a linkto file and client2 requests a lock on the dst data file, they will both be granted, but all FOPs will be redirected to the dst data file. This issue will be taken up in a subsequent patch. Change-Id: I8c895fc3cced50dd2894259d40a827c7b43d58ac BUG: 1272331 Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12347 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> > Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12369 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/ctr: CTR DB named lookup heal of cold tier during attach tierJoseph Fernandes2015-10-101-2/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Heal hardlink in the db for already existing data in the cold tier during attach tier. i.e during fix layout do lookup to files in the cold tier. CTR xlator on the brick/server side does db update/insert of the hardlink on a namelookup. Currently the namedlookup is done synchronous to the fixlayout that is triggered by attach tier. This is not performant, adding more time to fixlayout. The performant approach is record the hardlinks on a compressed datastore and then do the namelookup asynchronously later, giving the ctr db eventual consistency master patch : http://review.gluster.org/#/c/11828/ >>Change-Id: I4ffc337fffe7d447804786851a9183a51b5044a9 >>BUG: 1252586 >>Signed-off-by: Joseph Fernandes <josferna@redhat.com> >>Reviewed-on: http://review.gluster.org/11828 >>Tested-by: Gluster Build System <jenkins@build.gluster.com> >>Reviewed-by: Dan Lambright <dlambrig@redhat.com> >>Tested-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I61b185a54ae4e8c1d82804b95a278bfbea870987 BUG: 1261146 Reviewed-on: http://review.gluster.org/12331 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>