summaryrefslogtreecommitdiffstats
path: root/xlators/cluster
Commit message (Collapse)AuthorAgeFilesLines
...
* tier/dht: Ignoring replica for migration countingJoseph Fernandes2015-11-051-7/+24
| | | | | | | | | | | | | | | | | | | | | We used to count replica files for migration counting even though they were ignore for migration as the replica brick didnt have the ownership (as per the replication xlator either AFR/EC). As a result the number of files migrated would show a wrong count, i.e each replicated file would be counted 1 + number of replica. This patch ignores such cases. Backport of http://review.gluster.org/#/c/12453/ Change-Id: Ib005fedaee16f171e0499782b91f3eeedf8860ed BUG: 1262860 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12511 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier : Files skipped during tier query parsingN Balachandran2015-11-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | The tier query parsing code was using fscanf to read each record. As space is a delimiter for fscanf, filenames containing spaces caused the parsing to return unexpected values causing various issues in the tier process, including crashes due to buffer overflows. > Change-Id: Ife602cb7ecb158fccbc2c89e4d2959bd97098a87 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12469 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> (cherry picked from commit 499b43058049572e33b525ac669ef623d476fe41) Change-Id: Id60f9c484dfbb02de6ebb44032160ad4cc94cb7f BUG: 1277587 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12502 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* quota: add version to quota xattrsvmallika2015-11-023-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of http://review.gluster.org/#/c/12386/ When a quota is disable and the clean-up process terminated without completely cleaning-up the quota xattrs. Now when quota is enabled again, this can mess-up the accounting A version number is suffixed for all quota xattrs and this version number is specific to marker xaltor, i.e when quota xattrs are requested by quotad/client marker will remove the version suffix in the key before sending the response > Change-Id: I1ca2c11460645edba0f6b68db70d476d8d26e1eb > BUG: 1272411 > Signed-off-by: vmallika <vmallika@redhat.com> > Reviewed-on: http://review.gluster.org/12386 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I67b1b930b28411d76b2d476a4e5250c52aa495a0 BUG: 1277080 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12487 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/ec: update version and size on good bricksAshish Pandey2015-11-011-10/+2
| | | | | | | | | | | | | | | | | | | | | | Problem: readdir/readdirp fops calls [f]xattrop with fop->good which contain only one brick for these operations. That causes xattrop to be failed as it requires at least "minimum" number of brick. Solution: Use lock->good_mask to call xattrop. lock->good_mask contain all the good locked bricks on which the previous write opearion was successfull. Change-Id: If1b500391aa6fca6bd863702e030957b694ab499 BUG: 1272404 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/12419 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12440 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* stripe: set ENOENT when a READ hits EOFNiels de Vos2015-11-011-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The NFS-server sets EOF only in the READ reply when op_errno is set to ENOENT. Xlators are expected to set op_errno to ENOENT when EOF is reached, op_ret will contain the number of bytes returned by the READ. When an NFS-client (like VMware ESXi) do a READ that exceeds the size of the file, errno should be set to EOF and the return value contains the number of bytes that are read (from the requested offset, until the end of the file). Not setting EOF on a correct short READ, can result in errors on the NFS-client. This is not an issue with the Linux NFS-client (or VFS). Linux is smart enough to not try to read more bytes than the file contains. Cherry picked from commit 2bd2ccf0fdd5390c1c07cb228048f93e5e516512: > BUG: 1209298 > Change-Id: Ib15538744908a6001d729288d3e18a432d19050b > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: http://review.gluster.org/10142 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> BUG: 1219399 Change-Id: Ib15538744908a6001d729288d3e18a432d19050b Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/12470 Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/tier dont log error on lookup heal for files on hot tierDan Lambright2015-10-302-17/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 12430 On fix-layout heal files are scanned. Files found are exist on the hot or cold subvolume. Those not found in the cold tier would exist on the hot. They should not be flagged as an error. Replace INFO with TRACE for common tier migration logs. Frequent migration was growing the log files too quickly. On migratation failures, do not acrue files towards cycle limit's budget. > Change-Id: Ie832ee07c43bce5477ae81c939d1fe8416a11615 > BUG: 1275383 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12430 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Joseph Fernandes Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: Ia1ce5c3ac9c8c43cf3f3f7e0bd6161aa13affe5f BUG: 1272398 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12465 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/ec: Implement gfid-hash read-policyPranith Kumar K2015-10-293-10/+73
| | | | | | | | | | | | | | | | | | | | | Add a policy in ec to performs reads from same bricks as long as they are good. Based on the gfid of the file/directory it determines the bricks to be considered for reading. >Change-Id: Ic97b5c54c086a28b5e07a330a4fd448551b49376 >BUG: 1261260 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/12133 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> BUG: 1270705 Change-Id: Ibf0d21d7210125fa7aaa12b3f98bcdf7cd89ef02 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12456 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: disable self-heal lock compatibility for arbiter volumesPranith Kumar K2015-10-291-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: afrv2 takes locks from infinity-2 to infinity-1 to be compatible with <=3.5.x clients. For arbiter volumes this leads to problems as the I/O takes full file locks. Solution: Don't be compatible with <=3.5.x clients on arbiter volumes as arbiter volumes are introduced in 3.7 >Change-Id: I48d6aab2000cab29c0c4acbf0ad356a3fa9e7bab >BUG: 1275247 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/12426 >Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> >Reviewed-by: Vijay Bellur <vbellur@redhat.com> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >(cherry picked from commit 33b1e373ee40546e1aeed00d4f5f7bfd6d9fefb9) Change-Id: I22c00e94d7ab9bbcd1a6836fc6cfc300df26b765 BUG: 1276229 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12455 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: jiademing.dd <iesool@126.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* afr: write zeros to sink for non-sparse filesRavishankar N2015-10-292-16/+43
| | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/12371/ Problem: If a file is created with zeroes ('dd', 'fallocate' etc.) when a brick is down, the self-heal does not write the zeroes to the sink after it comes up. Consequenty, there is a mismatch in disk-usage amongst the bricks of the replica. Fix: If we definitely know that the file is not sparse, then write the zeroes to the sink even if the checksums match. Change-Id: Ic739b3da5dbf47d99801c0e1743bb13aeb3af864 BUG: 1275921 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12436 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: fixes in transaction codeRavishankar N2015-10-263-15/+11
| | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/12368/ and http://review.gluster.org/#/c/12415/ 1. When winding the pre-op, transaction.pre_op[i] is set. If the pre-op fails, transaction.failed_subvols[i] is set. If if fails on all chidren, we can directly proceed to unlock (via afr_changelog_post_op_now) without trying to wind the write, fail and then going to unlock. 2. 'fop_subvols' seems to be an unused variable, hence removing it. 3. Call local->transaction.wind() only on subvols where pre-op succeeded. Change-Id: I9525628daf48082e979b0093fa0478934495e61f BUG: 1273334 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12399 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/dht : op_ret not set correctly in dht_fsync_cbkN Balachandran2015-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | local->op_ret was not set correctly in dht_fsync_cbk in case the file was being migrated > Change-Id: If73ae04368ea0c7f6868c8704dfc2deb2faee753 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12401 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> (cherry picked from commit 9710f58e5874bccb4b328abef80ea226ccf9c798) Change-Id: I2addb86083c1d8305cf91e0b0385deeb227216c8 BUG: 1272036 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12409 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: add pause tier for snapshotsDan Lambright2015-10-216-7/+227
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 12304 Snaps of tiered volumes cannot handle files undergoing migration. We implement a helper mechanism to "pause" migration. Any files undergoing migration are aborted. Clean up is done to remove sticky bits and data at the destination. Migration is restarted after snap completes. For testing an internal switch is added. It is not exposed externally. gluster volume set vol1 tier-pause [true|false] > Change-Id: Ia85bbf89ac142e9b7e73fcbef98bb9da86097799 > BUG: 1267950 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12304 > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Conflicts: xlators/mgmt/glusterd/src/glusterd-messages.h Change-Id: I5f039d8d38a4c915bd873969f336b96755a0b8f1 BUG: 1274101 Reviewed-on: http://review.gluster.org/12411 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier do not abort migration if a single brick is downDan Lambright2015-10-211-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | backport fix 12397 When a bricks are down, promotion/demotion should still be possible. For example, if an EC brick is down, the other bricks are able to recover the data and migrate it. > Change-Id: I8e650c640bce22a3ad23d75c363fbb9fd027d705 > BUG: 1273215 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12397 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Joseph Fernandes Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I6688757eaf97426c8e1ea1038c598b34bf6b8ccc BUG: 1272334 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12405 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier remove suprious log messages on valid failed migrationDan Lambright2015-10-191-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Backport fix 12391 > On a write to a replica volume, we record in all brick's databases an entry. > When the tier daemon runs, it will only move the file if it is the true > owner of the file as defined by the XATTR_NODE_UUID_KEY. > Change-Id: Ib82717f87a3f94f3d0d9f969773de9e88d6aaf22 > BUG: 1273043 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12391 > Reviewed-by: Joseph Fernandes > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I12147f878cd1927f845867fb7c0b84c4db017ee1 BUG: 1272398 Reviewed-on: http://review.gluster.org/12394 Reviewed-by: Joseph Fernandes Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/ec : Remove index entries if file/dir does not existAshish Pandey2015-10-181-33/+45
| | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/12353 Problem: During write and rebalance if a brick is down, index entries will be created. If the same file gets migrated to other subvol by rebalance process, these index entries will remain in index directory. During heal, these indices should be removed when we get ENOENT or ESTALE for a index. Solution: Capture correct errno and take appropriate action to purge these indices. Change-Id: I1aad8b99e4df2e139648e3bf971e4cb1c4b38699 Bug: 1271967 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/12361 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/dht : Do not migrate files with POSIX locks heldN Balachandran2015-10-181-11/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dht_migrate_file does not migrate file locks to the dst file. Any locks held on the source file are lost once the migration is complete. This issue is magnified in the case of a tier volume as file migrations occur more frequently and repeatedly as compared to a DHT rebalance. The fix makes 2 changes: 1. Before starting the actual migration process, check if there are any locks held on the file. If yes, do not migrate the file. 2. The rebalance process tries to lock on the entire file just before moving into the Phase 2 of the file migration. If the lock acquisition fails, the file migration does not proceed. If the lock is granted, the file migration proceeds. This still leaves a small window where conflicting locks can be granted to different clients. If client1 requests a lock on the src file just after it is converted to a linkto file and client2 requests a lock on the dst data file, they will both be granted, but all FOPs will be redirected to the dst data file. This issue will be taken up in a subsequent patch. Change-Id: I8c895fc3cced50dd2894259d40a827c7b43d58ac BUG: 1272331 Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12347 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> > Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12369 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/ctr: CTR DB named lookup heal of cold tier during attach tierJoseph Fernandes2015-10-101-2/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Heal hardlink in the db for already existing data in the cold tier during attach tier. i.e during fix layout do lookup to files in the cold tier. CTR xlator on the brick/server side does db update/insert of the hardlink on a namelookup. Currently the namedlookup is done synchronous to the fixlayout that is triggered by attach tier. This is not performant, adding more time to fixlayout. The performant approach is record the hardlinks on a compressed datastore and then do the namelookup asynchronously later, giving the ctr db eventual consistency master patch : http://review.gluster.org/#/c/11828/ >>Change-Id: I4ffc337fffe7d447804786851a9183a51b5044a9 >>BUG: 1252586 >>Signed-off-by: Joseph Fernandes <josferna@redhat.com> >>Reviewed-on: http://review.gluster.org/11828 >>Tested-by: Gluster Build System <jenkins@build.gluster.com> >>Reviewed-by: Dan Lambright <dlambrig@redhat.com> >>Tested-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I61b185a54ae4e8c1d82804b95a278bfbea870987 BUG: 1261146 Reviewed-on: http://review.gluster.org/12331 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: add watermarks and policy driverDan Lambright2015-10-105-97/+470
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport fix 12039 This fix introduces infrastructure to support different policies for promotion and demotion. Currently the tier feature automatically promotes and demotes files periodically based on access. This is good for testing but too stringent for most real workloads. It makes it difficult to fully utilize a hot tier- data will be demoted before it is touched- its unlikely a 100GB hot SSD will have all its data touched in a window of time. A new parameter "mode" allows the user to pick promotion/demotion polcies. The "test mode" will be used for *.t and other general testing. This is the current mechanism. The "cache mode" introduces watermarks. The watermarks represent levels of data residing on the hot tier. "cache mode" policy: The % the hot tier is full is called P. Do not promote or demote more than D MB or F files. A random number [0-100] is called R. Rules for migration: if (P < watermark_low) don't demote, always promote. if (P >= watermark_low) && (P < watermark_hi) demote if R < P; promote if R > P. if (P > watermark_hi) always demote, don't promote. gluster volume set {vol} cluster.watermark-hi % gluster volume set {vol} cluster.watermark-low % gluster volume set {vol} cluster.tier-max-mb {D} gluster volume set {vol} cluster.tier-max-files {F} gluster volume set {vol} cluster.tier-mode {test|cache} > Change-Id: I157f19667ec95aa1d53406041c1e3b073be127c2 > BUG: 1257911 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12039 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Conflicts: xlators/cluster/dht/src/dht-rebalance.c xlators/cluster/dht/src/tier.c Change-Id: Ibfe6b89563ceab98708325cf5d5ab0997c64816c BUG: 1270527 Reviewed-on: http://review.gluster.org/12330 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/ctr: Solution for db locks for tier migrator and ctr using sqlite ↵Joseph Fernandes2015-10-093-19/+298
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | version less than 3.7 i.e rhel 6.7 Problem: On RHEL 6.7, we have sqlite version 3.6.2 which doesnt support WAL journaling mode, as this journaling mode is only available in sqlite 3.7 and above. As a result we cannot have to progreses concurrently accessing sqlite, without running into db locks! Well WAL is also need for performace on CTR side. Solution: This solution is to use CTR db connection for doing queries when WAL mode is absent. i,e tier migrator will send sync_op ipc calls to CTR, which in turn will do the query and create/update the query file suggested by tier migrator. Pending: Well this solution will stop the db locks but the performance is still an issue for CTR. We are developing an in-Memory Transaction Log (iMeTaL) which will help boost the CTR performance by doing in memory udpates on the IO path and later flush the updates to the db in a batch/segment flush. Master patch: http://review.gluster.org/#/c/12191 >> Change-Id: Ie3149643ded159234b5cc6aa6cf93b9022c2f124 >> BUG: 1240577 >> Signed-off-by: Joseph Fernandes <josferna@redhat.com> >> Signed-off-by: Dan Lambright <dlambrig@redhat.com> >> Signed-off-by: Joseph Fernandes <josferna@redhat.com> >> Reviewed-on: http://review.gluster.org/12191 >> Tested-by: Gluster Build System <jenkins@build.gluster.com> >> Reviewed-by: Luis Pabon <lpabon@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: Ie8c7a7e9566244c104531b579126bb57fbc6e32b BUG: 1270123 Reviewed-on: http://review.gluster.org/12325 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/dht: unlink fails after lookup in a directoryMohammed Rafi KC2015-10-081-14/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unlink fails with invalid argument for files that are being present on cold tier, before attaching. All of the fops will be hashed to hot_tier after attach-tier (unless explicitly set the "rule" option). Lookups sent to directory, will eventually search the directory using readdirp, and will populate inode_ctx for the inodes based on the output, in respective dht_xlators. So the readdirp will populate inodes_ctx for the files (that is already present in volume before attaching) in cold-dht only because it got the entries from the cold-tier. So when an unlink comes on such an inode, the lookup associated with the unlink will be send as a re validate request to cold-tier only, since already a lookup was performed on the inode, and the new lookup will succeed. So from the unlink of dht, it will hash to cold-tier but the cached_subvol will be cold, since there is a mismatch in hash and cach , it chose hashed subvolume and will sent the fop to hot dht, and the fops fail with EINVAL from the hot-dht since it does not have inode_ctx stored for that inode (because, no lookup was performed from hot-dht). Back port of> >Change-Id: Ib7c14a9297a22d615f7a890a060be4809b5a745a >BUG: 1236032 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Signed-off-by: Dan Lambright <dlambrig@redhat.com> >Reviewed-on: http://review.gluster.org/11675 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: Ie08858867f58df1a3363800aaa87902bdd8256a1 BUG: 1266880 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12318 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/afr: Handle stack reset failuresPranith Kumar K2015-10-072-0/+8
| | | | | | | | | | | | | | | | | Backport of http://review.gluster.com/12309 When all the bricks go down in the middle of the self-heal, in AFR_STACK_RESET afr_local_init will fail because all the bricks are down. So local will remain NULL for the frame. This leads to crashes as this failure is not handled in both entry and data self-heals. Change-Id: I71a02f161f2c4dbfdc8bb7f2a6f32807191ed253 BUG: 1269501 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12310 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/ec : Mark new entry changelog in entry self-healv3.7.5Ashish Pandey2015-10-062-7/+79
| | | | | | | | | | | | | | | | | | | | Problem : When a new entry is created dirty mark xattrs are not created this will need full heal to be performed, even when there are partial failures. Solution : Marks new entry changelog in self-heal. PS: Also fixed erasing of dirty markers when no data heal is required. BUG: 1258313 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Change-Id: I156e3d3201afa77efe118e1aaace1d91c90a9613 Reviewed-on: http://review.gluster.org/12306 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* dht/rebalance: fix mem-leak in rebalanceSusant Palai2015-10-062-5/+32
| | | | | | | | | | Change-Id: I37faf983fc02996541f3d96a17cb2a2c2cdb6781 BUG: 1261234 Reviewed-on: http://review.gluster.org/12235 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12296
* fd: Do fd_bind on successful openPranith Kumar K2015-10-054-0/+7
| | | | | | | | | | | | | | | | | | | | | | | - fd_unref should decrement fd->inode->fd_count only if it is present in the inode's fd list. - successful open/opendir should perform fd_bind. >Change-Id: I81dd04f330e2fee86369a6dc7147af44f3d49169 >BUG: 1207735 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/11044 >Reviewed-by: Anoop C S <anoopcs@redhat.com> >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1259697 Change-Id: I73b79dd3519aa085fb84dde74b321511cbccce1a Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12100 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd, dht: volume set for use-readdirp in dhtPranith Kumar K2015-10-041-0/+3
| | | | | | | | | | | | | | | | | | | | | >Change-Id: Icab246b1d02808864d878d949fa56f9f889b538a >BUG: 1265677 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/12221 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> >Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> >Reviewed-by: Kaushal M <kaushal@redhat.com> >(cherry picked from commit 059db0254f5670a34f1a928155c0c7d1cd03b53a) Change-Id: Ifc46ed08fc10b32f5e814aa09c155e11e8c93138 BUG: 1267822 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12269 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht : FOP handling during file migrationN Balachandran2015-09-281-2/+3
| | | | | | | | | | | | | An earlier patch introduced a bug in the FOP migration code. Fixed the issue. Change-Id: Ib7d8d3f54ddd455b7f53b0b2e3a82a9e942ba1f9 BUG: 1266872 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12238 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* cluster/tier: Handle FOPs on files being migratedN Balachandran2015-09-256-89/+475
| | | | | | | | | | | | | | | | | | | | | Determine which DHT level is responsible for handling fops on a file undergoing migration based on the name of the the linkto xattr set on the file being migrated and process accordingly. Change-Id: I82772e39314d4fe7f2ba0dcf22de0c6a374ee139 BUG: 1265892 Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12090 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 470869a954c17f32a3ba43ccda7442f82c0da6b2) Reviewed-on: http://review.gluster.org/12224 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/dht: Reset source file mode bits on migration failureNithya Balachandran2015-09-211-3/+94
| | | | | | | | | | | | | | | | | | | | | DHT rebalance uses the sgid and sticky bits to indicate that a file is being migrated. These were not removed if the file migration failed. The fix resets these bits to the original values. >Change-Id: I9801bfc0bd80c0800251ccd66c1c91a51cffd909 >Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> >Reviewed-on: http://review.gluster.org/11454 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: Ia701687819ee7130d6abebad84feb2ee879b7ab2 BUG: 1262700 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12167 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht : Propagate op_errno on failureNithya Balachandran2015-09-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Fixed issue where dht_selfheal_layout_lock_cbk does not propagate the op_errno. >Change-Id: I0b968339db65d2969e36e64407eeb724cc6516bd >BUG: 1262438 >Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> >Reviewed-on: http://review.gluster.org/12165 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 2ec8ea8769e943d3987dd80f8f6937359bcccf34) Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Change-Id: I6b744be71c87737f0f35fe70c3ffbf391bb1a153 BUG: 1263191 Reviewed-on: http://review.gluster.org/12178 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/tier: Fixed a crash in tieringNithya Balachandran2015-09-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 12179 An incorrect check was causing the arguments to the promote thread to be cleared before the thread was done with them. This caused the process to crash when it tried to dereference a NULL pointer. > Change-Id: I8348309ef4dad33b7f648c7a2c2703487e401269 > BUG: 1263204 > Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12179 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-by: Joseph Fernandes Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I5cd4cb9978fc9d3a74f69ef75474fc3b593aadf0 BUG: 1263746 Reviewed-on: http://review.gluster.org/12187 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier do not flag migration error on already migrated fileDan Lambright2015-09-161-15/+13
| | | | | | | | | | | | | | In some cases a brick will try to migrate a file that has already been migrated. This is a legal case, e.g. when both bricks are replica pairs. Change-Id: If2578b947014cbbdfb3c6591db9044d6b1d92774 BUG: 1262408 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12186 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com>
* afr: perform replace-brick in a synctaskRavishankar N2015-09-154-14/+73
| | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/12169/ Problem: replace-brick setxattr is not performed inside a synctask. This can lead to hangs if the setxattr is executed by epoll thread, as the epoll thread will be waiting for replies to come where as epoll thread is the thread that needs to epoll_ctl for reading from socket and listen. Fix: Move replace-brick to synctask to prevent epoll thread hang. This patch is in line with the fix performed in http://review.gluster.org/#/c/12163/ Change-Id: I7284930ead9b0adaa0257f21ec2d893fa5a7146f BUG: 1262547 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12172 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr : get split-brain-status in a synctaskAnuradha Talur2015-09-156-22/+103
| | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/12163/ On executing `getfattr -n replica.split-brain-status <file>` on mount, there is a possibility that the mount hangs. To avoid this hang, fetch the split-brain-status of a file in synctask. >Change-Id: I87b781419ffc63248f915325b845e3233143d385 >BUG: 1262345 >Signed-off-by: Anuradha Talur <atalur@redhat.com> Change-Id: I9f4f4b54e108d3a0017264353b8272e072170c16 BUG: 1262547 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/12166 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* dht/remove-brick: Avoid data loss for hard link migrationSusant Palai2015-09-131-8/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: If the hashed subvol of a file has reached cluster.min-free-disk, for a create opertaion a linkto file will be created on the hashed and the data file will be created on some other brick. For creation of the linkfile we populate the dictionary with linkto key and value as the cached subvol. After successful linkto file creation, the linkto-key-value pair is not deleted form the dictionary and hence, the data file will also have linkto xattr which points to itself.This looks something like this. client-0 client-1 -------T file rwx------file linkto.xattr=client-1 linkto.xattr=client-1 Now coming to the data loss part. Hardlink migration highly depend on this linkto xattr on the data file. This value should be the new hashed subvol of the first hardlink encountered post fix-layout. But when it tries to read the linkto xattr it gets the same target as where it is sitting. Now the source and destination are same for migration. At the end of migration the source file is truncated and deleted, which in this case is the destination and also the only data file it self resulting in data loss. BUG: 1262197 Change-Id: I5338a5704ac60ca9afb278977e178319266a0cc0 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12105 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12156 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* tier/ctr: Solving DB Lock issue due to write contention from db connectionsJoseph Fernandes2015-09-113-41/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 12031. > Problem: The DB on the brick is been accessed by CTR, for write and > tier migrator, for read and write. The write from tier migrator is reseting > the heat counters after a cycle. Since we are using sqlite, two connections > trying to write would cause a db lock contention. As a result CTR used to fail > to update the db. > Solution: Using the same db connection of CTR for reseting the heat counters. > 1) Introducted a new IPC FOP for CTR > 2) After the query do a ipc syncop to the underlying client xlator associated > to the brick. > 3) CTR in brick will catch the IPC FOP and cleat the heat counters. > Change-Id: I53306bfc08dcdba479deb4ccc154896521336150 > BUG: 1260730 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/12031 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Conflicts: xlators/cluster/dht/src/tier.c Change-Id: I88aa289cdf21e216b42c3d8ccfb4e7e828b43772 BUG: 1262341 Reviewed-on: http://review.gluster.org/12161 Reviewed-by: Joseph Fernandes Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* dht: NULL dereferencing causes crashMohammed Rafi KC2015-09-101-2/+2
| | | | | | | | | | | | | | | | | | | | If linkfile_create is failed for some reason, then we are trying to dereference a null variable backport of http://review.gluster.org/#/c/12106/ >Change-Id: I3c6ff3715821b9b993d1bab7b90167de2861e190 >BUG: 1260147 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Change-Id: I7fd98dc298ffe5aab07df10c3b28d0736cb25653 BUG: 1260511 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12112 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* afr: Do not wind the full writev payload to arbiter brickRavishankar N2015-09-071-0/+30
| | | | | | | | | | | | | | | | ...because the arbiter xlator just unwinds it without passing it down till posix anyway. Instead, send a one-byte vector so that afr write transaction works as expected. Backport of http://review.gluster.org/#/c/12095/ Change-Id: I52913ca51dfee0c8472cbadb62c5d39b7badef77 BUG: 1255110 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12104 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/tier: avoid filling /var/run with tiering filesDan Lambright2015-09-031-4/+28
| | | | | | | | | | | | | | | | | | | | This is a backport of 11931. > We failed to delete old promote/demote workfiles in /var/run. > This fix removes the <pid> postfix so there will be only a > single pair of files. > Change-Id: Ib9aafe7b4a9d4b0c05cf03a94cc1057a423a27d2 > BUG: 1253970 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/11931 Change-Id: Id9fb843a5ce553a79fc9f5809f84af9d317b1d3e BUG: 1259360 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12092 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* cluster/tier: make attach/detach work with new rebalance logicDan Lambright2015-09-022-25/+31
| | | | | | | | | | | | | | | | | | | | | | | This is a backport of 10795. > The new rebalance performance improvements added new > datastructures which were not initialized in the > tier case. Function dht_find_local_subvol_cbk() needs > to accept a list built by lower level DHT translators > in order to build the local subvolumes list. > Change-Id: Iab03fc8e7fadc22debc08cd5bc781b9e3e270497 > BUG: 1222088 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/10795 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Change-Id: Icbd51c96ae4d367d1edf41cdd0edb35095195699 BUG: 1259079 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12085 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: maintain start state of rebalance daemon across graph switch.Dan Lambright2015-09-021-3/+12
| | | | | | | | | | | | | | | | | | | This is a backport of fix 10977. > When we did a graph switch on a rebalance daemon, a second call > to gf_degrag_start() was done. This lead to multiple threads > doing migration. When multiple threads try to move the same > file there can be deadlocks. > Change-Id: I931ca7fe600022f245e3dccaabb1ad004f732c56 > BUG: 1226005 Change-Id: I163d2d04692eba36c986ea9835f588962c92b93f BUG: 1259078 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12082 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com>
* cluster/tier: account for reordered layoutsDan Lambright2015-09-022-14/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 11092 > For a tiered volume the cold subvolume is always at a fixed > position in the graph. DHT's layout array, on the other hand, > may have the cold subvolume in either the first or second > index, therefore code cannot make any assumptions. The fix > searches the layout for the correct position dynamically > rather than statically. > The bug manifested itself in NFS, in which a newly attached > subvolume had not received an existing directory. This case > is a "stale entry" and marked as such in the layout for > that directory. The code did not see this, because it > looked at the wrong index in the layout array. > The fix also adds the check for decomissioned bricks, and > fixes a problem in detach tier related to starting the > rebalance process: we never received the right defrag > command and it did not get directed to the tier translator. > Change-Id: I77cdf9fbb0a777640c98003188565a79be9d0b56 > BUG: 1214289 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: Idb2eec9ba25812f41de7f960a0314c92341d6b5d BUG: 1259081 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12086 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com>
* afr: Unset dirty xattr after setting pending xattr during post-opRavishankar N2015-09-021-13/+13
| | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/12078 In AFR transaction, in the pre-op, the dirty xattr is set. In the post-op, if the transaction fails on one of the bricks, then on the healthy brick, the dirty xattr is unset and then the pending xattr (for the brick that went down) is set in that order. If the brick crashes after unsetting the dirty xattr, we have lost information about a pending heal. Hence we need to reverse the order, i.e. set pending xattr first followed by unsetting the dirty. Change-Id: I0b8a872cb4579a1bad602f70c76f09691bd582b2 BUG: 1258845 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12079 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
* cluster/afr: Make [f]xattrop metadata transactionPranith Kumar K2015-08-314-183/+234
| | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.com/11809 Problem: When xlators above afr do [f]xattrop when one of the bricks is down, after the brick comes backup, the metadata is not healed because [f]xattrop is not considered a transaction. Fix: Treat [f]xattrop as transaction so that changes done by xlators above afr are marked for heal when some of the bricks were down at the time of [f]xattrop. BUG: 1248890 Change-Id: Ibe69aa0ca6be9b4b4134dc2879b306e2e9c4cde8 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/11810 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
* cluster/dht: Don't set posix acls on linkto filesNithya Balachandran2015-08-311-0/+34
| | | | | | | | | | | | | | | | | | | | | | Posix acls on a linkto file change the file's permission bits and cause DHT to treat it as a non-linkto file.This happens on the migration failure of a file on which posix acls were set. The fix prevents posix acls from being set on a linkto file and copies them across only after a file has been successfully migrated. Change-Id: Iccf7ff6fba49fe05d691d9b83bf76a240848b212 BUG: 1258377 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12025 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12062 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* afr: modify afr_txn_nothing_failed()Ravishankar N2015-08-311-12/+3
| | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/11827/ In an AFR transaction, we need to consider something as failed only if the failure (either in the pre-op or the FOP phase) occurs on the bricks on which a transaction lock was obtained. Without this, we would end up considering the transaction as failure even on the bricks on which the lock was not obtained, resulting in unnecessary fsyncs during the post-op phase of every write transaction for non-appending writes. Change-Id: Iee79e5d85dc7b4c41459d8bdd04a8454bdaf9a9d BUG: 1255698 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/11985 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* ec : trusted.ec.version xattr of all root directories of all bricks should ↵Ashish Pandey2015-08-291-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | be same. Problem: After replacing the brick using "replace-brick" command and running "heal full", the version of the root directory of the newly added brick is not getting healed. heal starts running on the dentries of the root but does not run on root directory. Solution: Run heal on root directory. > Change-Id: Ifd42a3fb341b049c895817e892e5b484a5aa6f80 > BUG: 1243382 > Signed-off-by: Ashish Pandey <aspandey@redhat.com> > Reviewed-on: http://review.gluster.org/11676 > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: NetBSD Build System <jenkins@build.gluster.org> Change-Id: Ifd42a3fb341b049c895817e892e5b484a5aa6f80 BUG: 1243384 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/11755 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr : Examine data/metadata readable for read-subvolAnuradha Talur2015-08-282-23/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | During lookup and discover, currently read_subvol is based only on data_readable. read_subvol should be decided based on both data_readable and metadata_readable. Credits to Ravishankar N for the logic of afr_first_up_child from http://review.gluster.org/10905/ . > Change-Id: I98580b23c278172ee2902be08eeaafb6722e830c > BUG: 1240244 > Signed-off-by: Anuradha Talur <atalur@redhat.com> > Reviewed-on: http://review.gluster.org/11551 > Reviewed-by: Ravishankar N <ravishankar@redhat.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > (cherry picked from commit 36349fa250ace6109002dfa41305d9dcd54ce0a9) Change-Id: Ia068ef9deb97f7bc48ea0c56d5ab6851f8860118 BUG: 1256909 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/12011 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* cluster/dht: avoid mknod on decommissioned brickSusant Palai2015-08-272-35/+334
| | | | | | | | | | | | BUG: 1256702 Change-Id: I0795720cb77a9c77e608f34fbb69574fd2acb542 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/11998 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12024
* dht: block/handle create op falling to decommissioned brickSusant Palai2015-08-265-56/+455
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: Post remove-brick start till commit phase, the client layout may not be in sync with disk layout because of lack of lookup. Hence,a create call may fall on the decommissioned brick. Solution: Will acquire a lock on hashed subvol. So that a fix-layout or selfheal can not step on layout while reading the layout. Even if we read a layout before remove-brick fix-layout and the file falls on the decommissioned brick, the file should be migrated to a new brick as per the fix-layout. BUG: 1256283 Change-Id: I3ef1adaf20dfb9524396a3648d1a664464eda8c1 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/11260 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12001
* afr: launch index heal on local subvols up on a child-up eventRavishankar N2015-08-231-17/+11
| | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/11912/ Problem: When a replica's child goes down and comes up, the index heal is triggered only on the child that just came up. This does not serve the intended purpose as the list of files that need to be healed to this child is actually captured on the other child of the replica. Fix: Launch index-heal on all local children of the replica xlator which just received a child up. Note that afr_selfheal_childup() eventually calls afr_shd_index_healer() which will not run the heal on non-local children. Signed-off-by: Ravishankar N <ravishankar@redhat.com> Change-Id: Ia23e47d197f983c695ec0bcd283e74931119ee55 BUG: 1255690 Reviewed-on: http://review.gluster.org/11982 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>