summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/dht
Commit message (Collapse)AuthorAgeFilesLines
...
* tier/libgfdb: Replacing ASCII query file with binaryJoseph Fernandes2015-11-082-194/+152
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Earlier, when the database was queried we used to save all the queried records in an ASCII format in the query file. This caused issues like filename having ASCII delimiter and used to take a lot of space. The tier.c file also had a lot of parsing code. Here we changed the format of the query file to binary. All the logic of serialization and formating of query record is done by libgfdb. Libgfdb provides API, gfdb_write_query_record() and gfdb_read_query_record(), which the user i.e tier migrator and CTR xlator can use to write to and read from query file. With this binary format we save on disk space i.e reduce to 50% atleast as we are saving GFID's in binary format 16 bytes and not the string format which takes 36 bytes + We are not saving path of the file + we are also saving on ASCII delimiters. The on disk format of query record is as follows, +---------------------------------------------------------------------------+ | Length of serialized query record | Serialized Query Record | +---------------------------------------------------------------------------+ 4 bytes Length of serialized query record | | -------------------------------------------------| | | V Serialized Query Record Format: +---------------------------------------------------------------------------+ | GFID | Link count | <LINK INFO> |..... | FOOTER | +---------------------------------------------------------------------------+ 16 B 4 B Link Length 4 B | | | | -----------------------------| | | | | | V | Each <Link Info> will be serialized as | +-----------------------------------------------+ | | PGID | BASE_NAME_LENGTH | BASE_NAME | | +-----------------------------------------------+ | 16 B 4 B BASE_NAME_LENGTH | | | ------------------------------------------------------------------------| | | V FOOTER is a magic number 0xBAADF00D indicating the end of the record. This also serves as a serialized schema validator. Backport of http://review.gluster.org/#/c/12354/ > Change-Id: I9db7416fd421e118dd44eafab8b535caafe50d5a > BUG: 1272207 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/12354 > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I170c579027f2594a58706f826e3ddf89e34022f4 BUG: 1263619 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12535 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier correct promotion cycle calculationDan Lambright2015-11-071-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 12480 The tier translator should only choose candidate files for promotion from the most recent cycle, not a multiple of the most recent cycles. Otherwise user observed behavior can be inconsistent. Remove related test in tier.t that is subject to race condition. > Change-Id: I9ad1523cac00f904097ce468efa6ddd515857024 > BUG: 1275524 > Signed-off-by: root <root@rhs-cli-15.gdev.lab.eng.bos.redhat.com> > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12480 > Reviewed-by: Joseph Fernandes > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Conflicts: tests/basic/tier/tier.t xlators/cluster/dht/src/tier.c Change-Id: Ic4587bf1b5d26ba377a12a4ce8e329362988a33b BUG: 1275483 Reviewed-on: http://review.gluster.org/12536 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: fix lookup-unhashed on tiered volumesDan Lambright2015-11-061-2/+1
| | | | | | | | | | | | | | | | | | | | | | During attach tier the commit hash must be copied to the hot tier. This is a backport of 12498. > Change-Id: I91b92fd8e98696993433856e1436409b657c439d > BUG: 1277716 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12498 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I012c8ce9ef1dbb9550f109a1141afbecaa4fa54d BUG: 1278603 Reviewed-on: http://review.gluster.org/12525 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/dht: Ignoring replica for migration countingJoseph Fernandes2015-11-051-7/+24
| | | | | | | | | | | | | | | | | | | | | We used to count replica files for migration counting even though they were ignore for migration as the replica brick didnt have the ownership (as per the replication xlator either AFR/EC). As a result the number of files migrated would show a wrong count, i.e each replicated file would be counted 1 + number of replica. This patch ignores such cases. Backport of http://review.gluster.org/#/c/12453/ Change-Id: Ib005fedaee16f171e0499782b91f3eeedf8860ed BUG: 1262860 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12511 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier : Files skipped during tier query parsingN Balachandran2015-11-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | The tier query parsing code was using fscanf to read each record. As space is a delimiter for fscanf, filenames containing spaces caused the parsing to return unexpected values causing various issues in the tier process, including crashes due to buffer overflows. > Change-Id: Ife602cb7ecb158fccbc2c89e4d2959bd97098a87 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12469 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> (cherry picked from commit 499b43058049572e33b525ac669ef623d476fe41) Change-Id: Id60f9c484dfbb02de6ebb44032160ad4cc94cb7f BUG: 1277587 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12502 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* quota: add version to quota xattrsvmallika2015-11-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of http://review.gluster.org/#/c/12386/ When a quota is disable and the clean-up process terminated without completely cleaning-up the quota xattrs. Now when quota is enabled again, this can mess-up the accounting A version number is suffixed for all quota xattrs and this version number is specific to marker xaltor, i.e when quota xattrs are requested by quotad/client marker will remove the version suffix in the key before sending the response > Change-Id: I1ca2c11460645edba0f6b68db70d476d8d26e1eb > BUG: 1272411 > Signed-off-by: vmallika <vmallika@redhat.com> > Reviewed-on: http://review.gluster.org/12386 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I67b1b930b28411d76b2d476a4e5250c52aa495a0 BUG: 1277080 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12487 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/tier dont log error on lookup heal for files on hot tierDan Lambright2015-10-302-17/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 12430 On fix-layout heal files are scanned. Files found are exist on the hot or cold subvolume. Those not found in the cold tier would exist on the hot. They should not be flagged as an error. Replace INFO with TRACE for common tier migration logs. Frequent migration was growing the log files too quickly. On migratation failures, do not acrue files towards cycle limit's budget. > Change-Id: Ie832ee07c43bce5477ae81c939d1fe8416a11615 > BUG: 1275383 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12430 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Joseph Fernandes Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: Ia1ce5c3ac9c8c43cf3f3f7e0bd6161aa13affe5f BUG: 1272398 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12465 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht : op_ret not set correctly in dht_fsync_cbkN Balachandran2015-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | local->op_ret was not set correctly in dht_fsync_cbk in case the file was being migrated > Change-Id: If73ae04368ea0c7f6868c8704dfc2deb2faee753 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12401 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> (cherry picked from commit 9710f58e5874bccb4b328abef80ea226ccf9c798) Change-Id: I2addb86083c1d8305cf91e0b0385deeb227216c8 BUG: 1272036 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12409 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: add pause tier for snapshotsDan Lambright2015-10-216-7/+227
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 12304 Snaps of tiered volumes cannot handle files undergoing migration. We implement a helper mechanism to "pause" migration. Any files undergoing migration are aborted. Clean up is done to remove sticky bits and data at the destination. Migration is restarted after snap completes. For testing an internal switch is added. It is not exposed externally. gluster volume set vol1 tier-pause [true|false] > Change-Id: Ia85bbf89ac142e9b7e73fcbef98bb9da86097799 > BUG: 1267950 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12304 > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Conflicts: xlators/mgmt/glusterd/src/glusterd-messages.h Change-Id: I5f039d8d38a4c915bd873969f336b96755a0b8f1 BUG: 1274101 Reviewed-on: http://review.gluster.org/12411 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier do not abort migration if a single brick is downDan Lambright2015-10-211-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | backport fix 12397 When a bricks are down, promotion/demotion should still be possible. For example, if an EC brick is down, the other bricks are able to recover the data and migrate it. > Change-Id: I8e650c640bce22a3ad23d75c363fbb9fd027d705 > BUG: 1273215 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12397 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Joseph Fernandes Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I6688757eaf97426c8e1ea1038c598b34bf6b8ccc BUG: 1272334 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12405 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier remove suprious log messages on valid failed migrationDan Lambright2015-10-191-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Backport fix 12391 > On a write to a replica volume, we record in all brick's databases an entry. > When the tier daemon runs, it will only move the file if it is the true > owner of the file as defined by the XATTR_NODE_UUID_KEY. > Change-Id: Ib82717f87a3f94f3d0d9f969773de9e88d6aaf22 > BUG: 1273043 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12391 > Reviewed-by: Joseph Fernandes > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I12147f878cd1927f845867fb7c0b84c4db017ee1 BUG: 1272398 Reviewed-on: http://review.gluster.org/12394 Reviewed-by: Joseph Fernandes Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/dht : Do not migrate files with POSIX locks heldN Balachandran2015-10-181-11/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dht_migrate_file does not migrate file locks to the dst file. Any locks held on the source file are lost once the migration is complete. This issue is magnified in the case of a tier volume as file migrations occur more frequently and repeatedly as compared to a DHT rebalance. The fix makes 2 changes: 1. Before starting the actual migration process, check if there are any locks held on the file. If yes, do not migrate the file. 2. The rebalance process tries to lock on the entire file just before moving into the Phase 2 of the file migration. If the lock acquisition fails, the file migration does not proceed. If the lock is granted, the file migration proceeds. This still leaves a small window where conflicting locks can be granted to different clients. If client1 requests a lock on the src file just after it is converted to a linkto file and client2 requests a lock on the dst data file, they will both be granted, but all FOPs will be redirected to the dst data file. This issue will be taken up in a subsequent patch. Change-Id: I8c895fc3cced50dd2894259d40a827c7b43d58ac BUG: 1272331 Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12347 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> > Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12369 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/ctr: CTR DB named lookup heal of cold tier during attach tierJoseph Fernandes2015-10-101-2/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Heal hardlink in the db for already existing data in the cold tier during attach tier. i.e during fix layout do lookup to files in the cold tier. CTR xlator on the brick/server side does db update/insert of the hardlink on a namelookup. Currently the namedlookup is done synchronous to the fixlayout that is triggered by attach tier. This is not performant, adding more time to fixlayout. The performant approach is record the hardlinks on a compressed datastore and then do the namelookup asynchronously later, giving the ctr db eventual consistency master patch : http://review.gluster.org/#/c/11828/ >>Change-Id: I4ffc337fffe7d447804786851a9183a51b5044a9 >>BUG: 1252586 >>Signed-off-by: Joseph Fernandes <josferna@redhat.com> >>Reviewed-on: http://review.gluster.org/11828 >>Tested-by: Gluster Build System <jenkins@build.gluster.com> >>Reviewed-by: Dan Lambright <dlambrig@redhat.com> >>Tested-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I61b185a54ae4e8c1d82804b95a278bfbea870987 BUG: 1261146 Reviewed-on: http://review.gluster.org/12331 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: add watermarks and policy driverDan Lambright2015-10-105-97/+470
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport fix 12039 This fix introduces infrastructure to support different policies for promotion and demotion. Currently the tier feature automatically promotes and demotes files periodically based on access. This is good for testing but too stringent for most real workloads. It makes it difficult to fully utilize a hot tier- data will be demoted before it is touched- its unlikely a 100GB hot SSD will have all its data touched in a window of time. A new parameter "mode" allows the user to pick promotion/demotion polcies. The "test mode" will be used for *.t and other general testing. This is the current mechanism. The "cache mode" introduces watermarks. The watermarks represent levels of data residing on the hot tier. "cache mode" policy: The % the hot tier is full is called P. Do not promote or demote more than D MB or F files. A random number [0-100] is called R. Rules for migration: if (P < watermark_low) don't demote, always promote. if (P >= watermark_low) && (P < watermark_hi) demote if R < P; promote if R > P. if (P > watermark_hi) always demote, don't promote. gluster volume set {vol} cluster.watermark-hi % gluster volume set {vol} cluster.watermark-low % gluster volume set {vol} cluster.tier-max-mb {D} gluster volume set {vol} cluster.tier-max-files {F} gluster volume set {vol} cluster.tier-mode {test|cache} > Change-Id: I157f19667ec95aa1d53406041c1e3b073be127c2 > BUG: 1257911 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12039 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Conflicts: xlators/cluster/dht/src/dht-rebalance.c xlators/cluster/dht/src/tier.c Change-Id: Ibfe6b89563ceab98708325cf5d5ab0997c64816c BUG: 1270527 Reviewed-on: http://review.gluster.org/12330 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/ctr: Solution for db locks for tier migrator and ctr using sqlite ↵Joseph Fernandes2015-10-093-19/+298
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | version less than 3.7 i.e rhel 6.7 Problem: On RHEL 6.7, we have sqlite version 3.6.2 which doesnt support WAL journaling mode, as this journaling mode is only available in sqlite 3.7 and above. As a result we cannot have to progreses concurrently accessing sqlite, without running into db locks! Well WAL is also need for performace on CTR side. Solution: This solution is to use CTR db connection for doing queries when WAL mode is absent. i,e tier migrator will send sync_op ipc calls to CTR, which in turn will do the query and create/update the query file suggested by tier migrator. Pending: Well this solution will stop the db locks but the performance is still an issue for CTR. We are developing an in-Memory Transaction Log (iMeTaL) which will help boost the CTR performance by doing in memory udpates on the IO path and later flush the updates to the db in a batch/segment flush. Master patch: http://review.gluster.org/#/c/12191 >> Change-Id: Ie3149643ded159234b5cc6aa6cf93b9022c2f124 >> BUG: 1240577 >> Signed-off-by: Joseph Fernandes <josferna@redhat.com> >> Signed-off-by: Dan Lambright <dlambrig@redhat.com> >> Signed-off-by: Joseph Fernandes <josferna@redhat.com> >> Reviewed-on: http://review.gluster.org/12191 >> Tested-by: Gluster Build System <jenkins@build.gluster.com> >> Reviewed-by: Luis Pabon <lpabon@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: Ie8c7a7e9566244c104531b579126bb57fbc6e32b BUG: 1270123 Reviewed-on: http://review.gluster.org/12325 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/dht: unlink fails after lookup in a directoryMohammed Rafi KC2015-10-081-14/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unlink fails with invalid argument for files that are being present on cold tier, before attaching. All of the fops will be hashed to hot_tier after attach-tier (unless explicitly set the "rule" option). Lookups sent to directory, will eventually search the directory using readdirp, and will populate inode_ctx for the inodes based on the output, in respective dht_xlators. So the readdirp will populate inodes_ctx for the files (that is already present in volume before attaching) in cold-dht only because it got the entries from the cold-tier. So when an unlink comes on such an inode, the lookup associated with the unlink will be send as a re validate request to cold-tier only, since already a lookup was performed on the inode, and the new lookup will succeed. So from the unlink of dht, it will hash to cold-tier but the cached_subvol will be cold, since there is a mismatch in hash and cach , it chose hashed subvolume and will sent the fop to hot dht, and the fops fail with EINVAL from the hot-dht since it does not have inode_ctx stored for that inode (because, no lookup was performed from hot-dht). Back port of> >Change-Id: Ib7c14a9297a22d615f7a890a060be4809b5a745a >BUG: 1236032 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Signed-off-by: Dan Lambright <dlambrig@redhat.com> >Reviewed-on: http://review.gluster.org/11675 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: Ie08858867f58df1a3363800aaa87902bdd8256a1 BUG: 1266880 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12318 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* dht/rebalance: fix mem-leak in rebalanceSusant Palai2015-10-062-5/+32
| | | | | | | | | | Change-Id: I37faf983fc02996541f3d96a17cb2a2c2cdb6781 BUG: 1261234 Reviewed-on: http://review.gluster.org/12235 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12296
* fd: Do fd_bind on successful openPranith Kumar K2015-10-053-0/+6
| | | | | | | | | | | | | | | | | | | | | | | - fd_unref should decrement fd->inode->fd_count only if it is present in the inode's fd list. - successful open/opendir should perform fd_bind. >Change-Id: I81dd04f330e2fee86369a6dc7147af44f3d49169 >BUG: 1207735 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/11044 >Reviewed-by: Anoop C S <anoopcs@redhat.com> >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1259697 Change-Id: I73b79dd3519aa085fb84dde74b321511cbccce1a Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12100 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd, dht: volume set for use-readdirp in dhtPranith Kumar K2015-10-041-0/+3
| | | | | | | | | | | | | | | | | | | | | >Change-Id: Icab246b1d02808864d878d949fa56f9f889b538a >BUG: 1265677 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/12221 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> >Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> >Reviewed-by: Kaushal M <kaushal@redhat.com> >(cherry picked from commit 059db0254f5670a34f1a928155c0c7d1cd03b53a) Change-Id: Ifc46ed08fc10b32f5e814aa09c155e11e8c93138 BUG: 1267822 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12269 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht : FOP handling during file migrationN Balachandran2015-09-281-2/+3
| | | | | | | | | | | | | An earlier patch introduced a bug in the FOP migration code. Fixed the issue. Change-Id: Ib7d8d3f54ddd455b7f53b0b2e3a82a9e942ba1f9 BUG: 1266872 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12238 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* cluster/tier: Handle FOPs on files being migratedN Balachandran2015-09-256-89/+475
| | | | | | | | | | | | | | | | | | | | | Determine which DHT level is responsible for handling fops on a file undergoing migration based on the name of the the linkto xattr set on the file being migrated and process accordingly. Change-Id: I82772e39314d4fe7f2ba0dcf22de0c6a374ee139 BUG: 1265892 Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12090 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 470869a954c17f32a3ba43ccda7442f82c0da6b2) Reviewed-on: http://review.gluster.org/12224 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/dht: Reset source file mode bits on migration failureNithya Balachandran2015-09-211-3/+94
| | | | | | | | | | | | | | | | | | | | | DHT rebalance uses the sgid and sticky bits to indicate that a file is being migrated. These were not removed if the file migration failed. The fix resets these bits to the original values. >Change-Id: I9801bfc0bd80c0800251ccd66c1c91a51cffd909 >Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> >Reviewed-on: http://review.gluster.org/11454 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: Ia701687819ee7130d6abebad84feb2ee879b7ab2 BUG: 1262700 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12167 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht : Propagate op_errno on failureNithya Balachandran2015-09-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Fixed issue where dht_selfheal_layout_lock_cbk does not propagate the op_errno. >Change-Id: I0b968339db65d2969e36e64407eeb724cc6516bd >BUG: 1262438 >Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> >Reviewed-on: http://review.gluster.org/12165 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 2ec8ea8769e943d3987dd80f8f6937359bcccf34) Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Change-Id: I6b744be71c87737f0f35fe70c3ffbf391bb1a153 BUG: 1263191 Reviewed-on: http://review.gluster.org/12178 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/tier: Fixed a crash in tieringNithya Balachandran2015-09-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 12179 An incorrect check was causing the arguments to the promote thread to be cleared before the thread was done with them. This caused the process to crash when it tried to dereference a NULL pointer. > Change-Id: I8348309ef4dad33b7f648c7a2c2703487e401269 > BUG: 1263204 > Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12179 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-by: Joseph Fernandes Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I5cd4cb9978fc9d3a74f69ef75474fc3b593aadf0 BUG: 1263746 Reviewed-on: http://review.gluster.org/12187 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier do not flag migration error on already migrated fileDan Lambright2015-09-161-15/+13
| | | | | | | | | | | | | | In some cases a brick will try to migrate a file that has already been migrated. This is a legal case, e.g. when both bricks are replica pairs. Change-Id: If2578b947014cbbdfb3c6591db9044d6b1d92774 BUG: 1262408 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12186 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com>
* dht/remove-brick: Avoid data loss for hard link migrationSusant Palai2015-09-131-8/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: If the hashed subvol of a file has reached cluster.min-free-disk, for a create opertaion a linkto file will be created on the hashed and the data file will be created on some other brick. For creation of the linkfile we populate the dictionary with linkto key and value as the cached subvol. After successful linkto file creation, the linkto-key-value pair is not deleted form the dictionary and hence, the data file will also have linkto xattr which points to itself.This looks something like this. client-0 client-1 -------T file rwx------file linkto.xattr=client-1 linkto.xattr=client-1 Now coming to the data loss part. Hardlink migration highly depend on this linkto xattr on the data file. This value should be the new hashed subvol of the first hardlink encountered post fix-layout. But when it tries to read the linkto xattr it gets the same target as where it is sitting. Now the source and destination are same for migration. At the end of migration the source file is truncated and deleted, which in this case is the destination and also the only data file it self resulting in data loss. BUG: 1262197 Change-Id: I5338a5704ac60ca9afb278977e178319266a0cc0 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12105 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12156 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* tier/ctr: Solving DB Lock issue due to write contention from db connectionsJoseph Fernandes2015-09-113-41/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 12031. > Problem: The DB on the brick is been accessed by CTR, for write and > tier migrator, for read and write. The write from tier migrator is reseting > the heat counters after a cycle. Since we are using sqlite, two connections > trying to write would cause a db lock contention. As a result CTR used to fail > to update the db. > Solution: Using the same db connection of CTR for reseting the heat counters. > 1) Introducted a new IPC FOP for CTR > 2) After the query do a ipc syncop to the underlying client xlator associated > to the brick. > 3) CTR in brick will catch the IPC FOP and cleat the heat counters. > Change-Id: I53306bfc08dcdba479deb4ccc154896521336150 > BUG: 1260730 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/12031 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Conflicts: xlators/cluster/dht/src/tier.c Change-Id: I88aa289cdf21e216b42c3d8ccfb4e7e828b43772 BUG: 1262341 Reviewed-on: http://review.gluster.org/12161 Reviewed-by: Joseph Fernandes Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* dht: NULL dereferencing causes crashMohammed Rafi KC2015-09-101-2/+2
| | | | | | | | | | | | | | | | | | | | If linkfile_create is failed for some reason, then we are trying to dereference a null variable backport of http://review.gluster.org/#/c/12106/ >Change-Id: I3c6ff3715821b9b993d1bab7b90167de2861e190 >BUG: 1260147 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Change-Id: I7fd98dc298ffe5aab07df10c3b28d0736cb25653 BUG: 1260511 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12112 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier: avoid filling /var/run with tiering filesDan Lambright2015-09-031-4/+28
| | | | | | | | | | | | | | | | | | | | This is a backport of 11931. > We failed to delete old promote/demote workfiles in /var/run. > This fix removes the <pid> postfix so there will be only a > single pair of files. > Change-Id: Ib9aafe7b4a9d4b0c05cf03a94cc1057a423a27d2 > BUG: 1253970 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/11931 Change-Id: Id9fb843a5ce553a79fc9f5809f84af9d317b1d3e BUG: 1259360 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12092 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* cluster/tier: make attach/detach work with new rebalance logicDan Lambright2015-09-022-25/+31
| | | | | | | | | | | | | | | | | | | | | | | This is a backport of 10795. > The new rebalance performance improvements added new > datastructures which were not initialized in the > tier case. Function dht_find_local_subvol_cbk() needs > to accept a list built by lower level DHT translators > in order to build the local subvolumes list. > Change-Id: Iab03fc8e7fadc22debc08cd5bc781b9e3e270497 > BUG: 1222088 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/10795 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Change-Id: Icbd51c96ae4d367d1edf41cdd0edb35095195699 BUG: 1259079 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12085 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: maintain start state of rebalance daemon across graph switch.Dan Lambright2015-09-021-3/+12
| | | | | | | | | | | | | | | | | | | This is a backport of fix 10977. > When we did a graph switch on a rebalance daemon, a second call > to gf_degrag_start() was done. This lead to multiple threads > doing migration. When multiple threads try to move the same > file there can be deadlocks. > Change-Id: I931ca7fe600022f245e3dccaabb1ad004f732c56 > BUG: 1226005 Change-Id: I163d2d04692eba36c986ea9835f588962c92b93f BUG: 1259078 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12082 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com>
* cluster/tier: account for reordered layoutsDan Lambright2015-09-022-14/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 11092 > For a tiered volume the cold subvolume is always at a fixed > position in the graph. DHT's layout array, on the other hand, > may have the cold subvolume in either the first or second > index, therefore code cannot make any assumptions. The fix > searches the layout for the correct position dynamically > rather than statically. > The bug manifested itself in NFS, in which a newly attached > subvolume had not received an existing directory. This case > is a "stale entry" and marked as such in the layout for > that directory. The code did not see this, because it > looked at the wrong index in the layout array. > The fix also adds the check for decomissioned bricks, and > fixes a problem in detach tier related to starting the > rebalance process: we never received the right defrag > command and it did not get directed to the tier translator. > Change-Id: I77cdf9fbb0a777640c98003188565a79be9d0b56 > BUG: 1214289 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: Idb2eec9ba25812f41de7f960a0314c92341d6b5d BUG: 1259081 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12086 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com>
* cluster/dht: Don't set posix acls on linkto filesNithya Balachandran2015-08-311-0/+34
| | | | | | | | | | | | | | | | | | | | | | Posix acls on a linkto file change the file's permission bits and cause DHT to treat it as a non-linkto file.This happens on the migration failure of a file on which posix acls were set. The fix prevents posix acls from being set on a linkto file and copies them across only after a file has been successfully migrated. Change-Id: Iccf7ff6fba49fe05d691d9b83bf76a240848b212 BUG: 1258377 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12025 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12062 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht: avoid mknod on decommissioned brickSusant Palai2015-08-272-35/+334
| | | | | | | | | | | | BUG: 1256702 Change-Id: I0795720cb77a9c77e608f34fbb69574fd2acb542 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/11998 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12024
* dht: block/handle create op falling to decommissioned brickSusant Palai2015-08-265-56/+455
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: Post remove-brick start till commit phase, the client layout may not be in sync with disk layout because of lack of lookup. Hence,a create call may fall on the decommissioned brick. Solution: Will acquire a lock on hashed subvol. So that a fix-layout or selfheal can not step on layout while reading the layout. Even if we read a layout before remove-brick fix-layout and the file falls on the decommissioned brick, the file should be migrated to a new brick as per the fix-layout. BUG: 1256283 Change-Id: I3ef1adaf20dfb9524396a3648d1a664464eda8c1 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/11260 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12001
* dht/tiering : create new dictionary during migrationMohammed Rafi KC2015-08-191-2/+10
| | | | | | | | | | | | | | | | | | | | | | | To avoid setting wrong xattr during creating link file Back port of: >Change-Id: Iad8de3521eae17e510035ed42e3e01933d647096 >BUG: 1250828 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: http://review.gluster.org/11838 >Reviewed-by: N Balachandran <nbalacha@redhat.com> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Dan Lambright <dlambrig@redhat.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit a3faffb259d5288907fac33a2822a8f61c3e86fe) Change-Id: I76ef168cd881c8fd828283a1ae70ed251fc44aaa BUG: 1254438 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/11945 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* dht/tier :rename fails with EBUSYMohammed Rafi KC2015-08-191-8/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the files was in hot tier and the look up was done already, then hashed and cached subvolume will be hot-tier. Once the file is moved from hot-tier to cold-tier, then subsequent lookup will send a revalidate lookup to hot-tier and it will find out that the file was actually moved and there is only link in the cached subvolume. So dht will return an ESTALE to fuse. Upon receiving ESTALE for a lookup, fuse will create a new inode and sent a fresh lookup. This lookup will be successful, and it will locate the file properly. Then fuse try to link the inode, but the older inode was already there in inmemory inode cache with same gfid and that is also shared with fuse kernal. So inode_link will return the older ionode itself. So the subsequent rename fop will come to gluster with the older inode. From dht_rename, we will take a lock on the inode and after successful inodelk on inode dht will send lookup before creating a link. this lookup will again find out that the file is a link file, and then dht will think that file is migrating/migrated in the mean time, and will send EBUSY. Back port of : >Change-Id: Ib3a01e5b1d7f64514b04bb6234026d049f082679 >BUG: 1248306 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: http://review.gluster.org/11768 >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> >Reviewed-by: Dan Lambright <dlambrig@redhat.com> >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Dan Lambright <dlambrig@redhat.com> (cherry picked from commit 0ad26041fbf65ab36856a0ad178c32e51bf87319) Change-Id: I1278a2c2ccc2cadcbe147db836f0526f079f6038 BUG: 1254437 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/11944 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier : Use dht_* versions for xlator_fopsN Balachandran2015-08-191-16/+28
| | | | | | | | | | | | | | | | The tier xlator was using the default_* versions for some xlator_fops. Changed to use the dht_* versions for all xlator_fops Change-Id: I8252fb3911b8a48a55e9eee42b89bd66bbacf799 BUG: 1254468 Signed-off-by: N Balachandran <nbalacha@redhat.com> (cherry picked from commit 0c20107a60726804030f98a7f79b94c677e6a7b6) Reviewed-on: http://review.gluster.org/11951 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: fix demotion when cold tier is ECDan Lambright2015-08-121-0/+2
| | | | | | | | | | | | | | | | | | | This is a backport of 11855. We did not set the gfid in the loc structure in tier demotion. EC has a sanity check which fails FOPs when the loc gfid mismatches with the file attribute. When the FOP failed demotion was aborted. > Change-Id: I69022c9ccb135b86e1feea93b01801b6a4100509 > BUG: 1251121 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I266d554e3e0a2ff024a5ba3a7e9ca40866688eae BUG: 1252907 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/11901 Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* dht: Adding log messages to the new logging frameworkarao2015-07-2715-335/+961
| | | | | | | | | | | | | | | | | | | | | | | | Backported from: http://review.gluster.org/10021 > Change-Id: Ib3bb61c5223f409c23c68100f3fe884918d2dc3f > BUG: 1194640 > Reviewed-on: http://review.gluster.org/10021 > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Joseph Fernandes <josferna@redhat.com> > Tested-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Tested-by: Raghavendra G <rgowdapp@redhat.com> > Signed-off-by: arao <arao@redhat.com> BUG: 1217722 Change-Id: Ide79c6c1e6a466fb52f955c90a2b22711bec794a Signed-off-by: arao <arao@redhat.com> Signed-off-by: Anusha Rao <arao@redhat.com> Reviewed-on: http://review.gluster.org/11350 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* dht: send lookup even for fd based operations during rebalanceRavishankar N2015-07-241-22/+30
| | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/11713 Problem: dht_rebalance_inprogress_task() was not sending lookups to the destination subvolume for a file undergoing writes during rebalance. Due to this, afr was not able to populate the read_subvol and failed the write with EIO. Fix: Send lookup for fd based operations as well. Thanks to Raghavendra G for helping with the RCA. Change-Id: Iaa427666328109bbdf228876e62c13b75b7df88e BUG: 1245934 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/11744 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* features/shard: Use xattrop (as opposed to setxattr) for updates to size xattrKrutika Dhananjay2015-07-211-2/+2
| | | | | | | | | | | | Backport of: http://review.gluster.org/11467 Change-Id: I9effecbb1296d11cf1629b5e5cc38192f84cfcb3 BUG: 1243655 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11689 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* syncop: Include iatt to 'syncop_link' argsSoumya Koduri2015-07-151-1/+1
| | | | | | | | | | | | | | | | | Include iatt to 'syncop_link' args to fetch proper attributes of the newly linked inode. This is backport of the below fix - http://review.gluster.org/11611 Change-Id: If6b92961bd7a89add3791ed3a9b494087348b492 BUG: 1243408 Reviewed-on: http://review.gluster.org/11611 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/11677 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier: fixes for migration over ec as cold tierDan Lambright2015-07-142-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of fix 11433. > An opendir is done in rebalance. The graph constructed when > EC is used in tiering may have no local volumes (if > all the hot volumes are on one node and all the others on > another node). Previously the opendir only sent fops down > the local subvolumes for migration. They must be sent down > both the hot and cold subvolumes for tiering. > When setxattr2() received a NULL subvolume; this dereferenced > an uninitialized variable. > When a lookup is done during creation of the destination > file, the xattr dict is "polluted" with virtual xattrs. > These cause subsequent xattrs in the new file to not be > written by posix. They are required by EC. > The inode gfid for "entry_loc" in gf_defrag_migrate_single_file() > was not initialized. This made underlying translators > think the gfid was 0, and failed migration. > Change-Id: I6ccda8ca8e43485b9b354341bbfcb302496f632c > BUG: 1236212 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I9b26725e055eecfec235c4291ee90b0e53d0ea62 BUG: 1242274 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/11652 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: Fix Null pointer dereference while loggingRaghavendra G2015-07-061-8/+8
| | | | | | | | | | | Change-Id: I1ea358b83267b0bcdf654ce18fe881fd4a6bf08d BUG: 1233158 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/11314 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: use refcount to manage memory used to store migrationRaghavendra G2015-07-012-21/+46
| | | | | | | | | | | | | | information. Without refcounting, we might free up memory while other fops are still accessing it. BUG: 1235928 Change-Id: Ia4fa4a651cd6fe2394a0c20cef83c8d2cbc8750f Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/11419 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* tier/ctr: Ignore creation of T file and Ctr Lookup heal improvememntsDan Lambright2015-06-271-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a back port of 11334 1) Ignore creation of T file in ctr_mknod 2) Ignore lookup for T file in ctr_lookup 3) Ctr_lookup: a. If the gfid and pgfid in empty dont record b. Decreased log level for multiple heal attempts c. Inode/File heal happens after an expiry period, which is configurable. d. Hardlink heal happens after an expiry period, which is configurable. > Change-Id: Id8eb5092e78beaec22d05f5283645081619e2452 > BUG: 1235269 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/11334 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> Change-Id: Ia28a5cf975e41d318906f707deca447aaa35630f BUG: 1236288 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/11446 Reviewed-by: Joseph Fernandes Tested-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com>
* tier/dht: Fixing non atomic promotion/demotion w.r.t to frequency periodJoseph Fernandes2015-06-271-40/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the ping-pong issue i.e files getting demoted immediately after promition, caused by off-sync promotion/demotion processes. The solution is do promotion/demotion refering to the system time. To have the fix working all the file serving nodes should have thier system time synchronized with each other either manually or using a NTP Server. NOTE: The ping-pong issue can re-appear even with this fix, if the admin have different promotion freq period and demotion freq period, but this would be under the control of the admin. Backport of http://review.gluster.org/#/c/11110/ to 3.7.x: > Change-Id: I1b33a5881d0cac143662ddb48e5b7b653aeb1271 > BUG: 1218717 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/11110 > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I81bd1d677487ebc0fc46df4980500102571de68e BUG: 1230857 Reviewed-on: http://review.gluster.org/11191 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tiering/rebalance: tier daemon stopped with out updating statusMohammed Rafi KC2015-06-261-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a subvol goes down, tier daemon stopped immediately, and the status shows as "Progressing". With this change, with respect to tier xlator, when a subvol goes offline it will update the status as failed. Back port of: >Change-Id: I9f722ed0d35cda8c7fc1a7e75af52222e2d0fdb7 >BUG: 1227803 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: http://review.gluster.org/11068 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Dan Lambright <dlambrig@redhat.com> >Tested-by: Dan Lambright <dlambrig@redhat.com> (cherry picked from commit d3714f252d91f4d1d5df05c4dcc8bc7c2ee75326) Change-Id: I268b7e2631779d15db5d9aec495dfd00ded94c5c BUG: 1235203 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/11414 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tiering:static function called from a non static inline functionMohammed Rafi KC2015-06-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/11032/ gcc v5.1.1 throws warning for calling a static function from a non-static inline function. <snippet from compiler warning> CC tier.lo tier.c:610:15: warning: 'tier_migrate_using_query_file' is static but used in inline function 'tier_migrate_files_using_qfile' which is not static ret = tier_migrate_using_query_file ((void *)query_cbk_args); ^ tier.c:585:47: warning: 'tier_process_brick_cbk' is static but used in inline function 'tier_build_migration_qfile' which is not static ret = dict_foreach (args->brick_list, tier_process_brick_cbk, ^ tier.c:565:176: warning: 'demotion_qfile' is static but used in inline function 'tier_build_migration_qfile' which is not static tier.c:565:158: warning: 'promotion_qfile' is static but used in inline function 'tier_build_migration_qfile' which is not static tier.c:563:58: warning: 'demotion_qfile' is static but used in inline function 'tier_build_migration_qfile' which is not static tier.c:563:40: warning: 'promotion_qfile' is static but used in inline function 'tier_build_migration_qfile' which is not static ret = remove (GET_QFILE_PATH (is_promotion)); ^ CCLD tier.la </snip> Change-Id: I46046feeb79ab4e2724b0ba6b02c9ec8b121ff4e BUG: 1231767 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/11232 Reviewed-by: Joseph Fernandes Tested-by: Joseph Fernandes Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>