summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/dht
Commit message (Collapse)AuthorAgeFilesLines
* cluster/dht: Pass the correct xdata in fremovexattr fopKrutika Dhananjay2017-05-031-10/+5
| | | | | | | | | | | | | Backport of: https://review.gluster.org/17126 Change-Id: Id84bc87e48f435573eba3b24d3fb3c411fd2445d BUG: 1440635 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/17148 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht: Pass the req dict instead of NULL in dht_attr2()Krutika Dhananjay2017-04-293-57/+74
| | | | | | | | | | | | | | | | | | | | | | | Backport of: https://review.gluster.org/17085 This bug was causing VMs to pause during rebalance. When qemu winds down a STAT, shard fills the trusted.glusterfs.shard.file-size attribute in the req dict which DHT doesn't wind its STAT fop with upon detecting the file has undergone migration. As a result shard doesn't find the value to this key in the unwind path, causing it to fail the STAT with EINVAL. Also, the same bug exists in other fops too, which is also fixed in this patch. Change-Id: I56273b1a65347dabd38bc6bdd12d618f68287a00 BUG: 1440635 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/17121 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cluster/dht: Modify local->loc.gfid in thread safe mannerPranith Kumar K2017-04-071-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of https://review.gluster.org/16986 Problem: local->loc.gfid in dht_lookup_directory() will be null-gfid for a fresh lookup. dht_lookup_dir_cbk() updates local->loc.gfid while in other thread dht_lookup_directory() is still winding lookup calls to subvolumes so there is a chance of partial gfid being seen by EC. We saw in 12x(4+2) volume, ec is receiving an loc where the gfid has last 10 bytes matching with the gfid of the directory and the first 4 bytes are all-zeros. This is leading to EC erroring out the lookup with EINVAL which leads to NFS failing lookup with EIO. snip from gdb: $37 = (dht_local_t *) 0x7fde5de5b3cc (gdb) p /x $37->loc.gfid $39 = {0x3b, 0x82, 0x10, 0x5e, 0x40, 0x65, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5, 0x6c, 0x2c, 0xb8, 0x56} (gdb) fr 7 state=<optimized out>) at ec-generic.c:837 837 ec_lookup_rebuild(fop->xl->private, fop, cbk); (gdb) p /x fop->loc[0].gfid $40 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5, 0x6c, 0x2c, 0xb8, 0x56} snip from log: [2017-01-29 03:22:30.132328] W [MSGID: 122019] [ec-helpers.c:354:ec_loc_gfid_check] 0-butcher-disperse-4: Mismatching GFID's in loc [2017-01-29 03:22:30.132709] W [MSGID: 112199] [nfs3-helpers.c:3515:nfs3_log_newfh_res] 0-nfs-nfsv3: /linux-4.9.5/Documentation => (XID: b27b9474, MKDIR: NFS: 5(I/O error), POSIX: 5(Input/output error)), FH: exportid 00000000-0000-0000-0000-000000000000, gfid 00000000-0000-0000-0000-000000000000, mountid 00000000-0000-0000-0000-000000000000 [Invalid argument] Fix: update local->loc.gfid in last-call to make sure there are no races. >BUG: 1438411 >Change-Id: Ifcb7e911568c1f1f83123da6ff0cf742b91800a0 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> BUG: 1438424 Change-Id: If039956205cfac5e798c2c90e92a9a47b404e804 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/16988 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: Fix crash in "nuke-dir" featureKrutika Dhananjay2017-03-101-1/+10
| | | | | | | | | | | | | | | | | | | | | | Backport of: https://review.gluster.org/16829 My patch at https://review.gluster.org/16419 is resulting in core dumps everytime I run tests/features/nuke.t. Turns out dht, upon successfully "nuking" a directory, which was initiated through a setxattr, unwinds the operation with rmdir fop signature, resulting in readdir-ahead casting a struct iatt (preparent) to dict_t, leading to a crash. Change-Id: Ib970b3198185a6c641092b00e115a672cb3f9111 BUG: 1428743 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/16840 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht Fix error assignment in dht_*xattr2 functionsN Balachandran2017-03-101-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Corrected the op_errno assignments and NULL checks in the dht_sexattr2 and dht_removexattr2 functions. Earlier, they unwound with the default EINVAL op_errno if the file had been deleted. > Change-Id: Iaf837a473d769cea40132487a966c7f452990071 > BUG: 1421653 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/16610 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: MOHIT AGRAWAL <moagrawa@redhat.com> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> (cherry picked from commit 028626a86ea409f908783b9007c02877f20be43e) Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: Id2e91df47bcd734dda18700fb075608c1627a608 BUG: 1424915 Reviewed-on: https://review.gluster.org/16678 Tested-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: Don't update layout in rebalance_task_completionN Balachandran2017-02-131-24/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | Updating the layout in the dht inode_ctx in rebalance_task_completion after the file is migrated is erroneous in case of files with hardlinks. This step can be skipped as the layout will be set in the syncop_lookup call post the migration in dht_migrate_file. > Change-Id: I24ac798a919585d91a117d6a207e6a31b88486c6 > BUG: 1415761 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/16457 > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-by: Susant Palai <spalai@redhat.com> Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: Ifccffd67b8bc12208efb23101366a1ac7a8c60f5 BUG: 1420184 Reviewed-on: https://review.gluster.org/16561 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: N Balachandran <nbalacha@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Susant Palai <spalai@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht: Incorrect migration checks in fsyncN Balachandran2017-01-221-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | Fixed the order of the migration phase checks in dht_fsync_cbk. Phase1 should never be hit if op_ret is non zero. > Change-Id: I9222692e04868bffa93498059440f0aa553c83ec > BUG: 1410777 > Reviewed-on: http://review.gluster.org/16350 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> > Tested-by: Raghavendra G <rgowdapp@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 83117c71482c9f87a0b6812094cbb22497eb3faa) Change-Id: I0455871aef230371db5ecb1b2e6b1468e8a6d079 BUG: 1412119 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/16374 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: Do rename cleanup as rootPranith Kumar K2017-01-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Rename linkfile cleanup is done as non-root which may not have priviliges to do the rename so it fails with EACCESS. MKDIR on that name in future will start to hole on this subvolume. It is not easy to hit on fuse mounts because vfs takes care of the permission checks even before rename fop is wound. But with nfs-ganesha mounts it happens. Fix: Do rename cleanup as root >BUG: 1409727 >Change-Id: I414c1eb6dce76b4516a6c940557b249e6c3f22f4 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/16317 >Smoke: Gluster Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> >Reviewed-by: N Balachandran <nbalacha@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> BUG: 1412913 Change-Id: I7f891034150d7a0e3210202fb0788040c91e1c09 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/16390 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* dht/rebalance: remove errno check for failure detectionSusant Palai2017-01-081-16/+13
| | | | | | | | | | | | | | | | | | | | | > BUG: 1410355 > Change-Id: I867419ca36a81ef7209e6911a46c1c2c898b8eab > Signed-off-by: Susant Palai <spalai@redhat.com> > Reviewed-on: http://review.gluster.org/16328 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 451ca272d12f2d49522c845d53585520f71525f8) Change-Id: If05094346fc1fc48f4982db6a195740171ae2fad BUG: 1410764 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/16348 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht: Fix dict_leak in migration check tasksN Balachandran2017-01-081-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | Fixed a memleak where dict was not being unrefed in the dht_migration_complete_check_task and dht_rebalance_inprogress_task functions. BUG: 1410369 > Change-Id: I3d42e9a2e5c8596c985bf6431a68fd3905227383 > BUG: 1409186 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/16308 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: MOHIT AGRAWAL <moagrawa@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> > (cherry picked from commit 11b6a2c9fc5232b58774cab29873406c0fbfef19) Change-Id: I4e92e4c87b52f6f5ffb859005aa3d525225b4bda Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/16331 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht: Fix memory corruption while accessing regex stored inRaghavendra G2017-01-033-45/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | private If reconfigure is executed parallely (or concurrently with dht_init), there are races that can corrupt memory. One such race is modification of regexes stored in conf (conf->rsync_regex_valid and conf->extra_regex_valid) through dht_init_regex. With change [1], reconfigure codepath can get executed parallely (with itself or with dht_init) and this fix is needed. Also, a reconfigure can race with any thread doing dht_layout_search, resulting in dht_layout_search accessing regex freed up by reconfigure (like in bz 1399134). [1] http://review.gluster.org/15046 >Change-Id: I039422a65374cf0ccbe0073441f0e8c442ebf830 >BUG: 1399134 >Signed-off-by: Raghavendra G <rgowdapp@redhat.com> >Reviewed-on: http://review.gluster.org/15945 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: N Balachandran <nbalacha@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Change-Id: I039422a65374cf0ccbe0073441f0e8c442ebf830 BUG: 1399423 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 64451d0f25e7cc7aafc1b6589122648281e4310a) Reviewed-on: http://review.gluster.org/15793 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* dht/rename : Incase of failure remove linkto file properlyJiffin Tony Thottan2017-01-022-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generally linkto file is created using root user. Consider following case, a user is trying to rename a file which he is not permitted. So the rename fails with EACESS and when rename tries to cleanup the linkto file, it fails. The above issue happens when rename/00.t test executed on nfs-ganesha clients : Steps executed in script * create a file "abc" using root * rename the file "abc" to "xyz" using a non root user, it fails with EACESS * delete "abc" * create directory "abc" using root * again try ot rename "abc" to "xyz" using non root user, test hungs here which slowly leds to OOM kill of ganesha process RCA put forwarded by Du for OOM kill of ganesha Note that when we hit this bug, we've a scenario of a dentry being present as: * a linkto file on one subvol * a directory on rest of subvols When a lookup happens on the dentry in such a scenario, the control flow goes into an infinite loop of: dht_lookup_everywhere dht_lookup_everywhere_cbk dht_lookup_unlink_cbk dht_lookup_everywhere_done dht_lookup_directory (as local->dir_count > 0) dht_lookup_dir_cbk (sets to local->need_selfheal = 1 as the entry is a linkto file on one of the subvol) dht_lookup_everywhere (as need_selfheal = 1). This infinite loop can cause increased consumption of memory due to: 1) dht_lookup_directory assigns a new layout to local->layout unconditionally 2) Most of the functions in this loop do a stack_wind of various fops. This results in growing of call stack (note that call-stack is destroyed only after lookup response is received by fuse - which never happens in this case) Thanks Du for root causing the oom kill and Sushant for suggesting the fix Upstream reference : >Change-Id: I1e16bc14aa685542afbd21188426ecb61fd2689d >BUG: 1397052 >Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> >Reviewed-on: http://review.gluster.org/15894 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> >(cherry picked from commit 57d59f4be205ae0c7888758366dc0049bdcfe449) Change-Id: I1e16bc14aa685542afbd21188426ecb61fd2689d BUG: 1401029 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/16015 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht: Check for null inodeN Balachandran2017-01-021-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | Check for NULL inode before attempting to set dht inode ctx. > Change-Id: I7693c18445f138221d8417df5e95b118cedb818a > BUG: 1395261 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/15847 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 8313d53accaa22feb14d284fb91245be0a32e16e) Change-Id: I7607d32d38d707dd5d71b98efffd1a458ffe90d7 BUG: 1395510 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15850 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* dht/rebalance: reverify lookup failuresSusant Palai2016-12-272-58/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | race: readdirp has read one entry, and doing a lookup on that entry, but user might have renamed/removed that entry just after readdirp but before lookup. Since remove-brick is a costly opertaion,will ingore any ENOENT/ESTALE failures and move on. > Change-Id: I62c7fa93c0b9b7e764065ad1574b97acf51b5996 > BUG: 1408115 > Signed-off-by: Susant Palai <spalai@redhat.com> > Reviewed-on: http://review.gluster.org/15846 > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Change-Id: I62c7fa93c0b9b7e764065ad1574b97acf51b5996 BUG: 1408414 Reviewed-on: http://review.gluster.org/15846 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> (cherry picked from commit c20febcb1ffaef3fa29563987e7a3b554aea27b3) Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/16278
* cluster/dht: A hard link is lost during rebalance + lookupMohit Agrawal2016-12-141-35/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: A hard link is lost during rebalance + lookup.Rebalance skip files if file has hardlink.In dht_migrate_file __is_file_migratable () function checks if a file has hardlink, if yes file is not migrated but if link is created after call this function then link will lost. Solution: Call __check_file_has_hardlink to check hardlink existence after (S+T) bits in migration process ,if file has hardlink then skip the file for migrate rebalance process. > BUG: 1396048 > Change-Id: Ia53c07ef42f1128c2eedf959a757e8df517b9d12 > Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > Reviewed-on: http://review.gluster.org/15866 > Reviewed-by: Susant Palai <spalai@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > (cherry picked from commit 71dd2e914d4a537bf74e1ec3a24512fc83bacb1d) BUG: 1399432 Change-Id: I30e21efd5a054d8a3e640ab3ed8aa7955d083926 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15954 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht: Proper log message if data migration is skippedankitraj2016-11-021-8/+8
| | | | | | | | | | | | | | | | | | | There was a misleading message from logs about available disk space while rebalancing of bricks while calculating free space. Bug: 1390870 Backprt of http://review.gluster.org/#/c/15345/ Change-Id: Ie9df0b2cbf00faaf13a0a3f0dbd657377a082755 Signed-off-by: ankitraj <anraj@redhat.com> Reviewed-on: http://review.gluster.org/15765 Tested-by: ankitraj CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht: udpate stbuf from servers those have layoutSusant Palai2016-09-291-3/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: For healing of uid/gid we check if local->stbuf.ia_ctime is lesser than stbuf->ia_ctime (received from brick). If yes then uid/gid is updated to local->prebuf(source of healing). But we merge local->stbuf also form the newly added brick. So if we receive response from the newly added brick first and update the local->stbuf, then local->prebuf will remain empty since the newly added brick will have the latest ctime among all servers. And this can result in healing wrong uid/gids to the rest of servers. Hence, we should update local->stbuf from servers with a layout which will ignore merging stbufs from newly added bricks. > Reviewed-on: http://review.gluster.org/15126 > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 36af81ac7cb2d459f9bfc0c436f0038a68f85235) Change-Id: If4b64f75a0ea669abdbe9f5a3d1d18ff19374c2f BUG: 1375096 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/15464 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht: "replica.split-brain-status" attribute value is not correctMohit Agrawal2016-09-262-12/+210
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In a distributed-replicate volume attribute "replica.split-brain-status" value does not display split-brain condition though directory is in split-brain. If directory is in split brain on mutiple replica-pairs it does not show full list of replica pairs. Solution: Update the dht_aggregate code to aggregate the xattr value in this specific condition. Fix: 1) function getChoices returns the choices from split-brain status string. 2) function add_opt adding the choices to local buffer to store in dictionary 3) For the key "replica.split-brain-status" function dht_aggregate call dht_aggregate_split_brain_xattr to prepare the list. Test: To verify the patch followed below steps 1) Create a distributed replica volume and create mount point 2) Stop heal daemon 3) Touch file and directories on mount point mkdir test{1..5};touch tmp{1..5} 4) Down brick process on one of the replica set pkill -9 glusterfsd 5) Change permission of dir on mount point chmod 755 test{1..5} 6) Restart brick process on node with force option 7) kill brick process on other node in same replica set 8) Change permission of dir again on mount point chmod 766 test{1..5} 9) Reexecute same step from 4-9 on other replica set also 10) After check heal status on server it will show dir's are in split brain on all replica sets 11) After check the replica.split-brain-status attr on mount point it will show wrong status of split brain. 12) After apply the patch the attribute shows correct value. > Change-Id: Icdfd72005a4aa82337c342762775a3d1761bbe4a > Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > Reviewed-on: http://review.gluster.org/15201 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > (cherry picked from commit c4e9ec653c946002ab6d4c71ee8e6df056438a04) Change-Id: I85a5ae60189066d9e80799f00f1352c2f33ef4f8 Backport of commit c4e9ec653c946002ab6d4c71ee8e6df056438a04 BUG: 1375098 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15467 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: heal root permission post add-brickSusant Palai2016-09-132-5/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Post add-brick event the new brick will have permission of 755 by default. If the root directory permission was other than 755, that does not get healed to the new brick leading to permission errors/inconsistencies. For choosing source of attr heal we can trust the subvols which have layouts with latest ctime(as part of missing directory heal, we heal the proper attr). In case none of the subvols have layout, return ESTALE to retrigger a fresh lookup. Note: This patch heals the permission of the root directories only. Since, permission healing of directory is not straight forward and required intrusive fix, those are not addressed here. > Reviewed-on: http://review.gluster.org/15195 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 801cd07a4c6ec65ff930b2ae6bb5e405ccd03334) Change-Id: If894e3895d070d46b62d2452e52c1eaafcf56c29 BUG: 1374573 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/15465 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: Skip layout overlap maximization on weighted rebalanceN Balachandran2016-09-131-4/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | During a fix-layout, dht_selfheal_layout_maximize_overlap () does not consider chunk sizes while calculating layout overlaps, causing smaller bricks to sometimes get larger ranges than larger bricks. Temporarily enabling this operation if only if weighted rebalance is disabled or all bricks are the same size. > Change-Id: I5ed16cdff2551b826a1759ca8338921640bfc7b3 > BUG: 1366494 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/15403 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> (cherry picked from commit b93692cce603006d9cb6750e08183bca742792ac) Change-Id: Icf0dd83f36912e721982bcf818a06c4b339dc974 BUG: 1374135 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15422 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht: move layout logs to DEBUG levelSusant Palai2016-09-072-5/+8
| | | | | | | | | | | | | | | | | | | | > Reviewed-on: http://review.gluster.org/15343 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> > Signed-off-by: Susant Palai <spalai@redhat.com> (cherry picked from commit 15c790b502ba92caa17f2d1870c3d75d547e6bad) Change-Id: Iad96256218be643b272762b5638a3f6837aff28d BUG: 1366496 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/15413 Reviewed-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* quotad: fix potential buffer overflowsRaghavendra G2016-08-272-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This converts sprintf to gf_asprintf in following components: * quotad.c * dht * afr * protocol/client * rpc/rpc-lib * rpc/rpc-transport This is a backport of http://review.gluster.org/#/c/14102/ > Change-Id: If8a267bab3d91003bdef3a92664077a0136745ee > BUG: 1332073 > Signed-off-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-on: http://review.gluster.org/14102 > Tested-by: Manikandan Selvaganesh <mselvaga@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com> Change-Id: If8a267bab3d91003bdef3a92664077a0136745ee BUG: 1366746 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/15325 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Manikandan Selvaganesh <mselvaga@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cluster/dht: initialize cbk before attempting inode-linkRaghavendra G2016-08-151-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise inode-link failures in selfheal codepath will result in a crash. > Change-Id: I9061629ae9d1eb1ac945af5f448d0d8b397a5022 > BUG: 1345748 > Signed-off-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-on: http://review.gluster.org/14707 > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Poornima G <pgurusid@redhat.com> > Reviewed-by: Susant Palai <spalai@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> (cherry picked from commit a4d35ccb8afeefae4d9cdd36ac19b0e97d0d04d0) Change-Id: I9061629ae9d1eb1ac945af5f448d00dba97a5022 BUG: 1366482 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/15157 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* dht/rebalance: allocate migrator thread pool dynamicallySusant Palai2016-08-101-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Problems: The maximum number of migratior threads created was static set to "40". And the number of these threads get created in rebalance depends on the number of cores user has. If the number of cores exceeds 40, a crash or memory corruption can be seen. Fix: Make the migratior thread pool dynamic. > Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839 > BUG: 1362069 > Reviewed-on: http://review.gluster.org/15000 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit b8e8bfc7e4d3eaf76bb637221bc6392ec10ca54b) Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839 BUG: 1362069 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/15061 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cluster/tier: dont promote if estimated block consumption > hi watermarkMilind Changire2016-08-052-50/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | Add test to fail promotion if estimated block consumption grows beyond hi watermark. Skip file migrations until next cycle if tier_get_fs_stat() fails in tier_migrate_using_query_file() > Reviewed-on: http://review.gluster.org/14780 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> (cherry picked from commit 1f4e41e8c2f5f4af4564caba0a08996853f089f4) Change-Id: Ice04572fa739c09109c4433e65965197482a7beb BUG: 1362198 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/15065 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/distribute: heal layout in discover codepath tooRaghavendra G2016-06-281-33/+7
| | | | | | | | | | | | | | | | | | | | | Backport of commit a74f8cf4e7edc2ce9f045317a18dacddf25adb8a: > BUG: 1334164 > Change-Id: I4259d88f2b6e4f9d4ad689bc4e438f1db9cfd177 > Signed-off-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-on: http://review.gluster.org/14365 > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Change-Id: Ic559c220a1f0051e531314d13940604e2dead08c BUG: 1348060 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/14351 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* dht:remember locked subvol and send unlock to the sameMohammed Rafi KC2016-06-205-21/+218
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During locking we send lock request to cached subvol, and normally we unlock to the cached subvol But with parallel fresh lookup on a directory, there is a race window where the cached subvol can change and the unlock can go into a different subvol from which we took lock. This will result in a stale lock held on one of the subvol. So we will store the details of subvol which we took the lock and will unlock from the same subvol Back port of> >Change-Id: I47df99491671b10624eb37d1d17e40bacf0b15eb >BUG: 1311002 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: http://review.gluster.org/13492 >Reviewed-by: N Balachandran <nbalacha@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> (cherry picked from commit ef0db52bc55a51fe5e3856235aed0230b6a188fe) Change-Id: Ib821e7355b4937b86d2f9f11e2c8311b7301b6c7 BUG: 1347524 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/14750 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: Fix unsafe iteration on inode->fd_listXavier Hernandez2016-06-181-16/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When DHT traverses the inode->fd_list, it does that in an unsafe way that can generate races with fd_unref() called from other threads. This patch fixes this problem taking the inode->lock and adding a reference to the fd while it's being used outside of the mutex protected region. A minor change in storage/posix has been done to also access the inode->fd_list in a safe way. Backport of: > Change-Id: I10d469ca6a8f76e950a8c9779ae9c8b70f88ef93 > BUG: 1344340 > Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> > Reviewed-on: http://review.gluster.org/14682 > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I10d469ca6a8f76e950a8c9779ae9c8b70f88ef93 BUG: 1346750 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/14733 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: Handle rmdir failure correctlyN Balachandran2016-06-182-13/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | DHT did not handle rmdir failures on non-hashed subvols correctly in a 2x2 dist-rep volume, causing the directory do be deleted from the hashed subvol. Also fixed an issue where the dht_selfheal_restore errcodes were overwriting the rmdir error codes. > Change-Id: If2c6f8dc8ee72e3e6a7e04a04c2108243faca468 > BUG: 1330032 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/14060 > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 78c1c6002f0b11afa997a14f8378c04f257ea1c5) Change-Id: Id3f7c8fd515586d09f1f29c2eceddfee2ef8ec55 BUG: 1347529 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/14751 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/distribute: use a linked inode in directory heal codepathRaghavendra G2016-06-062-11/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is needed for following reasons: * healing is done in lookup and mkdir codepath where inode is not linked _yet_ as normally linking is done in interface layers (fuse-bridge, gfapi, nfsv3 etc). * healing consists of non-lookup fops like inodelk, setattr, setxattr etc. All non-lookup fops expect a linked inode. Backport of commit 06f92634d9ad8aa5c56d786e5248016c283e5c5b: > Change-Id: I1bd8157abbae58431b7f6f6fffee0abfe5225342 > BUG: 1334164 > Signed-off-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-on: http://review.gluster.org/14295 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Smoke: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Susant Palai <spalai@redhat.com> > Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Change-Id: I1bd8157abbae58431b7f6f6fffee0abfe5225342 BUG: 1336285 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/14350 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* dht : add metalock/unlockSusant Palai2016-06-051-5/+98
| | | | | | | | | | | | Change-Id: I842a7ea1b286f1b893b200fe647597e7fd0f2105 BUG: 1338501 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14631 Tested-by: Kotresh HR <khiremat@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* dht: selfheal should wind mkdir call to subvols with ESTALE errorSakshi Bansal2016-05-261-1/+2
| | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/14496/ > Change-Id: I7140e50263b5f28b900829592c664fa1d79f3f99 > BUG: 1338634 > Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Change-Id: I7140e50263b5f28b900829592c664fa1d79f3f99 BUG: 1338669 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/14500 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* features/shard: Get hard-link-count in {unlink,rename}_cbk before deleting ↵Krutika Dhananjay2016-05-241-8/+13
| | | | | | | | | | | | | | | shards Backport of http://review.gluster.org/#/c/14334/ Change-Id: Iff0e90bee22e20c309eaea6c6a19e4fa6e101ed7 BUG: 1337839 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14451 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/tier: downgrade max-cycle-time log message to INFODan Lambright2016-05-211-1/+1
| | | | | | | | | | | | | | | | | | | | The "max cycle time" log message was incorrectly logged as an error. Downgrade it to INFO. This is a backport of 14361 > Change-Id: Ia7d074423019fa79443bc6ea694148b7b8da455d > BUG: 1335973 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I91cbd0165fa6c72d3fa5e373bc12578ef0bde9da BUG: 1336472 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/14362 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* dht: rename takes lock on parent directory if destination existsSakshi Bansal2016-05-201-7/+32
| | | | | | | | | | | | | | | | | | | | | | For directory rename if destination exists the source directory is created as a child of the given destination directory. Since the new child directory does not exist take lock on parent of the child directory. Backport of http://review.gluster.org/#/c/14371/ > Change-Id: I24a34605a2cd65984910643ff5462f35e8fc7e71 > BUG: 1336698 > Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Change-Id: I24a34605a2cd65984910643ff5462f35e8fc7e71 BUG: 1337394 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/14417 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* tier/detach: Clear tier-fix-layout-complete xattr after migration threads joinJoseph Fernandes2016-05-181-33/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | Previously we had wrongly placed the clearing tier-fix-layout-complete xattr before the joining of migration threads. This would lead to situations where failure of clearing the xattr would cause the premature death of migration threads. Now we clear the xattr only after the data movement threads join, ensuring that all migration is done. This is a backport of 14285 > Change-Id: I829b671efa165ae13dbff7b00707434970b37a09 > BUG: 1334839 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I08cc92db1fb8d48026ca51743be6cafe385d1b79 BUG: 1336152 Reviewed-on: http://review.gluster.org/14342 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes Tested-by: Dan Lambright <dlambrig@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* tier/detach : During detach check if background fixlayout is doneJoseph Fernandes2016-05-061-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | During detach check if background fixlayout is done, if not done ignore the case and continue detach. Backport of http://review.gluster.org/14147 > Change-Id: I5d5cfc0e73d0eb217fdeab54c432dc4af8bc598d > BUG: 1332136 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/14147 > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I16f0320a2c07ac1d137bd4f27697722049df803c BUG: 1333803 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/14241 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: Joseph Fernandes NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht: Perform NULL check on xdata before dict_get()Krutika Dhananjay2016-05-061-1/+1
| | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/14212 .. to prevent unnecessary logs from gf_msg_callingfn() Change-Id: Ic2f21532f09af3ab7d36ce5f20c561fff5208fbb BUG: 1333244 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14218 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* dht/rebalance: add lock migration fop to dht_migrate_fileSusant Palai2016-05-012-63/+112
| | | | | | | | | | | Change-Id: Id0e7400c8ae950c90d42a3ddf8b558a14959a1f8 BUG: 1326085 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14074 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht: handle EREMOTE in dht lk/flushSusant Palai2016-05-012-7/+98
| | | | | | | | | | | | | | | | With lock-migration, we need to send requests to destination brick post migration. Once, the source brick marks the lock structure to be already migrated, the requests will be redirected to destination brick by dht_lk2/flush2. Change-Id: I50b14011c5ab68c34826fb7ba7f8c8d42a68ad97 BUG: 1326085 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/13493 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* glusterd: volume set changes for lock migrationSusant Palai2016-05-012-3/+27
| | | | | | | | | | | Change-Id: I48c6f9cdda47503615ba65882acd5eedf0a70c89 BUG: 1326085 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14024 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* tier/migrator: Fetch the next query file for the next cycleJoseph Fernandes2016-04-302-0/+25
| | | | | | | | | | | | | | | | | | | | | | | Problem: When we spawn promote and demote thread, query files are build. And only query file with index 0 is picked for migration as the first query file. This may not be suitable for scenarios, where the file in the query are too big to move in the first cycle, as a result file in the other query files always get missed. We need to shuffle so that other query files also get a chance. Fix: Remember the previous first query file and shift it by one index, before the migration starts. Change-Id: I704947bcf4bab6b20b1179a6d9ae4a15a3d51bd9 BUG: 1330353 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/14068 Tested-by: Joseph Fernandes Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* dht/afr/client/posix: Fail mkdir without gfid-reqPranith Kumar K2016-04-291-0/+8
| | | | | | | | | | | | | | | Do not allow directory creations without gfids as after the directories are created, operations on them fail anyway. So it is better to fail mkdir. BUG: 1317361 Change-Id: I8f8e3b38bbded1960b7215bac0432500f7e78038 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13690 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* libglusterfs: Add debug and trace logs for stack traceRaghavendra Talur2016-04-271-1/+2
| | | | | | | | | | | | | | | | | It has become very difficult to identify the xlator which returned negative op_ret. Being able to just change the log level and visualize the stack is helpful in such cases. Change-Id: I6545b4802c1ab4d0d230d5e9e036afb2384882e1 BUG: 1330052 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/13448 CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* tier/dht: check for rebalance completion for EIO errorMohammed Rafi KC2016-04-251-1/+3
| | | | | | | | | | | | | | | | | When an ongoing rebalance completion check task been triggered by dht, there is a possibility of a race between afr setting subvol as non-readable and dht updates the cached subvol. In this window a write can fail with EIO. Change-Id: I42638e6d4104c0dbe893d1bc73e1366188458c5d BUG: 1329503 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/14049 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* dht/rebalance: Handle GF_DEFRAG_STOPSusant Palai2016-04-251-0/+17
| | | | | | | | | | | | | | | | Problem: On a rebal stop, the migrator threads don't intimate the crawler thread to wake up in case it is waiting on signal from migrator thread. Change-Id: I3cc4be41a4db25f48fee059ebb79a97ee99dcd00 BUG: 1327507 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14004 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* dht: Add lease() fopPoornima G2016-04-253-0/+47
| | | | | | | | | | | | | Change-Id: I0bbc2c2ef115c78393f6570815a5b80316e7e4be BUG: 1319992 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/11720 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/distribute: detect stale layouts in entry fopsRaghavendra G2016-04-225-27/+646
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dht_mkdir () { first-hashed-subvol = hashed-subvol for "bname" in in-memory layout of "parent"; inodelk (SETLKW, parent, "LAYOUT_HEAL_DOMAIN", "can be any subvol, but we choose first-hashed-subvol randomly"); { begin: hashed-subvol = hashed-subvol for "bname" in in-memory layout of "parent"; hash-range = extract hashe-range from layout of "parent"; ret = mkdir (parent/bname, hashed-subvol, hash-range); if (ret == "hash-value doesn't fall into layout stored on the brick (this error is returned by posix-mkdir)") { refresh_parent_layout (); goto begin; } } inodelk (UNLCK, parent, "LAYOUT_HEAL_DOMAIN", "first-hashed-subvol"); proceed with other parts of dht_mkdir; } posix_mkdir (parent/bname, client-hash-range) { disk-hash-range = getxattr (parent, "dht-layout-key"); if (disk-hash-range != client-hash-range) { fail-with-error ("hash-value doesn't fall into layout stored on the brick"); return 0; } continue-with-posix-mkdir; } Similar changes need to be done for dentry operations like create, symlink, link, unlink, rmdir, rename. These will be addressed in subsequent patches. This patch addresses only mkdir codepath. This change breaks stripe tests, as on some striped subvols dht layout xattrs are not set for some reason. This results in failure of mkdir. Since striped volumes are always created with dht, some tests associated with stripe also fail. So, I am making following tests changes (since stripe is out of maintainance): * modify ./tests/basic/rpc-coverage.t to not to use striped volumes * mark all (2) tests in tests/bugs/stripe/ as bad tests Change-Id: Idd1ae879f24a48303dc743c1bb4d91f89a629e25 BUG: 1323040 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/13885 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* quota: setting 'read-only' option in xdata to instruct DHT to not healSakshi Bansal2016-04-191-2/+10
| | | | | | | | | | | | | | | | | | | | | | When quota is enabled the quota enforcer tries to get the size of the source directory by sending nameless lookup to quotad. But if the rename is successful even on one subvol or the source layout has anomalies then this nameless lookup in quotad tries to heal the directory which requires a lock on as many subvols as it can. But src is already locked as part of rename. For rename to proceed in brick it needs to complete a cluster-wide lookup. But cluster-wide lookup in quotad is blocked on locks held by rename, hence a deadlock. To avoid this quota sends an option in xdata which instructs DHT not to heal. Change-Id: I792f9322331def0b1f4e16e88deef55d0c9f17f0 BUG: 1252244 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/13988 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht: add "nuke" functionality for efficient server-side deletionJeff Darcy2016-04-071-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This turns a special xattr into an rmdir with flags set. When that hits the posix translator on the server side, that causes the file/directory to be moved into the special "landfill" directory. From there, the posix janitor thread will take care of deleting it entirely on the server side - traversing it recursively if necessary. A couple of secondary issues were fixed to make this effective. * FUSE now ensures that setxattr values are NUL terminated. * The janitor thread now gets woken up immediately when something is placed in 'landfill' instead of only when file descriptors need to be closed. * The default landfill-emptying interval was reduced to 10s. To use the feature, issue a setxattr something like this: setfattr -n glusterfs.dht.nuke -v "" /mnt/glusterfs/vol/some_dir The value doesn't actually matter; the mere receipt of a request with this key is sufficient. Some day it might be useful to allow setting a required value as a sort of password, so that only those who know it can access the underlying special functionality. Change-Id: I8a343c2cdb40a76d5a06c707191fb67babb8514f Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/13878 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>