summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* cluster/afr: Make sure latency-arg is passed to afrPranith Kumar K2018-04-184-3/+6
| | | | | | | | | | | xlator_notify doesn't pass the extra arguments that come in the input function, so XLATOR_NOTIFY macro should be used instead to pass the extra arguments to the function. BUG: 1567881 fixes bz#1567881 Change-Id: Ic15b6c446638cbacf3149693147a754219037c47 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* libglusterfs: fix comparison of a NULL dict with a non-NULL dictXavi Hernandez2018-04-181-8/+8
| | | | | | | | | | Function are_dicts_equal() had a bug when the first argument was NULL and the second one wasn't NULL. In this case it incorrectly returned that the dicts were different when they could be equal. Fixes: bz#1566732 Change-Id: I0fc245c2e7d1395865a76405dbd05e5d34db3273 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* Add CLI option to print XLATORDIRPrashanth Pai2018-04-182-0/+7
| | | | | | | | | | | | | | | | | | glusterfs gets the path to xlator dir from a compile time flag named XLATORDIR which gets passed through a -D flag to GCC. This path is used to find and load xlator shared objects. The XLATORDIR path isn't easily accessible to glusterd2. Glusterd2 currently uses the following command (hack) to get value of XLATORDIR: $ strings -d `which glusterfsd` | awk '/glusterfs/*/xlator$/' This change introduces "print-xlatordir" CLI option to expose XLATORDIR. The option is intentionally not documented. Updates: bz#1193929 Change-Id: Ic7247457600f11cd8d68eb3d0ad2526fdfda0b02 Signed-off-by: Prashanth Pai <ppai@redhat.com>
* afr: fixes to afr-eager lockingRavishankar N2018-04-182-0/+26
| | | | | | | | | | | | | 1. If pre-op fails on all bricks,set lock->release to true in afr_handle_lock_acquire_failure so that the GF_ASSERT in afr_unlock() does not crash. 2. Added a missing 'return' after handling pre-op failure in afr_transaction_perform_fop(), fixing a use-after-free issue. Change-Id: If0627a9124cb5d6405037cab3f17f8325eed2d83 fixes: bz#1561129 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* Revert "storage/posix: add pgfid in readdirp if needed"Nigel Babu2018-04-181-38/+8
| | | | | | | | This reverts commit d206fab73f6815c927a84171ee9361c9b31557b1. Change-Id: I5b43fdcf916bc844437c9d60f6957bc40936e3c2 Updates: bz#1560319 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* build: exclude '--with-previous-options' to prevent infinite loopXie Changlong2018-04-161-1/+1
| | | | | | | | | | | | Reproducible Steps: 1. cd glusterfs/; rm -rf *; git reset --hard #clean repo 2. cd extras/LinuxRPM/; ./make_glusterrpms #it's ok here 3. ./make_glusterrpms #infinite loop 4. cd ../../; make distclean #infinite loop Change-Id: I162953d4576cedea7c6f6c631a77163a5cca023e updates: #439 Signed-off-by: Xie Changlong <xiechanglong@cmss.chinamobile.com>
* maintainers: promote Deepshikha to maintainerNigel Babu2018-04-161-1/+1
| | | | | | | | | | Deepshikha has been doing excellent work across the CI system. She is now ready to co-maintain the Continuous Integration module and be responsible for the CI ecosystem in its entirety. Fixes: bz#1567880 Change-Id: If204301d26731f93b2dccfe8b6571ee748a47b26 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* fuse: retire statvfs tweakCsaba Henk2018-04-161-13/+0
| | | | | | | | | | | | | | | | | | | | | | | | fuse xlator used to override the filesystem block size of the storage backend to indicate its preferences. Now we retire this tweak and pass on what we get from the backend. This fixes the anomaly reported in the referred BUG. For more background, see the following email, which was sent out to gluster-devel and gluster-users mailing lists to gauge if anyone sees any use of this tweak: http://lists.gluster.org/pipermail/gluster-devel/2018-March/054660.html http://lists.gluster.org/pipermail/gluster-users/2018-March/033775.html Noone vetoed the removal of it but it got endorsement: http://lists.gluster.org/pipermail/gluster-devel/2018-March/054686.html BUG: 1523219 Change-Id: I3b7111d3037a1b91a288c1589f407b2c48d81bfa Signed-off-by: Csaba Henk <csaba@redhat.com>
* geo-rep: Fix syncing of symlinkKotresh HR2018-04-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: If symlink is created on master pointing to current directory (e.g symlink -> ".") with non root uid or gid, geo-rep worker crashes with ENOTSUP. Cause: Geo-rep creates the symlink on slave and fixes the uid and gid using chown cmd. os.chown dereferences the symlink which is pointing to ".gfid" which is not supported. Note that geo-rep operates on aux-gfid-mount (e.g. "/mnt/.gfid/<gfid-of-symlink-file>"). Solution: The uid or gid change is acutally on symlink file. So use os.lchown, i.e, don't deference. BUG: 1567209 Change-Id: I63575fc589d71f987bef1d350c030987738c78ad updates: bz#1567209 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* extras: Disable choose-local in groups virt and gluster-blockKrutika Dhananjay2018-04-132-0/+2
| | | | | | Change-Id: Icba68406d86623195d59d6ee668e0850c037c63a fixes: bz#1566386 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* rpc: set listen-backlog to high valueMilind Changire2018-04-131-1/+1
| | | | | | | | | | | | | | | | | | | | Problem: On node reboot, when glusterd starts volumes rapidly, there's a flood of connections from the bricks to glusterd and from the self-heal daemons to the bricks. This causes SYN Flooding and dropped connections when the listen-backlog is not enough to hold the pending connections to compensate for the rate at which connections are accepted by the RPC layer. Solution: Increase the listen-backlog value to 1024. This is a partial solution. Part of the solution is to rearm the listener socket early for quicker accept() of connections. See commit 6964640a977cb10c0c95a94e03c229918fa6eca8 (change 19833) Change-Id: I62283d1f4990dd43839f9a6932cf8a36effd632c fixes: bz#1564600 Signed-off-by: Milind Changire <mchangir@redhat.com>
* cluster/dht: Handle file migrations when brick downN Balachandran2018-04-131-5/+51
| | | | | | | | | | | | | | | The decision as to which node would migrate a file was based on the gfid of the file. Files were divided among the nodes for the replica/disperse set. However, if a brick was down when rebalance started, the nodeuuids would be saved as NULL and a set of files would not be migrated. Now, if the nodeuuid is NULL, the first non-null entry in the set is the node responsible for migrating the file. Change-Id: I72554c107792c7d534e0f25640654b6f8417d373 fixes: bz#1564198 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* core/build/various: python3 compat, prepare for python2 -> python3Kaleb S. KEITHLEY2018-04-1259-102/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note 1) we're not supposed to be using #!/usr/bin/env python, see https://fedoraproject.org/wiki/Packaging:Guidelines?rd=Packaging/Guidelines#Shebang_lines Note 2) we're also not supposed to be using "!/usr/bin/python, see https://fedoraproject.org/wiki/Changes/Avoid_usr_bin_python_in_RPM_Build#Quick_Opt-Out The previous patch (https://review.gluster.org/19767) tried to do too much in one patch, so it was abandoned. This patch does two things: 1) minor cleanup of configure(.ac) to explicitly use python2 2) change all the shebang lines to #!/usr/bin/python2 and add them where they were missing based on warnings emitted during rpmbuild. In a follow-up patch python2 will eventually be changed to python3. Before that python2-isms (e.g. print, string.join(), etc.) need to be converted to python3. Some of those can be rewritten in version agnostic python. E.g. print statements become print() with "from __future_ import print_function". The python 2to3 utility will be used for some of those. Also Aravinda has given guidance in the comments to the first patch for changes. updates: #411 Change-Id: I471730962b2526022115a1fc33629fb078b74338 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* cluster/dht: Wind open to all subvolsN Balachandran2018-04-111-10/+5
| | | | | | | | | | dht_opendir should wind the open to all subvols whether or not local->subvols is set. This is because dht_readdirp winds the calls to all subvols. Change-Id: I67a96b06dad14a08967c3721301e88555aa01017 updates: bz#1564198 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* xlators/performance: Add pass-through optionVarsha Rao2018-04-119-10/+139
| | | | | | | | | | Add pass-through option in performance traslators. Set the option in GF_OPTION_INIT() and GF_OPTION_RECONF() Updates: #304 Change-Id: If1537450147d154905831e36f7162a32866d7ad6 Signed-off-by: Varsha Rao <varao@redhat.com>
* posix: reserve option behavior is not correct while using fallocateMohit Agrawal2018-04-112-0/+11
| | | | | | | | | | | | | | | | | Problem: storage.reserve option is not working correctly while disk space is allocate throguh fallocate Solution: In posix_disk_space_check_thread_proc after every 5 sec interval it calls posix_disk_space_check to monitor disk space and set the flag in posix priv.In 5 sec timestamp user can create big file with fallocate that can reach posix reserve limit and no error is shown on terminal even limit has reached. To resolve the same call posix_disk_space for every fallocate fop instead to call by a thread after 5 second BUG: 1560411 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Change-Id: I39ba9390e2e6d084eedbf3bcf45cd6d708591577
* storage/posix: add pgfid in readdirp if neededKinglong Mee2018-04-101-8/+38
| | | | | | Change-Id: I6745428fd9d4e402bf2cad52cee8ab46b7fd822f fixes: bz#1560319 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* posix: check file state before continuing with fopsSusant Palai2018-04-105-16/+756
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In context of Cloudsync: In scenarios where a data modification fop e.g. a write landed in POSIX thinking that the file is local, while the file was actually remote, can be dangerous. Ofcourse we don’t want to take inodelk for every read/write operation to check the archival status or coordinate with an upload or a download of a file. To avoid inodelk, we will check the status of the file in POSIX it self, before we resume the fop. This helps us avoiding any races mentioned above. Now e.g. if a write reached POSIX for a file which was actually remote, it can check the status of the file and will get to know that the file is remote. It can error out with this status “remote” and cloudsync xlator will retry the same operation, once it finished downloading the file. This patch includes the setxattr changes to do the post processing of upload i.e. truncate and setting the remote xattr "trusted.glusterfs.cs.remote" to indicate the file is REMOTE Each file will have no xattr if the file is LOCAL, one remote xattr if the file is REMOTE and a combination of REMOTE and DOWNLOADING xattr if the file is getting downloaded. There is healing logic of these xattrs to recover from crash inconsitencies. Fixes: #387 Change-Id: Ie93c2d41aa8d6a798a39bdbef9d1669f057e5fdb Signed-off-by: Susant Palai <spalai@redhat.com>
* cluster/dht: act as passthrough for renames on single child DHTRaghavendra G2018-04-101-7/+15
| | | | | | | | | | Various synchronization present in dht_rename while handling directories and files is necessary only if we have more than only one child. Change-Id: Ie21ad419125504ca2f391b1ae2e5c1d166fee247 fixes: bz#1563511 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
* experimental/cloudsync: Download xlator for archival featureSusant Palai2018-04-1021-4/+2468
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | spec-files: https://review.gluster.org/#/c/18854/ Overview: * Cloudsync maintains three file states in it's inode-ctx i.e 1 - LOCAL, 2 - REMOTE, 3 - DOWNLOADING. * A data modifying fop is allowed only if the state is LOCAL. If the state is REMOTE or DOWNLOADING, client will download or wait for the download to finish initiated by other client. * Multiple download and upload from different clients are synchronized by inodelk. * In POSIX a state check is done (part of different commit)before allowing the fop to continue. If the state is remote/downloading the fop is unwound with EREMOTE. The client will then download the file and continue with the fop again. * Basic Algo for fop (let's say write fop): - If LOCAL -> resume fop - If REMOTE -> - INODELK - STAT (this gets state and heal the state if needed) - DOWNLOAD - resume fop Note: * Developers will need to write plugins for download, based on the remote store they choose. In phase-1, support will be added for one remote store per volume. In future, more options for multiple remote stores will be explored. TODOs: - Implement stat/lookup/readdirp to return size info from xattr - Make plugins configurable - Implement unlink fop - Add metrics collection - Add sharding support Design Contributions: Aravinda V K <avishwan@redhat.com> Amar Tumballi <amarts@redhat.com> Ram Ankireddypalle <areddy@commvault.com> Susant Palai <spalai@redhat.com> updates: #387 Change-Id: Iddf711ee7ab4e946ae3e472ff62791a7b85e6d4b Signed-off-by: Susant Palai <spalai@redhat.com>
* quota: allow writes when with EINVAL on pgfid isnot existKinglong Mee2018-04-091-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | NFS client gets "Invalid argument" when writing file through nfs-ganesha. 1. With quota disabled; nfs client mount nfs-ganesha share, and do 'll' in the testing directory. 2. Enable quota; getfattr: Removing leading '/' from absolute path names trusted.gfid=0xe2edaac0eca8420ebbbcba7e56bbd240 trusted.gfid2path.b3250af8fa558e66=0x39663134343566662d653530332d343831352d396635312d3236633565366332633137642f7465737466696c653932 trusted.glusterfs.quota.9f1445ff-e503-4815-9f51-26c5e6c2c17d.contri.3=0x00000000000002000000000000000001 Notice: testfile92 without trusted.pgfid xattr. 3. restart glusterfs volume by "gluster volume stop/start gvtest" 4. echo somedata > testfile92 5. ll testfile92 -rw-r--r-- 1 root root 0 Mar 6 21:43 testfile92 BUG: 1560319 Change-Id: Iaa4dd1e891c99069fb85b7b11bb0482cbf2303b1 fixes: bz#1560319 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* rpc: rearm listener socket earlyMilind Changire2018-04-071-3/+3
| | | | | | | | | | | | | | | Problem: On node reboot, when glusterd starts volumes, a setup with a large number of bricks might cause SYN Flooding and connections to be dropped if the connections are not accepted quickly enough. Solution: accept() the connection and rearm the listener socket early to receive more connection requests as soon as possible. Change-Id: Ibed421e50284c3f7a8fcdb4de7ac86cf53d4b74e fixes: bz#1564600 Signed-off-by: Milind Changire <mchangir@redhat.com>
* features/index: Choose different base file on EMLINK errorPranith Kumar K2018-04-062-18/+61
| | | | | | | Change-Id: I4648816af908539efdc2528608aa2ebf7f0d0e2f fixes: bz#1559004 BUG: 1559004 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* doc: Update the admin guide linkVarsha Rao2018-04-061-1/+1
| | | | | | | | Update the existing admin guide link as it is incorrect. Change-Id: I05669192623aeac287dfa9002caa0f390ea79499 Updates: bz#1193929 Signed-off-by: Varsha Rao <varao@redhat.com>
* cluster/ec: Turn ON the stripe-cache option by defaultAshish Pandey2018-04-061-1/+1
| | | | | | Change-Id: I0a290396c30c635b13ee73004d20259efb76a954 fixes: bz#1563945 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
* gfapi: fix a couple of minor issuesKaleb S. KEITHLEY2018-04-053-6/+2
| | | | | | | | | | | | | | | | | | | | duplicatation of exported functions in gfapi.map. Only the newest one is needed. Both the legacy and current symbols are exported. glfs_io_cbk34 typedef should not be in a public header file. The old application was compiled with the original glfs_io_cbk. Outside of libgfapi, nothing now uses/needs this old typedef, move it into the C file that needs it. Similarly glfs_realpath34() decl should not be in glfs.h. Period. Old applications were compiled with the then glfs_realpath() decl and linked with glfs_realpath@@GFAPI_3_4.0. New applications should only call glfs_realpath() and it will be linked to the new/current glfs_realpath(). Change-Id: Icd5b0c9e9b68f0c133f14447b09ace35f33dbab2 fixes: bz#1564235 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* glusterd: show brick online after port registrationAtin Mukherjee2018-04-051-2/+3
| | | | | | | | | | | | | | | | | gluster-block project needs a dependency check to see if all the bricks are online before bringing up the relevant gluster-block services. While the patch https://review.gluster.org/#/c/19785/ attempts to write the script but brick should be only marked as online only when the pmap_signin is completed. While this is perfectly fine for non brick multiplexing, but with brick multiplexing this patch still doesn't eliminate the race completely as the attach_req call is asynchrnous and glusterd immediately marks the port as registerd. Change-Id: I81db54b88f7315e1b24e0234beebe00de6429f9d Fixes: bz#1563273 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* afr: add quorum checks in pre-opRavishankar N2018-04-051-33/+31
| | | | | | | | | | | | | | | | | Problem: We seem to be winding the FOP if pre-op did not succeed on quorum bricks and then failing the FOP with EROFS since the fop did not meet quorum. This essentially masks the actual error due to which pre-op failed. (See BZ). Fix: Skip FOP phase if pre-op quorum is not met and go to post-op. Fixes: 1561129 Change-Id: Ie58a41e8fa1ad79aa06093706e96db8eef61b6d9 fixes: bz#1561129 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* glusterd: mark port_registered to true for all running bricks with brick muxAtin Mukherjee2018-04-053-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | glusterd maintains a boolean flag 'port_registered' which is used to determine if a brick has completed its portmap sign in process. This flag is (re)set in pmap_sigin and pmap_signout events. In case of brick multiplexing this flag is the identifier to determine if the very first brick with which the process is spawned up has completed its sign in process. However in case of glusterd restart when a brick is already identified as running, glusterd does a pmap_registry_bind to ensure its portmap table is updated but this flag isn't which is fine in case of non brick multiplex case but causes an issue if the very first brick which came as part of process is replaced and then the subsequent brick attach will fail. One of the way to validate this is to create and start a volume, remove the first brick and then add-brick a new one. Add-brick operation will take a very long time and post that the volume status will show all other brick status apart from the new brick as down. Solution is to set brickinfo->port_registered to true for all the running bricks when brick multiplexing is enabled. Change-Id: Ib0662d99d0fa66b1538947fd96b43f1cbc04e4ff Fixes: bz#1560957 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* features/changelog: Update option levelsAravinda VK2018-04-051-0/+7
| | | | | | | | Options levels for Changelog Xlator Change-Id: Idd246717e38096c44258a990a0939f82e5fc9654 Updates: #430 Signed-off-by: Aravinda VK <avishwan@redhat.com>
* cluster/dht: enable lookup-optimize by defaultN Balachandran2018-04-043-3/+5
| | | | | | | | | | | | | | Lookup-optimize has been shown to improve create performance. The code has been in the project for several years and is considered stable. Enabling this by default in order to test this in the upstream regression runs. Change-Id: Iab792979ee34f0af4713931e0b5b399c23f65313 updates: bz#1557435 BUG: 1557435 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* glusterd: fix txn_opinfo memory leakAtin Mukherjee2018-04-043-9/+25
| | | | | | | | | | | | | For transactions where there's no volname involved (eg : gluster v status), the originator node initiates with staging phase and what that means in op-sm there's no unlock event triggered which resulted into a txn_opinfo dictionary leak. Credits : cynthia.zhou@nokia-sbell.com Change-Id: I92fffbc2e8e1b010f489060f461be78aa2b86615 Fixes: bz#1550339 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: honour localtime-logging for all the daemonsAtin Mukherjee2018-04-035-0/+30
| | | | | | Change-Id: I97a70d29365b0a454241ac5f5cae56d93eefd73a Fixes: bz#1563334 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* cluster/afr: Prevent ping-event handling on shdPranith Kumar K2018-04-031-0/+2
| | | | | | | | | On shd, we shouldn't treat any brick down based on latency, otherwise self-heal will never happen fixes: bz#1562717 Change-Id: Ica07fcc4fae91a6bfd9c9a670e2be464704d94b7 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* glusterd: setting mgmt_v3_timer->timer to NULL after deleting mgmt_v3_timerSanju Rakonde2018-04-021-1/+0
| | | | | | | | | | | We are setting mgmt_v3_timer->timer to NULL after mgmt_v3_timer is deleted which is unnecessary. So removing the statement. This issue is caught while running glusterd with ASAN. Change-Id: Ied1f91590a2c64ec1af36d4de9c3febd6cf94bb9 Fixes: bz#1562907 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* mount/fuse: Set default fuse reader thread count to 1Krutika Dhananjay2018-04-021-1/+1
| | | | | | | Updates #412 Change-Id: Ida53d8b630feabb856a3551fa888f92382ade768 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* ocf: use glusterd-workdir for finding volume status filesNiels de Vos2018-04-021-1/+1
| | | | | | | | | The volume status files are located in the glusterd-workdir, not under /etc (sysconfdir). BUG: 1234873 Change-Id: Id7f7c83261bb4b5ac2fc104dcd6cb198d6a930aa Signed-off-by: Niels de Vos <ndevos@redhat.com>
* build: revert configure --without-ipv6-default behaviourKaleb S. KEITHLEY2018-04-021-14/+31
| | | | | | | | | | | | | | | | | | | | | | | | | patch https://review.gluster.org/19692 breaks gluster on systems with IPv6 enabled but don't have IPv6 reverse DNS. Also it defaulted to enabling ipv6-default regardless of whether --with-ipv6-default or --without-ipv6-default were specified in the options to configure. (Also the patch was merged without review.) Prefer libtirpc over glibc rpc. On newer linux with tirpc and without glibc rpc use tirpc (obviously) on less new linux with both tirpc and glibc rpc default to use tirpc, unless --without-tirpc is specified, in which case use glibc rpc On less new linux without tirpc fall back to glib rpc (obviously) ipv6-default requires libtirpc. It is off by default. It must be explicitly enabled with --with-ipv6-default. If --with-ipv6-default is specified, but tirpc is not available, disable it and issue a warning Change-Id: Ib96a230fafb83ec83a71948fe55af1215a7a6ffa BUG: 1562052 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* cluster/dht: Update dht option levelsN Balachandran2018-04-021-2/+16
| | | | | | | | | Set the levels for DHT options based on https://review.gluster.org/#/c/19466/ Change-Id: I51b31a706a0b9517404e83224c89de145fd5d7e1 updates: #430 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* mount/fuse: Add support for multi-threaded fuse readersKrutika Dhananjay2018-04-028-83/+196
| | | | | | | | | | | | | | Usage: Use 'reader-thread-count=<NUM>' as command line option to set the thread count at the time of mounting the volume. Next task is to make these threads auto-scale based on the load, instead of having the user remount the volume everytime to change the thread count. Updates #412 Change-Id: I94aa1505e5ae6a133683d473e0e4e0edd139b76b Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* cluster/dht: Update layout in inode only on successN Balachandran2018-04-022-4/+24
| | | | | | | | | | | | | | | | | | | | | With lookup-optimize enabled, gf_defrag_settle_hash in rebalance sometimes flips the on-disk layout on volume root post the migration of all files in the directory. This is sometimes seen when attempting to fix the layout of a directory multiple times before calling gf_defrag_settle_hash. dht_fix_layout_of_directory generates a new layout in memory but updates it in the inode ctx before it is set on disk. The layout may be different the second time around due to dht_selfheal_layout_maximize_overlap. If the layout is then not written to the disk, the inode now contains the wrong layout. gf_defrag_settle_hash does not check the correctness of the layout in the inode before updating the commit-hash and writing it to the disk thus changing the layout of the directory. Change-Id: Ie1407d92982518f2a0c40ec70ad370b34a87b4d4 updates: bz#1557435 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* Revert "glusterd: handling brick termination in brick-mux"Sanju Rakonde2018-03-297-147/+39
| | | | | | | | | | | | | This reverts commit a60fc2ddc03134fb23c5ed5c0bcb195e1649416b. This commit was causing multiple tests to time out when brick multiplexing is enabled. With further debugging, it's found that even though the volume stop transaction is converted into mgmt_v3 to allow the remote nodes to follow the synctask framework to process the command, there are other callers of glusterd_brick_stop () which are not synctask based. Change-Id: I7aee687abc6bfeaa70c7447031f55ed4ccd64693 updates: bz#1545048
* afr: add new value for read-hash-mode volume optionRavishankar N2018-03-297-32/+175
| | | | | | | | | | Updates: #363 This new value (3) will try to wind read requests to the child of AFR having the least amount of pending requests in its queue. Change-Id: If6bda2aac9bf7aec3fc39622f78659313c4b6508 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* cluster/ec: send list-node-uuids request to all subvolumesXavi Hernandez2018-03-282-1/+2
| | | | | | | | | | | | The xattr trusted.glusterfs.list-node-uuids was only sent to a single subvolume. This was returning null uuids from the other subvolumes as if they were down. This fix forces that xattr to be requested from all subvolumes. Change-Id: If62eb39a6857258923ba625e153d4ad79018ea2f fixes: bz#1561406 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* Fix gluster(8) formattingMichael Scherer2018-03-281-1/+0
| | | | | | | | | Looking at the man page show that "Snapshot command" wasn't aligned with the other section titles. Change-Id: I24bdb2e3728e03862fee57710cfe34b0607fe09a BUG: 1507230 Signed-off-by: Michael Scherer <misc@redhat.com>
* glusterd: changing the op-version of volume stop mgmt v3Kaleb S. KEITHLEY2018-03-281-3/+3
| | | | | | | | log message describe the actual test Change-Id: I1ea7300a6b186032a65236492d6d2a6eef0ab983 fixes: bz#1560441 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* rpc: update tirpc registration to "force" unregister old mapping before ↵Shreyas Siravara2018-03-282-0/+8
| | | | | | | | | | | | re-registering > Reviewed-on: https://review.gluster.org/16849 > Reviewed-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I05ed6b7c715a71e5819fbe8116e7c3146010f836 BUG: 1521030 Signed-off-by: Kevin Vigor <kvigor@fb.com> Signed-off-by: Amar Tumballi <amarts@redhat.com>
* rpc: simplify parameters when a saved frame is forced to unwindZhang Huan2018-03-281-4/+2
| | | | | | | | | When a saved frame is to be forced unwind, there is no need to pass an empty iovector without any data pointed to. Change-Id: I6e858fb38644326e22239b83272b15db656035e5 BUG: 1523122 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
* rpc: fix incorrect return value when xdr decode failsZhang Huan2018-03-281-1/+0
| | | | | | | | | | | | | xdr_replymsg is called to decode reply message, and it returns failure if the message is corrupted. However, retrieving return value from the global errno is 0 even xdr_replymsg fails. Fix this issue by simply returning a negative value if call to xdr_replymsg fails. Change-Id: I2b9a1dc97652fbb6cf6568ea617f120713784a55 BUG: 1523122 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
* glusterd: handling brick termination in brick-muxSanju Rakonde2018-03-287-39/+147
| | | | | | | | | | | | | | | Problem: There's a race between the last glusterfs_handle_terminate() response sent to glusterd and the kill that happens immediately if the terminated brick is the last brick. Solution: When it is a last brick for the brick process, instead of glusterfsd killing itself, glusterd will kill the process in case of brick multiplexing. And also changing gf_attach utility accordingly. Change-Id: I386c19ca592536daa71294a13d9fc89a26d7e8c0 fixes: bz#1545048 BUG: 1545048 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>