summaryrefslogtreecommitdiffstats
path: root/xlators
Commit message (Collapse)AuthorAgeFilesLines
* gNFSd: Auto re-register NFS/Mount programs with rpcbind periodicallyShreyas Siravara2017-08-302-2/+77
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Every once in a while rpcbind crashes and the NFS endpoints go bye-bye. This diff makes it such that we should almost never encounter the case where we have NFS up and rpcbind down causing bad endpoints and hanging mounts for our customers. Test Plan: Added prove tests + tested on dev server Reviewers: dph, moox, rwareing Reviewed By: rwareing Differential Revision: https://phabricator.fb.com/D2571724 Tasks: 8803558 Change-Id: I35acb2d731185a7b20020cb57bdd4d879e978df4 Signature: t1:2571724:1445555327:3276a4dcc4da71346b09d4aeb46c69dddcc7c5ba Reviewed-on: https://review.gluster.org/17961 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* Make a DHT subvolume go read-only when a subvolume crashesShreyas Siravara2017-08-304-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When subvolumes crash, users get messages like "No such file or directory" or "I/O Error" when doing operations that are cluster-wide, i.e., operations that touch the subvolume that has crashed. These include operations like mkdir() and rmdir() which are cluster-wide, as well as reads/writes/creates that hash to the dead subvolume. DHT does the right thing by disallowing operations to the subvolume -- it is effectively putting the subvolume in "read-only" mode to protect data, but it does not return the correct error. As a result, users of the filesystem think that the data is gone (in the case of "No such file or directory", or worse a blanket error that means nothing in the case of EIO). DHT sets the errno to ENOENT, which while makes sense in the context of DHT (No subvolume entry, hence ENOENT), the error it should bubble up to the user is EROFS, since it is putting the system in read-only mode. This diff changes the error messages to EROFS so the users get a more clear message of what is going on. Test Plan: Tested by downing a subvolume and checking error codes. Also ran other prove tests to make sure they pass. Change-Id: I20ad6fe31dbd66536db2a69246771ffad0140db3 Reviewers: rwareing, dph, moox Reviewed-on: https://review.gluster.org/17952 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* cluster/afr: SHD should not use did_discovery code pathsRichard Wareing2017-08-293-1/+7
| | | | | | | | | | | | | | | | | | | | | | Summary: - Exempt the SHD from the discover code path Test Plan: - prove -v tests/bugs/fb8149516.t - Make rc and canary on offending host (gfsdataswarm048.prn2) Reviewers: moox, dph, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2491694 Change-Id: I5ec3997cf26375e834c3c7c4ea6c174eef957b8b Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/18141 Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* Add negative caching to nfs auth cacheShreyas Siravara2017-08-295-351/+289
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This diff adds the ability to the nfs daemon to cache hosts it has deauthorized for mounts, not just the hosts it has authorized. This allows a host that has been denied to be deauthorized for ttl # of seconds, or until the nfs daemon has restarted. Test Plan: Use the prove tests to maintain the integrity of the auth code. Test manually to see if the correct code path is being hit. Reviewers: dph, rwareing Reviewed By: rwareing Differential Revision: https://phabricator.fb.com/D1947728 Change-Id: I9728e15913e0900ab34311b13b30eba0b91ce33f Reviewed-on: https://review.gluster.org/18134 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* features/locks: Fix crash bug in connection (lock) clean-up flowRichard Wareing2017-08-282-12/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Fixes crash bug where bricks can crash when the "clear locks" command is run (by CLI or by revocation code) and sockets are later cleaned-up causing bricks to crash. Crash bug is due to use-after-free due to refs being left to the lock in the client-list. When this list is later traversed it triggers a crash as pointers are now pointing to garbage. Test Plan: - Ran with monkey-unlock and tested connection clean-ups after lock revocation Reviewers: sshreyas, dph, moox Reviewed By: moox Differential Revision: https://phabricator.fb.com/D2695087 Tasks: 6207062 Change-Id: Iea26efe4bfbadc26431a3c50a0a8bda218bb5219 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/18122 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: Remove "compatability" code from SHD flowRichard Wareing2017-08-281-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - After examining some lock revocations, I recommend we remove this (problematic?) code which causes a lock to be held outside of the SHD locking domain. - The theory is that this lock should never conflict with anything outside of the SHD flow since the offsets are so huge. However in practice with lock revocation the lock lives a very long time, and I worry what the other implications of this might be. Test Plan: - Run tests/basic/afr/* (w/ D2706710) Reviewers: dph, moox, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2706717 Change-Id: I1f358f66810d104f28def9d1ac2a4fde3d073c92 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/18123 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: Delete "special domain" AFR heal flowRichard Wareing2017-08-281-48/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Reverts code from 50998ae08c5a767468ee85cb5c53bb5554ff734a, this was originally intended for backward compat w/3.5. This isn't relevant for us, and everywhere we see stuck heals we tend to see this "aaaaaaaaa" guy show up, so let's nuke it. Test Plan: - Run heal prove tests Reviewers: moox, dph, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2689678 Tasks: 6207062 Change-Id: Ie9db3eb6c6d44f6137ebcf964e06965047763ed9 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/18121 Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* cluster/dht: Fix rebalance bug + better loggingRichard Wareing2017-08-151-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Fixes edge case where lookup by gfid fails because it's not copied into the inode struct from the loc_t struct during the readdir loop - Improved logging for error conditions Test Plan: - Tested on dev server - Canaried build on <redacted> Reviewers: dph, moox, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2676693 Tasks: 9034954 Change-Id: I7f0160b391c43fc38e679fdb660cee59d2267932 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/18040 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* cluster/afr/shd: Fix leak in PGFID healingRichard Wareing2017-08-011-0/+4
| | | | | | | | | | | | | | | | Summary: - Fixes leak in PGFID healing flow Test Plan: - Valgrind on dev server Differential Revision: https://phabricator.fb.com/D3090661 Change-Id: Icde6c3ed868034dff77c92f01182dd1e3a4f8a57 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17948 Tested-by: Jeff Darcy <jeff@pl.atyp.us> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: Fix case in PGFID healing where NOOP was not being honoredRichard Wareing2017-08-014-3/+23
| | | | | | | | | | | | | | | | | | | | | Summary: - PGFID healing should not be triggered in the case where there is nothing to do (ret = 2). Instead this return code should be returned to the heal daemon to trigger the reap of the entry. - Reworked shd-pgfid-heal.t to queue up heal naturally instead of synthetically Test Plan: - Run tests/basic/afr/shd-pgfid-heal.t Differential Revision: https://phabricator.fb.com/D2748578 Change-Id: I74300de2b4dce23867f4111548de35f58bf77453 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17936 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* features/quota: Fix brick crash in quota unlink callbackRichard Wareing2017-08-011-1/+1
| | | | | | | | | | | | | | | | | | | Summary: - Get log message to use loc.gfid not loc.inode->gfid Test Plan: - Run prove -v tests/basic/quota*.t Reviewers: dph, moox, sshreyas Reviewed By: sshreyas Signature: t1:2559107:1445311668:61ca5809fa977326d0fb503e874363a29cd31dfe Change-Id: Iad16d7b2102376380eb0f6918111249af370aaeb Reviewed-on: https://review.gluster.org/17938 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* afr/cluster: PGFID heal supportRichard Wareing2017-07-316-17/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: PGFID healing enables heals which might otherwise fail due due to the lack of a entry heal to succeed by performing the entry healing within the same heal flow. It does this by leveraging the PGFID tracking feature of the POSIX xlator, and examining lookup replies for the PGFID attribute. If detected, the pgfid will be decoded and stored for later use in case the heal fails for whatever reason. Cascading heal failures are handled through recursion. This feature is critical for a couple reasons: 1. General healing predictability - When the SHD attempts to heal a given GFID, it should be able to do so without having to wait for some other dependent heal to take place. 2. Reliability - In some cases the parent directory may require healing, but the req'd entry in the indices/xattrop directory may not exist (e.g. bugs/crashes etc). Prior to PGFID heal support some sort of external script would be required to queue up these heals by using FS specific utilities to lookup the parent directory by hardlink or worse...do a costly full heal to clean them up. 3. Performance - In combination with multi-threaded SHD this feature will make SHD healing _much_ faster as directories with large amount of files to be healed will no longer have to wait for an entry heal to come along, the first file in that directory queued for healing will trigger an entry heal for the directory and this will allow the other files in that directory to be (immediatelly) healed in parallel. Test Plan: - run prove tests/basic/afr/shd_pgfid_heal.t - run prove tests/basic/afr/shd*.t - run prove tests/basic/afr/gfid*.t Differential Revision: https://phabricator.fb.com/D2546133 Change-Id: I25f586047f8bcafa900c0cc9ee8f0e2128688c73 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17929 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* server: fix core dumps on upstream test machinesJeff Darcy2017-07-181-1/+5
| | | | | | | | | | Change-Id: I48f5340507a5fcebe874f498eba737585c1c32a7 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17818 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* [io-cache] cache statfsDavid Wolinsky2017-07-143-16/+139
| | | | | | | | | | | | | | | | | | | | | | | | Summary: cache calls to statfs - io-cache must be enabled - then enable statfs caching - also can configure an independent cache time Test Plan: unit test basic/cache.t Reviewers: rwareing, sshreyas Subscribers: rappleye Differential Revision: https://phabricator.fb.com/D2524471 Change-Id: I55e0a773f9e24c2358d6fbbabbaf58bd5bd89ffc Tasks: 8618383 Reviewed-on: https://review.gluster.org/17771 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* [nfs] exports_auth per (sub) volumeDavid Wolinsky2017-07-137-27/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - exports_auth changed to a per-volume option - parse exports_auth in nfs3.c - set nfs3_export state for exports_auth - all calls into mnt3_authenticate_request must pass in volname - volname is checked to determine if auth is enabled for that volume Test Plan: manual testing, will look into unit testing Reviewers: rwareing, sshreyas Reviewed By: sshreyas Subscribers: rappleye Differential Revision: https://phabricator.fb.com/D2519423 Tasks: 6863942 Change-Id: Ia9fd92ca5a5bd4cbb57e9ce61075f024ab7dbc27 Signature: t1:2519423:1444775772:24dc39e22684784b75899e97e9d1e294b059a077 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17762 Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* cluster/afr: Handle gfid-less directories in heal flowRichard Wareing2017-07-124-18/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Updates heal flow to handle case where a directory does not have a gfid assigned. In this case we will remove _only_ empty directories in these cases such that the parent can re-gain consistency and files within can be correctly healed. - Also adds a test for the case where a file does not have a gfid, this is already handles by the metadata heal flow, but tests were lacking for this code path. Test Plan: - prove -v tests/basic/shd_autofix_nogfid.t - prove -v tests/basic/gfid_unsplit_shd.t Reviewers: dph, moox, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2502067 Tasks: 8549168 Change-Id: I8dd3e6a6d62807cb38aafe597eced3d4b402351b Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17750 Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* cluster/afr: SHD should always inspect directory healsRichard Wareing2017-07-111-2/+18
| | | | | | | | | | | | | | | | | | | | | | | | Summary: - This change ensures SHD always inspect directories which are queued for healing; i.e. it will not exclusively trust the wise-fool algorithm as there are cases where the change log simply isn't correct (bugs, crashes, etc). Failing to perform the entry heal in these cases will result in data heals failing to take place. - We made a similar change in 3.4.x for similar reasons Test Plan: - Run prove -v tests/basic/shd_force_inspect.t Reviewers: moox, dph, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2492993 Tasks: 8549168 Signature: t1:2492993:1443740894:7cf07168ca09946df9d8f96a3085fe2d3c201543 Change-Id: I2d8e1cbecbbca720cc3ee988d7aae08bea0a5453
* cluster/afr: GFID unsplit improvementsRichard Wareing2017-07-115-105/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Few improvements: handle type mis-matches (e.g. dir/file mis-matches), added in an option to control whether gfid unsplits will happen, ensured entry healing will happen in the gfid mis-match case when the option is enabled. - Added prove test to cover entry healing & type mis-match cases - Enable metadata split-brain resolution by default - Enable gfid split-brain resolution by default - Fix gfid unsplit logging bugs where it was showing null GFIDs instead of the actual chosen GFIDs Test Plan: - run prove -v test/basic/gfid_unsplit* - Ran valgrind to verify leak-free state Reviewers: moox, sshreyas Reviewed By: sshreyas Change-Id: Id67ddc728745ebbbaf7bdd3f9a5549e5a4cc4a20 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Change-Id: I4181233f9ba7f61ccd2ba91f0874eb2ac7cd40b5 Manually-merged-by: Jeff Darcy <jdarcy@fb.com> Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17739 Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* cluster/afr: Adjust gfid unsplit flow for proper correctness w/ AFR2Richard Wareing2017-07-071-7/+18
| | | | | | | | | | | | | | | | | | | | | Summary: - Prior patch did not re-run the gfid-mismatch flow after doing the unsplit. I think this is prudent to re-validate the unsplit worked as well as allow the code to continue from where it effectively left off. Test Plan: - Run prove -v tests/basic/gfid_unsplit.t Reviewers: dph, moox, sshreyas Reviewed By: sshreyas Change-Id: Ib3ed40f3db38c89090a876d7af3a1b2a303539d5 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17729 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* cluster/afr: Non-destructive GFID unsplit brain support for v3.6.xRichard Wareing2017-07-063-12/+323
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - v3.6.3 port of non-destructive GFID unsplit-brain code, almost a re-write for AFR2, but the original behavior lives on. - This feature allows the GlusterFS filesystem to automagically resolve GFID splitbrain situations by choosing the authorative file based on the last modification time. Other policies such as majority or size are also possible but not implemented just yet. - Core feature to Halo Geo-Replication, as this (gfid) form of split-brain is an everyday possibility with async mounts, so there needs to be an automated & scalable method to resolve them via the SHD or optionally in-line by FUSE clients or NFS daemons. - Operational notes: 1. Files or directory entries are supported, you can even write files into a directory and they will not be lost. 2. Streamed writes to a files are fully supported while a split-brain resolution happens, i.e. the writes will not be interrupted while the unsplit takes place. 3. Un-split (ones which are determined not to be "authoritative") files are renamed like so: ".<filename>_<random uuid>" Test Plan: - Run prove -v tests/basic/gfid_unsplit.t - Test output: https://phabricator.fb.com/P20041740 Reviewers: moox, dph, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2479409 Signature: t1:2479409:1443208319:373218aa9758a1b48db23ea5e211ec303fa92e64 Blame Revision: Change-Id: I5b3d2e79fad74b4372c02b86219e8ee98f5e29dc Change-Id: I8ef719bcccb19ab6674647e02b72e1b36155fed9 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17720 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* Build/test fixes - build_env, tirpc, mem-pool, cleanupJeff Darcy2017-07-061-2/+3
| | | | | | | | | | | Differential Revision: https://phabricator.intern.facebook.com/D5376801 Change-Id: I5bf733a395ef2b85065200fa5810ced27ee0d682 Reviewed-on: https://review.gluster.org/17719 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* cluster/afr: Fix metadata split-brain flow (HOTFIX)Richard Wareing2017-07-051-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - The metadata heal flow for some reason likes to tinker with the sink states prior to having the source finalized, this broke the policy based unsplit flow. This patch fixes it by simply setting those chilren who aren't the favorite as sinks. Test Plan: - Tested against some reported instances Reviewers: moox, sshreyas, dph Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2481527 Signature: t1:2481527:1443215555:1165d8eb5f3dec216ec3ff0795d9837712906b1d Blame Revision: Change-Id: I56f96fdcef32dd4fc5d35958148d0e56d142d5e4 Change-Id: I16aa445a22c3bcd7b589954e2da513ed53822d5b Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17682 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* Revert "cluster/afr: Fix metadata split-brain flow (HOTFIX)"Jeff Darcy2017-07-031-17/+0
| | | | This reverts commit 992a9f8494a358f828eeef34b46e9f5ccfca1d3b.
* Revert "cluster/afr: Adjust gfid unsplit flow for proper correctness w/ AFR2"Jeff Darcy2017-07-031-17/+3
| | | | This reverts commit 1d5d6ec423a21d698196cca39c7ba0e2563d9ba8.
* cluster/afr: Adjust gfid unsplit flow for proper correctness w/ AFR2Richard Wareing2017-07-031-3/+17
| | | | | | | | | | | | | | | | | | | | | Summary: - Prior patch did not re-run the gfid-mismatch flow after doing the unsplit. I think this is prudent to re-validate the unsplit worked as well as allow the code to continue from where it effectively left off. Test Plan: - Run prove -v tests/basic/gfid_unsplit.t Reviewers: dph, moox, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2483295 Signature: t1:2483295:1443237512:84baa105ddabaabdac52628121a44a98fb55ffa5 Blame Revision: Change-Id: I7b8a327be7272dd561fbf2995d781923b316ec45 Change-Id: Ib3ed40f3db38c89090a876d7af3a1b2a303539d5
* cluster/afr: Fix metadata split-brain flow (HOTFIX)Richard Wareing2017-07-031-0/+17
| | | | | | | | | | | | | | | | | | | | | | Summary: - The metadata heal flow for some reason likes to tinker with the sink states prior to having the source finalized, this broke the policy based unsplit flow. This patch fixes it by simply setting those chilren who aren't the favorite as sinks. Test Plan: - Tested against some reported instances Reviewers: moox, sshreyas, dph Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2481527 Signature: t1:2481527:1443215555:1165d8eb5f3dec216ec3ff0795d9837712906b1d Blame Revision: Change-Id: I56f96fdcef32dd4fc5d35958148d0e56d142d5e4 Change-Id: I16aa445a22c3bcd7b589954e2da513ed53822d5b
* Change default rsize/wsize from 2 MB to 512 KBShreyas Siravara2017-06-121-1/+1
| | | | | | | | | | | | | | | | | | | | | Summary: Per title Test Plan: Tested on devserver Reviewers: dph, moox, rwareing Reviewed By: rwareing FB-commit-id: 65e8b70 Change-Id: Ie51dab3eba0e989c81c58007f18186c8a48a2f91 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17310 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* glusterd: fix tiering/GFDB breakageJeff Darcy2017-05-131-0/+2
| | | | | | | | | | | | | | | | Some code that's only used for tiering was being compiled even when tiering was supposed to be disabled, and this was causing compilers to barf. Compilation is supposed to be controlled internally by USE_GFDB, ties to the "--disable-tiering" configure option. Change-Id: I41afa6ad1c50c02a9ec0f3ea420a0101d97e2960 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17283 Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com>
* Build/test fixesJeff Darcy2017-05-111-2/+2
| | | | | | | | | | | | | | (1) The version of gcc on my devserver doesn't seem to like plain "inline" (in AHA) so I replaced it with "extern inline" which is a better idea anyway. (2) Fixed up some stuff to do with finding env.rc (3) Added "nfsvers=3,proto=tcp" so that NFS tests run (a little bit) better on my machine. Never hurts to be explicit, I guess. Change-Id: I3357b61a950c0d1ef3dfd2c12c96d157c4d163e2 Signed-off-by: Jeff Darcy <jdarcy@fb.com>
* nfs/auth: Fix sensitivity to rw,ro ordering in the exports fileShreyas Siravara2017-03-171-4/+9
| | | | | | | | | | | | | | | | | | | | | | Summary: When a netgroup is marked as rw in the exports file, and another netgroup is marked as ro for the same share, the ro option is not honored. This diff fixes that bug Test Plan: Added a test and verifies that it passes with this patch and does not pass without this patch. Reviewers: rwareing, dph, moox Reviewed By: moox FB-commit-id: 2d36d2d Change-Id: Ia394f36472f094a62ddfedc0c8fd5d95e247b4b0 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: https://review.gluster.org/16908 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* Merge remote-tracking branch 'origin/release-3.8' into merge-3.8Kevin Vigor2017-03-1615-211/+445
|\ | | | | | | Change-Id: Ib336c2ada491c2d2fcbbbe6865f9eb975a405b36
| * cluster/afr: Perform new entry mark before creating new entryPranith Kumar K2017-03-114-49/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a chance for the source brick to go down just after the new entry is created and before source brick is marked with necessary pending markers. If after this any I/O happens then new entry will become source and reverse heal will happen. To prevent this mark the pending xattrs before creating the new entry. >BUG: 1417466 >Change-Id: I233b87e694d32e5d734df5a83b4d2ca711c17503 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: https://review.gluster.org/16474 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> BUG: 1429312 Change-Id: Ia1bdaf9511acaeff72a336c8185a56a64ea0e2ba Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/16850 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
| * storage/posix: Execute syscalls in xattrop under different locksKrutika Dhananjay2017-03-114-36/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: https://review.gluster.org/16462 and https://review.gluster.org/16792 ... and not inode->lock. This is to prevent the epoll thread from *potentially* being blocked on this lock in the worst case for extended period elsewhere in the brick stack, while the syscalls in xattrop are being performed under the same lock by a different thread. This could potentially lead to ping-timeout, if the only available epoll thread is busy waiting on the inode->lock, thereby preventing it from picking up the ping request from the client(s). Also removed some unused functions. >Change-Id: I2054a06701ecab11aed1c04e80ee57bbe2e52564 >BUG: 1421938 >Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> >Reviewed-on: https://review.gluster.org/16462 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit b5c26a462caf97bfc5380c81092f5c331ccaf1ae) Change-Id: I2054a06701ecab11aed1c04e80ee57bbe2e52564 BUG: 1427390 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/16777 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
| * afr: restore atime/mtime for non-regular filesRavishankar N2017-03-104-51/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AFR restores atime/mtime only as a part of data heal. For non-regular files (dirs, symlinks, char/block/socket files etc) which do not undergo data-heal, atime/mtime is not restored. This patch restores atime/mtime as a part of metadata heal for such files. > Reviewed-on: https://review.gluster.org/16844 > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 804a65f07ea8e2093f781807651d0d07513b2627) Change-Id: Id8da885fc93fdf65c2f4bae2af3605b146ac1f16 BUG: 1429405 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/16852 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
| * cluster/dht: Fix crash in "nuke-dir" featureKrutika Dhananjay2017-03-101-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: https://review.gluster.org/16829 My patch at https://review.gluster.org/16419 is resulting in core dumps everytime I run tests/features/nuke.t. Turns out dht, upon successfully "nuking" a directory, which was initiated through a setxattr, unwinds the operation with rmdir fop signature, resulting in readdir-ahead casting a struct iatt (preparent) to dict_t, leading to a crash. Change-Id: Ib970b3198185a6c641092b00e115a672cb3f9111 BUG: 1428743 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/16840 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
| * features/shard: Fix EIO error on add-brickKrutika Dhananjay2017-03-102-19/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: https://review.gluster.org/14419 DHT seems to link inode during lookup even before initializing inode ctx with layout information, which comes after directory healing. Consider two parallel writes. As part of the first write, shard sends lookup on .shard which in its return path would cause DHT to link .shard inode. Now at this point, when a second write is wound, inode_find() of .shard succeeds and as a result of this, shard goes to create the participant shards by issuing MKNODs under .shard. Since the layout is yet to be initialized, mknod fails in dht call path with EIO, leading to VM pauses. The fix involves shard maintaining a flag to denote whether a fresh lookup on .shard completed one network trip. If it didn't, all inode_find()s in fop path will be followed by a lookup before proceeding with the next stage of the fop. Big thanks to Raghavendra G and Pranith Kumar K for the RCA and subsequent inputs and feedback on the patch. Change-Id: I66a7adf177e338a7691f441f199dde7c2b90c292 BUG: 1387878 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/16750 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
| * features/shard: Put onus of choosing the inode to resolve on individual fopsKrutika Dhananjay2017-03-102-26/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: https://review.gluster.org/16709 ... as opposed to adding checks in "common" functions to choose the inode to resolve based local->fop, which is rather ugly and prone to errors. Change-Id: Ib26d3dd5a7ae43cd27839752bdae2cce56d73e8a BUG: 1387878 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/16749 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
| * cluster/dht Fix error assignment in dht_*xattr2 functionsN Balachandran2017-03-101-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Corrected the op_errno assignments and NULL checks in the dht_sexattr2 and dht_removexattr2 functions. Earlier, they unwound with the default EINVAL op_errno if the file had been deleted. > Change-Id: Iaf837a473d769cea40132487a966c7f452990071 > BUG: 1421653 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/16610 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: MOHIT AGRAWAL <moagrawa@redhat.com> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> (cherry picked from commit 028626a86ea409f908783b9007c02877f20be43e) Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: Id2e91df47bcd734dda18700fb075608c1627a608 BUG: 1424915 Reviewed-on: https://review.gluster.org/16678 Tested-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
| * fuse: fix memory leak in setxattrXavier Hernandez2017-03-101-27/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there's some failed check in setxattr of mount/fuse before actually starting the operation, a fuse_state_t structure is leaked. This fix correctly releases allocated resources in case of error. > Change-Id: I8b1cda67a613c13b6bc38947352e2ccfccf96a1d > BUG: 1412174 > Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> > Reviewed-on: http://review.gluster.org/16380 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Niels de Vos <ndevos@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: I7e838f8284aa2aca2e43067a4b002e8530ad928d BUG: 1412994 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: https://review.gluster.org/16403 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
| * cluster/ec: Fixing log messageSunil Kumar H G2017-03-101-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Updating the warning message with details to improve user understanding. >BUG: 1409202 >Change-Id: I001f8d5c01c97fff1e4e1a3a84b62e17c025c520 >Signed-off-by: Sunil Kumar H G <sheggodu@redhat.com> >Reviewed-on: http://review.gluster.org/16315 >Tested-by: Sunil Kumar Acharya >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> BUG: 1427419 Change-Id: I34a869d7cd7630881c897e0e4ecac367cd2820f9 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> Reviewed-on: https://review.gluster.org/16781 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ashish Pandey <aspandey@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
* | cluster/afr: AFR2 discovery should always do entry heal flowRichard Wareing2017-03-065-14/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Fixes case where when a brick is completely wiped, the AFR2 discovery mechanism would potentially (1/R chance where R is your replication factor) pin a NFSd or client to the wiped brick. This would in turn prevent the client from seeing the contents of the (degraded) subvolume. - The fix proposed in this patch is to force the entry-self heal code path when the discovery process happens. And furthermore, forcing a conservative merge in the case where no brick is found to be degraded. - This also restores the property of our 3.4.x builds where-by bricks automagically rebuild via the SHDs without having to run any sort of "full heal". SHDs are given enough signal via this patch to figure out what they need to heal. Test Plan: Run "prove -v tests/bugs/fb8149516.t" Output: https://phabricator.fb.com/P19989638 Prove test showing failed run on v3.6.3-fb_10 without the patch -> https://phabricator.fb.com/P19989643 Reviewers: dph, moox, sshreyas Reviewed By: sshreyas FB-commit-id: 3d6f171 Change-Id: I7e0dec82c160a2981837d3f07e3aa6f6a701703f Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: https://review.gluster.org/16862 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* | HotFix: Disables rmtab functionality (v3.6.3)Richard Wareing2017-03-061-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - This is a new feature for 3.6 to provide "show mount" functionality to clients. It unfortunately is not scalable for large numbers of mounts due to the various activities done in the epoll loop causing IO stalls and very slow mount performance for NFS clients, especially nfs CLI clients. Test Plan: - Built RCs, deployed to gfsinstabu.frc3c08 Reviewers: dph, sshreyas Reviewed By: sshreyas Subscribers: storage@ FB-commit-id: bf30931 Change-Id: I7d0e110be95d82e3d8be7d2ac7576386471c1a47 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: https://review.gluster.org/16859 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* | Fix management client deadlockRichard Wareing2017-03-061-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - ping notify is a NOOP for management daemons Test Plan: - Built and ran on gfsadsbu (132 nodes) verified nothing hangs and upgrade is smooth Reviewers: sshreyas Reviewed By: sshreyas FB-commit-id: ec30b68 Change-Id: I8e121aaaa3ad268e5df057e03aa4b37a403c9ea0 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: https://review.gluster.org/16858 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* | Merge remote-tracking branch 'origin/release-3.8' into merge-3.8Kevin Vigor2017-03-056-47/+40
|\|
| * posix: Fix creation of files with S_ISVTX on FreeBSDXavier Hernandez2017-02-262-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On FreeBSD the S_ISVTX flag is completely ignored when creating a regular file. Since gluster needs to create files with this flag set, specialy for DHT link files, it's necessary to force the flag. This fix does this by calling fchmod() after creating a file that must have this flag set. > Change-Id: I51eecfe4642974df6106b9084a0b144835a4997a > BUG: 1411228 > Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> > Reviewed-on: https://review.gluster.org/16417 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Change-Id: Icaf46ebb440a3a722fd2fd771dd9d2f765b35ef4 BUG: 1424974 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: https://review.gluster.org/16687 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
| * glusterd: ignore return code of glusterd_restart_bricksAtin Mukherjee2017-02-201-9/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When GlusterD is restarted on a multi node cluster, while syncing the global options from other GlusterD, it checks for quorum and based on which it decides whether to stop/start a brick. However we handle the return code of this function in which case if we don't want to start any bricks the ret will be non zero and we will end up failing the import which is incorrect. Fix is just to ignore the ret code of glusterd_restart_bricks () >Reviewed-on: https://review.gluster.org/16574 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> >Reviewed-by: Jeff Darcy <jdarcy@redhat.com> >(cherry picked from commit 55625293093d485623f3f3d98687cd1e2c594460) Change-Id: I37766b0bba138d2e61d3c6034bd00e93ba43e553 BUG: 1420993 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/16594 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
| * protocol/client: Fix double free of client fdctx destroyRavishankar N2017-02-203-34/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the race between fd re-open code and fd release code, both of which free the fd context due to a race in certain variable checks as explained below: 1. client process (shd in the case of this BZ) sends an opendir to its children (client xlators) which send the fop to the bricks to get a valid fd. 2. Client xlator loses connection to the brick. fdctx->remotefd is -1 3. Client re-establishes connection. After handshake, it reopens the dir and sets fdctx->remotefd to a valid fd in client3_3_reopendir_cbk(). 4. Meanwhile, shd sends a fd unref after it is done with the opendir. This triggers a releasedir (since fd->refcount becomes 0). 5. client3_3_releasedir() sees that fdctx-->remotefd is a valid number (i.e not -1), sets fdctx->released=1 and calls client_fdctx_destroy() 6. As a continuation of step3, client_reopen_done() is called by client3_3_reopendir_cbk(), which sees that fdctx->released==1 and again calls client_fdctx_destroy(). Depending on when step-5 does GF_FREE(fdctx), we may crash at any place in step-6 in client3_3_reopendir_cbk() when it tries to access fdctx->{whatever}. > Reviewed-on: https://review.gluster.org/16521 > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 25fc74f9d1f2b1e7bab76485a99f27abadd10b7b) Change-Id: Ia50873d11763e084e41d2a1f4d53715438e5e947 BUG: 1422352 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/16621 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* | Merge remote-tracking branch 'origin/release-3.8' into merge-3.8Kevin Vigor2017-02-168-45/+115
|\|
| * gNFS: Keep the mountdict as long as the service is activeNiels de Vos2017-02-161-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We initialize and take ref once on mountdict during NFS/MNT3 server initialization but seem to be unref'in it for every UMNTALL request. This can lead to crash when there are multiple UMNTALL requests with >=1 active mount entry(/ies) in the mountlist. Since we take the ref only once, we should keep the mountdict through out the life of the process and dereference it only during unitialization of mnt3 service. Cherry picked from commit a88ae92de190af0956013780939ba6bdfd509ff8: > Change-Id: I3238a8df09b8972e56dd93fee426d866d40d9959 > BUG: 1421759 > Signed-off-by: Soumya Koduri <skoduri@redhat.com> > Reviewed-on: https://review.gluster.org/16611 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > Reviewed-by: Niels de Vos <ndevos@redhat.com> Change-Id: I3238a8df09b8972e56dd93fee426d866d40d9959 BUG: 1422394 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16627 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
| * afr: all children of AFR must be up to resolve s-brainRavishankar N2017-02-153-15/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: The various split-brain resolution policies (favorite-child-policy based, CLI based and mount (get/setfattr) based) attempt to resolve split-brain even when not all bricks of replica are up. This can be a problem when say in a replica 3, the only good copy is down and the other 2 bricks are up and blame each other (i.e. split-brain). We end up healing the file in such a case and allow I/O on it. Fix: A decision on whether the file is in split-brain or not must be taken only if we are able to examine the afr xattrs of *all* bricks of a given replica. > Reviewed-on: https://review.gluster.org/16476 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 0e03336a9362e5717e561f76b0c543e5a197b31b) Change-Id: Icddb1268b380005799990f5379ef957d84639ef9 BUG: 1420984 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/16589 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>