| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Every once in a while rpcbind crashes and the NFS endpoints go bye-bye.
This diff makes it such that we should almost never encounter the case
where we have NFS up and rpcbind down causing bad endpoints and hanging
mounts for our customers.
Test Plan: Added prove tests + tested on dev server
Reviewers: dph, moox, rwareing
Reviewed By: rwareing
Differential Revision: https://phabricator.fb.com/D2571724
Tasks: 8803558
Change-Id: I35acb2d731185a7b20020cb57bdd4d879e978df4
Signature: t1:2571724:1445555327:3276a4dcc4da71346b09d4aeb46c69dddcc7c5ba
Reviewed-on: https://review.gluster.org/17961
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- PGFID healing should not be triggered in the case where there is
nothing to do (ret = 2). Instead this return code should be returned
to the heal daemon to trigger the reap of the entry.
- Reworked shd-pgfid-heal.t to queue up heal naturally instead of
synthetically
Test Plan: - Run tests/basic/afr/shd-pgfid-heal.t
Differential Revision: https://phabricator.fb.com/D2748578
Change-Id: I74300de2b4dce23867f4111548de35f58bf77453
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17936
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
PGFID healing enables heals which might otherwise fail due
due to the lack of a entry heal to succeed by performing
the entry healing within the same heal flow.
It does this by leveraging the PGFID tracking feature of
the POSIX xlator, and examining lookup replies for the
PGFID attribute. If detected, the pgfid will be decoded
and stored for later use in case the heal fails for whatever
reason. Cascading heal failures are handled through
recursion.
This feature is critical for a couple reasons:
1. General healing predictability - When the SHD
attempts to heal a given GFID, it should be able
to do so without having to wait for some other
dependent heal to take place.
2. Reliability - In some cases the parent directory
may require healing, but the req'd entry in the
indices/xattrop directory may not exist
(e.g. bugs/crashes etc). Prior to PGFID heal support
some sort of external script would be required to
queue up these heals by using FS specific utilities
to lookup the parent directory by hardlink or
worse...do a costly full heal to clean them up.
3. Performance - In combination with multi-threaded SHD
this feature will make SHD healing _much_ faster as
directories with large amount of files to be healed
will no longer have to wait for an entry heal to
come along, the first file in that directory queued
for healing will trigger an entry heal for the directory
and this will allow the other files in that directory
to be (immediatelly) healed in parallel.
Test Plan:
- run prove tests/basic/afr/shd_pgfid_heal.t
- run prove tests/basic/afr/shd*.t
- run prove tests/basic/afr/gfid*.t
Differential Revision: https://phabricator.fb.com/D2546133
Change-Id: I25f586047f8bcafa900c0cc9ee8f0e2128688c73
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17929
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
cache calls to statfs
- io-cache must be enabled
- then enable statfs caching
- also can configure an independent cache time
Test Plan: unit test basic/cache.t
Reviewers: rwareing, sshreyas
Subscribers: rappleye
Differential Revision: https://phabricator.fb.com/D2524471
Change-Id: I55e0a773f9e24c2358d6fbbabbaf58bd5bd89ffc
Tasks: 8618383
Reviewed-on: https://review.gluster.org/17771
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- exports_auth changed to a per-volume option
- parse exports_auth in nfs3.c
- set nfs3_export state for exports_auth
- all calls into mnt3_authenticate_request must pass in volname
- volname is checked to determine if auth is enabled for that volume
Test Plan: manual testing, will look into unit testing
Reviewers: rwareing, sshreyas
Reviewed By: sshreyas
Subscribers: rappleye
Differential Revision: https://phabricator.fb.com/D2519423
Tasks: 6863942
Change-Id: Ia9fd92ca5a5bd4cbb57e9ce61075f024ab7dbc27
Signature: t1:2519423:1444775772:24dc39e22684784b75899e97e9d1e294b059a077
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17762
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Updates heal flow to handle case where a directory does not have a
gfid assigned. In this case we will remove _only_ empty directories
in these cases such that the parent can re-gain consistency and files
within can be correctly healed.
- Also adds a test for the case where a file does not have a gfid, this
is already handles by the metadata heal flow, but tests were lacking
for this code path.
Test Plan:
- prove -v tests/basic/shd_autofix_nogfid.t
- prove -v tests/basic/gfid_unsplit_shd.t
Reviewers: dph, moox, sshreyas
Reviewed By: sshreyas
Differential Revision: https://phabricator.fb.com/D2502067
Tasks: 8549168
Change-Id: I8dd3e6a6d62807cb38aafe597eced3d4b402351b
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17750
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Accidentally pushed this directly through the branch instead of through
Gerrit.
Change-Id: Ieedd2f71887cca91a6f1d31bc3cddfc489fc9fa6
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17749
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- This change ensures SHD always inspect directories which are queued
for healing; i.e. it will not exclusively trust the wise-fool
algorithm as there are cases where the change log simply isn't
correct (bugs, crashes, etc). Failing to perform the entry heal
in these cases will result in data heals failing to take place.
- We made a similar change in 3.4.x for similar reasons
Test Plan: - Run prove -v tests/basic/shd_force_inspect.t
Reviewers: moox, dph, sshreyas
Reviewed By: sshreyas
Differential Revision: https://phabricator.fb.com/D2492993
Tasks: 8549168
Signature: t1:2492993:1443740894:7cf07168ca09946df9d8f96a3085fe2d3c201543
Change-Id: I2d8e1cbecbbca720cc3ee988d7aae08bea0a5453
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Few improvements: handle type mis-matches (e.g. dir/file mis-matches), added in an option to control whether gfid unsplits will happen, ensured entry healing will happen in the gfid mis-match case when the option is enabled.
- Added prove test to cover entry healing & type mis-match cases
- Enable metadata split-brain resolution by default
- Enable gfid split-brain resolution by default
- Fix gfid unsplit logging bugs where it was showing null GFIDs instead of the actual chosen GFIDs
Test Plan:
- run prove -v test/basic/gfid_unsplit*
- Ran valgrind to verify leak-free state
Reviewers: moox, sshreyas
Reviewed By: sshreyas
Change-Id: Id67ddc728745ebbbaf7bdd3f9a5549e5a4cc4a20
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Change-Id: I4181233f9ba7f61ccd2ba91f0874eb2ac7cd40b5
Manually-merged-by: Jeff Darcy <jdarcy@fb.com>
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17739
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Prior patch did not re-run the gfid-mismatch flow after doing the
unsplit. I think this is prudent to re-validate the unsplit worked as
well as allow the code to continue from where it effectively left off.
Test Plan: - Run prove -v tests/basic/gfid_unsplit.t
Reviewers: dph, moox, sshreyas
Reviewed By: sshreyas
Change-Id: Ib3ed40f3db38c89090a876d7af3a1b2a303539d5
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17729
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- v3.6.3 port of non-destructive GFID unsplit-brain code, almost a re-write for AFR2, but the original behavior lives on.
- This feature allows the GlusterFS filesystem to automagically resolve GFID splitbrain situations by choosing the authorative file based on the last modification time. Other policies such as majority or size are also possible but not implemented just yet.
- Core feature to Halo Geo-Replication, as this (gfid) form of split-brain is an everyday possibility with async mounts, so there needs to be an automated & scalable method to resolve them via the SHD or optionally in-line by FUSE clients or NFS daemons.
- Operational notes:
1. Files or directory entries are supported, you can even write files into a directory and they will not be lost.
2. Streamed writes to a files are fully supported while a split-brain resolution happens, i.e. the writes will not be interrupted while the unsplit takes place.
3. Un-split (ones which are determined not to be "authoritative") files are renamed like so: ".<filename>_<random uuid>"
Test Plan:
- Run prove -v tests/basic/gfid_unsplit.t
- Test output: https://phabricator.fb.com/P20041740
Reviewers: moox, dph, sshreyas
Reviewed By: sshreyas
Differential Revision: https://phabricator.fb.com/D2479409
Signature: t1:2479409:1443208319:373218aa9758a1b48db23ea5e211ec303fa92e64
Blame Revision: Change-Id: I5b3d2e79fad74b4372c02b86219e8ee98f5e29dc
Change-Id: I8ef719bcccb19ab6674647e02b72e1b36155fed9
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17720
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
|
|
|
|
|
|
|
|
|
| |
Differential Revision: https://phabricator.intern.facebook.com/D5376801
Change-Id: I5bf733a395ef2b85065200fa5810ced27ee0d682
Reviewed-on: https://review.gluster.org/17719
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
|
|
|
|
|
|
|
|
| |
Change-Id: I4074e7cce8f6782860f849780ab6d0458e92a2ce
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17708
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
|
|
|
|
|
|
|
|
|
| |
Change-Id: I977f94ebc1630bdf46fd28d310f433c1c7d327f5
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17286
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These were causing hung tests when the bricks were restarted, in some
cases even accompanied by kernel stack traces involving fuse. They're
unnecessary to the purpose of the test.
Change-Id: I3c6c485324e2ee9418eb54929015b5bae436e9ea
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17284
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ide0045f89134ef0eda1f985818c5cb93b5be969b
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17258
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
Change-Id: I163656985185c710e20ad80824e14645020522cd
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17279
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
(1) The version of gcc on my devserver doesn't seem to like plain
"inline" (in AHA) so I replaced it with "extern inline" which is a
better idea anyway.
(2) Fixed up some stuff to do with finding env.rc
(3) Added "nfsvers=3,proto=tcp" so that NFS tests run (a little bit)
better on my machine. Never hurts to be explicit, I guess.
Change-Id: I3357b61a950c0d1ef3dfd2c12c96d157c4d163e2
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When a netgroup is marked as rw in the exports file, and another netgroup is marked as ro for the same share,
the ro option is not honored. This diff fixes that bug
Test Plan: Added a test and verifies that it passes with this patch and does not pass without this patch.
Reviewers: rwareing, dph, moox
Reviewed By: moox
FB-commit-id: 2d36d2d
Change-Id: Ia394f36472f094a62ddfedc0c8fd5d95e247b4b0
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: https://review.gluster.org/16908
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
rmtab support is brutally hacked out in FB branch, these tests
are doomed to be sad.
Test Plan:
Reviewers: sshreyas
Subscribers:
Tasks:
Blame Revision:
Change-Id: I7bc3c061108682a12f051edcc4379721d5a216df
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: https://review.gluster.org/16884
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Fixes case where when a brick is completely wiped, the AFR2 discovery
mechanism would potentially (1/R chance where R is your replication
factor) pin a NFSd or client to the wiped brick. This would in turn
prevent the client from seeing the contents of the (degraded)
subvolume.
- The fix proposed in this patch is to force the entry-self
heal code path when the discovery process happens. And furthermore,
forcing a conservative merge in the case where no brick is found to be
degraded.
- This also restores the property of our 3.4.x builds where-by bricks
automagically rebuild via the SHDs without having to run any sort of
"full heal". SHDs are given enough signal via this patch to figure
out what they need to heal.
Test Plan:
Run "prove -v tests/bugs/fb8149516.t"
Output: https://phabricator.fb.com/P19989638
Prove test showing failed run on v3.6.3-fb_10 without the patch -> https://phabricator.fb.com/P19989643
Reviewers: dph, moox, sshreyas
Reviewed By: sshreyas
FB-commit-id: 3d6f171
Change-Id: I7e0dec82c160a2981837d3f07e3aa6f6a701703f
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: https://review.gluster.org/16862
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Try to get a passing smoke test by crudely hacking out failing tests.
Test Plan:
Commit & hope for happy smoke.
Reviewers: sshreyas
Subscribers:
Tasks:
Blame Revision:
Change-Id: I564ac50557276a839d8de3a89a5c154c751b7503
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: https://review.gluster.org/16856
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
It was observed while testing the SHD threading code, that under high loads SHD/AFR related
SyncOps & SyncTasks can actually hang/deadlock as the transport
disconnected event (for frame timeouts) never gets bubbled up correctly. Various
tests indicated the ping timeouts worked fine, while "frame timeouts"
did not. The only difference? Ping timeouts actually disconnect
the transport while frame timeouts did not. So from a high-level we
know this prevents deadlock as subsequent tests showed the deadlocks
no longer ocurred (after this change). That said, there may be some
more elegant solution. For now though, forcing a reconnect is
preferential vs hanging clients or deadlocking the SHD.
Test Plan:
It's fairly difficult to write a good prove test for this since it requires human eyes to observe if the SHD is deadlocked (I'm open to ideas). Here's the repro though:
1. Create a 3x replicated cluster on a host.
2. Set the frame-timeout low (say 2 sec)
3. Down a brick, and write a pile of files (maybe 2000)
4. Bring up the downed brick and let the SHD begin healing files
5. During the heal process, kill -STOP <pid of brick> (hang) one of the bricks
Without this patch the SHD will be deadlocked, even though the frame timed out after 2 seconds. With the patch, the plug is pulled on the transport, a disconnect is bubbled up
to the syncop and the SHD resumes.
Reviewers: dph, meyering, cjh
Reviewed By: cjh
Subscribers: ethanr
Conflicts:
rpc/rpc-lib/src/rpc-clnt.c
FB-commit-id: c99357c
Change-Id: I344079161492b195267c2d64b6eab0b441f12ded
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: https://review.gluster.org/16846
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
| |\ |
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When GlusterD is restarted on a multi node cluster, while syncing the
global options from other GlusterD, it checks for quorum and based on
which it decides whether to stop/start a brick. However we handle the
return code of this function in which case if we don't want to start any
bricks the ret will be non zero and we will end up failing the import
which is incorrect.
Fix is just to ignore the ret code of glusterd_restart_bricks ()
>Reviewed-on: https://review.gluster.org/16574
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
>Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
>(cherry picked from commit 55625293093d485623f3f3d98687cd1e2c594460)
Change-Id: I37766b0bba138d2e61d3c6034bd00e93ba43e553
BUG: 1420993
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: https://review.gluster.org/16594
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
This test is failing consistently on both the upstream 3.98 branch and
3.8-fb, just nuke.
Test Plan:
Check in, hope smoke test passes.
Reviewers:
Subscribers:
Tasks:
Blame Revision:
Change-Id: I888a9f424340dff128a4a5273f96e6d3fbc323a9
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: https://review.gluster.org/16647
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
| |\| |
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Problem:
The various split-brain resolution policies (favorite-child-policy based,
CLI based and mount (get/setfattr) based) attempt to resolve split-brain
even when not all bricks of replica are up. This can be a problem when
say in a replica 3, the only good copy is down and the other 2 bricks
are up and blame each other (i.e. split-brain). We end up healing the
file in such a case and allow I/O on it.
Fix:
A decision on whether the file is in split-brain or not must be taken
only if we are able to examine the afr xattrs of *all* bricks of a given
replica.
> Reviewed-on: https://review.gluster.org/16476
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
(cherry picked from commit 0e03336a9362e5717e561f76b0c543e5a197b31b)
Change-Id: Icddb1268b380005799990f5379ef957d84639ef9
BUG: 1420984
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: https://review.gluster.org/16589
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
| |\| |
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When overwriting an existing file with O_TRUNC, the 'atime' was set to
0, meaning the Epoch (01-Jan-1970 UTC). However, the 'mtime' gets
updated correcty.
In case 'atime' or 'mtime' is not passed in the 'struct iatt', the time
values passed to the systemcall are taken from the current values are
returned by lstat().
Cherry picked from commit 9bed81ada6f91f998e9abd915b18e3f06557cdcb:
> Change-Id: I7021b7161dcd6c9a3e515d98f6d4847533c434b3
> BUG: 1401777
> Reported-by: Eivind Sarto <eivindsarto@gmail.com>
> Signed-off-by: Niels de Vos <ndevos@redhat.com>
> Reviewed-on: http://review.gluster.org/16034
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Change-Id: I7021b7161dcd6c9a3e515d98f6d4847533c434b3
BUG: 1411011
Reported-by: Eivind Sarto <eivindsarto@gmail.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/16356
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Problem:
Currently, I/O on a split-brained file fails even when the
favorite-child-policy is set until the self-heal is complete.
Fix:
If a valid 'source' is found using the set favorite-child-policy, inspect
and reset the afr pending xattrs on the 'sinks' (inside appropriate locks),
refresh the inode and then proceed with the read or write transaction.
The resetting itself happens in the self-heal code and hence can also
happen in the client side background-heal or by the shd's index-heal in
addition to the txn code path explained above. When it happens in via
heal, we also add checks in undo-pending to not reset the sink xattrs
again.
> Reviewed-on: http://review.gluster.org/15673
> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Change-Id: Ic8c1317720cb26bd114b6fe6af4e58c73b864626
BUG: 1378547
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reported-by: Simon Turcotte-Langevin <simon.turcotte-langevin@ubisoft.com>
Reviewed-on: http://review.gluster.org/16091
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
dd is doing a statfs and failing with ENOSPC instead of writign and
getting EDQUOTA. Make either error a success in this test.
Test Plan:
prove bug-1292020.t
Reviewers:
Subscribers:
Tasks:
Blame Revision:
Change-Id: I9f580d9e4a4dd293df55a1d954f86a9862fcae7b
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16352
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
Test relied on rmtab setting to find NFS config dir, but in FB branch
rmtab is disabled by default. Just hardcoded the location (test already
contained hardcoded path, as it turns out).
Test Plan:
Reviewers:
Subscribers:
Tasks:
Blame Revision:
Change-Id: I4a50a00ed550832ca8d91981e6c5af4d8c81b466
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16336
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
| |\|
| |
| |
| |
| | |
Change-Id: I844adf2aef161a44d446f8cd9b7ebcb224ee618a
Signed-off-by: Kevin Vigor <kvigor@fb.com>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Problem:
In CentOS-7, the file was receving an extra removexattr(security.ima)
FOP which changed its ctime, breaking the assumption that a particular brick
had the latest ctime based on the writevs done in the .t
Fix:
1. Compare the ctime of both files in the backend and pick the one with
the latest ctime for the fav-child policy. Also unmount the volume
before comparing, to avoid any further FOPS on the file that
can possibly modify the timestamps.
2. Added floating point handling in stat function. Thanks to Pranith for
the helping debugging the regex.
> Reviewed-on: http://review.gluster.org/16288
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
(cherry picked from commit 76fff8cb2a164b596ca67e65c99623f5b68361fd)
Change-Id: I06041a0f39a29d2593b867af8685d65c7cd99150
BUG: 1410073
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/16324
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Backport of: http://review.gluster.org/16286
PROBLEM:
Consider a volume with granular-entry-heal and sharding enabled. When
a replica is down and a shard is created as part of a write, the name
index is correctly created under indices/entry-changes/<dot-shard-gfid>.
Now when a read on the same region triggers another MKNOD, the fop
fails on the online bricks with EEXIST. By virtue of this being a
symmetric error, the failed_subvols[] array is reset to all zeroes.
Because of this, before post-op, the GF_XATTROP_ENTRY_OUT_KEY will be
set, causing the name index, which was created in the previous MKNOD
operation, to be wrongly deleted in THIS MKNOD operation.
FIX:
The ideal fix would have been for a transaction to delete the name
index ONLY if it knows it is the one that created the index in the first
place. This would involve gathering information as to whether THIS xattrop
created the index from individual bricks, aggregating their responses and
based on the various posisble combinations of responses, decide whether to
delete the index or not. This is rather complex. Simpler fix would be
for post-op to examine local->op_ret in the event of no failed_subvols
to figure out whether to delete the name index or not. This can occasionally
lead to creation of stale name indices but they won't be affecting the IO path
or mess with pending changelogs in any way and self-heal in its crawl of
"entry-changes" directory would take care to delete such indices.
Change-Id: Icc642a987d1b6a5097562315aecf1263ed35ceb6
BUG: 1408786
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/16293
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Backport of: http://review.gluster.org/16193
Replace the EXPECT '00000001' with EXPECT_NOT '00000000'. This is
because occasionally a name-heal is performing new-entry marking on
'c' causing the pending entry changelog on it to become '00000002'.
Change-Id: Ib7b0d64c8de2498c2ffb3b8e06228694f2c55755
BUG: 1406740
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/16224
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Backport of: http://review.gluster.org/16169
Check that shd is up before executing 'volume heal' command
Change-Id: I43634a979791fcb92bfebf93ec48eff42af2bb97
BUG: 1405890
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/16190
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
After sending SIGTERM to gluster process we immediately
check if process exited. We should wait for some time
before checking process state.
> Reviewed-on: http://review.gluster.org/16162
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Avra Sengupta <asengupt@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: N Balachandran <nbalacha@redhat.com>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
(cherry picked from commit e9d8525a0d34130ba2a582109937b8e79eecf6ab)
BUG: 1405450
Change-Id: Iaba0067f6e880a7fe38e11b9fa0fe9bd103b19e2
Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-on: http://review.gluster.org/16164
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Avra Sengupta <asengupt@redhat.com>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Enable data, metadata and entry self-heal as xlator-options so that glfs-heal.c
can heal split-brain files even if they are disabled on the volume via volume
set commands.
> Reviewed-on: http://review.gluster.org/11333
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
(cherry picked from commit 209c2d447be874047cb98d86492b03fa807d1832)
Change-Id: Ic191a1017131db1ded94d97c932079d7bfd79457
BUG: 1405130
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/16144
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Backport of: http://review.gluster.org/16075
Incorrect initialisation of local->optimistic_change_log was leading
to skipped pre-op and post-op even when a brick didn't participate in
the txn because it was down.
The result - missing granular name index resulting in some entries
never getting healed.
FIX:
Initialise local->optimistic_change_log just before pre-op.
Also fixed granular entry heal to create the granular name index in
pre-op as opposed to post-op. This is to prevent loss of granular
information when during an entry txn, the good (src) brick goes
offline before the post-op is done. This would cause self-heal to
do conservative merge (since dirty xattr is the only information
available), which when granular-entry-heal is enabled, expects
granular indices, the lack of which can lead to loss of data in
the worst case.
Change-Id: Ibc0fbfb3fa21c578e28868d9e30b274e33c12064
BUG: 1403646
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/16105
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Problem:
The issue as seen by the user is detailed in the BZ but what is
happening is if the no. of items in the wait queue == max-qlen,
syncop_mt_dir_scan() does a pthread_cond_wait until the launched
synctask workers dequeue the queue. But if for some reason the worker
fails, the queue is never emptied due to which further invocations of
syncop_mt_dir_scan() are blocked forever.
Fix: Made some changes to _dir_scan_job_fn
- If a worker encounters error while processing an entry, notify the
readdir loop in syncop_mt_dir_scan() of the error but continue to process
other entries in the queue, decrementing the qlen as and when we dequeue
elements, and ending only when the queue is empty.
- If the readdir loop in syncop_mt_dir_scan() gets an error form the
worker, stop the readdir+queueing of further entries.
> Reviewed-on: http://review.gluster.org/16073
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
(cherry picked from commit 2d012c4558046afd6adb3992ff88f937c5f835e4)
Change-Id: I39ce073e01a68c7ff18a0e9227389245a6f75b88
BUG: 1403192
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/16096
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
During snapd graph generation we should check if SSL is
enabled on main volume or not. This is because clients
will communicate with snapd as if it is communicating to
a brick.
> Reviewed-on: http://review.gluster.org/15979
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Kaushal M <kaushal@redhat.com>
(cherry picked from commit 182f0d12040dab5081ca645a3f370f65cd68b528)
Change-Id: I0d7fe86c567b297a8528a48faf06161d4c3cb415
Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
BUG: 1400459
Reviewed-on: http://review.gluster.org/15986
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Backport of: http://review.gluster.org/15747
When there are already existing non-granular indices created that are
yet to be healed, if granular-entry-heal option is toggled from 'off' to
'on', AFR self-heal whenever it kicks in, will try to look for granular
indices in 'entry-changes'. Because of the absence of name indices,
granular entry healing logic will fail to heal these directories, and
worse yet unset pending extended attributes with the assumption that
are no entries that need heal.
To get around this, a new CLI is introduced which will invoke glfsheal
program to figure whether at the time an attempt is made to enable
granular entry heal, there are pending heals on the volume OR there
are one or more bricks that are down. If either of them is true, the
command will be failed with the appropriate error.
New CLI: gluster volume heal <VOL> granular-entry-heal {enable,disable}
Change-Id: I342e0390f847fcb015a50ef58aedfcbcb58f4ed3
BUG: 1398501
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15942
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Issue:
=====
In certain cases, there was no unwind of read
from read-ahead xlator, thus resulting in hang.
RCA:
====
In certain cases, ioc_readv() issues STACK_WIND_TAIL() instead
of STACK_WIND(). One such case is when inode_ctx for that file
is not present (can happen if readdirp was called, and populates
md-cache and serves all the lookups from cache).
Consider the following graph:
...
io-cache (parent)
|
readdir-ahead
|
read-ahead
...
Below is the code snippet of ioc_readv calling STACK_WIND_TAIL:
ioc_readv()
{
...
if (!inode_ctx)
STACK_WIND_TAIL (frame, FIRST_CHILD (frame->this),
FIRST_CHILD (frame->this)->fops->readv, fd,
size, offset, flags, xdata);
/* Ideally, this stack_wind should wind to readdir-ahead:readv()
but it winds to read-ahead:readv(). See below for
explaination.
*/
...
}
STACK_WIND_TAIL (frame, obj, fn, ...)
{
frame->this = obj;
/* for the above mentioned graph, frame->this will be readdir-ahead
* frame->this = FIRST_CHILD (frame->this) i.e. readdir-ahead, which
* is as expected
*/
...
THIS = obj;
/* THIS will be read-ahead instead of readdir-ahead!, as obj expands
* to "FIRST_CHILD (frame->this)" and frame->this was pointing
* to readdir-ahead in the previous statement.
*/
...
fn (frame, obj, params);
/* fn will call read-ahead:readv() instead of readdir-ahead:readv()!
* as fn expands to "FIRST_CHILD (frame->this)->fops->readv" and
* frame->this was pointing ro readdir-ahead in the first statement
*/
...
}
Thus, the readdir-ahead's readv() implementation will be skipped, and
ra_readv() will be called with frame->this = "readdir-ahead" and
this = "read-ahead". This can lead to corruption / hang / other problems.
But in this perticular case, when 'frame->this' and 'this' passed
to ra_readv() doesn't match, it causes ra_readv() to call ra_readv()
again!. Thus the logic of read-ahead readv() falls apart and leads to
hang.
Solution:
=========
Modify STACK_WIND_TAIL() as:
STACK_WIND_TAIL (frame, obj, fn, ...)
{
next_xl = obj /* resolve obj as the variables passed in obj macro
can be overwritten in the further instrucions */
next_xl_fn = fn /* resolve fn and store in a tmp variable, before
modifying any variables */
frame->this = next_xl;
...
THIS = next_xl;
...
next_xl_fn (frame, next_xl, params);
...
}
BUG: 1399018
Change-Id: Ie662ac8f18fa16909376f1e59387bc5b886bd0f9
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15934
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
during crawl
Backport of: http://review.gluster.org/15880
If granular name indices are already in existence for a volume, and
before they are healed, granular entry heal be disabled, a crawl on
indices/xattrop will clear the changelogs on these directories. When
their corresponding entry-changes indices are crawled subsequently,
if it is found that the directories don't need heal anymore, the
granular indices are not cleaned up.
This patch fixes that problem by ensuring that the zero-xattrop
also deletes the stale indices at the level of index translator.
Change-Id: Iae0a560c1c9d37b083cad89f16d3dcf83c4f7dc7
BUG: 1398501
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15927
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Backport of http://review.gluster.org/#/c/15005/9.
GlusterD as of now was blindly assuming that the brick port which was
already allocated would be available to be reused and that assumption
is absolutely wrong.
Solution : On first attempt, we thought GlusterD should check if the
already allocated brick ports are free, if not allocate new port and
pass it to the daemon. But with that approach there is a possibility
that if PMAP_SIGNOUT is missed out, the stale port will be given back
to the clients where connection will keep on failing. Now given the
port allocation always start from base_port, if everytime a new port
has to be allocated for the daemons, the port range will still be
under control. So this fix tries to clean up old port using
pmap_registry_remove () if any and then goes for pmap_registry_alloc ()
This patch is being ported to 3.8 branch because, the brick process
blindly re-using old port, without registering with the pmap server,
causes snapd daemon to not start properly, even though snapd registers
with the pmap server. With this patch, all the brick processes and
snapd will register with the pmap server to either get the same port,
or a new port, and avoid port collision.
> Reviewed-on: http://review.gluster.org/15005
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Avra Sengupta <asengupt@redhat.com>
(cherry picked from commit c3dee6d35326c6495591eb5bbf7f52f64031e2c4)
Change-Id: If54a055d01ab0cbc06589dc1191d8fc52eb2c84f
BUG: 1369766
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/15308
Tested-by: Avra Sengupta <asengupt@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Backport of http://review.gluster.org/15826
On recieving a rename fop, marker_rename() stores the,
oldloc and newloc in its 'local' struct, once the rename
is done, the xtime marker(last updated time) is set on
the file, but sending a setxattr fop. When upcall
receives the setxattr fop, the loc->inode is NULL and
it crashes. The loc->inode can be NULL only in one valid
case, i.e. in rename case where the inode of new loc
can be NULL. Hence, marker should have filled the inode
of the new_loc before issuing a setxattr.
> Reviewed-on: http://review.gluster.org/15826
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Kotresh HR <khiremat@redhat.com>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
(cherry picked from commit 46e5466850311ee69e6ae9a11c2bba2aabadd5de)
Change-Id: Id638f678c3daaf4a5c29b970b58929d377ae8977
BUG: 1396418
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15878
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
Halo prove tests were racy in a couple of ways. First, they raced
against the self-heal daemon (e.g. write to volume with two bricks
up and then assert that only two bricks have data file; but shd will
properly copy file to third brick sooner or later). Fix by disabling
shd in such tests.
Second, tests rely on pings to complete and set halo state as
expected, but do not check for this. If writing begins before initial
pings complete, all bricks may be up and receive the data. Fix by adding
explicit check for halo child states.
Test Plan:
prove tests/basic/halo*.t
(prior to this changeset, would fail within ~10 iterations on my
devserver and almost always on centos regression. Now runs overnight
without failure on my devserver).
Reviewers:
Subscribers:
Tasks:
Blame Revision:
Change-Id: If6823540dd4e23a19cc495d5d0e8b0c6fde9a3bd
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16325
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Implements "Hybrid" halo mounts which perform write FOPs synchronously
to all regions and read FOPs asynchronously to those nodes within the
region.
- Leverages the fact that the inode cache stores "hints" as to the
clean/dirty state of upto 16 replicas in the cluster; upon refresh of
an inode we can do a fresh lookup to get a "wise" node if none exist
in our region
- Activated via the cluster.halo-hybrid-mode option
Test Plan:
- Run AFR prove tests
- Run halo prove tests
Reviewers: kvigor, sshreyas
Reviewed By: kvigor, sshreyas
FB-commit-id: aca760757afd45e4de2e28dc31b87a73ee52f12d
Change-Id: Ic6728ce93b7b96e3151dccdebccd30e007f4750c
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16306
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|