| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Change-Id: I2ca0298ee9d166f58b8730256ea76a04e547ce5d
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
|
|
|
|
|
|
| |
Differential Revision: https://phabricator.intern.facebook.com/D5927193
Change-Id: Ife04c8738b9ee721e7be9bc843b2f6d54bbb468e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This rolls up multiple patches related to namespace identificaton and
throttling/QoS. This primarily includes the following, all by Michael
Goulet <mgoulet@fb.com>.
io-threads: Add weighted round robin queueing by namespace
https://phabricator.facebook.com/D5615269
io-threads: Add per-namespaces queue sizes to IO_THREADS_QUEUE_SIZE_KEY
https://phabricator.facebook.com/D5683162
io-threads: Implement better slot allocation algorithm
https://phabricator.facebook.com/D5683186
io-threads: Only enable weighted queueing on bricks
https://phabricator.facebook.com/D5700062
io-threads: Update queue sizes on drain
https://phabricator.facebook.com/D5704832
Fix parsing (-1) as default NS weight
https://phabricator.facebook.com/D5723383
Parts of the following patches have also been applied to satisfy
dependencies.
io-throttling: Calculate moving averages and throttle offending hosts
https://phabricator.fb.com/D2516161
Shreyas Siravara <sshreyas@fb.com>
Hook up ODS logging for FUSE clients.
https://phabricator.facebook.com/D3963376
Kevin Vigor <kvigor@fb.com>
Add the flag --skip-nfsd-start to skip the NFS daemon stating, even if
it is enabled
https://phabricator.facebook.com/D4575368
Alex Lorca <alexlorca@fb.com>
There are also some "standard" changes: dealing with code that moved,
reindenting to comply with Gluster coding standards, gf_uuid_xxx, etc.
This patch *does* revert some changes which have occurred upstream since
3.6; these will be re-applied as apppropriate on top of this new base.
Change-Id: I69024115da7a60811e5b86beae781d602bdb558d
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently the bricks can open any mount directory from the given volume. This patch adds a provision to prevent
bricks from opening brick directories that aren't created for them. This will help with operating gluster on large
scale.
We add a new xattr GF_XATTR_BRICK_NAME to the brick directory. When we start a brick daemon, we make sure the path on
disk matches with the config provided. For backward compatibility, we ignore if there is no value for
GF_XATTR_BRICK_NAME and set the current brick daemon's path as value.
We ignore GF_XATTR_BRICK_NAME during healing and reset GF_XATTR_BRICK_NAME on brick replace.
Test Plan: Run fb-smoke
Reviewers: jdarcy, sshreyas
Reviewed By: sshreyas
Differential Revision: https://phabricator.intern.facebook.com/D5448921
Porting note: disabled some checks to deal with the snapshot case
Change-Id: I98e62033dfd07f30ad3b99ac003ce94c8d935e5f
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18275
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Enables multi-core epoll support in the nfs daemon.
- Option can be turned on using:
gluster volume set <volname> nfs.event-threads <numthreads>
Test Plan: Prove test!
Reviewers: kvigor, rwareing
Reviewed By: rwareing
Subscribers: dld, moox, dph
Differential Revision: https://phabricator.fb.com/D3117966
Change-Id: Ie8a7b1ba04b0e83f5ec7a09f9d181fe59be479ca
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18266
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- This diff adds support for detecting and tracking idle client connections.
- It allows *service translators* (server, nfs) to opt-in to detect and close idle client connections.
- Right now it explicitly restricts the service to NFS as a safety.
Here are the debug logs when a client connection gets closed:
[2016-03-29 17:27:06.154232] W [socket.c:2426:socket_timeout_handler] 0-socket: Shutting down idle client connection (idle=20s,fd=20,conn=[2401:db00:11:d0af:face:0:3:0:957]->[2401:db00:11:d0af:face:0:3:0:2049])!
[2016-03-29 17:27:06.154292] D [event-epoll.c:655:__event_epoll_timeout_slot] 0-epoll: Connection on slot->fd=9 was idle for 20 seconds!
[2016-03-29 17:27:06.163282] D [socket.c:629:__socket_rwv] 0-socket.nfs-server: EOF on socket
[2016-03-29 17:27:06.163298] D [socket.c:2474:socket_event_handler] 0-transport: disconnecting now
[2016-03-29 17:27:06.163316] D [event-epoll.c:614:event_dispatch_epoll_handler] 0-epoll: generation bumped on idx=9 from gen=4 to slot->gen=5, fd=20, slot->fd=20
Test Plan: - Used stuck NFS mounts to create idle clients and unstuck them.
Reviewers: kvigor, rwareing
Reviewed By: rwareing
Subscribers: dld, moox, dph
Differential Revision: https://phabricator.fb.com/D3112099
Change-Id: Ic06c89e03f87daabab7f07f892390edd1a1fcc20
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18265
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add outstanding-req field to track requests that have been sent
down the stack and haven't come back.
This is a port of D4908836 to 3.8
Reviewers: sshreyas
Change-Id: I5870f63008d553416109c1808a434f526f5a633d
Reviewed-on: https://review.gluster.org/18236
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Two new volume options that control reads.
performance.io-cache.read-size
- Tells gluster how much it should try to read on each posix_readv call
performance.io-cache.min-cached-read-size
- Tells gluster the smallest files it should start caching, anything smaller is not cached
This is a port of D4844662 to 3.8
Change-Id: I5ba891906f97e514e7365cc34374619379434766
Reviewed-on: https://review.gluster.org/18235
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Too may hard links blow up btrfs by exceeding max xattr size (recordign
pgfid for each hardlink). Add a limit to prevent this explosion.
This is a port D4682329 to 3.8
Reviewed By: sshreyas
Change-Id: I614a247834fb8f2b2743c0c67d11cefafff0dbaa
Reviewed-on: https://review.gluster.org/18232
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
AFR currently waits for all children to respond before sending an UP
message. This means that one dead host cal cause us to wait a TCP
timeout (2 mins!) before declaring the volume up.
Now we send an UP as soon as quorum is obtained.
This is a port of D4701919 to 3.8.
Reviewed By: sshreyas
Change-Id: I642d4eb7dc7e0b289e89b7a16abf99a3f98aa8b3
Reviewed-on: https://review.gluster.org/18231
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- This diff adds error counts and rates to the regular io-stats dump.
- It outputs keys that look like this:
"storage.gluster.nfsd.groot.aggr.errors.<error_name>.count": "6",
"storage.gluster.nfsd.groot.inter.errors.<error_name>.per_sec": "0.00"
- <error_name> is the lowercase representation of errno values (e.g., ENOENT -> enoent, etc.)
- This is a port of D4691581 to 3.8
Reviewers: dph, kvigor
Reviewed By: kvigor
Change-Id: I96857d4283c47f9d330ae1978f113013e7c78a87
Reviewed-on: https://review.gluster.org/18230
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This translator tags namespaces with a unique hash that corresponds to the
top-level directory (right under the gluster root) of the file the fop acts
on. The hash information is injected into the call frame by this translator,
so this namespace information can later be used to do throttling, QoS and
other namespace-specific stats collection and actions in later xlators
further down the stack.
When the translator can't find a path directly for the fd_t or loc_t, it winds
a GET_ANCESTRY_PATH_KEY down to the posix xlator to get the path manually.
Caching this namespace information in the inode makes sure that most requests
don't need to recalculate the hash, so that typically fops are just doing an
inode_ctx_get instead of the more expensive code paths that this xlator can take.
Right now the xlator is hard-coded to only hash the top-level directory, but
this could be easily extended to more sophisticated matching by modification
of the parse_path function.
Test Plan:
Run `prove -v tests/basic/namespace.t` to see that tagging works.
Change-Id: I960ddadba114120ac449d27a769d409cc3759ebc
Reviewed-on: https://review.gluster.org/18041
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Fixes the unecessary log spew in other daemons
- This is a port of D3646627 to 3.8
Reviewers: rwareing, kvigor
Reviewed By: kvigor
Change-Id: Id54ab41cdfdd2006d3af2d8774c38025c566c523
Reviewed-on: https://review.gluster.org/18199
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Adds the ability for gluster to log every single CREATE and UNLINK that happens on the bricks (right before invoking sys_unlink() or open(...| O_CREAT)
- Makes it so that CREATEs and UNLINKs are not downsampled in io-stats
- This is a port of D3268156, D3778968, D3903894 & D3301527 to 3.8
Reviewed By: kvigor
Change-Id: I1bce28068c02b7d202f094094237646b4d39794b
Reviewed-on: https://review.gluster.org/18198
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Adds iamshd (iamnfsd already there due to fop throttling)
options to io-stats xlator.
- Leverages these options to correctly write multi-volume NFSd stats
- This is a port of D2714648 to 3.8
Test Plan:
- Tested on local dev server, verified multiple files are generated for
multiple vols
Change-Id: Id2014a135fe52045da462eaaa91f336f45cdf167
Reviewed-on: https://review.gluster.org/18195
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- We noticed some folks name their files all the way up to NAME_MAX (usually 255) and when split-brain is encountered, we fail to heal the file.
- This diff puts an upper bound on the number of bytes we will snprintf into the buffer so that we do not fail the rename.
- This is a port of D3646254 to 3.8
Test Plan: Prove test -- can show it fails without patch as well.
Reviewers: #posix_storage, rwareing
Reviewed By: rwareing
Change-Id: I51c6b28374d4a3f21e29044cb727b4b1da7b69e1
Reviewed-on: https://review.gluster.org/18194
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Our current approach to measuring "average fop latency" is badly
flawed in that it doesn't weight the FOPs correctly according to how
many occurred in the time interval. This makes Statisticians very
sad. This patch adds an internally computed weighted average
latency which will be far more efficient to display via ODS, as well
as having the benefit of not being complete nonsense.
- This is a port of D3148415 & D3405772 to 3.8
Reviewers: kvigor, dph, sshreyas
Reviewed By: sshreyas
Change-Id: Ie3618f279b545610b7ed1a8482243fcc8dc53217
Reviewed-on: https://review.gluster.org/18192
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ic77287c1b96ae426b927b4bf6f2826d6f3a3b17d
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18175
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|\
| |
| |
| | |
Change-Id: Ie35cd1c8c7808949ddf79b3189f1f8bf0ff70ed8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The aux mount is created on the first limit/remove_limit/list command
and it remains until volume is stopped / deleted / (quota is disabled)
, where we do a lazy unmount. If the process is uncleanly terminated,
then the mount entry remains and we get (Transport disconnected) error
on subsequent attempts to run quota list/limit-usage/remove commands.
Second issue, There is also a risk of inadvertent rm -rf on the
/var/run/gluster causing data loss for the user. Ideally, /var/run is
a temp path for application use and should not cause any data loss to
persistent storage.
Solution:
1) unmount the aux mount after each use.
2) clean stale mount before mounting, if any.
One caveat with doing mount/unmount on each command is that we cannot
use same mount point for both list and limit commands.
The reason for this is that list command needs mount to be accessible
in cli after response from glusterd, So it could be unmounted by a
limit command if executed in parallel (had we used same mount point)
Hence we use separate mount points for list and limit commands.
> Reviewed-on: https://review.gluster.org/16938
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Manikandan Selvaganesh <manikandancs333@gmail.com>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
> (cherry picked from commit 2ae4b4058691b324535d802f4e6d24cce89a10e5)
Change-Id: I4f9e39da2ac2b65941399bffb6440db8a6ba59d0
BUG: 1449782
Signed-off-by: Sanoj Unnikrishnan <sunnikri@redhat.com>
Reviewed-on: https://review.gluster.org/17242
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
georep gsyncd's xtime needs to filtered irrespective
of any process access.
This way, we can avoid (unnecessarily)syncing xtime attribute
to slave, which may raise permission denied errors.
test case modified to check for xtime xattr only in backend.
Back port of>
>Change-Id: I2390b703048d5cc747d91fa2ae884dc55de58669
>BUG: 1353952
>Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
>Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
>Reviewed-on: https://review.gluster.org/14880
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Kotresh HR <khiremat@redhat.com>
>Tested-by: Kotresh HR <khiremat@redhat.com>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Change-Id: Ibdee6f3093648a7e0fb1e2b6be8172e604ab657f
BUG: 1441574
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: https://review.gluster.org/17045
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- This diff exposes the io-thread queue depths by sending a specialized getxattr() call down to the io-threads translator.
- Port of D3086477, D3094145, D3095505 to 3.8
Test Plan: Tested on devserver, will run prove tests. Valgrind + ASAN pass as well.
Reviewers: rwareing, kvigor
Subscribers: dld, moox, dph
Differential Revision: https://phabricator.fb.com/D3086477
Change-Id: Ia452a4fcdb9173a751c4cb48d739b25c235f6855
Reviewed-on: https://review.gluster.org/18143
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
Every once in a while rpcbind crashes and the NFS endpoints go bye-bye.
This diff makes it such that we should almost never encounter the case
where we have NFS up and rpcbind down causing bad endpoints and hanging
mounts for our customers.
Test Plan: Added prove tests + tested on dev server
Reviewers: dph, moox, rwareing
Reviewed By: rwareing
Differential Revision: https://phabricator.fb.com/D2571724
Tasks: 8803558
Change-Id: I35acb2d731185a7b20020cb57bdd4d879e978df4
Signature: t1:2571724:1445555327:3276a4dcc4da71346b09d4aeb46c69dddcc7c5ba
Reviewed-on: https://review.gluster.org/17961
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- PGFID healing should not be triggered in the case where there is
nothing to do (ret = 2). Instead this return code should be returned
to the heal daemon to trigger the reap of the entry.
- Reworked shd-pgfid-heal.t to queue up heal naturally instead of
synthetically
Test Plan: - Run tests/basic/afr/shd-pgfid-heal.t
Differential Revision: https://phabricator.fb.com/D2748578
Change-Id: I74300de2b4dce23867f4111548de35f58bf77453
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17936
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
PGFID healing enables heals which might otherwise fail due
due to the lack of a entry heal to succeed by performing
the entry healing within the same heal flow.
It does this by leveraging the PGFID tracking feature of
the POSIX xlator, and examining lookup replies for the
PGFID attribute. If detected, the pgfid will be decoded
and stored for later use in case the heal fails for whatever
reason. Cascading heal failures are handled through
recursion.
This feature is critical for a couple reasons:
1. General healing predictability - When the SHD
attempts to heal a given GFID, it should be able
to do so without having to wait for some other
dependent heal to take place.
2. Reliability - In some cases the parent directory
may require healing, but the req'd entry in the
indices/xattrop directory may not exist
(e.g. bugs/crashes etc). Prior to PGFID heal support
some sort of external script would be required to
queue up these heals by using FS specific utilities
to lookup the parent directory by hardlink or
worse...do a costly full heal to clean them up.
3. Performance - In combination with multi-threaded SHD
this feature will make SHD healing _much_ faster as
directories with large amount of files to be healed
will no longer have to wait for an entry heal to
come along, the first file in that directory queued
for healing will trigger an entry heal for the directory
and this will allow the other files in that directory
to be (immediatelly) healed in parallel.
Test Plan:
- run prove tests/basic/afr/shd_pgfid_heal.t
- run prove tests/basic/afr/shd*.t
- run prove tests/basic/afr/gfid*.t
Differential Revision: https://phabricator.fb.com/D2546133
Change-Id: I25f586047f8bcafa900c0cc9ee8f0e2128688c73
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17929
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
cache calls to statfs
- io-cache must be enabled
- then enable statfs caching
- also can configure an independent cache time
Test Plan: unit test basic/cache.t
Reviewers: rwareing, sshreyas
Subscribers: rappleye
Differential Revision: https://phabricator.fb.com/D2524471
Change-Id: I55e0a773f9e24c2358d6fbbabbaf58bd5bd89ffc
Tasks: 8618383
Reviewed-on: https://review.gluster.org/17771
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- exports_auth changed to a per-volume option
- parse exports_auth in nfs3.c
- set nfs3_export state for exports_auth
- all calls into mnt3_authenticate_request must pass in volname
- volname is checked to determine if auth is enabled for that volume
Test Plan: manual testing, will look into unit testing
Reviewers: rwareing, sshreyas
Reviewed By: sshreyas
Subscribers: rappleye
Differential Revision: https://phabricator.fb.com/D2519423
Tasks: 6863942
Change-Id: Ia9fd92ca5a5bd4cbb57e9ce61075f024ab7dbc27
Signature: t1:2519423:1444775772:24dc39e22684784b75899e97e9d1e294b059a077
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17762
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Updates heal flow to handle case where a directory does not have a
gfid assigned. In this case we will remove _only_ empty directories
in these cases such that the parent can re-gain consistency and files
within can be correctly healed.
- Also adds a test for the case where a file does not have a gfid, this
is already handles by the metadata heal flow, but tests were lacking
for this code path.
Test Plan:
- prove -v tests/basic/shd_autofix_nogfid.t
- prove -v tests/basic/gfid_unsplit_shd.t
Reviewers: dph, moox, sshreyas
Reviewed By: sshreyas
Differential Revision: https://phabricator.fb.com/D2502067
Tasks: 8549168
Change-Id: I8dd3e6a6d62807cb38aafe597eced3d4b402351b
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17750
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Accidentally pushed this directly through the branch instead of through
Gerrit.
Change-Id: Ieedd2f71887cca91a6f1d31bc3cddfc489fc9fa6
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17749
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- This change ensures SHD always inspect directories which are queued
for healing; i.e. it will not exclusively trust the wise-fool
algorithm as there are cases where the change log simply isn't
correct (bugs, crashes, etc). Failing to perform the entry heal
in these cases will result in data heals failing to take place.
- We made a similar change in 3.4.x for similar reasons
Test Plan: - Run prove -v tests/basic/shd_force_inspect.t
Reviewers: moox, dph, sshreyas
Reviewed By: sshreyas
Differential Revision: https://phabricator.fb.com/D2492993
Tasks: 8549168
Signature: t1:2492993:1443740894:7cf07168ca09946df9d8f96a3085fe2d3c201543
Change-Id: I2d8e1cbecbbca720cc3ee988d7aae08bea0a5453
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Few improvements: handle type mis-matches (e.g. dir/file mis-matches), added in an option to control whether gfid unsplits will happen, ensured entry healing will happen in the gfid mis-match case when the option is enabled.
- Added prove test to cover entry healing & type mis-match cases
- Enable metadata split-brain resolution by default
- Enable gfid split-brain resolution by default
- Fix gfid unsplit logging bugs where it was showing null GFIDs instead of the actual chosen GFIDs
Test Plan:
- run prove -v test/basic/gfid_unsplit*
- Ran valgrind to verify leak-free state
Reviewers: moox, sshreyas
Reviewed By: sshreyas
Change-Id: Id67ddc728745ebbbaf7bdd3f9a5549e5a4cc4a20
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Change-Id: I4181233f9ba7f61ccd2ba91f0874eb2ac7cd40b5
Manually-merged-by: Jeff Darcy <jdarcy@fb.com>
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17739
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Prior patch did not re-run the gfid-mismatch flow after doing the
unsplit. I think this is prudent to re-validate the unsplit worked as
well as allow the code to continue from where it effectively left off.
Test Plan: - Run prove -v tests/basic/gfid_unsplit.t
Reviewers: dph, moox, sshreyas
Reviewed By: sshreyas
Change-Id: Ib3ed40f3db38c89090a876d7af3a1b2a303539d5
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17729
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- v3.6.3 port of non-destructive GFID unsplit-brain code, almost a re-write for AFR2, but the original behavior lives on.
- This feature allows the GlusterFS filesystem to automagically resolve GFID splitbrain situations by choosing the authorative file based on the last modification time. Other policies such as majority or size are also possible but not implemented just yet.
- Core feature to Halo Geo-Replication, as this (gfid) form of split-brain is an everyday possibility with async mounts, so there needs to be an automated & scalable method to resolve them via the SHD or optionally in-line by FUSE clients or NFS daemons.
- Operational notes:
1. Files or directory entries are supported, you can even write files into a directory and they will not be lost.
2. Streamed writes to a files are fully supported while a split-brain resolution happens, i.e. the writes will not be interrupted while the unsplit takes place.
3. Un-split (ones which are determined not to be "authoritative") files are renamed like so: ".<filename>_<random uuid>"
Test Plan:
- Run prove -v tests/basic/gfid_unsplit.t
- Test output: https://phabricator.fb.com/P20041740
Reviewers: moox, dph, sshreyas
Reviewed By: sshreyas
Differential Revision: https://phabricator.fb.com/D2479409
Signature: t1:2479409:1443208319:373218aa9758a1b48db23ea5e211ec303fa92e64
Blame Revision: Change-Id: I5b3d2e79fad74b4372c02b86219e8ee98f5e29dc
Change-Id: I8ef719bcccb19ab6674647e02b72e1b36155fed9
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17720
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Change-Id: I4074e7cce8f6782860f849780ab6d0458e92a2ce
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17708
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Change-Id: I977f94ebc1630bdf46fd28d310f433c1c7d327f5
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17286
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
These were causing hung tests when the bricks were restarted, in some
cases even accompanied by kernel stack traces involving fuse. They're
unnecessary to the purpose of the test.
Change-Id: I3c6c485324e2ee9418eb54929015b5bae436e9ea
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17284
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Change-Id: I163656985185c710e20ad80824e14645020522cd
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17279
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
When a netgroup is marked as rw in the exports file, and another netgroup is marked as ro for the same share,
the ro option is not honored. This diff fixes that bug
Test Plan: Added a test and verifies that it passes with this patch and does not pass without this patch.
Reviewers: rwareing, dph, moox
Reviewed By: moox
FB-commit-id: 2d36d2d
Change-Id: Ia394f36472f094a62ddfedc0c8fd5d95e247b4b0
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: https://review.gluster.org/16908
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
This test is failing consistently on both the upstream 3.98 branch and
3.8-fb, just nuke.
Test Plan:
Check in, hope smoke test passes.
Reviewers:
Subscribers:
Tasks:
Blame Revision:
Change-Id: I888a9f424340dff128a4a5273f96e6d3fbc323a9
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: https://review.gluster.org/16647
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
Test relied on rmtab setting to find NFS config dir, but in FB branch
rmtab is disabled by default. Just hardcoded the location (test already
contained hardcoded path, as it turns out).
Test Plan:
Reviewers:
Subscribers:
Tasks:
Blame Revision:
Change-Id: I4a50a00ed550832ca8d91981e6c5af4d8c81b466
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16336
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|\|
| |
| |
| |
| | |
Change-Id: I844adf2aef161a44d446f8cd9b7ebcb224ee618a
Signed-off-by: Kevin Vigor <kvigor@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Problem:
In CentOS-7, the file was receving an extra removexattr(security.ima)
FOP which changed its ctime, breaking the assumption that a particular brick
had the latest ctime based on the writevs done in the .t
Fix:
1. Compare the ctime of both files in the backend and pick the one with
the latest ctime for the fav-child policy. Also unmount the volume
before comparing, to avoid any further FOPS on the file that
can possibly modify the timestamps.
2. Added floating point handling in stat function. Thanks to Pranith for
the helping debugging the regex.
> Reviewed-on: http://review.gluster.org/16288
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
(cherry picked from commit 76fff8cb2a164b596ca67e65c99623f5b68361fd)
Change-Id: I06041a0f39a29d2593b867af8685d65c7cd99150
BUG: 1410073
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/16324
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Enable data, metadata and entry self-heal as xlator-options so that glfs-heal.c
can heal split-brain files even if they are disabled on the volume via volume
set commands.
> Reviewed-on: http://review.gluster.org/11333
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
(cherry picked from commit 209c2d447be874047cb98d86492b03fa807d1832)
Change-Id: Ic191a1017131db1ded94d97c932079d7bfd79457
BUG: 1405130
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/16144
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Backport of: http://review.gluster.org/15747
When there are already existing non-granular indices created that are
yet to be healed, if granular-entry-heal option is toggled from 'off' to
'on', AFR self-heal whenever it kicks in, will try to look for granular
indices in 'entry-changes'. Because of the absence of name indices,
granular entry healing logic will fail to heal these directories, and
worse yet unset pending extended attributes with the assumption that
are no entries that need heal.
To get around this, a new CLI is introduced which will invoke glfsheal
program to figure whether at the time an attempt is made to enable
granular entry heal, there are pending heals on the volume OR there
are one or more bricks that are down. If either of them is true, the
command will be failed with the appropriate error.
New CLI: gluster volume heal <VOL> granular-entry-heal {enable,disable}
Change-Id: I342e0390f847fcb015a50ef58aedfcbcb58f4ed3
BUG: 1398501
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15942
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
during crawl
Backport of: http://review.gluster.org/15880
If granular name indices are already in existence for a volume, and
before they are healed, granular entry heal be disabled, a crawl on
indices/xattrop will clear the changelogs on these directories. When
their corresponding entry-changes indices are crawled subsequently,
if it is found that the directories don't need heal anymore, the
granular indices are not cleaned up.
This patch fixes that problem by ensuring that the zero-xattrop
also deletes the stale indices at the level of index translator.
Change-Id: Iae0a560c1c9d37b083cad89f16d3dcf83c4f7dc7
BUG: 1398501
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15927
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
Halo prove tests were racy in a couple of ways. First, they raced
against the self-heal daemon (e.g. write to volume with two bricks
up and then assert that only two bricks have data file; but shd will
properly copy file to third brick sooner or later). Fix by disabling
shd in such tests.
Second, tests rely on pings to complete and set halo state as
expected, but do not check for this. If writing begins before initial
pings complete, all bricks may be up and receive the data. Fix by adding
explicit check for halo child states.
Test Plan:
prove tests/basic/halo*.t
(prior to this changeset, would fail within ~10 iterations on my
devserver and almost always on centos regression. Now runs overnight
without failure on my devserver).
Reviewers:
Subscribers:
Tasks:
Blame Revision:
Change-Id: If6823540dd4e23a19cc495d5d0e8b0c6fde9a3bd
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16325
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Implements "Hybrid" halo mounts which perform write FOPs synchronously
to all regions and read FOPs asynchronously to those nodes within the
region.
- Leverages the fact that the inode cache stores "hints" as to the
clean/dirty state of upto 16 replicas in the cluster; upon refresh of
an inode we can do a fresh lookup to get a "wise" node if none exist
in our region
- Activated via the cluster.halo-hybrid-mode option
Test Plan:
- Run AFR prove tests
- Run halo prove tests
Reviewers: kvigor, sshreyas
Reviewed By: kvigor, sshreyas
FB-commit-id: aca760757afd45e4de2e28dc31b87a73ee52f12d
Change-Id: Ic6728ce93b7b96e3151dccdebccd30e007f4750c
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16306
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- SHD is now excluded from the max-replicas policy. We'd need
to make an SHD specific tunable for this to make tests reliably
pass, and frankly it probably makes things more intuitive having
SHD excluded (i.e. SHD can always see everything).
- Updated the halo-failover-enabled test, I think it's a bit more clear
now, and works reliably. halo.t fixed after fixing the SHD
max-replicas bug.
Test Plan: - Run prove tests -> https://phabricator.fb.com/P19872728
Reviewers: dph, sshreyas
Reviewed By: sshreyas
FB-commit-id: e425e6651cd02691d36427831b6b8ca206d0f78f
Change-Id: I57855ef99628146c32de59af475b096bd91d6012
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16305
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Changes halo-decision to be based on the lowest halo value observed
- Adds halo-min-sample option to wait until N latency samples have been
gathered prior to activating halos.
- Fixed 3 edge cases where halo's weren't being correctly
config'd, or not configured as quickly as is possible. Namely:
1. Don't mark a child down if there's no better alternative (and you'd
no longer satisfy min/max replicas); fixes unneccessary flapping.
2. If a child goes down and this causes us to fall below max_replicas,
swap in a warm child immediately if it is within our halo latency
(don't wait around for the next "ping"); swaps in a new child
immediately helping with resiliency.
3. If the child latency is within the halo, and it's currently marked
up, mark it down if it's the highest latency child and the number of
children is > max_replicas; this will allow us to support the
SHD use-case where we can "beam" a single copy to a geo and have it
replicate within the geo after that.
- More commenting
Test Plan:
- Run halo prove tests
- Pointed compiled code at gfsglobal.prn2, tested out an NFS daemon and
FUSE mounts to ensure they worked as expected on a large scale
cluster.
Reviewers: dph, jackl, cjh, mmckeen
Reviewed By: mmckeen
FB-commit-id: 7e2e8ae6b8ec62a5e0b31c9fd6100c81795b3424
Change-Id: Iba2b2f1bc848b4546cb96117ff1895f83953a4f8
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16304
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Adds "halo-failover-enabled" option to enable/disable failing over to a brick outside of the defined halo to satisfy min-replicas
- There are some use-cases where failing over to a brick which is out of region will be undesirable. I such cases we will more than likely opt to have more replicas within the region to tolerate the loss of a single replica in that region without losing quorum.
- Fixed quorum accounting problem as well, now correctly goes RO in case where we lose a brick and aren't able to swap one in for some reason (fail-over not enabled or otherwise)
Test Plan:
- run prove -v tests/basic/halo.t
- run prove -v tests/basic/halo-disable.t
- run prove -v tests/basic/halo-failover-enabled.t
- run prove -v tests/basic/halo-failover-disabled.t
Reviewers: dph, cjh, jackl, mmckeen
Reviewed By: mmckeen
Conflicts:
xlators/cluster/afr/src/afr.h
xlators/mount/fuse/utils/mount.glusterfs.in
Change-Id: Ia3ebf83f34b53118ca4491a3c4b66a178cc9795e
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16275
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|