| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Highlights include:
* Fixed GF_CONF_OPTS (dev builds) and RPM_BUILD_FLAGS (rpm builds)
* Fixed version in configure.ac
* Fixed handling of files only present when BUILD_FB_EXTRAS is set
* Fixed disable-georeplication (upstream bug)
* Fixed disable-tiering (upstream bug)
* Removed .service files which should be generated from .in versions
* Fixed tirpc (previously fbtirpc) references
* Fixed init_enable problems
* Removed delay-gen references
Test Plan: Use build.sh to build an RPM, and install it.
Differential Revision: https://phabricator.intern.facebook.com/D6611299
Change-Id: If61a4964a149f782038ea47362a82b813e6b7738
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
They have a common ancestor at 3.6, but there were hundreds of
lines of changes for each file on each side of the fork. In both
cases the easiest method was to take the upstream 3.8 version and
re-apply our own changes since we branched. Some changes were
dropped (e.g. runit) and a few other files needed new changes
(e.g. pkg-version) to keep up. Then there was more hacking to
fix stealth geo-rep dependencies, enable tirpc/IPv6, and so on.
Also added buildrpm38 and makerelease38. These should probably
not go upstream, but not sure what else to do with them.
Test Plan: Build RPMs. Install, create volumes, mount, do I/O.
Reviewers: sshreyas, #posix_storage
Reviewed By: sshreyas
Subscribers: jbacik, aquevedo, scientist, sshreyas, calvinowens, jweiner
Differential Revision: https://phabricator.intern.facebook.com/D6259797
Tasks: T20348589
Tags: posix-2017h2, gluster, posix_storage
Change-Id: I2d43fc6f7f5603293e406c21e4ec85bf19610b77
Signature: 6259797:1510694123:fc5d2975fec134a51d4b70f7f983cd71971e175a
|
|
|
|
|
| |
Change-Id: I2ca0298ee9d166f58b8730256ea76a04e547ce5d
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
|
|
|
|
|
|
| |
Differential Revision: https://phabricator.intern.facebook.com/D5927193
Change-Id: Ife04c8738b9ee721e7be9bc843b2f6d54bbb468e
|
|
|
|
|
|
|
|
|
|
| |
Includes io-threads parts of the following patches:
9e3fea1 performance/io-threads: Exit all threads on PARENT_DOWN
2cfb7bc performance/io-threads: Exit threads in fini() as well
Change-Id: Id7cc7720e75414fb8a3ac2db68a5fe63c459ffe2
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Includes io-stats parts of the following patches:
1e421a5 logging: Avoid re-initing log level in io-stats
0facb11 io-stats: Fix overwriting of client profile by the bricks
91004b0 debug/io-stats: Disable fop stats dump by default
62f9659 all: fix various cppcheck warnings
e62c0fe build: export minimum symbols from xlators for correct resolution
1d0a0d1 core: use syscall wrappers instead of direct syscalls - tail
0773ca6 all: reduce "inline" usage
8a9328e build: do not #include "config.h" in each file
320455b io-stats: Fixing dereference after null check.
28397ca Avoid conflict between contrib/uuid and system uuid
49d6894 io-stats : null dereference coverity fix.
Change-Id: If1bdad6244e5749c6d8c456e6c64b5c5b483e273
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This rolls up multiple patches related to namespace identificaton and
throttling/QoS. This primarily includes the following, all by Michael
Goulet <mgoulet@fb.com>.
io-threads: Add weighted round robin queueing by namespace
https://phabricator.facebook.com/D5615269
io-threads: Add per-namespaces queue sizes to IO_THREADS_QUEUE_SIZE_KEY
https://phabricator.facebook.com/D5683162
io-threads: Implement better slot allocation algorithm
https://phabricator.facebook.com/D5683186
io-threads: Only enable weighted queueing on bricks
https://phabricator.facebook.com/D5700062
io-threads: Update queue sizes on drain
https://phabricator.facebook.com/D5704832
Fix parsing (-1) as default NS weight
https://phabricator.facebook.com/D5723383
Parts of the following patches have also been applied to satisfy
dependencies.
io-throttling: Calculate moving averages and throttle offending hosts
https://phabricator.fb.com/D2516161
Shreyas Siravara <sshreyas@fb.com>
Hook up ODS logging for FUSE clients.
https://phabricator.facebook.com/D3963376
Kevin Vigor <kvigor@fb.com>
Add the flag --skip-nfsd-start to skip the NFS daemon stating, even if
it is enabled
https://phabricator.facebook.com/D4575368
Alex Lorca <alexlorca@fb.com>
There are also some "standard" changes: dealing with code that moved,
reindenting to comply with Gluster coding standards, gf_uuid_xxx, etc.
This patch *does* revert some changes which have occurred upstream since
3.6; these will be re-applied as apppropriate on top of this new base.
Change-Id: I69024115da7a60811e5b86beae781d602bdb558d
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A mistake was made in D2519423 where `ret` wasn't being set to `0` at the end of `nfs3_init_subvolume_options` since code was inserted between the final `ret = 0` and the return, causing the function to return phony positive ret values.
This causes the code to interpret the reconfigure function as a failure, meaning that changes can't be persisted.
This only affects the `reconfigure` path and not the `init` path, since the `reconfigure` path fails when `ret != 0` and the init path only fails when `ret == -1`...
Test Plan: See that volume options are actually being set when the `nfs` xlator is alive, instead of simply on init.
Reviewers: jdarcy, kvigor, dph, sshreyas
Reviewed By: sshreyas
Subscribers: #posix_storage
Differential Revision: https://phabricator.intern.facebook.com/D5699888
Change-Id: I89006ce3970f22a4206e58ca5630c21df536031c
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18293
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: @sshreyas thought the best idea to roll out these new features in the default-off state. This diff adds a few options and modifies tests to make sure that this is done.
Test Plan: The brick restart test works fine, but now it's default disabled on all bricks.
Reviewers: sshreyas, jdarcy
Reviewed By: jdarcy
Subscribers: sshreyas, #posix_storage
Differential Revision: https://phabricator.intern.facebook.com/D5653138
Porting note: includes disconnected-reqs option; retart-bricks inapplicable
Change-Id: I332339894d3cbfafdabeb8592e95c37f30f9751a
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18291
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
[done] separate p99 dumping into general funcs
[done] add p95, p90, and p50 stats
- add p95, p90, p50 within p99, and generalize
- rename config to dump-percentile-lantencies
Test Plan:
make install glusterfs on dev machine.
gluster volume create $name ...
mount volume on /mnt/$name <brick1, brick2, ...>
dd if=/dev/zero of=/mnt/$name/test
check each brick for pn printing
/var/lib/glusterd/stats/glusterfsd__$brick.dump
Reviewers: sshreyas, kvigor, jdarcy
Reviewed By: jdarcy
Differential Revision: https://phabricator.intern.facebook.com/D5645951
Change-Id: Ic8ada48d9772bf2d5b3a2ba3c845d91d4e03c9d3
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18279
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
[done] separate p99 dumping into general funcs
[done] add p95, p90, and p50 stats
- add p95, p90, p50 within p99, and generalize
- rename config to dump-percentile-lantencies
Test Plan:
make install glusterfs on dev machine.
gluster volume create $name ...
mount volume on /mnt/$name <brick1, brick2, ...>
dd if=/dev/zero of=/mnt/$name/test
check each brick for pn printing
/var/lib/glusterd/stats/glusterfsd__$brick.dump
Reviewers: sshreyas, kvigor, jdarcy
Reviewed By: jdarcy
Differential Revision: https://phabricator.intern.facebook.com/D5645951
Change-Id: I7bcd7201fc3753316db0ece809491a1cbdbefd32
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18278
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
added global and by-fop-type calculation of p99 latency
to the sampled fop data
Test Plan:
build local glusterfs mount and looked at
the stats while dd if=/dev/zero of=/mnt/fuse/groot/share1/test1 bs=5
Reviewers: sshreyas, mgoulet, jdarcy
Reviewed By: jdarcy
Subscribers: jdarcy
Differential Revision: https://phabricator.intern.facebook.com/D5597662
Change-Id: I3f5cd9c0ea59ae4357827fcbd19bbf009e661c05
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18277
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently the bricks can open any mount directory from the given volume. This patch adds a provision to prevent
bricks from opening brick directories that aren't created for them. This will help with operating gluster on large
scale.
We add a new xattr GF_XATTR_BRICK_NAME to the brick directory. When we start a brick daemon, we make sure the path on
disk matches with the config provided. For backward compatibility, we ignore if there is no value for
GF_XATTR_BRICK_NAME and set the current brick daemon's path as value.
We ignore GF_XATTR_BRICK_NAME during healing and reset GF_XATTR_BRICK_NAME on brick replace.
Test Plan: Run fb-smoke
Reviewers: jdarcy, sshreyas
Reviewed By: sshreyas
Differential Revision: https://phabricator.intern.facebook.com/D5448921
Porting note: disabled some checks to deal with the snapshot case
Change-Id: I98e62033dfd07f30ad3b99ac003ce94c8d935e5f
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18275
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This will now skip files in the peer directory that don't have names or contents that match what we expect for a valid peerfile, instead of blowing up the entire glusterd initialization as soon as the first unexpected thing happens.
Test Plan: Test (peer-parsing.t) included.
Reviewers: #posix_storage, kvigor
Reviewed By: kvigor
Subscribers: kvigor
Differential Revision: https://phabricator.intern.facebook.com/D5498639
Tags: gluster, posix_storage
Change-Id: Ifad9b047a828c2f76f97d0c39f305b7ec5a8ca4c
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18276
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: By far the most common reason why a brick's directory might not exist is that the local filesystem on which it lives hasn't finished mounting yet. This is unlike other checks we do, such as for a volume ID and GFID. Some of these are normal conditions when a brick is firstcreated; others are often the result of operator/script error. In the singular case of the directory being absent, wait a little while to see if it comes up.
Test Plan:
Create a volume.
Start/stop a volume once so everything gets initialized.
Move a brick directory out of place.
Try to start the volume. This should pause.
Immediately move the brick directory back into place. This should break the pause.
Reviewers: #posix_storage, sshreyas
Reviewed By: sshreyas
Subscribers: shreyas, sshreyas, ventullo, moox
Differential Revision: https://phabricator.intern.facebook.com/D5063515
Tags: gluster
Change-Id: Ied7b07b1a60f54856a67d4cdbad35bfce9e196e4
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18274
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We want to track the number of locks held by the locks xlator. One of the ways to do it would be to track the
total number of pl_lock objects in the system.
This patch tracks the total number of pl_lock object and exposes the stat via io-stats JSON dump.
Test Plan: WIP, haven't got a pass. Putting the diff to get a sense of this approach would yield what you guys are looking for?
Reviewers: kvigor, sshreyas, jdarcy
Reviewed By: jdarcy
Differential Revision: https://phabricator.intern.facebook.com/D5303071
Change-Id: I946debcbff61699ec28b4d6f243042440107a224
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18273
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a new volume option, shd-validate-data. When set, the self-heal code
will fetch checksums for regular files along with all the usual xattrs. If the
file seems OK but the checksums show a data mismatch, and if there is only one
replica that's out of step with the others, then we modify the source/sink
calculations to force a heal from one of the agreeing replicas to the odd one
out. Combined with a tool to put files into the self-heal index (being developed
separately), this provides a very rudimentary kind of scrubbing functionality.
Validation is now conditional on the "trusted.glusterfs.validate-status" xattr
having the specific value of "suspect" to avoid redoing validation (which is
expensive) as we find the same file in multiple bricks' indices. When we decide
to take action, we update this xattr to "clean" for copies that were in the
majority and "repaired" for the odd one out that gets clobbered. We also copy
the about-to-be-clobbered copy into an "orphans" directory to facilitate
analysis of corruption patterns. The data goes into ${GFID}.data there, while
${GFID}.link is a symlink to the file's old location.
Porting note: this is several internal squashed together ("See Also")
Differential Revision: https://phabricator.intern.facebook.com/D5092983
See Also: https://phabricator.intern.facebook.com/D5126974
See Also: https://phabricator.intern.facebook.com/D5127427
See Also: https://phabricator.intern.facebook.com/D5132804
See Also: https://phabricator.intern.facebook.com/D5209185
See Also: https://phabricator.intern.facebook.com/D5370353
Change-Id: Ie0ae18b368c408a5e47d0bf03ebac80b87b70aa9
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18269
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Enables multi-core epoll support in the nfs daemon.
- Option can be turned on using:
gluster volume set <volname> nfs.event-threads <numthreads>
Test Plan: Prove test!
Reviewers: kvigor, rwareing
Reviewed By: rwareing
Subscribers: dld, moox, dph
Differential Revision: https://phabricator.fb.com/D3117966
Change-Id: Ie8a7b1ba04b0e83f5ec7a09f9d181fe59be479ca
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18266
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- This diff adds support for detecting and tracking idle client connections.
- It allows *service translators* (server, nfs) to opt-in to detect and close idle client connections.
- Right now it explicitly restricts the service to NFS as a safety.
Here are the debug logs when a client connection gets closed:
[2016-03-29 17:27:06.154232] W [socket.c:2426:socket_timeout_handler] 0-socket: Shutting down idle client connection (idle=20s,fd=20,conn=[2401:db00:11:d0af:face:0:3:0:957]->[2401:db00:11:d0af:face:0:3:0:2049])!
[2016-03-29 17:27:06.154292] D [event-epoll.c:655:__event_epoll_timeout_slot] 0-epoll: Connection on slot->fd=9 was idle for 20 seconds!
[2016-03-29 17:27:06.163282] D [socket.c:629:__socket_rwv] 0-socket.nfs-server: EOF on socket
[2016-03-29 17:27:06.163298] D [socket.c:2474:socket_event_handler] 0-transport: disconnecting now
[2016-03-29 17:27:06.163316] D [event-epoll.c:614:event_dispatch_epoll_handler] 0-epoll: generation bumped on idx=9 from gen=4 to slot->gen=5, fd=20, slot->fd=20
Test Plan: - Used stuck NFS mounts to create idle clients and unstuck them.
Reviewers: kvigor, rwareing
Reviewed By: rwareing
Subscribers: dld, moox, dph
Differential Revision: https://phabricator.fb.com/D3112099
Change-Id: Ic06c89e03f87daabab7f07f892390edd1a1fcc20
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18265
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Improves upon D2387001 by moving the "forced" root gfid heal to the SHDs
- Removed code which forced NFSd/FUSE clients through the entry heal for
the root GFID, this will make them spin up just as fast as prior to D2387001 (i.e. instantly)
Porting note: mostly inapplicable in 3.8, only one non-test change survived
Test Plan: - Must pass tests/bugs/fb8149516.t
Reviewers: dph, moox, sshreyas
Reviewed By: sshreyas
Differential Revision: https://phabricator.fb.com/D2722239
Change-Id: I35f5827df6ead1bb0ff886ca0adabb2add2e7163
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18259
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: These requests haven't been issued, yet alone acknowledged. They would disappear if we crashed, which to the client is indistinguishable from any other kind of disconnection - if indeed the client itself isn't the one that died. So we're completely within our rights to discard these. There are strong hints that such "orphan" requests are part of how we get into the lock-revocation hangs we've been seeing for a while. Even if that theory doesn't pan out, there's no good reason to keep them around clogging up queues and so forth.
This is a port of D5430057 & D5662545 to 3.8
Change-Id: Ie4c88f7791aac85540631f60f5c639497468ad76
Reviewed-on: https://review.gluster.org/18254
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
directory
Summary:
- We may have found an issue where certain directories were being moved into .landfill and then being quickly purged via nftw().
- We would like to have an emergency option to disable these purges.
Test Plan: Build, vol-set, read logs
Reviewers: rwareing, dph
Reviewed By: dph
Subscribers: #posix_storage
Differential Revision: https://phabricator.intern.facebook.com/D4862021
Change-Id: I90b54c535930c1ca2925a928728199b6b80eadd9
Signature: t1:4862021:1491855616:51b9b5b8957b0bb97afe27766f2e5aa78ff9edd4
Reviewed-on: https://review.gluster.org/18253
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: - Exempt the SHD from the discover code path
Test Plan:
- prove -v tests/bugs/fb8149516.t
- Make rc and canary on offending host (gfsdataswarm048.prn2)
Reviewers: moox, dph, sshreyas
Reviewed By: sshreyas
Differential Revision: https://phabricator.fb.com/D2491694
Change-Id: I691a990950e13be6e376c64fddb110cd6ceefe47
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18251
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add assume-permissive option for EACCES debugging / rug-sweeping.
Re-fetch permissions when needed if they're absent.
This is a port of D5104707 & D5131597 to 3.8
Change-Id: I900fc66876ec8e73b04049f844c428b3d225d4ad
Reviewed-on: https://review.gluster.org/18249
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: - Prevents entry self-heal flow from happening on non-root GFIDs
Test Plan: - Run prove -v tests/bugs/fb8149516.t
Reviewers: dph, moox, sshreyas
Reviewed By: sshreyas
Differential Revision: https://phabricator.fb.com/D2470622
Change-Id: Id8559f2cfeb6e1e5c26dc1571854c0fbc0b59e08
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18250
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
correctly used
Summary:
This diff fixes a bug in the NFS daemon where the auth cache would use an export item after it was free'd by the
auth params refresh thread. This usually manifests as a crash in production, when exports files are updated by chef.
Since each auth cache entry holds a pointer to an export_item_t it makes sense that it should first get a reference to it.
Freein'g the export_item_t struct happens only in `exp_item_unref()`, once the reference count has dropped to 0.
This diff also fixes a use-after-free bug in the auth-cache, in the insertion path.
In _cache_item(), if we find an entry in the dict, we update that entry with a timestamp & ref the export item associated with it.
However, if the item already existed and we called old_cache_insert() with the same key, we gave the dict permission to free the old entry.
We then end up using that entry.
The fix is to use dict_set_static_bin() instead of dict_set_bin() which informs the dict that the pointer we are giving it belongs to us.
This is a port of D5780476, D5785038 to 3.8
Change-Id: I5cdcdc1cc6abad26c7077d66a14f263da07678ac
Reviewed-on: https://review.gluster.org/18248
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A lot of the diff "volume" is just refactoring, which should have no functional effect.
It's preparation for adding a new implementation.
The main functional change is locking around the external calls into this module, to prevent some of the races that we've seen.
Additional fixes:
- entry_data->data can be NULL, so we should check lookup_res before dereferencing it below.
- It renames functions that need to be locked to have double underscores in front of them.
This is a port of D5658875, D5658809 & D5762136 to 3.8
Change-Id: If1b71b5c3268271f3a41c07394c215290a12c0ec
Reviewed-on: https://review.gluster.org/18247
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- This diff looks for a custom xattr on a directory or file called 'trusted.glusterfs.md-cache-timeout' and uses that timeout if it finds it instead of the default timeout value for the cache.
- For example, if we know that a customer has a fixed set of directories that never change, we can set that attribute on all their directories and cache directory metadata for the lifetime of the client (NFS or FUSE) process.
- Port of D5430395 to 3.8
Reviewed By: jdarcy
Change-Id: Ieb232bc1365c59dd7c396c7a617f12973cc8ea01
Reviewed-on: https://review.gluster.org/18241
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Null peer UUIDs are assumed to be invalid.
Glusterd should complain and bail if we try to load any on startup.
This is a port of D5160925 to 3.8
Reviewed By: sshreyas
Change-Id: Ib8679c7501a4fc1fbf9b34fdbf47037f38ec7cb8
Reviewed-on: https://review.gluster.org/18238
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There appears to be a thread leak somewhere, which causes io-threads to
run out of threads to process a particular (priority-based) queue.
The leak should obviously be fixed, but that might take a while
and the consequences until then are severe - a brick essentially going
offline without the courtesy of actually dying. This patch adds a
watchdog that checks for stuck queues, and adds threads to break the deadlock.
The same thing done manually on one afflicted cluster caused brick CPU usage
to drop from 2600% to 400%, with latency quickly returning to normal.
The controlling option is performance.iot-watchdog-secs,
which is the number of seconds we'll watch for a non-empty
queue with no items being dequeued. That's our signal to
add a thread. A value of zero (the default) disables
this watchdog feature.
This is a port of D5177414 to 3.8.
Test Plan: All the usual tests to determine safety.
Use gdb to hack priv->queue_sizes to a non-zero value. This will make it look like the queue is non-empty, but since it does in fact have zero items there will be no dequeues. After watchdog-secs seconds, this should add a thread, with a corresponding entry in the brick log.
Change-Id: Ic051e411d3e9351e1cf5e233bad8bbb5078cb259
Reviewed-on: https://review.gluster.org/18239
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
| |
Change-Id: I520894244063ef854b4416cb5418065bd9de7277
Reviewed-on: https://review.gluster.org/18237
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add outstanding-req field to track requests that have been sent
down the stack and haven't come back.
This is a port of D4908836 to 3.8
Reviewers: sshreyas
Change-Id: I5870f63008d553416109c1808a434f526f5a633d
Reviewed-on: https://review.gluster.org/18236
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Two new volume options that control reads.
performance.io-cache.read-size
- Tells gluster how much it should try to read on each posix_readv call
performance.io-cache.min-cached-read-size
- Tells gluster the smallest files it should start caching, anything smaller is not cached
This is a port of D4844662 to 3.8
Change-Id: I5ba891906f97e514e7365cc34374619379434766
Reviewed-on: https://review.gluster.org/18235
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Sometimes a the process that glusterd is trying to kill is already dead.
- In that case, if it can't find the pid, it should just continue on and not fail the entire operation.
- This is a port of D4837916 to 3.8
Change-Id: Ic96952a8d31927446f648830ede6ccd82512663f
Reviewed-on: https://review.gluster.org/18234
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Too may hard links blow up btrfs by exceeding max xattr size (recordign
pgfid for each hardlink). Add a limit to prevent this explosion.
This is a port D4682329 to 3.8
Reviewed By: sshreyas
Change-Id: I614a247834fb8f2b2743c0c67d11cefafff0dbaa
Reviewed-on: https://review.gluster.org/18232
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
AFR currently waits for all children to respond before sending an UP
message. This means that one dead host cal cause us to wait a TCP
timeout (2 mins!) before declaring the volume up.
Now we send an UP as soon as quorum is obtained.
This is a port of D4701919 to 3.8.
Reviewed By: sshreyas
Change-Id: I642d4eb7dc7e0b289e89b7a16abf99a3f98aa8b3
Reviewed-on: https://review.gluster.org/18231
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- When you write a file and then stat it immediately, md-cache returns stale stat information.
- This diff implements flush() in md-cache so that we can correctly invalidate inodes after
a write.
- This is a port of D4762171 to 3.8
Reviewers: kvigor, dph
Reviewed By: kvigor
Change-Id: I368b7870d61b14a7e390917d195cbccc67029eb7
Reviewed-on: https://review.gluster.org/18233
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- This diff adds error counts and rates to the regular io-stats dump.
- It outputs keys that look like this:
"storage.gluster.nfsd.groot.aggr.errors.<error_name>.count": "6",
"storage.gluster.nfsd.groot.inter.errors.<error_name>.per_sec": "0.00"
- <error_name> is the lowercase representation of errno values (e.g., ENOENT -> enoent, etc.)
- This is a port of D4691581 to 3.8
Reviewers: dph, kvigor
Reviewed By: kvigor
Change-Id: I96857d4283c47f9d330ae1978f113013e7c78a87
Reviewed-on: https://review.gluster.org/18230
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- There is a known kernel bug that causes reads to disk to be limited by the RA setting in /sys/block/sd[a-z]/queue/read_ahead_kb.
- The workaround is to fadvise POSIX_FADV_RANDOM on file descriptors before reading.
- This is a port of D4585521 to 3.8
Test Plan: Still need to figure out a good test for this, other than simple inspection.
Reviewers: rwareing, kvigor
Reviewed By: kvigor
Change-Id: I4a307573da620d9a1955fb5f4e8cd67154e11ace
Reviewed-on: https://review.gluster.org/18229
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This translator tags namespaces with a unique hash that corresponds to the
top-level directory (right under the gluster root) of the file the fop acts
on. The hash information is injected into the call frame by this translator,
so this namespace information can later be used to do throttling, QoS and
other namespace-specific stats collection and actions in later xlators
further down the stack.
When the translator can't find a path directly for the fd_t or loc_t, it winds
a GET_ANCESTRY_PATH_KEY down to the posix xlator to get the path manually.
Caching this namespace information in the inode makes sure that most requests
don't need to recalculate the hash, so that typically fops are just doing an
inode_ctx_get instead of the more expensive code paths that this xlator can take.
Right now the xlator is hard-coded to only hash the top-level directory, but
this could be easily extended to more sophisticated matching by modification
of the parse_path function.
Test Plan:
Run `prove -v tests/basic/namespace.t` to see that tagging works.
Change-Id: I960ddadba114120ac449d27a769d409cc3759ebc
Reviewed-on: https://review.gluster.org/18041
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- This gives md-cache to cache statfs calls
- You can turn it on or off via 'gluster vol set groot performance.md-cache-statfs <on|off>'
- This is a port of D4652632
Test Plan: Tested functionality on devserver
Reviewers: kvigor
Reviewed By: kvigor
Subscribers: #posix_storage
Differential Revision: https://phabricator.intern.facebook.com/D4652632
Change-Id: I664579e3c19fb9a6cd9d7b3a0eae061f70f4def4
Signature: t1:4652632:1488581841:111cc01efe83c71f1e98d075abb10589c4574705
Reviewed-on: https://review.gluster.org/18228
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Fixes the unecessary log spew in other daemons
- This is a port of D3646627 to 3.8
Reviewers: rwareing, kvigor
Reviewed By: kvigor
Change-Id: Id54ab41cdfdd2006d3af2d8774c38025c566c523
Reviewed-on: https://review.gluster.org/18199
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Adds the ability for gluster to log every single CREATE and UNLINK that happens on the bricks (right before invoking sys_unlink() or open(...| O_CREAT)
- Makes it so that CREATEs and UNLINKs are not downsampled in io-stats
- This is a port of D3268156, D3778968, D3903894 & D3301527 to 3.8
Reviewed By: kvigor
Change-Id: I1bce28068c02b7d202f094094237646b4d39794b
Reviewed-on: https://review.gluster.org/18198
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Log an OOPS and bail when *parent is null just before going into
posix_resolve code path (to avoid crash)
Test Plan: - Prove test/canary on cluster
Differential Revision: https://phabricator.fb.com/D2640497
Change-Id: I6140ef6fdb711748dad1c66d929aca36328bc574
Reviewed-on: https://review.gluster.org/17969
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Prior to this diff, Gluster would simply log "One more more clients cannot ..."
- With this diff, we now show up to 20 clients that are mismatched.
- This is a port of D3313082 to 3.8
Reviewers: rwareing, kvigor
Reviewed By: kvigor
Change-Id: Ia8830f18c922bda1aee787a2e3d6033164bb64d4
Reviewed-on: https://review.gluster.org/18196
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Adds iamshd (iamnfsd already there due to fop throttling)
options to io-stats xlator.
- Leverages these options to correctly write multi-volume NFSd stats
- This is a port of D2714648 to 3.8
Test Plan:
- Tested on local dev server, verified multiple files are generated for
multiple vols
Change-Id: Id2014a135fe52045da462eaaa91f336f45cdf167
Reviewed-on: https://review.gluster.org/18195
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- We noticed some folks name their files all the way up to NAME_MAX (usually 255) and when split-brain is encountered, we fail to heal the file.
- This diff puts an upper bound on the number of bytes we will snprintf into the buffer so that we do not fail the rename.
- This is a port of D3646254 to 3.8
Test Plan: Prove test -- can show it fails without patch as well.
Reviewers: #posix_storage, rwareing
Reviewed By: rwareing
Change-Id: I51c6b28374d4a3f21e29044cb727b4b1da7b69e1
Reviewed-on: https://review.gluster.org/18194
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- We have a thread that checks if connected clients are "still" authorized for a mount.
- This thread is currently only checking the IP (regression from the 3.4 -> 3.6 rebase, perhaps).
- This diff adds code toe check the IP *and* the FQDN before unmounting the client.
Test Plan: Tested on devserver, auth prove tests.
Reviewers: rwareing, kvigor
Reviewed By: kvigor
Change-Id: I441a4436d8df064d2f09a2539acb780ab53943f6
Reviewed-on: https://review.gluster.org/18193
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Our current approach to measuring "average fop latency" is badly
flawed in that it doesn't weight the FOPs correctly according to how
many occurred in the time interval. This makes Statisticians very
sad. This patch adds an internally computed weighted average
latency which will be far more efficient to display via ODS, as well
as having the benefit of not being complete nonsense.
- This is a port of D3148415 & D3405772 to 3.8
Reviewers: kvigor, dph, sshreyas
Reviewed By: sshreyas
Change-Id: Ie3618f279b545610b7ed1a8482243fcc8dc53217
Reviewed-on: https://review.gluster.org/18192
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Per title
- This is a port of D2875451 to 3.8
Test Plan: Live?
Reviewers: dph, moox, dld, rwareing
Reviewed By: rwareing
Change-Id: Ie2862bcbb49d1159cf2465d48cc506f629c527e0
Reviewed-on: https://review.gluster.org/18191
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|