| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We want to track the number of locks held by the locks xlator. One of the ways to do it would be to track the
total number of pl_lock objects in the system.
This patch tracks the total number of pl_lock object and exposes the stat via io-stats JSON dump.
Test Plan: WIP, haven't got a pass. Putting the diff to get a sense of this approach would yield what you guys are looking for?
Reviewers: kvigor, sshreyas, jdarcy
Reviewed By: jdarcy
Differential Revision: https://phabricator.intern.facebook.com/D5303071
Change-Id: I946debcbff61699ec28b4d6f243042440107a224
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18273
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a new volume option, shd-validate-data. When set, the self-heal code
will fetch checksums for regular files along with all the usual xattrs. If the
file seems OK but the checksums show a data mismatch, and if there is only one
replica that's out of step with the others, then we modify the source/sink
calculations to force a heal from one of the agreeing replicas to the odd one
out. Combined with a tool to put files into the self-heal index (being developed
separately), this provides a very rudimentary kind of scrubbing functionality.
Validation is now conditional on the "trusted.glusterfs.validate-status" xattr
having the specific value of "suspect" to avoid redoing validation (which is
expensive) as we find the same file in multiple bricks' indices. When we decide
to take action, we update this xattr to "clean" for copies that were in the
majority and "repaired" for the odd one out that gets clobbered. We also copy
the about-to-be-clobbered copy into an "orphans" directory to facilitate
analysis of corruption patterns. The data goes into ${GFID}.data there, while
${GFID}.link is a symlink to the file's old location.
Porting note: this is several internal squashed together ("See Also")
Differential Revision: https://phabricator.intern.facebook.com/D5092983
See Also: https://phabricator.intern.facebook.com/D5126974
See Also: https://phabricator.intern.facebook.com/D5127427
See Also: https://phabricator.intern.facebook.com/D5132804
See Also: https://phabricator.intern.facebook.com/D5209185
See Also: https://phabricator.intern.facebook.com/D5370353
Change-Id: Ie0ae18b368c408a5e47d0bf03ebac80b87b70aa9
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18269
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Enables multi-core epoll support in the nfs daemon.
- Option can be turned on using:
gluster volume set <volname> nfs.event-threads <numthreads>
Test Plan: Prove test!
Reviewers: kvigor, rwareing
Reviewed By: rwareing
Subscribers: dld, moox, dph
Differential Revision: https://phabricator.fb.com/D3117966
Change-Id: Ie8a7b1ba04b0e83f5ec7a09f9d181fe59be479ca
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18266
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- This diff adds support for detecting and tracking idle client connections.
- It allows *service translators* (server, nfs) to opt-in to detect and close idle client connections.
- Right now it explicitly restricts the service to NFS as a safety.
Here are the debug logs when a client connection gets closed:
[2016-03-29 17:27:06.154232] W [socket.c:2426:socket_timeout_handler] 0-socket: Shutting down idle client connection (idle=20s,fd=20,conn=[2401:db00:11:d0af:face:0:3:0:957]->[2401:db00:11:d0af:face:0:3:0:2049])!
[2016-03-29 17:27:06.154292] D [event-epoll.c:655:__event_epoll_timeout_slot] 0-epoll: Connection on slot->fd=9 was idle for 20 seconds!
[2016-03-29 17:27:06.163282] D [socket.c:629:__socket_rwv] 0-socket.nfs-server: EOF on socket
[2016-03-29 17:27:06.163298] D [socket.c:2474:socket_event_handler] 0-transport: disconnecting now
[2016-03-29 17:27:06.163316] D [event-epoll.c:614:event_dispatch_epoll_handler] 0-epoll: generation bumped on idx=9 from gen=4 to slot->gen=5, fd=20, slot->fd=20
Test Plan: - Used stuck NFS mounts to create idle clients and unstuck them.
Reviewers: kvigor, rwareing
Reviewed By: rwareing
Subscribers: dld, moox, dph
Differential Revision: https://phabricator.fb.com/D3112099
Change-Id: Ic06c89e03f87daabab7f07f892390edd1a1fcc20
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18265
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: - Adds test coverage for unsplitting via SHD
Test Plan: - Run prove -v tests/bugs/fb2506544* (https://phabricator.fb.com/P56056659)
Reviewers: moox, dld, dph, sshreyas
Reviewed By: sshreyas
Differential Revision: https://phabricator.fb.com/D2770524
Porting note: also added fb*.t tests to test_env.
Change-Id: Iac28b595194925a45e62b6438611c9bade58b30b
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18261
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Improves upon D2387001 by moving the "forced" root gfid heal to the SHDs
- Removed code which forced NFSd/FUSE clients through the entry heal for
the root GFID, this will make them spin up just as fast as prior to D2387001 (i.e. instantly)
Porting note: mostly inapplicable in 3.8, only one non-test change survived
Test Plan: - Must pass tests/bugs/fb8149516.t
Reviewers: dph, moox, sshreyas
Reviewed By: sshreyas
Differential Revision: https://phabricator.fb.com/D2722239
Change-Id: I35f5827df6ead1bb0ff886ca0adabb2add2e7163
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18259
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: These requests haven't been issued, yet alone acknowledged. They would disappear if we crashed, which to the client is indistinguishable from any other kind of disconnection - if indeed the client itself isn't the one that died. So we're completely within our rights to discard these. There are strong hints that such "orphan" requests are part of how we get into the lock-revocation hangs we've been seeing for a while. Even if that theory doesn't pan out, there's no good reason to keep them around clogging up queues and so forth.
This is a port of D5430057 & D5662545 to 3.8
Change-Id: Ie4c88f7791aac85540631f60f5c639497468ad76
Reviewed-on: https://review.gluster.org/18254
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
directory
Summary:
- We may have found an issue where certain directories were being moved into .landfill and then being quickly purged via nftw().
- We would like to have an emergency option to disable these purges.
Test Plan: Build, vol-set, read logs
Reviewers: rwareing, dph
Reviewed By: dph
Subscribers: #posix_storage
Differential Revision: https://phabricator.intern.facebook.com/D4862021
Change-Id: I90b54c535930c1ca2925a928728199b6b80eadd9
Signature: t1:4862021:1491855616:51b9b5b8957b0bb97afe27766f2e5aa78ff9edd4
Reviewed-on: https://review.gluster.org/18253
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
| |
Change-Id: Ie5f2e085169000ed385f9911ea6222aac7ac46ad
Reviewed-on: https://review.gluster.org/18252
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: - Exempt the SHD from the discover code path
Test Plan:
- prove -v tests/bugs/fb8149516.t
- Make rc and canary on offending host (gfsdataswarm048.prn2)
Reviewers: moox, dph, sshreyas
Reviewed By: sshreyas
Differential Revision: https://phabricator.fb.com/D2491694
Change-Id: I691a990950e13be6e376c64fddb110cd6ceefe47
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18251
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add assume-permissive option for EACCES debugging / rug-sweeping.
Re-fetch permissions when needed if they're absent.
This is a port of D5104707 & D5131597 to 3.8
Change-Id: I900fc66876ec8e73b04049f844c428b3d225d4ad
Reviewed-on: https://review.gluster.org/18249
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: - Prevents entry self-heal flow from happening on non-root GFIDs
Test Plan: - Run prove -v tests/bugs/fb8149516.t
Reviewers: dph, moox, sshreyas
Reviewed By: sshreyas
Differential Revision: https://phabricator.fb.com/D2470622
Change-Id: Id8559f2cfeb6e1e5c26dc1571854c0fbc0b59e08
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18250
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
correctly used
Summary:
This diff fixes a bug in the NFS daemon where the auth cache would use an export item after it was free'd by the
auth params refresh thread. This usually manifests as a crash in production, when exports files are updated by chef.
Since each auth cache entry holds a pointer to an export_item_t it makes sense that it should first get a reference to it.
Freein'g the export_item_t struct happens only in `exp_item_unref()`, once the reference count has dropped to 0.
This diff also fixes a use-after-free bug in the auth-cache, in the insertion path.
In _cache_item(), if we find an entry in the dict, we update that entry with a timestamp & ref the export item associated with it.
However, if the item already existed and we called old_cache_insert() with the same key, we gave the dict permission to free the old entry.
We then end up using that entry.
The fix is to use dict_set_static_bin() instead of dict_set_bin() which informs the dict that the pointer we are giving it belongs to us.
This is a port of D5780476, D5785038 to 3.8
Change-Id: I5cdcdc1cc6abad26c7077d66a14f263da07678ac
Reviewed-on: https://review.gluster.org/18248
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A lot of the diff "volume" is just refactoring, which should have no functional effect.
It's preparation for adding a new implementation.
The main functional change is locking around the external calls into this module, to prevent some of the races that we've seen.
Additional fixes:
- entry_data->data can be NULL, so we should check lookup_res before dereferencing it below.
- It renames functions that need to be locked to have double underscores in front of them.
This is a port of D5658875, D5658809 & D5762136 to 3.8
Change-Id: If1b71b5c3268271f3a41c07394c215290a12c0ec
Reviewed-on: https://review.gluster.org/18247
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- This diff looks for a custom xattr on a directory or file called 'trusted.glusterfs.md-cache-timeout' and uses that timeout if it finds it instead of the default timeout value for the cache.
- For example, if we know that a customer has a fixed set of directories that never change, we can set that attribute on all their directories and cache directory metadata for the lifetime of the client (NFS or FUSE) process.
- Port of D5430395 to 3.8
Reviewed By: jdarcy
Change-Id: Ieb232bc1365c59dd7c396c7a617f12973cc8ea01
Reviewed-on: https://review.gluster.org/18241
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Null peer UUIDs are assumed to be invalid.
Glusterd should complain and bail if we try to load any on startup.
This is a port of D5160925 to 3.8
Reviewed By: sshreyas
Change-Id: Ib8679c7501a4fc1fbf9b34fdbf47037f38ec7cb8
Reviewed-on: https://review.gluster.org/18238
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously, glfsxmp would fail way down in XDR code. The reasons are
still a bit unclear, but exactly duplicating the build flags etc. we use
for other programs seems to fix the issue. With this change, we have
one example of one set of flags that can be used to build other GFAPI
programs.
This is a port of D5370316 to 3.8
Reviewed By: sshreyas
Change-Id: I74535a791545189f829f10f04caf34a8a07295f7
Reviewed-on: https://review.gluster.org/18240
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There appears to be a thread leak somewhere, which causes io-threads to
run out of threads to process a particular (priority-based) queue.
The leak should obviously be fixed, but that might take a while
and the consequences until then are severe - a brick essentially going
offline without the courtesy of actually dying. This patch adds a
watchdog that checks for stuck queues, and adds threads to break the deadlock.
The same thing done manually on one afflicted cluster caused brick CPU usage
to drop from 2600% to 400%, with latency quickly returning to normal.
The controlling option is performance.iot-watchdog-secs,
which is the number of seconds we'll watch for a non-empty
queue with no items being dequeued. That's our signal to
add a thread. A value of zero (the default) disables
this watchdog feature.
This is a port of D5177414 to 3.8.
Test Plan: All the usual tests to determine safety.
Use gdb to hack priv->queue_sizes to a non-zero value. This will make it look like the queue is non-empty, but since it does in fact have zero items there will be no dequeues. After watchdog-secs seconds, this should add a thread, with a corresponding entry in the brick log.
Change-Id: Ic051e411d3e9351e1cf5e233bad8bbb5078cb259
Reviewed-on: https://review.gluster.org/18239
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
| |
Change-Id: I520894244063ef854b4416cb5418065bd9de7277
Reviewed-on: https://review.gluster.org/18237
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add outstanding-req field to track requests that have been sent
down the stack and haven't come back.
This is a port of D4908836 to 3.8
Reviewers: sshreyas
Change-Id: I5870f63008d553416109c1808a434f526f5a633d
Reviewed-on: https://review.gluster.org/18236
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Two new volume options that control reads.
performance.io-cache.read-size
- Tells gluster how much it should try to read on each posix_readv call
performance.io-cache.min-cached-read-size
- Tells gluster the smallest files it should start caching, anything smaller is not cached
This is a port of D4844662 to 3.8
Change-Id: I5ba891906f97e514e7365cc34374619379434766
Reviewed-on: https://review.gluster.org/18235
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Sometimes a the process that glusterd is trying to kill is already dead.
- In that case, if it can't find the pid, it should just continue on and not fail the entire operation.
- This is a port of D4837916 to 3.8
Change-Id: Ic96952a8d31927446f648830ede6ccd82512663f
Reviewed-on: https://review.gluster.org/18234
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Too may hard links blow up btrfs by exceeding max xattr size (recordign
pgfid for each hardlink). Add a limit to prevent this explosion.
This is a port D4682329 to 3.8
Reviewed By: sshreyas
Change-Id: I614a247834fb8f2b2743c0c67d11cefafff0dbaa
Reviewed-on: https://review.gluster.org/18232
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
AFR currently waits for all children to respond before sending an UP
message. This means that one dead host cal cause us to wait a TCP
timeout (2 mins!) before declaring the volume up.
Now we send an UP as soon as quorum is obtained.
This is a port of D4701919 to 3.8.
Reviewed By: sshreyas
Change-Id: I642d4eb7dc7e0b289e89b7a16abf99a3f98aa8b3
Reviewed-on: https://review.gluster.org/18231
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- When you write a file and then stat it immediately, md-cache returns stale stat information.
- This diff implements flush() in md-cache so that we can correctly invalidate inodes after
a write.
- This is a port of D4762171 to 3.8
Reviewers: kvigor, dph
Reviewed By: kvigor
Change-Id: I368b7870d61b14a7e390917d195cbccc67029eb7
Reviewed-on: https://review.gluster.org/18233
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- This diff adds error counts and rates to the regular io-stats dump.
- It outputs keys that look like this:
"storage.gluster.nfsd.groot.aggr.errors.<error_name>.count": "6",
"storage.gluster.nfsd.groot.inter.errors.<error_name>.per_sec": "0.00"
- <error_name> is the lowercase representation of errno values (e.g., ENOENT -> enoent, etc.)
- This is a port of D4691581 to 3.8
Reviewers: dph, kvigor
Reviewed By: kvigor
Change-Id: I96857d4283c47f9d330ae1978f113013e7c78a87
Reviewed-on: https://review.gluster.org/18230
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- There is a known kernel bug that causes reads to disk to be limited by the RA setting in /sys/block/sd[a-z]/queue/read_ahead_kb.
- The workaround is to fadvise POSIX_FADV_RANDOM on file descriptors before reading.
- This is a port of D4585521 to 3.8
Test Plan: Still need to figure out a good test for this, other than simple inspection.
Reviewers: rwareing, kvigor
Reviewed By: kvigor
Change-Id: I4a307573da620d9a1955fb5f4e8cd67154e11ace
Reviewed-on: https://review.gluster.org/18229
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This translator tags namespaces with a unique hash that corresponds to the
top-level directory (right under the gluster root) of the file the fop acts
on. The hash information is injected into the call frame by this translator,
so this namespace information can later be used to do throttling, QoS and
other namespace-specific stats collection and actions in later xlators
further down the stack.
When the translator can't find a path directly for the fd_t or loc_t, it winds
a GET_ANCESTRY_PATH_KEY down to the posix xlator to get the path manually.
Caching this namespace information in the inode makes sure that most requests
don't need to recalculate the hash, so that typically fops are just doing an
inode_ctx_get instead of the more expensive code paths that this xlator can take.
Right now the xlator is hard-coded to only hash the top-level directory, but
this could be easily extended to more sophisticated matching by modification
of the parse_path function.
Test Plan:
Run `prove -v tests/basic/namespace.t` to see that tagging works.
Change-Id: I960ddadba114120ac449d27a769d409cc3759ebc
Reviewed-on: https://review.gluster.org/18041
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- This gives md-cache to cache statfs calls
- You can turn it on or off via 'gluster vol set groot performance.md-cache-statfs <on|off>'
- This is a port of D4652632
Test Plan: Tested functionality on devserver
Reviewers: kvigor
Reviewed By: kvigor
Subscribers: #posix_storage
Differential Revision: https://phabricator.intern.facebook.com/D4652632
Change-Id: I664579e3c19fb9a6cd9d7b3a0eae061f70f4def4
Signature: t1:4652632:1488581841:111cc01efe83c71f1e98d075abb10589c4574705
Reviewed-on: https://review.gluster.org/18228
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Fixes the unecessary log spew in other daemons
- This is a port of D3646627 to 3.8
Reviewers: rwareing, kvigor
Reviewed By: kvigor
Change-Id: Id54ab41cdfdd2006d3af2d8774c38025c566c523
Reviewed-on: https://review.gluster.org/18199
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Adds the ability for gluster to log every single CREATE and UNLINK that happens on the bricks (right before invoking sys_unlink() or open(...| O_CREAT)
- Makes it so that CREATEs and UNLINKs are not downsampled in io-stats
- This is a port of D3268156, D3778968, D3903894 & D3301527 to 3.8
Reviewed By: kvigor
Change-Id: I1bce28068c02b7d202f094094237646b4d39794b
Reviewed-on: https://review.gluster.org/18198
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Log an OOPS and bail when *parent is null just before going into
posix_resolve code path (to avoid crash)
Test Plan: - Prove test/canary on cluster
Differential Revision: https://phabricator.fb.com/D2640497
Change-Id: I6140ef6fdb711748dad1c66d929aca36328bc574
Reviewed-on: https://review.gluster.org/17969
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- This queue will be used to hold the set of directory crawl / file
migrate operations in the multi-threaded rebalance.
- This is a port of D3712047 to 3.8
Test Plan: Unit test included.
Reviewed By: sshreyas
Change-Id: I25497a64beba744430807b3512eaee5d90f089c4
Reviewed-on: https://review.gluster.org/18197
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Prior to this diff, Gluster would simply log "One more more clients cannot ..."
- With this diff, we now show up to 20 clients that are mismatched.
- This is a port of D3313082 to 3.8
Reviewers: rwareing, kvigor
Reviewed By: kvigor
Change-Id: Ia8830f18c922bda1aee787a2e3d6033164bb64d4
Reviewed-on: https://review.gluster.org/18196
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Adds iamshd (iamnfsd already there due to fop throttling)
options to io-stats xlator.
- Leverages these options to correctly write multi-volume NFSd stats
- This is a port of D2714648 to 3.8
Test Plan:
- Tested on local dev server, verified multiple files are generated for
multiple vols
Change-Id: Id2014a135fe52045da462eaaa91f336f45cdf167
Reviewed-on: https://review.gluster.org/18195
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- We noticed some folks name their files all the way up to NAME_MAX (usually 255) and when split-brain is encountered, we fail to heal the file.
- This diff puts an upper bound on the number of bytes we will snprintf into the buffer so that we do not fail the rename.
- This is a port of D3646254 to 3.8
Test Plan: Prove test -- can show it fails without patch as well.
Reviewers: #posix_storage, rwareing
Reviewed By: rwareing
Change-Id: I51c6b28374d4a3f21e29044cb727b4b1da7b69e1
Reviewed-on: https://review.gluster.org/18194
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- We have a thread that checks if connected clients are "still" authorized for a mount.
- This thread is currently only checking the IP (regression from the 3.4 -> 3.6 rebase, perhaps).
- This diff adds code toe check the IP *and* the FQDN before unmounting the client.
Test Plan: Tested on devserver, auth prove tests.
Reviewers: rwareing, kvigor
Reviewed By: kvigor
Change-Id: I441a4436d8df064d2f09a2539acb780ab53943f6
Reviewed-on: https://review.gluster.org/18193
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Our current approach to measuring "average fop latency" is badly
flawed in that it doesn't weight the FOPs correctly according to how
many occurred in the time interval. This makes Statisticians very
sad. This patch adds an internally computed weighted average
latency which will be far more efficient to display via ODS, as well
as having the benefit of not being complete nonsense.
- This is a port of D3148415 & D3405772 to 3.8
Reviewers: kvigor, dph, sshreyas
Reviewed By: sshreyas
Change-Id: Ie3618f279b545610b7ed1a8482243fcc8dc53217
Reviewed-on: https://review.gluster.org/18192
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Per title
- This is a port of D2875451 to 3.8
Test Plan: Live?
Reviewers: dph, moox, dld, rwareing
Reviewed By: rwareing
Change-Id: Ie2862bcbb49d1159cf2465d48cc506f629c527e0
Reviewed-on: https://review.gluster.org/18191
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- This diff enables gfproxyd to output a stats file that looks like 'glusterfs_gfproxyd_{volname}.dump'
- This is a port of D3753684 to 3.8
Test Plan: Tested on devserver, verified output.
Reviewers: kvigor
Reviewed By: kvigor
Change-Id: I8559974e9d24976fd1c8b6145fbc81be40fd4134
Reviewed-on: https://review.gluster.org/18189
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- PGFID healing is pointless when a child is down, since the heal will
fail for that reason (and we have no signal for this). Instead
restrict PGFID healing to the case where all children are up.
- This is a port of D3100450 to 3.8
Test Plan: Run prove -v tests/basic/afr/shd-pgfid-heal.t
Reviewers: kvigor, sshreyas
Reviewed By: sshreyas
Change-Id: I88e542449e3b40415cd201ff39694e86eef65a6e
Reviewed-on: https://review.gluster.org/18190
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add AFR quorum state to io-stats translator.
Sample output:
{
"storage.gluster.nfsd.test-replicate-0.has-quorum": "1",
"storage.gluster.nfsd.test-replicate-0.quorum-threshold": "1",
"storage.gluster.nfsd.test-replicate-1.has-quorum": "1",
"storage.gluster.nfsd.test-replicate-1.quorum-threshold": "1"
}
The quorum-threshold field shows the number of bricks that can be
lost while still maintaining quorum. Negative numbers indicate
that quorum has been lost and show the number of bricks that must
be brought online to restore quorum.
Additionally, I found that the code contained both
afr_have_quorum() and afr_has_quorum(), which were mostly
cut-n-pasted copies of each other, but with subtle differences.
Mercifully, afr_have_quorum() was totally unused, so I nuked it in
passing.
This is a port of D4089969 to 3.8.
Test Plan: Run, observe stats output. Kill brick, observe proper change. fb-smoke.
Reviewers: #posix_storage, sshreyas
Reviewed By: sshreyas
Subscribers: sshreyas
Differential Revision: https://phabricator.intern.facebook.com/D4089969
Change-Id: Ifddb351aebfe63998846bb52be8942415ce4c1a9
Reviewed-on: https://review.gluster.org/18188
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Reduces version of change log option to 2 (3.4.x) so we can disable
this *server* side feature when older clients are attached
Test Plan:
- Was required as a hotfix for a broken cluster, after installing an RC
with this patch we were able to kill the feature and stabilize the
cluster.
Reviewers: sshreyas, moox, dph, dld, kvigor
Reviewed By: kvigor
Differential Revision: https://phabricator.fb.com/D2981552
Change-Id: I515e2bb520585e5efaa305b1acbab21ebc7218a9
Reviewed-on: https://review.gluster.org/18183
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ic77287c1b96ae426b927b4bf6f2826d6f3a3b17d
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/18175
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|\
| |
| |
| | |
Change-Id: Ie35cd1c8c7808949ddf79b3189f1f8bf0ff70ed8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BUG: 1469558
Change-Id: Ia9a4e69e5d7dfd33933b20b7c4ea41e439d3c838
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: https://review.gluster.org/18039
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
master review https://review.gluster.org/17092 circa April 2017
Fix already exists in release-3.12 and release-3.11 branches
Hat tip to Shyam (srangana[at]redhat.com) who found the existing
fix after sitting and debugging it with me for several hours.
Reported-by: Kinglong Mee <mijinlong@open-fs.com>
Change-Id: Ic7169fd05aff7bf46108e8ac7b1f29688a7f2358
BUG: 1481398
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: https://review.gluster.org/18037
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kinglong Mee <kinglongmee@gmail.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Prashanth Pai <ppai@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The return code of xmlTextWriter* APIs says it returns either the bytes
written (may be 0 because of buffering) or -1 in case of error. Now if the
volume of the xml data payload is not huge then most of the time the
data to be written gets buffered, however when this grows sometimes this
APIs will return the total number of bytes written and then it becomes
absolutely mandatory that every such call is followed by
XML_RET_CHECK_AND_GOTO otherwise we may end up returning a non zero ret
code which would result into the overall xml generation to fail.
>Reviewed-on: https://review.gluster.org/17702
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Amar Tumballi <amarts@redhat.com>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Gaurav Yadav <gyadav@redhat.com>
Change-Id: I02ee7076e1d8c26cf654d3dc3e77b1eb17cbdab0
BUG: 1470495
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: https://review.gluster.org/17766
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
... to make the change in commit acf8cfdf truly useful.
Without this, a race between entry creation fops and lookup
at posix layer can cause lookups to fail with ENODATA, as
opposed to ENOENT.
Backport of:
> Change-Id: I44a226872283a25f1f4812f03f68921c5eb335bb
> Reviewed-on: https://review.gluster.org/17821
> BUG: 1472758
> cherry-picked from 669868d23eaeba42809fca7be134137c607d64ed
Change-Id: I44a226872283a25f1f4812f03f68921c5eb335bb
BUG: 1480193
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: https://review.gluster.org/18015
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Backport of https://review.gluster.org/#/c/17717/
Problem:
In a 3 way replica, when the source brick does not have pending xattrs
for the sinks, but the 2 sinks blame each other, metadata heal was not
happpening because we were not setting all non-sources as sinks.
Fix: Mark all non-sources as sinks, like it is done in data and entry
heal.
Change-Id: I534978940f5087302e307fcc810a48ffe898ce08
BUG: 1471613
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: https://review.gluster.org/17784
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|