| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FALLOCATE file operations is not implemented in the
existing EC code. This change set implements it
for EC.
BUG: 1448293
Change-Id: Id9ed914db984c327c16878a5b2304a0ea461b623
Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
Reviewed-on: https://review.gluster.org/15200
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
In nameless lookup/other fops, parent inode will be NULL, when we try
to add the cache to the NULL inode, it causes a crash.
Hence handle the scenario of nameless fops, and do not cache/serve
the nameless fops.
Change-Id: I3b90f882ac89e6aaf3419db89e6f890797f37700
BUG: 1451588
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: https://review.gluster.org/17316
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
The max value of rda-cache-limit is 1GB before this patch.
When parallel-readdir is enabled, there will be many instances of
readdir-ahead, hence the rda-cache-limit depends on the number of
instances. Eg: On a volume with distribute count 4, rda-cache-limit
when parallel-readdir is enabled, will be 4GB instead of 1GB.
Consider a followinf sequence of operations:
- Enable parallel readdir
- Set rda-cache-limit to lets say 3GB
- Disable parallel-readdir, this results in one instance of readdir-ahead
and the rda-cache-limit will be back to 1GB, but the current value is 3GB
and hence the mount will stop working as 3GB > max 1GB.
Solution:
To fix this, we can limit the cache to 1GB even when parallel-readdir
is enabled. But there is no necessity to limit the cache to 1GB, it
can be increased if the system has enough resources. Hence getting rid
of the rda-cache-limit max value is more apt. If we just change the
rda-cache-limit max to INFINITY, we will render older(<3.11) clients
broken, when the rda-cache-limit is set to > 1GB (as the older clients
still expect a value < 1GB). To safely change the max value of
rda-cache-limit to INFINITY, add a check in glusted to verify all
the clients are > 3.11 if the value exceeds 1GB.
Change-Id: Id0cdda3b053287b659c7bf511b13db2e45b92032
BUG: 1446516
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: https://review.gluster.org/17338
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With brick multiplexing enabled, upon a node reboot new bricks were
not being attached to the first spawned brick process even though
there wasn't any compatibility issues.
The reason for this is that upon glusterd restart after a node
reboot, since brick services aren't running, glusterd starts the
bricks in a "no-wait" mode. So after a brick process is spawned for
the first brick, there isn't enough time for the corresponding pid
file to get populated with a value before the compatibilty check is
made for the next brick.
This commit solves this by iteratively waiting for the pidfile to be
populated in the brick compatibility comparison stage before checking
if the brick process is alive.
Change-Id: Ibd1f8e54c63e4bb04162143c9d70f09918a44aa4
BUG: 1451248
Signed-off-by: Samikshan Bairagya <samikshan@gmail.com>
Reviewed-on: https://review.gluster.org/17307
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These options will cause brick's log complains:
_log_if_unknown_option] 0-patchy-quota: option 'timeout' is not recognized
_log_if_unknown_option] 0-patchy-server: option 'ping-timeout' is not recognized
Change-Id: Ida2add13f792736a4e52bfaf38d1169309283a3f
BUG: 1449008
Signed-off-by: Zhou Zhengping <johnzzpcrystal@gmail.com>
Reviewed-on: https://review.gluster.org/17213
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Prashanth Pai <ppai@redhat.com>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: detach status xml output was broken because
of the wrong argument. The status_op sent to verify
whether it is a tier status command was as false.
Fix: the argument being passed was changed from false
to true.
Change-Id: I8cdd4dd972d6bfbb61c1182cbf4097767f83c7c5
BUG: 1446362
Signed-off-by: hari gowtham <hgowtham@redhat.com>
Reviewed-on: https://review.gluster.org/17131
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: hari gowtham <hari.gowtham005@gmail.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Rebalance compares the node-uuid of a file against its own
to and migrates a file only if they match. However, the
current behaviour in both AFR and EC is to return
the node-uuid of the first brick in a replica set for all
files. This means a single node ends up migrating all
the files if the first brick of every replica set is on the
same node.
Fix:
AFR and EC will return all node-uuids for the replica set.
The rebalance process will divide the files to be migrated
among all the nodes by hashing the gfid of the file and
using that value to select a node to perform the migration.
This patch makes the required DHT and tiering changes.
Some tests in rebal-all-nodes-migrate.t will need to be
uncommented once the AFR and EC changes are merged.
Change-Id: I5ce41600f5ba0e244ddfd986e2ba8fa23329ff0c
BUG: 1366817
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://review.gluster.org/17239
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gfid-mismatch-resolution-with-fav-child-policy.t does a `TEST ls
$M0/f3` (line #170) to trigger healing of a file in gfid split-brain in
a rep-3 volume. But the code to trigger name heal of gfid split-brain
file is not yet there. The test is passing due a lookup/ stat on $M0
which triggers a background entry self heal (which has the code to heal
gfid split-brain files) which may or may not complete the heal before
line 170. If it doesn't, lookup on f3 is failing with EIO.
Add the .t to bad tests until Karthik's patch for CLI based gfid
split-brain resolution fixes name heal also.
Change-Id: Iba6e9d81db386bc406aff1ecb6a18851f09bf7c0
BUG: 1450730
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: https://review.gluster.org/17290
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adding a testcase to check the proper handling of "." and ".."
in gfapi path.
The patch which fix the issue is https://review.gluster.org/#/c/17177
Change-Id: I5c9cceade30f7d8a3b451b5f34f1cf9815729c4a
BUG: 1447266
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: https://review.gluster.org/17216
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reset brick currently kills of the corresponding brick process.
However, with brick multiplexing enabled, stopping the brick
process would render all bricks attached to it unavailable. To
handle this correctly, we need to make sure that the brick process
is terminated only if brick-multiplexing is disabled. Otherwise,
we should send the GLUSTERD_BRICK_TERMINATE rpc to the respective
brick process to detach the brick that is to be reset.
Change-Id: I69002d66ffe6ec36ef48af09b66c522c6d35ac58
BUG: 1446172
Signed-off-by: Samikshan Bairagya <samikshan@gmail.com>
Reviewed-on: https://review.gluster.org/17128
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The test is failing in master. see gluster-devel for more details.
Change-Id: I7a589ad2c54bd55d62f4e66fdf8037c19fc123ea
BUG: 1448364
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: https://review.gluster.org/17234
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Include the 'none' option as well in the output. This fixes the bug in
commit 335555d256d444f4952ce239168f72b393370f01.
Also added a test-case.
This is a
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Change-Id: I479a14ae69ecae5a03e85e73ed50c19b483df603
BUG: 1448804
Reviewed-on: https://review.gluster.org/17215
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: While brick-muliplexing is on after restarting glusterd, CLI is
not showing pid of all brick processes in all volumes.
Solution: While brick-mux is on all local brick process communicated through one
UNIX socket but as per current code (glusterd_brick_start) it is trying
to communicate with separate UNIX socket for each volume which is populated
based on brick-name and vol-name.Because of multiplexing design only one
UNIX socket is opened so it is throwing poller error and not able to
fetch correct status of brick process through cli process.
To resolve the problem write a new function glusterd_set_socket_filepath_for_mux
that will call by glusterd_brick_start to validate about the existence of socketpath.
To avoid the continuous EPOLLERR erros in logs update socket_connect code.
Test: To reproduce the issue followed below steps
1) Create two distributed volumes(dist1 and dist2)
2) Set cluster.brick-multiplex is on
3) kill glusterd
4) run command gluster v status
After apply the patch it shows correct pid for all volumes
BUG: 1444596
Change-Id: I5d10af69dea0d0ca19511f43870f34295a54a4d2
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Reviewed-on: https://review.gluster.org/17101
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Prashanth Pai <ppai@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The aux mount is created on the first limit/remove_limit/list command
and it remains until volume is stopped / deleted / (quota is disabled)
, where we do a lazy unmount. If the process is uncleanly terminated,
then the mount entry remains and we get (Transport disconnected) error
on subsequent attempts to run quota list/limit-usage/remove commands.
Second issue, There is also a risk of inadvertent rm -rf on the
/var/run/gluster causing data loss for the user. Ideally, /var/run is
a temp path for application use and should not cause any data loss to
persistent storage.
Solution:
1) unmount the aux mount after each use.
2) clean stale mount before mounting, if any.
One caveat with doing mount/unmount on each command is that we cannot
use same mount point for both list and limit commands.
The reason for this is that list command needs mount to be accessible
in cli after response from glusterd, So it could be unmounted by a
limit command if executed in parallel (had we used same mount point)
Hence we use separate mount points for list and limit commands.
Change-Id: I4f9e39da2ac2b65941399bffb6440db8a6ba59d0
BUG: 1433906
Signed-off-by: Sanoj Unnikrishnan <sunnikri@redhat.com>
Reviewed-on: https://review.gluster.org/16938
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Manikandan Selvaganesh <manikandancs333@gmail.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Both low and hi watermark can be set to same value
as the check missed the case for being equal.
Fix: Add the check to both the hi and low values being equal
along with the low value being higher than hi value.
Change-Id: Ia235163aeefdcb2a059e2e58a5cfd8fb7f1a4c64
BUG: 1447960
Signed-off-by: hari gowtham <hgowtham@redhat.com>
Reviewed-on: https://review.gluster.org/17175
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: hari gowtham <hari.gowtham005@gmail.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Its known that readdirplus operation fetches stat as well for each of the
dirents. But often applications may need extra information, like for eg.,
NFS-Ganesha which operates on handles needs handles for each of those
dirents returned. So this would require extra calls to the backend, in this
case LOOKUP (which is very expensive operation) resulting in very low
readdir performance.
To address that introducing this new API using which applications can
make request for any extra information to be returned as part of
readdirplus response.
Currently this new api returns stat and handles as demanded by application.
The synopsis of the API is noted in glfs.h.
@todo:
* Enhance test script using this new API
Below were the perf results on single brick volume with and without
these changes -
Dataset used -
10*100 directories and each directory containing 100 empty files.
I used NFS-Ganesha application to test these changes -
>for i in {1..5}; do systemctl restart nfs-ganesha; sleep 10; mount -t nfs -o vers=4 localhost:/brick_vol /mnt; cd /mnt; echo "ITERATION$i"; date; find . > tmp-nfs.log; date; cd /; umount /mnt; sleep 2; done;
Without these changes -
ITERATION1
Mon Mar 20 17:22:26 IST 2017
Mon Mar 20 17:23:18 IST 2017
ITERATION2
Mon Mar 20 17:23:39 IST 2017
Mon Mar 20 17:24:28 IST 2017
ITERATION3
Mon Mar 20 17:24:49 IST 2017
Mon Mar 20 17:25:36 IST 2017
ITERATION4
Mon Mar 20 17:30:57 IST 2017
Mon Mar 20 17:31:37 IST 2017
ITERATION5
Mon Mar 20 17:31:57 IST 2017
Mon Mar 20 17:32:40 IST 2017
[root@dhcp35-197 /]#
On an average ~46.2 sec
With these changes applied -
ITERATION1
Mon Mar 20 17:35:03 IST 2017
Mon Mar 20 17:35:15 IST 2017
ITERATION2
Mon Mar 20 17:35:36 IST 2017
Mon Mar 20 17:35:46 IST 2017
ITERATION3
Mon Mar 20 17:36:06 IST 2017
Mon Mar 20 17:36:17 IST 2017
ITERATION4
Mon Mar 20 17:41:38 IST 2017
Mon Mar 20 17:41:49 IST 2017
ITERATION5
Mon Mar 20 17:42:10 IST 2017
Mon Mar 20 17:42:20 IST 2017
On an average ~10.8 sec
Updates #174
BUG: 1442950
Change-Id: I0f74f74dc62085ca4c4a23c38e3edc84bd850876
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: https://review.gluster.org/15663
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Halo Geo-replication is a feature which allows Gluster or NFS clients to write
locally to their region (as defined by a latency "halo" or threshold if you
like), and have their writes asynchronously propagate from their origin to the
rest of the cluster. Clients can also write synchronously to the cluster
simply by specifying a halo-latency which is very large (e.g. 10seconds) which
will include all bricks.
In other words, it allows clients to decide at mount time if they desire
synchronous or asynchronous IO into a cluster and the cluster can support both
of these modes to any number of clients simultaneously.
There are a few new volume options due to this feature:
halo-shd-latency: The threshold below which self-heal daemons will
consider children (bricks) connected.
halo-nfsd-latency: The threshold below which NFS daemons will consider
children (bricks) connected.
halo-latency: The threshold below which all other clients will
consider children (bricks) connected.
halo-min-replicas: The minimum number of replicas which are to
be enforced regardless of latency specified in the above 3 options.
If the number of children falls below this threshold the next
best (chosen by latency) shall be swapped in.
New FUSE mount options:
halo-latency & halo-min-replicas: As descripted above.
This feature combined with multi-threaded SHD support (D1271745) results in
some pretty cool geo-replication possibilities.
Operational Notes:
- Global consistency is gaurenteed for synchronous clients, this is provided by
the existing entry-locking mechanism.
- Asynchronous clients on the other hand and merely consistent to their region.
Writes & deletes will be protected via entry-locks as usual preventing
concurrent writes into files which are undergoing replication. Read operations
on the other hand should never block.
- Writes are allowed from _any_ region and propagated from the origin to all
other regions. The take away from this is care should be taken to ensure
multiple writers do not write the same files resulting in a gfid split-brain
which will require resolution via split-brain policies (majority, mtime &
size). Recommended method for preventing this is using the nfs-auth feature to
define which region for each share has RW permissions, tiers not in the origin
region should have RO perms.
TODO:
- Synchronous clients (including the SHD) should choose clients from their own
region as preferred sources for reads. Most of the plumbing is in place for
this via the child_latency array.
- Better GFID split brain handling & better dent type split brain handling
(i.e. create a trash can and move the offending files into it).
- Tagging in addition to latency as a means of defining which children you wish
to synchronously write to
Test Plan:
- The usual suspects, clang, gcc w/ address sanitizer & valgrind
- Prove tests
Reviewers: jackl, dph, cjh, meyering
Reviewed By: meyering
Subscribers: ethanr
Differential Revision: https://phabricator.fb.com/D1272053
Tasks: 4117827
Change-Id: I694a9ab429722da538da171ec528406e77b5e6d1
BUG: 1428061
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16099
Reviewed-on: https://review.gluster.org/16177
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current rebalance throttle options: lazy/normal/aggressive may not always be
sufficient for the purpose of throttling. In our recent test, we observed for
certain setups, normal and aggressive modes behaved similarly consuming full
disk bandwidth. So in cases like this admin should be able to tune it
down(or vice versa) depending on the need.
Along with old throttle configurations, thread counts are tuned based on number.
e.g. gluster v set vol-name cluster-rebal.throttle 5.
Admin can tune up/down between 0 and the number of cores available.
Note: For heterogenous servers, validation will fail on the old server if "number"
is given for throttle configuration.
The message looks something like this:
"volume set: failed: Staging failed on vm2. Error: cluster.rebal-throttle should be {lazy|normal|aggressive}"
Test: Manual test by logging active thread number after reconfiguring throttle option.
testcase: tests/basic/distribute/throttle-rebal.t
Change-Id: I46e3cde546900307831028b344ecf601fd9b02c3
BUG: 1438370
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: https://review.gluster.org/16980
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the call to glfs_new("volname") passes a name for the volume and it
does not match the name of the subvolume in the graph, glfs_init() will
fail. This is easily reproducible by a gfapi program that loads the
volume from a .vol file, and not from a GlusterD server.
Change-Id: I33e77fbee7d12eaefe7c384fad6aecfa3582ea5a
BUG: 1425623
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: https://review.gluster.org/16796
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Prashanth Pai <ppai@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This new xlator does not allocate any resources on init(). This makes it
a good option to use for debugging xlator releated resources leaks on
fini().
By putting the sink xlator as single xlator in a .vol file, and loading
it through gfapi, we can investigate the resource leaks that are
happening through gfapi (and the Gluster core). By extending the .vol
file with additional xlators, it is possible to analyze resource leaks
of single xlators.
Change-Id: Idb5faa861b623dd5b2a988b181e669b0d52c2a0e
BUG: 1425623
Fixes: #176
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: https://review.gluster.org/16806
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Currently the automatic split brain resolution with favorite child policy
is not resolving the GFID split brains.
Fix:
When there is a GFID split brain and the favorite child policy is set to
size/mtime/ctime/majority, based on the policy decide on the source and
sinks. Delete the entry from the sinks and recreate it from the source.
Mark the appropriate pending attributes and resolve the GFID split brain.
When the heal takes place it will complete the pending heals and reset
the attributes.
Change-Id: Ie30e5373f94ca6f276745d9c3ad662b8acca6946
BUG: 1430719
Signed-off-by: karthik-us <ksubrahm@redhat.com>
Reviewed-on: https://review.gluster.org/16878
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before creating any file negative lookups(1 in Fuse, 4 in SMB etc.)
are sent to verify if the file already exists. By serving these
lookups from the cache when possible, increases the create
performance by multiple folds in SMB access and some percentage
in Fuse/NFS access.
Feature page: https://review.gluster.org/#/c/16436
Updates #82
Change-Id: Ib1c0e7ac7a386f943d84f6398c27f9a03665b2a4
BUG: 1442569
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: https://review.gluster.org/16952
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* As of now bit-rot-stub does versioning always. This leads
lots of getxattr calls being made in lookups. So make
object versioning optional.
Change-Id: I83713e45ae59fb28004bb3cfa008f2d69edebbfa
BUG: 1359599
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/14442
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In afr-v2, self-blaming xattrs are not there by design. But if the FOP
failed on a brick due to an error other than ENOTCONN (or even due to
ENOTCONN, but we regained connection before postop was wound), we wind
the post-op also on the failed brick, leading to setting self-blaming
xattrs on that brick. This can lead to undesired results like healing of
files in split-brain etc.
Fix:
If a fop failed on a brick on which pre-op was successful, do not
perform post-op on it. This also produces the desired effect of not
resetting the dirty xattr on the brick, which is how it should be
because if the fop failed on a brick, there is no reason to clear the
dirty bit which actually serves as an indication of the failure.
Change-Id: I5f1caf4d1b39f36cf8093ccef940118638caa9c4
BUG: 1438255
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: https://review.gluster.org/16976
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch fixes the recently seen issues with worm_sh.t test.
RCA:
-
$ git log --oneline xlators/features/read-only/src/worm.c
1b01bdc worm: allow Self-heal-Daemon to perform some operations
c5a4a77 features/worm: Adding implementation for ftruncate
-
These two patches were merged in reverse order of their submission,
and hence the check added for internal processes got missed in
new fop 'ftruncate()'. The worm_sh.t passed the tests as while
that patch got submitted there was no ftruncate() in worm xlator.
Change-Id: I81a8a45fa2679917a2c859c4f5224a2c3edbc784
BUG: 1423413
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: https://review.gluster.org/17048
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com>
Reviewed-by: David Spisla <david.spisla@iternity.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
The value of linkto xattr is generally the name of the dht's
next subvol, this requires that the next subvol of dht is not
changed for the life time of the volume. But with parallel
readdir enabled, the readdir-ahead loaded below dht, is optional.
The linkto xattr for first subvol, when:
- parallel readdir is enabled : "<volname>-readdir-head-0"
- plain distribute volume : "<volname>-client-0"
- distribute replicate volume : "<volname>-afr-0"
The value of linkto xattr is "<volname>-readdir-head-0" when
parallel readdir is enabled, and is "<volname>-client-0" if
its disabled. But the dht_lookup takes care of healing if it
cannot identify which linkto subvol, the xattr points to.
In dht_lookup_cbk, if linkto xattr is found to be "<volname>-client-0"
and parallel readdir is enabled, then it cannot understand the
value "<volname>-client-0" as it expects "<volname>-readdir-head-0".
In that case, dht_lookup_everywhere is issued and then the linkto file
is unlinked and recreated with the right linkto xattr. The issue is
when parallel readdir is enabled, mount point accesses the file
that is currently being migrated. Since rebalance process doesn't
have parallel-readdir feature, it expects "<volname>-client-0"
where as mount expects "<volname>-readdir-head-0". Thus at some point
either the mount or rebalance will fail.
Solution:
Enable parallel-readdir for rebalance as well and then do not
allow enabling/disabling parallel-readdir if rebalance is in
progress.
Change-Id: I241ab966bdd850e667f7768840540546f5289483
BUG: 1436090
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: https://review.gluster.org/17056
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
test: Manual
created files of size 1K on 2 brick(of size 1GB) setup .
added a brick of size 16GB.
set min-free-disk to 12GB(so that first two bricks won't receive any files).
removed one of the 1st brick of size 1GB.
Logs from test:
[2017-04-12 08:52:08.196484] W [MSGID: 0] [dht-rebalance.c:895:__dht_check_free_space]
0-test1-dht: Write will cross min-free-disk for file - /tile32 on subvol - test1-client-1.
Looking for new subvol.
[2017-04-12 08:52:08.196904] I [MSGID: 0] [dht-rebalance.c:925:__dht_check_free_space]
0-test1-dht: new target found - test1-client-2 for file - /tile32
- Post migration we have two files. The new destination (/brick/1) has the data file
[root@vm1 ~]# ll /brick/1/tile32
-rw-r--r--. 2 root root 0 Apr 12 14:22 /brick/1/tile32
- On the old target the linkto file is there with linkto xattr pointing to /brick/1
[root@vm1 ~]# ll /tmp/2/tile32
---------T. 2 root root 1000 Apr 12 14:22 /tmp/2/tile32
[root@vm1 ~]# getfattr -m . -de text /tmp/2/tile32
getfattr: Removing leading '/' from absolute path names
security.selinux="unconfined_u:object_r:user_tmp_t:s0"
trusted.gfid="����:Aс�#�/'b2"
trusted.glusterfs.dht.linkto="test1-client-2"
Marking ./tests/features/worm_sh.t as bad test.
Reason being, this patch failed on master branch as well and it has nothing
to do with rebalance/remove-brick.
BUG: 1441508
Change-Id: I90bae251cda3d957a49cdceda90cd08311a392fb
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: https://review.gluster.org/17034
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit optionally adds client details corresponding to the
locally running bricks to the get-state output. Since getting
the client details involves sending RPC requests to the respective
local bricks, this is a relatively more costly operation. These
client details would be added to the get-state output only if the
get-state command is invoked with the 'detail' option.
This commit therefore also changes the get-state CLI usage. The
modified usage is as follows:
# gluster get-state [<daemon>] [[odir </path/to/output/dir/>] \
[file <filename>]] [detail]
Change-Id: I42cd4ef160f9e96d55a08a10d32c8ba44e4cd3d8
BUG: 1431183
Signed-off-by: Samikshan Bairagya <samikshan@gmail.com>
Reviewed-on: https://review.gluster.org/17003
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Self-Heal-Daemon should be allowed to trigger unlink, link,
trauncate, rename and write operation. The value of frame->root->pid
can be used to detect internal (by SHD) operations.
Change-Id: I7526148100bef1e2837d69df5c119dc97d91fffd
BUG: 1423413
Signed-off-by: David Spisla <david.spisla@iternity.com>
Reviewed-on: https://review.gluster.org/16661
Tested-by: jiffin tony Thottan <jthottan@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bug-1421590-brick-mux-reuse-ports.t seems to be a bad test to me and here is my
reasoning:
This test tries to check if the ports are reused or not. When a volume is
restarted, by the time glusterd tries to allocate a new port to the one of the
brick processes of the volume there is no guarantee that the older port will be
allocated given the kernel might take some extra time to free up the port between
this time frame. From
https://build.gluster.org/job/regression-test-burn-in/2932/console we can
clearly see that post restart of the volume, glusterd allocated port 49153 &
49155 for brick1 & brick2 respectively but the test was expecting the ports to
be matched with 49155 & 49156 which were allocated before the volume was
restarted.
Change-Id: Id887bf28445261d4de04fc7502e58057659c9512
BUG: 1441035
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: https://review.gluster.org/17033
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
georep gsyncd's xtime needs to filtered irrespective
of any process access.
This way, we can avoid (unnecessarily)syncing xtime attribute
to slave, which may raise permission denied errors.
test case modified to check for xtime xattr only in backend.
Change-Id: I2390b703048d5cc747d91fa2ae884dc55de58669
BUG: 1353952
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: https://review.gluster.org/14880
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Tested-by: Kotresh HR <khiremat@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently when prarallel readdir is enabled, setting any junk value
to rda-cache-limit and rda-request-size succeeds. This is because of
bug in the special handling of these options.
Fixing the same in this patch
Change-Id: I902cd9ac9134c158ab6f8aea4b001254a03547bd
BUG: 1439640
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: https://review.gluster.org/17008
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The test files that were marked as bad test were checked and
updated for centos. The tests that had issue were fixed.
Tests that aren't needed anymore are removed.
REASON:
tests/basic/tier/tier-file-create.t
This test checks one line after creating a tiered volume (which
is done in every tier test). So this line is moved along with
other test in tier and the file is deleted.
tests/bugs/tier/bug-1286974.t
This bug checks for the tier as a task and tier has been moved
from a task to service as a part of the tier as a service patch
https://review.gluster.org/#/c/13365/
So it is removed from bad tests.
tests/basic/tier/record-metadata-heat.t
This test had a bug and has been fixed.
tests/basic/tier/bug-1214222-directories_missing_after_attach_tier.t
tests/basic/tier/fops-during-migration.t
tests/basic/tier/tier-snapshot.t
tests/basic/tier/tier_lookup_heal.t
These test seem to work fine on centos now.
Change-Id: I05537f4bbb91584410177ce43543897eff8761a1
BUG: 1421600
Signed-off-by: hari gowtham <hgowtham@redhat.com>
Reviewed-on: https://review.gluster.org/16605
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: hari gowtham <hari.gowtham005@gmail.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem : At the time of set FQDN name to "auth.allow/auth.reject" through
gluster cli,it does not accept FQDN/host name.
Solution: Condition needs to be update in verify_host_name and gf_auth
to accept FQDN/host name.
Fix : Change the condition to accept FQDN/host Name.
To verify the patch followed below procedure
1) Try to set FQDN name for auth.allow or auth.reject parameter
gluster v set myvol auth.reject <fqdn name>
It gives error "fqdn-name" is not a valid internet-address-list
2) After apply the patch it does not give any error.
3) To verify auth.allow/reject try to mount volume on some client.
Change-Id: Ieb76cbb93d43323fd29c7ca04efe3790edb4281b
BUG: 1321578
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Reviewed-on: https://review.gluster.org/15086
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It can often be useful while debugging to know how many times
EXPECT_WITHIN had to retry a command before it got the result we were
looking for. This patch just adds a variable EW_RETRIES that can be
inspected to find this info for the last EXPECT_WITHIN.
Change-Id: I1bcb09bb7eb118c3d76c60317ef99e02df6b6ee6
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: https://review.gluster.org/16451
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem :
- Deploy gluster on 2 nodes, one brick each, one volume replicated
- Create a snapshot
- Lose one server
- Add a replacement peer and new brick with a new IP address
- replace-brick the missing brick onto the new server
(wait for replication to finish)
- peer detach the old server
- after doing above steps, glusterd fails to restart.
Solution:
With the fix detach peer will populate an error : "N2 is part of
existing snapshots. Remove those snapshots before proceeding".
While doing so we force user to stay with that peer or to delete
all snapshots.
Change-Id: I3699afb9b2a5f915768b77f885e783bd9b51818c
BUG: 1322145
Signed-off-by: Gaurav Yadav <gyadav@redhat.com>
Reviewed-on: https://review.gluster.org/16907
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to do this because modifying the volume/brick tree while
glusterd_restart_bricks is still walking it can lead to segfaults.
Without waiting we could accidentally "slip in" while attach_brick has
released big_lock between retries and make such a modification.
Change-Id: I30ccc4efa8d286aae847250f5d4fb28956a74b03
BUG: 1432542
Signed-off-by: Jeff Darcy <jeff@pl.atyp.us>
Reviewed-on: https://review.gluster.org/16927
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
One of the brick multiplexing patches (commit 1a95fc3) had some changes
in gf_auth () & server_setvolume () functions which caused auth-allow
feature to be broken. mount doesn't succeed even if it's part of the
auth-allow list. This fix does the following:
1. Reintroduce the peer-info data back in gf_auth () so that fnmatch has
valid input and it can decide on the result.
2. config-params dict should capture key values pairs for all the bricks
in case brick multiplexing is on. In case brick multiplexing isn't
enabled, then config-params should carry attributes from protocol/server
such that all rpc auth related attributes stay in tact in the
dictionary.
Change-Id: I007c4c6d78620a896b8858a29459a77de8b52412
BUG: 1433815
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: https://review.gluster.org/16920
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Reviewed-by: MOHIT AGRAWAL <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
While doing conservative merge, even if a brick is down, it will reset
the pending xattr on that. When that brick comes up, as part of the
heal, it will consider this brick as the source and removes the entries
on the other bricks, which leads to data loss.
Fix:
Undo pending only for the bricks which are up.
Change-Id: I18436fa0bb1faa5f60531b357dea3f6b20446303
BUG: 1433571
Signed-off-by: karthik-us <ksubrahm@redhat.com>
Reviewed-on: https://review.gluster.org/16913
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 086436a introduced generation number (cleanup_gen) to ensure that
rpc layer doesn't end up cleaning up the connection object if
application layer has already destroyed it. Bumping up cleanup_gen was
done only in rpc_clnt_connection_cleanup (). However the same is needed
in rpc_clnt_reconnect_cleanup () too as with out it if the object gets destroyed
through the reconnect event in the application layer, rpc layer will
still end up in trying to delete the object resulting into double free
and crash.
Peer probing an invalid host/IP was the basic test to catch this issue.
Change-Id: Id5332f3239cb324cead34eb51cf73d426733bd46
BUG: 1433578
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: https://review.gluster.org/16914
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Milind Changire <mchangir@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM: during a low watermark reset, checking of whether
the low watermark is lower than hi watermark is not done.
FIX: This patch checks if the hi watermark value is higher
the default low watermark. Else throws an failure of the reset
command
Change-Id: I8b49090c6bccce6d45c2e8076ab766047a2a6162
BUG: 1328342
Signed-off-by: hari gowtham <hgowtham@redhat.com>
Reviewed-on: https://review.gluster.org/14028
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: hari gowtham <hari.gowtham005@gmail.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Bigger buffer size made rebalance faster and broke the test.
- We made the file bigger so rebalance takes longer.
BUG: 1428058
Change-Id: I7acfec5611b28bdae0a9d9fc03eb104659a5563a
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Author: Shreyas Siravara <sshreyas@fb.com>
Reviewed-on: https://review.gluster.org/16802
Tested-by: Vijay Bellur <vbellur@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: add-brick command to increase replica count in an arbiter
volume succeeds, causing undesirable effects like the 4th brick being
loaded with the arbiter xlator, the 3rd one losing the arbiter xlator
(when the brick process is restarted), arbitration logic in afr going
for a toss etc.
Fix: Arbiter configuration should always be a replica 3 volume (of
which 3rd brick is arbiter). Hence disallow increasing replica count for
arbiter volume configurations.
Change-Id: I9fe4edac880d0f711e6d44324ad5562974e53e51
BUG: 1429200
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: https://review.gluster.org/16845
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Fix to https://bugzilla.redhat.com/show_bug.cgi?id=1316873 has made
changes to set dirty flag before every update fop, data or metadata, and unset
it after successful operation. That makes some of the fops very slow such as
entry operations or metadata operations.
Solution: File data operations are the only operation which take some time and
setting dirty flag before a fop and unsetting it after serves the purpose as
probability of failure of a fop is high when the time duration is more. For all
the other operations, set dirty flag at the end of the fop, if any brick is
down and need heal.
Providing following option to choose between high performance or better heal
marking for metadata and entry fops.
Set/Unset dirty flag for every update fop at the start of the fop. If ON, this
option impacts performance of entry operations or metadata operations as it
will set dirty flag at the start and unset it at the end of ALL update fop. If
OFF and all the bricks are good, dirty flag will be set at the start only for
file fops For metadata and entry fops dirty flag will not be set at the start,
if all the bricks are good. This does not impact performance for metadata
operations and entry operation but has a very small window to miss marking
entry as dirty in case it is required to be healed.
Thanks to Xavi and Ashish for the design
Picked the .t file from Ashish' patch https://review.gluster.org/16298
BUG: 1408809
Change-Id: I3ce860063f0e2901e50754dcfc3e4ed22daf819f
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: https://review.gluster.org/16821
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: Xavier Hernandez <xhernandez@datalab.es>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I07c3a21c1c0625a517964693351356eead962571
BUG: 1427404
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: https://review.gluster.org/16792
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This test was commented out with the belief that it depended
on utimensat() support, but in fact it was not necessary because
`stat -c %Y` only outputs second resolution.
Simply commenting in the test made it fail because it checked
the values *before* the heal, while intended was to check them
*after* the heal. This commit fixes that.
Change-Id: I4194ac645b365a1f906a3ac9bcbbdb1f05000e27
BUG: 1422074
Signed-off-by: Niklas Hambüchen <mail@nh2.me>
Reviewed-on: https://review.gluster.org/16789
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niklas Hambüchen
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ic0b5c93c6365e26a5742184dd9445354c0a57295
BUG: 1427404
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: https://review.gluster.org/16780
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit changes the following:
1. In glusterfs_handle_terminate, send out individual pmap signout
requests to glusterd for every brick.
2. Add another parameter to glusterfs_mgmt_pmap_signout function to
pass the brickname that needs to be removed from the pmap registry.
3. Make sure pmap_registry_search doesn't break out from the loop
iterating over the list of bricks per port if the first brick entry
corresponding to a port is whitespaced out.
4. Make sure the pmap registry entries are removed for other
daemons like snapd.
Change-Id: I69949874435b02699e5708dab811777ccb297174
BUG: 1421590
Signed-off-by: Samikshan Bairagya <samikshan@gmail.com>
Reviewed-on: https://review.gluster.org/16689
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Gaurav Yadav <gyadav@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem-1
If Lookup which doesn't take any locks observes version mismatch it can't be
trusted. If we launch a heal based on this information it will lead to
self-heals which will affect I/O performance in the cases where Lookup is
wrong. Considering self-heal-daemon and operations on the inode from client
which take locks can still trigger heal we can choose to not attempt a heal on
Lookup.
Problem-2:
Fixed spurious failure of
tests/bitrot/bug-1373520.t
For the issues above, what was happening was that ec_heal_inspect()
is preventing 'name' heal to happen
Problem-3:
tests/basic/ec/ec-background-heals.t
To be honest I don't know what the problem was, while fixing
the 2 problems above, I made some changes to ec_heal_inspect() and
ec_need_heal() after which when I tried to recreate the spurious
failure it just didn't happen even after a long time.
BUG: 1414287
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Change-Id: Ife2535e1d0b267712973673f6d474e288f3c6834
Reviewed-on: https://review.gluster.org/16468
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM: spurious failure of the test.
CAUSE: the function "rebalance_run_time" calculates the total time
the tier has been running for. this being a test case, the run time
of tier can be 0 and when the function adds up zero it results in
zero. and thus it starts to fail.
FIX: Give it some time for the function to add up the values.
Signed-off-by: hari gowtham <hgowtham@redhat.com>
Change-Id: Ie270f3f3c8942081cca85dc49ef8fec76f3a261a
BUG: 1425743
Reviewed-on: https://review.gluster.org/16711
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: hari gowtham <hari.gowtham005@gmail.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|