| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
| |
The test is giving frequent failures in regression.
Error seen is normally like below:
`09:09:24 not ok 58 [ 14/ 80343] < 104> '^3$ number_healer_threads_shd patchy_distribute1 __afr_shd_healer_wait' -> 'Got "1" instead of "^3$"'`
updates: bz#1708929
Change-Id: I240bdcfb76b1f953d75937a53c5dfabba134f282
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch add more test cases for shd mux test cases
The test case includes
1) Createing multiple volumes to check the attach and detach
of self heal daemon requests.
2) Make sure the healing happens in all sceanarios
3) After a volume detach make sure the threads of the detached
volume is all cleaned.
4) Repeat all the above tests for ec volume
5) Node Reboot case
6) glusterd restart cases
7) Add-brick/remove brick
8) Convert a distributed volume to disperse volume
9) Convert a replicated volume to distributed volume
Change-Id: I7c317ef9d23a45ffd831157e4890d7c83a8fce7b
fixes: bz#1708929
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Icf93b90bcac022a355d4718220698987dbc91ecf
Signed-off-by: Kotresh HR <khiremat@redhat.com>
updates: bz#1693692
|
|
|
|
|
|
|
|
|
| |
Fixed a bug in the revalidate code path that wiped out
directory permissions if no mds subvol was found.
Change-Id: I8b4239ffee7001493c59d4032a2d3062586ea115
fixes: bz#1716830
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
| |
All these checks are done after analyzing clang-scan report produced
by the CI job @ https://build.gluster.org/job/clang-scan
updates: bz#1622665
Change-Id: I590305af4ceb779be952974b2a36066ffc4865ca
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Translators covered:
* playground/template
* debug/delay-gen
* debug/error-gen
* features/namespace
* features/quiesce
* meta
updates: bz#1693692
Change-Id: Ic8fde8efcb309ea492d8e819241f786f7ff467a1
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The way delta_blocks is computed in shard is incorrect, when a file
is truncated to a lower size. The accounting only considers change
in size of the last of the truncated shards.
FIX:
Get the block-count of each shard just before an unlink at posix in
xdata. Their summation plus the change in size of last shard
(from an actual truncate) is used to compute delta_blocks which is
used in the xattrop for size update.
Change-Id: I9128a192e9bf8c3c3a959e96b7400879d03d7c53
fixes: bz#1705884
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
1. Add geo-rep fanout test case
2. Add glusterd geo-rep negative test cases
3. Add glusterd geo-rep config test cases
Change-Id: I856c087eb3216d8f0ffd1f266deac88e9a4effec
Signed-off-by: Kotresh HR <khiremat@redhat.com>
updates: bz#1693692
|
|
|
|
|
|
|
|
|
|
| |
Rename with existing name testcase is occasionaly
failing on EC volume. Hence commenting the same
until it's analysed
Change-Id: Icb2ad189b9e4d12101e8f5abcb8a033181360386
Signed-off-by: Kotresh HR <khiremat@redhat.com>
updates: bz#1193929
|
|
|
|
|
|
|
|
|
|
| |
1401716: Resource leak
1401714: Dereference before null check
updates: bz#789278
Change-Id: I8fb0b143a1d4b37ee6be7d880d9b5b84ba00bf36
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: While tier code was removed, the is_tier_enabled
related to tier wasn't handled for upgrade.
As this option was missing in the info file, the checksum
mismatch issue happens during upgrade.
This results in the peer rejections happening.
Fix: use the op_version check and note down the is_tier_enabled
always. This way it will be dummy key, but the future upgrades
will work fine.
NOTE: Just having the key from 3.10 to 7 will cause issues when
upgraded from 5 to 8 or any such upgrade which skips the version
where we handle it.
Change-Id: I9951e2b74f16e58e884e746c34dcf53e559c7143
fixes: bz#1714973
Signed-off-by: hari gowtham <hgowtham@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
upcall: remove extra variable assignment and use just one
initialization.
open-behind: reduce the overall number of lines, in functions
not frequently called
selinux: reduce some lines in init failure cases
updates: bz#1693692
Change-Id: I7c1de94f2ec76a5bfe1f48a9632879b18e5fbb95
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* locks/posix.c: key was not freed in one of the cases.
* locks/common.c: lock was being free'd out of context.
* nfs/exports: handle case of missing free.
* protocol/client: handle case of entry not freed.
* storage/posix: handle possible case of double free
CID: 1398628, 1400731, 1400732, 1400756, 1124796, 1325526
updates: bz#789278
Change-Id: Ieeaca890288bc4686355f6565f853dc8911344e8
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
storage.reserve-size option will take size as input
instead of percentage. If set, priority will be given to
storage.reserve-size over storage.reserve. Default value
of this option is 0.
fixes: bz#1651445
Change-Id: I7a7342c68e436e8bf65bd39c567512ee04abbcea
Signed-off-by: Sheetal Pamecha <sheetal.pamecha08@gmail.com>
|
|
|
|
|
|
|
| |
updates: bz#1193929
Change-Id: Ieb5e35d454498bc389972f9f15fe46b640f1b97d
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: While high no. of volumes are configured around 2000
glusterd has bottleneck during handshake at the time
of copying dictionary
Solution: To avoid the bottleneck serialize a dictionary instead
of copying key-value pair one by one
Change-Id: I9fb332f432e4f915bc3af8dcab38bed26bda2b9a
fixes: bz#1711297
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Ide59a3fde11b23f654b1ec03d72b4ec53b36a03b
Signed-off-by: Kotresh HR <khiremat@redhat.com>
updates: bz#1693692
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Traditionally all svc manager will execute process stop and then
followed by start each time when they called. But that is not
required by shd, because the attach request implemented in the shd
multiplex has the intelligence to check whether a detach is required
prior to attaching the graph. So there is no need to send an explicit
detach request if we are sure that the next call is an attach request
Change-Id: I9157c8dcaffdac038f73286bcf5646a3f1d3d8ec
fixes: bz#1710054
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
While processing a cleanup_and_exit function, we are
accessing a graph object. But this has not been protected
under a lock. Because a parallel cleanup of a graph is quite
possible which might lead to an invalid memory access
Change-Id: Id05ca70d5b57e172b0401d07b6a1f5386c044e79
fixes: bz#1708926
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
| |
Added geo-rep regression tests with EC volume.
fixes: bz#1650095
Change-Id: Ifb6e68e0a6103a98fced7f84d3088b8edf33d52f
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
|
|
|
|
|
|
| |
updates: bz#1693692
Change-Id: If4c30572d4501d169bb4b0871c677d974515867c
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
While restarting a glusterd process, when we have a stale pid
we were doing a simple kill. Instead we can use glusterd_proc_stop
Because it has more logging plus force kill in case if there is
any problem with kill signal handling.
Change-Id: I4a2dadc210a7a65762dd714e809899510622b7ec
updates: bz#1710054
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterd_svcs_stop should call individual wrapper function to stop a
daemon rather than calling glusterd_svc_stop. For example for shd,
it should call glusterd_shdsvc_stop instead of calling basic API
function to stop. Because the individual functions for each daemon
could be doing some specific operation in their wrapper function.
Change-Id: Ie6d40590251ad470ef3901d1141ab7b22c3498f5
fixes: bz#1712741
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: "gluster v status" is hung in heterogenous cluster
when issued from a non-upgraded node.
Cause: commit 34e010d64 fixes the txn-opinfo mem leak
in op-sm framework by not setting the txn-opinfo if some
conditions are true. When vol status is issued from a
non-upgraded node, command is hanging in its upgraded peer
as the upgraded node setting the txn-opinfo based on new
conditions where as non-upgraded nodes are following diff
conditions.
Fix: Add an op-version check, so that all the nodes follow
same set of conditions to set txn-opinfo.
fixes: bz#1710159
Change-Id: Ie1f353212c5931ddd1b728d2e6949dfe6225c4ab
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Given a directory with statedumps captured at different times if
there are any stacks that appear in multiple statedumps, it prints
them.
Sample output:
glusterdump.25425.dump repeats=5 stack=0x7f53642cb968 pid=0 unique=0 lk-owner=
glusterdump.25427.dump repeats=5 stack=0x7f85002cb968 pid=0 unique=0 lk-owner=
glusterdump.25428.dump repeats=5 stack=0x7f962c2cb968 pid=0 unique=0 lk-owner=
glusterdump.25428.dump repeats=2 stack=0x7f962c329f18 pid=60830 unique=0 lk-owner=88f50620967f0000
glusterdump.25429.dump repeats=5 stack=0x7f20782cb968 pid=0 unique=0 lk-owner=
glusterdump.25472.dump repeats=5 stack=0x7f27ac2cb968 pid=0 unique=0 lk-owner=
glusterdump.25473.dump repeats=5 stack=0x7f4fbc2cb9d8 pid=0 unique=0 lk-owner=
NOTE: stacks with lk-owner=""/lk-owner=0000000000000000/unique=0 may not be hung frames and need further inspection
fixes bz#1714415
Change-Id: Ib64a3fca63f49df2fafedcd4baa57e9b25411b08
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At the moment new stack doesn't populate frame->root->unique in all cases. This
makes it difficult to debug hung frames by examining successive state dumps.
Fuse and server xlators populate it whenever they can, but other xlators won't
be able to assign 'unique' when they need to create a new frame/stack because
they don't know what 'unique' fuse/server xlators already used. What we need is
for unique to be correct. If a stack with same unique is present in successive
statedumps, that means the same operation is still in progress. This makes
'finding hung frames' part of debugging hung frames easier.
fixes bz#1714098
Change-Id: I3e9a8f6b4111e260106c48a2ac3a41ef29361b9e
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
| |
1401590: Deadcode
updates: bz#789278
Change-Id: I3aa1d3aa9769e6990f74b6a53e288e788173c5e0
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
| |
After basic analysis, found that these methods were not being
used at all.
updates: bz#1693692
Change-Id: If9cfa1ab189e6e7b56230c4e1d8e11f9694a9a65
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Also some cleanup:
* old-protocol.t was actually added to make sure we have line-coverage
* first-test.t should have been removed as per the comment. It doesn't do anything.
* add statvfs to rpc-coverage so we can cover statvfs in few xlators.
updates: bz#1693692
Change-Id: Ie8651ce007de484c4abced16b4de765aa5e517be
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Ibd37b6ea82b781a1a266b95f7596874134f30079
fixes: bz#1713730
Signed-off-by: Amgad Saleh <amgad.saleh@nokia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In commit ac70f66c5805e10b3a1072bd467918730c0aeeb4 I
missed one condition to populate volume dictionary in
multiple threads while brick_multiplex is enabled.Due
to that glusterd is not sending volume dictionary for
all volumes to peer.
Solution: Update the condition in code as well as update test case
also to avoid the issue
Change-Id: I06522dbdfee4f7e995d9cc7b7098fdf35340dc52
fixes: bz#1711250
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
| |
updates: bz#1193929
Change-Id: Iee9aab8140882069165621189741f189fb2cc884
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The handler functions are pointed to dummy functions.
The switch case handling for tier also have been moved to
point default case to avoid issues, if reintroduced.
The tier changes in DHT still remain as such.
updates: bz#1693692
Change-Id: I80d80c9a3eb862b4440a36b31ae82b2e9d92e4dc
Signed-off-by: Hari Gowtham <hgowtham@redhat.com>
|
|
|
|
|
|
| |
updates: bz#1193929
Change-Id: Ic26ab5277f720c734f083150c1c541763dfa64aa
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
add test for async Read/Write combinations
glfs_read_async/write_async
glfs_pread_async/pwrite_async
glfs_readv_async/writev_async
glfs_preadv_async/pwritev_async
ftruncate/ftruncate_async
fsync/fsync_async
fdatasync/fdatasync_async
Updates: #655
Change-Id: I12beb97029fd60bce79650a376d8fcd8d383ef16
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
* add more fops: f{get,set,list,remove}xattr(), access(), fstat(), fsetattr(),
getxattr(), lgetxattr(), llistxattr(), lsetxattr(), fgetxattr()
* handle some error cases (like volume not found)
Updates: #655
Change-Id: I3334bdf3090eafd83a54e1be12036ea01b181089
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the following CID's:
* 1124829
* 1274075
* 1274083
* 1274128
* 1274135
* 1274141
* 1274143
* 1274197
* 1274205
* 1274210
* 1274211
* 1288801
* 1398629
Change-Id: Ia7c86cfab3245b20777ffa296e1a59748040f558
Updates: bz#789278
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
EC was ignoring lock contention notifications received while a lock was
being acquired. When a lock is partially acquired (some bricks have
granted the lock but some others not yet) we can receive notifications
from acquired bricks, which should be honored, since we may not receive
more notifications after that.
Since EC was ignoring them, once the lock was acquired, it was not
released until the eager-lock timeout, causing unnecessary delays on
other clients.
This fix takes into consideration the notifications received before
having completed the full lock acquisition. After that, the lock will
be releaed as soon as possible.
Fixes: bz#1708156
Change-Id: I2a306dbdb29fb557dcab7788a258bd75d826cc12
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
afr_child_up_status_meta works only when LOOKUP on $M0 is successful.
There are cases where quorum is not met and LOOKUP fails on $M0 which
leads to failures similar to:
grep: /mnt/glusterfs/0/.meta/graphs/active/patchy-replicate-0/private: Transport endpoint is not connected
This was happening once in a while based on attribute-timeout and
md-cache not serving the lookup.
Fix:
Find child-up status based on statedump instead. Also changed mount
options to include --entry-timeout=0 and --attribute-timeout=0
updates bz#1193929
Change-Id: Ic0de72c3006d7399a5feb3e4d10d4748949b2ab3
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
| |
fixes bz#1706603
Change-Id: I0bfd30f787f157b7a54f71088f767ccfd7621208
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
| |
CID: 1401345 - Unused value
updates: bz#789278
Change-Id: I6b8f2611151ce0174042384b7632019c312ebae3
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Modified Geo-rep help text for better sanity.
fixes: bz#1652887
Change-Id: I40ef7ef709eaecf0125ab4b4a7517e2c5d1ef4a0
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We only need to calculate and write the checksum in case of
!is_quota_conf .
Align the code in accordance.
Also, use a smaller buffer (to write few chars).
Change-Id: I40c83ce10447df77ff9975d314d768ec2c0087c2
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
| |
Change-Id: I14957c5161f31d5dfc6cf56f8d7ccf4d39372f39
fixes: bz#1711820
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
| |
Avoid serious memory leak
fixes: bz#1711240
Change-Id: Ic61a8fdd0e941e136c98376a87b5a77fa8c22316
Signed-off-by: Xie Changlong <xiechanglong@cmss.chinamobile.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A rebalance process currently only looks up files
that it is supposed to migrate. This could cause issues
when lookup-optimize is enabled as the dir layout can be
updated with the commit hash before all files are looked up.
This is expecially problematic of one of the rebalance processes
fails to complete as clients will try to access files whose
linkto files might not have been created.
Each process will now lookup every file in the directory it is
processing.
Pros: Less likely that files will be inaccessible.
Cons: More lookup requests sent to the bricks and a potential
performance hit.
Note: this does not handle races such as when a layout is updated on disk
just as the create fop is sent by the client.
Change-Id: I22b55846effc08d3b827c3af9335229335f67fb8
fixes: bz#1711764
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
| |
updates: bz#1712322
Change-Id: I120a1d23506f9ebcf88c7ea2f2eff4978a61cf4a
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During a graph cleanup, we first sent a PARENT_DOWN and wait for
a child down to ultimately free the xlator and the graph.
In the ec xlator, we cleanup the threads when we get a PARENT_DOWN event.
But a racing event like CHILD_UP or event xl_op may trigger healing threads
after threads cleanup.
So there is a chance that the threads might access a freed private variabe
Change-Id: I252d10181bb67b95900c903d479de707a8489532
fixes: bz#1703948
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Running with 2 second sleep at this place caused failures like:
`not ok 14 [ 2014/ 7] < 41> 'test-message1 cat /mnt/glusterfs/1/test.txt' -> 'Got "test-message0" instead of "test-message1"'`
in few runs in 100 iterations. But when increased to higher than sleep 3,
have not seen any failures in 100 runs.
While I don't know the exact reasons for the behavior yet, looks like this
increase in wait helps to pass the regression without failures.
updates: bz#1693692
Change-Id: I0610b79bea53e36de3eea6c11234b7fc9dfd6232
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In function "afr_selfheal_entry_granular", after completing the
heal we are not destroying the frame. This will lead to crash.
when we execute statedump operation, where it tried to access
xlator object. If this xlator object is freed as part of the
graph destroy this will lead to an invalid memory access
Change-Id: I0a5e78e704ef257c3ac0087eab2c310e78fbe36d
fixes: bz#1708926
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|