| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently for an application using glfsapi to use glusterfs, when a
statedump is taken, it uses /var/run/gluster dir to dump info.
There can be concerns as this directory may be owned by some other
user, and hence it may fail taking statedump. Such applications
should have an option to use different path.
This patch provides an API to do so.
Updates: bz#1689097
Change-Id: I8918e002bc823d83614c972b6c738baa04681b23
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
| |
So, we will get more debug info.
fixes: #679
Change-Id: I3588e204ad25c20b69271c1a4ee17d0d158bd794
Signed-off-by: Xie Changlong <xiechanglong@cmss.chinamobile.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For nameless LOOKUPs, server creates a new inode which shall
remain invalid until the fop is successfully processed post
which it is linked to the inode table.
But incase if there is an already linked inode for that entry,
it discards that newly created inode which results in upcall
notification. This may result in client being bombarded with
unnecessary upcalls affecting performance if the data set is huge.
This issue can be avoided by looking up and storing the upcall
context in the original linked inode (if exists), thus saving up on
those extra callbacks.
Change-Id: I044a1737819bb40d1a049d2f53c0566e746d2a17
fixes: bz#1718338
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
While sending upcall notifications via synctasks, the argument used to
carry relevant data for these tasks is not initialized properly. This patch
is to fix the same.
Change-Id: I9fa8f841e71d3c37d3819fbd430382928c07176c
fixes: bz#1718316
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
As force-migration option is disabled by default,
the warning seems unnessary.
Rephrased the warning to make best sense out of it.
fixes: bz#1712668
Change-Id: Ia18c3c5e7b3fec808fce2194ca0504a837708822
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Ifd8ae5eeb91b968cc1a9a9b5d15844c5233d56db
fixes: bz#1717953
Signed-off-by: Anoop C S <anoopcs@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Unable to setup mountbroker root directory while creating geo-replication
session for non-root user.
Casue:
With patch[1] which defines the max-port for glusterd one extra sapce
got added in field of 'option max-port'.
[1]. https://review.gluster.org/#/c/glusterfs/+/21872/
In geo-rep spliting of key-value pair form vol file was done on the
basis of space so this additional space caused "ValueError: too many values
to unpack".
Solution:
Use split so that it can treat consecutive whitespace as a single separator.
Fixes: bz#1709248
Change-Id: Ia22070a43f95d66d84cb35487f23f9ee58b68c73
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch cleans some iovec code and creates two additional helper
functions to simplify management of iovec structures.
iov_range_copy(struct iovec *dst, uint32_t dst_count, uint32_t dst_offset,
struct iovec *src, uint32_t src_count, uint32_t src_offset,
uint32_t size);
This function copies up to 'size' bytes from 'src' at offset
'src_offset' to 'dst' at 'dst_offset'. It returns the number of
bytes copied.
iov_skip(struct iovec *iovec, uint32_t count, uint32_t size);
This function removes the initial 'size' bytes from 'iovec' and
returns the updated number of iovec vectors remaining.
The signature of iov_subset() has also been modified to make it safer
and easier to use. The new signature is:
iov_subset(struct iovec *src, int src_count, uint32_t start, uint32_t size,
struct iovec **dst, int32_t dst_count);
This function creates a new iovec array containing the subset of the
'src' vector starting at 'start' with size 'size'. The resulting
array is allocated if '*dst' is NULL, or copied to '*dst' if it fits
(based on 'dst_count'). It returns the number of iovec vectors used.
A new set of functions to iterate through an iovec array have been
created. They can be used to simplify the implementation of other
iovec-based helper functions.
Change-Id: Ia5fe57e388e23392a8d6cdab17670e337cadd587
Updates: bz#1193929
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this is critical so all the tests will be contained in the same
directory, and one can just 'cp -a tests/ <any-location>/' and
run glusterfs tests.
only 'glfsxmp.c' was an exception as it was just copying the
file from api example directory. Now moved it to tests.
updates: bz#1193929
Change-Id: I00359d64be580bffc5b3c3a090968d86c2c6952a
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
The test case is failing to heal the volume within $HEAL_TIMEOUT @195.
This is happening because as part of split-brain resolution the file
gets expunged from the sink and the new entry mark for that file will
be done on the source bricks as part of impunging. Since the source
bricks shd-threads failed to get the heal-domain lock, they will wait
for the heal-timeout of 10 minutes, which is greater than $HEAL_TIMEOUT.
Fix:
Set the cluster.heal-timeout to 5 seconds to trigger the heal so that
one of the source brick heals the file within the $HEAL_TIMEOUT.
Change-Id: Ie73c578cc5361c0d617a48ccc86026734d20ba8c
fixes: bz#1718998
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
We currently don't have a roll-back/undoing of post-ops if quorum is not met.
Though the FOP is still unwound with failure, the xattrs remain on the disk.
Due to these partial post-ops and partial heals (healing only when 2 bricks
are up), we can end up in metadata split-brain purely from the afr xattrs
point of view i.e each brick is blamed by atleast one of the others for
metadata. These scenarios are hit when there is frequent connect/disconnect
of the client/shd to the bricks.
Fix:
Pick a source based on the xattr values. If 2 bricks blame one, the blamed
one must be treated as sink. If there is no majority, all are sources. Once
we pick a source, self-heal will then do the heal instead of erroring out
due to split-brain.
This patch also adds restriction of all the bricks to be up to perform
metadata heal to avoid any metadata loss.
Removed the test case tests/bugs/replicate/bug-1468279-source-not-blaming-sinks.t
as it was doing metadata heal even when only 2 of 3 bricks were up.
Change-Id: I07a9d62f84ceda329dcab1f02a33aeed258dcb09
fixes: bz#1717819
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: useradd fails with: Cannot allocate memory
useradd: cannot lock /etc/passwd; try again later.
Solution:
Lock files should get automatically removed once "usradd" or "groupadd"
command finishes. But sometimes we encounter situations (bugs) where
some of these files may not get properly unlocked after the execution of
the command. In that case, when we execute useradd next time, it may show
the error “cannot lock /etc/password” or “unable to lock group file”.
So, to avoid any such errors, check for any lock files under /etc and
remove those.
updates: bz#1193929
Change-Id: If6456a271c2bc0717f768d7101a40ce44a9af3d7
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Long tale of double unref! But do read...
In cases where a shard base inode is evicted from lru list while still
being part of fsync list but added back soon before its unlink, there
could be an extra inode_unref() leading to premature inode destruction
leading to crash.
One such specific case is the following -
Consider features.shard-deletion-rate = features.shard-lru-limit = 2.
This is an oversimplified example but explains the problem clearly.
First, a file is FALLOCATE'd to a size so that number of shards under
/.shard = 3 > lru-limit.
Shards 1, 2 and 3 need to be resolved. 1 and 2 are resolved first.
Resultant lru list:
1 -----> 2
refs on base inode - (1) + (1) = 2
3 needs to be resolved. So 1 is lru'd out. Resultant lru list -
2 -----> 3
refs on base inode - (1) + (1) = 2
Note that 1 is inode_unlink()d but not destroyed because there are
non-zero refs on it since it is still participating in this ongoing
FALLOCATE operation.
FALLOCATE is sent on all participant shards. In the cbk, all of them are
added to fync_list.
Resulting fsync list -
1 -----> 2 -----> 3 (order doesn't matter)
refs on base inode - (1) + (1) + (1) = 3
Total refs = 3 + 2 = 5
Now an attempt is made to unlink this file. Background deletion is triggered.
The first $shard-deletion-rate shards need to be unlinked in the first batch.
So shards 1 and 2 need to be resolved. inode_resolve fails on 1 but succeeds
on 2 and so it's moved to tail of list.
lru list now -
3 -----> 2
No change in refs.
shard 1 is looked up. In lookup_cbk, it's linked and added back to lru list
at the cost of evicting shard 3.
lru list now -
2 -----> 1
refs on base inode: (1) + (1) = 2
fsync list now -
1 -----> 2 (again order doesn't matter)
refs on base inode - (1) + (1) = 2
Total refs = 2 + 2 = 4
After eviction, it is found 3 needs fsync. So fsync is wound, yet to be ack'd.
So it is still inode_link()d.
Now deletion of shards 1 and 2 completes. lru list is empty. Base inode unref'd and
destroyed.
In the next batched deletion, 3 needs to be deleted. It is inode_resolve()able.
It is added back to lru list but base inode passed to __shard_update_shards_inode_list()
is NULL since the inode is destroyed. But its ctx->inode still contains base inode ptr
from first addition to lru list for no additional ref on it.
lru list now -
3
refs on base inode - (0)
Total refs on base inode = 0
Unlink is sent on 3. It completes. Now since the ctx contains ptr to base_inode and the
shard is part of lru list, base shard is unref'd leading to a crash.
FIX:
When shard is readded back to lru list, copy the base inode pointer as is into its inode ctx,
even if it is NULL. This is needed to prevent double unrefs at the time of deleting it.
Change-Id: I99a44039da2e10a1aad183e84f644d63ca552462
Updates: bz#1696136
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
While we process a cleanup, there is a chance for a race between
async operations, for example ec_launch_replace_heal. So this can
lead to invalid mem access.
Solution:
Just like we track on going heal fops, we can also track fops like
ec_launch_replace_heal, so that we can decide when to send a
PARENT_DOWN request.
Change-Id: I055391c5c6c34d58aef7336847f3b570cb831298
fixes: bz#1703948
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
| |
This patch will add two extra logs for invalid argument
Change-Id: I3950b4f4b9d88b1f1e788ef93d8f09d4bd8d4d8b
updates: bz#1703948
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
There are a lot of fromatting error is markdown
files peresent under /doc directiory of the project.
Fixing formatting errors and sending a patch.
Fixes: bz#1718273
Change-Id: I08f938088bbaaafddf634f73616ea0dbfe7aedf3
Signed-off-by: kshithijiyer <kshithij.ki@gmail.com>
|
|
|
|
|
|
|
|
| |
* Also some logging enhancements in snapview-server
Change-Id: I6a7646771cedf4bd1c62806eea69d720bbaf0c83
fixes: bz#1715921
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In unit file of TA process we have been using ta-vol as
volume id and also ta-vol-server.transport.socket.listen-port=24007
In volume file for TA process we only consider volname
as "ta" and not as "ta-vol". That's why it was not able to assign
this port number to ta process as in volume file it will try to
find server xlator as ta-vol
volume ta-server <<<<<<<<< not ta-vol-server
46 type protocol/server
47 option transport.listen-backlog 10
48 option transport.socket.keepalive-count 9
49 option transport.socket.keepalive-interval 2
50 option transport.socket.keepalive-time 20
51 option transport.tcp-user-timeout 0
52 option transport.socket.keepalive 1
53 option auth.addr./mnt/thin-arbiter.allow *
54 option auth-path /mnt/thin-arbiter
55 option transport.address-family inet
56 option transport-type tcp
57 subvolumes ta-io-stats
58 end-volume
Solution:
Provide "ta" as vol id for the command which Unit file
is going to execute.
Also, made changes in setup-thin-arbiter.sh to correctly
identify the directory of Unit file irrespective of the location from
where we are executing this script.
Change-Id: Ia7bbccdc0304e7dfaaa732bebb726fba731d1d33
fixes: bz#1716766
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
commit 146e4b45d0ce906ae50fd6941a1efafd133897ea enabled
storage.fips-mode-rchecksum option for all new volumes with op-version
>=GD_OP_VERSION_7_0 but `gluster vol get $volname
storage.fips-mode-rchecksum` was displaying it as 'off'. This patch fixes it.
fixes: bz#1717782
Change-Id: Ie09f89838893c5776a3f60569dfe8d409d1494dd
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
| |
Change-Id: I8cf0a153f84ef2d162e6dd03261441d211c07d40
updates: bz#1193929
Signed-off-by: Anoop C S <anoopcs@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Following files are fixed.
tests/bugs/distribute/overlap.py
tests/utils/changelogparser.py
tests/utils/create-files.py
tests/utils/gfid-access.py
tests/utils/libcxattr.py
Change-Id: I3db857cc19e19163d368d913eaec1269fbc37140
updates: bz#1193929
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Frequent intermittent failures observed.
```
08:59:24 ok 11 [ 10/ 3] < 36> 'write_to /mnt/glusterfs/0/test.txt test-message1'
08:59:24 ok 12 [ 10/ 6] < 37> 'test-message1 cat /mnt/glusterfs/0/test.txt'
08:59:24 ok 13 [ 10/ 4] < 38> 'test-message0 cat /mnt/glusterfs/1/test.txt'
08:59:24 not ok 14 [ 3715/ 6] < 45> 'test-message1 cat /mnt/glusterfs/1/test.txt' -> 'Got "test-message0" instead of "test-message1"'
08:59:24 ok 15 [ 10/ 162] < 47> 'gluster --mode=script --wignore volume set patchy features.cache-invalidation on'
08:59:24 ok 16 [ 10/ 148] < 48> 'gluster --mode=script --wignore volume set patchy performance.qr-cache-timeout 15'
```
updates: bz#1718191
Change-Id: Ieb9e5a9a428995ff178f77bc4a5155b8298d3fa0
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The test is giving frequent failures in regression.
Error seen is normally like below:
`09:09:24 not ok 58 [ 14/ 80343] < 104> '^3$ number_healer_threads_shd patchy_distribute1 __afr_shd_healer_wait' -> 'Got "1" instead of "^3$"'`
updates: bz#1708929
Change-Id: I240bdcfb76b1f953d75937a53c5dfabba134f282
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch add more test cases for shd mux test cases
The test case includes
1) Createing multiple volumes to check the attach and detach
of self heal daemon requests.
2) Make sure the healing happens in all sceanarios
3) After a volume detach make sure the threads of the detached
volume is all cleaned.
4) Repeat all the above tests for ec volume
5) Node Reboot case
6) glusterd restart cases
7) Add-brick/remove brick
8) Convert a distributed volume to disperse volume
9) Convert a replicated volume to distributed volume
Change-Id: I7c317ef9d23a45ffd831157e4890d7c83a8fce7b
fixes: bz#1708929
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Icf93b90bcac022a355d4718220698987dbc91ecf
Signed-off-by: Kotresh HR <khiremat@redhat.com>
updates: bz#1693692
|
|
|
|
|
|
|
|
|
| |
Fixed a bug in the revalidate code path that wiped out
directory permissions if no mds subvol was found.
Change-Id: I8b4239ffee7001493c59d4032a2d3062586ea115
fixes: bz#1716830
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
| |
All these checks are done after analyzing clang-scan report produced
by the CI job @ https://build.gluster.org/job/clang-scan
updates: bz#1622665
Change-Id: I590305af4ceb779be952974b2a36066ffc4865ca
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Translators covered:
* playground/template
* debug/delay-gen
* debug/error-gen
* features/namespace
* features/quiesce
* meta
updates: bz#1693692
Change-Id: Ic8fde8efcb309ea492d8e819241f786f7ff467a1
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The way delta_blocks is computed in shard is incorrect, when a file
is truncated to a lower size. The accounting only considers change
in size of the last of the truncated shards.
FIX:
Get the block-count of each shard just before an unlink at posix in
xdata. Their summation plus the change in size of last shard
(from an actual truncate) is used to compute delta_blocks which is
used in the xattrop for size update.
Change-Id: I9128a192e9bf8c3c3a959e96b7400879d03d7c53
fixes: bz#1705884
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
1. Add geo-rep fanout test case
2. Add glusterd geo-rep negative test cases
3. Add glusterd geo-rep config test cases
Change-Id: I856c087eb3216d8f0ffd1f266deac88e9a4effec
Signed-off-by: Kotresh HR <khiremat@redhat.com>
updates: bz#1693692
|
|
|
|
|
|
|
|
|
|
| |
Rename with existing name testcase is occasionaly
failing on EC volume. Hence commenting the same
until it's analysed
Change-Id: Icb2ad189b9e4d12101e8f5abcb8a033181360386
Signed-off-by: Kotresh HR <khiremat@redhat.com>
updates: bz#1193929
|
|
|
|
|
|
|
|
|
|
| |
1401716: Resource leak
1401714: Dereference before null check
updates: bz#789278
Change-Id: I8fb0b143a1d4b37ee6be7d880d9b5b84ba00bf36
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: While tier code was removed, the is_tier_enabled
related to tier wasn't handled for upgrade.
As this option was missing in the info file, the checksum
mismatch issue happens during upgrade.
This results in the peer rejections happening.
Fix: use the op_version check and note down the is_tier_enabled
always. This way it will be dummy key, but the future upgrades
will work fine.
NOTE: Just having the key from 3.10 to 7 will cause issues when
upgraded from 5 to 8 or any such upgrade which skips the version
where we handle it.
Change-Id: I9951e2b74f16e58e884e746c34dcf53e559c7143
fixes: bz#1714973
Signed-off-by: hari gowtham <hgowtham@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
upcall: remove extra variable assignment and use just one
initialization.
open-behind: reduce the overall number of lines, in functions
not frequently called
selinux: reduce some lines in init failure cases
updates: bz#1693692
Change-Id: I7c1de94f2ec76a5bfe1f48a9632879b18e5fbb95
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* locks/posix.c: key was not freed in one of the cases.
* locks/common.c: lock was being free'd out of context.
* nfs/exports: handle case of missing free.
* protocol/client: handle case of entry not freed.
* storage/posix: handle possible case of double free
CID: 1398628, 1400731, 1400732, 1400756, 1124796, 1325526
updates: bz#789278
Change-Id: Ieeaca890288bc4686355f6565f853dc8911344e8
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
storage.reserve-size option will take size as input
instead of percentage. If set, priority will be given to
storage.reserve-size over storage.reserve. Default value
of this option is 0.
fixes: bz#1651445
Change-Id: I7a7342c68e436e8bf65bd39c567512ee04abbcea
Signed-off-by: Sheetal Pamecha <sheetal.pamecha08@gmail.com>
|
|
|
|
|
|
|
| |
updates: bz#1193929
Change-Id: Ieb5e35d454498bc389972f9f15fe46b640f1b97d
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: While high no. of volumes are configured around 2000
glusterd has bottleneck during handshake at the time
of copying dictionary
Solution: To avoid the bottleneck serialize a dictionary instead
of copying key-value pair one by one
Change-Id: I9fb332f432e4f915bc3af8dcab38bed26bda2b9a
fixes: bz#1711297
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Ide59a3fde11b23f654b1ec03d72b4ec53b36a03b
Signed-off-by: Kotresh HR <khiremat@redhat.com>
updates: bz#1693692
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Traditionally all svc manager will execute process stop and then
followed by start each time when they called. But that is not
required by shd, because the attach request implemented in the shd
multiplex has the intelligence to check whether a detach is required
prior to attaching the graph. So there is no need to send an explicit
detach request if we are sure that the next call is an attach request
Change-Id: I9157c8dcaffdac038f73286bcf5646a3f1d3d8ec
fixes: bz#1710054
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
While processing a cleanup_and_exit function, we are
accessing a graph object. But this has not been protected
under a lock. Because a parallel cleanup of a graph is quite
possible which might lead to an invalid memory access
Change-Id: Id05ca70d5b57e172b0401d07b6a1f5386c044e79
fixes: bz#1708926
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
| |
Added geo-rep regression tests with EC volume.
fixes: bz#1650095
Change-Id: Ifb6e68e0a6103a98fced7f84d3088b8edf33d52f
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
|
|
|
|
|
|
| |
updates: bz#1693692
Change-Id: If4c30572d4501d169bb4b0871c677d974515867c
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
While restarting a glusterd process, when we have a stale pid
we were doing a simple kill. Instead we can use glusterd_proc_stop
Because it has more logging plus force kill in case if there is
any problem with kill signal handling.
Change-Id: I4a2dadc210a7a65762dd714e809899510622b7ec
updates: bz#1710054
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterd_svcs_stop should call individual wrapper function to stop a
daemon rather than calling glusterd_svc_stop. For example for shd,
it should call glusterd_shdsvc_stop instead of calling basic API
function to stop. Because the individual functions for each daemon
could be doing some specific operation in their wrapper function.
Change-Id: Ie6d40590251ad470ef3901d1141ab7b22c3498f5
fixes: bz#1712741
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: "gluster v status" is hung in heterogenous cluster
when issued from a non-upgraded node.
Cause: commit 34e010d64 fixes the txn-opinfo mem leak
in op-sm framework by not setting the txn-opinfo if some
conditions are true. When vol status is issued from a
non-upgraded node, command is hanging in its upgraded peer
as the upgraded node setting the txn-opinfo based on new
conditions where as non-upgraded nodes are following diff
conditions.
Fix: Add an op-version check, so that all the nodes follow
same set of conditions to set txn-opinfo.
fixes: bz#1710159
Change-Id: Ie1f353212c5931ddd1b728d2e6949dfe6225c4ab
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Given a directory with statedumps captured at different times if
there are any stacks that appear in multiple statedumps, it prints
them.
Sample output:
glusterdump.25425.dump repeats=5 stack=0x7f53642cb968 pid=0 unique=0 lk-owner=
glusterdump.25427.dump repeats=5 stack=0x7f85002cb968 pid=0 unique=0 lk-owner=
glusterdump.25428.dump repeats=5 stack=0x7f962c2cb968 pid=0 unique=0 lk-owner=
glusterdump.25428.dump repeats=2 stack=0x7f962c329f18 pid=60830 unique=0 lk-owner=88f50620967f0000
glusterdump.25429.dump repeats=5 stack=0x7f20782cb968 pid=0 unique=0 lk-owner=
glusterdump.25472.dump repeats=5 stack=0x7f27ac2cb968 pid=0 unique=0 lk-owner=
glusterdump.25473.dump repeats=5 stack=0x7f4fbc2cb9d8 pid=0 unique=0 lk-owner=
NOTE: stacks with lk-owner=""/lk-owner=0000000000000000/unique=0 may not be hung frames and need further inspection
fixes bz#1714415
Change-Id: Ib64a3fca63f49df2fafedcd4baa57e9b25411b08
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At the moment new stack doesn't populate frame->root->unique in all cases. This
makes it difficult to debug hung frames by examining successive state dumps.
Fuse and server xlators populate it whenever they can, but other xlators won't
be able to assign 'unique' when they need to create a new frame/stack because
they don't know what 'unique' fuse/server xlators already used. What we need is
for unique to be correct. If a stack with same unique is present in successive
statedumps, that means the same operation is still in progress. This makes
'finding hung frames' part of debugging hung frames easier.
fixes bz#1714098
Change-Id: I3e9a8f6b4111e260106c48a2ac3a41ef29361b9e
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
| |
1401590: Deadcode
updates: bz#789278
Change-Id: I3aa1d3aa9769e6990f74b6a53e288e788173c5e0
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
| |
After basic analysis, found that these methods were not being
used at all.
updates: bz#1693692
Change-Id: If9cfa1ab189e6e7b56230c4e1d8e11f9694a9a65
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|