| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
EC was ignoring lock contention notifications received while a lock was
being acquired. When a lock is partially acquired (some bricks have
granted the lock but some others not yet) we can receive notifications
from acquired bricks, which should be honored, since we may not receive
more notifications after that.
Since EC was ignoring them, once the lock was acquired, it was not
released until the eager-lock timeout, causing unnecessary delays on
other clients.
This fix takes into consideration the notifications received before
having completed the full lock acquisition. After that, the lock will
be releaed as soon as possible.
Backport of:
> BUG: bz#1708156
> Change-Id: I2a306dbdb29fb557dcab7788a258bd75d826cc12
> Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Fixes: bz#1717282
Change-Id: I2a306dbdb29fb557dcab7788a258bd75d826cc12
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Signed-off-by: Hari Gowtham <hgowtham@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Following files are fixed.
tests/bugs/distribute/overlap.py
tests/utils/changelogparser.py
tests/utils/create-files.py
tests/utils/gfid-access.py
tests/utils/libcxattr.py
Have marked glupy as bad test.
Backport of:
> Change-Id: I3db857cc19e19163d368d913eaec1269fbc37140
> BUG: 1193929
> Signed-off-by: Kotresh HR <khiremat@redhat.com>
Change-Id: I3db857cc19e19163d368d913eaec1269fbc37140
Updates: bz#1629877
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When eager-lock lock acquisition fails because of say network failures, the
local is not being removed from owners_list, this leads to accumulation of
waiting frames and the application will hang because the waiting frames are
under the assumption that another transaction is in the process of acquiring
lock because owner-list is not empty. Handled this case as well in this patch.
Added asserts to make it easier to find these problems in future.
fixes bz#1699736
Change-Id: I3101393265e9827755725b1f2d94a93d8709e923
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In an arbiter volume configuration SHD will not send any writes onto the arbiter
brick even if there is data pending marker for the arbiter brick. If we have a
arbiter setup on the geo-rep master and there are data pending markers for the files
on arbiter brick, SHD will not mark any data changelog during healing. While syncing
the data from master to slave, if the arbiter-brick is considered as ACTIVE, then
there is a chance that slave will miss out some data. If the arbiter brick is being
newly added or replaced there is a chance of slave missing all the data during sync.
Fix:
If there is data pending marker for the arbiter brick, send truncate on the arbiter
brick during heal, so that it will record truncate as the data transaction in changelog.
Change-Id: I3242ba6cea6da495c418ef860d9c3359c5459dec
fixes: bz#1687687
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
| |
Fixes: bz#1673268
Change-Id: I2b9be45f199f6436b858536c6f49be85902217f0
Signed-off-by: Nigel Babu <nigelb@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: When quorum count option is updated, the change is not reflected in
the nfs-server.vol file. This is because in get_checksum_for_file(), when the
last part of the file read has size less than buffer size, the read buffer
stores old data value along with correct data value.
Solution: Pass the bytes read instead of fixed buffer size, for calculating
checksum.
Change-Id: I4b641607c8a262961b3f3da0028a54e08c3f8589
fixes: bz#1672248
Signed-off-by: Varsha Rao <varao@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM:
Lot of the earlier changes in the management of shards in lru, fsync
lists assumed that if a given shard exists in fsync list, it must be
part of lru list as well. This was found to be not true.
Consider this - a file is FALLOCATE'd to a size which would make the
number of participant shards to be greater than the lru list size.
In this case, some of the resolved shards that are to participate in
this fop will be evicted from lru list to give way to the rest of the
shards. And once FALLOCATE completes, these shards are added to fsync
list but without a ref. After the fop completes, these shard inodes
are unref'd and destroyed while their inode ctxs are still part of
fsync list. Now when an FSYNC is called on the base file and the
fsync-list traversed, the client crashes due to illegal memory access.
FIX:
Hold a ref on the shard inode when adding to fsync list as well.
And unref under following conditions:
1. when the shard is evicted from lru list
2. when the base file is fsync'd
3. when the shards are deleted.
Change-Id: Iab460667d091b8388322f59b6cb27ce69299b1b2
fixes: bz#1669382
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
(cherry picked from commit 72922c1fd69191b220f79905a23395c3a87f86ce)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...when ctime is zero. ia_type and ia_gfid always need to be non-zero
for things to work correctly.
Problem:
Commit c9bde3021202f1d5c5a2d19ac05a510fc1f788ac zeroed out the iatt
buffer in the cbks of modification fops before unwinding if the ctime in
the buffer was zero. This was causing the fops to fail: noticeable when
AFR's 'consistent-metadata' option was enabled. (AFR zeros out the ctime
when the option is set. See commit
4c4624c9bad2edf27128cb122c64f15d7d63bbc8).
Fixes:
-Do not zero out the ia_type and ia_gfid of the iatt buff under any
circumstance.
-Also, fixed _rda_inode_ctx_update_iatts() to always update these values from
the incoming buf when ctime is zero. Otherwise we end up with zero
ia_type and ia_gfid the first time the function is called *and* the
incoming buf has ctime set to zero.
fixes: bz#1665145
Reported-By:Michael Hanselmann <public@hansmi.ch>
Change-Id: Ib72228892d42c3513c19fc6dfb543f2aa3489eca
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 09db11b0c020bc79d493c6d7e7ea4f3beb000c68)
|
|
|
|
|
|
|
|
|
|
|
|
| |
rm -rf <dir> fails on dirs which contain linkto files
that point to themselves because dht incorrectly thought
that they were cached files after looking them up.
The fix now treats them as invalid linkto files
and deletes them.
Change-Id: I376c72a5309714ee339c74485e02cfb4e29be643
fixes: bz#1671611
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a single brick is added to the volume and the
newly added brick is the first to respond to a
dht_revalidate call, its stbuf will not be merged
into local->stbuf as the brick does not yet have
a layout. The is_permission_different check therefore
fails to detect that an attr heal is required as it
only considers the stbuf values from existing bricks.
To fix this, merge all stbuf values into local->stbuf
and use local->prebuf to store the correct directory
attributes.
Change-Id: Ic9e8b04a1ab9ed1248b6b056e3450bbafe32e1bc
fixes: bz#1660736
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The default value of shard-block-size was changed from 4MB
to 64MB sometime back. The script "fallocate"s a 6MB file
and expects it to have 1 shard under .shard. This worked when
the shard-block-size was 4MB. With the default value now at 64MB,
file "file1" won't have any shards under .shard and the stat on the
1st shard's path fails with ENOENT.
Changed the script to explicitly set shard-block-size to 4MB.
Change-Id: I7f1785922287d16d74c95fa57cbbe12e6e66e4f7
fixes: bz#1660932
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
(cherry picked from commit e69fc87593334b24432978dbf592fa73fe5fc38b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
If parent dir is in split-brain or has dirty xattrs set, and the file
has gfid missing on one of the bricks, then name heal won't assign the
gfid.
Fix:
Use the brick we select the gfid from as the 'source'.
Note: Problem was found while trying to debug a split-brain issue on
Cynthia Zhou's setup.
fixes: bz#1655545
Change-Id: Id088d4f0fb017aa35122de426654194e581ed742
Reported-by: Cynthia Zhou <cynthia.zhou@nokia-sbell.com>
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 4d58730c0cd6ab5db39aec8a15276f7bd3371b04)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the commit febf5ed4848, during the volume create op,
we are setting volinfo->caps to 0, only if any of the bricks
belong to the same node and brickinfo->vg[0] is null.
Previously, we used to set volinfo->caps to 0, when
either brick doesn't belong to the same node or brickinfo->vg[0]
is null.
With this patch, we set volinfo->caps to 0, when either brick
doesn't belong to the same node or brickinfo->vg[0] is null.
(as we do earlier without commit febf5ed4848).
> BUG: bz#1635820
> Change-Id: I00a97415786b775fb088ac45566ad52b402f1a49
> Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
(cherry picked from commit aae1c402b74fd02ed2f6473b896f108d82aef8e3)
fixes: bz#1647968
Change-Id: I00a97415786b775fb088ac45566ad52b402f1a49
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch https://review.gluster.org/#/c/glusterfs/+/19135/ has
optimised glusterd test cases by clubbing the similar test
cases into a single test case.
https://review.gluster.org/#/c/glusterfs/+/19135/15/tests/bugs/glusterd/bug-1293414-import-brickinfo-uuid.t
test case has been deleted and added as a part of
tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t
In the original test case, we create a volume with two bricks,
each on a separate node(N1 & N2). From another node in cluster(N3),
we try to detach a node which is hosting bricks. It fails.
In the new test, we created volume with single brick on N1.
and from another node in cluster, we tried to detach N1. we
expect peer detach to fail, but peer detach was success as
the node is hosting all the bricks of volume.
Now, changing the new test case to cover the original test case scenario.
Please refer https://bugzilla.redhat.com/show_bug.cgi?id=1642597#c1 to
understand why the new test case is not failing in centos-regression.
> BUG: bz#1642597
> Change-Id: Ifda12b5677143095f263fbb97a6808573f513234
> Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
(cherry picked from commit 0ca6773eaf5aeb507ebc72d2c2f61902eeff414c)
fixes: bz#1643078
Change-Id: Ifda12b5677143095f263fbb97a6808573f513234
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
https://review.gluster.org/#/c/glusterfs/+/21427/ seems to be failing
this .t spuriously. On checking one of the failure logs, I see:
22:05:44 Launching heal operation to perform index self heal on volume patchy has been unsuccessful:
22:05:44 Self-heal daemon is not running. Check self-heal daemon log file.
22:05:44 not ok 20 , LINENUM:38
In glusterd log:
[2018-10-18 22:05:44.298832] E [MSGID: 106301] [glusterd-syncop.c:1352:gd_stage_op_phase] 0-management: Staging of operation 'Volume Heal' failed on localhost : Self-heal daemon is not running. Check self-heal daemon log file
But the tests which preceed this check whether via a statedump if the shd is
conected to the bricks, and they have succeeded and even started
healing. From glustershd.log:
[2018-10-18 22:05:40.975268] I [MSGID: 108026] [afr-self-heal-common.c:1732:afr_log_selfheal] 0-patchy-replicate-0: Completed data selfheal on 3b83d2dd-4cf2-4ea3-a33e-4275be40f440. sources=[0] 1 sinks=2
So the only reason I can see launching heal via cli failing is a race where
shd has been spawned but glusterd has not yet updated in-memory that it is up,
and hence failing the CLI.
Fix:
Check for shd up status before launching heal via CLI
Change-Id: Ic88abf14ad3d51c89cb438db601fae4df179e8f4
fixes: bz#1641872
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 3dea105556130abd4da0fd3f8f2c523ac52398d1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of:
> Change-Id: Ic15ca41444dd04684a9458bd4a526b1d3e160499
> Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
> (cherry picked from commit e627977)
> BUG: 1605056
In __shard_update_shards_inode_list(), previously shard translator
was not holding a ref on the base inode whenever a shard was added to
the lru list. But if the base shard is forgotten and destroyed either
by fuse due to memory pressure or due to the file being deleted at some
point by a different client with this client still containing stale
shards in its lru list, the client would crash at the time of locking
lru_base_inode->lock owing to illegal memory access.
So now the base shard is ref'd into the inode ctx of every shard that
is added to lru list until it gets lru'd out.
The patch also handles the case where none of the shards associated
with a file that is about to be deleted are part of the LRU list and
where an unlink at the beginning of the operation destroys the base
inode (because there are no refkeepers) and hence all of the shards
that are about to be deleted will be resolved without the existence
of a base shard in-memory. This, if not handled properly, could lead
to a crash.
Change-Id: Ic15ca41444dd04684a9458bd4a526b1d3e160499
updates: bz#1641440
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
translators like readdir-ahead selectively retain entry information of
iatt (gfid and type) when rest of the iatt is invalidated (for write
invalidating ia_size, (m)(c)times etc). Fuse-bridge uses this
information and sends only entry information in readdirplus
response. However such option doesn't exist in gfapi. This patch
modifies gfapi to populate the stat by forcing an extra lookup.
Thanks to Shyamsundar Ranganathan <srangana@redhat.com> and Prashanth
Pai <ppai@redhat.com> for tests.
Change-Id: Ieb5f8fc76359c327627b7d8420aaf20810e53000
Fixes: bz#1630804
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit 6257276d9de3f15643f159b2ec627a67c84fc23d)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of https://review.gluster.org/#/c/glusterfs/+/21135/
Problem:
When a directory has dirty xattrs due to failed post-ops or when
replace/reset brick is performed, AFR does a conservative merge as
expected, but heal-info reports it as split-brain because there are no
clear sources.
Fix:
Modify pending flag to contain information about pending heals and
split-brains. For directories, if spit-brain flag is not set,just show
them as needing heal and not being in split-brain.
Change-Id: I09ef821f6887c87d315ae99e6b1de05103cd9383
fixes: bz#1638163
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of https://review.gluster.org/#/c/glusterfs/+/21380/
Problem:
In an arbiter volume, if there is a pending data heal of a file only on
arbiter brick, self-heal takes inodelks twice due to a code-bug but unlocks
it only once, leaving behind a stale lock on the brick. This causes
the next write to the file to hang.
Fix:
Fix the code-bug to take lock only once. This bug was introduced master
with commit eb472d82a083883335bc494b87ea175ac43471ff
Thanks to Pranith Kumar K <pkarampu@redhat.com> for finding the RCA.
fixes: bz#1638159
Change-Id: I15ad969e10a6a3c4bd255e2948b6be6dcddc61e1
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The jenkins release-new job runs on a CentOS 7 box, which does not
have python3. As a result it runs (autogen.sh and) configure before
producing the dist tar file, converting all the python3 shebangs to
python2 shebangs in the dist tar file.
Then when that tar file is "carried" to, e.g. Fedora koji build
system to build packages, the shebangs are incorrect, despite having
originally been correct in the git repo.
Change-Id: I5154baba3f6d29d3c4823bafc2b57abecbf90e5b
updates: #411
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reverted the following:
- 248152767b0599986bbb6bb35fc27197f6be6964
- 09943beb499617212f2985ca8ea9ecd1ed1b470e
- d01f7244e9d9f7e3ef84e0ba7b48ef1b1b09d809
The reverts are redone by hand, due to clang format changes
that made using git to revert the changes more tedious.
Change-Id: I96489638a2b641fb2206a110298543225783f7be
Updates: bz#1628620
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
| |
Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4
Signed-off-by: Nigel Babu <nigelb@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Adding checks for avoiding glusterd's working directory used as
a brick for volume creation.
fixes: bz#853601
Change-Id: I4b16a05f752e92216aa628f542a4fdbf59b3c669
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It wouldn't make sense to allow iostats file to be written in
*any* directory. While the formating makes sure we try to append
io-stats-name for the file, so overwriting existing file is slim,
but in any case it makes sense to restrict dumping to one directory.
Below are the sample commands, and files created for the corresponding
values:
$ setfattr -n trusted.io-stats-dump -v file-for-dump $M0
In this case, the file would be in /var/run/gluster/file-for-dump
$ setfattr -n trusted.io-stats-dump -v /dir1/dir2/file-for-dump $M0
In this case, then the dump file is in /var/run/gluster/dir1-dir2-file-for-dump
Note that the value passed for this virtual xattr would be treated as a
file, and even if the value has '/' in it, it would be changed to '-'
for sanity.
Fixes: bz#1625106
Change-Id: Id9ae6a40a190b8937c51662e6e1c2a0f6c86a0e0
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
xlators/cluster/stripe/src/stripe-helpers.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/dht/src/tier.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/dht/src/dht-layout.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/dht/src/dht-helper.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/dht/src/dht-common.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/afr/src/afr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/afr/src/afr-inode-read.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
tests/bugs/replicate/bug-1250170-fsync.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
tests/basic/gfapi/gfapi-async-calls-test.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
tests/basic/ec/ec-fast-fgetxattr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
rpc/xdr/src/glusterfs3.h: Move to GF_MALLOC() instead of GF_CALLOC() when possible
rpc/rpc-transport/socket/src/socket.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
rpc/rpc-lib/src/rpc-clnt.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
extras/geo-rep/gsync-sync-gfid.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-xml-output.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-rpc-ops.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-volume.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-system.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-snapshot.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-peer.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-global.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
It doesn't make sense to calloc (allocate and clear) memory
when the code right away fills that memory with data.
It may be optimized by the compiler, or have a microscopic
performance improvement.
In some cases, also changed allocation size to be sizeof some
struct or type instead of a pointer - easier to read.
In some cases, removed redundant strlen() calls by saving the result
into a variable.
1. Only done for the straightforward cases. There's room for improvement.
2. Please review carefully, especially for string allocation, with the
terminating NULL string.
Only compile-tested!
updates: bz#1193929
Original-Author: Yaniv Kaul <ykaul@redhat.com>
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Change-Id: I16274dca4078a1d06ae09a0daf027d734b631ac2
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
see https://review.gluster.org/#/c/19788/,
https://review.gluster.org/#/c/19871/,
https://review.gluster.org/#/c/19952/,
https://review.gluster.org/#/c/20104/,
https://review.gluster.org/#/c/20162/,
https://review.gluster.org/#/c/20185/,
https://review.gluster.org/#/c/20207/,
https://review.gluster.org/#/c/20227/,
https://review.gluster.org/#/c/20307/,
https://review.gluster.org/#/c/20320/,
https://review.gluster.org/#/c/20332/,
https://review.gluster.org/#/c/20364/,
https://review.gluster.org/#/c/20441/, and
https://review.gluster.org/#/c/20484
shebangs changed from /usr/bin/python2 to /usr/bin/python3.
(Reminder, various distribution packaging guidelines require use
of explicit python version and don't allow '#!/usr/bin/env python',
regardless of how handy that idiom may be.)
glusterfs.spec(.in) package python{2,3}-gluster and python2 or
python3 dependencies as appropriate.
configure(.ac):
+ test for and use python2 or python3 as appropriate. If build
machine has python2 and python3, use python3. Override by
setting PYTHON=/usr/bin/python2 when running configure.
+ PYTHONDEV_CPPFLAGS from python[23]-config --includes is a
better match to the original python sysconfig.get_python_inc().
All those other extraneous flags breaks the build.
+ Only change the shebangs once. Changing them over and over
again, e.g., during a `make glusterrpms` in extras/LinuxRPM
just sends make (is it really make that's looping?) into an
infinite loop. If you figure out why, let me know.
+ Oldest python2 is python2.6 on CentOS 6 and Debian 8 (Jessie).
Everything else has 2.7 or 3.x
+ logic from https://review.gluster.org/c/glusterfs/+/21050, which
needs to be removed/merged after that patch is merged.
Builds on CentOS 6, CentOS 7, Fedora 28, Fedora rawhide, and the
mysterious RHEL > 7.
Change-Id: Idae21d3b6f58b32372e1daa0d234e491e563198f
updates: #411
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The linkto file creation for the dst was done in parallel with
the unlink of the old src linkto. If these operations reached
the brick out of order, we end up with a dst linkto file without
a .glusterfs handle.
Fixed by the unlinking only after the linkto file creation has
completed.
Change-Id: I4246f7655f5bc180f5ded7fd34d263b7828a8110
fixes: bz#1621981
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When metadata-self-heal is triggered on the mount, it blocks
lookup until metadata-self-heal completes. But that can lead
to hangs when lot of clients are accessing a directory which
needs metadata heal and all of them trigger heals waiting
for other clients to complete heal.
Fix:
Only when the heal is needed but the pending xattrs are not set,
trigger metadata heal that could block lookup. This is the only
case where different clients may give different metadata to the
clients without heals, which should be avoided.
Updates bz#1622821
Change-Id: I6089e9fda0770a83fb287941b229c882711f4e66
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
The value of trusted.pgfid.xx was always set to 1
in posix_mknod. This is incorrect if posix_mknod
calls posix_create_link_if_gfid_exists.
Change-Id: Ibe87ca6f155846b9a7c7abbfb1eb8b6a99a5eb68
fixes: bz#1619720
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In function cli_cmd_volume_statedump_options_parse if
the wordcount of arguments is exactly 3, then option_str
would remain NULL, and hence the function will generate
a segmentation fault on the strstr check in its body.
This can be triggered when we run the command,
`gluster volume statedump <volname>`
The fix is to check if option_str is non-NULL before use
and also to pass in a duplicated empty string to the dict
key "options" when this is NULL.
Fixes: bz#1619423
Change-Id: Ic029ab60b64890d92c7a0876a638929495d3aa59
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In line #13 of the test case, it checks whether the file is present
on first 2 bricks or not. If it is not present on even one of the bricks
it will break the loop and checks for the dirty marking on the parent
on the 3rd brick and checks for file not present on the 1st and 2nd
bricks. The below scenario can happen in this case:
- File gets created on 1st and 3rd bricks
- In line #13 it sees file is not present on both 1st & 2nd bricks and
breaks the loop
- In line #51 test fails because the file will be present on the 1st brick
- In line #53 test will fail because the file creation was not failed on
quorum bricks and dirty marking will not be there on the parent on 3rd
brick
Fix:
Don't break from the loop if file is present on either brick 1 or brick 2.
Change-Id: I918068165e4b9124c1de86cfb373801b5b432bd9
fixes: bz#1612054
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
If the gfid link file inside .glusterfs is not present for a file,
the operations which are dependent on the gfid will fail,
complaining the link file does not exists inside .glusterfs.
Fix:
If the link file creation fails, fail the entry creation operation
and delete the original file.
Change-Id: Id767511de2da46b1f45aea45cb68b98d965ac96d
fixes: bz#1612037
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
problem:
During a handshake, when we import a friend data
snap description variable was just referenced to
dictionary value.
Solution:
snap description should have a separate memory allocated
through gf_strdup
Change-Id: I94da0c57919e1228919231d1563a001362b100b8
fixes: bz#1618004
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
modifications
PROBLEM:
Stats of dentries that are readdirp'd ahead can become stale due to
fops like writes, truncate etc that modify the file pointed by
dentries. When a readdir is finally wound at offset corresponding to
these entries, the iatts that are returned to the application come
from readdir-ahead's cache, which are stale by now. This problem gets
further aggravated when caching translators/modules cache and continue
to serve this stale information.
FIX:
* Store the iatt in context of the inode pointed by dentry.
* Whenever the inode pointed by dentry undergoes modification, in cbk
of modification fop, update the iatt stored in inode-ctx to reflect
the modification.
* When serving a readdirp response from application, update iatts of
dentries with the iatts stored in the context of inodes pointed by
these dentries.
* Some fops don't have valid iatts in their responses. For eg., write
response whose data is still cached in write-behind will have zeroed
out stat. In this case keep only ia_type and ia_gfid and reset rest
of the iatt members to zero.
- fuse-bridge in this case just sends "entry" information back to
kernel and attr is not sent.
- gfapi sets entry->inode to NULL and zeroes out the entire stat
* There is one tiny race between the entry creation and a readdirp on
its parent dir, which could cause the inode-ctx setting and inode
ctx reading to happen on two different inode objects. To prevent
this, when entry->inode doesn't eqaul to linked_inode,
- fuse-bridge is made to send only "entry" information without
attributes
- gfapi sets entry->inode to NULL and zeroes out the entire stat.
Change-Id: Ia27ff49a61922e88c73a1547ad8aacc9968a69df
BUG: 1390050
Updates: bz#1390050
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
This script on a normal setup takes 15 minutes. With lcov it needs
to be increased. Considering we did 1.5X of the default $run_timeout
in run-tests.sh, I am doing the same for this script.
fixes bz#1614718
Change-Id: Ia571b33ff13deb8cbd8e48561769e876aa0b1aff
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Setting the refresh flag in inode ctx in shard_rename_src_cbk()
is applicable only when the dst file exists and is sharded and
has a hard link > 1 at the time of rename.
But this piece of code is exercised even when dst doesn't exist.
In this case, the mount crashes because local->int_inodelk.loc.inode
is NULL.
Change-Id: Iaf85a5ee3dff8b01a76e11972f10f2bb9dcbd407
Updates: bz#1611692
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
fix brick checks for validating-server-quorum.t & quorum-validation.t
...and make brick_up_status_1 function more generic.
Also fix a timing issue in
bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
Change-Id: I797ef4cec5b160aafa979bae7151b1e99fcb48ac
Updates: bz#1603063
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier this test did following things on M0 and M1 mounted on same
volume:
1 create file M0/testfile
2 open an fd on M0/testfile
3 remove the file from M1, M1/testfile
4 echo "data" >> M0/testfile
The test expects appending data to M0/testfile to fail. However,
redirector ">>" creates a file if it doesn't exist. So, the only
reason test succeeded was due to lookup succeeding due to stale stat
in md-cache. This hypothesis is verified by two experiments:
* Add a sleep of 10 seconds before append operation. md-cache cache
expires and lookup fails followed by creation of file and hence append
succeeds to new file.
* set md-cache timeout to 600 seconds and test never fails even with
sleep 10 before append operation. Reason is stale stat in md-cache
survives sleep 10.
So, the spurious nature of failure was dependent on whether lookup is
done when stat is present in md-cache or not.
The actual test should've been to write to the fd opened in step 2
above. I've changed the test accordingly. Note that this patch also
remounts M0 after initial file creation as open-behind disables
opening-behind on witnessing a setattr on the inode and touch involves
a setattr. On remount, create operation is not done and hence file is
opened-behind.
Change-Id: I739f255e0a62ff0024f0824dad3539974955df99
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Fixes: bz#1615096
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
The test case was checking for the entry pending marker reset
on the root after performing client side lookup at line #60-63.
But sometimes the entry heal was not getting completed immediately.
Fix:
Wait for the entry heal to complete before checking the changelog.
Change-Id: I42fde21b04a126ab044ce58373a996d72f125d96
fixes: bz#1614730
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
| |
See BZ for details.
Change-Id: I2cc2064f14d80271ebcc21747103ce4cee848cbf
fixes: bz#1615078
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
| |
Check for the bricks to be up before attempting to mount.
Change-Id: I1224908137016df3007f4467aa9760967ce0694d
Fixes: bz#1615092
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some of the mux tests, set a trap to catch test exit and
call cleanup. This will cause cleanup to be invoked twice
in case the test times out, or even otherwise, as include.rc
also sets a trap to cleanup on exit (TERM and others).
This leads to the tarballs generated on failures for these
tests to be empty and does not aid debugging.
This patch corrects this pattern across the tests to the
more standard cleanup at the end.
Fixes: bz#1615037
Change-Id: Ib83aeb09fac2aa591b390b9fb9e1f605bfef9a8b
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we pass a command to be executed in EXPECT_WITHIN and we use ``
the value is passed by value, so if the first execution gives a result
that is different from the expected value, EXPECT_WITHIN test will
fail because the command will not be re-evaluated. Changed the
expression with `` to a function. Added sleep(3) in afr.c for
reconfigure to both RC and re-test after the change.
fixes bz#1614662
Change-Id: I3bc8a75b996729261aa48067f6ed8da9c6273b13
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: After reboot a node brick is not coming up because
fsid comparison is failed before start a brick
Solution: Instead of comparing fsid compare volume_id to
resolve the same because fsid is changed after
reboot a node but volume_id persist as a xattr
on brick_root path at the time of creating a volume.
Change-Id: Ic289aab1b4ebfd83bbcae8438fee26ae61a0fff4
fixes: bz#1612418
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In lcov based regression testing environments, all tests take
more time than what occurs in centos7 regressions. Possibly
due to code instrumentation for lcov purposes.
Due to this the test, bug-1432542-mpx-restart-crash.t constantly
times out. This patch increases the timeout for the same to enable
lcov tests to pass on a more regular basis.
It was also noted by Nithya that the test at times generated an
OOM kill on the regression machines. In order to reduce runtime
memory foot print of the tests, FUSE mounts are unmounted as
soon as the required test is complete.
Fixes: bz#1608568
Change-Id: I37f8d4b45807a69c52c7c7df4923c0fc33fab4e4
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gluster_shared_storage bricks
Problem: In a brick multiplexing environment, Bricks of a normal volume
created by user are getting attached to the bricks of a volume
"gluster_shared_storage" which is created by enabling the
enable-shared-storage option. Mounting gluster_shared_storage
has strict authentication checks. when we attach bricks of a normal
volume to bricks of gluster_shared_storage, mounting the normal
volume created by user will fail due to strict authentication checks.
Solution: We should not attach bricks of a normal volume to brick
process of gluster_shared_storage volume and vice versa.
fixes: bz#1610726
Change-Id: If1b5a2a02675789a2915ba480fb48c145449163d
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
anonymous fds interfere with working of read-ahead as read-ahead won't
be able to store its cache in fd. Also, as seen in bz 1455872,
anonymous fds also affect performance of large file sequential reads
as the cost of opening fd for each read on brick stack is
significant. So, have a proper fd which enables read-ahead to store
its cache and brick stack to reuse the fd during reads.
With this change test
tests/bugs/snapshot/bug-1167580-set-proper-uid-and-gid-during-nfs-access.t
fails consistently. The failure can also be seen with open-behind
off. bz 1611532 has been filed to track the issue with test. Thanks to
Rafi <rkavunga@redhat.com> for assistance provided in debugging test
failure.
Change-Id: Ifa52d8ff017f115e83247f3396b9d27f0295ce3f
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Fixes: bz#1455872
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
invalidation
Invalidations are triggered mainly by two codepaths - upcall and
write-behind unwinding a cached write with zeroed out stat. For the
case of upcall, following race can happen:
* stat s1 is fetched from brick
* invalidation is detected on brick
* invalidation is propagated to md-cache and cache is invalidated
* s1 updates md-cache with a stale state
For the case of write-behind, imagine following sequence of operations,
* A stat s1 was issued from application thread t1 when size of file
was s1
* stat s1 completes on brick stack, but yet to reach md-cache
* A write w1 from application thread t2 extends file to size s2 is
cached in write-behind and response is unwound with zeroed out stat
* md-cache while handling write-cbk, invalidates cache
* md-cache receives response for s1, updates cache with stale stat
with size of s1 overwriting invalidation state
Fix is to remember when s1 was incident on md-cache and update cache
with results of s1 only if the it was incident after invalidation of
cache.
This patch identified some bugs in regression tests which is tracked
in https://bugzilla.redhat.com/show_bug.cgi?id=1608158. As a stop gap
measure I am marking following tests as bad
basic/afr/split-brain-resolution.t
bugs/bug-1368312.t
bugs/replicate/bug-1238398-split-brain-resolution.t
bugs/replicate/bug-1417522-block-split-brain-resolution.t
bugs/replicate/bug-1438255-do-not-mark-self-accusing-xattrs.t
Change-Id: Ia4bb9dd36494944e2d91e9e71a79b5a3974a8c77
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Updates: bz#1512691
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In brick mux scenario sometime glusterd is not able
to start/attach a brick and gluster v status shows
brick is already running
Solution:
1) To make sure brick is running check brick_path in
/proc/<pid>/fd , if a brick is consumed by the brick
process it means brick stack is come up otherwise not
2) Before start/attach a brick check if a brick is mounted
or not
3) At the time of printing volume status check brick is
consumed by any brick process
Test: To test the same followed procedure
1) Setup brick mux environment on a vm
2) Put a breaking point in gdb in function posix_health_check_thread_proc
at the time of notify GF_EVENT_CHILD_DOWN event
3) unmount anyone brick path forcefully
4) check gluster v status it will show N/A for the brick
5) Try to start volume with force option, glusterd throw
message "No device available for mount brick"
6) Mount the brick_root path
7) Try to start volume with force option
8) down brick is started successfully
Change-Id: I91898dad21d082ebddd12aa0d1f7f0ed012bdf69
fixes: bz#1595320
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently this lru limit is hard-coded to 16384. This patch makes it
configurable to make it easier to hit the lru limit and enable testing
of different cases that arise when the limit is reached.
The option is features.shard-lru-limit. It is by design allowed to
be configured only in init() but not in reconfigure(). This is to avoid
all the complexity associated with eviction of least recently used shards
when the list is shrunk.
Change-Id: Ifdcc2099f634314fafe8444e2d676e192e89e295
updates: bz#1605056
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|