| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Also corrected a typo in 4.1.4 release notes.
Change-Id: I1ee0f4e4409a0a6af6c2940acb2ff70ea2db824e
Signed-off-by: ShyamsundarR <srangana@redhat.com>
Fixes: bz#1638055
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lease_id is a 16 bits opaque data, copying it by gf_strdup is wrong.
Invalid read of size 2
at 0x483FA2F: memmove (vg_replace_strmem.c:1270)
by 0xE2EF6FB: ??? (in /usr/lib64/libtirpc.so.3.0.0)
by 0xE2EE047: xdr_opaque (in /usr/lib64/libtirpc.so.3.0.0)
by 0x107A97DC: xdr_gfx_value (glusterfs4-xdr.c:207)
by 0x107A98C0: xdr_gfx_dict_pair (glusterfs4-xdr.c:321)
by 0xE2EF35E: xdr_array (in /usr/lib64/libtirpc.so.3.0.0)
by 0x107A9A89: xdr_gfx_dict (glusterfs4-xdr.c:335)
by 0x107AA97B: xdr_gfx_write_req (glusterfs4-xdr.c:897)
by 0x107A181E: xdr_serialize_generic (xdr-generic.c:25)
by 0x231044A2: client_submit_request (client.c:205)
by 0x2314D3C1: client4_0_writev (client-rpc-fops_v2.c:3863)
by 0x230FD5FA: client_writev (client.c:956)
Address 0xad659e18 is 72 bytes inside a block of size 73 alloc'd
at 0x483880B: malloc (vg_replace_malloc.c:299)
by 0x106BA7EC: __gf_malloc (mem-pool.c:136)
by 0x1064521E: gf_strndup (mem-pool.h:166)
by 0x1064521E: gf_strdup (mem-pool.h:183)
by 0x1064521E: get_fop_attr_thrd_key (glfs.c:627)
by 0x1064D8E9: glfs_pwritev@@GFAPI_3.4.0 (glfs-fops.c:1154)
by 0x10610C0C: glusterfs_write2 (handle.c:2092)
by 0x54D30C: mdcache_write2 (mdcache_file.c:647)
by 0x48A3FC: nfs4_write (nfs4_op_write.c:459)
by 0x48A44D: nfs4_op_write (nfs4_op_write.c:487)
by 0x4634F5: nfs4_Compound (nfs4_Compound.c:947)
by 0x460155: nfs_rpc_process_request (nfs_worker_thread.c:1329)
by 0x4608A3: nfs_rpc_valid_NFS (nfs_worker_thread.c:1539)
by 0x488F12F: svc_vc_decode (svc_vc.c:825)
Backport of:
> Patch: https://review.gluster.org/21586/
> BUG: bz#1647651
> Change-Id: Ib9fff55c897bc43c15036a869888e763df133757
> Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
(cherry picked from commit 6d4cd8ce6c0d88d331ffed97c51d3061a3900561)
Updates bz#1648938
Change-Id: I881d1e9aeb343d456cbf80d16bc46fd4a81a8e43
Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
|
|
|
|
|
|
|
|
|
|
| |
long standing bug in .spec file.
I guess nobody has ever built rpms with '--without bd'
Change-Id: I71e26c3d06af5d329ae89cc249a4ad88664ddf53
fixes: bz#1648982
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this fix is just a review of possible SIGSEGV issues in line 714
as per crash report at the bug:
```
08:35:25 Program terminated with signal 11, Segmentation fault.
08:35:25 #0 0x00007f4ebb491c5c in gf_time_fmt (dst=0x7f4eb1ff9a90 "", sz_dst=256, utime=1531470915, fmt=0)
at /home/jenkins/root/workspace/centos7-regression/libglusterfs/src/common-utils.h:714
```
fixes: bz#1648367
Change-Id: I160c391f8ac1a3456e59103d293b24e0e3fae718
Signed-off-by: Amar Tumballi <amarts@redhat.com>
(cherry picked from commit 6c2deb080aa2df73d3cb2a5f330208d30e9c6759)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While syncing metadata, 'os.chmod', 'os.chown',
'os.utime' should be used without de-reference.
But python supports only 'os.chown' without
de-reference. That's mostly because Linux
doesn't support 'chmod' on symlink file itself
but it does support 'chown'.
So while syncing metadata ops, if it's symlink
we should only sync 'chown' and not do 'chmod'
and 'utime'. It will lead to tracebacks with
errors like EROFS, EPERM, ACCESS, ENOENT.
All the three errors (EPERM, ACCESS, ENOENT)
were handled except EROFS. But the way it was
handled was not fool proof. The operation is
tried and failure was handled based on the errors.
All the errors with symlink file for 'chown',
'utime' had to be passed to safe errors list of
'errno_wrap'. This patch handles it better by
avoiding 'chmod' and 'utime' if it's symlink
file.
Backport of:
> Patch: https://review.gluster.org/21546/
> BUG: 1646104
> Change-Id: Ic354206455cdc7ab2a87d741d81f4efe1f19d77d
> Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 3c6cf9a4a1b46cab2dc53c1ee0afca0fe993102e)
fixes: bz#1646806
Change-Id: Ic354206455cdc7ab2a87d741d81f4efe1f19d77d
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
By allowing clients taking dump in a file on brick process, we are
allowing compromised clients to create io-stats dumps on server,
which can exhaust all the available inodes.
Fixes: CVE-2018-14659
Fixes: bz#1647669
Change-Id: I32bfde9d4fe646d819a45e627805b928cae2e1ca
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'getspec' operation is not used between 'client' and 'server' ever since
we have off-loaded volfile management to glusterd, ie, at least 7 years.
No reason to keep the dead code! The removed option had no meaning,
as glusterd didn't provide a way to set (or unset) this option. So,
no regression should be observed from any of the existing glusterfs
deployment, supported or unsupported.
Updates: CVE-2018-14653
Updates: bz#1647670
Change-Id: I4a2e0f673c5bcd4644976a61dbd2d37003a428eb
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
as key size in xdr can be anything, it can be bigger than the
'NAME_MAX' allowed in the structure, which can allow for service denial
attacks.
Fixes: CVE-2018-14653
Fixes: bz#1647670
Change-Id: I2dc5e99af27ddf44c12c94b07e51adb8674cce80
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Server stack needs to have all the sort of validation, assuming
clients can be compromized. It is possible for a compromized
client to send basenames with paths with '/', and with that
create files without permission on server. By sanitizing the basename,
and not allowing anything other than actual directory as the parent
for any entry creation, we can mitigate the effects of clients
not able to exploit the server.
Fixes: CVE-2018-14651
Fixes: bz#1647667
Change-Id: I5dc0da0da2713452ff2b65ac2ddbccf1a267dc20
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, there are possibilities in few places, where a user-controlled
(like filename, program parameter etc) string can be passed as 'fmt' for
printf(), which can lead to segfault, if the user's string contains '%s',
'%d' in it.
Fixes: CVE-2018-14661
NOTE: this change is a focused fix for the CVE, but is just subset of
changes in master. This is done so that we keep the changes in the
codebase to minimum, and also as clang coding standard is implemented,
the changes wouldn't apply cleanly from master, so there is scope for
mistakes. By keeping it to minimum, we solve CVE, and also prevent
errors.
Fixes: bz#1647668
Change-Id: Ib547293f2d9eb618594cbff0df3b9c800e88bde4
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Fixes CID 1396581
Change-Id: Ic04091b5783a75d8e1e605a9c1c28b77fea048d3
updates: bz#1647972
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the current scheme of glusterfs where lock migration is
experimental, (ideally) the rebalance process which is migrating
the file should request for a metalock. Hence, the metalock count
should not be more than one for an inode. In future, if there is a
need for meta-lock from other clients, this patch can be reverted.
Since pl_metalk is called as part of setxattr operation, any client
process(non-rebalance) residing outside trusted network can exhaust
memory of the server node by issuing setxattr repetitively on the
metalock key. The current patch makes sure that more than
one metalock cannot be granted on an inode.
Fixes CVE-2018-14660
updates: bz#1647972
Change-Id: Ie1e697766388718804a9551bc58351808fe71069
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch in master: https://review.gluster.org/#/c/glusterfs/+/21534/
A compromised client can set arbitrary values for the GF_XATTROP_ENTRY_IN_KEY
and GF_XATTROP_ENTRY_OUT_KEY during xattrop fop. These values are
consumed by index as a filename to be created/deleted according to the key.
Thus it is possible to create/delete random files even outside the gluster
volume boundary.
Fix:
Index expects the filename to be a basename, i.e. it must not contain any
pathname components like "/" or "../". Enforce this.
Fixes: CVE-2018-14654
Fixes: bz#1646200
Change-Id: I35f2a39257b5917d17283d0a4f575b92f783f143
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the commit febf5ed4848, during the volume create op,
we are setting volinfo->caps to 0, only if any of the bricks
belong to the same node and brickinfo->vg[0] is null.
Previously, we used to set volinfo->caps to 0, when
either brick doesn't belong to the same node or brickinfo->vg[0]
is null.
With this patch, we set volinfo->caps to 0, when either brick
doesn't belong to the same node or brickinfo->vg[0] is null.
(as we do earlier without commit febf5ed4848).
> BUG: bz#1635820
> Change-Id: I00a97415786b775fb088ac45566ad52b402f1a49
> Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
fixes: bz#1643052
Change-Id: I00a97415786b775fb088ac45566ad52b402f1a49
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When 'gluster-mountbroker status' was issued, it
crashes in a corner case with 'str object has not
attribute get'. Fixed the same.
Backport of:
> Patch: https://review.gluster.org/21507
> fixes: bz#1643929
> Signed-off-by: Kotresh HR <khiremat@redhat.com>
> Change-Id: Iaf1a937ed0136b3b2058230c75fa89a215d8a5eb
(cherry picked from commit 5987b3388126a3c5e77481913cbaa4142117d19a)
fixes: bz#1644516
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Change-Id: Iaf1a937ed0136b3b2058230c75fa89a215d8a5eb
|
|
|
|
|
|
|
|
|
|
|
|
| |
posix_update_utime_in_mdata() unconditionally logs an error
if consistent time attributes features is not enabled. This
log does not add any value, prints an incorrect errno &
floods the log file. Hence nuking this log message in this
patch.
fixes: bz#1644524
Change-Id: I01736d2ed48d14f12ccd8a808521f59145e42ccb
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geo-rep's automatic error handling does gfid conflict
resolution. But if there are ENOENT errors because the
parent is not synced to slave, it doesn' handle them.
This patch adds the intelligence to create missing
parent directories on slave. It can create the missing
directories upto the depth of 10.
Backport of:
> Patch: https://review.gluster.org/21498
> BUG: 1643402
> Change-Id: Ic97ed1fa5899c087e404d559e04f7963ed7bb54c
> Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 19775e0445411cca9ddd9d294fd54d0b6fbe6a03)
fixes: bz#1644518
Change-Id: Ic97ed1fa5899c087e404d559e04f7963ed7bb54c
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
A compromised client can send a variable length buffer value for the
GF_XATTR_CLRLK_CMD virtual xattr. If the length is greater than the
size of the "key" used to send the response back, locks xlator can
segfault when it tries to do a dict_set because of the buffer overflow
in strncpy of pl_getxattr().
Fix:
Perform size checks while forming the 'key'.
Note:
This fix is already there in the master branch upstream as a part of the
commit 052849983e51a061d7fb2c3ffd74fa78bb257084 (https://review.gluster.org/#/c/glusterfs/+/20933/)
This patch just picks the code change needed to fix the vulnerability.
Fixes: CVE-2018-14652
fixes: bz#1645363
Change-Id: I101693e91f9ea2bd26cef6c0b7d82527fefcb3e2
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
For lease operation, we allocate and store child nodes
data in lease structure. Use the same in afr_lease_cbk()
while checking for the quorum.
Change-Id: If1fdd5a0798888afd39ad3df57d96487baf9d1e6
fixes: bz#1644474
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
During gfid-conflict-resolution, geo-rep crashes
with 'ValueError: list.remove(x): x not in list'
Cause and Analysis:
During gfid-conflict-resolution, the entry blob is
passed back to master along with additional
information to verify it's integrity. If everything
looks fine, the entry creation is ignored and is
deleted from the original list. But it is crashing
during removal of entry from the list saying entry
not in list. The reason is that the stat information
in the entry blob was modified and sent back to
master if present.
Fix:
Send back the correct stat information for
gfid-conflict-resolution.
Backport of:
> BUG: 1642865
> Change-Id: I47a6aa60b2a495465aa9314eebcb4085f0b1c4fd
> Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit ff18121945bff394f3234e9f1a9d61ac97d4d493)
fixes: bz#1644163
Change-Id: I47a6aa60b2a495465aa9314eebcb4085f0b1c4fd
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch https://review.gluster.org/#/c/glusterfs/+/19135/ has
optimised glusterd test cases by clubbing the similar test
cases into a single test case.
https://review.gluster.org/#/c/glusterfs/+/19135/15/tests/bugs/glusterd/bug-1293414-import-brickinfo-uuid.t
test case has been deleted and added as a part of
tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t
In the original test case, we create a volume with two bricks,
each on a separate node(N1 & N2). From another node in cluster(N3),
we try to detach a node which is hosting bricks. It fails.
In the new test, we created volume with single brick on N1.
and from another node in cluster, we tried to detach N1. we
expect peer detach to fail, but peer detach was success as
the node is hosting all the bricks of volume.
Now, changing the new test case to cover the original test case scenario.
Please refer https://bugzilla.redhat.com/show_bug.cgi?id=1642597#c1 to
understand why the new test case is not failing in centos-regression.
> BUG: bz#1642597
> Change-Id: Ifda12b5677143095f263fbb97a6808573f513234
> Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
(cherry picked from commit 0ca6773eaf5aeb507ebc72d2c2f61902eeff414c)
fixes: bz#1643075
Change-Id: Ifda12b5677143095f263fbb97a6808573f513234
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Glusterfs leases expects lease_id to be set and sent
for each fop to determine conflict resolution with the
existing lease.
Incase if not set (most likely if there is an older
client in a mixed cluster), it makes sense to consider
it as conflicitng fop and recall the lease.
Also fixed the return status check for __remove_lease(),
wherein non-negative value is considered as success case.
This is backport of below mainline patch -
https://review.gluster.org/21458
Change-Id: I5bcfba4f7c71a5af7cdedeb03436d0b818e85783
updates: #350
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes below issues in gfapi lease code-path
* 'glfs_setfsleasid' should allow NULL input to be
able to reset leaseid
* Applications should be allowed to (un)register for
upcall notifications of type GLFS_EVENT_LEASE_RECALL
* APIs added to read contents of GLFS_EVENT_LEASE_RECALL
argument which is of type "struct glfs_upcall_lease"
This is backport of below mainline path -
https://review.gluster.org/#/c/glusterfs/+/21391
Change-Id: I3320ddf235cc82fad561e13b9457ebd64db6c76b
updates: #350
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
https://review.gluster.org/#/c/glusterfs/+/21427/ seems to be failing
this .t spuriously. On checking one of the failure logs, I see:
22:05:44 Launching heal operation to perform index self heal on volume patchy has been unsuccessful:
22:05:44 Self-heal daemon is not running. Check self-heal daemon log file.
22:05:44 not ok 20 , LINENUM:38
In glusterd log:
[2018-10-18 22:05:44.298832] E [MSGID: 106301] [glusterd-syncop.c:1352:gd_stage_op_phase] 0-management: Staging of operation 'Volume Heal' failed on localhost : Self-heal daemon is not running. Check self-heal daemon log file
But the tests which preceed this check whether via a statedump if the shd is
conected to the bricks, and they have succeeded and even started
healing. From glustershd.log:
[2018-10-18 22:05:40.975268] I [MSGID: 108026] [afr-self-heal-common.c:1732:afr_log_selfheal] 0-patchy-replicate-0: Completed data selfheal on 3b83d2dd-4cf2-4ea3-a33e-4275be40f440. sources=[0] 1 sinks=2
So the only reason I can see launching heal via cli failing is a race where
shd has been spawned but glusterd has not yet updated in-memory that it is up,
and hence failing the CLI.
Fix:
Check for shd up status before launching heal via CLI
Change-Id: Ic88abf14ad3d51c89cb438db601fae4df179e8f4
fixes: bz#1641761
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 3dea105556130abd4da0fd3f8f2c523ac52398d1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of https://review.gluster.org/#/c/glusterfs/+/21380/
Problem:
In an arbiter volume, if there is a pending data heal of a file only on
arbiter brick, self-heal takes inodelks twice due to a code-bug but unlocks
it only once, leaving behind a stale lock on the brick. This causes
the next write to the file to hang.
Fix:
Fix the code-bug to take lock only once. This bug was introduced master
with commit eb472d82a083883335bc494b87ea175ac43471ff
Thanks to Pranith Kumar K <pkarampu@redhat.com> for finding the RCA.
fixes: bz#1637953
Change-Id: I15ad969e10a6a3c4bd255e2948b6be6dcddc61e1
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of https://review.gluster.org/#/c/glusterfs/+/21135/
Problem:
When a directory has dirty xattrs due to failed post-ops or when
replace/reset brick is performed, AFR does a conservative merge as
expected, but heal-info reports it as split-brain because there are no
clear sources.
Fix:
Modify pending flag to contain information about pending heals and
split-brains. For directories, if spit-brain flag is not set,just show
them as needing heal and not being in split-brain.
Change-Id: I09ef821f6887c87d315ae99e6b1de05103cd9383
fixes: bz#1633634
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: After an upgrade from the version where shared-brick-count
option is not present to a version which introduced this option
causes issue at the mount point i.e, size of the volume at mount
point will be reduced by shared-brick-count value times.
Cause: shared-brick-count is equal to the number of bricks that
are sharing the file system. gd_set_shared_brick_count() calculates
the shared-brick-count value based on uuid of the node and fsid of
the brick. https://review.gluster.org/#/c/glusterfs/+/19484 handles
setting of fsid properly during an upgrade path. This patch assumed
that when the code path is reached, brickinfo->uuid is non-null.
But brickinfo->uuid is null for all the bricks, as the uuid is null
https://review.gluster.org/#/c/glusterfs/+/19484 couldn't reached the
code path to set the fsid for bricks. So, we had fsid as 0 for all
bricks, which resulted in gd_set_shared_brick_count() to calculate
shared-brick-count in a wrong way. i.e, the logic written in
gd_set_shared_brick_count() didn't work as expected since fsid is 0.
Solution: Before control reaches the code path written by
https://review.gluster.org/#/c/glusterfs/+/19484,
adding a check for whether brickinfo->uuid is null and
if brickinfo->uuid is having null value, calling
glusterd_resolve_brick will set the brickinfo->uuid to a
proper value. When we have proper uuid, fsid for the bricks
will be set properly and shared-brick-count value will be
caluculated correctly.
Please take a look at the bug https://bugzilla.redhat.com/show_bug.cgi?id=1632889
for complete RCA
Steps followed to test the fix:
1. Created a 2 node cluster, the cluster is running with binary
which doesn't have shared-brick-count option
2. Created a 2x(2+1) volume and started it
3. Mouted the volume, checked size of volume using df
4. Upgrade to a version where shared-brick-count is introduced
(upgraded the nodes one by one i.e, stop the glusterd, upgrade the node
and start the glusterd).
5. after upgrading both the nodes, bumped up the cluster.op-version
6. At mount point, df shows the correct size for volume.
> BUG: 1632889
> Change-Id: Ib9f078aafb15e899a01086eae113270657ea916b
> Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
(cherry picked from commit f1e9b878ce2067db83a0baa5f384eda87287719d)
fixes: bz#1633479
Change-Id: Ib9f078aafb15e899a01086eae113270657ea916b
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For both Virt and block workloads the file is opened multiple times
leading to dynamically setting eager-lock to off for the workload.
Instead of depending on the number-of-open-fds, if we change the
logic to depend on number of inodelks, then it will give better
performance than the earlier logic. When there is an eager-lock
and number of inodelks is more than 1 we know that there is a
conflicting lock, so depend on that information to decide whether
to keep the current transaction go through delayed-post-op or not.
Locks xlator doesn't have implementation to query number of locks in
fxattrop in releases older than 3.10 so to keep things backward
compatible in 3.12, data transactions will use new logic where as
fxattrop transactions will use old logic. I am planning to send one
more patch which makes metadata domain locks also depend on
inodelk-count
Profile info for a dd of 500MB to a file with another fd opened
on the file using exec 250>filename
Without this patch:
0.14 67.41 us 16.72 us 3870.82 us 892 FINODELK
0.59 279.87 us 95.71 us 2085.89 us 898 FXATTROP
3.46 366.43 us 81.75 us 6952.79 us 4000 WRITE
95.79 148733.99 us 50568.12 us 919127.86 us 273 FSYNC
With this patch:
0.00 51.01 us 38.07 us 80.16 us 4 FINODELK
0.00 235.43 us 235.43 us 235.43 us 1 TRUNCATE
0.00 125.07 us 56.80 us 193.33 us 2 GETXATTR
0.00 135.86 us 62.13 us 209.59 us 2 INODELK
0.00 197.88 us 155.39 us 253.90 us 4 FXATTROP
0.00 450.59 us 394.28 us 506.89 us 2 XATTROP
0.00 56.96 us 19.06 us 406.59 us 23 FLUSH
37.81 273648.93 us 48.43 us 6017657.05 us 44 LOOKUP
62.18 4951.86 us 93.80 us 1143154.75 us 3999 WRITE
postgresql benchmark performance changed from ~1130 TPS to ~2300TPS
randio fio job inside Ovirt based VM went from ~600IOPs to ~2000IOPS
fixes bz#1635980
Change-Id: If7f7388d2f08cf7f17ca517a4ea222560661dc36
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When eager-lock is disabled because of multiple-fds opened and app
writes come on conflicting regions, the number of locks grows very
fast leading to all the CPU being spent just in locking and unlocking
by traversing huge queues in locks xlator for granting locks.
Fix:
Reduce the number of locks in transit by bundling the writes in the
same lock and disable delayed piggy-pack when we learn that multiple
fds are open on the file. This will reduce the size of queues in the
locks xlator. This also reduces the number of network calls like
inodelk/fxattrop.
Please note that this problem can still happen if eager-lock is
disabled as the writes will not be bundled in the same lock.
fixes bz#1635979
Change-Id: I8fd1cf229aed54ce5abd4e6226351a039924dd91
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Till now, glusterd was generating the volfile path for the snapshot
volume's bricks like this.
/snaps/<snap name>/<brick volfile>
But in reality, the path to the brick volfile for a snapshot volume is
/snaps/<snap name>/<snap volume name>/<brick volfile>
The above workaround was used to distinguish between a mount command used
to mount the snapshot volume, and a brick of the snapshot volume, so that
based on what is actually happening, glusterd can return the proper volfile
(client volfile for the former and the brick volfile for the latter). But,
this was causing problems for snapshot restore when brick multiplexing is
enabled. Because, with brick multiplexing, it tries to find the volfile
and sends GETSPEC rpc call to glusterd using the 2nd style of path i.e.
/snaps/<snap name>/<snap volume name>/<brick volfile>
So, when the snapshot brick (which is multiplexed) sends a GETSPEC rpc
request to glusterd for obtaining the brick volume file, glusterd was
returning the client volume file of the snapshot volume instead of the
brick volume file.
Change-Id: I28b2dfa5d9b379fe943db92c2fdfea879a6a594e
fixes: bz#1636218
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is part of the reason why we use autoconf (i.e. configure).
For an ordinary clone+autogen.sh+configure SBIN_DIR is
/usr/local/sbin; for an rpm or dpkg build it will be /usr/sbin.
I wonder how many more are lurking in our sources? /usr/libexec is
one that frequently bites us on Debian and Ubuntu, which don't have
/usr/libexec. (But it's all Linux, right?)
See https://bugzilla.redhat.com/show_bug.cgi?id=1601532
Reported-by: lohmaier+rhbz@gmail.com
Change-Id: I6523894416cc06236ea1f99529efd36e957bd98e
updates: bz#1632013
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
| |
Fixes: bz#1630186
Change-Id: Ie5ea9b69fea22eab65d7e85215f8538b617da456
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When name-self-heal is triggered on the mount, it blocks
lookup until name-self-heal completes. But that can lead
to hangs when lot of clients are accessing a directory which
needs name heal and all of them trigger heals waiting
for other clients to complete heal.
Fix:
When a name-heal is needed but quorum number of names have the
file and pending xattrs exist on the parent, then better to
delegate the heal to SHD which will be completed as part of
entry-heal of the parent directory. We could also do the same
for quorum-number of names not present but we don't have
any known use-case where this is a frequent occurrence so
not changing that part at the moment. When there is a gfid
mismatch or missing gfid it is important to complete the heal
so that next rename doesn't assume everything is fine and
perform a rename etc
fixes bz#1625575
Change-Id: I8b002c85dffc6eb6f2833e742684a233daefeb2c
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When metadata-self-heal is triggered on the mount, it blocks
lookup until metadata-self-heal completes. But that can lead
to hangs when lot of clients are accessing a directory which
needs metadata heal and all of them trigger heals waiting
for other clients to complete heal.
Fix:
Only when the heal is needed but the pending xattrs are not set,
trigger metadata heal that could block lookup. This is the only
case where different clients may give different metadata to the
clients without heals, which should be avoided.
Updates bz#1625575
Change-Id: I6089e9fda0770a83fb287941b229c882711f4e66
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
If requested start time and end time doesn't fall into
first HTIME file, then history API fails even though
continuous changelogs are avaiable for the requested range
in other HTIME files. This is induced by changelog disable
and enable which creates fresh HTIME index file.
Cause and Analysis:
Each HTIME index file represents the availability of
continuous changelogs. If changelog is disabled and enabled,
a new HTIME index file is created represents non availability
of continuous changelogs. So as long as the requested start
and end falls into single HTIME index file and not across,
history API should succeed.
But History API checks for the changelogs only in first
HTIME index file and errors out if not available.
Fix:
Check in all HTIME index files for availability of continuous
changelogs for requested change.
Backport of:
> Patch: https://review.gluster.org/21016/
> BUG: bz#1622549
> Change-Id: I80eeceb5afbd1b89f86a9dc4c320e161907d3559
> Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 35aa67001c8fac99b040fbc61f36ef4f1b1590ac)
fixes: bz#1630141
Change-Id: I80eeceb5afbd1b89f86a9dc4c320e161907d3559
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. '--ignore-mising-args' option for rsync is not
being used even though the rsync version is
greater than 3.1.0. Fixed the same.
2. '--existing' option for rsync is also not being
used. Fixed the same.
3. geo-rep config fails to set rsync-options as the
value contains '--'. Interestingly, python argsparse
treats the value with '--' (e.g., --ignore-missing-args)
as option. But when passed with something like
--value=--ignore-missing-args, it succeeds. Fixed the
same.
Backport of:
> Patch: https://review.gluster.org/21191
> Change-Id: Iaeb838acaff1c2920fee9c7f920c99edce13a0a1
> Signed-off-by: Kotresh HR <khiremat@redhat.com>
> BUG: 1629561
Change-Id: Iaeb838acaff1c2920fee9c7f920c99edce13a0a1
Signed-off-by: Kotresh HR <khiremat@redhat.com>
fixes: bz#1630140
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Analysis:
Monitor process spawns monitor threads (one per brick).
Each monitor thread, forks worker and agent processes.
Each monitor thread, while intializing, updates the
monitor status file. It is synchronized using flock.
The race is that, some thread can fork worker while
other thread opened the status file resulting in
holding the reference of fd in worker process.
Cause:
flock gets unlocked either by specifically unlocking it
or by closing all duplicate fds referring to the file.
The code was relying on fd close, hence a reference
in worker/agent process by fork could cause the deadlock.
Fix:
1. flock is unlocked specifically.
2. Also made sure to update status file in approriate places so that
the reference is not leaked to worker/agent process.
With this fix, both the deadlock and possible fd
leaks is solved.
Backport of:
> Patch: https://review.gluster.org/20704
> BUG: bz#1614799
> Change-Id: I0d1ce93072dab07d0dbcc7e779287368cd9f093d
> Signed-off-by: Kotresh HR <khiremat@redhat.com>
fixes: bz#1630145
Change-Id: I0d1ce93072dab07d0dbcc7e779287368cd9f093d
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Always use ssh and scp with "-oPasswordAuthentication=no"
and "-oStrictHostKeyChecking=no" options. It might hang
the post script otherwise leading geo-rep setup failure
Also increased geo-rep timeout. Occasionally, it's taking
more time to reach Active/Passive status. Especially, the
first start after create.
Backport of:
> Patch: https://review.gluster.org/20601
> BUG: bz#1610405
> Change-Id: I9560d64dbe0edf5db73446a9fc97dda19b88d233
> Signed-off-by: Kotresh HR <khiremat@redhat.com>
fixes: bz#1630144
Change-Id: I9560d64dbe0edf5db73446a9fc97dda19b88d233
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
A return value of ENODATA was forcibly returned in the case where
SSL_get_error(r) returned SSL_ERROR_SYSCALL. Sometimes SSL_ERROR_SYSCALL
is a transient error which is identified by setting errno to EAGAIN.
EAGAIN is not a fatal error and indicates that the syscall needs to be
retried.
Solution:
Bubble up the errno in case SSL_get_error(r) returns SSL_ERROR_SYSCALL
and let the upper layers handle it appropriately.
fixes: bz#1601356
Change-Id: I76eff278378930ee79abbf9fa267a7e77356eed6
Signed-off-by: Milind Changire <mchangir@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
posix_set_parent_ctime() unconditionally logs an error if consistent
time attributes is not enabled. This log does not add any value, prints
an incorrect errno & floods the log file.
Hence nuking this log message in this patch.
Backport of :
> Patch: https://review.gluster.org/20547/
> Change-Id: I82a78f2f8ce5ab518f8cdf6d9086a97049712f75
> BUG: 1607049
> Signed-off-by: Vijay Bellur <vbellur@redhat.com>
(cherry picked from commit e0df887ba044ce92e9a2822be9261d0f712b02bd)
Change-Id: I82a78f2f8ce5ab518f8cdf6d9086a97049712f75
fixes: bz#1629548
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Idfce8b9ec79303b92045e68ab98765f7e2f98940
fixes: bz#1623161
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the file system, the responsibility w.r.to the block and char device
files is related to only support for 'creating' them (using mknod(2)).
Once the device files are created, the read/write syscalls for the specific
devices are handled by the device driver registered for the specific major
number, and depending on the minor number, it knows where to read from.
Hence, we are at risk of reading contents from devices which are handled
by the host kernel on server nodes.
By disabling open/read/write on the device file, we would be safe with
the bypass one can achieve from client side (using gfapi)
Fixes: bz#1625096
Change-Id: I48c776b0af1cbd2a5240862826d3d8918601e47f
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
current implementation of alloca can cause issues when strings larger
than the allocated buffer is passed to the xdr. Hence it makes sense
to allow XDR decode functions to deal with memory allocations, which
we can free later.
Fixes: bz#1625097
Change-Id: I3a05553f5702de9575c244649ca0e5ac9abaac94
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It wouldn't make sense to allow iostats file to be written in
*any* directory. While the formating makes sure we try to append
io-stats-name for the file, so overwriting existing file is slim,
but in any case it makes sense to restrict dumping to one directory.
Below are the sample commands, and files created for the corresponding
values:
$ setfattr -n trusted.io-stats-dump -v file-for-dump $M0
In this case, the file would be in /var/run/gluster/file-for-dump
$ setfattr -n trusted.io-stats-dump -v /dir1/dir2/file-for-dump $M0
In this case, then the dump file is in /var/run/gluster/dir1-dir2-file-for-dump
Note that the value passed for this virtual xattr would be treated as a
file, and even if the value has '/' in it, it would be changed to '-'
for sanity.
Fixes: bz#1625106
Change-Id: Id9ae6a40a190b8937c51662e6e1c2a0f6c86a0e0
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will prevent any arbitrary file creation through glusterfs
by modifying the client bits.
Also check for the similar flaw inside posix too, so we prevent any
changes in layers in-between.
Fixes: bz#1625095
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Change-Id: Id9fe0ef6e86459e8ed85ab947d977f058c5ae06e
|
|
|
|
|
|
| |
Fixes: bz#1625089
Change-Id: Ie56df0da46c242846a1ba51ccb9e011af118b119
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
getting and setting a file's content using extended
attribute worked great as a GET/PUT alternative when
an object storage is supported on top of Gluster. But
it needs application changes, and also, it skips some
caching layers.
It is not used over years, and not supported any more.
Removing the dead code.
Fixes: bz#1625102
Change-Id: Ide3b3f1f644f6ca58558bbe45561f346f96b95b7
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
The value of trusted.pgfid.xx was always set to 1
in posix_mknod. This is incorrect if posix_mknod
calls posix_create_link_if_gfid_exists.
Change-Id: Ibe87ca6f155846b9a7c7abbfb1eb8b6a99a5eb68
fixes: bz#1623317
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Ia944a2353f0a89b61f66bde7f270489dff3793b4
fixes: bz#1607945
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently there is no check for path = "" in glfs_resolve_at.
So if application sends an empty path, then the function resolves
into the parent inode which is incorrect. Plus modified possible
of "path" with "origpath" in the same function.
Change-Id: Ie5ff9ce4b771607b7dbb3fe00704fe670421792a
fixes: bz#1618347
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
(cherry picked from commit febee007bb1a99d65300630c2a98cbb642b1c8dc)
|