| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
The issue is being hit during hybrid mode while handling rename on slave.
In this special case the rename is recorded as mkdir and geo-rep process it
by resolving the path form backend.
While resolving the backend path during this special handling one corner case is not considered.
<snip>
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 118, in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 588, in entry_ops
src_entry = get_slv_dir_path(slv_host, slv_volume, gfid)
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 710, in get_slv_dir_path
dir_entry = os.path.join(pfx, pargfid, basename)
File "/usr/lib64/python2.7/posixpath.py", line 75, in join
if b.startswith('/'):
AttributeError: 'int' object has no attribute 'startswith'
In pyhthon3:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python3.8/posixpath.py", line 90, in join
genericpath._check_arg_types('join', a, *p)
File "/usr/lib64/python3.8/genericpath.py", line 152, in _check_arg_types
raise TypeError(f'{funcname}() argument must be str, bytes, or '
TypeError: join() argument must be str, bytes, or os.PathLike object, not 'int'
</snip>
Backport of:
>Ptach link: https://review.gluster.org/#/c/glusterfs/+/24468/
>Change-Id: I8b926899c60ad8c4ffc886d57028ba70fd21e332
>Fixes: #1250
>Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
Change-Id: I8b926899c60ad8c4ffc886d57028ba70fd21e332
Fixes: #1250
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
(cherry picked from commit 27f5c8ba844e9da54fc1304df4ffe015a3bbb9bd)
Change-Id: I171eb9ad4e30f49cfe86cb258918682d3c0f5af9
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geo-replication has a large number of repeated imports as shown below:
```
from syncdutils import set_term_handler, finalize, lf
from syncdutils import log_raise_exception, FreeObject, escape
```
There imports can be clubbed together as shown below:
``
from syncdutils import (set_term_handler, finalize, lf,
log_raise_exception, FreeObject, escape)
```
Fixes: #1105
Change-Id: I59a48dd57a70fc851d93150b85e736ce41e8b793
Signed-off-by: kshithijiyer <kshithij.ki@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
With this patch now you can notice log if it is due to EIO:
[2020-03-16 16:24:48.293837] E [syncdutils(worker /bricks/brick1/mbr3):348:log_raise_exception] <top>: Getting "Input/Output error" is most likely due to a. Brick is down or b. Split brain issue.
[2020-03-16 16:24:48.293915] E [syncdutils(worker /bricks/brick1/mbr3):352:log_raise_exception] <top>: This is expected as per design to keep the consistency of the file system. Once the above issue is resolved geo-rep would automatically proceed further.
Change-Id: Ie33f2440bc96089731ce12afa8dab91d9550a7ca
Fixes: #1104
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
problem: Geo-rep gsyncd process mounts the master and slave volume
on master nodes and slave nodes respectively and starts
the sync. But it doesn't wait for the mount to be in ready
state to accept I/O. The gluster mount is considered to be
ready when all the distribute sub-volumes is up. If the all
the distribute subvolumes are not up, it can cause ENOTCONN
error, when lookup on file comes and file is on the subvol
that is down.
solution: Added a Virtual Xattr "dht.subvol.status" which returns "1"
if all subvols are up and "0" if all subvols are not up.
Geo-rep then uses this virtual xattr after a fresh mount, to
check whether all subvols are up or not and then starts the
I/O.
fixes: bz#1664335
Change-Id: If3ad01d728b1372da7c08ccbe75a45bdc1ab2a91
Signed-off-by: Harpreet Kaur <hlalwani@redhat.com>
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If non-standard ssh-port is used, Geo-rep can be configured to use ssh port
by using config option, the value should be in allowed port range and non negative.
At present it can accept negative value and outside allowed port range which is incorrect.
Many Linux kernels use the port range 32768 to 61000.
IANA suggests it should be in the range 1 to 2^16 - 1, so keeping the same.
$ gluster volume geo-replication master 127.0.0.1::slave config ssh-port -22
geo-replication config updated successfully
$ gluster volume geo-replication master 127.0.0.1::slave config ssh-port 22222222
geo-replication config updated successfully
This patch fixes the above issue and have added few validations around this
in test cases.
Change-Id: I9875ab3f00d7257370fbac6f5ed4356d2fed3f3c
Fixes: bz#1792276
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There exists a window of 15 sec, where the deletes are picked up
by history crawl when the ignore_deletes is set to true.
And it eventually deletes the file/s from slave which is/are not
supposed to be deleted. Though it is working as per design, a
note regarding this is needed.
Added a warning message indicating the same.
Also logged info when the worker restarts after ignore-deletes
option set.
fixes: bz#1708603
Change-Id: I103be882fac18b4cef935efa355f5037a396f7c1
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When gsyncd failed with tracebacks during start, it prints
only exception object but not the error string. This patch
adds the error string as well.
Earlier:
"failed with ImportError"
Now:
"failed with ImportError: No module named _io."
fixes: bz#1771895
Change-Id: I0d772a250d4c2010a0c35053aa7b165b71f8434e
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Geo-rep fails to start on python2 only machine like
centos6. It fails with "ImportError no module named _io".
This patch fixes the same.
fixes: bz#1771577
Change-Id: I8228458a853a230546f9faf29a0e9e0f23b3efec
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
- libgfchangelog is simplified by removing unnecessary API Class
- Merged Agent logic into Worker instead of running Worker and Agent as
two separate processes and maintaining RPC between Worker and Agent.
- Geo-rep command Pause and Resume will continue without any changes.
But Agent functionality also gets paused with that.
Updates: #755
Change-Id: Ie2c00fa7dddf21f180f0649e0aaf084d29023c98
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
While syncing rename of directory in hybrid crawl, geo-rep
crashes as below.
Traceback (most recent call last):
File "/usr/local/libexec/glusterfs/python/syncdaemon/repce.py", line 118, in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/local/libexec/glusterfs/python/syncdaemon/resource.py", line 588, in entry_ops
src_entry = get_slv_dir_path(slv_host, slv_volume, gfid)
File "/usr/local/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 687, in get_slv_dir_path
[ENOENT], [ESTALE])
File "/usr/local/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 546, in errno_wrap
return call(*arg)
PermissionError: [Errno 13] Permission denied: '/bricks/brick1/b1/.glusterfs/8e/c0/8ec0fcd4-d50f-4a6e-b473-a7943ab66640'
Cause:
Conversion of gfid to path for a directory uses readlink on backend
.glusterfs gfid path. But this fails for non root user with
permission denied.
Fix:
Use gfid2path interface to get the path from gfid
Change-Id: I9d40c713a1b32cea95144cbc0f384ada82972222
fixes: bz#1763439
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After upgrade, if the config files are of old format, it
gets migrated to new format. Monitor process migrates it.
Since monitor doesn't run on nodes where bricks are not
hosted, it doesn't get migrated there. So this patch fixes
the config upgrade on nodes which doesn't host bricks.
This happens during config either on get/set/reset.
Change-Id: Ibade2f2310b0f3affea21a3baa1ae0eb71162cba
Signed-off-by: Kotresh HR <khiremat@redhat.com>
fixes: bz#1762220
|
|
|
|
|
|
| |
Fixes: bz#1741779
Change-Id: I708b6b7e6c520dee10445528e6f99ba38e141c25
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Geo-rep session for non-root going faulty.
Solution:
During worker start we do not construct slave url and use 'args.resource_remote'
which is basically just slave-hostname.
This works better for root session but fails in non-root session during
ssh command.
Using slave url solves this issue.
fixes: bz#1753928
Change-Id: Ib83552fde77f81c208896494b323514ab37ebf22
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
The bug[1] addresses issue of data inconsistency when handling RENAME with
existing destination. This fix requires some performance tuning considering
this issue occurs in heavy rename workload.
Solution:
If distribution count for master volume is one do not verify op's on
master and go ahead with rename.
The performance improvement with this patch can only be observed if
master volume has distribution count one.
[1]. https://bugzilla.redhat.com/show_bug.cgi?id=1694820
fixes: bz#1753857
Change-Id: I8e9bcd575e7e35f40f9f78b7961c92dee642f47b
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The arguments for gluster-mountbroker is missing help="< Some Text>"
parameter wherever add_argument() is used due to which gluster-mountbroker
help is missing a brief description of what the argument does in the help
of {add,setup,remove} as shown below:
usage: gluster-mountbroker remove [-h] [--volume VOLUME] [--user USER]
optional arguments:
-h, --help show this help message and exit
--volume VOLUME
--user USER
usage: gluster-mountbroker add [-h] volume user
positional arguments:
volume
user
optional arguments:
-h, --help show this help message and exit
usage: gluster-mountbroker setup [-h] mount_root group
positional arguments:
mount_root
group
optional arguments:
-h, --help show this help message and exit
fixes: bz#1728683
Change-Id: I7eabcd2c103f01e40160ba35500b0a4e5c9f5e7a
Signed-off-by: kshithijiyer <kshithij.ki@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
While running markdown-link-checker it was
observed that there were a large number of
404 links present in the documentation present
in the form of markdown files in the project.
This was casued due to the following reasons:
1. Repos being removed.
2. Typo in markdown links.
3. Restructring of directoires.
Solution:
Fixing all the 404 links present in the project.
fixes: bz#1746810
Change-Id: I30de745f848fca2e9c92eb7493f74738f0890ed9
Signed-off-by: kshithijiyer <kshithij.ki@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Background:
The processed changelogs are archived each month in a single tar file.
The default format is "archive_YYYYMM.tar" which is specified as "%%Y%%m"
in configuration file.
Problem:
The created changelog archive file didn't have corresponding year
and month. It created as "archive_%Y%m.tar" on python2 only systems.
Cause and Fix:
Geo-rep expects "%Y%m" after the ConfigParser reads it from config file.
Since it was "%%Y%%m" in config file, geo-rep used to get correct value
"%Y%m" in python3 and "%%Y%%m" in python2 which is incorrect.
The fix can be to use "%Y%m" in config file but that fails in python3.
So the fix is to use "RawConfigParser" in geo-rep and use "%Y%m". This
works both in python2 and python3.
Change-Id: Ie5b7d2bc04d0d53cd1769e064c2d67aaf95d557c
fixes: bz#1741890
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Structured logging format changed in the below patch for better
readability and parsing. This patch changes the Geo-replication logs
to that format.
https://review.gluster.org/#/c/glusterfs/+/22987/
Before:
```
[2019-08-21 04:23:01.13672] I [monitor(monitor):157:monitor] Monitor: \
starting gsyncd worker<TAB>brick=/bricks/b1<TAB>slave_node=f29.sonne
```
After:
```
[2019-08-21 04:23:01.13672] I [monitor(monitor):157:monitor] Monitor: \
starting gsyncd worker [{brick=/bricks/b1}, {slave_node=f29.sonne}]
```
Change-Id: I65ec69a2869e2febeedc558f88620399e4a1a6a4
Updates: #657
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All the workers connects to primary slave node. It should
connect to available slave nodes in round robin fashion
and choose different slave node if the corresponding slave
node is down. This patch fixes the same.
Thanks Aravinda for the help in root causing this.
Change-Id: I9f8e7744f4adb8a24833cf173681d109710f98cb
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Updates: bz#1737484
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When two threads(sync jobs) in Geo-rep worker calls `gconf.get` and
`gconf.getr`(realtime) at the sametime, `getr` resets the conf object
and other one gets None. Thread Lock is introduced to fix the issue.
```
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py",
line 368, in twrap
tf(*aargs)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1987,
in syncjob
po = self.sync_engine(pb, self.log_err)
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py",
line 1444, in rsync
rconf.ssh_ctl_args + \
AttributeError: 'NoneType' object has no attribute 'split'
```
Change-Id: I9c245e5c36338265354e158f5baa32b119eb2da5
Updates: bz#1737484
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even the use builtin 'type' command as in patch [1]
causes issues if argument in question is not part of PATH
environment variable for that user. This patch fixes the
same by doing source /etc/profile. This was already being
used in another part of script.
[1] https://review.gluster.org/23089
Change-Id: Iceb78835967ec6a4350983eec9af28398410c002
fixes: bz#1734738
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch [1] added validation in gverify.sh to check if the gluster
binary on slave by executing gluster directly on slave. But for
non-root users, even though gluster binary is present, path is not
found when executed via ssh. Hence validate the gluster binary using
bash builtin 'type' command.
[1] https://review.gluster.org/19224
Change-Id: I93ca62c0c5b1e16263e586ddbbca8407d60ca126
fixes: bz#1731920
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added test case for the patch
https://review.gluster.org/#/c/glusterfs/+/22894/4
Also updated if else structure in gsyncdconfig.py to avoid
repeated occurance of values in new configfile.
fixes: bz#1707731
Change-Id: If97e1d37ac52dbd17d47be6cb659fc5a3ccab6d7
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
geo-replication/setup.py is missing license details in
setup() and is using string 'syncdaemon' instead
of using the variable name which has the same value.
Present:
setup(
name=name,
version="",
description='GlusterFS Geo Replication',
license='',
author='Red Hat, Inc.',
author_email='gluster-devel@gluster.org',
url='http://www.gluster.org',
packages=['syncdaemon', ],
test_suite='nose.collector',
install_requires=[],
scripts=[],
entry_points={},
)
Changing to:
setup(
name=name,
version="",
description='GlusterFS Geo Replication',
license='GPLV2 and LGPLV3+',
author='Red Hat, Inc.',
author_email='gluster-devel@gluster.org',
url='http://www.gluster.org',
packages=[name, ],
test_suite='nose.collector',
install_requires=[],
scripts=[],
entry_points={},
)
fixes: bz#1727107
Change-Id: I742c4c510da55a22256c819da7e2fc9622119f63
Signed-off-by: kshithijiyer <kshithij.ki@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
gluster command not found.
Cause:
In Volinfo class we issue command 'gluster vol info' to get information
about volume like getting brick_root to perform various operation.
When geo-rep session is configured for non-root user Volinfo class
fails to issue gluster command due to unavailability of gluster
binary path for non-root user.
Solution:
Use config value 'slave-gluster-command-dir'/'gluster-command-dir' to get path
for gluster command based on caller.
fixes: bz#1722740
Change-Id: I4ec46373da01f5d00ecd160c4e8c6239da8b3859
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- configuration handling is enhanced with patch
https://review.gluster.org/#/c/glusterfs/+/18257/
- hence, the old configurations are not applied when
Geo-rep session is created in the old version and upgraded.
This patch solves the issue. It,
- checks if the config file is old.
- parses required values from old config file and stores in new
config file, which ensures that configerations are applied on
upgrade.
- stores old config file as backup.
- handles changes in options introduced in
https://review.gluster.org/#/c/glusterfs/+/18257/
fixes: bz#1707731
Change-Id: Iad8da6c1e1ae8ecf7c84dfdf8ea3ac6966d8a2a0
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
During mountbroker setup: 'gluster-mountbroker <mountbroker-root> <group>'
commad to set the permission and group for GEOREP_DIR directory
(/var/lib/glusterd/geo-replication) fails due to extra argument, which is
enssential for non-root geo-rep setup.
fixes: bz#1721441
Change-Id: Ia83442733bf0b29f630e8c9e398097316efca092
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
| |
CID: 1400730
updates: bz#789278
Change-Id: I0f6924050a31d3d2cc0b555f859920e349728e0a
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Unable to setup mountbroker root directory while creating geo-replication
session for non-root user.
Casue:
With patch[1] which defines the max-port for glusterd one extra sapce
got added in field of 'option max-port'.
[1]. https://review.gluster.org/#/c/glusterfs/+/21872/
In geo-rep spliting of key-value pair form vol file was done on the
basis of space so this additional space caused "ValueError: too many values
to unpack".
Solution:
Use split so that it can treat consecutive whitespace as a single separator.
Fixes: bz#1709248
Change-Id: Ia22070a43f95d66d84cb35487f23f9ee58b68c73
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The gfid conflict resolution code path is not supposed
to hit in generic code path. But few of the heavy rename
workload (BUG: 1694820) makes it a generic case. So
logging the entries to be fixed as INFO floods the log
in these particular workloads. Hence convert them to DEBUG.
fixes: bz#1709653
Change-Id: I4d5e102b87be5fe5b54f78f329e588882d72b9d9
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Geo-rep sync hangs when tarssh is used as sync
engine at heavy workload.
Analysis and Root cause:
It's found out that the tar process was hung.
When debugged further, it's found out that stderr
buffer of tar process on master was full i.e., 64k.
When the buffer was copied to a file from /proc/pid/fd/2,
the hang is resolved.
This can happen when files picked by tar process
to sync doesn't exist on master anymore. If this count
increases around 1k, the stderr buffer is filled up.
Fix:
The tar process is executed using Popen with stderr as PIPE.
The final execution is something like below.
tar | ssh <args> root@slave tar --overwrite -xf - -C <path>
It was waiting on ssh process first using communicate() and then tar.
Note that communicate() reads stdout and stderr. So when stderr of tar
process is filled up, there is no one to read until untar via ssh is
completed. This can't happen and leads to deadlock.
Hence we should be waiting on both process parallely, so that stderr is
read on both processes.
Change-Id: I609c7cc5c07e210c504771115b4d551a2e891adf
fixes: bz#1707728
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When 'use_tarssh' is set to true, it exits with successful
message but the default 'rsync' was used as sync-engine.
The new config 'sync-method' is not allowed to set from cli.
Analysis and Fix:
The 'use_tarssh' config is deprecated with new
config framework and 'sync-method' is the new
config to choose sync-method i.e. tarssh or rsync.
This patch fixes the 'sync-method' config. The allowed
values are tarssh and rsync.
Change-Id: I0edb0319cad0455b29e49f2f08a64ce324735e84
fixes: bz#1707686
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are two ways for creating secret pem pub file during geo-rep
setup.
1. gluster-georep-sshkey generate
2. gluster system:: execute gsec_create
Below patch solves this problem for `gluster-georep-sshkey generate`
method.
Patch link: https://review.gluster.org/#/c/glusterfs/+/22246/
This patch is added to support old way of creating secret pem pub file
`gluster system:: execute gsec_create`.
Problem: While Geo-rep setup when creating an ssh authorized_keys
the geo-rep setup inserts an extra space before the "ssh-rsa" label.
This gets flagged by an enterprise customer's security scan as a
security violation.
Solution: Remove extra space while creating secret key.
fixes: bz#1679401
Change-Id: I92ba7e25aaa5123dae9ebe2f3c68d14315aa5f0e
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Geo-rep fails to sync the rename properly if destination exists.
It results in source to be remained on slave causing more number of
files on slave. Also heavy rename workload like logrotate caused
lot of ESTALE errors
Cause:
Geo-rep fails to sync rename if destination exists if creation
of source file also falls into single batch of changelogs being
processed. This is because, after fixing problematic gfids verifying
from master, while re-processing original entries, CREATE also was
re-processed causing more files on slave and rename to be failed.
Solution:
Entries need to be removed from retrial list after fixing
problematic gfids on slave so that it's not re-created again on slave.
Also treat ESTALE as EEXIST so that the error is properly handled
verifying the op on master volume.
Change-Id: I50cf289e06b997adddff0552bf2466d9201dd1f9
fixes: bz#1694820
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Entries counter was incremented twice and decremented only
once. And entries count was being used in place of metadata
entries. This patch fixes both of them.
fixes: bz#1512093
Change-Id: I5601a5fe8d25c9d65b72eb529171e7117ebbb67f
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
libgfchangelog.so: cannot open shared object file
Due to hardcoded shared library name runtime loader looks for particular version of
a shared library.
Solution:
Using find_library to locate shared library at runtime solves this issue.
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 323, in main
func(args)
File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 82, in subcmd_worker
local.service_loop(remote)
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1261, in service_loop
changelog_agent.init()
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 233, in __call__
return self.ins(self.meth, *a)
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 215, in __call__
raise res
OSError: libgfchangelog.so: cannot open shared object file: No such file or directory
Change-Id: I3dd013d701ed1cd99ba7ef20d1898f343e1db8f5
fixes: bz#1699394
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Geo-rep fails to sync rename of symlink if it's
renamed multiple times if creation and rename
happened successively
Worker crash at slave:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", in entry_ops
[ESTALE, EINVAL, EBUSY])
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", in errno_wrap
return call(*arg)
File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in lsetxattr
cls.raise_oserr()
File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in raise_oserr
raise OSError(errn, os.strerror(errn))
OSError: [Errno 12] Cannot allocate memory
Geo-rep Behaviour:
1. SYMLINK doesn't record target path in changelog.
So while syncing SYMLINK, readlink is done on
master to get target path.
2. Geo-rep will create destination if source is not
present while syncing RENAME. Hence while syncing
RENAME of SYMLINK, target path is collected from
destination.
Cause:
If symlink is created and renamed multiple times, creation of
symlink is ignored, as it's no longer present on master at
that path. While symlink is renamed multiple times at master,
when syncing first RENAME of SYMLINK, both source and destination
is not present, hence target path is not known. In this case,
while creating destination directly at slave, regular file
attributes were encoded into blob instead of symlink,
causing failure in gfid-access translator while decoding
blob.
Solution:
While syncing of RENAME of SYMLINK, when target is not known
and when src and destination is not present on the master,
don't create destination. Ignore the rename. It's ok to ignore.
If it's unliked, it's fine. If it's renamed to something else,
it will be synced then.
Change-Id: Ibdfa495513b7c05b5370ab0b89c69a6802338d87
fixes: bz#1693648
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
| |
ssh-port validation is mentioned as `validation=int` in template
`gsyncd.conf`, but not handled this during geo-rep config set.
Fixes: bz#1692666
Change-Id: I3f19d9b471b0a3327e4d094dfbefcc58ed2c34f6
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`address_family=inet6` needs to be added while mounting master and
slave volumes in gverify script.
New option introduced to gluster cli(`--inet6`) which will be used
internally by geo-rep while calling `gluster volume info
--remote-host=<ipv6>`.
Fixes: bz#1688833
Change-Id: I1e0d42cae07158df043e64a2f991882d8c897837
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
- Fixed Relative import and non-package import related issues.
- socketserver import issues fix
- Renamed installed directory name to `gfevents` from `events`(To
avoid any issues with other global libs)
Fixes: bz#1679406
Change-Id: I3dc38bc92b23387a6dfbcc0ab8283178235bf756
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem : While Geo-rep setup when creating an ssh authorized_keys
the geo-rep setup inserts an extra space before the "ssh-rsa" label.
This gets flagged by an enterprise customer's security scan as a
security violation.
Solution: Remove extra space in GSYNCD_CMD & TAR_CMD.
Change-Id: I956f938faef0e0883703bbc337b1dc2770e4a921
fixes: bz#1679401
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'configparser' is backported to python2 and can
be installed using pip (pip install configparser).
So trying to import 'configparser' first and later
'ConfigParser' can cause issues w.r.t unicode strings.
Always try importing 'ConfigParser' first and then
'configparser'. This solves python2/python3 compat
issues.
Change-Id: I2a87c3fc46476296b8cb547338f35723518751cc
fixes: bz#1671637
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: When geo-rep is configured as hybrid crawl
directory renames are not synced to the slave.
Solution: Rename sync of directory was failing due to incorrect
destination path calculation.
During check for existence on slave we miscalculated
realpath. <host:brickpath/dir>.
Change-Id: I23f1ea60e86a917598fe869d5d24f8da654d8a0a
fixes: bz#1665826
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added a command to set "features.read-only" option
to a default value "on" for slave volume.
Changes are made in:
$SRC//extras/hook-scripts/S56glusterd-geo-rep-create-post.sh
for root geo-rep and
$SRC/geo-replication/src/set_geo_rep_pem_keys.sh
for non-root geo-rep.
Fixes: bz#1654187
Change-Id: I15beeae3506f3f6b1dcba0a5c50b6344fd468c7c
Signed-off-by: Harpreet Kaur <hlalwani@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libglusterfs devel package headers are referenced in code using
include semantics for a program, this while it works can be better
especially when dealing with out of tree xlator builds or in
general out of tree devel package usage.
Towards this, the following changes are done,
- moved all devel headers under a glusterfs directory
- Included these headers using system header notation <> in all
code outside of libglusterfs
- Included these headers using own program notation "" within
libglusterfs
This change although big, is just moving around the headers and
making it correct when including these headers from other sources.
This helps us correctly include libglusterfs includes without
namespace conflicts.
Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b
Updates: bz#1193929
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Creation of files/directories with non-ascii names fails
to sync to the slave. It crashes with below traceback on
slave.
...
File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/repce.py", line 118, in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/resource.py", line 709, in entry_ops
[ESTALE, EINVAL, EBUSY])
File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/syncdutils.py", line 546, in errno_wrap
return call(*arg)
File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/libcxattr.py", line 83, in lsetxattr
cls.raise_oserr()
File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/libcxattr.py", line 38, in raise_oserr
raise OSError(errn, os.strerror(errn))
OSError: [Errno 12] Cannot allocate memory
Cause:
The length calculation arguments passed to blob creation was done before encoding. Hence
was failing in gfid-access layer.
Fix:
It appears that the calculating lenght properly fixes this issue. But it will cause
issues in other places in 'python2' and not in 'python3'. So encoding and decoding
each required string to make geo-rep compatible with both 'python2' and 'python3'
is a nightmare and is not fool proof. Hence kept 'python2' code as is with out
encode/decode and applied encode/decode only to 'python3'
Added non-ascii filename tests to regression
fixes: bz#1650893
Change-Id: I35cfaf848e07b1a0b5cb93c01b98b472f08271a6
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In non-root fail-over/fail-back(FO/FB), when slave is
promoted as master, the session goes to 'Faulty'
Cause:
The command 'gluster-mountbroker <mountbroker-root> <group>'
is run as a pre-requisite on slave in non-root setup.
It modifies the permission and group of following required
directories and files recursively
[1] /var/lib/glusterd/geo-replication
[2] /var/log/glusterfs/geo-replication-slaves
In a normal setup, this is executed on slave node and hence
doing it recursively is not an issue on [1]. But when original
master becomes slave in non-root during FO/FB, it contains
ssh public keys and modifying permissions on them causes
geo-rep to fail with incorrect permissions.
Fix:
Don't do permission change recursively. Fix permissions for
required files.
fixes: bz#1651498
Change-Id: I68a744644842e3b00abc26c95c06f123aa78361d
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While syncing metadata, 'os.chmod', 'os.chown',
'os.utime' should be used without de-reference.
But python supports only 'os.chown' without
de-reference. That's mostly because Linux
doesn't support 'chmod' on symlink file itself
but it does support 'chown'.
So while syncing metadata ops, if it's symlink
we should only sync 'chown' and not do 'chmod'
and 'utime'. It will lead to tracebacks with
errors like EROFS, EPERM, ACCESS, ENOENT.
All the three errors (EPERM, ACCESS, ENOENT)
were handled except EROFS. But the way it was
handled was not fool proof. The operation is
tried and failure was handled based on the errors.
All the errors with symlink file for 'chown',
'utime' had to be passed to safe errors list of
'errno_wrap'. This patch handles it better by
avoiding 'chmod' and 'utime' if it's symlink
file.
fixes: bz#1646104
Change-Id: Ic354206455cdc7ab2a87d741d81f4efe1f19d77d
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geo-rep's automatic error handling does gfid conflict
resolution. But if there are ENOENT errors because the
parent is not synced to slave, it doesn' handle them.
This patch adds the intelligence to create missing
parent directories on slave. It can create the missing
directories upto the depth of 10.
fixes: bz#1643402
Change-Id: Ic97ed1fa5899c087e404d559e04f7963ed7bb54c
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
When 'gluster-mountbroker status' was issued, it
crashes in a corner case with 'str object has not
attribute get'. Fixed the same.
fixes: bz#1643929
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Change-Id: Iaf1a937ed0136b3b2058230c75fa89a215d8a5eb
|