| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
- Fixed Relative import and non-package import related issues.
- socketserver import issues fix
- Renamed installed directory name to `gfevents` from `events`(To
avoid any issues with other global libs)
Fixes: bz#1649054
Change-Id: I3dc38bc92b23387a6dfbcc0ab8283178235bf756
Signed-off-by: Aravinda VK <avishwan@redhat.com>
(cherry picked from commit cd68f7b88b9a2c9a4e4ff9fca61517384e54130a)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. scheduler - Popen
2. syncdutils - corner case on failure
Backport of:
> Patch: https://review.gluster.org/21505
> BUG: 1643932
> Change-Id: I65af97a244a8790e976acedc2728db6ebbf2ae10
> Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 33e96100e17e9a293db6d63d9d5449d6c2d69376)
fixes: bz#1644514
Change-Id: I65af97a244a8790e976acedc2728db6ebbf2ae10
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. ctypes/syscalls
A) arguments is expected to be encoded
B) Raw conversion of return value from bytearray into string
2. struct pack/unpack - Raw converstion of string to bytearray
3. basestring -> str
Updates: #411
Change-Id: I80f939adcdec0ed0022c87c0b76d057ad5559e5a
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit fb6e8d0d0ca21b16d331fa69da9b9dadf6c5c35d)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The file objects for python3 by default is opened
in binary mode where as in python2 it's opened
as text by default.
The geo-rep code parses the output of Popen assuming
it as text, hence used the 'universal_newlines' flag
which provides backward compatibility for the same.
Change-Id: I371a03b6348af9666164cb2e8b93d47475431ad9
Updates: #411
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 65aed1070cc2e44959cf3a0fbfde635de7e03103)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'os.pipe' returns pair of file descriptors
which are non-inheritable by child processes.
But geo-rep uses te inheritable nature of
pipe fds to communicate between parent and
child processes. Hence wrote a compatiable
pipe routine which works well both with python2
and python3 with inheritable nature.
Updates: #411
Change-Id: I869d7a52eeecdecf3851d44ed400e69b32a612d9
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 173e89a6506bc8c727ce6d8e5ac84b59ad2e21de)
|
|
|
|
|
|
|
|
|
| |
1. Fix fdopen used for pid file
2. Fix sha256 checksum calculation
Updates: #411
Change-Id: Ic173d104a73822c29aca260ba6de872cd8d23f86
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Please review, it's not always just the comments that were fixed.
I've had to revert of course all calls to creat() that were changed
to create() ...
Only compile-tested!
Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. MKDIR/RMDIR is recorded on all bricks. So if
one brick succeeds creating it, other bricks
should ignore it. But this was not happening.
The fix rename of directories in hybrid crawl,
was trying to rename the directory to itself
and in the process crashing with ENOENT if the
directory is removed.
2. If file is created, deleted and a directory is
created with same name, it was failing to sync.
Again the issue is around the fix for rename
of directories in hybrid crawl. Fixed the same.
If the same case was done with hardlink present
for the file, it was failing. This patch fixes
that too.
fixes: bz#1598884
Change-Id: I6f3bca44e194e415a3d4de3b9d03cc8976439284
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geo-rep mounts are private to worker. It uses
mount namespace using unshare command to achieve
the same. Well, the unshare command has to support
'--propagation' option. So geo-rep breaks on the
systems with older unshare version. The patch
makes it fall back to lazy umount behaviour if
the unshare does not support propagation option.
fixes: bz#1589782
Change-Id: Ia614f068aede288d63ac62fea4461b1865066054
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
see https://review.gluster.org/#/c/19788/,
https://review.gluster.org/#/c/19871/, and
https://review.gluster.org/#/c/19952/
This patch adds version agnostic imports for urllib, cpickle,
socketserver, _thread, queue, etc., suggested by Aravinda in
https://review.gluster.org/#/c/19767/1
Note: Fedora packaging guidelines require explicit shebangs, so
popular practices like #!/usr/bin/env python and #!/usr/bin/python
are not allowed; they must be #!/usr/bin/python2 or #!/usr/bin/python3
Note: Selected small fixes from 2to3 utility. Specifically apply,
basestring, funcattrs, idioms, numliterals, set_literal, types, urllib,
and zip have already been applied.
Note: these 2to3 fixes report no changes are necessary: exec, execfile,
exitfunc, filter, getcwdu, intern, itertools, metaclass, methodattrs, ne,
next, nonzero, operator, paren, raw_input, reduce, reload, renames, repr,
standarderror, sys_exc, throw, tuple_params, xreadlines.
Change-Id: I8d393064a1837874d8b4bc87c8ce05c679664642
updates: #411
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lazy umounting the master volume by worker causes
issues with rsync's usage of getcwd. Henc removing
the lazy umount and using private mount namespace
for the same. On the slave, the lazy umount is
retained as we can't use private namespace in non
root geo-rep setup.
Change-Id: I403375c02cb3cc7d257a5f72bbdb5118b4c8779a
BUG: 1546129
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Once Geo-replication is started, it runs Gluster commands to get Volume
info from Master and Slave. With this patch, Georep can get Volume info
from Conf file if `--use-gconf-volinfo` argument is specified to monitor
Create a config(Or add to the config if exists) with following fields
[vars]
master-bricks=NODEID:HOSTNAME:PATH,..
slave-bricks=NODEID:HOSTNAME,..
master-volume-id=
slave-volume-id=
master-replica-count=
master-disperse_count=
Note: Exising Geo-replication is not affected since this is activated
only when `--use-gconf-volinfo` is passed while spawning `gsyncd
monitor`
Tiering support is not yet added since Tiering + Glusterd2 is still
under discussion.
Fixes: #396
Change-Id: I281baccbad03686c00f6488a8511dd6db0edc57a
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
| |
BUG: 1529480
Change-Id: If4775ed9886990c0e1bcf4e44c7dfef95cc4f0c3
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MD5 is not fips compliant. Hence replacing
with SHA256.
NOTE:
The hash is used to form the ctl_path for the ssh connection.
The length of ctl_path for ssh connection should not be > 108.
ssh fails with ctl_path too long if it is so. But when rsync
is piped to ssh, it is not taking > 90. rsync is failing with
error number 12. Hence using first 32 bytes of hash. Hash
collision doesn't matter as only one sock file is created
per directory.
Change-Id: I58aeb32a80b5422f6ac0188cf33fbecccbf08ae7
Updates: #230
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Fixed Python pep8 issues
- Removed dead code
- Rewritten configuration management
- Rewritten Arguments/subcommands handling
- Added Args upgrade to accommodate all these changes without changing
glusterd code
- use of md5 removed, which was used to hash the brick path for workdir
Both Master and Slave nodes will have subdir for session in the
format "<mastervol>_<primary_slave_host>_<slavevol>
$GLUSTER_LOGDIR/geo-replication/<mastervol>_<primary_slave_host>_<slavevol>
$GLUSTER_LOGDIR/geo-replication-slaves/<mastervol>_<primary_slave_host>_<slavevol>
Log file paths renamed since session info is available with directory
name itself.
$LOG_DIR_MASTER/
- gsyncd.log - Gsyncd, Worker monitor logs
- mnt-<brick-path>.log - Aux mount logs, mounted by each worker
- changes-<brick-path>.log - Changelog related logs(One per brick)
$LOG_DIR_SLAVE/
- gsyncd.log - Slave Gsyncd logs
- mnt-<master-node>-<master-brick-path>.log - Aux mount logs,
mounted for each connection from master-node:master-brick
- mnt-mbr-<master-node>-<master-brick-path>.log - Same as above,
but mountbroker setup
Fixes: #73
Change-Id: I2ec2a21e4e2a92fd92899d026e8543725276f021
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In hybrid crawl, renames and unlink can't be
synced but directory renames can be detected.
While syncing the directory on slave, if the
gfid already exists, it should be rename.
Hence if directory gfid already exists, rename
it.
Change-Id: Ibf9f99e76a3e02795a3c2befd8cac48a5c365bb6
BUG: 1499566
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
If there is a hardlink to a symlink on master
and if the symlink file is deleted on master,
geo-rep fails to sync the hardlink.
Typical Usecase:
It's easily hit with rsnapshot use case where
it uses hardlinks.
Example Reproducer:
Setup geo-replication between master and slave
volume and in master mount point, do the following.
1. mkdir /tmp/symlinkbug
2. ln -f -s /does/not/exist /tmp/symlinkbug/a_symlink
3. rsync -a /tmp/symlinkbug ./
4. cp -al symlinkbug symlinkbug.0
5. ln -f -s /does/not/exist2 /tmp/symlinkbug/a_symlink
6. rsync -a /tmp/symlinkbug ./
7. cp -al symlinkbug symlinkbug.1
Cause:
If the source was not present while syncing hardlink,
it was always packing the blob as regular file.
Fix:
If the source was not present while syncing hardlink,
pack the blob based on the mode.
Change-Id: Iaa12d6f99de47b18e0650e7c4eb455f23f8390f2
BUG: 1432046
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reported-by: Christian Lohmaier <lohmaier+rhbz@gmail.com>
Reviewed-on: https://review.gluster.org/18011
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libgfchangelog was encoding path using spec rfc3986, but encoding only
required for SPACE and NEWLINE chars since the NEWLINE char is used as
record separator and SPACE as field separator in the parsed changelogs
output.
Changed the encoding function to encode only SPACE and NEWLINE.
BUG: 1451724
Change-Id: I1936efad31788a9e636f912c832ed7d7efea4fe2
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: https://review.gluster.org/17787
Reviewed-by: Prashanth Pai <ppai@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changed all log messages to structured log format
Change-Id: Idae25f8b4ad0bbae38f4362cbda7bbf51ce7607b
Updates: #240
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: https://review.gluster.org/17551
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Flag: --ignore-missing-args
This Rsync flag reduces sync failures if the source file is
unlinked but present in --files-from list. This reduces
Rsync retries in Geo-rep and improves the performance
Flag: --existing
Rsync in Geo-rep never creates target files. Using RPC Geo-rep creates
entry in Slave and rsync --inplace used to prevent creating temporary file
and rename.(To avoid different GFID in Slave). If the entry is missing in
Slave then Geo-rep Rsync gets Permission denied errors when it tries to
create file with name as GFID inside .gfid dir.(Geo-rep rsync syncs data
using GFIDS with aux-gfid-mount)
To disable these flags,
gluster volume geo-replication <session> config \
rsync-opt-ignore-missing-args false
gluster volume geo-replication <session> config \
rsync-opt-existing false
Thanks Kotresh for finding these awesome tunables.
BUG: 1400924
Change-Id: I6a84fb86a589bf6edc8dfd1086456a84b05a64fc
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: https://review.gluster.org/16010
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On corner cases, mount cleanup might cause
worker crash. Fixing the same.
Change-Id: I38c0af51d10673765cdb37bc5b17bb37efd043b8
BUG: 1433506
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/17015
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
EBUSY was added to retry list of errno_wrap
without importing. Fixing the same.
Change-Id: Ide81a9ccc9b948a96265b6890da078b722b45d51
BUG: 1434018
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/17011
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not crash on EBUSY error. Add EBUSY
retry errno list. Crash only if the error
persists even after max retries.
Change-Id: Ia067ccc6547731f28f2a315d400705e616cbf662
BUG: 1434018
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/16924
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to improve debuggability, it is important
to have access to geo-rep master and slave mounts.
With the default behaviour, geo-rep lazy unmounts
the mounts after changing the current working
directory into the mount point. It also cleans
up the mount points. So only geo-rep worker has
the access and it becomes impossible to take the
client profile info and do any other client statck
analysis. Hence the following new config is being
introduced to allow access to mounts.
gluster vol geo-rep <mastervol> <slavehost>::<slavevol> \
config access_mount true
The default value of 'access_mount' is false.
Change-Id: I53dce4ea86a6ffc979c82f9330e8954327180ca3
BUG: 1433506
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/16912
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To spawn workers for each local brick, Geo-rep was collecting all
the machine IPs based on hostname and finds based on the connectivity.
With this patch, Geo-rep finds local brick if host UUID matches with
UUID of the brick from Volume info.
BUG: 1401801
Change-Id: Ic83c65df89e43cb86346e3ede227aa84d17ffd79
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/16035
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added following events
EVENT_GEOREP_ACTIVE
{
"nodeid": NODEID,
"ts": TIMESTAMP,
"event": "GEOREP_ACTIVE",
"message": {
"master_volume": MASTER_VOLUME_NAME,
"slave_host": SLAVE_HOST,
"slave_volume": SLAVE_VOLUME,
"brick_path": BRICK_PATH
}
}
EVENT_GEOREP_PASSIVE
{
"nodeid": NODEID,
"ts": TIMESTAMP,
"event": "GEOREP_PASSIVE",
"message": {
"master_volume": MASTER_VOLUME_NAME,
"slave_host": SLAVE_HOST,
"slave_volume": SLAVE_VOLUME,
"brick_path": BRICK_PATH
}
}
EVENT_GEOREP_CHECKPOINT_COMPLETED
{
"nodeid": NODEID,
"ts": TIMESTAMP,
"event": "GEOREP_ACTIVE",
"message": {
"master_volume": MASTER_VOLUME_NAME,
"slave_host": SLAVE_HOST,
"slave_volume": SLAVE_VOLUME,
"brick_path": BRICK_PATH,
"checkpoint_time": CHECKPOINT_TIME,
"checkpoint_completion_time": CHECKPOINT_COMPLETION_TIME
}
}
BUG: 1379330
Change-Id: I90716175868c59dd65c8d202e73e0ede90347b6a
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15630
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Tested-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If glusterfs-events rpm is not installed, Geo-replication will
fail since it imports eventtypes.
Any call to gsyncd will fail with Import error. Glusterd start
fails since it runs `gsyncd.py --version`
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py",
line 29, in <module>
from syncdutils import FreeObject, norm, grabpidfile, finalize
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py",
line 28, in <module>
from events import eventtypes
ImportError: No module named events
BUG: 1378057
Change-Id: I1a9bc086c3d52449ec7296cb2f9ceb16cd41a8a4
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15539
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libgfchangelog was not respecting the log_level configured
in Geo-replication. With this patch Libgfchangelog log level
can be configured using `config changelog_log_level TRACE`.
Default Changelog log level is INFO
BUG: 1363965
Change-Id: Ida714931129f6a1331b9d0815da77efcb2b898e3
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15078
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Event Type defined in #15351 to avoid merge conflicts
Add geo-rep events applicable to changes in
geo-rep session in the server side.
Change-Id: Ia66574d2abccad7fce6a96667efbc7c6c8903fc6
BUG: 1370445
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/15328
Tested-by: Aravinda VK <avishwan@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this patch, Data and Meta GFIDs are post processed. If Changelog has
UNLINK entry then remove from Data and Meta GFIDs list(If stat on GFID is
ENOENT in Master).
While processing Changelogs,
- Collect all the data and meta operations in a temporary database
- Delete all Data and Meta GFIDs which are already unlinked as per Changelogs
(unlink only if stat on GFID is ENOENT)
- Process all Entry operations as usual
- Process data and meta operations in batch(Fetch from Db in batch)
- Data sync is again batched based on number of changelogs(Default 1day
changelogs). Once the sync is complete, Update last Changelog's time as last_synced
time as usual.
Additionally maintain entry_stime on Brick root, ignore Entry ops if changelog
suffix time is less than entry_stime. If data stime is more than entry_stime,
this can happen only when passive worker updates stime by itself by getting
mount point stime. Use entry_stime = data_stime in this case.
New configurations:
max-rsync-retries - Default Value is 10
max-data-changelogs-in-batch - Max number of changelogs to be considered in a
batch for syncing. Default value is 5760(4 changelogs per min * 60 min *
24 hours)
max-history-changelogs-in-batch - Max number of history changelogs to be
processed at once. Default value 86400(4 changelogs per min * 60 min * 24
hours * 15 days)
BUG: 1364420
Change-Id: I7b665895bf4806035c2a8573d361257cbadbea17
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15110
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add error logs if gf_history_changelog fails. If requested
changelog range is not available, log the error and exit
instead of continuing the loop and exiting in readdir
without logging. Also fixed the duplicate MSGID number in
'changelog-lib-messages.h'
Change-Id: Icd71b89ae23b48a71380657ba5649029c32fabfd
BUG: 1362151
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/15064
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If ssh returns 127 that means the remote gsyncd path is wrong
or push-pem failed during create. Existing error message was
pointing old documentation.
Change-Id: Ifbbb4a604fc0ae0fd5cb2746df6363bf28cde1e9
BUG: 1343943
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/14673
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Handle ESTALE returned by lstat gracefully
by retrying it. Do not crash the worker.
Change-Id: I2527cd8bd1f7d2428cb4fa3f20782bebaf2df12a
BUG: 1247529
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/11772
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While doing RMDIR worker gets ENOTEMPTY because same directory will
have files from other bricks which are not deleted since that worker
is slow processing. So geo-rep does recursive_delete.
Recursive delete was done using shutil.rmtree. once started, it will
not check disk_gfid in between. So it ends up deleting the new files
created by other workers. Also if other worker creates files after one
worker gets list of files to be deleted, then first worker will again
get ENOTEMPTY again.
To fix these races, retry is added when it gets ENOTEMPTY/ESTALE/ENODATA.
And disk_gfid check added for original path for which recursive_delete is
called. This disk gfid check executed before every Unlink/Rmdir. If disk
gfid is not matching with GFID from Changelog, that means other worker
deleted the directory. Even if the subdir/file present, it belongs to
different parent. Exit without performing further deletes.
Retry on ENOENT during create is ignored, since if CREATE/MKNOD/MKDIR
failed with ENOENT will not succeed unless parent directory is created
again.
Rsync errors handling was handling unlinked_gfids_list only for one
Changelog, but when processed in batch it fails to detect unlinked_gfids
and retries again. Finally skips the entire Changelogs in that batch.
Fixed this issue by moving self.unlinked_gfids reset logic before batch
start and after batch end.
Most of the Geo-rep races with rm -rf is eliminated with this patch,
but in some cases stale directories left in some bricks and in mount
point we get ENOTEMPTY.(DHT issue, Error will be logged in Slave log)
BUG: 1211037
Change-Id: I8716b88e4c741545f526095bf789f7c1e28008cb
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/10204
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ENTRY operations failures on slave left no trace for debugging purposes.
This patch captures such failures on slave cluster and forwards them to
the master and logs them. Failures of specific interest are the ones
which return code EEXIST on the failing operations.
Change-Id: Iecab876f16593c746d53f4b7ec2e0783367856bb
BUG: 1207115
Signed-off-by: Milind Changire <mchangir@redhat.com>
Reviewed-on: http://review.gluster.org/10048
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Making geo-rep use the common storage shared by nfs,
snapshot and geo-rep. The meta volume should be named
as gluster_shared_storage, and it should be mounted
at "/var/run/gluster/shared_storage/".
geo-rep will have create a directory called 'geo-rep'
in the meta-volume and all the lock files are created
inside it.
Change-Id: I82d0bff9be191f75f643606a9a21d53559047ac4
BUG: 1210344
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10196
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CURRENT DESIGN AND ITS LIMITATIONS:
-----------------------------------
Geo-replication syncs changes across geography using changelogs captured
by changelog translator. Changelog translator sits on server side just
above posix translator. Hence, in distributed replicated setup, both
replica pairs collect changelogs w.r.t their bricks. Geo-replication
syncs the changes using only one brick among the replica pair at a time,
calling it as "ACTIVE" and other non syncing brick as "PASSIVE".
Let's consider below example of distributed replicated setup where
NODE-1 as b1 and its replicated brick b1r is in NODE-2
NODE-1 NODE-2
b1 b1r
At the beginning, geo-replication chooses to sync changes from NODE-1:b1
and NODE-2:b1r will be "PASSIVE". The logic depends on virtual getxattr
'trusted.glusterfs.node-uuid' which always returns first up subvolume
i.e., NODE-1. When NODE-1 goes down, the above xattr returns NODE-2 and
that is made 'ACTIVE'. But when NODE-1 comes back again, the above xattr
returns NODE-1 and it is made 'ACTIVE' again. So for a brief interval of
time, if NODE-2 had not finished processing the changelog, both NODE-2
and NODE-1 will be ACTIVE causing rename race as mentioned in the bug.
SOLUTION:
---------
1. Have a shared replicated storage, a glusterfs management volume specific
to geo-replication.
2. Geo-rep creates a file per replica set on management volume.
3. fcntl lock on the above said file is used for synchronization
between geo-rep workers belonging to same replica set.
4. If management volume is not configured, geo-replication will back
to previous logic of using first up sub volume.
Each worker tries to lock the file on shared storage, who ever wins will
be ACTIVE. With this, we are able to solve the problem but there is an
issue when the shared replicated storage goes down (when all replicas
goes down). In that case, the lock state is lost. So AFR needs to rebuild the
lock state after brick comes up.
NOTE:
-----
This patch brings in the, pre-requisite step of setting up management volume
for geo-replication during creation.
1. Create mgmt-vol for geo-replicatoin and start it. Management volume should
be part of master cluster and recommended to be three way replicated
volume having each brick in different nodes for availability.
2. Create geo-rep session.
3. Configure mgmt-vol created with geo-replication session as follows.
gluster vol geo-rep <mastervol> slavenode::<slavevol> config meta_volume \
<meta-vol-name>
4. Start geo-rep session.
Backward Compatiability:
-----------------------
If management volume is not configured, it falls back to previous logic of
using node-uuid virtual xattr. But it is not recommended.
Change-Id: I7319d2289516f534b69edd00c9d0db5a3725661a
BUG: 1196632
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/9759
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shutil.rmtree was failing to remove file if file was not
exists. Added error handling function to ignore ENOENT if
a file/dir not present.
BUG: 1198101
Change-Id: I1796db2642f81d9e2b5e52c6be34b4ad6f1c9786
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/9792
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Prashanth Pai <ppai@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Iacc67e4ba9ac45e0858f3befe84ffb8fccf7e1c3
BUG: 1075417
Signed-off-by: arao <arao@redhat.com>
Reviewed-on: http://review.gluster.org/9502
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changelog consumption/processing now happens in seperate process
group than monitor. When monitor process group gets SIGSTOP all
worker process, ssh, rsync will be paused except the changelog
processing. When it gets SIGCONT it resumes its operation.
Changelog agent runs as RepceServer, geo-rep worker communicates
with changelog agent using RepceClient.
Change-Id: I35c333e4d8b13d03a7808aed601960eef23cfa04
BUG: 1093602
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/7322
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In source install, libgfchangelog is installed in /usr/local/lib
When glusterd runs /usr/local/libexec/glusterfs/python/gsyncd --version
it fails to find library without LD_LIBRARY_PATH.
This patch avoids loading library when it is run from glusterd
during start.
BUG: 1096026
Change-Id: I59912227ac27ff4877d947a7c8f1fe2e8c5be06e
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/7713
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Every time when geo-rep restarts it first does FS crawl using
XCrawl and then switches to Changelog Mode. This is because changelog
only had live API, that is we can get changes only after registering.
Now this(http://review.gluster.org/#/c/6930/) patch introduces History
API for changelogs. If history is available then geo-rep will use it
instead of FS Crawl.
History API returns TS till what time history is available for
given start and end time. If TS < endtime then switch to FS Crawl.
(History => FS Crawl => Live Changelog)
If TS >= endtime, then switch directly to Changelog mode
(History => Live Changelog)
Change-Id: I4922f62b9f899c40643bd35720e0c81c36b2f255
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/6938
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pep8 is a style guide for python.
http://legacy.python.org/dev/peps/pep-0008/
pep8 can be installed using, `pip install pep8`
Usage: `pep8 <python file>`, For example, `pep8 master.py`
will display all the coding standard errors.
flake8 is used to identify unused imports and other issues
in code.
pip install flake8
cd $GLUSTER_REPO/geo-replication/
flake8 syncdaemon
Updated license headers to each source file.
Change-Id: I01c7d0a6091d21bfa48720e9fb5624b77fa3db4a
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/7311
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Prashanth Pai <ppai@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-> "threaded" hybrid crawl.
-> Enabling metatadata synchronization.
-> Handling EINVAL/ESTALE gracefully while syncing metadata.
-> Improvments to changelog crawl code.
-> Initial crawl changelog generation format.
-> No gsyncd restart when checkpoint updated.
-> Fix symlink handling in hybrid crawl.
-> Slave's xtime key is 'stime'.
-> tar+ssh as data synchronization.
-> Instead of 'raise', just log in warning level for xtime missing cases.
-> Fix for JSON object load failure
-> Get new config value after config value reset.
-> Skip already processed changelogs.
-> Saving status of each individual worker thread.
-> GFID fetch on slave for purges.
-> Add tar ssh keys and config options.
-> Fix nlink count when using backend.
-> Include "data" operation for hardlink.
-> Use changelog time prefix as slave's time.
-> Process changelogs in parallel.
Change-Id: I09fcbb2e2e418149a6d8435abd2ac6b2f015bb06
BUG: 1036539
Signed-off-by: Ajeet Jha <ajha@redhat.com>
Reviewed-on: http://review.gluster.org/6404
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I8961633a7371c941a3feee44c949d5c934eca998
Original-Author: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 847839
Reviewed-on: http://review.gluster.org/5933
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch extends the persistent instrumentation work done by
Aravinda (@avishwa), by introducing a handfull of instrumentation
variables for crawl. These variables are "pulled up" by glusterd
in the event of a geo-replication status cli command and looks
something like below:
"Uptime=00:21:10;FilesSyned=2982;FilesPending=0;BytesPending=0;DeletesPending=0;"
"FilesPending", "BytesPending" and "DeletesPending" are short-lived
variables that are non-zero when a changelog is being processes (ie.
when an active sync in ongoing). After a successfull changelog process
"FilesPending" is summed up into "FilesSynced". The three short-lived
variabled are then reset to zero and the data is persisted
Additionally this patch also reverts some of the changes made for
BZ #986929 (those were not needed).
Change-Id: I948f1a0884ca71bc5e5bcfdc017d16c8c54fc30b
BUG: 990420
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/5441
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A hostname fqdn can be of length 255 according to RFC1123
------------------------->
/usr/include/bits/posix1_lim.h:#define _POSIX_HOST_NAME_MAX 255
<-------------------------
On linux this length is 64
------------------------->
/usr/include/bits/local_lim.h:#define HOST_NAME_MAX 64
<-------------------------
When a given hostname is > 45 (characters) - SSH fails with
-------------------------->
"ControlPath too long for Unix domain socket".
<--------------------------
Indicating that the total length of ControlPath which is
on linux should be 108
------------------------->
/usr/include/linux/un.h:#define UNIX_PATH_MAX 108
<-------------------------
This leads to "faulty" geo-replication status.
This patch brings in a new file called manifest which carries
given a geo-rep session some unique information - with which
a unique `md5` is generated in a 32length digest, this ensures
that we don't exceed UNIX_PATH_MAX limitations instead we use
a conservative approach and still be able to provide a unique
socket path.
Change-Id: I3a6a27d605d751a86e7c82eace4561d9b0134fe1
BUG: 990330
Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
Reviewed-on: http://review.gluster.org/5681
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* also consume changelog for change detection.
* Status fixes
* Use new libgfchangelog done API
* process (and sync) one changelog at a time
Change-Id: I24891615bb762e0741b1819ddfdef8802326cb16
BUG: 847839
Original Author: Csaba Henk <csaba@redhat.com>
Original Author: Aravinda VK <avishwan@redhat.com>
Original Author: Venky Shankar <vshankar@redhat.com>
Original Author: Amar Tumballi <amarts@redhat.com>
Original Author: Avra Sengupta <asengupt@redhat.com>
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/5131
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
Change-Id: Ibd0faefecc15b6713eda28bc96794ae58aff45aa
BUG: 847839
Original Author: Amar Tumballi <amarts@redhat.com>
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/5133
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|