| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix getting subvol number if the volume
type is tier. If the volume type was tier,
the subvol number was calculated incorrectly
and hence few of workers didn't become ACTIVE
resulting in files not being replicated from
corresponding brick. This patch addresses
the same.
Change-Id: Ic10ad7f09a0fa91b4bf2aa361dea3bd48be74853
BUG: 1292084
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/12994
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If ENTRY creation failed for symlink in Slave and symlink
renamed in Master. If Source not exists to Rename in Slave
Geo-rep interprets as Create of Target file. Geo-rep sends blob
of regular file to create symlink instead of sending blob of
symlink.
With this patch, Geo-rep identifies symlink and sends respective
blob.
BUG: 1289859
Change-Id: If9351974d1945141a1d3abb838b7d0de7591e48e
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/12917
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Tested-by: Milind Changire <mchangir@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without '--sparse' option tar will not properly archive sparse file
and geo-replication will result in non-sparse file on the remote end.
Here is more on how I arrived at this
http://markelov.org/wiki/index.php/GlusterFS_3.6.1_on_CentOS_6.5:_geo-replication_and_sparse_files_problem
Change-Id: I8d671964a1b48bbb916e4a064571221bf3631494
BUG: 1276839
Signed-off-by: Alex Markelov <alex@markelov.org>
Reviewed-on: http://review.gluster.org/12476
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
symlinks are not getting synced to slave in a Tiering based volume.
Solution:
Now, symlinks are created directly in cold tier bricks( in the backend).
Earlier, cold tier was avoided for namespace operations and only
hot tier was used while processing changelogs.
Now, cold tier is HASH subvolume in a Tiering volume.
So, carry out namespace operation only in cold tier subvolume and
avoid hot tier subvolume to avoid any races.
Earlier, XSYNC was used(and changeloghistory avoided) during initial sync
in order to avoid race while processing historychangelog in Hot tier.
This is no longer required as there is no race from Hot tier.
Also, avoid both live and history changelog ENTRY operations from Hot tier to avoid any race with cold tier.
Change-Id: Ia8fbb7ae037f5b6cb683f36c0df5c3fc2894636e
BUG: 1287519
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/12844
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
geo-replication session creation fails when Hostname
is having CAPS in it.
Issue is with the regex pattern which handles only small lettered
Hostname.
Fix:
Fix the regex pattern to handle CAPS based hostname as well.
Change-Id: I5c99c102e9706acc2b1fab1e6bf158e68beed373
BUG: 1265522
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/12216
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When any of the open fd of a file is closed
on which fcntl lock is taken even though another
fd of the same file is open on which lock is taken,
all fcntl locks will be released. This causes both
replica workers to be ACTIVE sometimes. This patche
fixes that issue.
Change-Id: I1e203ab0e29442275338276deb56d09e5679329c
BUG: 1285488
Original-Author: Aravinda VK <avishwan@redhat.com>
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/12752
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
%s was not replaced by actual values in GsyncdError
BUG: 1279921
Change-Id: I3c0a10f07383ca72844a46f930b4aa3d3c29f568
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/12566
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geo-rep is logging in Local time, all other Gluster logs are in
GMT/UTC. It is very difficult to co-relate Geo-rep logs with
other Gluster logs.
BUG: 1282331
Change-Id: Ieae8bda7e4788e587cf4595e21e0e772c210cfbb
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/12583
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Active worker tries to acquire lock in each iteration. On every successfull
lock acqusition it was not closing previously opened lock fd.
To see the leak, get the PID of worker,
ps -ax | grep feedback-fd
watch ls /proc/$pid/fd
BUG: 1225566
Change-Id: Ic476c24c306e7ab372c5560fbb80ef39f4fb31af
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/12332
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If different port used for SSH instead of 22, Geo-replication
was failing to establish SSH connection.
ssh_port option can be added using config:ssh_command and
config:ssh_command_tar, but user has to remember complete
ssh command used with parameter to add/modify ssh port.
This patch adds new config option for ssh_port,
gluster volume geo-replication <MASTERVOL> <SLAVEHOST::<SLAVEVOL> \
config ssh_port 52022
Change-Id: I7753a09485f0b1f49d2b2a80b962c720817c96f4
Signed-off-by: Aravinda VK <avishwan@redhat.com>
BUG: 1276028
Reviewed-on: http://review.gluster.org/12444
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When Changelog agent process dies, Geo-replication fails to detect
and worker will run without respective Changelog agent. Status shows
Active/Passive without any progress.
With this patch, Worker process gets killed whenever Changelog
agent dies.
Change-Id: I30b4cc77f924f7e8174b8bfe415ac17f0b3851b4
Signed-off-by: Aravinda VK <avishwan@redhat.com>
BUG: 1277076
Reviewed-on: http://review.gluster.org/12485
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GEO-REP INTEROP WITH SHARD FEATURE
Problem:
The sequence of entry creation and chown in master
is recorded as creation of entry with resulted
user:group in xsync changelog. During sync, entry
creation is always split into two ops, MKNOD and
SETATTR. Hence the issue is not being hit otherwise
it would have failed with EPERM if parent is owned
by different user. But with shard translator being
enabled on slave, doing entry creation with MKNOD and
SETATTR is not allowed, SETATTR fails as it accesses
inode structure which is not linked.
Solution:
The sequence of entry creation and chown in master
should be recorded as MKNOD and SETATTR separately always
and do entry creation with single op in gfid-access
xlator. The gfid-access patch will be sent separately.
Change-Id: I93e554bf9342397a7660503f5128e9709f8a0cd8
BUG: 1265148
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/12205
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During XSync crawl, last_synced time in status file was not updated.
This patch fixes the issue by updating status file when stime xattr
is updated after Xsync or Changelog Crawl.
Change-Id: I4dc3a2d4c3d8378a939da0868caf1aef4f789599
Signed-off-by: Aravinda VK <avishwan@redhat.com>
BUG: 1247536
Reviewed-on: http://review.gluster.org/11771
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GEO-REP INTEROP WITH SHARD FEATURE
If it is FXATTROP or XATTROP in changelog,
add the gfid to rsync queue.
Change-Id: If68d38d7ed00b70a4618cfcc8e75df3fbadbf724
BUG: 1265148
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/12226
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a series of patch which aims to fix geo-replication
in a Tiering Volume.
Problem:
Consider, a file is placed in volume initially and then hot tier is
attached. During any operation on the file, due to lookup a linkto
file is created in hot tier.
Now, any namespace operation carried out on the file is recorded in
both cold and hot tier.
There is a room for races when both changelogs are replayed.
Solution:
So, We are going to replay (namespace related)operations
only in the hot tier.
Why?
a. If the file is directly placed in Hot tier , all fops will be
recorded in HOT tier.
b. If the file is already present in Cold tier, and if any fop is
carried out, it creates linkto file in Hot tier.
Now, operations like UNLINK, RENAME are captured in Hot
tier(by means of linkto file).
This way, we can get both tier's operation in HOT tier itself.
Now, once the file is demoted to COLD tier, any namespace operation
carried out on the cold tier can be avoided as we directly RECORD
the same in HOT tier.
How?
1. Check whether the brick is cold tier and skip ENTRY operation.
2. Also, if it is cold tier brick, use Xsync(which is used during initial run).
This will help in getting all cold tier bricks changes using File System crawl
and helps in avoiding races with hot tier brick(which can happen
if historychangelog used in cold tier brick).
Dependent patches:
1. http://review.gluster.org/12239
2. http://review.gluster.org/12326
Change-Id: I7692b1dbb8813a7e253451bca02f8f09a5782dde
BUG: 1266875
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/12355
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a series of patches which aims to fix geo-replication
in a Tiering Volume.
Problem:
Consider, a file is placed in volume initially and then hot tier is
attached. During any operation on the file, due to lookup a linkto
file is created in hot tier.
Now, any namespace operation carried out on the file is recorded in
both cold and hot tier.
There is a room for races when both changelogs are replayed.
Solution:
So, We are going to replay (namespace related)operations
only in the hot tier.
Why?
a. If the file is directly placed in Hot tier, all fops will be
recorded in HOT tier.
b. If the file is already present in Cold tier, and if any fop is
carried out, it creates linkto file in Hot tier.
Now, operations like UNLINK, RENAME are captured in Hot tier(by means of linkto file).
This way, we can get both tier's operation in HOT tier itself.
But, We may miss initial Data sync immediately after creating the
file as it is only recording MKNOD. So, if MKNOD encountered
with sticky bit set, queue DATA operation for the corresponding gfid.
(This is addressed here in this patch)
So, If tier-gfid linkto is set, we need to record the corresponding
MKNOD. Earlier this was avoided as it was set as INTERNAL fop.
(This changelog related changes are addressed in the patch:
- http://review.gluster.org/12417)
Change-Id: I2fa84cfa2b0f86506c3d15d484138ab9651e4f83
BUG: 1266875
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/12326
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes portability issues in gverify.sh and
libcxattr.py with NetBSD.
Change-Id: Idfaa6cf3815136e6a2343aab98d979b6ab451bbd
BUG: 1257847
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/12088
Reviewed-by: Emmanuel Dreyfus <manu@netbsd.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When the monitor process itself is getting killed, geo-rep session
still shows as active.
Status command will just pick up the content from the status file
to show the output. Monitor process is the one which updates the Status file.
When the monitor process itself gets killed, there is no way to update
the status file. So, geo-rep session status command ends up showing
last updated Status present in the status file.
Solution:
While getting the status output, check whether monitor process is running.
If it is NOT running, update the status as STOPPED.
Change-Id: I86a7ac1746dd8f27eef93658e992ef16f6068d9d
BUG: 1251980
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/11873
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Handle ESTALE returned by lstat gracefully
by retrying it. Do not crash the worker.
Change-Id: I2527cd8bd1f7d2428cb4fa3f20782bebaf2df12a
BUG: 1247529
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/11772
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both ACTIVE and PASSIVE workers register to changelog
at almost same time. When PASSIVE worker becomes ACTIVE,
the start and end time would be current stime and register_time
repectively for history API. Hence register_time would be less
then stime for which history obviously fails. But it will
be successful for the next restart as new register_time > stime.
Fix is to pass current time as the end time to history call
instead of the register_time.
Also improvised the logging for ACTIVE/PASSIVE switching.
Change-Id: Idc08b4b55c7a4c575ba44918a98389164ccbee8f
BUG: 1239044
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/11524
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix fd reference before assignment in mgmt_lock
function.
Change-Id: Ie939d4262a59cae0817ae388658a000576ab69b8
BUG: 1233411
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/11318
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When DHT can't resolve a File it raises ESTALE, ignore ESTALE errors
same as ENOENT after retry.
Affected places:
Xattr.lgetxattr
os.listdir
os.link
Xattr.lsetxattr
os.chmod
os.chown
os.utime
os.readlink
BUG: 1232912
Change-Id: I02015f508d901e4a74dd48e1c52423e78eaf1dcd
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/11296
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If meta-volume is deleted and use_meta_volume
is set to false, geo-rep still fails complaining
meta volume is not mounted. The patch fixes that
issue.
Change-Id: Iecf732197926bf9ce69112287fccbb1c34e58e6d
BUG: 1234694
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/11358
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lock filename was formed with 'master volume id'
and 'subvol number'. Hence multiple slaves try
acquiring lock on same file and become PASSIVE
ending up not syncing data. Using 'slave volume id'
in lock filename will fix the issue making lock
file unique across different slaves.
BUG: 1234882
Change-Id: Ie3590b36ed03e80d74c0cfc1290dd72122a3b4b1
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/11367
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
during unlink/rmdir of Parent_GFID/Basename, if parent
directory does not exists. Parent GFID will not get resolved
and DHT raises ESTALE instead of ENOENT.
Now ESTALE errors ignored during unlink/rmdir
BUG: 1223280
Change-Id: If275c89fb9fc7d16004550805a4cd65be818540d
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/10837
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ENTRY and META operations executed sequentially, DATA operations
are handled async, increment happens when a changelog parsed.
Decrement happens after the sync of all files.
'files_in_batch' was reset multiple times in batch instead of once.
BUG: 1224098
Change-Id: I87617f2fd5f4d3221a1c9f9d5a8efb0686c42bbe
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/10911
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geo-rep can't sync xattrs and acls with tar over ssh
for following reasons.
Issue 1: xattrs doesn't sync with tar over ssh.
Reason: untar doesn't respect '--overwrite' option when used along
with '--xattrs'. So it sends unlink if the file exists on
destination and re-creates afresh. But all entry operations
are banned in aux-gfid-mount as it may lead to gfid-mismatch.
Hence fails with EPERM. This happens only when some xattr is
set on a file in master volume.
Issue2: acls on directories does not sync with tar over ssh.
Reason: tar tries to opendir ".gfid/<gfid1>" and is not supported
by gfid-access-translator as readirp can't be handled on
virtual inodes and hence fails with ENOTSUP where as it syncs
for files.
Since the issue is with tar commmand it self and nothing could be
done from gluster side, disabling xattr and acls support with tar
over ssh option.
Geo-rep can sync xattrs and acls with 'rsync' as the sync engine.
Change-Id: I6821d327e7fe15545adef644869aa2389f79c701
BUG: 1223642
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10873
Tested-by: NetBSD Build System
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When gsyncd fails with Python traceback, glusterd fails
parsing gsyncd output and shows error.
BUG: 1219937
Change-Id: Ic32fd897c49a5325294a6588351b539c6e124338
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/10694
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Default Values for last_synced, checkpoint_time and
checkpoint_completion_time was zero instead of 'N/A'
BUG: 1212410
Change-Id: Ie775508f8dcb9ba6f311946a2039739e4336d9a6
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/10580
Reviewed-by: darshan n <dnarayan@redhat.com>
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When rsync is executed using Python subprocess, by default
stdout of subprocess will be None. With the log rsync performance
patch stdout is assigned to PIPE. Rsync writes to that PIPE
whenever it syncs files. If log_rsync_performance is disabled
then nobody will consume stdout and that gets full. Rsync hangs
if PIPE is full.
log_rsync_performance option is introduced with patch 10070
With this patch stdout=PIPE only if log_rsync_performance is
enabled. Also removed -v option from Rsync.
Thanks Venky and Kotresh for RCA.
BUG: 1218552
Change-Id: I4ebcfb6999358c8e2c147f7964255bd836ed7499
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/10556
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While doing RMDIR worker gets ENOTEMPTY because same directory will
have files from other bricks which are not deleted since that worker
is slow processing. So geo-rep does recursive_delete.
Recursive delete was done using shutil.rmtree. once started, it will
not check disk_gfid in between. So it ends up deleting the new files
created by other workers. Also if other worker creates files after one
worker gets list of files to be deleted, then first worker will again
get ENOTEMPTY again.
To fix these races, retry is added when it gets ENOTEMPTY/ESTALE/ENODATA.
And disk_gfid check added for original path for which recursive_delete is
called. This disk gfid check executed before every Unlink/Rmdir. If disk
gfid is not matching with GFID from Changelog, that means other worker
deleted the directory. Even if the subdir/file present, it belongs to
different parent. Exit without performing further deletes.
Retry on ENOENT during create is ignored, since if CREATE/MKNOD/MKDIR
failed with ENOENT will not succeed unless parent directory is created
again.
Rsync errors handling was handling unlinked_gfids_list only for one
Changelog, but when processed in batch it fails to detect unlinked_gfids
and retries again. Finally skips the entire Changelogs in that batch.
Fixed this issue by moving self.unlinked_gfids reset logic before batch
start and after batch end.
Most of the Geo-rep races with rm -rf is eliminated with this patch,
but in some cases stale directories left in some bricks and in mount
point we get ENOTEMPTY.(DHT issue, Error will be logged in Slave log)
BUG: 1211037
Change-Id: I8716b88e4c741545f526095bf789f7c1e28008cb
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/10204
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Discussion in gluster-devel
http://www.gluster.org/pipermail/gluster-devel/2015-April/044301.html
MASTER NODE - Master Volume Node
MASTER VOL - Master Volume name
MASTER BRICK - Master Volume Brick
SLAVE USER - Slave User to which Geo-rep session is established
SLAVE - <SLAVE_NODE>::<SLAVE_VOL> used in Geo-rep Create command
SLAVE NODE - Slave Node to which Master worker is connected
STATUS - Worker Status(Created, Initializing, Active, Passive, Faulty,
Paused, Stopped)
CRAWL STATUS - Crawl type(Hybrid Crawl, History Crawl, Changelog Crawl)
LAST_SYNCED - Last Synced Time(Local Time in CLI output and UTC in XML output)
ENTRY - Number of entry Operations pending.(Resets on worker restart)
DATA - Number of Data operations pending(Resets on worker restart)
META - Number of Meta operations pending(Resets on worker restart)
FAILURES - Number of Failures
CHECKPOINT TIME - Checkpoint set Time(Local Time in CLI output and UTC
in XML output)
CHECKPOINT COMPLETED - Yes/No or N/A
CHECKPOINT COMPLETION TIME - Checkpoint Completed Time(Local Time in CLI
output and UTC in XML output)
XML output:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
cliOutput>
geoRep>
volume>
name>
sessions>
session>
session_slave>
pair>
master_node>
master_brick>
slave_user>
slave/>
slave_node>
status>
crawl_status>
entry>
data>
meta>
failures>
checkpoint_completed>
master_node_uuid>
last_synced>
checkpoint_time>
checkpoint_completion_time>
BUG: 1212410
Change-Id: I944a6c3c67f1e6d6baf9670b474233bec8f61ea3
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/10121
Tested-by: NetBSD Build System
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Access unreferenced access of fd:
In meta volume configuration for geo-rep, if
geo-rep directory is not created yet, open fails
with no fd, but it is accessed in close(fd). So
after creating 'geo-rep' directory in meta-volume,
open the lock file to get fd.
2. Fix volume_id in forming lock file name.
For the very first time, gconf.volume_id would
be null, as config is not reloaded yet. Hence, use
'uuid' function to get the volume id.
Change-Id: I8381ab7a44bc800df25d596218466641c10937a4
BUG: 1210344
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10458
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Tested-by: NetBSD Build System
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changelog processing is done in batch, for example if 10 changelogs
available for processing then process all at once. Collect Entry, Meta
and Data operations separately, All the entry operations like CREATE,
MKDIR, MKNOD, LINK, UNLINK will be executed first then rsync will be
triggered for whole batch. Stime will get updated once the complete
batch is complete.
In case of large number of Changelogs in a batch, If geo-rep fails after
Entry operations, but before rsync then on restart, it again starts from the
beginning since stime is not updated. It has to process all the changelogs
again. While processing same changelogs again, all CREATE will get EEXIST
since all the files created in previous run. Big hit for performance.
With this patch, Geo-rep limits number of changelogs per batch based on
Changelog file size. So that when geo-rep fails it has to retry only last batch
changelogs since stime gets updated after each batch.
BUG: 1210965
Change-Id: I844448c4cdcce38a3a2e2cca7c9a50db8f5a9062
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/10202
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ENTRY operations failures on slave left no trace for debugging purposes.
This patch captures such failures on slave cluster and forwards them to
the master and logs them. Failures of specific interest are the ones
which return code EEXIST on the failing operations.
Change-Id: Iecab876f16593c746d53f4b7ec2e0783367856bb
BUG: 1207115
Signed-off-by: Milind Changire <mchangir@redhat.com>
Reviewed-on: http://review.gluster.org/10048
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Making geo-rep use the common storage shared by nfs,
snapshot and geo-rep. The meta volume should be named
as gluster_shared_storage, and it should be mounted
at "/var/run/gluster/shared_storage/".
geo-rep will have create a directory called 'geo-rep'
in the meta-volume and all the lock files are created
inside it.
Change-Id: I82d0bff9be191f75f643606a9a21d53559047ac4
BUG: 1210344
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10196
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If this option is set, Deletes will not be propogated to Slave.
This option is applicable for UNLINK and RMDIR.
gluster volume geo-replication <MASTER> <SLAVEHOST>::<SLAVEVOL> \
config ignore_deletes true
Default value is false.
PS: Use this option with caution, If you create the file in master
with same path then it fails to sync to slave. Old file in Slave
will have different GFID compared to New.
BUG: 1189363
Change-Id: I1f7816d1ea36460a654873739d3fb1b6c13e0f8d
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/9583
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unless change_detector is set to xsync, do not fallback to
xsync, except during Initial Sync or Partial History.
When a brick goes down, Changelog exception is raised due
to which geo-rep fallback to xsync. Even after brick comes
back geo-rep will not consume Changelog.
BUG: 1202649
Change-Id: I1f8ea26ac7735f6ee09b3b143ee3eb66bfc9fc37
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/9758
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The index value for UNLINK and RMDIR in changelog
is no more the last index. It varies based on whether
the 'changelog.capture-del-path' is enabled or not.
Hence, fixed index is used.
The option to capture deleted path in changelog comes
with the patch: http://review.gluster.org/#/c/10288/
And the parser changes with http://review.gluster.org/#/c/10166/
Change-Id: Idc1a2e2bf90c888be4524d3ce74865aea09485de
BUG: 1214561
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10344
Tested-by: NetBSD Build System
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introducing configurable option to log the rsync performance.
gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> \
config log-rsync-performance true
Default value is False.
Example log:
[2015-03-31 16:48:34.572022] I [resource(/bricks/b1):857:rsync] SSH: rsync
performance: Number of files: 2 (reg: 1, dir: 1), Number of regular files
transferred: 1, Total file size: 178 bytes, Total transferred file
size: 178 bytes, Literal data: 178 bytes, Matched data: 0 bytes,
Total bytes sent: 294, Total bytes received: 32, sent 294 bytes
received 32 bytes 652.00 bytes/sec
Change-Id: If11467e29e6ac502fa114bd5742a8434b7084f98
Signed-off-by: Aravinda VK <avishwan@redhat.com>
BUG: 764827
Reviewed-on: http://review.gluster.org/10070
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for ACLS. When it sees SETXATTR
in Changelog, it adds the file to data queue. rsync/tar+ssh
will take care of syncing ACLS. User set ACLS will be
synced to Slave.
This requires "system.posix_acl_access" to go through when
client-pid is equal GF_CLIENT_PID_GSYNCD in fuse layer.
New config interface is introduced, sync-acls
Which can be set using geo-rep config(Default is True)
gluster volume geo-replication <VOLUME> <SLAVEHOST>::<SLAVEVOL> \
config sync-acls false
Change-Id: I7eb3523fa72b8fed830efc98138891244e830d65
BUG: 1187021
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10001
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the RPC based changes to {libgf}changelog, changelog_init
is required before changelog_register.
Change-Id: Id125b2bd2e51aaaffa22ecab463dfb739c50d83c
Signed-off-by: Venky Shankar <vshankar@redhat.com>
BUG: 1170075
Reviewed-on: http://review.gluster.org/9993
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Tested-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With trash feature, .trashcan directory gets created
at each export directory. Xsync picks .trashcan to sync
and fails with EPERM. Xsync should ignore .trashcan
directory.
Change-Id: I45bd226c96011ace2c40dd2de878d886c7d34ce5
BUG: 1203293
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/9934
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the RPC based changes to {libgf}changelog, loading shared
objects dynamically would need symbols to be available from
other shared libraries. As an example, creating an RPC listner
loads the RPC transport shared object which requires symbols
to be available from already loaded shared objects.
Using RTLD_GLOBAL makes the symbols available for symbol
resolution of subsequently loaded libraries.
Change-Id: I3d3ef790eded82911f05836c707509157680645c
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9814
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use register time(xsync_upper_limit) only for stime update, do not
use for change detection.
Problem 1:
If a file created before geo-rep, xtime xattr does not exist.
Geo-rep updates xtime of the file to current time if not exists.
xtime > upper_limit so geo-rep will not pick those files. Changelog
either will have SETXATTR, and fails to sync the file.
Problem 2:
If a file is created before geo-rep create and updated after
geo-rep start. xtime of the file is greater than upper limit(geo-rep
start time/changelog register time). Geo-rep(XSync) will not pick this
file for syncing. Changelog will have only DATA recorded for that file.
Geo-rep tries DATA without any ENTRY ops and fails with rsync error.
BUG: 1200733
Change-Id: Ie4e8f284db689d2c755ef8e7ecbb658db1c0785f
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/9855
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Tested-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CURRENT DESIGN AND ITS LIMITATIONS:
-----------------------------------
Geo-replication syncs changes across geography using changelogs captured
by changelog translator. Changelog translator sits on server side just
above posix translator. Hence, in distributed replicated setup, both
replica pairs collect changelogs w.r.t their bricks. Geo-replication
syncs the changes using only one brick among the replica pair at a time,
calling it as "ACTIVE" and other non syncing brick as "PASSIVE".
Let's consider below example of distributed replicated setup where
NODE-1 as b1 and its replicated brick b1r is in NODE-2
NODE-1 NODE-2
b1 b1r
At the beginning, geo-replication chooses to sync changes from NODE-1:b1
and NODE-2:b1r will be "PASSIVE". The logic depends on virtual getxattr
'trusted.glusterfs.node-uuid' which always returns first up subvolume
i.e., NODE-1. When NODE-1 goes down, the above xattr returns NODE-2 and
that is made 'ACTIVE'. But when NODE-1 comes back again, the above xattr
returns NODE-1 and it is made 'ACTIVE' again. So for a brief interval of
time, if NODE-2 had not finished processing the changelog, both NODE-2
and NODE-1 will be ACTIVE causing rename race as mentioned in the bug.
SOLUTION:
---------
1. Have a shared replicated storage, a glusterfs management volume specific
to geo-replication.
2. Geo-rep creates a file per replica set on management volume.
3. fcntl lock on the above said file is used for synchronization
between geo-rep workers belonging to same replica set.
4. If management volume is not configured, geo-replication will back
to previous logic of using first up sub volume.
Each worker tries to lock the file on shared storage, who ever wins will
be ACTIVE. With this, we are able to solve the problem but there is an
issue when the shared replicated storage goes down (when all replicas
goes down). In that case, the lock state is lost. So AFR needs to rebuild the
lock state after brick comes up.
NOTE:
-----
This patch brings in the, pre-requisite step of setting up management volume
for geo-replication during creation.
1. Create mgmt-vol for geo-replicatoin and start it. Management volume should
be part of master cluster and recommended to be three way replicated
volume having each brick in different nodes for availability.
2. Create geo-rep session.
3. Configure mgmt-vol created with geo-replication session as follows.
gluster vol geo-rep <mastervol> slavenode::<slavevol> config meta_volume \
<meta-vol-name>
4. Start geo-rep session.
Backward Compatiability:
-----------------------
If management volume is not configured, it falls back to previous logic of
using node-uuid virtual xattr. But it is not recommended.
Change-Id: I7319d2289516f534b69edd00c9d0db5a3725661a
BUG: 1196632
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/9759
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shutil.rmtree was failing to remove file if file was not
exists. Added error handling function to ignore ENOENT if
a file/dir not present.
BUG: 1198101
Change-Id: I1796db2642f81d9e2b5e52c6be34b4ad6f1c9786
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/9792
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Prashanth Pai <ppai@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for xattrs. When it sees SETXATTR
in Changelog, it adds the file to data queue. rsync/tar+ssh
will take care of syncing xattrs. User set xattrs will be
synced to Slave.
New config interface is introduced, sync-xattrs
Which can be set using geo-rep config(Default is True)
gluster volume geo-replication <VOLUME> <SLAVEHOST>::<SLAVEVOL> \
config sync-xattrs false
Change-Id: I70626d854a0d616469dd54d61e5ef155ed8b67d8
BUG: 1196690
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/9499
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this patch,
- Hybrid Crawl will not generate empty Changelogs
- Archives Changelogs when processed(Hybrid(XSync), History,
and Changelog Crawl
- Passive worker cleans up its processing directory
BUG: 1169331
Change-Id: I1383ffaed261cdf50da91b14260b4d43177657d1
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/9453
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Iacc67e4ba9ac45e0858f3befe84ffb8fccf7e1c3
BUG: 1075417
Signed-off-by: arao <arao@redhat.com>
Reviewed-on: http://review.gluster.org/9502
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
|