<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/tests/geo-rep.rc, branch v7.5</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>tests/geo-rep: Add geo-rep cli testcases</title>
<updated>2019-06-06T05:17:07+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2019-06-04T09:40:39+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=76673e983b7eb849011b3fb4417cf1d5f7147352'/>
<id>76673e983b7eb849011b3fb4417cf1d5f7147352</id>
<content type='text'>
Change-Id: Icf93b90bcac022a355d4718220698987dbc91ecf
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
updates: bz#1693692
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: Icf93b90bcac022a355d4718220698987dbc91ecf
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
updates: bz#1693692
</pre>
</div>
</content>
</entry>
<entry>
<title>tests/geo-rep: Add geo-rep glusterd test cases</title>
<updated>2019-06-04T06:24:38+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2019-06-03T07:55:16+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=d8bb432eb776f3a8708ed6cacf1c19fca9524d51'/>
<id>d8bb432eb776f3a8708ed6cacf1c19fca9524d51</id>
<content type='text'>
1. Add geo-rep fanout test case
2. Add glusterd geo-rep negative test cases
3. Add glusterd geo-rep config test cases

Change-Id: I856c087eb3216d8f0ffd1f266deac88e9a4effec
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
updates: bz#1693692
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
1. Add geo-rep fanout test case
2. Add glusterd geo-rep negative test cases
3. Add glusterd geo-rep config test cases

Change-Id: I856c087eb3216d8f0ffd1f266deac88e9a4effec
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
updates: bz#1693692
</pre>
</div>
</content>
</entry>
<entry>
<title>geo-rep: Fix sync hang with tarssh</title>
<updated>2019-05-13T11:06:41+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2019-05-08T05:56:06+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=f0d3690e8916cfb5e10a0df2e9721a0fd079bfce'/>
<id>f0d3690e8916cfb5e10a0df2e9721a0fd079bfce</id>
<content type='text'>
Problem:
Geo-rep sync hangs when tarssh is used as sync
engine at heavy workload.

Analysis and Root cause:
It's found out that the tar process was hung.
When debugged further, it's found out that stderr
buffer of tar process on master was full i.e., 64k.
When the buffer was copied to a file from /proc/pid/fd/2,
the hang is resolved.

This can happen when files picked by tar process
to sync doesn't exist on master anymore. If this count
increases around 1k, the stderr buffer is filled up.

Fix:
The tar process is executed using Popen with stderr as PIPE.
The final execution is something like below.

tar | ssh &lt;args&gt; root@slave tar --overwrite -xf - -C &lt;path&gt;

It was waiting on ssh process first using communicate() and then tar.
Note that communicate() reads stdout and stderr. So when stderr of tar
process is filled up, there is no one to read until untar via ssh is
completed. This can't happen and leads to deadlock.
Hence we should be waiting on both process parallely, so that stderr is
read on both processes.

Change-Id: I609c7cc5c07e210c504771115b4d551a2e891adf
fixes: bz#1707728
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Geo-rep sync hangs when tarssh is used as sync
engine at heavy workload.

Analysis and Root cause:
It's found out that the tar process was hung.
When debugged further, it's found out that stderr
buffer of tar process on master was full i.e., 64k.
When the buffer was copied to a file from /proc/pid/fd/2,
the hang is resolved.

This can happen when files picked by tar process
to sync doesn't exist on master anymore. If this count
increases around 1k, the stderr buffer is filled up.

Fix:
The tar process is executed using Popen with stderr as PIPE.
The final execution is something like below.

tar | ssh &lt;args&gt; root@slave tar --overwrite -xf - -C &lt;path&gt;

It was waiting on ssh process first using communicate() and then tar.
Note that communicate() reads stdout and stderr. So when stderr of tar
process is filled up, there is no one to read until untar via ssh is
completed. This can't happen and leads to deadlock.
Hence we should be waiting on both process parallely, so that stderr is
read on both processes.

Change-Id: I609c7cc5c07e210c504771115b4d551a2e891adf
fixes: bz#1707728
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests/geo-rep: Fix arequal checksum comparison</title>
<updated>2019-05-09T05:16:51+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2019-05-08T08:40:05+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=288cffd1ab7180cccfcdea36d0c469b9fa52108f'/>
<id>288cffd1ab7180cccfcdea36d0c469b9fa52108f</id>
<content type='text'>
The arequal checkusm comparison was always returning
as successful, eventhough, if it was not. Fixed the same.

Change-Id: I5083da25c0954126e452d06311d2d376f8540555
fixes: bz#1707742
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The arequal checkusm comparison was always returning
as successful, eventhough, if it was not. Fixed the same.

Change-Id: I5083da25c0954126e452d06311d2d376f8540555
fixes: bz#1707742
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>geo-rep: Fix rename with existing destination with same gfid</title>
<updated>2019-04-26T07:15:00+00:00</updated>
<author>
<name>Sunny Kumar</name>
<email>sunkumar@redhat.com</email>
</author>
<published>2019-04-02T07:08:09+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=e7e89c9ec8b56ad5a442ad105c0b05e674a591cd'/>
<id>e7e89c9ec8b56ad5a442ad105c0b05e674a591cd</id>
<content type='text'>
Problem:
   Geo-rep fails to sync the rename properly if destination exists.
It results in source to be remained on slave causing more number of
files on slave. Also heavy rename workload like logrotate caused
lot of ESTALE errors

Cause:
   Geo-rep fails to sync rename if destination exists if creation
of source file also falls into single batch of changelogs being
processed. This is because, after fixing problematic gfids verifying
from master, while re-processing original entries, CREATE also was
re-processed causing more files on slave and rename to be failed.

Solution:
   Entries need to be removed from retrial list after fixing
problematic gfids on slave so that it's not re-created again on slave.
   Also treat ESTALE as EEXIST so that the error is properly handled
verifying the op on master volume.

Change-Id: I50cf289e06b997adddff0552bf2466d9201dd1f9
fixes: bz#1694820
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Signed-off-by: Sunny Kumar &lt;sunkumar@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
   Geo-rep fails to sync the rename properly if destination exists.
It results in source to be remained on slave causing more number of
files on slave. Also heavy rename workload like logrotate caused
lot of ESTALE errors

Cause:
   Geo-rep fails to sync rename if destination exists if creation
of source file also falls into single batch of changelogs being
processed. This is because, after fixing problematic gfids verifying
from master, while re-processing original entries, CREATE also was
re-processed causing more files on slave and rename to be failed.

Solution:
   Entries need to be removed from retrial list after fixing
problematic gfids on slave so that it's not re-created again on slave.
   Also treat ESTALE as EEXIST so that the error is properly handled
verifying the op on master volume.

Change-Id: I50cf289e06b997adddff0552bf2466d9201dd1f9
fixes: bz#1694820
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Signed-off-by: Sunny Kumar &lt;sunkumar@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>geo-rep: Fix syncing multiple rename of symlink</title>
<updated>2019-03-29T07:23:41+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2019-03-28T11:17:16+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=877af725b3e35b548d6d7aeec5adb21721d8bf8b'/>
<id>877af725b3e35b548d6d7aeec5adb21721d8bf8b</id>
<content type='text'>
Problem:
Geo-rep fails to sync rename of symlink if it's
renamed multiple times if creation and rename
happened successively

Worker crash at slave:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py",  in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", in entry_ops
    [ESTALE, EINVAL, EBUSY])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", in errno_wrap
    return call(*arg)
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in lsetxattr
    cls.raise_oserr()
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in raise_oserr
    raise OSError(errn, os.strerror(errn))
OSError: [Errno 12] Cannot allocate memory

Geo-rep Behaviour:
1. SYMLINK doesn't record target path in changelog.
   So while syncing SYMLINK, readlink is done on
   master to get target path.

2. Geo-rep will create destination if source is not
   present while syncing RENAME. Hence while syncing
   RENAME of SYMLINK, target path is collected from
   destination.

Cause:
If symlink is created and renamed multiple times, creation of
symlink is ignored, as it's no longer present on master at
that path. While symlink is renamed multiple times at master,
when syncing first RENAME of SYMLINK, both source and destination
is not present, hence target path is not known.  In this case,
while creating destination directly at slave,  regular file
attributes were encoded into blob instead of symlink,
causing failure in gfid-access translator while decoding
blob.

Solution:
While syncing of RENAME of SYMLINK, when target is not known
and when src and destination is not present on the master,
don't create destination. Ignore the rename. It's ok to ignore.
If it's unliked, it's fine.  If it's renamed to something else,
it will be synced then.

Change-Id: Ibdfa495513b7c05b5370ab0b89c69a6802338d87
fixes: bz#1693648
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Geo-rep fails to sync rename of symlink if it's
renamed multiple times if creation and rename
happened successively

Worker crash at slave:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py",  in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", in entry_ops
    [ESTALE, EINVAL, EBUSY])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", in errno_wrap
    return call(*arg)
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in lsetxattr
    cls.raise_oserr()
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in raise_oserr
    raise OSError(errn, os.strerror(errn))
OSError: [Errno 12] Cannot allocate memory

Geo-rep Behaviour:
1. SYMLINK doesn't record target path in changelog.
   So while syncing SYMLINK, readlink is done on
   master to get target path.

2. Geo-rep will create destination if source is not
   present while syncing RENAME. Hence while syncing
   RENAME of SYMLINK, target path is collected from
   destination.

Cause:
If symlink is created and renamed multiple times, creation of
symlink is ignored, as it's no longer present on master at
that path. While symlink is renamed multiple times at master,
when syncing first RENAME of SYMLINK, both source and destination
is not present, hence target path is not known.  In this case,
while creating destination directly at slave,  regular file
attributes were encoded into blob instead of symlink,
causing failure in gfid-access translator while decoding
blob.

Solution:
While syncing of RENAME of SYMLINK, when target is not known
and when src and destination is not present on the master,
don't create destination. Ignore the rename. It's ok to ignore.
If it's unliked, it's fine.  If it's renamed to something else,
it will be synced then.

Change-Id: Ibdfa495513b7c05b5370ab0b89c69a6802338d87
fixes: bz#1693648
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>geo-rep: Make slave volume read-only (by default)</title>
<updated>2018-12-07T09:37:54+00:00</updated>
<author>
<name>Harpreet Kaur</name>
<email>hlalwani@redhat.com</email>
</author>
<published>2018-11-28T08:36:36+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=6e92171117c73d7a8901800299446a131e54b597'/>
<id>6e92171117c73d7a8901800299446a131e54b597</id>
<content type='text'>
Added a command to set "features.read-only" option
to a default value "on" for slave volume.
Changes are made in:
$SRC//extras/hook-scripts/S56glusterd-geo-rep-create-post.sh
for root geo-rep and
$SRC/geo-replication/src/set_geo_rep_pem_keys.sh
for non-root geo-rep.

Fixes: bz#1654187

Change-Id: I15beeae3506f3f6b1dcba0a5c50b6344fd468c7c
Signed-off-by: Harpreet Kaur &lt;hlalwani@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Added a command to set "features.read-only" option
to a default value "on" for slave volume.
Changes are made in:
$SRC//extras/hook-scripts/S56glusterd-geo-rep-create-post.sh
for root geo-rep and
$SRC/geo-replication/src/set_geo_rep_pem_keys.sh
for non-root geo-rep.

Fixes: bz#1654187

Change-Id: I15beeae3506f3f6b1dcba0a5c50b6344fd468c7c
Signed-off-by: Harpreet Kaur &lt;hlalwani@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests/geo-rep: Mask failure of geo-rep arbiter test</title>
<updated>2018-12-05T15:34:52+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2018-12-04T10:01:54+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=382ad46bbcb79344285d5cb1fe3ce85f83317c44'/>
<id>382ad46bbcb79344285d5cb1fe3ce85f83317c44</id>
<content type='text'>
Comment out the particular test which is failing
arbitrarily. Also changed the code to differentiate
error cases. There could be some race because of
which it's failing arbitrarily. This will be debugged
and fixed in separate patch.

Change-Id: I925df6421737d7a9abd9446a9d85029b4285ad2c
updates: bz#1193929
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Comment out the particular test which is failing
arbitrarily. Also changed the code to differentiate
error cases. There could be some race because of
which it's failing arbitrarily. This will be debugged
and fixed in separate patch.

Change-Id: I925df6421737d7a9abd9446a9d85029b4285ad2c
updates: bz#1193929
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>geo-rep: Fix syncing of files with non-ascii filenames</title>
<updated>2018-12-04T09:15:44+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2018-11-17T07:44:24+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=54ebd47e4154c8b37680b2bac43775fc92ced153'/>
<id>54ebd47e4154c8b37680b2bac43775fc92ced153</id>
<content type='text'>
Problem:
  Creation of files/directories with non-ascii names fails
  to sync to the slave. It crashes with below traceback on
  slave.
  ...
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/repce.py", line 118, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/resource.py", line 709, in entry_ops
    [ESTALE, EINVAL, EBUSY])
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/syncdutils.py", line 546, in errno_wrap
    return call(*arg)
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/libcxattr.py", line 83, in lsetxattr
    cls.raise_oserr()
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/libcxattr.py", line 38, in raise_oserr
    raise OSError(errn, os.strerror(errn))
  OSError: [Errno 12] Cannot allocate memory

Cause:
  The length calculation arguments passed to blob creation was done before encoding. Hence
  was failing in gfid-access layer.

Fix:
  It appears that the calculating lenght properly fixes this issue. But it will cause
  issues in other places in 'python2' and not in 'python3'. So encoding and decoding
  each required string to make geo-rep compatible with both 'python2' and 'python3'
  is a nightmare and is not fool proof. Hence kept 'python2' code as is with out
  encode/decode and applied encode/decode only to 'python3'

Added non-ascii filename tests to regression

fixes: bz#1650893
Change-Id: I35cfaf848e07b1a0b5cb93c01b98b472f08271a6
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
  Creation of files/directories with non-ascii names fails
  to sync to the slave. It crashes with below traceback on
  slave.
  ...
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/repce.py", line 118, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/resource.py", line 709, in entry_ops
    [ESTALE, EINVAL, EBUSY])
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/syncdutils.py", line 546, in errno_wrap
    return call(*arg)
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/libcxattr.py", line 83, in lsetxattr
    cls.raise_oserr()
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/libcxattr.py", line 38, in raise_oserr
    raise OSError(errn, os.strerror(errn))
  OSError: [Errno 12] Cannot allocate memory

Cause:
  The length calculation arguments passed to blob creation was done before encoding. Hence
  was failing in gfid-access layer.

Fix:
  It appears that the calculating lenght properly fixes this issue. But it will cause
  issues in other places in 'python2' and not in 'python3'. So encoding and decoding
  each required string to make geo-rep compatible with both 'python2' and 'python3'
  is a nightmare and is not fool proof. Hence kept 'python2' code as is with out
  encode/decode and applied encode/decode only to 'python3'

Added non-ascii filename tests to regression

fixes: bz#1650893
Change-Id: I35cfaf848e07b1a0b5cb93c01b98b472f08271a6
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>geo-rep/hook-script: Fix ssh/scp options</title>
<updated>2018-08-03T03:18:58+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2018-07-31T14:27:03+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=9e96d646537fd060ca932291aaa0c4a3ce942b67'/>
<id>9e96d646537fd060ca932291aaa0c4a3ce942b67</id>
<content type='text'>
Always use ssh and scp with "-oPasswordAuthentication=no"
and "-oStrictHostKeyChecking=no" options. It might hang
the post script otherwise leading geo-rep setup failure

Also increased geo-rep timeout. Occasionally, it's taking
more time to reach Active/Passive status. Especially, the
first start after create.

fixes: bz#1610405
Change-Id: I9560d64dbe0edf5db73446a9fc97dda19b88d233
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Always use ssh and scp with "-oPasswordAuthentication=no"
and "-oStrictHostKeyChecking=no" options. It might hang
the post script otherwise leading geo-rep setup failure

Also increased geo-rep timeout. Occasionally, it's taking
more time to reach Active/Passive status. Especially, the
first start after create.

fixes: bz#1610405
Change-Id: I9560d64dbe0edf5db73446a9fc97dda19b88d233
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
