| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
BUG: 1028673
Change-Id: I9ba8e3e6cf2f888640b4d2a2eb934a27ff903c42
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-on: http://review.gluster.org/6290
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix the Coverity issue introduced in RFE: NFS volume set/reset
commit i.e. http://review.gluster.org/6236
Change-Id: I817b9da03a3ce7f5511303faea0c50dfdad60ff4
BUG: 1027409
Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com>
Reviewed-on: http://review.gluster.org/6307
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gluster was starting rebalance processes on peers where it wasn't
required in two cases.
- For a normal rebalance command on a volume, rebalance processes were
started on all peers instead of just the peers which contain bricks of
the volume
- For rebalance process being restarted by a volume sync, caused by a
new peer being probed or a peer restarting, rebalance processes were
started on all peers, for both a normal rebalance and for remove-brick
needing rebalance.
This patch adds a new check before starting rebalance process in the
above two cases.
- For rebalance process required by a rebalance command, each peer will
check if it contains atleast one brick of the volume
- For rebalance process required by a remove-brick command, each peer
will check if it contains atleast one of the bricks being removed
Change-Id: I512da16994f0d5482889c3a009c46dc20a8a15bb
BUG: 1031887
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/6301
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I2210f1ac7de04c6025c0ec02d998b626d41466ae
BUG: 1028672
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6303
Reviewed-by: M. Mohan Kumar <mohan@in.ibm.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds <peerid> tag to bricks and nfs/shd like services to
volume status xml output.
BUG: 955548
Change-Id: I9aaa9266e4d56f632235eaeef565e92d757c0694
Signed-off-by: Bala.FA <barumuga@redhat.com>
Reviewed-on: http://review.gluster.org/6162
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement reconfigure() for NFS xlator so that volume set/reset wont
restart the NFS server process. But few options can not be reconfigured
dynamically e.g. nfs.mem-factor, nfs.port etc which needs NFS to be
restarted.
Change-Id: Ic586fd55b7933c0a3175708d8c41ed0475d74a1c
BUG: 1027409
Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com>
Reviewed-on: http://review.gluster.org/6236
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
.. in the systems with non-trusted server
This new functionality can be useful in various cloud technologies.
It is implemented via a special encryption/crypt translator,which
works on the client side and performs encryption and authentication;
1. Class of supported algorithms
The crypt translator can support any atomic symmetric block cipher
algorithms (which require to pad plain/cipher text before performing
encryption/decryption transform (see glossary in atom.c for
definitions). In particular, it can support algorithms with the EOF
issue (which require to pad the end of file by extra-data).
Crypt translator performs translations
user -> (offset, size) -> (aligned-offset, padded-size) ->server
(and backward), and resolves individual FOPs (write(), truncate(),
etc) to read-modify-write sequences.
A volume can contain files encrypted by different algorithms of the
mentioned class. To change some option value just reconfigure the
volume.
Currently only one algorithm is supported: AES_XTS.
Example of algorithms, which can not be supported by the crypt
translator:
1. Asymmetric block cipher algorithms, which inflate data, e.g. RSA;
2. Symmetric block cipher algorithms with inline MACs for data
authentication.
2. Implementation notes.
a) Atomic algorithms
Since any process in a stackable file system manipulates with local
data (which can be obsoleted by local data of another process), any
atomic cipher algorithm without proper support can lead to non-POSIX
behavior. To resolve the "collisions" we introduce locks: before
performing FOP->read(), FOP->write(), etc. the process should first
lock the file.
b) Algorithms with EOF issue
Such algorithms require to pad the end of file with some extra-data.
Without proper support this will result in losing information about
real file size. Keeping a track of real file size is a responsibility
of the crypt translator. A special extended attribute with the name
"trusted.glusterfs.crypt.att.size" is used for this purpose. All files
contained in bricks of encrypted volume do have "padded" sizes.
3. Non-trusted servers and
Metadata authentication
We assume that server, where user's data is stored on is non-trusted.
It means that the server can be subjected to various attacks directed
to reveal user's encrypted personal data. We provide protection
against such attacks.
Every encrypted file has specific private attributes (cipher algorithm
id, atom size, etc), which are packed to a string (so-called "format
string") and stored as a special extended attribute with the name
"trusted.glusterfs.crypt.att.cfmt". We protect the string from
tampering. This protection is mandatory, hardcoded and is always on.
Without such protection various attacks (based on extending the scope
of per-file secret keys) are possible.
Our authentication method has been developed in tight collaboration
with Red Hat security team and is implemented as "metadata loader of
version 1" (see file metadata.c). This method is NIST-compliant and is
based on checking 8-byte per-hardlink MACs created(updated) by
FOP->create(), FOP->link(), FOP->unlink(), FOP->rename() by the
following unique entities:
. file (hardlink) name;
. verified file's object id (gfid).
Every time, before manipulating with a file, we check it's MACs at
FOP->open() time. Some FOPs don't require a file to be opened (e.g.
FOP->truncate()). In such cases the crypt translator opens the file
mandatory.
4. Generating keys
Unique per-file keys are derived by NIST-compliant methods from the
a) parent key;
b) unique verified object-id of the file (gfid);
Per-volume master key, provided by user at mount time is in the root
of this "tree of keys".
Those keys are used to:
1) encrypt/decrypt file data;
2) encrypt/decrypt file metadata;
3) create per-file and per-link MACs for metadata authentication.
5. Instructions
Getting started with crypt translator
Example:
1) Create a volume "myvol" and enable encryption:
# gluster volume create myvol pepelac:/vols/xvol
# gluster volume set myvol encryption on
2) Set location (absolute pathname) of your master key:
# gluster volume set myvol encryption.master-key /home/me/mykey
3) Set other options to override default options, if needed.
Start the volume.
4) On the client side make sure that the file /home/me/mykey exists
and contains proper per-volume master key (that is 256-bit AES
key). This key has to be in hex form, i.e. should be represented
by 64 symbols from the set {'0', ..., '9', 'a', ..., 'f'}.
The key should start at the beginning of the file. All symbols at
offsets >= 64 are ignored.
5) Mount the volume "myvol" on the client side:
# glusterfs --volfile-server=pepelac --volfile-id=myvol /mnt
After successful mount the file which contains master key may be
removed. NOTE: Keeping the master key between mount sessions is in
user's competence.
**********************************************************************
WARNING! Losing the master key will make content of all regular files
inaccessible. Mount with improper master key allows to access content
of directories: file names are not encrypted.
**********************************************************************
6. Options of crypt translator
1) "master-key": specifies location (absolute pathname) of the file
which contains per-volume master key. There is no default location
for master key.
2) "data-key-size": specifies size of per-file key for data encryption
Possible values:
. "256" default value
. "512"
3) "block-size": specifies atom size. Possible values:
. "512"
. "1024"
. "2048"
. "4096" default value;
7. Test cases
Any workload, which involves the following file operations:
->create();
->open();
->readv();
->writev();
->truncate();
->ftruncate();
->link();
->unlink();
->rename();
->readdirp().
8. TODOs:
1) Currently size of IOs issued by crypt translator is restricted
by block_size (4K by default). We can use larger IOs to improve
performance.
Change-Id: I2601fe95c5c4dc5b22308a53d0cbdc071d5e5cee
BUG: 1030058
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4667
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Special xattr names "clone" & "snapshot" can be used to create full and
linked clone of the LV images. GFID of destination posix file (to be
mapped) is passed as a value to the xattr. Destination posix file must
exist before running this operation.
These operations form a basis for offloading storage related operations
from QEMU to GlusterFS.
Syntax for full clone: xattr name: "clone" value: "gfid-of-dest-file"
Syntax for linked clone: xattr name: "snapshot" value: "gfid-of-dest-file"
Syntax for merging: xattr name: "merge" value: "path-to-snapshot-file"
Example:
setfattr -n clone -v <gfid-of-dest-file> /media/source
setfattr -n snapshot -v <gfid-of-dest-file> /media/source
setfattr -n merge -v "/media/sn" /media/sn
Change-Id: Id9f984a709d4c2e52a64ae75bb12a8ecb01f8776
BUG: 1028672
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-on: http://review.gluster.org/5626
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Volume option bd-aio controls AIO feature for BD xlator. Code taken from
posix-aio.c
Change-Id: Ib049bd59c9d3f9101d33939838322cfa808de053
BUG: 1028672
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-on: http://review.gluster.org/5748
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current BD xlator (block backend) has a few limitations such as
* Creation of directories not supported
* Supports only single brick
* Does not use extended attributes (and client gfid) like posix xlator
* Creation of special files (symbolic links, device nodes etc) not
supported
Basic limitation of not allowing directory creation is blocking
oVirt/VDSM to consume BD xlator as part of Gluster domain since VDSM
creates multi-level directories when GlusterFS is used as storage
backend for storing VM images.
To overcome these limitations a new BD xlator with following
improvements is suggested.
* New hybrid BD xlator that handles both regular files and block device
files
* The volume will have both POSIX and BD bricks. Regular files are
created on POSIX bricks, block devices are created on the BD brick (VG)
* BD xlator leverages exiting POSIX xlator for most POSIX calls and
hence sits above the POSIX xlator
* Block device file is differentiated from regular file by an extended
attribute
* The xattr 'user.glusterfs.bd' (BD_XATTR) plays a role in mapping a
posix file to Logical Volume (LV).
* When a client sends a request to set BD_XATTR on a posix file, a new
LV is created and mapped to posix file. So every block device will
have a representative file in POSIX brick with 'user.glusterfs.bd'
(BD_XATTR) set.
* Here after all operations on this file results in LV related
operations.
For example opening a file that has BD_XATTR set results in opening
the LV block device, reading results in reading the corresponding LV
block device.
When BD xlator gets request to set BD_XATTR via setxattr call, it
creates a LV and information about this LV is placed in the xattr of the
posix file. xattr "user.glusterfs.bd" used to identify that posix file
is mapped to BD.
Usage:
Server side:
[root@host1 ~]# gluster volume create bdvol host1:/storage/vg1_info?vg1 host2:/storage/vg2_info?vg2
It creates a distributed gluster volume 'bdvol' with Volume Group vg1
using posix brick /storage/vg1_info in host1 and Volume Group vg2 using
/storage/vg2_info in host2.
[root@host1 ~]# gluster volume start bdvol
Client side:
[root@node ~]# mount -t glusterfs host1:/bdvol /media
[root@node ~]# touch /media/posix
It creates regular posix file 'posix' in either host1:/vg1 or host2:/vg2 brick
[root@node ~]# mkdir /media/image
[root@node ~]# touch /media/image/lv1
It also creates regular posix file 'lv1' in either host1:/vg1 or
host2:/vg2 brick
[root@node ~]# setfattr -n "user.glusterfs.bd" -v "lv" /media/image/lv1
[root@node ~]#
Above setxattr results in creating a new LV in corresponding brick's VG
and it sets 'user.glusterfs.bd' with value 'lv:<default-extent-size'
[root@node ~]# truncate -s5G /media/image/lv1
It results in resizig LV 'lv1'to 5G
New BD xlator code is placed in xlators/storage/bd directory.
Also add volume-uuid to the VG so that same VG can't be used for other
bricks/volumes. After deleting a gluster volume, one has to manually
remove the associated tag using vgchange <vg-name> --deltag
<trusted.glusterfs.volume-id:<volume-id>>
Changes from previous version V5:
* Removed support for delayed deleting of LVs
Changes from previous version V4:
* Consolidated the patches
* Removed usage of BD_XATTR_SIZE and consolidated it in BD_XATTR.
Changes from previous version V3:
* Added support in FUSE to support full/linked clone
* Added support to merge snapshots and provide information about origin
* bd_map xlator removed
* iatt structure used in inode_ctx. iatt is cached and updated during
fsync/flush
* aio support
* Type and capabilities of volume are exported through getxattr
Changes from version 2:
* Used inode_context for caching BD size and to check if loc/fd is BD or
not.
* Added GlusterFS server offloaded copy and snapshot through setfattr
FOP. As part of this libgfapi is modified.
* BD xlator supports stripe
* During unlinking if a LV file is already opened, its added to delete
list and bd_del_thread tries to delete from this list when a last
reference to that file is closed.
Changes from previous version:
* gfid is used as name of LV
* ? is used to specify VG name for creating BD volume in volume
create, add-brick. gluster volume create volname host:/path?vg
* open-behind issue is fixed
* A replicate brick can be added dynamically and LVs from source brick
are replicated to destination brick
* A distribute brick can be added dynamically and rebalance operation
distributes existing LVs/files to the new brick
* Thin provisioning support added.
* bd_map xlator support retained
* setfattr -n user.glusterfs.bd -v "lv" creates a regular LV and
setfattr -n user.glusterfs.bd -v "thin" creates thin LV
* Capability and backend information added to gluster volume info (and
--xml) so
that management tools can exploit BD xlator.
* tracing support for bd xlator added
TODO:
* Add support to display snapshots for a given LV
* Display posix filename for list-origin instead of gfid
Change-Id: I00d32dfbab3b7c806e0841515c86c3aa519332f2
BUG: 1028672
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-on: http://review.gluster.org/4809
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Remove bd_map xlator and CLI related changes.
Change-Id: If7086205df1907127c1a1fa4ba603f1c48421d09
BUG: 1028672
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-on: http://review.gluster.org/5747
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* When a writev call occurs, the client compresses the data before
sending it to server. On the server, compressed data is decompressed.
Similarly, when a readv call occurs, the server compresses the data
before sending it to client. On the client, the compressed data is
decompressed. Thus the amount of data sent over the wire is minimized.
* Compression/Decompression is done using Zlib library.
* During normal operation, this is the format of data sent over wire :
<compressed-data> + trailer(8)
The trailer contains the CRC32 checksum and length of original
uncompressed data. This is used for validation.
HOW TO USE
----------
Turning on compression xlator:
gluster volume set <vol_name> compress on
Configurable options:
gluster volume set <vol_name> compress.compression-level 8
gluster volume set <vol_name> compress.min-size 50
Change-Id: Ib7a66b6f1f70fe002b7c513588cdf75c69370805
BUG: 923540
Original-author : Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Prashanth Pai <nullpai@gmail.com>
Signed-off-by: Prashanth Pai <ppai@redhat.com>
Reviewed-on: http://review.gluster.org/3251
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the current context "replica_cnt" is used just to know whether the
specific key exists or not by calling "dict_get_int32", which we can
replace by "dict_get ()". And changing the log message as it is more
appropriate to say "migration of data" rather than "rebalance".
This patch refactors commit 51c6fa7a354826744de98 against BZ 961669
reviewed on : http://review.gluster.org/5566
Change-Id: I48eae206a28d4083975e64407ed8fe4539f9c24b
BUG: 1027270
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Original patch: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/6001
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: susant palai <spalai@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is (arguably) a hack to work around a bug in libvirt which is not
well behaved wrt to using TCP ports in the unreserved space between
49152-65535. (See RFC 6335)
Normally glusterd starts and binds to the first available port in range,
usually 49152. libvirt's live migration also tries to use ports in this
range, but has no fallback to use (an)other port(s) when the one it wants
is already in use.
Change-Id: Id8fe35c08b6ce4f268d46804bbb6dddab7a6b7bb
BUG: 1018178
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/6210
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
... so that subsequent volume commands don't block waiting forever,
for the lock to be released.
Change-Id: I24b5ec47f6982900ab74ff1b492d523f31ecfb7f
BUG: 1022055
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/6122
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Quota volume reset command without "force"
option fixed, doesn't fail anymore. It resets
unprotected fields and not the protected ones.
Also, an appropriate message is provided to the user
for the following cases :
1. only unprotected fields are reset, "force" option
should be used to reset protected fields.
2. Both protected and unprotected fields are reset.
3. No field was reset, "force" option required.
Test case for the same also added.
Change-Id: I24e8f1be87b79ccd81bf6f933e00608b861c7a16
BUG: 1022905
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/6135
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement a limit on the total number of outstanding RPC requests
from a given cient. Once the limit is reached the client socket
is removed from POLL-IN event polling.
Change-Id: I8071b8c89b78d02e830e6af5a540308199d6bdcd
BUG: 1008301
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6114
Reviewed-by: Santosh Pradhan <spradhan@redhat.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Harshavardhana <harsha@harshavardhana.net>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
brick force.
Expectation with force is that user is aware of the consequences of
sanity checks not being triggered.
Change-Id: I79dfeed16a23829a7217cef33ab83f9f0ffae336
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
BUG: 1007509
Reviewed-on: http://review.gluster.org/5746
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Glusterd changes:
With this patch, glusterd creates a socket file in
DATADIR/run/glusterd.socket , and listen on it for cli requests. It
listens for 2 rpc programs on the socket file,
- The glusterd cli rpc program, for all cli commands
- A reduced glusterd handshake program, just for the 'system:: getspec'
command
The location of the socket file can be changed with the glusterd option
'glusterd-sockfile'.
To retain compatibility with the '--remote-host' cli option, glusterd
also listens for the cli requests on port 24007. But, for the sake of
security, it listens using a reduced cli rpc program on the port. The
reduced rpc program only contains read-only procs used for 'volume
(info|list|status)', 'peer status' and 'system:: getwd' cli commands.
CLI changes:
The gluster cli now uses the glusterd socket file for communicating with
glusterd by default. A new option '--gluster-sock' has been added to
allow specifying the sockfile used to connect. Using the '--remote-host'
option will make cli connect to the given host & port.
Tests changes:
cluster.rc has been modified to make use of socket files and use
different log files for each glusterd.
Some of the tests using cluster.rc have been fixed.
Change-Id: Iaf24bc22f42f8014a5fa300ce37c7fc9b1b92b53
BUG: 980754
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/5280
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gettimeofday() returns the current wall clock time and timezone.
Using these functions in order to measure the passage of time
(how long an operation took) therefore seems like a no-brainer.
This time suffer's from some limitations:
a. They have a low resolution: “High-performance” timing by
definition, requires clock resolutions into the microseconds
or better.
b. They can jump forwards and backwards in time: Computer
clocks all tick at slightly different rates, which causes
the time to drift. Most systems have NTP enabled which
periodically adjusts the system clock to keep them in sync
with “actual” time. The adjustment can cause the clock to
suddenly jump forward (artificially inflating your timing
numbers) or jump backwards (causing your timing calculations
to go negative or hugely positive). In such cases timer
thread could go into an infinite loop.
From 'man gettimeofday':
----------
..
..
The time returned by gettimeofday() is affected by discontinuous
jumps in the system time (e.g., if the system administrator manually
changes the system time). If you need a monotonically increasing
clock, see clock_gettime(2).
..
..
----------
Rationale:
For calculating interval timing for Timer thread, all that’s
needed should be clock as a simple counter that increments
at a stable rate.
This is necessary to avoid the jumps which are caused by using
"wall time", this counter must be monotonic that can never
“tick” backwards, ever.
Change-Id: I701d31e71a85a73d21a6c5cd15583e7a5a645eeb
BUG: 1017993
Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
Reviewed-on: http://review.gluster.org/6070
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently to know the number of files to be healed, either user
has to go to backend and check the number of entries present in
indices/xattrop directory. But if a volume consists of large
number of bricks, going to each backend and counting the number
of entries is a time-taking task. Otherwise user can give
gluster volume heal vol-name info command but with this
approach if no. of entries are very hugh in the indices/
xattrop directory, it will comsume time.
So as a feature, new command is implemented.
Command 1: gluster volume heal vn statistics heal-count
This command will get the number of entries present in
every brick of a volume. The output displays only entries
count.
Command 2: gluster volume heal vn statistics heal-count
replica 192.168.122.1:/home/user/brickname
Here if we are concerned with just one replica.
So providing any one of the brick of a replica will get
the number of entries to be healed for that replica only.
Example:
Replicate volume with replica count 2.
Backend status:
--------------
[root@dhcp-0-17 xattrop]# ls -lia | wc -l
1918
NOTE: Out of 1918, 2 entries are <xattrop-gfid> dummy
entries so actual no. of entries to be healed are
1916.
[root@dhcp-0-17 xattrop]# pwd
/home/user/2ty/.glusterfs/indices/xattrop
Command output:
--------------
Gathering count of entries to be healed on volume volume3 has been successful
Brick 192.168.122.1:/home/user/22iu
Status: Brick is Not connected
Entries count is not available
Brick 192.168.122.1:/home/user/2ty
Number of entries: 1916
Change-Id: I72452f3de50502dc898076ec74d434d9e77fd290
BUG: 1015990
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/6044
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"gluster volume heal volumename statistics" command gives the summary
of the afr crawl done based on the entries present in the xattrop
directory. Whenever afr crawls are attempted, the beginning time of
crawl, end time of crawl, no of files healed, heal-failed count and
number of files in split brain are shown along with the type of the
crawl. If crawl is already in progress then it will give the number
of files healed, heal failed count and number of files in split-brain
from the beginning of the crawl and instead of telling the end time of
the crawl, "CRAWL IN PROGRESS" message will be shown.
Output format:
command: "gluster volume heal volume-name statistics"
Output:
Gathering afr crawl statistics crawl statistics on volume volume-name
has been successful
------------------------------------------------
Crawl statistics for brick no 0
Hostname of brick 192.168.122.248
Starting time of crawl: Wed Jul 10 15:52:38 2013
Ending time of crawl: Wed Jul 10 15:52:38 2013
Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0
Starting time of crawl: Wed Jul 10 15:52:38 2013
Ending time of crawl: Wed Jul 10 15:52:38 2013
Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0
------------------------------------------------
Crawl statistics for brick no 1
Hostname of brick 192.168.122.1
Starting time of crawl: Wed Jul 10 15:52:42 2013
Ending time of crawl: Wed Jul 10 15:52:42 2013
Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0
Starting time of crawl: Wed Jul 10 15:52:42 2013
Ending time of crawl: Wed Jul 10 15:52:42 2013
Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0
--------------------------------------------------
Change-Id: I10bf9d10b005741db9973fb1352e0dd59ed99aa9
BUG: 949400
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/4790
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
oVirt's Gluster Integration needs an inexpensive command that can be
executed every 10 seconds to monitor async tasks and their parameters,
for all volumes.
The solution involves adding a 'tasks' sub-command to 'volume status'
to fetch only the async task IDs, type and other relevant parameters.
Only the originator glusterd participates in this command as all the
information needed is available on all the nodes. This is to make the
command suitable for being executed every 10 seconds.
Change-Id: I1edc607baf29b001a5585079dec681d7c641b3d1
BUG: 1012346
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/6006
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds node uuid in rebalance/remove-brick status xml output.
Output XML will look like
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<cliOutput>
<opRet>0</opRet>
<opErrno>0</opErrno>
<opErrstr/>
<volRebalance>
<op>3</op>
<nodeCount>1</nodeCount>
<node>
<nodeName>localhost</nodeName>
==>> <id>883626f8-4d29-4d02-8c5d-c9f48c5b2445</id>
<files>0</files>
<size>0</size>
<lookups>0</lookups>
<failures>0</failures>
<status>3</status>
<statusStr>completed</statusStr>
</node>
<aggregate>
<files>0</files>
<size>0</size>
<lookups>0</lookups>
<failures>0</failures>
<status>3</status>
<statusStr>completed</statusStr>
</aggregate>
</volRebalance>
</cliOutput>
Change-Id: I5a1d4f9043b33b9e88150647a243ddb16154e843
BUG: 1012296
Signed-off-by: Bala.FA <barumuga@redhat.com>
Reviewed-on: http://review.gluster.org/6005
Reviewed-by: Kaushal M <kaushal@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I8ff6b48ef41fd6e9ea68c42dfb9878f8a08ed627
BUG: 1010874
Signed-off-by: Ajeet Jha <ajha@redhat.com>
Reviewed-on: http://review.gluster.org/5989
Reviewed-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The volume op-versions are calculated during a volume set/reset, reading a
volume from disk and importing a volume during probe or volume sync. The
calculation of the volume op-version depends on the clusters op-version as some
features are enabled automatically depending on the clusters op-version. We
also don't store the volume op-versions persistently and don't export the
volume op-versions during sync. Due to this, there can occur cases which will
lead to inconsistencies in volumes in different peers. One such case is below,
Consider, a cluster made up 3 peers P1, P2 and P3, operating at op-version N.
The cluster has two volumes V1 and V2, which have volume op-versions N (since
volume op-version cannot be greater than cluster op-version). We have,
Cluster-op-version = N
V1 op-version = N
V2 op-version = N
A set operation on V1 causes the clusters op-version to be bumped up to N+1.
Assume that there exist some features that are automatically enabled on
op-version N+1. The op-version of V2 remains at N as no operation has been
performed on it. So,
Cluster op-version = N+1
V1 op-version = N+1
V2 op-version = N
Now, we probe a new peer P4. On the new peer we will have the following
op-versions,
Cluster op-version = N+1
V1 op-version = N+1
V2 op-version = N+1
This happens because we don't send volume op-versions during the sync after
probe. P4 will freshly calculate the op-version of V2 (assuming features have
been auto enabled due to the cluster op-version being N+1) as N+1.
Another case is when glusterd on a peer restarts. Assume P3 was restarted,
glusterd will recalculate the volume op-versions during the restore state.
Again, op-version of V2 will be calculated as N+1 assuming auto enabled
features. This will lead to inconsistency in the volume representation in
memory and on disk, as glusterd will assume the volume contains auto enabled
features, but the volfiles don't contain them as they were not regenrated.
These kind of issues can be solved by calculating the volume op-version only
when features are enabled and disabled (ie. during volume set/reset),
persisting the volume-op-versions and exporting/importing them.
Change-Id: I52de0668c92628622e85f4588fb28829a7231132
BUG: 1005043
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/5568
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Glusterd would not store all the volumes when a global options were set.
When setting a global option, like 'nfs.*' options, glusterd used to
modify the volinfo for all the volumes, but would store only the volinfo
for the named volume. This lead to mismatch in the persisted and the
in-memory representation of those volumes, which lead to problems like
peers being rejected because of volume mismatches.
Change-Id: I8bca10585e34b7135cb32af0055dbd462b3fb9b5
BUG: 1012400
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/6007
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I27f5f7cd54115d7b236b42f6beaaa05a8b379dd7
BUG: 1010153
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/5978
Reviewed-by: Harshavardhana <harsha@harshavardhana.net>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Incorrect NFS ACL encoding causes "system.posix_acl_default"
setxattr failure on bricks on XFS file system. XFS (potentially
others?) doesn't understand when the 0x10 prefix is added to the
ACL type field for default ACLs (which the Linux NFS client adds)
which causes setfacl()->setxattr() to fail silently. NFS client
adds NFS_ACL_DEFAULT(0x1000) for default ACL.
FIX:
Mask the prefix (added by NFS client) OFF, so the setfacl is not
rejected when it hits the FS.
Original patch by: "Richard Wareing"
Change-Id: I17ad27d84f030cdea8396eb667ee031f0d41b396
BUG: 1009210
Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com>
Reviewed-on: http://review.gluster.org/5980
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I8d670a228d3c1282aa7d70b151f166d04abc40e5
BUG: 764890
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-on: http://review.gluster.org/5909
Reviewed-by: Anand Avati <avati@redhat.com>
Tested-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I1841864273fc4242de15fbfcf76fd5de40269f28
BUG: 1006249
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/5889
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While a gluster command holding lock is in execution,
any other gluster command which tries to run will fail to
acquire the lock. As a result command#2 will follow the
cleanup code flow, which also includes unlocking the held
locks. As both the commands are run from the same node,
command#2 will end up releasing the locks held by command#1
even before command#1 reaches completion.
Now we call the unlock routine in the code path, of the cluster
has been locked during the same transaction.
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Change-Id: I7b7aa4d4c7e565e982b75b8ed1e550fca528c834
BUG: 1008172
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/5937
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Gluster NFS server is hard-coding the max rsize/wsize to 64KB
which is very less for NFS running over 10GE NIC. The existing
options nfs.read-size, nfs.write-size are not working as
expected.
FIX:
Make the options nfs.read-size (for rsize) and nfs.write-size
(for wsize) work to tune the NFS I/O size. Value range would
be 4KB(Min)-64KB(Default)-1MB(max).
NB: Credit to "Richard Wareing" for catching it.
Change-Id: I2754ecb0975692304308be8bcf496c713355f1c8
BUG: 1009223
Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com>
Reviewed-on: http://review.gluster.org/5964
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Currenly the CLI rebalance status command output does not indicate the
'type' of rebalance, i.e. whether a full rebalance or only a fix-layout
was carried out.
Fix: After the rebalance status of all peers is received by the
originator glusterd, alter it to reflect the type of rebalance
before passing it on to the CLI process.
Change-Id: I1940ffda0d36e25e5b33c84a0ea210394cc9e1d3
BUG: 1004744
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/5826
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The rebalance status was being reset to 'Not started' when add-brick was
performed. This would lead to odd cases where a 'rebalance status' on a
volume would show status as 'not started' but would also include the
rebalance statistics. This also affected the showing of asynchronus task
status in 'volume status' command.
By not resetting the status prevent the above issues from happening.
Since we use the running/not-running of the rebalance process as the
check when performing other operations we can safely leave the rebalance
stats collected on an add-brick.
Change-Id: I4c69d9c789d081c6de7e7a81dd0d4eba2e83ec17
BUG: 1006247
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/5895
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia9ee44763a9c2798b26d3225bf03a974d7ece21f
BUG: 998962
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/5666
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces task parameters for the asynchronus task shown in
volume status. The parameters are only given for xml output. The
parameters shown currently are,
- source and destination bricks for replace-brick tasks
......
<tasks>
<task>
<type>Replace brick</type>
<id>3d1a1005-9d2e-4ae0-bd62-577bc1d333a3</id>
<status>1</status>
<params>
<srcBrick>archm:/export/test4</srcBrick>
<dstBrick>archm:/export/test-replace1</dstBrick>
</params>
</task>
</tasks>
......
- list of bricks being removed for remove-brick tasks
......
<tasks>
<task>
<type>Remove brick</type>
<id>901c20ca-0da2-41de-8669-5f0caca6b846</id>
<status>1</status>
<params>
<brick>archm:/export/test2</brick>
<brick>archm:/export/test3</brick>
</params>
</task>
</tasks>
......
The changes for non-xml output will be done in a subsequent patch.
Change-Id: I322afe2f83ed8adeddb99f7962c25911204dc204
BUG: 916577
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/5771
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I7c17de39da03c6b2764790581e097936da406695
BUG: 1002556
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-on: http://review.gluster.org/5893
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier, a peer running a higher op-version couldn't be probed into a
cluster running at a lower op-version. This created issues when trying
to expand an upgraded cluster. This patch changes this behaviour.
The cluster no longer rejects a peer being probed if its op-version is
higher than the cluster op-version. The peer will reduce its op-version
if it doesn't have any volumes. If the peer contains volumes and needs
to reduce its op-version, it fails the handshake and the probe fails.
Change-Id: I12c6c873922799e1557b7184e956baea643d0dea
BUG: 1005038
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/5715
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia4bf607e044d50636fb0a599a2ff91059f5b17aa
BUG: 1006177
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/5887
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Slave's xtime is now stored on the master itself (and that too only on
the root), which implies it cannot be propogated to the cascaded slave.
Thus the intermediate master now makes use of it's own volume information
to propogate volume-mark and xtime.
On starting Geo-Replication "geo-replication.ignore-pid-check" marker
option is enabled, which is an override for the client-pid check in
marker. This options triggers marker update only for geo-replication
auxillary mount (client-pid == -1). Since gsyncd not does setxattr()
directly on the bricks, this option won't trigger a chain of spurious
metadata updates that would need to be processed by gsyncd.
Change-Id: If50c5ef275dfb6b4ff4fd35be2565587e2fdf3e1
BUG: 996371
Original Author: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/5592
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Avra Sengupta <asengupt@redhat.com>
Tested-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is required by Geo-Replication that does auxillary mount
with client-pid as -1 (which has special treatment at specific
places in GlusterFS), to trigger xtime updates on the intermediate
master in a cascading setup.
Marker too had a check to "not" mark updates for geo-replication's
auxillary mounts. With the new geo-replication design, xtimes are
not set by the master on the slave for all entities. Due to this
cascading setups were broken.
This patch introduces "geo-replication.ignore-pid-check" option
as a "override" for the client-pid check for gsyncd's client-pid.
When this options is enabled, marker start "marking" even if the
updates are from the special client.
Geo-Replication on the detection of itself being an intermediate
master, enables this option.
Change-Id: I9f7140edd12fef5480595ee0f93f35b94cdb8345
BUG: 996371
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/5591
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Avra Sengupta <asengupt@redhat.com>
Tested-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added op-version checks to all geo-rep commands. Min
op-version should be 2.
Change-Id: I942d897404e11e4d53123409731ba5cd252668fe
BUG: 847839
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/5732
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I8c2d398114ad4534bcc052f9a5be8bbb2e7e2582
BUG: 999531
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/5677
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
non-root@hostname::slave-vol geo-rep sessions are not supported.
only hostname and root@hostname sessions are supported, and are
treated as the same.
Change-Id: I87551e1bd4ff4e0e6520c34eb3d944587cc65476
BUG: 998933
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/5659
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
create force will fail with proper message, if the ip is not
reachable, or is unable to fetch slave details.
Change-Id: I44a3ba777b37702ffd0e48e9cb46c51e293327d4
BUG: 988314
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/5516
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now saving the session details in
/var/lib/glusterd/geo-replication/<mastervol>_<slaveip>_<slavevol>
repo to distinguish between two master-slave sessions where the
slavename is same across two different clusters.
Change-Id: I57c93f55cc9bd4fe2bffe579028aaf5e4335b223
BUG: 991501
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/5488
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a translator to improve the performance of typical,
sequential directory reads (i.e., ls). readdir-ahead begins
preloading the contents of a directory on open and serves readdir
requests from the preloaded content. readdir-ahead is currently
implemented to only handle the single threaded directory read
case.
readdir-ahead is currently disabled by default. It can be enabled
with the following command:
gluster volume set <volname> readdir-ahead on
The following are results of a getdents test on a single brick
volume.
Test info:
- Single VM, gluster client/server.
- Volume mounted with native client using --gid-timeout=2.
- getdents on single directory with 100k 0-byte files.
Test results:
- !readdir-ahead
read 3120080 bytes from offset 0
3 MiB, 4348 ops, 0:00:07.00 (416.590 KiB/sec and 594.4737 ops/sec)
- readdir-ahead
read 3120080 bytes from offset 0
3 MiB, 4348 ops, 0:00:03.00 (820.116 KiB/sec and 1170.3043 ops/sec)
BUG: 980517
Change-Id: Ieceb9e1eb47d1d5b5af8da2bf03839537364653f
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/4519
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I1442bc1d115a9c6ecf139a0ca9da74d07e0fe928
BUG: 1003855
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/5764
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for internals snapshots using QCOW2 and
general framework for external snapshots (next patch) with
QCOW2 and QED.
For internal snapshots, the file must be "initialized" or
"formatted" into QCOW2 format, and specify a file size.
Snapshots can be created, deleted, and applied ("goto").
e.g:
// Format and Initialize
sh# setfattr -n trusted.glusterfs.block-format -v qcow2:10GB /mnt/imgfile
sh# ls -l /mnt/imgfile
-rw-r--r-- 1 root root 10G Jul 18 21:20 imgfile
// Create a snapshot
sh# setfattr -n trusted.glusterfs.block-snapshot-create -v name1 imgfile
// Apply a snapshot
sh# setfattr -n trusted.gluterfs.block-snapshot-goto -v name1 imgfile
Change-Id: If993e057a9455967ba3fa9dcabb7f74b8b2cf4c3
BUG: 986775
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/5367
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|