| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
snap volumes
Change-Id: I97f59fcf1e78ded35fd15996d587ecd043c7dc17
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
|
|
|
|
|
|
|
| |
Also fixes peer rejects on glusterd restart
Change-Id: I1671416c1f3fd2afea450cc3b4c632de187351ca
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
|
|
|
|
|
| |
Change-Id: Icdb66c89acdd043d0d6368c48ce2e01b1a40966f
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
| |
Change-Id: I277a70f732666d047ba5dff7a7e6925e0679741b
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
in volume status
Added a parent_volname member in glusterd_volinfo_ structure
to help point the snap vol to the parent volname. Using this
to fetch the pidfile location during volume status.
Change-Id: I30a16646561394d0f7d16f66abff14c425f31f06
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
|
|
|
|
|
| |
Change-Id: I9f53bd4d83794c69c54e4a03f59425a1ca6a4ac3
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
|
|
|
|
|
| |
Change-Id: I98fcd25e80b1b39e0292eae059e2d624ca9094fb
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
| |
Change-Id: I94b5f6e00be7d1ff0c454e291c779dae7b423748
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
| |
Change-Id: I8fafb19c2c13caac2a509c36f99d2dd782865b13
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
| |
Also fixes snapshot config output
Change-Id: Ia50d94492009cf73dbb99ba20117b9fa4c41048a
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
|
|
|
|
|
|
|
|
| |
Also refactored code in glusterd for create command
Additionally, removed brick-op func from mgmt_iniate_all_phases
Change-Id: Iddcc332009c5716adee7f2b04c93b352fb983446
Signed-off-by: shishir gowda <sgowda@redhat.com>
|
|
|
|
|
| |
Change-Id: I8b88fe94d0f9ee1089cafdda037abcf2f7a180ca
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
| |
Change-Id: I54db2fa67ebb6b57629f9536c296fbae07a1d159
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is still a work in progress.
As of now, these things are done:
* Take the snapshot of the backend brick
* Create the new volume for the snapshot
* Create the brick and the client volfiles
* Store the snapshot related info in /var/lib/glusterd
* Create the snap object representing the snapshot
TODO:
Start the brick processes for the snapshot
Change-Id: I26fbb0f8e5cf004d4c1dbca51819bab1cd1bac15
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduced a new store for storing snapshot list for a given volume.
$GLUSTERD_INSTALL_PATH/vols/<volname>/snap_list.info
$GLUSTERD_INSTALL_PATH/vols/<volname>/snaps/
$GLUSTERD_INSTALL_PATH/vols/<volname>/snaps/<snap-name>/info <-snapshot volume info
$GLUSTERD_INSTALL_PATH/vols/<volname>/snaps/<snap-name>/bricks <-snapshot volume brick dir
$GLUSTERD_INSTALL_PATH/vols/<volname>/snaps/<snap-name>/bricks/<infos>
<-snapshot volume brick info files
store delete options
TODO -
$GLUSTERD_INSTALL_PATH/CG/ <-place holder for all cg's
.../CG/<cg-name>/info <- per cg information placeholder
Change-Id: I1f9fd8ff7cc0682d05b33965736a43dca6adb3e9
Signed-off-by: shishir gowda <sgowda@redhat.com>
|
|
|
|
|
| |
Change-Id: I03ff9e8094e7e36b28b521380949c7e9044c2e4e
Signed-off-by: shishir gowda <sgowda@redhat.com>
|
|
|
|
|
|
|
|
| |
API's for creating, adding, finding, removing snapshots
and consistency groups are provided.
Change-Id: Ic28da69a075b062aefdf14754c68259ca58bd427
Signed-off-by: shishir gowda <sgowda@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement reconfigure() for NFS xlator so that volume set/reset wont
restart the NFS server process. But few options can not be reconfigured
dynamically e.g. nfs.mem-factor, nfs.port etc which needs NFS to be
restarted.
Change-Id: Ic586fd55b7933c0a3175708d8c41ed0475d74a1c
BUG: 1027409
Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com>
Reviewed-on: http://review.gluster.org/6236
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current BD xlator (block backend) has a few limitations such as
* Creation of directories not supported
* Supports only single brick
* Does not use extended attributes (and client gfid) like posix xlator
* Creation of special files (symbolic links, device nodes etc) not
supported
Basic limitation of not allowing directory creation is blocking
oVirt/VDSM to consume BD xlator as part of Gluster domain since VDSM
creates multi-level directories when GlusterFS is used as storage
backend for storing VM images.
To overcome these limitations a new BD xlator with following
improvements is suggested.
* New hybrid BD xlator that handles both regular files and block device
files
* The volume will have both POSIX and BD bricks. Regular files are
created on POSIX bricks, block devices are created on the BD brick (VG)
* BD xlator leverages exiting POSIX xlator for most POSIX calls and
hence sits above the POSIX xlator
* Block device file is differentiated from regular file by an extended
attribute
* The xattr 'user.glusterfs.bd' (BD_XATTR) plays a role in mapping a
posix file to Logical Volume (LV).
* When a client sends a request to set BD_XATTR on a posix file, a new
LV is created and mapped to posix file. So every block device will
have a representative file in POSIX brick with 'user.glusterfs.bd'
(BD_XATTR) set.
* Here after all operations on this file results in LV related
operations.
For example opening a file that has BD_XATTR set results in opening
the LV block device, reading results in reading the corresponding LV
block device.
When BD xlator gets request to set BD_XATTR via setxattr call, it
creates a LV and information about this LV is placed in the xattr of the
posix file. xattr "user.glusterfs.bd" used to identify that posix file
is mapped to BD.
Usage:
Server side:
[root@host1 ~]# gluster volume create bdvol host1:/storage/vg1_info?vg1 host2:/storage/vg2_info?vg2
It creates a distributed gluster volume 'bdvol' with Volume Group vg1
using posix brick /storage/vg1_info in host1 and Volume Group vg2 using
/storage/vg2_info in host2.
[root@host1 ~]# gluster volume start bdvol
Client side:
[root@node ~]# mount -t glusterfs host1:/bdvol /media
[root@node ~]# touch /media/posix
It creates regular posix file 'posix' in either host1:/vg1 or host2:/vg2 brick
[root@node ~]# mkdir /media/image
[root@node ~]# touch /media/image/lv1
It also creates regular posix file 'lv1' in either host1:/vg1 or
host2:/vg2 brick
[root@node ~]# setfattr -n "user.glusterfs.bd" -v "lv" /media/image/lv1
[root@node ~]#
Above setxattr results in creating a new LV in corresponding brick's VG
and it sets 'user.glusterfs.bd' with value 'lv:<default-extent-size'
[root@node ~]# truncate -s5G /media/image/lv1
It results in resizig LV 'lv1'to 5G
New BD xlator code is placed in xlators/storage/bd directory.
Also add volume-uuid to the VG so that same VG can't be used for other
bricks/volumes. After deleting a gluster volume, one has to manually
remove the associated tag using vgchange <vg-name> --deltag
<trusted.glusterfs.volume-id:<volume-id>>
Changes from previous version V5:
* Removed support for delayed deleting of LVs
Changes from previous version V4:
* Consolidated the patches
* Removed usage of BD_XATTR_SIZE and consolidated it in BD_XATTR.
Changes from previous version V3:
* Added support in FUSE to support full/linked clone
* Added support to merge snapshots and provide information about origin
* bd_map xlator removed
* iatt structure used in inode_ctx. iatt is cached and updated during
fsync/flush
* aio support
* Type and capabilities of volume are exported through getxattr
Changes from version 2:
* Used inode_context for caching BD size and to check if loc/fd is BD or
not.
* Added GlusterFS server offloaded copy and snapshot through setfattr
FOP. As part of this libgfapi is modified.
* BD xlator supports stripe
* During unlinking if a LV file is already opened, its added to delete
list and bd_del_thread tries to delete from this list when a last
reference to that file is closed.
Changes from previous version:
* gfid is used as name of LV
* ? is used to specify VG name for creating BD volume in volume
create, add-brick. gluster volume create volname host:/path?vg
* open-behind issue is fixed
* A replicate brick can be added dynamically and LVs from source brick
are replicated to destination brick
* A distribute brick can be added dynamically and rebalance operation
distributes existing LVs/files to the new brick
* Thin provisioning support added.
* bd_map xlator support retained
* setfattr -n user.glusterfs.bd -v "lv" creates a regular LV and
setfattr -n user.glusterfs.bd -v "thin" creates thin LV
* Capability and backend information added to gluster volume info (and
--xml) so
that management tools can exploit BD xlator.
* tracing support for bd xlator added
TODO:
* Add support to display snapshots for a given LV
* Display posix filename for list-origin instead of gfid
Change-Id: I00d32dfbab3b7c806e0841515c86c3aa519332f2
BUG: 1028672
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-on: http://review.gluster.org/4809
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
brick force.
Expectation with force is that user is aware of the consequences of
sanity checks not being triggered.
Change-Id: I79dfeed16a23829a7217cef33ab83f9f0ffae336
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
BUG: 1007509
Reviewed-on: http://review.gluster.org/5746
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently to know the number of files to be healed, either user
has to go to backend and check the number of entries present in
indices/xattrop directory. But if a volume consists of large
number of bricks, going to each backend and counting the number
of entries is a time-taking task. Otherwise user can give
gluster volume heal vol-name info command but with this
approach if no. of entries are very hugh in the indices/
xattrop directory, it will comsume time.
So as a feature, new command is implemented.
Command 1: gluster volume heal vn statistics heal-count
This command will get the number of entries present in
every brick of a volume. The output displays only entries
count.
Command 2: gluster volume heal vn statistics heal-count
replica 192.168.122.1:/home/user/brickname
Here if we are concerned with just one replica.
So providing any one of the brick of a replica will get
the number of entries to be healed for that replica only.
Example:
Replicate volume with replica count 2.
Backend status:
--------------
[root@dhcp-0-17 xattrop]# ls -lia | wc -l
1918
NOTE: Out of 1918, 2 entries are <xattrop-gfid> dummy
entries so actual no. of entries to be healed are
1916.
[root@dhcp-0-17 xattrop]# pwd
/home/user/2ty/.glusterfs/indices/xattrop
Command output:
--------------
Gathering count of entries to be healed on volume volume3 has been successful
Brick 192.168.122.1:/home/user/22iu
Status: Brick is Not connected
Entries count is not available
Brick 192.168.122.1:/home/user/2ty
Number of entries: 1916
Change-Id: I72452f3de50502dc898076ec74d434d9e77fd290
BUG: 1015990
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/6044
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"gluster volume heal volumename statistics" command gives the summary
of the afr crawl done based on the entries present in the xattrop
directory. Whenever afr crawls are attempted, the beginning time of
crawl, end time of crawl, no of files healed, heal-failed count and
number of files in split brain are shown along with the type of the
crawl. If crawl is already in progress then it will give the number
of files healed, heal failed count and number of files in split-brain
from the beginning of the crawl and instead of telling the end time of
the crawl, "CRAWL IN PROGRESS" message will be shown.
Output format:
command: "gluster volume heal volume-name statistics"
Output:
Gathering afr crawl statistics crawl statistics on volume volume-name
has been successful
------------------------------------------------
Crawl statistics for brick no 0
Hostname of brick 192.168.122.248
Starting time of crawl: Wed Jul 10 15:52:38 2013
Ending time of crawl: Wed Jul 10 15:52:38 2013
Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0
Starting time of crawl: Wed Jul 10 15:52:38 2013
Ending time of crawl: Wed Jul 10 15:52:38 2013
Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0
------------------------------------------------
Crawl statistics for brick no 1
Hostname of brick 192.168.122.1
Starting time of crawl: Wed Jul 10 15:52:42 2013
Ending time of crawl: Wed Jul 10 15:52:42 2013
Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0
Starting time of crawl: Wed Jul 10 15:52:42 2013
Ending time of crawl: Wed Jul 10 15:52:42 2013
Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0
--------------------------------------------------
Change-Id: I10bf9d10b005741db9973fb1352e0dd59ed99aa9
BUG: 949400
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/4790
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
oVirt's Gluster Integration needs an inexpensive command that can be
executed every 10 seconds to monitor async tasks and their parameters,
for all volumes.
The solution involves adding a 'tasks' sub-command to 'volume status'
to fetch only the async task IDs, type and other relevant parameters.
Only the originator glusterd participates in this command as all the
information needed is available on all the nodes. This is to make the
command suitable for being executed every 10 seconds.
Change-Id: I1edc607baf29b001a5585079dec681d7c641b3d1
BUG: 1012346
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/6006
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The volume op-versions are calculated during a volume set/reset, reading a
volume from disk and importing a volume during probe or volume sync. The
calculation of the volume op-version depends on the clusters op-version as some
features are enabled automatically depending on the clusters op-version. We
also don't store the volume op-versions persistently and don't export the
volume op-versions during sync. Due to this, there can occur cases which will
lead to inconsistencies in volumes in different peers. One such case is below,
Consider, a cluster made up 3 peers P1, P2 and P3, operating at op-version N.
The cluster has two volumes V1 and V2, which have volume op-versions N (since
volume op-version cannot be greater than cluster op-version). We have,
Cluster-op-version = N
V1 op-version = N
V2 op-version = N
A set operation on V1 causes the clusters op-version to be bumped up to N+1.
Assume that there exist some features that are automatically enabled on
op-version N+1. The op-version of V2 remains at N as no operation has been
performed on it. So,
Cluster op-version = N+1
V1 op-version = N+1
V2 op-version = N
Now, we probe a new peer P4. On the new peer we will have the following
op-versions,
Cluster op-version = N+1
V1 op-version = N+1
V2 op-version = N+1
This happens because we don't send volume op-versions during the sync after
probe. P4 will freshly calculate the op-version of V2 (assuming features have
been auto enabled due to the cluster op-version being N+1) as N+1.
Another case is when glusterd on a peer restarts. Assume P3 was restarted,
glusterd will recalculate the volume op-versions during the restore state.
Again, op-version of V2 will be calculated as N+1 assuming auto enabled
features. This will lead to inconsistency in the volume representation in
memory and on disk, as glusterd will assume the volume contains auto enabled
features, but the volfiles don't contain them as they were not regenrated.
These kind of issues can be solved by calculating the volume op-version only
when features are enabled and disabled (ie. during volume set/reset),
persisting the volume-op-versions and exporting/importing them.
Change-Id: I52de0668c92628622e85f4588fb28829a7231132
BUG: 1005043
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/5568
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Incorrect NFS ACL encoding causes "system.posix_acl_default"
setxattr failure on bricks on XFS file system. XFS (potentially
others?) doesn't understand when the 0x10 prefix is added to the
ACL type field for default ACLs (which the Linux NFS client adds)
which causes setfacl()->setxattr() to fail silently. NFS client
adds NFS_ACL_DEFAULT(0x1000) for default ACL.
FIX:
Mask the prefix (added by NFS client) OFF, so the setfacl is not
rejected when it hits the FS.
Original patch by: "Richard Wareing"
Change-Id: I17ad27d84f030cdea8396eb667ee031f0d41b396
BUG: 1009210
Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com>
Reviewed-on: http://review.gluster.org/5980
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now saving the session details in
/var/lib/glusterd/geo-replication/<mastervol>_<slaveip>_<slavevol>
repo to distinguish between two master-slave sessions where the
slavename is same across two different clusters.
Change-Id: I57c93f55cc9bd4fe2bffe579028aaf5e4335b223
BUG: 991501
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/5488
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I1442bc1d115a9c6ecf139a0ca9da74d07e0fe928
BUG: 1003855
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/5764
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of using the cluster op-version, volume op-version is used to
enable open-behind during volgen. For doing this, the volume op-versions
are updated before regenerating the volfiles.
Change-Id: I675bb549bf7c7c0279030dca698fb530781addc6
BUG: 990830
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/5385
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I42d08b3a6a7a3839f5e9953e1f83959222c080f8
Signed-off-by: shishir gowda <sgowda@redhat.com>
BUG: 989846
Reviewed-on: http://review.gluster.org/5446
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently rebalance/remove-brick op's display migration failed count even
for files which failed due to space issues (not enough space for file, or
migration leading to cluster imbalance)
These will now be counted as skipped, and rebalance/remove-brick status
will display the additional counter
Change-Id: I674904d380b5f8300e9ca9e6af557c3d30d6cff4
BUG: 989846
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/5399
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Till now all the brick processes were writing the valgrind information
to the same log file.
Change-Id: I0251c943935e2901b729c71f21d0677edb9f6867
BUG: 922877
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/5394
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commands:
gluster system:: execute gsec_create
gluster volume geo-rep <master> <slave-url> create [push-pem] [force]
gluster volume geo-rep <master> <slave-url> start [force]
gluster volume geo-rep <master> <slave-url> stop [force]
gluster volume geo-rep <master> <slave-url> delete
gluster volume geo-rep <master> <slave-url> config
gluster volume geo-rep <master> <slave-url> status
The geo-replication is distributed. The session will be created, and
gsyncd will be spawned on all relevant nodes, instead of only one
node.
geo-rep: Collecting status detail related data
Added persistent store for saving information about
TotalFilesSynced, TotalSyncTime, TotalBytesSynced
Changes in the status information in socket:
Existing(Ex):
FilesSynced=2;BytesSynced=2507;Uptime=00:26:01;
New(Ex):
FilesSynced=2;BytesSynced=2507;Uptime=00:26:01;SyncTime=0.69978;
TotalSyncTime=2.890044;TotalFilesSynced=6;TotalBytesSynced=143640;
Persistent details stored in
/var/lib/glusterd/geo-replication/${mastervol}/${eSlave}-detail.status
Change-Id: I1db7fc13ffca2e415c05200b0109b1254067f111
BUG: 847839
Original Author: Avra Sengupta <asengupt@redhat.com>
Original Author: Venky Shankar <vshankar@redhat.com>
Original Author: Aravinda VK <avishwan@redhat.com>
Original Author: Amar Tumballi <amarts@redhat.com>
Original Author: Csaba Henk <csaba@redhat.com>
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/5132
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the initial version of the Changelog Translator.
What is it
-----------
Goal is to capture changes performed on a GlusterFS volume.
The translator needs to be loaded on the server (bricks) and
captures changes in a plain text file inside a configured
directory path (controlled by "changelog-dir", should be
somewhere in <export>/.glusterfs/changelog by default).
Changes are classified into 3 types:
- Data: : TYPE-I
- Metadata : TYPE-II
- Entry : TYPE-III
Changelog file is rolled over after a certain time interval
(defauls to 60 seconds) after which a changelog is started.
The thing to be noted here is that for a time interval
(time slice) multiple changes for an inode are recorded only
once (ie. say for 100+ writes on an inode that happens within
the time slice has only a single corresponding entry in the
changelog file). That way we do not bloat up the changelog
and also save lots of writes.
Changelog Format
-----------------
TYPE-I and TYPE-II changes have the gfid on the entity on
which the operation happened. TYPE-III being a entry op
requires the parent gfid and the basename. Changelog format
has been kept to a minimal and it's upto the consumers to
do the heavy loading of figuring out deletes, renames etc..
A single changelog file records all three types of changes,
with each change starting with an identifier ("D": DATA,
"M": METADATA and "E": ENTRY). Option is provided for the
encoding type (See TUNABLES).
Consumers
----------
The only consumer as of today would be geo-replication, although
backup utilities, self-heal, bit-rot detection could be possible
consumers in the future.
CLI
----
By default, change-logging is disabled (the translator is present
in the server graph but does nothing). When enabled (via cli) each
brick starts to log the changes. There are a set of tunable that
can be used to change the translators behaviour:
- enable/disable changelog (disabled by default)
gluster volume set <volume> changelog {on|off}
- set the logging directory (<brick>/.glusterfs/changelogs is the
default)
gluster volume set <volume> changelog-dir /path/to/dir
- select encoding type (binary (default) or ascii)
gluster volume set <volume> encoding {binary|ascii}
- change the rollover time for the logs (60 secs by default)
gluster volume set <volume> rollover-time <secs>
- when secs > 0, changelog file is not open()'d with O_SYNC flag
- and fsync is trigerred periodically every <secs> seconds.
gluster volume set <volume> fsync-interval <secs>
features/changelog: changelog consumer library (libgfchangelog)
A shared library is provided for the consumer of the changelogs
for easy acess via APIs. Application can link against this library
and request for changelog updates. Conversion of binary logs to
human-readable ascii format is also taken care by the library which
keeps a copy of the changelog in application provided working
directory.
Change-Id: I75575fb7f1c53d2bec3dba1a329ea7bb3c628497
BUG: 847839
Original Author: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/5127
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is to prevent the possibility of a deadlock when
rpc_connection_cleanup being called in the same thread as rpc_clnt_unref
Change-Id: Ia4dcc0a8a6e6158d4ddec68b780fccbc4cd64adb
BUG: 962619
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/5321
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: If47e209cb61ea0eb74ee2d6ef9e9342b2d6ee13a
BUG: 980838
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/5261
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Check if a previous remove-brick operation has been committed before
starting a new rebalance/remove-brick task.
Change-Id: I553e5ba64a6a352ca91032ab1a17997051a4494e
BUG: 963541
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/5019
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Making the glusterd_store_* functions re-usable will help with future
changes that need to read/write lists of items.
BUG: 904065
Change-Id: I99fb8eced76d12d5a254567eccff9790b43d8da3
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/4676
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
rpc_transport object, which is part of rpc_clnt, is destroyed
prematurely. This is because, rpc_transport object is ref'd by socket
layer and rpc layer. These ref's, until the synctask'izing of
operations, were unref'd sequentially in the epoll thread.
With more threads at play, the sequential unref guarantee is off.
Fix:
Shutting down the transport before proceeding with cleaning up of
rpc_clnt object would serialize the unref's on the rpc_transport object
and thus eliminating the race.
Also, we don't store the address of brickinfo in brick's rpc notify
function, to avoid the possibility of referring a freed brickinfo.
Instead we use a string based id to 'reach' the corresponding brickinfo.
Change-Id: If2739e2eeaee1e8b071ab2b6754b7ea0f81cfceb
BUG: 962619
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/5000
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Option to disable or enable acl with nfs.acl boolean
option.
2. Deregister the acl service with the portmapper service
when no longer required.
Change-Id: I6562b6b40138d040aa2bf1e5641f4c0e0e9f9d09
BUG: 970070
Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com>
Reviewed-on: http://review.gluster.org/5136
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I9e2743ab61c8baee92a1dfd376ec4bb145776176
BUG: 963524
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5016
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Volume op-versions calculations now take into account if an option,
a. enables/disables an xlator, or
b. is a boolean option.
This prevents op-versions from being updated when a feature is disabled.
Also, correctly close the dynamically loaded xlators in
xlator_volopt_dynload() and prevent leaks.
Change-Id: I895ddeeec6f6a33e509325f0ce6f01b7aad3cf5c
BUG: 954256
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/4952
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch enables the open-behind by default only when the op-version
allows it. Also the volume op-version calculations take account of this
enablement.
Change-Id: Idf7a3c274ec4828aafc815cdd1df829ecb853354
BUG: 954256
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/4866
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterd could deadlock after a peer-detach command as follows,
1) glusterd_friend_cleanup function 'flushes' out messages in the rpc
layer's queue, that haven't received a response. At this point, glusterd
has already acquired the big lock.
2) The side-effect of flushing out the messages is that the
corresponding call backs are called.
Call backs themselves are executed after acquiring the big lock. This
results in the big lock being acquired in a nested manner (in the same
thread), which causes
a deadlock.
This can also happen during brick/NFS/SHD disconnect in volume-stop.
Change-Id: Iab3aad143cd8ebbab53ea0b69687f0e7627dc8a9
BUG: 965533
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/5061
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change removes the asymmetry in the 'layer' (read rpc, transport
etc) in which transport options were being filled for inet and unix
sockets.
Change-Id: Iaa080691fd5e4c3baedffa97e9c3f16642c1fc12
BUG: 955919
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4850
Reviewed-by: Raghavendra G <raghavendra@gluster.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The restarting of bricks has been deffered until the cluster 'stabilizes'
itself volumes' view. Since glusterd_spawn_daemons is executed everytime
a peer 'joins' the cluster, it may inadvertently restart bricks that
were taken offline for say, maintenance purposes. This fix avoids that.
Change-Id: Ic2a0a9657eb95c82d03cf5eb893322cf55c44eba
BUG: 960190
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4973
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I08065aaa3c140d4b02af4ca38f5f4d00d7f0c2bb
BUG: 958739
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/4937
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each volume is now associated with two op-versions,
* op_version - the op-version of the highest op-versioned feature enabled
* client_op_version - the op-version of the highest op-versioned feature
enabled which affects the clients only.
These two op-versions are generated dynamically and kept updated during
runtime. Glusterd now uses the respective volumes' client-op-version during
getspec requests.
To achieve the above a new field in the vme table is introduced,
client_option, this boolean field tells if the option is a client side
option.
Change-Id: I12c83b1dd29ab506026efd50d448cebbcee53c27
BUG: 907311
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/4584
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are primarily three lists that are part of glusterd process,
that are concurrently accessed. Namely, priv->volumes, priv->peers
and volinfo->bricks_list.
Big-lock approach
-----------------
WHAT IS IT?
Big lock is a coarse-grained lock which protects all three
lists, mentioned above, from racy access.
HOW DOES IT WORK?
At any given point in time, glusterd's thread(s) are in execution
_iff_ there is a preceding, inbound network event. Of course, the
sigwaiter thread and timer thread are exceptions.
A network event is an external trigger to glusterd, via the epoll
thread, in the form of POLLIN and POLLERR.
As long as we take the big-lock at all such entry points and yield
it when we are done, we are guaranteed that all the network events,
accessing the global lists, are serialised.
This amounts to holding the big lock at
- all the handlers of all the actors in glusterd. (POLLIN)
- all the cbks in glusterd. (POLLIN)
- rpc_notify (DISCONNECT event), if we access/modify
one of the three lists. (POLLERR)
In the case of synctask'ized volume operations, we must remember that,
if we held the big lock for the entire duration of the handler,
we may block other non-synctask rpc actors from executing.
For eg, volume-start would block in PMAP SIGNIN, if done incorrectly.
To prevent this, we need to yield the big lock, when we yield the
synctask, and reacquire on waking up of the synctask.
Change-Id: Ib929f9905b55fb6c3fc27fefb497a26dba058e4f
BUG: 948686
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4784
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch incorporates all the changes suggested on the behaviour of
'volume create' command in http://review.gluster.org/#change,4214
(comment #14, to be precise).
Change-Id: Iaac524a59738b177415595b18aa8a136090d3d25
BUG: 948729
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/4740
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Till now running glusterfs processes were allowed to run in valgrind
mode only when built with debug mode enabled.
Change-Id: I11e07ea2a4da4f82f70cdded6258a22d65d6db64
BUG: 922877
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/4688
Reviewed-by: Anand Avati <avati@redhat.com>
Tested-by: Anand Avati <avati@redhat.com>
|