summaryrefslogtreecommitdiffstats
path: root/xlators/mgmt/glusterd
Commit message (Collapse)AuthorAgeFilesLines
* features/qemu-block: support for QCOW2 and QED formatsAnand Avati2013-09-032-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for internals snapshots using QCOW2 and general framework for external snapshots (next patch) with QCOW2 and QED. For internal snapshots, the file must be "initialized" or "formatted" into QCOW2 format, and specify a file size. Snapshots can be created, deleted, and applied ("goto"). e.g: // Format and Initialize sh# setfattr -n trusted.glusterfs.block-format -v qcow2:10GB /mnt/imgfile sh# ls -l /mnt/imgfile -rw-r--r-- 1 root root 10G Jul 18 21:20 imgfile // Create a snapshot sh# setfattr -n trusted.glusterfs.block-snapshot-create -v name1 imgfile // Apply a snapshot sh# setfattr -n trusted.gluterfs.block-snapshot-goto -v name1 imgfile Change-Id: If993e057a9455967ba3fa9dcabb7f74b8b2cf4c3 BUG: 986775 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5367 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
* nfs: persistent caching of connected NFS-clientsNiels de Vos2013-08-281-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce /var/lib/glusterfs/nfs/rmtab to contain a list of NFS-clients which have a volume mounted. The volume option 'nfs.mount-rmtab' can be set to an alternative filename. When the file is located on shared storage, multiple gNFS servers can use the same file to present a single NFS-server. This cache is read when a system administrator calls 'showmount -a' and updated when an NFS-client calls MNT or UMNT from the MOUNT protocol. Usage: - create a volume for storing the shared rmtab file - mount the volume on all storage servers, at the same location - make sure that the volume is mounted at boot (add to /etc/fstab) - place the rmtab file on the volume: # gluster volume set <VOLUME> nfs.mount-rmtab <MOUNTPOINT>/<FILENAME> - any subsequent mount requests will add an entry to this file - 'showmount -a' requests will return the NFS-clients using the cluster Note: The NFS-server does currently not support reconfigure(). When a configuration option is set/changed, the NFS-server glusterfs process gets restarted. This causes the active NFS-clients to be forgotten (the entries are saved in the old rmtab, but we do not have a reference to that file any more, so we can't re-add them). Therefor a re-mount done by the NFS-clients is needed before they get listed in the rmtab again. Change-Id: I58f47135d60ad112849d647bea4e1129683dd2b3 BUG: 904065 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/4430 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* glusterd: add check in remove-brick start variantRavishankar N2013-08-211-11/+11
| | | | | | | | | | | | | | | | | | | | | | The 'start' variant of the remove-brick command only applies at the dht level wherein we can remove all the bricks of a sub-volume (and remove multiple such sub-volumes) but not select bricks of it. This patch disallows removing individual replica bricks of multiple sub-volumes (i.e. reducing the replcia count of the volume) using remove-brick 'start'. The preferred method for such an operation is to use commit force. This patch also reverts the check to prevent removal of bricks from a replicate volume (commit 0d415f7) BUG: 961669 Change-Id: I447ad27f73a0963b5e09fb317bf7267a7a5a6147 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/5566 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: release big locks while doing mountAnand Avati2013-08-181-0/+4
| | | | | | | | | | | | | Else things can deadlock in getspec v/s glusterd_do_mount() Change-Id: Ie70b43916e495c1c8f93e4ed0836c2fb7b0e1f1d BUG: 997576 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5636 Tested-by: Joe Julian <joe@julianfamily.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
* glusterd: Try to start all bricks on 'start force'Kaushal M2013-08-181-2/+7
| | | | | | | | | | | | | | | | | | | A volume would fail to start if any one of the bricks fails staging or fails to start, even with the 'force' option. With this patch, when the 'force' option is given for a volume start, glusterd will continue and start other bricks even if one fails staging or starting. Also did a small fix in changelog, to prevent it crashing when it fails to init. Change-Id: I7efbd9ab13d12d69b0335ae54143fa17586f8f98 BUG: 994375 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5510 Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cli: Add server uuid into volume brick info xmlTimothy Asir2013-08-181-0/+10
| | | | | | | | | | | | | | | | | Add server uuid as an attribute to the existing brick details in the volume info cli xml output. Currently, when a node has more than one ip, the oVirt-engine fails to map the corresponding server using the ip alone. If we get the host uuid along with brick details in volume info command it will be easy for ovirt-engine to find out the server and thereby we can avoid confusion in finding the server. Change-Id: I3c9c9acea80e10e0b2977477759d9af045e48959 BUG: 955588 Signed-off-by: Timothy Asir <tjeyasin@redhat.com> Reviewed-on: http://review.gluster.org/4875 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Move certain logs into 'DEBUG' levelHarshavardhana2013-08-181-3/+3
| | | | | | | | | | | | Confusing "Error" messages in logs can cause user panic and false positives - avoid them as necessary in future. Change-Id: I906c64eea879b19a8db099c89d1d7f874e5530db BUG: 995784 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5555 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: remove-brick:Allow simultaneous removal of multiple subvolumes.Ravishankar N2013-08-132-36/+89
| | | | | | | | | | | | | | | | | Currently, remove-brick supports removal of only one distributed stripe/ replica pair at a time. Fix it to support removal of multiple pairs. This is consistent with add-brick behaviour which supports adding multiple stripe/replica pairs simultaneously. Removal is successful irrespective of the order of the bricks given at the CLI, as long as the bricks are from the same subvolume(s). Change-Id: I7c11c1235ce07b124155978b9d48d0ea65396103 BUG: 974007 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/5210 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
* Correcting a log message in glusterd-geo-rep.cM S Vishwanath Bhat2013-08-051-1/+1
| | | | | | | | | | Change-Id: I4352f513fc5616daa20e9a4ad51a63fb13a27dff BUG: 847839 Signed-off-by: M S Vishwanath Bhat <vbhat@redhat.com> Reviewed-on: http://review.gluster.org/5472 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Add switch and nufa options to 'gluster cli'Harshavardhana2013-08-032-17/+53
| | | | | | | | | Change-Id: Ic3c43291e0e1ead0d89c0436e8d70aa5dee2f543 BUG: 924488 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5391 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cli,glusterd: Fix when tasks are shown in 'volume status'Kaushal M2013-08-031-0/+4
| | | | | | | | | | | | | Asynchronous tasks are shown in 'volume status' only for a normal volume status request for either all volumes or a single volume. Change-Id: I9d47101511776a179d213598782ca0bbdf32b8c2 BUG: 888752 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5308 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Use volume op-versions during volgenKaushal M2013-08-024-19/+14
| | | | | | | | | | | | | | Instead of using the cluster op-version, volume op-version is used to enable open-behind during volgen. For doing this, the volume op-versions are updated before regenerating the volfiles. Change-Id: I675bb549bf7c7c0279030dca698fb530781addc6 BUG: 990830 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5385 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Re-initialize skipped file count in glusterdshishir gowda2013-07-311-0/+1
| | | | | | | | | Change-Id: I42d08b3a6a7a3839f5e9953e1f83959222c080f8 Signed-off-by: shishir gowda <sgowda@redhat.com> BUG: 989846 Reviewed-on: http://review.gluster.org/5446 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd : Checking session created or not in case of geo-rep stopAvra Sengupta2013-07-311-2/+6
| | | | | | | | | | | | | | | | | Performing statefile check in case of geo-rep stop, so as to provide proper error message in case session is not created. However in case of geo-rep stop force, we allow the command to succeed even in case that the session is not created, because the stop command is a failsafe command to stop running geo-rep sessions on any nodes. Change-Id: I2b6a0253de977633606c422cbbc9e37cede9a268 BUG: 989541 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5417 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd : initiating gsyncd restart during add-brickAvra Sengupta2013-07-313-25/+150
| | | | | | | | | | | | | | | | | | | | | | | During add-brick, when a new brick is added in one of the nodes that was already a part of the existing volume, and gsyncd was already running on that node, then all gsyncd processes running on that node, for that particular master and any slave sessions will be restarted If a new brick is added in a new node, then after adding the brick, the user has to perform the following steps: 1. gluster system:: execute gsec_create 2. gluster volume geo-replication <master-vol> <slave-vol> create push-pem force 3. gluster volume geo-replication <master-vol> <slave-vol> start force Change-Id: I4b9633e176c80e4a7cf33f42ebfa47ab8fc283f1 BUG: 989532 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5416 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: Fix a minor typo.Vijay Bellur2013-07-311-1/+1
| | | | | | | | | | | Thanks to Patrick Matthäi <pmatthaei@debian.org> for the patch. Signed-off-by: Vijay Bellur <vbellur@redhat.com> Change-Id: I59da74298894ccc2ab30967ffe44cc844aa73f82 BUG: 814534 Reviewed-on: http://review.gluster.org/5436 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Anand Avati <avati@redhat.com>
* cluster/dht: Treat migration failures due to space constraints as skippedshishir gowda2013-07-302-0/+28
| | | | | | | | | | | | | | | | Currently rebalance/remove-brick op's display migration failed count even for files which failed due to space issues (not enough space for file, or migration leading to cluster imbalance) These will now be counted as skipped, and rebalance/remove-brick status will display the additional counter Change-Id: I674904d380b5f8300e9ca9e6af557c3d30d6cff4 BUG: 989846 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/5399 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Fixing create force issues while it returned true everytime.Avra Sengupta2013-07-291-42/+71
| | | | | | | | | | | | | | | | | | | | Now geo-rep create force will return true if a node is down, and log an appropriate message. It will also return true with an appropriate log message if the slave verification fails. However it will not return true if the config file is deleted, ot corrupted, so as not to get the state_file's path. It will also fail if the slave url is invalid. If the push-pem option is given and /var/lib/glusterd/geo-replication/common_secret.pem.pub is not present, then also the create force command will fail. Change-Id: Ie7532a0884ddf9c3008bd30832d171d5b53b540e BUG: 988314 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5405 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cli, glusterd: Cleanup logging of bd op commands.Vijay Bellur2013-07-271-1/+0
| | | | | | | | | | | | | | This patch prevents messages of the form "bd op: %s : SUCCESS" from being logged in .cmd_log_history. Change-Id: Iebeb7e26d409bf99b9c8df0a5c1c5a5d30d78a61 BUG: 823081 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/4871 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: M. Mohan Kumar <mohan@in.ibm.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mgmt/glusterd: let each brick write the valgrind o/p to different fileRaghavendra Bhat2013-07-261-1/+5
| | | | | | | | | | | | | Till now all the brick processes were writing the valgrind information to the same log file. Change-Id: I0251c943935e2901b729c71f21d0677edb9f6867 BUG: 922877 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/5394 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/cli changes for distributed geo-repAvra Sengupta2013-07-2612-574/+2677
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commands: gluster system:: execute gsec_create gluster volume geo-rep <master> <slave-url> create [push-pem] [force] gluster volume geo-rep <master> <slave-url> start [force] gluster volume geo-rep <master> <slave-url> stop [force] gluster volume geo-rep <master> <slave-url> delete gluster volume geo-rep <master> <slave-url> config gluster volume geo-rep <master> <slave-url> status The geo-replication is distributed. The session will be created, and gsyncd will be spawned on all relevant nodes, instead of only one node. geo-rep: Collecting status detail related data Added persistent store for saving information about TotalFilesSynced, TotalSyncTime, TotalBytesSynced Changes in the status information in socket: Existing(Ex): FilesSynced=2;BytesSynced=2507;Uptime=00:26:01; New(Ex): FilesSynced=2;BytesSynced=2507;Uptime=00:26:01;SyncTime=0.69978; TotalSyncTime=2.890044;TotalFilesSynced=6;TotalBytesSynced=143640; Persistent details stored in /var/lib/glusterd/geo-replication/${mastervol}/${eSlave}-detail.status Change-Id: I1db7fc13ffca2e415c05200b0109b1254067f111 BUG: 847839 Original Author: Avra Sengupta <asengupt@redhat.com> Original Author: Venky Shankar <vshankar@redhat.com> Original Author: Aravinda VK <avishwan@redhat.com> Original Author: Amar Tumballi <amarts@redhat.com> Original Author: Csaba Henk <csaba@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5132 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* gsyncd: distribute the crawling loadAvra Sengupta2013-07-262-1/+15
| | | | | | | | | | | | | | | | | | | * also consume changelog for change detection. * Status fixes * Use new libgfchangelog done API * process (and sync) one changelog at a time Change-Id: I24891615bb762e0741b1819ddfdef8802326cb16 BUG: 847839 Original Author: Csaba Henk <csaba@redhat.com> Original Author: Aravinda VK <avishwan@redhat.com> Original Author: Venky Shankar <vshankar@redhat.com> Original Author: Amar Tumballi <amarts@redhat.com> Original Author: Avra Sengupta <asengupt@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5131 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: implement batched fsync in a single threadAnand Avati2013-07-231-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because of the extra fsync()s issued by AFR transaction, they could potentially "clog" all the io-threads denying unrelated operations from making progress. This patch assigns a dedicated thread to issues fsyncs, as an experimental feature to understand performance characteristics with the approach. As a basis, incoming individual fsync requests are grouped into batches, falling in the same @batch-fsync-delay-usec window of time. These windows can extend in practice, as processing of the previous batch can take longer than @batch-fsync-delay-usec while new requests are getting batched. The feature support three modes (similar to the -S modes of fs_mark) - syncfs: In this mode one syncfs() is issued per batch, instead of N fsync()s (one per file.) - syncfs-single-fsync: In this mode one syncfs() is issued per batch (which, on Linux, guarantees the completion of write-out of dirty pages in the filesystem up to that point) and one single fsync() to synchronize or flush the controller/drive cache. This corresponds to -S 2 of fsmark. - syncfs-reverse-fsync: In this mode, one syncfs() is issued per batch, and all the open files in that batch are fsync()'ed in the reverse order of the queue. This corresponds to -S 4 of fsmark. - reverse-fsync: In this mode, no syncfs() is issued and all the files in the batch are fsync()'ed in the reverse order. This corresponds to -S 3 of fsmark. Change-Id: Ia1e170a810c780c8d80e02cf910accc4170c4cd4 BUG: 927146 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4746 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/changelog: changelog translatorAvra Sengupta2013-07-223-11/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the initial version of the Changelog Translator. What is it ----------- Goal is to capture changes performed on a GlusterFS volume. The translator needs to be loaded on the server (bricks) and captures changes in a plain text file inside a configured directory path (controlled by "changelog-dir", should be somewhere in <export>/.glusterfs/changelog by default). Changes are classified into 3 types: - Data: : TYPE-I - Metadata : TYPE-II - Entry : TYPE-III Changelog file is rolled over after a certain time interval (defauls to 60 seconds) after which a changelog is started. The thing to be noted here is that for a time interval (time slice) multiple changes for an inode are recorded only once (ie. say for 100+ writes on an inode that happens within the time slice has only a single corresponding entry in the changelog file). That way we do not bloat up the changelog and also save lots of writes. Changelog Format ----------------- TYPE-I and TYPE-II changes have the gfid on the entity on which the operation happened. TYPE-III being a entry op requires the parent gfid and the basename. Changelog format has been kept to a minimal and it's upto the consumers to do the heavy loading of figuring out deletes, renames etc.. A single changelog file records all three types of changes, with each change starting with an identifier ("D": DATA, "M": METADATA and "E": ENTRY). Option is provided for the encoding type (See TUNABLES). Consumers ---------- The only consumer as of today would be geo-replication, although backup utilities, self-heal, bit-rot detection could be possible consumers in the future. CLI ---- By default, change-logging is disabled (the translator is present in the server graph but does nothing). When enabled (via cli) each brick starts to log the changes. There are a set of tunable that can be used to change the translators behaviour: - enable/disable changelog (disabled by default) gluster volume set <volume> changelog {on|off} - set the logging directory (<brick>/.glusterfs/changelogs is the default) gluster volume set <volume> changelog-dir /path/to/dir - select encoding type (binary (default) or ascii) gluster volume set <volume> encoding {binary|ascii} - change the rollover time for the logs (60 secs by default) gluster volume set <volume> rollover-time <secs> - when secs > 0, changelog file is not open()'d with O_SYNC flag - and fsync is trigerred periodically every <secs> seconds. gluster volume set <volume> fsync-interval <secs> features/changelog: changelog consumer library (libgfchangelog) A shared library is provided for the consumer of the changelogs for easy acess via APIs. Application can link against this library and request for changelog updates. Conversion of binary logs to human-readable ascii format is also taken care by the library which keeps a copy of the changelog in application provided working directory. Change-Id: I75575fb7f1c53d2bec3dba1a329ea7bb3c628497 BUG: 847839 Original Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5127 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Give up biglock before brick's rpc unrefKrishnan Parthasarathi2013-07-111-1/+5
| | | | | | | | | | | | | This is to prevent the possibility of a deadlock when rpc_connection_cleanup being called in the same thread as rpc_clnt_unref Change-Id: Ia4dcc0a8a6e6158d4ddec68b780fccbc4cd64adb BUG: 962619 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5321 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Correct op-version of some optionsKaushal M2013-07-111-23/+23
| | | | | | | | | | | | | | | New options being introduced in the master branch should now have op-version set to the GD_OP_VERSION_MAX (3). Some of the options have been backported to release-3.3 branch and hence should have their op-version reduced. Some other options had op-version incorrectly set as 1. Change-Id: If40325b7b2da7aa36f90261024117cd18cf51ef0 BUG: 981278 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5318 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd/common-utils: move hostname helper functions to common-utilsKrishnan Parthasarathi2013-07-046-255/+20
| | | | | | | | | Change-Id: If47e209cb61ea0eb74ee2d6ef9e9342b2d6ee13a BUG: 980838 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5261 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: add a simple health-checkerNiels de Vos2013-07-031-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Goal of this health-checker is to detect fatal issues of the underlying storage that is used for exporting a brick. The current implementation requires the filesystem to detect the storage error, after which it will notify the parent xlators and exit the glusterfsd (brick) process to prevent further troubles. The interval the health-check runs can be configured per volume with the storage.health-check-interval option. The default interval is 30 seconds. It is not trivial to write an automated test-case with the current prove-framework. These are the manual steps that can be done to verify the functionality: - setup a Logical Volume (/dev/bz970960/xfs) and format is as XFS for brick usage - create a volume with the one brick # gluster volume create failing_xfs glufs1:/bricks/failing_xfs/data # gluster volume start failing_xfs - mount the volume and verify the functionality - make the storage fail (use device-mapper, or pull disks) # dmsetup table .. bz970960-xfs: 0 196608 linear 7:0 2048 # echo 0 196608 error > dmsetup-error-target # dmsetup load bz970960-xfs dmsetup-error-target # dmsetup resume bz970960-xfs # dmsetup table ... bz970960-xfs: 0 196608 error - notice the errors caught by syslog: Jun 24 11:31:49 vm130-32 kernel: XFS (dm-2): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 buf count 512 Jun 24 11:31:49 vm130-32 kernel: XFS (dm-2): I/O Error Detected. Shutting down filesystem Jun 24 11:31:49 vm130-32 kernel: XFS (dm-2): Please umount the filesystem and rectify the problem(s) Jun 24 11:31:49 vm130-32 kernel: VFS:Filesystem freeze failed Jun 24 11:31:50 vm130-32 GlusterFS[1969]: [2013-06-24 10:31:50.500674] M [posix-helpers.c:1114:posix_health_check_thread_proc] 0-failing_xfs-posix: health-check failed, going down Jun 24 11:32:09 vm130-32 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jun 24 11:32:20 vm130-32 GlusterFS[1969]: [2013-06-24 10:32:20.508690] M [posix-helpers.c:1119:posix_health_check_thread_proc] 0-failing_xfs-posix: still alive! -> SIGTERM - these errors are in the log of the brick as well: [2013-06-24 10:31:50.500607] W [posix-helpers.c:1102:posix_health_check_thread_proc] 0-failing_xfs-posix: stat() on /bricks/failing_xfs/data returned: Input/output error [2013-06-24 10:31:50.500674] M [posix-helpers.c:1114:posix_health_check_thread_proc] 0-failing_xfs-posix: health-check failed, going down [2013-06-24 10:32:20.508690] M [posix-helpers.c:1119:posix_health_check_thread_proc] 0-failing_xfs-posix: still alive! -> SIGTERM - the glusterfsd process has exited correctly: # gluster volume status Status of volume: failing_xfs Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick glufs1:/bricks/failing_xfs/data N/A N N/A NFS Server on localhost 2049 Y 1897 Change-Id: Ic247fbefb97f7e861307a5998a9a7a3ecc80aa07 BUG: 971774 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/5176 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Provide an option to disable afr durabilityPranith Kumar K2013-07-031-0/+5
| | | | | | | | | Change-Id: I40eec20ca6b3f857245a2438883822e251077ee9 BUG: 979365 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5269 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: More checks before starting rebalance/remove-brickKaushal M2013-07-024-12/+60
| | | | | | | | | | | | Check if a previous remove-brick operation has been committed before starting a new rebalance/remove-brick task. Change-Id: I553e5ba64a6a352ca91032ab1a17997051a4494e BUG: 963541 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5019 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* rpc: duplicate request cache for nfsRajesh Amaravathi2013-06-215-57/+66
| | | | | | | | | | | | | | | Duplicate request cache provides a mechanism for detecting duplicate rpc requests from clients. DRC caches replies and on duplicate requests, sends the cached reply instead of re-processing the request. Change-Id: I3d62a6c4aa86c92bf61f1038ca62a1a46bf1c303 BUG: 847624 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.org/4049 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* store: move glusterd_store functions from mgmt/glusterd to libglusterfsNiels de Vos2013-06-206-867/+179
| | | | | | | | | | | | | Making the glusterd_store_* functions re-usable will help with future changes that need to read/write lists of items. BUG: 904065 Change-Id: I99fb8eced76d12d5a254567eccff9790b43d8da3 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/4676 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Log peer op status at the appropriate timeKrutika Dhananjay2013-06-186-72/+284
| | | | | | | | | | Change-Id: Ia8e1af082078f2f791708ba4faa4992bf291dd6e BUG: 961339 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/5023 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Disable transport before cleaning up rpc objectKrishnan Parthasarathi2013-06-183-19/+99
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: rpc_transport object, which is part of rpc_clnt, is destroyed prematurely. This is because, rpc_transport object is ref'd by socket layer and rpc layer. These ref's, until the synctask'izing of operations, were unref'd sequentially in the epoll thread. With more threads at play, the sequential unref guarantee is off. Fix: Shutting down the transport before proceeding with cleaning up of rpc_clnt object would serialize the unref's on the rpc_transport object and thus eliminating the race. Also, we don't store the address of brickinfo in brick's rpc notify function, to avoid the possibility of referring a freed brickinfo. Instead we use a string based id to 'reach' the corresponding brickinfo. Change-Id: If2739e2eeaee1e8b071ab2b6754b7ea0f81cfceb BUG: 962619 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5000 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* nfs: option to disable aclRajesh Amaravathi2013-06-152-0/+13
| | | | | | | | | | | | | | | | 1. Option to disable or enable acl with nfs.acl boolean option. 2. Deregister the acl service with the portmapper service when no longer required. Change-Id: I6562b6b40138d040aa2bf1e5641f4c0e0e9f9d09 BUG: 970070 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.org/5136 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Ignore directories matching *.tmp in storeKrishnan Parthasarathi2013-06-131-0/+1
| | | | | | | | | | | | store being glusterd's persistent store under /var/lib/glusterd/ Change-Id: I1c01a09a8ce4a73ea612f05e7f14d4ab39ad1628 BUG: 971796 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5177 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* mgmt/glusterd, socket: Change logging for brick disconnectsPranith Kumar K2013-06-111-2/+6
| | | | | | | | | | | | | | | | | | For unix path based sockets, the socket path is cryptic (md5sum of path) and may not be useful for the user in debugging so log it in DEBUG. Changed logging in brick_rpc_notify to log brickinfo for disconnects. Change-Id: I69174bbbbde8352d38837723e950ad8fc15232aa BUG: 963153 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5009 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Add a cmd for getting uuid of local nodeKrishnan Parthasarathi2013-06-101-0/+99
| | | | | | | | | | | | | | | | | | | Usage: gluster system:: uuid get This is needed since we generate uuid of a node in a lazy manner. ie, we generate a uuid for the node only on the first volume or peer operation, when the node needs an external identity. With this command, we can force[1] the uuid generation, without a volume or peer operation performed. [1]: Querying for uuid (or uuid get), forces uuid to come into existence. Change-Id: I62c8b6754117756aa4d773dd48af4ddeb1a1d878 BUG: 971661 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5175 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* nfs: gluster volume set help shows null as default valueRajesh Joseph2013-06-061-11/+17
| | | | | | | | | | | | | | Bug(967445): The default value for all nfs options is displayed as "(null)" Fix: Changed nfs options to show default value. Change-Id: I3b1f27439c19a6655f7dcc7891df40706db9e474 BUG: 967445 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/5098 Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* mgmt/glusterd: Set task op at the time of task-id setPranith Kumar K2013-06-062-0/+2
| | | | | | | | | | | | | | | | | | Problem: If a remove-brick start is executed on m1 with brick from m2 on local subvolume no rebalance process is launched. Because of this volinfo->rebal.op is not set. This leads to volume status failures. Fix: Set rebal.op even when the reblance process is not started. Change-Id: I71c7e6f09353be14c1e8edca3c8685ebfdf226d6 BUG: 964059 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5030 Reviewed-by: Kaushal M <kaushal@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd-syncop: Fix unlocking and collating errorsKaushal M2013-06-043-50/+86
| | | | | | | | | | | | | * Only those peers which were locked need to be unlocked. * Fix location of collating errors in callbacks. The callback functions could miss collating errors if there was an rpc error. Change-Id: Ie27c2f1ec197da4f5077a4d6e032127954ce87cd BUG: 948686 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5087 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* mgmt/glusterd: Make sure peerinfo->uuid_str is assignedPranith Kumar K2013-05-314-5/+24
| | | | | | | | | Change-Id: I9e2743ab61c8baee92a1dfd376ec4bb145776176 BUG: 963524 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5016 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd-volgen: Improve volume op-versions calculationKaushal M2013-05-314-488/+616
| | | | | | | | | | | | | | | | | | Volume op-versions calculations now take into account if an option, a. enables/disables an xlator, or b. is a boolean option. This prevents op-versions from being updated when a feature is disabled. Also, correctly close the dynamically loaded xlators in xlator_volopt_dynload() and prevent leaks. Change-Id: I895ddeeec6f6a33e509325f0ce6f01b7aad3cf5c BUG: 954256 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4952 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd-volgen: Enable open-behind based on op-versionKaushal M2013-05-283-11/+50
| | | | | | | | | | | | | | | This patch enables the open-behind by default only when the op-version allows it. Also the volume op-version calculations take account of this enablement. Change-Id: Idf7a3c274ec4828aafc815cdd1df829ecb853354 BUG: 954256 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4866 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Allow volume start force to succeed if brick directories are recreatedKrutika Dhananjay2013-05-251-1/+15
| | | | | | | | | | | Change-Id: I4fc3c5c829adca256bb131f4a2722abc95741158 BUG: 963665 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/5020 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Give up biglock during rpc conn cleanupKrishnan Parthasarathi2013-05-231-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | glusterd could deadlock after a peer-detach command as follows, 1) glusterd_friend_cleanup function 'flushes' out messages in the rpc layer's queue, that haven't received a response. At this point, glusterd has already acquired the big lock. 2) The side-effect of flushing out the messages is that the corresponding call backs are called. Call backs themselves are executed after acquiring the big lock. This results in the big lock being acquired in a nested manner (in the same thread), which causes a deadlock. This can also happen during brick/NFS/SHD disconnect in volume-stop. Change-Id: Iab3aad143cd8ebbab53ea0b69687f0e7627dc8a9 BUG: 965533 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5061 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* glusterd: Set op_errstr to error string received from peerKrutika Dhananjay2013-05-161-0/+4
| | | | | | | | | | | | ... in case of volume op failure on remote host Change-Id: I7177dc02369dffa82f217496559532d18b7c7c7a BUG: 963628 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/5018 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rpc-transport: Moved unix socket options function to rpc-transportKrishnan Parthasarathi2013-05-162-6/+6
| | | | | | | | | | | | | This change removes the asymmetry in the 'layer' (read rpc, transport etc) in which transport options were being filled for inet and unix sockets. Change-Id: Iaa080691fd5e4c3baedffa97e9c3f16642c1fc12 BUG: 955919 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4850 Reviewed-by: Raghavendra G <raghavendra@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: Fix misleading log messages of the kind "Node <n> responded to <n>"Krutika Dhananjay2013-05-163-10/+13
| | | | | | | | | | | | | | | | | | | | | PROBLEM: glusterd logs coming from glusterd_xfer_friend_add_resp() (wrongly) indicate that a node responded to itself, although it actually responded to one of its peers. FIX: Make glusterd_xfer_friend_add_resp() distinguish between remote host and self and print the appropriate hostname. Change-Id: I2a504eeb058c08a0d378443888eb6f1dc7edc76f BUG: 963537 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/5017 Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: Start bricks on glusterd startup, only onceKrishnan Parthasarathi2013-05-154-5/+7
| | | | | | | | | | | | | | | The restarting of bricks has been deffered until the cluster 'stabilizes' itself volumes' view. Since glusterd_spawn_daemons is executed everytime a peer 'joins' the cluster, it may inadvertently restart bricks that were taken offline for say, maintenance purposes. This fix avoids that. Change-Id: Ic2a0a9657eb95c82d03cf5eb893322cf55c44eba BUG: 960190 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4973 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>