summaryrefslogtreecommitdiffstats
path: root/xlators/mgmt/glusterd/src/glusterd-geo-rep.c
Commit message (Collapse)AuthorAgeFilesLines
* glusterd/geo-rep: Fix glusterd crash in non-originator slave node.Kotresh HR2014-11-121-0/+1
| | | | | | | | | | | | | | | | | | | | Problem: glusterd crashes in non-originator slave node during geo-rep create push-pem. Cause: In glusterd_op_copy_file, the value of the key "common_pem_contents" is freed explicitly even after dict_set is successful when it is taken cared by dict_free. Solution: Free only in failure cases before dict_set. BUG: 1159210 Change-Id: I726f923915fc24de6588469c27f2cc996c20c59d Reviewed-On: http://review.gluster.org/9018/ Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9026 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep/glusterd: Enable changelog and marker during geo-rep create.Kotresh HR2014-11-121-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: Geo-rep misses few a files to sync when I/O happenned during geo-rep start. ANALYSES: To use the available changelogs to handle deletes/renames, 'xsync upper limit' is introduced which limits the xsync crawl till the changelog register time. But there is a small time interval between the changelog register time and the time changelog actually enabled. If there is I/O between this interval, it will not be synced through xsync as it is beyond changelog register time and not through changelog also as changelog is not actually enabled. SOLUTION: Enable changelog and marker during geo-rep create instead of geo-rep start so that entries are captured in changelog and above said interval is nullified. BUG: 1159205 Change-Id: If5203eb1cfcbde3999f97a5f1a6a1af4875ac358 Reviewed-on: http://review.gluster.org/8650 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9023 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* glusterd/geo-rep: Fix race in updating status fileKotresh HR2014-11-121-18/+26
| | | | | | | | | | | | | | | | | | | | | | | | When geo-rep is in paused state and a node in a cluster is rebooted, the geo-rep status goes to "faulty (Paused)" and no worker processes are started on that node yet. In this state, when geo-rep is resumed, there is a race in updating status file between glusterd and gsyncd itself as geo-rep is resumed first and then status is updated. glusterd tries to update to previous state and gsyncd tries to update it to "Initializing...(Paused)" on restart as it was paused previously. If gsyncd on restart wins, the state is always paused but the process is not acutally paused. So the solution is glusterd to update the status file and then resume. BUG: 1159195 Change-Id: I4c06f42226db98f5a3c49b90f31ecf6cf2b6d0cb Reviewed-on: http://review.gluster.org/8911 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9021 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep/glusterd: API to check active geo-rep session for the volumeKotresh H R2014-09-081-48/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Requirement: Snapshot needs an API to fail the CLI if any geo-rep session is active for that volume. Solution: A function "gd_vol_is_geo_rep_active" is provided to check if any geo-rep session is active for that volume. An in memory dict called 'gsync_running_slaves' is maintained in 'volinfo' structure to keep track of active geo-rep session for the volume. The key 'slavenode::slavevol' with value 'running' is added whenever geo-rep is started/resumed into the dict and the same is removed if stopped/paused. So the 'count' in dict is used to decide whether the geo-rep is active or not for that volume. Also added "this->name" in gf_log in routines which this patch is touched. BUG: 1138952 Change-Id: Ib13aeb509a56edf510651b77e20bf3cc43a3e763 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8459 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/8645 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cli: Xml output for geo-replication status commandndarshan2014-08-281-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds xml output for geo-replication status and status detail command. sample: -------------------------------------------------------------- <geoRep> <volume> <name>master</name> <sessions> <session> <session_slave>:2a301d66-b9d2-44b4-b827-d680d67123eb:ssh://XXXXXXXXXX::slave</session_slave> <pair> <master_node>localhost.localdomain</master_node> <master_node_uuid>2a301d66-b9d2-44b4-b827-d680d67123eb</master_node_uuid> <master_brick>/root/master_b1</master_brick> <slave>ssh://XXXXXXXXXXX::slave</slave> <status>faulty</status> <checkpoint_status>N/A</checkpoint_status> <crawl_status>N/A</crawl_status> </pair> </session> </sessions> </volume> </geoRep> ------------------------------------------------------------- Change-Id: Ia19dbe751c3ab1ec7cb8923cdd6c8b99c374072f BUG: 1133464 Signed-off-by: ndarshan <dnarayan@redhat.com> Reviewed-on: http://review.gluster.org/8089 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Signed-off-by: ndarshan <dnarayan@redhat.com> Reviewed-on: http://review.gluster.org/8532 Reviewed-by: Sachin Pandit <spandit@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/geo-rep: Create the conf file path correctlyAvra Sengupta2014-06-121-51/+66
| | | | | | | | | | | | | | In case of mount brocker, the conf file path needs to be correctly created, and then fetch the status file Change-Id: Iaa1b04ee46f10961a7056e834170d68282c36efa BUG: 1104649 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7977 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* glusterd/geo-rep: Fix the resource leaks.Poornima2014-06-121-15/+16
| | | | | | | | | Change-Id: Ic741250999880bdf9e1226cd3eefa791fb66a888 BUG: 789278 Signed-off-by: Poornima <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/6905 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep/glusterd: Fix glusterd crash during geo-rep config reset.Kotresh H R2014-06-121-1/+1
| | | | | | | | | | | | | | | | | | | | Problem: When "\!<config-option>" (i.e., reset) is used in geo-rep config, glusterd crashes with NULL pointer dereference of 'op_value' in glusterd_gsync_op_already_set. Solution: glusterd_gsync_op_already_set should be called only for geo-rep set option. So bypass glusterd_gsync_op_already_set for reset. Change-Id: I9d8ef242abf02011139d76a72564f66d68cc8585 BUG: 1107984 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8032 Reviewed-by: Avra Sengupta <asengupt@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* gsyncd / geo-rep: FSH recommended log locationsVenky Shankar2014-06-101-1/+14
| | | | | | | | | | | | | | | | Upgrading "working_dir" on the fly is a bit unclean yet (though it works) as currently config upgrade does not support "old" values to be expanded by using configuration variables. Change-Id: I44ed65c281f2e0ce3b6b467addc5c1c88ac674e7 BUG: 1077516 Signed-off-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Kotresh H R <khiremat@redhat.com> Signed-off-by: Aravinda VK <avishwan@redhat.com> Signed-off-by: Ajeet Jha <ajha@redhat.com> Reviewed-on: http://review.gluster.org/7375 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* feature/geo-rep: Fix to retain pause state of gsyncd on restart.Kotresh H R2014-06-051-2/+3
| | | | | | | | | | | | | | | A new gsyncd options '--pause-on-start' is introduced. When node reboots, if the status is paused, gsyncd is started with this option. After gsyncd spawns worker and agent, worker will send SIGSTOP to negative pid of monitor to enter pause mode. Change-Id: I5aad82c9a9fc8c243f384940b77d25e26e520d6d BUG: 1101410 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/7885 Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* glusterd/geo-rep: Allow gverify.sh and S56glusterd-geo-rep-create-post.shAvra Sengupta2014-05-141-16/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to operate for non-root privileged slave volume Mounting the slave-volume on local node, to perform disk checks in order to allow gverify.sh to operate for non-root privileged slave volume Allowing the hook script S56glusterd-geo-rep-create-post.sh to operate for non-root privileged slave volume Modified peer_add_secret_pub.in to accept username as argument and add the pem keys to the users's_home_dir/.ssh/authorized_keys Wrote set_geo_rep_pem_keys.sh which accepts username as argument and copies the pem keys from the user's home directory to $GLUSTERD_WORKING_DIR/geo-replication/ and then copies the keys to other nodes in the cluster and add them to the respective authorized keys. The script takes as argument the user name and assumes that the user will be present in all the nodes in the cluster. It is not needed for root. To summarize: For a privileged slave user, execute the following on master node as super user: gluster system:: execute gsec_create gluster volume geo-replication <master_vol> [root@]<slave_ip>::<slave_vol> create push_pem For a non-privileged slave user execute the following on master node as super user: gluster system:: execute gsec_create gluster volume geo-replication <master_vol> <slave_user>@<slave_ip>::<slave_vol> create push_pem then on the slave node execute the following as super user: /usr/local/libexec/glusterfs/set_geo_rep_pem_keys.sh <slave_user> BUG: 1077452 Change-Id: I88020968aa5b13a2c2ab86b1d6661b60071f6f5e Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7744 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* gsyncd / geo-rep: Partial support for Non-root geo-replication.Venky Shankar2014-05-141-100/+133
| | | | | | | | | | | | | | | | | | | | | | | | This patch enables geo-replication to be run as an unprivileged user. As of now, this is just the partial support, but is very close to achieve full functionality. Current limitation * Geo-replication executed Gluster CLI commands on the slave via SSH. On a non-root setup, Gluster CLI would run as an unprivileged user, failing to execute the command. As a workaround (for testing), setuid(2) Gluster CLI executable or use the glusterd option to accept commands by unprivileged CLI process. The nature of cli commands are "system::" commands (for key management) and remote volume info fetching. Remote volume info fetching has been modified to use --remote-host gluster cli option rather than ssh and remote cli execution. Change-Id: Ica89e2ba9b7f48fd6e1c876c477d7822dc693617 BUG: 1077452 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/7658 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep/glusterd: Pause and Resume feature for geo-replicationKotresh H R2014-05-131-14/+263
| | | | | | | | | | | | | | | | | This patch introduces pause and resume cli command for geo-replication. Signed-off-by: Kotresh H R <khiremat@redhat.com> Change-Id: I4f5e58e9175fe85077d56088473252391fb57de7 BUG: 1093602 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/7643 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* glusterd/geo-rep: Looks for state_file and pid-file in gsyncd_template.confAvra Sengupta2014-05-011-161/+383
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If entries like state_file or pid-file are missing in the gsyncd.conf or if the gsyncd.conf is also missing, glusterd looks for the missing configs in the gsyncd_template.conf status will display "Config Corrupted" as long as the entry is missing in the config file. Missing state-file entry in both config and template will not allow starting a geo-rep session. However stop force will successfully stop an already running session, if the state-file entries are missing in both the config file and the template, as long as either of them have a pid-file entry. if the pid-file entry is missing in the gsyncd.conf file, starting a geo-rep session will not be allowed. if the pid-file entry is missing in an already started session, then stop force will fetch it from the config template and stop the session. if the pid-file entry is missing in both the config and the template, stop force will fail with appropriate error stating pid-file entry is missing. Change-Id: I81d7cbc4af085d82895bbef46ca732555aa5365d BUG: 1059092 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/6856 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep/glusterd: Fix geo-rep status on introduction of volume lockKotresh H R2014-03-211-17/+14
| | | | | | | | | | | | | | | | | | | | Getting op context in 'glusterd_op_gsync_set' is no longer valid as it is expected that 'rsp_dict' sent from caller is filled. It was fine till now as no one was setting the op context. The introduction of volume locks sets it, consequently breaking geo-rep status command. Hence the code that gets dict from op context if present is removed. Also corrected some indentation issues in 'glusterd_op_gsync_set' Signed-off-by: Kotresh H R <khiremat@redhat.com> Change-Id: Ieacd6e6c9be3c92159f849caca2acf5aabca1e32 BUG: 1077697 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/7289 Reviewed-by: Avra Sengupta <asengupt@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep: Fix gsyncd restart issue.Kotresh H R2014-02-121-7/+111
| | | | | | | | | | | | | | | | | | | | | | | New function 'glusterd_gsync_op_already_set' is written which compares the geo-rep configuration value in gsync.conf with the one sent from cli. The generic function is written to compare op_value for any op_name sent from cli as this issue can happen with any configuration setting other than use-tarssh also. This routine is used to avoid restart of gsyncd if the configuration value is set to same as in gsyncd.conf. Also added error checking when glusterd_gsync_configure fails. Earlier, eventhough 'glusterd_gsync_configure' fails, error was not getting catched and success message was shown. Change-Id: If4dcd0ffc09e6e67c79ba86238f03eff1b7c7645 BUG: 1060797 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/6897 Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Fix a memory leak.Raghavendra Talur2014-02-081-0/+2
| | | | | | | | | | | | errmsg may have malloc'ed memory, free it before leaving function. Change-Id: I4ab3b9db7a48a5e256eb8a08b8ab49818ce6ca1b BUG: 789278 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/6902 Reviewed-by: Poornima G <pgurusid@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep: Fixing a memory leak issue reported by CoverityLalatendu Mohanty2014-02-021-0/+1
| | | | | | | | | | | | | Issue: "errmsg" is allocated memory through GF_CALLOC in function "gf_strdup" Fix: using GF_FREE to free the memory Change-Id: Iee8f8d806ea995591feee8e4ed0a0798ad07a8c4 BUG: 789278 Signed-off-by: Lalatendu Mohanty <lmohanty@redhat.com> Reviewed-on: http://review.gluster.org/6740 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Fixing null pointer dereference of "op_value"Lalatendu Mohanty2014-01-151-1/+1
| | | | | | | | | Change-Id: Id39743eaa5a52cc7fd4e2a1378a23384f5ef1fed BUG: 789278 Signed-off-by: Lalatendu Mohanty <lmohanty@redhat.com> Reviewed-on: http://review.gluster.org/6700 Reviewed-by: Avra Sengupta <asengupt@redhat.com> Tested-by: Avra Sengupta <asengupt@redhat.com>
* glusterd/geo-rep : Allow volume to stop if geo-rep is not active.Kotresh H R2014-01-141-0/+152
| | | | | | | | | | | | | | In staging phase of volume stop, code is added to read the state_file for each slave of the master to which the volume belongs. If any of the geo-rep session is active with at least one slave, volume is not allowed to stop else it is allowed. Change-Id: I4a01a357fc86b872e9635b3d19998cdbd9545114 BUG: 1049727 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/6663 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: Fixing an incorrect memcpy operation.Lalatendu Mohanty2014-01-091-1/+1
| | | | | | | | | | | | Currently we are copying a higher size variable to lower size variable i.e. copying a NAME_MAX to PATH_MAX sized variable in "memcpy (sts_val->worker_status, monitor_status, strlen(monitor_status));" Change-Id: I81dca8e81a4aea5545d5982aed20e05a5e08641c Signed-off-by: Lalatendu Mohanty <lmohanty@redhat.com> Reviewed-on: http://review.gluster.org/6667 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/geo-rep: more glusterd and cli fixes for geo-rep.Ajeet Jha2013-12-121-192/+371
| | | | | | | | | | | | | | | | | | | | | | | | -> handle option validation cases in reset case. -> Creating valid conf path when glusterd restarts. -> Reading the gsyncd worker thread status and displaying it. -> Displaying status-detail per worker. -> Fetch checkpoint info in geo-rep status. -> use-tarssh value validation added. misc: misc geo-rep fixes based on cluster, logrotate etc.. -> cluster/dht: fix 'stime' getxattr getting overwritten. -> cluster/afr: return max of 'stime' values in subvol. -> geo-rep-logrotate: Sending SIGHUP to geo-rep auxiliary. -> cluster/dht: fix convoluted logic while aggregating. -> cluster/*: fix 'stime' min/max fetch logic. Change-Id: I811acea0bbd6194797a3e55d89295d1ea021ac85 BUG: 1036552 Signed-off-by: Ajeet Jha <ajha@redhat.com> Reviewed-on: http://review.gluster.org/6405 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@gmail.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Validating invalid log-level under geo-rep config options.Ajeet Jha2013-10-031-2/+14
| | | | | | | | | | | Change-Id: I8ff6b48ef41fd6e9ea68c42dfb9878f8a08ed627 BUG: 1010874 Signed-off-by: Ajeet Jha <ajha@redhat.com> Reviewed-on: http://review.gluster.org/5989 Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* glusterd/cli: Status detail cli parse check and vol geo status crash fixAvra Sengupta2013-09-211-7/+12
| | | | | | | | | | Change-Id: I1841864273fc4242de15fbfcf76fd5de40269f28 BUG: 1006249 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5889 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd : Blocking invalid values for geo-rep config optionsAvra Sengupta2013-09-171-9/+57
| | | | | | | | | | | Change-Id: Ia9ee44763a9c2798b26d3225bf03a974d7ece21f BUG: 998962 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5666 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Added missing MY_UUID conversion in gsync stagingAvra Sengupta2013-09-101-0/+2
| | | | | | | | | Change-Id: Ia4bf607e044d50636fb0a599a2ff91059f5b17aa BUG: 1006177 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5887 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* gsyncd / geo-rep: "disjoint" cascading geo-replication sessionsVenky Shankar2013-09-041-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | Slave's xtime is now stored on the master itself (and that too only on the root), which implies it cannot be propogated to the cascaded slave. Thus the intermediate master now makes use of it's own volume information to propogate volume-mark and xtime. On starting Geo-Replication "geo-replication.ignore-pid-check" marker option is enabled, which is an override for the client-pid check in marker. This options triggers marker update only for geo-replication auxillary mount (client-pid == -1). Since gsyncd not does setxattr() directly on the bricks, this option won't trigger a chain of spurious metadata updates that would need to be processed by gsyncd. Change-Id: If50c5ef275dfb6b4ff4fd35be2565587e2fdf3e1 BUG: 996371 Original Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/5592 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Tested-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Added op-version checks to geo-rep commands.Venky Shankar2013-09-041-0/+43
| | | | | | | | | | | | | | Added op-version checks to all geo-rep commands. Min op-version should be 2. Change-Id: I942d897404e11e4d53123409731ba5cd252668fe BUG: 847839 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/5732 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd/gverify: Check for passwordless ssh in gverify.Venky Shankar2013-09-041-79/+74
| | | | | | | | | | Change-Id: I8c2d398114ad4534bcc052f9a5be8bbb2e7e2582 BUG: 999531 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/5677 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Allowing root@hostname::slave georep sessions to be created.Venky Shankar2013-09-041-23/+102
| | | | | | | | | | | | | | non-root@hostname::slave-vol geo-rep sessions are not supported. only hostname and root@hostname sessions are supported, and are treated as the same. Change-Id: I87551e1bd4ff4e0e6520c34eb3d944587cc65476 BUG: 998933 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/5659 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd/gverify.sh: Stops session being created with invalid slave detailsVenky Shankar2013-09-041-8/+68
| | | | | | | | | | | | | create force will fail with proper message, if the ip is not reachable, or is unable to fetch slave details. Change-Id: I44a3ba777b37702ffd0e48e9cb46c51e293327d4 BUG: 988314 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/5516 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Saving geo-rep session details in a more specific pathVenky Shankar2013-09-041-113/+126
| | | | | | | | | | | | | | | Now saving the session details in /var/lib/glusterd/geo-replication/<mastervol>_<slaveip>_<slavevol> repo to distinguish between two master-slave sessions where the slavename is same across two different clusters. Change-Id: I57c93f55cc9bd4fe2bffe579028aaf5e4335b223 BUG: 991501 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/5488 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Correcting a log message in glusterd-geo-rep.cM S Vishwanath Bhat2013-08-051-1/+1
| | | | | | | | | | Change-Id: I4352f513fc5616daa20e9a4ad51a63fb13a27dff BUG: 847839 Signed-off-by: M S Vishwanath Bhat <vbhat@redhat.com> Reviewed-on: http://review.gluster.org/5472 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd : Checking session created or not in case of geo-rep stopAvra Sengupta2013-07-311-2/+6
| | | | | | | | | | | | | | | | | Performing statefile check in case of geo-rep stop, so as to provide proper error message in case session is not created. However in case of geo-rep stop force, we allow the command to succeed even in case that the session is not created, because the stop command is a failsafe command to stop running geo-rep sessions on any nodes. Change-Id: I2b6a0253de977633606c422cbbc9e37cede9a268 BUG: 989541 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5417 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd : initiating gsyncd restart during add-brickAvra Sengupta2013-07-311-13/+3
| | | | | | | | | | | | | | | | | | | | | | | During add-brick, when a new brick is added in one of the nodes that was already a part of the existing volume, and gsyncd was already running on that node, then all gsyncd processes running on that node, for that particular master and any slave sessions will be restarted If a new brick is added in a new node, then after adding the brick, the user has to perform the following steps: 1. gluster system:: execute gsec_create 2. gluster volume geo-replication <master-vol> <slave-vol> create push-pem force 3. gluster volume geo-replication <master-vol> <slave-vol> start force Change-Id: I4b9633e176c80e4a7cf33f42ebfa47ab8fc283f1 BUG: 989532 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5416 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Fixing create force issues while it returned true everytime.Avra Sengupta2013-07-291-42/+71
| | | | | | | | | | | | | | | | | | | | Now geo-rep create force will return true if a node is down, and log an appropriate message. It will also return true with an appropriate log message if the slave verification fails. However it will not return true if the config file is deleted, ot corrupted, so as not to get the state_file's path. It will also fail if the slave url is invalid. If the push-pem option is given and /var/lib/glusterd/geo-replication/common_secret.pem.pub is not present, then also the create force command will fail. Change-Id: Ie7532a0884ddf9c3008bd30832d171d5b53b540e BUG: 988314 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5405 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/cli changes for distributed geo-repAvra Sengupta2013-07-261-508/+2179
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commands: gluster system:: execute gsec_create gluster volume geo-rep <master> <slave-url> create [push-pem] [force] gluster volume geo-rep <master> <slave-url> start [force] gluster volume geo-rep <master> <slave-url> stop [force] gluster volume geo-rep <master> <slave-url> delete gluster volume geo-rep <master> <slave-url> config gluster volume geo-rep <master> <slave-url> status The geo-replication is distributed. The session will be created, and gsyncd will be spawned on all relevant nodes, instead of only one node. geo-rep: Collecting status detail related data Added persistent store for saving information about TotalFilesSynced, TotalSyncTime, TotalBytesSynced Changes in the status information in socket: Existing(Ex): FilesSynced=2;BytesSynced=2507;Uptime=00:26:01; New(Ex): FilesSynced=2;BytesSynced=2507;Uptime=00:26:01;SyncTime=0.69978; TotalSyncTime=2.890044;TotalFilesSynced=6;TotalBytesSynced=143640; Persistent details stored in /var/lib/glusterd/geo-replication/${mastervol}/${eSlave}-detail.status Change-Id: I1db7fc13ffca2e415c05200b0109b1254067f111 BUG: 847839 Original Author: Avra Sengupta <asengupt@redhat.com> Original Author: Venky Shankar <vshankar@redhat.com> Original Author: Aravinda VK <avishwan@redhat.com> Original Author: Amar Tumballi <amarts@redhat.com> Original Author: Csaba Henk <csaba@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5132 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* gsyncd: distribute the crawling loadAvra Sengupta2013-07-261-0/+4
| | | | | | | | | | | | | | | | | | | * also consume changelog for change detection. * Status fixes * Use new libgfchangelog done API * process (and sync) one changelog at a time Change-Id: I24891615bb762e0741b1819ddfdef8802326cb16 BUG: 847839 Original Author: Csaba Henk <csaba@redhat.com> Original Author: Aravinda VK <avishwan@redhat.com> Original Author: Venky Shankar <vshankar@redhat.com> Original Author: Amar Tumballi <amarts@redhat.com> Original Author: Avra Sengupta <asengupt@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5131 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: big lock - a coarse-grained locking to prevent racesKrishnan Parthasarathi2013-04-121-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are primarily three lists that are part of glusterd process, that are concurrently accessed. Namely, priv->volumes, priv->peers and volinfo->bricks_list. Big-lock approach ----------------- WHAT IS IT? Big lock is a coarse-grained lock which protects all three lists, mentioned above, from racy access. HOW DOES IT WORK? At any given point in time, glusterd's thread(s) are in execution _iff_ there is a preceding, inbound network event. Of course, the sigwaiter thread and timer thread are exceptions. A network event is an external trigger to glusterd, via the epoll thread, in the form of POLLIN and POLLERR. As long as we take the big-lock at all such entry points and yield it when we are done, we are guaranteed that all the network events, accessing the global lists, are serialised. This amounts to holding the big lock at - all the handlers of all the actors in glusterd. (POLLIN) - all the cbks in glusterd. (POLLIN) - rpc_notify (DISCONNECT event), if we access/modify one of the three lists. (POLLERR) In the case of synctask'ized volume operations, we must remember that, if we held the big lock for the entire duration of the handler, we may block other non-synctask rpc actors from executing. For eg, volume-start would block in PMAP SIGNIN, if done incorrectly. To prevent this, we need to yield the big lock, when we yield the synctask, and reacquire on waking up of the synctask. Change-Id: Ib929f9905b55fb6c3fc27fefb497a26dba058e4f BUG: 948686 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4784 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep: retire old style ssh setupCsaba Henk2013-03-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Users are still using geo-rep with the old, deprecated, insecure, unsupported ssh setup. Not their fault -- the implementation of the new method had the following charasteristics: - old method is possible, but with default settings it's not working - it can be made operational by fiddling with "remote-gsyncd" tunable - with default setting, an unhelpful, actually misleading error message is produced - the UI gave no hint to the changes in the ssh setup http://review.gluster.org/4392 tried to fix these; what it accomplished was unrestricted support to the bad practice (by making the default old setup operational). From this on: - we disable the old method by reserving the "remote-gsyncd" tunable - if the old method is attempted, give a hint what to do Change-Id: Icade94725d8d8d2d4c89cab992d4226351637b86 BUG: 895656 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.org/4602 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Made gsync set use synctask frameworkAvra Sengupta2013-02-081-5/+1
| | | | | | | | | Change-Id: I409fa5a9f55434ece47a8a51d4812d3eca42d269 BUG: 852147 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4473 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: do dict unref after sending reply to cliKrutika Dhananjay2013-02-031-3/+0
| | | | | | | | | | | | | | | | | This patch channelizes dict unrefs of dictionaries created from the cli req during volume ops to one common function - glusterd_to_cli() - which is guaranteed to be called irrespective of whether the command succeeds or fails. This patch also removes extra unrefs at a few places. Change-Id: Ic8ba7166387b5dfd1f5ae860539e1b7093a94662 BUG: 861044 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4003 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / glusterd: do non-blocking connect to checkpoint serviceCsaba Henk2013-01-211-0/+5
| | | | | | | | | | | glusterd should not hang if gsyncd ends up in some weird state Change-Id: Ic141daa0cd05d515848c8b6c25702418e15b7599 BUG: 826512 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.org/3919 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Add GF_ASSERT check in glusterd volume op handlersKrutika Dhananjay2013-01-081-2/+2
| | | | | | | | | | Change-Id: Iea6ac1e612812ba8ffc4b60899a9e574a3b09ea6 BUG: 873549 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4346 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* gsyncd / geo-rep: do not start geo-rep if replace brick is in progressVenky Shankar2012-12-111-0/+10
| | | | | | | | | Change-Id: I9db32544ceb6f90c8231aaf40d722f6869a72614 BUG: 861945 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/4289 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: log appropriate message when locking failsKrutika Dhananjay2012-12-111-11/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: When a transaction is already in progress, and the user tries to execute another glusterd operation, the second operation fails as glusterd fails to acquire lock. But to the user, a message like "Operation failed" does not give ample information about why the operation failed. FIX: Made glusterd_op_txn_begin use and initialise error string, which is needed to capture failure in the "lock" phase. Also made gd_sync_task_begin set error string appropriately when locking fails. In the process, I had to introduce error string in some glusterd_handle_* functions. And because I introduced error string in these handlers, I decided to also set them in places where these handlers could possibly fail. HOW I TESTED IT: For want of a better idea, I "commented out" the call to "glusterd_unlock", recompiled glusterd and ran two glusterd volume operations, one after the other. The second operation fails with the message "Another transaction is in progress. Please try again after sometime." as expected. The tests were performed on two volume ops : one of them synctask'ized (volume start) and the other NOT (volume create). Change-Id: Ia862972929872ae2f053707a544824d9cadc37be BUG: 873549 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4197 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Fix xdr_to_generic success checkKaushal M2012-12-091-2/+3
| | | | | | | | | | | | | | | | This patch fixes the success check for xdr_to_generic function across the codebase. Also, cleans up the brick_op actors table in glusterfsd-mgmt.c to make sure that the actors are called directly by rpcsvc. Change-Id: I3086585f30c44f69f1bc83665f89e30025f76d3a BUG: 884452 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4278 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / gsyncd,glusterd: do not hardcode socket pathCsaba Henk2012-11-281-0/+2
| | | | | | | | | | | | ... in gsyncd python code. Indeed, use the configuration mechanism to set it suitably from glusterd. Change-Id: I9fe2088b14d28588d1e64fe892740cc5755b8365 BUG: 868877 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.org/4143 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gsyncd / geo-rep: include hostname in status cmdVenky Shankar2012-11-211-11/+36
| | | | | | | | | | | | | | Including hostname of the node where geo-rep start was initiated from. This helps any consumers of the status command to identify and possibly issue commands on those node(s). Change-Id: I005083878a3a4794da3b7f3f7d2cc9d28f004e3f BUG: 858218 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/4218 Reviewed-by: Csaba Henk <csaba@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: Fix to log command status at the appropriate timeKrutika Dhananjay2012-09-201-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: In the existing implementation, the success/failure of execution of a command is decided (and logged) in glusterd handler functions. Strictly speaking, the logging mechanism must take into account what course the command takes within the state machine before concluding whether it succeeded or failed. FIX: This patch attempts to fix the above issue for vol commands. The format of the log message is as follows: for failure: <command string> : FAILED : <cause of failure> for success: <command string> : SUCCESS APPROACH (in a nutshell): * The command string is packed into dict at cli and sent to glusterd. * glusterd logs the command status just before doing a "submit_reply", which is called (either directly or indirectly via a call to glusterd_op_cli_send_response) at 2 places for every vol command: i. in handler functions, and ii. in glusterd_op_txn_complete In short, the failure of a command in the handler implies the command has indeed failed. However, its success in the handler does NOT necessarily mean the command succeeded/will succeed. Change-Id: I5a8a2ddc318ef2dc2a9699f704a6bcd2f0ab0277 BUG: 823081 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/3948 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>