summaryrefslogtreecommitdiffstats
path: root/extras/ganesha
Commit message (Collapse)AuthorAgeFilesLines
* nfs-ganesha: gluster_shared_storage fails to automount on node reboot on rhel 8Shwetha K Acharya2020-09-102-2/+2
| | | | | | | | | | | | | | The patch https://review.gluster.org/#/c/glusterfs/+/24934/, changes mount point of gluster_shared_storage from /var/run to /run to address the issue of symlink at mount path in fstab. NOTE: mount point /var/run is symlink to /run The required changes with respect to gluster_shared_storage mount path are introduced with this patch in nfs-ganesha. Fixes: #1475 Change-Id: I9c7677a053e1291f71476d47ba6fa2e729f59625 Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
* common-ha: ganesha-ha.sh bad test for {rhel,centos} for pcs optionsKaleb S. KEITHLEY2020-05-281-1/+1
| | | | | | | | | | bash [[ ... =~ ... ]] built-in returns _0_ when the regex matches, not 1, thus the sense of the test is backwards and never correctly detects rhel or centos. Change-Id: Ic9e60aae4ea38aff8f13979080995e60621a68fe Fixes: #1269 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* common-ha: cluster status shows "FAILOVER" when actually HEALTHYKaleb S. KEITHLEY2020-05-041-2/+3
| | | | | | | | | | | | | | | | | | | | | | pacemaker devs change the format of the ouput of `pcs status` Expected to find a line in the format: Online: .... but now it's * Online: ... And the `grep -E "^Online:" no longer finds the list of nodes that are online. Also other lines now have '*' in first few characters of the line throwing off `grep -x ...` Change-Id: Ia04a89e76914f2a455a755f0a93fa415f60aefd0 Fixes: #1169 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* common-ha: cluster status shows "FAILOVER" when actually HEALTHYKaleb S. KEITHLEY2020-04-141-1/+1
| | | | | | | | | | | | | | | | | | | pacemaker devs change the format of the ouput of `pcs status` Expected to find a line in the format: Online: .... but now it's * Online: ... And the `grep -E "^Online:" no longer finds the list of nodes that are online. Change-Id: If2aa1e7b53c766c625d7b4cc222a83ea2c0bd72d Fixes: #1169 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* Updating create-ganesha-config script:Arjun Sharma2020-03-111-0/+1
| | | | | | | | Adding the Security_Label parameter for labelled nfs Change-Id: I26d332bc30c767093cfa5d6e63a3b0268fc8a60b Fixes: bz#1812353 Signed-off-by: Arjun Sharma <arjsharm@redhat.com>
* ganesha-ha: updates for pcs-0.10.x (i.e. in Fedora-29 and RHEL-8)Kaleb S. KEITHLEY2020-02-191-28/+56
| | | | | | | | | | | | | | pcs-0.10 has introduced changes options to pcs commands pcs-0.10.x is in Fedora-29 and later and RHEL-8. Also some minor cleanup. Namely use bash built-in [[...]] in a few more places instead of test(1), i.e. [...], and use correct "==" for comparison. Change-Id: I3fb2fcd71406964c77fdc4f18580ca133f365fd6 Fixes: bz#1193929 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* Be explicit on this being a python3 scriptMichael Scherer2020-01-161-1/+1
| | | | | | | | | | | | | While the script seems to work on both python3 and python2, this break the build of rawhide RPM who requires script to be either using python2 or python3. Since python2 is going to be deprecated, I guess we should aim for python3. Change-Id: Ic6322ad47772d708b60b96652a1122ee4a54141d Fixes: bz#1791682 Signed-off-by: Michael Scherer <misc@fedoraproject.org>
* Revert "glusterd: (storhaug) remove ganesha (843e1b0)"Jiffin Tony Thottan2019-08-242-16/+25
| | | | | | | | | please note as an additional change, macro GLUSTERD_GET_SNAP_DIR moved from glusterd-store.c to glusterd-snapshot-utils.h Change-Id: I811efefc148453fe32e4f0d322e80455447cec71 updates: #663 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
* Revert "packaging: (ganesha) remove glusterfs-ganesha subpackage and related ↵Jiffin Tony Thottan2019-08-2412-0/+2034
| | | | | | | | | | | | | | | | | | | files" Until 3.12, glusterd had an option to setup HA cluster for nfs-ganesha using pacemaker and corosync. But later infavour of "Storhaug" Project, this functionality was removed from the codebase. Since there is not much development happening towards storhaug, it better to keep back old working HA solution for nfs-ganesha with bit improvements. Planned improvements : * add an option in nfs-ganesha enable to set ganesha without HA. * Handle usage of export id's properly in the scripts. Change-Id: I1d60c8970bfc20035cf674d7b2705dfd4819647e updates: #663 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
* packaging: (ganesha) remove glusterfs-ganesha subpackage and related files)Kaleb S. KEITHLEY2017-03-217-229/+0
| | | | | | | | | | | | | | | Indiana Jones and the Temple of Ganesha HA, part two. remove glsuterfs-ganesha subpackage, superceded by storhaug Change-Id: I42a1fc59159add108d77080b9b130696216aa76d BUG: 1418417 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/16506 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
* storhaug HA: first step, remove resource agents and setup scriptKaleb S. KEITHLEY2017-01-177-1768/+3
| | | | | | | | | | | | | | | | resource agents and setup script(s) are now in storhaug This is a phased switch-over to storhaug. Ultimately all components here should be (re)moved to the storhaug project and its packages. But for now some will linger here. Change-Id: Ied3956972b14b14d8a76e22c583b1fe25869f8e7 BUG: 1410843 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/16349 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* common-ha: add node create new node dirs in shared storageKaleb S. KEITHLEY2016-12-221-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When adding a node to the ganesha HA cluster, create the directory tree in shared storage for the added node and create sets of symlinks to match what is/was created for the other nodes. I.e. in a four node cluster the new node needs a set of links to the four existing nodes: /run/gluster/shared/nfs-ganesha/$new/nfs/{ganesha,statd}/$e1 -> e1 /run/gluster/shared/nfs-ganesha/$new/nfs/{ganesha,statd}/$e2 -> e2 /run/gluster/shared/nfs-ganesha/$new/nfs/{ganesha,statd}/$e3 -> e3 /run/gluster/shared/nfs-ganesha/$new/nfs/{ganesha,statd}/$e4 -> e4 and all the existing nodes need links added for the new node: /run/gluster/shared/nfs-ganesha/$e1/nfs/{ganesha,statd}/$new -> new /run/gluster/shared/nfs-ganesha/$e2/nfs/{ganesha,statd}/$new -> new /run/gluster/shared/nfs-ganesha/$e3/nfs/{ganesha,statd}/$new -> new /run/gluster/shared/nfs-ganesha/$e5/nfs/{ganesha,statd}/$new -> new Likewise when deleting, remove the dir and symlinks. original change http://review.gluster.org/16036 BUG: 1400613 Change-Id: I52839046745728d06ab5a07f38081c032093bff6 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/16216 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com>
* common-ha: Correct the VIP assigned to the new node addedSoumya Koduri2016-12-211-4/+4
| | | | | | | | | | | | | | | | There is a regression introduced with patch#16115. An incorrect VIP gets assigned to the new node being added to the cluster. This patch fixes the same. Change-Id: I468c7d16bf7e4efa04692db83b1c5ee58fbb7d5f BUG: 1406410 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/16213 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* ganesha/scripts : Prevent removal of entries in ganesha.conf during deletion ↵Jiffin Tony Thottan2016-12-211-1/+1
| | | | | | | | | | | | | of a node Change-Id: Ia6c653eeb9bef7ff4107757f845218c2316db2e4 BUG: 1406249 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/16209 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com>
* common-ha: add node create new node dirs in shared storageKaleb S. KEITHLEY2016-12-161-1/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When adding a node to the ganesha HA cluster, create the directory tree in shared storage for the added node and create sets of symlinks to match what is/was created for the other nodes. I.e. in a four node cluster the new node needs a set of links to the four existing nodes: /run/gluster/shared/nfs-ganesha/$new/nfs/{ganesha,statd}/$e1 -> e1 /run/gluster/shared/nfs-ganesha/$new/nfs/{ganesha,statd}/$e2 -> e2 /run/gluster/shared/nfs-ganesha/$new/nfs/{ganesha,statd}/$e3 -> e3 /run/gluster/shared/nfs-ganesha/$new/nfs/{ganesha,statd}/$e4 -> e4 and all the existing nodes need links added for the new node: /run/gluster/shared/nfs-ganesha/$e1/nfs/{ganesha,statd}/$new -> new /run/gluster/shared/nfs-ganesha/$e2/nfs/{ganesha,statd}/$new -> new /run/gluster/shared/nfs-ganesha/$e3/nfs/{ganesha,statd}/$new -> new /run/gluster/shared/nfs-ganesha/$e5/nfs/{ganesha,statd}/$new -> new Likewise when deleting, remove the dir and symlinks. Change-Id: Id2f78f70946f29c3503e1e6db141b66cb431e0ea BUG: 1400613 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/16036 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
* common-ha: explicitly set udpu transport for corosyncKaleb S. KEITHLEY2016-12-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | On RHEL7 corosync uses udpu (udp unicast) by default. On RHEL6 the default is (now) udp multi-cast. In network environments that don't support udp multi-cast this causes the ever growing lists of [TOTEM ] Retruansmit errors. Always specifying --transport udpu is thus a no-op on RHEL7. Using the same transport on both RHEL6 and RHEL7 may (or may not give similar behavior and performance--it's hard to say. It remains a mystery why things have always worked on RHEL6 prior to now. Further investigation is required to uncover why this is the case. Change-Id: I4d0de97fe4425c47f249beaaf51aeca3e91731fa BUG: 1404410 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/16122 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com>
* common-ha: Create portblock RA as part of add/delete-nodeSoumya Koduri2016-12-121-8/+61
| | | | | | | | | | | | | | | | | When a node is added to or deleted from existing nfs-ganesha cluster, we need to create or cleanup portblock RA as well. This patch is to address the same. Also we need to adjust the quorum-policy with increase/decrease in the number of nodes in the cluster. Change-Id: I31a896715b9b7fc931009723d1570bf7aa4da9b6 BUG: 1403130 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/16089 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* common-ha: IPaddr RA is not stopped when pacemaker quorum is listKaleb S. KEITHLEY2016-12-011-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ken Gaillot writes: The other is pacemaker's no-quorum-policy cluster property. The default (which has not changed) is "stop" (stop all resources). Other values are "ignore" (act as if quorum was not lost), "freeze" (continue running existing resources but don't recover resources from unseen nodes) or "suicide" (shut down). But on my four node cluster % pcs property show no-quorum-policy Cluster Properties: % i.e. shows nothing. But: % pcs property list --all Cluster Properties: ... no-quorum-policy: stop ... % Seems to think it knows about it. and then % pcs property set no-quorum-policy=stop % pcs property show no-quorum-policy Cluster Properties: no-quorum-policy: stop % Which looks rather inconsistent. So we will try explicitly setting it to "stop" when there are three or more nodes. Change-Id: I47fc7ee84fcd6ad52ccb776913511978a8d517b4 BUG: 1400237 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15981 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* common-ha: add cluster HA status to --status output for gdeployKaleb S. KEITHLEY2016-12-011-28/+65
| | | | | | | | | | | | | | | | | | | | | | | | gdeploy desires a one-liner "health" assessment. If all the VIP and port block/unblock RAs are located on their prefered nodes and 'Started', then the cluster is deemed to be good (healthy). N.B. status originally only checked the "online" nodes obtained from `pcs status` but we really want to consider all the configured nodes, whether they are online or not. Also one `pcs status` is enough. Change-Id: Id0e0380b6982e23763edeb0488843b5363e370b8 BUG: 1395648 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15882 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Arthy Loganathan <aloganat@redhat.com> Reviewed-by: soumya k <skoduri@redhat.com>
* common-HA: Increase timeout for portblock RA of action=unblockSoumya Koduri2016-12-011-1/+7
| | | | | | | | | | | | | | | | | | | | Portblock RA of action type unblock stores the information about the client/server IPs connection in tickle_dir folder created in the shared storage. In case of node shutdown/reboot there could be cases wherein shared_storage may become unavailable for sometime. Hence increase the timeout to avoid that resource agent going into FAILED state. Change-Id: I4f98f819895cb164c3a82ba8084c7c11610f35ff BUG: 1399154 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/15947 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
* ganesha/scripts : use export id for dbus signalsJiffin Tony Thottan2016-11-171-2/+6
| | | | | | | | | | | | | | | | | Currently for add export and update export parameter passed for executing those signal is "PATH". This is based on assumption that volume name and PATH will always be same. But it is wrong for subdir exports. The only reliable parameter in export configuration file is "Export_Id". Change-Id: Ic63ff44ac7736e14502034b74beaae27292eddf9 BUG: 1389746 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/15751 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* common-ha: nfs-grace-monitor timed out unknown error messagesKaleb S. KEITHLEY2016-11-161-3/+2
| | | | | | | | | | | | | | | grace_mon_monitor() occasionally returns OCF_ERR_GENERIC, but it ought to return OCF_NOT_RUNNING instead. Change-Id: I3d550e33cc3d0a8dce4333ced72db19b3b2f4f2e BUG: 1394224 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15831 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
* common-ha: remove /etc/corosync/corosync.conf in teardown/cleanupKaleb S. KEITHLEY2016-11-151-2/+5
| | | | | | | | | | | | | | | | | | | | In newer versions of corosync we observe that after tearing down an existing HA cluster, when trying to set up a new cluster, `pcs cluster start --all` will fail if corosync believes the nodes are already in the cluster based on the presence of, and the contents of /etc/corosync/corosync.conf So we summarily delete it. (An alternative/work-around is to use `pcs cluster start --force --all`) Change-Id: I225f4e35e3b605e860ec4f9537c40ed94ac68625 BUG: 1394881 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15843 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com>
* common-ha: Use UpdateExports dbus msg for refresh-configSoumya Koduri2016-10-191-27/+2
| | | | | | | | | | | | | | | In nfs-ganesha 2.4, new dbs msg type "UpdateExports" support has been added. With this support, the exports can be re-configured dynamically without the need to re-export the entries. Change-Id: Iee7330d33e91db1126974a2ff46becb3764f2e5e BUG: 1382258 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/15617 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* common-ha: ganesha_mon: line 137: [: too many arguments ]" messagesKaleb S. KEITHLEY2016-09-051-5/+9
| | | | | | | | | | | | | | | | | ensure that there are always valid, non-null arguments to /bin/test Here there be dragons. Very racy, but if the races lose, they lose in a way that's consistent with what we're testing for anyway, namely that the ganesha.nfsd process is gone. Change-Id: I88b770dd874ffa8576711f8009f27122a4fb0130 BUG: 1363595 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15390 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* glusterd/ganesha : create export configuration file in shared storageJiffin Tony Thottan2016-08-262-84/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the second patch which moves export related configuration for a volume into shared storage. The main change includes in scripts create-export-ganesha.sh, dbus-send.sh and the handling of volume set command "ganesha.enable". The manipulation of EXPORT_ID move from dbus-send.sh to create-export-ganesha.sh. In volume set handling following has performed stage | commit ---------------------------------------------------------- 1.) gluster v set <volname> ganesha.enable on None | create export file | in node where cli executed, | thne export volume via dbus 2.) gluster v set <volname> ganesha.enable off unexport volume via dbus | remove export file from the | shared storage ----------------------------------------------------------- More details can be found at http://review.gluster.org/#/c/15105/ Change-Id: Ia8b0e89bc8fff24b0bc5d20a538a89212894a8e4 BUG: 1355956 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/14908 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* ganesha/scripts : Modifying ganesha-ha.sh for share storage related changesJiffin Tony Thottan2016-08-261-64/+12
| | | | | | | | | | | | | | | | | | | Currently the ganesha related configurations are "scp"ied for operations like add, delete, refresh-config in ganesha-ha.sh. This is no more required since all the conf files are available in shared storage and every node can directly access them from shared storage. More details can be found at http://review.gluster.org/#/c/15105/ Change-Id: Ic025eb4dc246db61d6fbe969ca60214751fbf3ba BUG: 1355956 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/14909 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* Revert "glusterd-ganesha : copy ganesha export configuration files during ↵Jiffin Tony Thottan2016-08-262-99/+2
| | | | | | | | | | | | | | | | | | | | | reboot" This reverts commit f71e2fa49af185779b9f43e146effd122d4e9da0. Reason: As part of sync up node reboot this patch copies ganesha export conf file from a source node. This change is no more require if the export files are available in shared storage. Change-Id: Id9c1ae78377bbd7d5d80aa1c14f534e30feaae97 BUG: 1355956 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/14907 Reviewed-by: soumya k <skoduri@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* ganesha/scripts : remove 'HA_VOL_SERVER' from the codeJiffin Tony Thottan2016-08-252-31/+6
| | | | | | | | | | | | | | | | | | | | The parameter HA_VOL_SERVER introduced intially ganesha-ha.conf to specify gluster server from which to mount the shared data volume. But after introducing new cli for the same purpose, it become unnecessary. The existence of that parameter can lead confussion to the users. This patch will remove/replace all the instance of HA_VOL_SERVER from the code Change-Id: I638c61dcd2c21ebdb279bbb141d35bb806bd3ef0 BUG: 1350371 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/14812 Tested-by: Kaleb KEITHLEY <kkeithle@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* glusterd/ganesha : Move ganesha-ha.conf and ganesha.conf to shared storageJiffin Tony Thottan2016-08-251-5/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently all the ganesha related configuration files(ganesha.conf, ganesha-ha.conf, export files, etc) is stored locally at /etc/ganesha on a every node in ganesha cluster. Usually we end up in two issues by doing so : * difficult in modifiying ganesha related conf file * diffciult to maintain consistency of conf file across ganesha cluster To tackle this, we plan to move all the ganesha configuration to shared storage. As a first step in this patch ganesha.conf and ganesha-ha.conf move to shared storage. Here actual ganesha.conf will resides in shared stoarge and symlinks will be created in /etc/ganesha when the option "gluster nfs-ganesha enable" is executed and remove those during the "disable" part. Modified prerequisites to done before running globaloption: * enable shared storage * create nfs-ganesha folder in shared storage * create ganesha.conf and ganesha-ha.conf in it More details can be found at http://review.gluster.org/#/c/15105/ Change-Id: Ifabb6c5db50061f077a03932940190af74e2ca7f BUG: 1355956 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/14906 Reviewed-by: soumya k <skoduri@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* commn-HA: Add portblock RA to tickle packets post failover(/back)Soumya Koduri2016-08-031-10/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Portblock resource-agents are used to send tickle ACKs so as to reset the oustanding tcp connections. This can be used to reduce the time taken by the NFS clients to reconnect post IP failover/failback. Two new resource agents (nfs_block and nfs_unblock) of type ocf:portblock with action block & unblock are created for each Virtual-IP (cluster_ip-1). These resource agents along with cluster_ip-1 RA are grouped in the order of block->IP->unblock and also the entire group maintains same colocation rules so that they reside on the same node at any given point of time. The contents of tickle_dir are of the following format - * A file is created for each of the VIPs used in the ganesha cluster. * Each of those files contain entries about clients connected as below: SourceIP:port_num DestinationIP:port_num Hence when one server failsover, connections of the clients connected to other VIPs are not affected. Note: During testing I observed that tickle ACKs are sent during failback but not during failover, though I/O successfully resumed post failover. Also added a dependency on portblock RA for glusterfs-ganesha package as it may not be available (as part of resource-agents package) in all the distributions. Change-Id: Icad6169449535f210d9abe302c2a6971a0a96d6f BUG: 1354439 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/14878 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* ganesha/scripts : delete nfs-ganesha folder from shared storage during clean upJiffin Tony Thottan2016-06-231-0/+1
| | | | | | | | | | | | | | | A directory named "nfs-ganesha" will be created inside shared storage when 'gluster nfs-ganesha enable' is executed. Similarly this directory should be removed when 'gluster nfs-ganesha disable' is executed. Change-Id: Icc09b32010de07c9809e22aafbb2fd08a5c8252f BUG: 1349398 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/14782 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* common-ha: race/timing issue setting up clusterKaleb S KEITHLEY2016-06-033-25/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ganesha_grace resource agent can start before the ganesha_mon resource agent, with the result that the crm_attribute that ganesha_grace expects to find has not been created yet. This is never (never? Or just so rarely that it has never actually been seen during development) seen with four nodes, but with just two nodes it's very repeatable. Note that when long (FQDN) names are used it is not unexpected to see Failed Actions in the output of `pcs status`, e.g.: * nfs-grace_monitor_5000 on node1.fully.qualified.domain.name.com 'unknown error' (1): call=20, status=complete, exitreason='none', last-rc-change='Wed Jun 1 12:32:32 2016', queued=0ms, exec=0ms * nfs-grace_monitor_5000 on node2.fully.qualified.domain.name.com 'unknown error' (1): call=18, status=complete, exitreason='none', last-rc-change='Wed Jun 1 12:32:42 2016', queued=0ms, exec=0ms and as long as all the ganesha_grace_clone and cluster_ip-1 resource agents are in Started state then this is okay. Change-Id: I726c9946ceb1ca92872b321612eb0f4c3cc039d8 BUG: 1341768 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/14607 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* ganesha: fix the shebang for the copy-export scriptNiels de Vos2016-05-291-1/+1
| | | | | | | | | | | | BUG: 1340488 Change-Id: I22061a8b8bc0ea43da91e5b2904a27a674a004be Reported-by: Patrick Matthäi <pmatthaei@debian.org> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/14548 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* common-ha: wait for cluster to elect DC before accessing CIBKaleb S KEITHLEY2016-05-241-6/+13
| | | | | | | | | | | | | | | | | | access attempts, e.g. `pcs property set stonith-enabled=false` will fail (or time out) if attempted "too early", i.e. before the cluster has elected its DC. Change-Id: Ifc0aa7ce652c1da339b9eb8fe17e40e8a09b1096 BUG: 1336945 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/14426 CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: Gluster Build System <jenkins@build.gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
* common-ha: post fail-back, ganesha.nfsds are not put into NFS-GRACEKaleb S KEITHLEY2016-05-241-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A little known, rarely used feature of pacemaker called "notification" is used to follow the status of the ganesha.nfsds in the cluster. This is done with location constraints and other Black Magick. When a nfsd dies, the ganesha-active attribute is cleared, the associated floating IP (VIP) fails over to another node, and the ganesha_grace notify method is invoked with post-stop on all the nodes where the ganesha.nfsd is still running. The notify methods send dbus msgs to put their nfsds into NFS-GRACE, and the nfsds perform their grace processing, e.g. taking over locks from the failed nfsd. N.B. Fail-back was originally not planned to be a feature for glusterfs-3.7, but we sorta got it for free. For fail-back, the opposite occurs. The ganesha-active attribute is recreated, the floating IP fails back, and the notify method is invoked with pre-start on all the nodes where the surviving ganesha.nfsds continue to run. The notify methods send dbus msgs again to put their nsfds into NFS-GRACE again, and the nfsds clean up their locks. Change-Id: I3fc64afa20ae3a928143d69aa533a8df68dd680e BUG: 1338967 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/14506 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* common-ha: log flooded with Could not map name=xxxx to a UUIDKaleb S KEITHLEY2016-05-232-9/+30
| | | | | | | | | | | | | | | | When the cluster is configured with long (FQDN) cluster members the log is flooded with "Could not map name=$shortname to a UUID" notices, and setting/getting the attribute is failing Change-Id: I954d8cef7115659cc9c8b23dae75a5a247dc5db7 BUG: 1337650 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/14435 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: soumya k <skoduri@redhat.com>
* common-ha: stonith-enabled option set error in new pacemakerKaleb S KEITHLEY2016-05-191-7/+5
| | | | | | | | | | | | | | | | | Setting the option too early results in an error in newer versions of pacemaker. Postpone setting the option in order for it to succeed. N.B. We do not use a fencing agent. Yes, we know this is not supported. Change-Id: I86953fdd67e6736294dbd2d0795611837188bd9d BUG: 1336945 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/14404 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
* core: assorted typos and spelling mistakes reported by Debian lintianKaleb S KEITHLEY2016-05-182-2/+2
| | | | | | | | | | | | | | | Also missing bang (!) in #!/bin/bash in shell scripts. Change-Id: I567a4be8f0f31f6285550f243fe802895f6bc43b BUG: 1336793 Reported-by: Patrick Matthäi <pmatthaei@debian.org> Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/14398 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* common-ha: floating IP (VIP) doesn't fail over when ganesha.nfsd diesKaleb S KEITHLEY2016-05-151-2/+7
| | | | | | | | | | | | | | | | | | | | | | | restore mistaken removal of 'attrd_updater delete grace-active' to trigger fail-over original was: attrd_updater -D -n grace-active sleep attrd_updater -D -n ganesha-active mistake was: sleep attrd_updater -D -n grace-active Change-Id: Ifb20ebc673b8d262dfb720e7d5c060fbf0d1639b BUG: 1336197 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/14343 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* ganesha/scripts : Fixing refresh config in ganesha-ha.shJiffin Tony Thottan2016-05-131-1/+1
| | | | | | | | | | | | | | | | The change http://review.gluster.org/#/c/14225/ cause a regression for refresh config funtion in ganesha-ha.sh due to a invalid usage of awk arguement. Change-Id: Id5adfb12f99b29bdb3531773cd34bd67cfff8768 BUG: 1333319 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/14325 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* glusterd-ganesha : copy ganesha export configuration files during rebootJiffin Tony Thottan2016-05-052-2/+99
| | | | | | | | | | | | | | | | | | | | | | glusterd creates export conf file for ganesha using hook script during volume start and ganesha_manage_export() for volume set command. But this routine is not added in glusterd restart scenario. Consider the following case, in a three node cluster a volume got exported via ganesha while one of the node is offline(glusterd is not running). When the node comes back online, that volume is not exported on that node due to the above mentioned issue. Also I have removed unused variables from glusterd_handle_ganesha_op() For this patch to work pcs cluster should running on that be node. Change-Id: I5b2312c2f3cef962b1f795b9f16c8f0a27f08ee5 BUG: 1330097 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/14063 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* NFS-Ganesha : Parse the Export_Id correctly for unexporting volumeJiffin Tony Thottan2016-05-053-4/+4
| | | | | | | | | | | | | | | | Currently export id parsed using "cut -d ' ' -f8" which might endup in giving wrong value. In case of multiple space chracter, output may differ. In this all those instance will replaced by awk call Change-Id: I60dea8ce116900da3c1fc9badf898e51183a2ca1 BUG: 1333319 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/14225 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* common-ha: continuous grace_mon log messages in /var/log/messagesKaleb S KEITHLEY2016-04-284-33/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | messages are seen on RHEL6.x and RHEL7.1 and earlier versions of pacemaker. (And RHEL7.2 with RHEL7.1 pacemaker packages.) It's not possible to query attrd attributes in the older version, only set/update/clear them. The messages come from invalid attempts to query the attributes. However it is possible to query crm attributes. The fix here is to create a "shadow" crm attribute for the attrd attribute. Changes are made to both, queries are made on the crm attribute. (Resource Agents "follow" the attrd attribute using constraint locations, so we must keep the attrd attribute.) Change-Id: I84ac1a80673e528d98b67b7d5062e21dcf744d4a BUG: 1324509 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13919 Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com>
* ganesha: Include a script to generate epoch valueSoumya Koduri2016-03-212-2/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | In a NFS-Ganesha HA cluster setup, for NFS clients to recover state succesfully post failover, the NFS-servers should start with a unique epoch value. With NFS-Ganesha 2.3, the service accepts an option "EPOCH_EXEC" which takes path of the script, generating epoch value. This script is executed before starting nfs-ganesha service so that the generated epoch value is used while bringing up the service. This patch includes the script to be used by nfs-ganesha+gluster setup. The epoch value is computed as follows - - first 32-bit contains the now() time - rest 32-bit value contains the local glusterd node uuid Change-Id: I876ea5a3730d7c6b40503e0fec16a4a142c54a36 BUG: 1317902 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/13744 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* ganesha-ha: merge release-3.7 to master for release-3.8Kaleb S KEITHLEY2016-03-151-13/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cherry-pick changes from release-3.7 branch to master in preparation for 3.8 changes merged/cherry-picked from release-3.7 consist of: > common-ha: delete virt-IP entry of deleted node > commit aee983ff43655771a9102592a284d0b0a29ad89f > BUG: 1250601 > > common-ha: concise output for HA status > commit d804b17f2fe92b1516f85f03978072c42ddc6f19 > BUG: 1250628 > > common-ha : refresh-config should print sensible output > commit d7adcca24fb9638df2806c01d8ea7e73eec46928 > BUG: 1254494 > > CommonHA: Avoid scp of the config state to the same host > commit 07b31a008b59d6c0b06bd17994de85fc56560b38 > BUG: 1259225 > > CommonHA: Fix the path of 'systemctl' cmd > commit 8e7ca068720fa6fa50e4746e5b17c305f38171e8 > BUG: 1259225 > > common-ha: refresh-config output includes dbus "method return" msg > commit 045cb34238e341d68288893b8f040056e138c04e > BUG: 1262881 > > common-ha: distribution neutral location of config files > commit 6667478cdba920d2658bba2edc99c8f8cc33e271 > BUG: 1251821 > > common-ha: Corrected refresh-config output parsing > commit 2892bab69cac2d991509fca1279a659bce1ae793 > BUG: 1254494 > > common-ha: reliable grace using pacemaker notify actions > commit e8121c4afb3680f532b450872b5a3ffcb3766a97 > BUG: 1290865 > > ganesha: Read export_id on each node while performing refresh-config > commit e0e633cdce7586af92490730257ed7f0cffcff61 > BUG: 1309238 > > nfs-ganesha: pcs cluster setup needs '--name' option > commit 812e7321d3c56d329526628fe96a8f6fabea75ca > BUG: 1314204 Change-Id: I41c19bd369bd9da1092b0677590634a4e9f72813 BUG: 1317424 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13728 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* nfs-ganesha: pcs cluster setup needs '--name' optionSoumya Koduri2016-03-151-8/+1
| | | | | | | | | | | | | | | | | Even on fedora machines (on f22, f23), pcs cluster setup CLI expects '--name' to be provided. This patch addresses the same. BUG: 1314204 Change-Id: If23851bfd7fe2f1c200731d66dcb3e744390ff83 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/13590 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-on: http://review.gluster.org/13727 Tested-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* ganesha: Read export_id on each node while performing refresh-configSoumya Koduri2016-03-152-32/+54
| | | | | | | | | | | | | | | | | | | | As mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1309238#c1, there could be cases which shall result in having different ExportIDs for the same volume on each node forming the ganesha cluster. Hence during refresh-config, it is necessary to read the ExportID on each of those nodes and re-export that volume with the same ID. BUG: 1309238 Change-Id: Id39b3a0ce2614ee611282ff2bee04cede1fc129d Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/13459 CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13726 Tested-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* common-ha: reliable grace using pacemaker notify actionsKaleb S KEITHLEY2016-03-144-306/+241
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using *-dead_ip-1 resources to track on which nodes the ganesha.nfsd had died was found to be unreliable. Running `pcs status` in the ganesha_grace monitor action was seen to time out during failover; the HA devs opined that it was, generally, not a good idea to run `pcs status` in a monitor action in any event. They suggested using the notify feature, where the resources on all the nodes are notified when a clone resource agent dies. This change adds a notify action to the ganesha_grace RA. The ganesha_mon RA monitors its ganesha.nfsd daemon. While the daemon is running, it creates two attributes: ganesha-active and grace-active. When the daemon stops for any reason, the attributes are deleted. Deleting the ganesha-active attribute triggers the failover of the virtual IP (the IPaddr RA) to another node where ganesha.nfsd is still running. The ganesha_grace RA monitors the grace-active attribute. When the grace-active attibute is deleted, the ganesha_grace RA stops, and will not restart. This triggers pacemaker to trigger the notify action in the ganesha_grace RAs on the other nodes in the cluster; which send a DBUS message to their ganesha.nfsd. (N.B. grace-active is a bit of a misnomer. while the grace-active attribute exists, everything is normal and healthy. Deleting the attribute triggers putting the surviving ganesha.nfsds into GRACE.) To ensure that the remaining/surviving ganesha.nfsds are put into NFS-GRACE before the IPaddr (virtual IP) fails over there is a short delay (sleep) between deleting the grace-active attribute and the ganesha-active attribute. To summarize: 1. on node 2 ganesha_mon:monitor notices that ganesha.nfsd has died 2. on node 2 ganesha_mon:monitor deletes its grace-active attribute 3. on node 2 ganesha_grace:monitor notices that grace-active is gone and returns OCF_ERR_GENERIC, a.k.a. new error. When pacemaker tries to (re)start ganesha_grace, its start action will return OCF_NOT_RUNNING, a.k.a. known error, don't attempt further restarts. 4. on nodes 1, 3, etc., ganesha_grace:notify receives a post-stop notification indicating that node 2 is gone, and sends a DBUS message to its ganesha.nfsd putting it into NFS-GRACE. 5. on node 2 ganesha_mon:monitor waits a short period, then deletes its ganesha-active attribute. This triggers the IPaddr (virt IP) failover according to constraint location rules. ganesha_nfsd modified to run for the duration, start action is invoked to setup the /var/lib/nfs symlink, stop action is invoked to restore it. ganesha-ha.sh modified accordingly to create it as a clone resource. BUG: 1290865 Change-Id: I1ba24f38fa4338b3aeb17c65645e9f439387ff57 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/12964 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-on: http://review.gluster.org/13725
* common-ha: Corrected refresh-config output parsingSoumya Koduri2016-03-141-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | >>>> Sample program with the earlier changes - output=$(dbus-send --print-reply --system \ --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr \ org.ganesha.nfsd.exportmgr.RemoveExport uint16:5 2>&1\ | grep -v "^method return") ret=$? echo "${output}" echo $ret sleep 1 output=$(dbus-send --system --dest=org.ganesha.nfsd \ /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.AddExport \ string:/usr/etc/ganesha/exports/export.vol3.conf \ string:"EXPORT(Path=/vol3)" 2>&1 | grep -v "^method return") ret=$? echo "${output}" echo $ret Output: 1 1 Even if the command was successfully executed, 'grep -v' has filtered out the output. >>>> Sample program with the current changes - output=$(dbus-send --print-reply --system --dest=org.ganesha.nfsd \ /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.RemoveExport\ uint16:5 2>&1) ret=$? echo "${output}" echo $ret sleep 1 output=$(dbus-send --print-reply --system --dest=org.ganesha.nfsd \ /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.AddExport \ string:/usr/etc/ganesha/exports/export.vol3.conf \ string:"EXPORT(Path=/vol3)" 2>&1) ret=$? echo "${output}" echo $ret Output: method return sender=:1.155 -> dest=:1.174 reply_serial=2 0 method return sender=:1.155 -> dest=:1.175 reply_serial=2 string "1 exports added" 0 BUG: 1254494 Change-Id: I44fbe32588ec11f087c8b99b2d55ed55ba73727c Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/12439 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13724 Tested-by: Kaleb KEITHLEY <kkeithle@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>