summaryrefslogtreecommitdiffstats
path: root/xlators/features/marker/utils/syncdaemon/syncdutils.py
Commit message (Collapse)AuthorAgeFilesLines
* geo-rep: retire old style ssh setupCsaba Henk2013-03-141-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Users are still using geo-rep with the old, deprecated, insecure, unsupported ssh setup. Not their fault -- the implementation of the new method had the following charasteristics: - old method is possible, but with default settings it's not working - it can be made operational by fiddling with "remote-gsyncd" tunable - with default setting, an unhelpful, actually misleading error message is produced - the UI gave no hint to the changes in the ssh setup http://review.gluster.org/4392 tried to fix these; what it accomplished was unrestricted support to the bad practice (by making the default old setup operational). From this on: - we disable the old method by reserving the "remote-gsyncd" tunable - if the old method is attempted, give a hint what to do Change-Id: Icade94725d8d8d2d4c89cab992d4226351637b86 BUG: 895656 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.org/4602 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep: do not access BaseException.message in syncdutilsNiels de Vos2012-12-181-2/+2
| | | | | | | | | | | | | | http://www.python.org/dev/peps/pep-0352/ explains that the .message property of BaseException is being removed. Most of the other exception handlers access <Exception>.args[] which should be suitable for this case too. Change-Id: I1810450b78d2b3d7f8bd07f2beb02cbe9e2adecb BUG: 888346 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/4328 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep: checkpointingCsaba Henk2012-06-131-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - gluster vol geo-rep M S conf checkpoint <LABEL|now> sets a checkpoint with LABEL (the keyword "now" is special, it's rendered to the label "as of <timestamp of current time>") that's used to refer to the checkpoint in the sequel. (Technically, gsyncd makes a note of the xtime of master's root as of setting the checkpoint, called the "checkpoint target".) - gluster vol geo-rep M S conf \!checkpoint deletes the checkpoint. - gluster vol geo-rep M S stat if status is OK, and there is a checkpoint configured, the checkpoint info is appended to status (either "not yet reached", or "completed at <timestamp of completion>"). (Technically, the worker runs a thread that monitors / serializes / verifies checkpoint status, and answers checkpoint status requests through a UNIX socket; monitoring boils down to querying the xtime of slave's root and comparing with the target.) - gluster vol geo-rep M S conf log-file | xargs grep checkpoint displays the checkpoint history. Set, delete and completion events are logged properly. Change-Id: I4398e0819f1504e6e496b4209e91a0e156e1a0f8 BUG: 826512 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3491 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* geo-rep / gsyncd: further cleanup refinementsCsaba Henk2012-05-241-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Regarding issue of leftover ssh control dirs: If master side worker is stuck in connection establishment phase, have the monitor kill it softly (ie. first by SIGTERM, to let it cleanup). This is trickier than sounds on first hearing, because if worker is stuck in waiting for a RePCe answer (in threading.Condition().wait()), then SIGTERM is ignored (more precisely, Python holds it back for the wait and resends it to itself when wait is over). So instead of signalling the worker only, we send TERM to the whole process group -- that brings down the ssh connection, which wakes up the waiting worker, which then can cleanup. Only problem is that monitor is also in the process group and it should not coomit a suicide. That is taken care by setting up a one-time SIGTERM handler in the monitor. - Regarding slave gsyncd stuck in chdir: Slave gsyncd is usually well behaved: if master does not send keepalives, it takes care to exit. However, if a hang occurs in early phase, when slave is to change to the gluster mountpoint, no timeout is set up for that (and unlike on master side, neither is there an external actor like the monitor to do that). So, to manage this scenario, we do the chdir in a (supposedly) short lived thread, and in the main thread we wait for the termination of this thread. If that does not happen within the time limit, main thread calls for cleanup and exit. (This logic explicitely takes the appropriate action in the cases when chdir succeeds or when hangs; but what about the remaining case, when chdir fails? Well in that case the chdir thread's exception handler will put the process to cleanup and exit route.) Change-Id: I6ad6faa9c7b1c37084d171d1e1a756abaff9eba8 BUG: 786291 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3376 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / gsyncd: fixes regarding the command invocation frameworkCsaba Henk2012-05-191-1/+1
| | | | | | | | | | | | | | | | Some of the bugs to fix were found by the following stress-test: make "glusterfs --client-pid=-1" exit immediately on slave side. Also fix eintr_wrap which should not "adopt" exceptions generated by the wrapped call, by re-raising them as GsyncdError. Change-Id: Ia0d39e0635975ebbbf98d86e1e26f3122e1ed6ff BUG: 764678 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3258 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* geo-rep / gsyncd: recognize ECONNABORTED as termination of aux glusterfsCsaba Henk2012-05-191-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't dump stack, rather log the "glusterfs session went down" message. If the aux glusterfs is already dead when we try to do some file operation, we get a failure with ENOTCONN, which is already handled as above. However, it's also possible that glusterfs dies while we are in a syscall into it -- in that case we get ECONNABORTED, and so far then we end up with an ugly stack strace. From now on we take ECONNABORTAD as well into consideration. Nb. wrt. testing: it's not easy to synthetically force the aux glusterfs to end this way; for that we have to provoke gsyncd into intensive synchronization. I succeeded in that with the following ruby oneliner: ruby -rcgi -e ' Dir.chdir($*[0]) a=[] Thread.new { loop { while a.size >= 100; File.delete a.shift; end; sleep 1 }} loop { a<<CGI.escape(STDIN.read 10); open(a[-1], "w") {}}' MTPT < /dev/urandom where the geo-rep master is mounted at MTPT. With this going on, deliver a SIGKILL to the geo-rep session's aux glusterfs. (It is giving ECONNABORTED non-deterministically, actually in the minority of cases.) Change-Id: I24fd8d0295cdba91d8b994057a1255ca8e2d1a67 BUG: 764510 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3078 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* geo-rep / syncdaemon: determine suitable xattr namespace based on privilegeCsaba Henk2012-03-051-0/+3
| | | | | | | | | Change-Id: I91fe16d7e5e4c21f138eab4ee0b9334aec40e41b BUG: 765433 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/2838 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: gsyncd: Python3 compat fixesVenky Shankar2012-01-261-3/+4
| | | | | | | | | | Change-Id: I2eef82faab3eed1189e3786a5dca296773e1caa0 BUG: 784498 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.com/2690 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Csaba Henk <csaba@redhat.com>
* cli: add geo-replication log-rotate commandVenky Shankar2011-10-201-1/+21
| | | | | | | | | | | | | | | | | | | | | | | | | Rotating geo-replication master/monitor log files from cli. On invocation, the log file for a given master-slave session is backed up with the current timestamp suffixed to the file name and signal is sent to gsyncd to start logging to a new log file. Sample commands: * Rotate log file for this <master>:<slave> session: gluster volume geo-replication <master> <slave> log-rotate * Rotate log files for all session for master volume <master> gluster volume geo-replication <master> log-rotate * Rotate log files for all sessions: gluster volume geo-replication log-rotate Change-Id: I75f641b4e082a04d5373c18583ca4a1d9651d27a BUG: 3519 Reviewed-on: http://review.gluster.com/529 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Csaba Henk <csaba@gluster.com>
* geo-rep: gsyncd: add --ignore-deletes optionVenky Shankar2011-09-201-0/+25
| | | | | | | | | | | | | | | When this option is set, a file deleted on master will not trigger a delete operation on the slave. Hence, the slave will remain as a superset of the master and can be used to recover the master in case of crash and/or accidental deletes. This options is not enabled by default. Change-Id: I9244d9dfa4f38f19436036f36bec0d9c3a1f7993 BUG: 3552 Reviewed-on: http://review.gluster.com/426 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Csaba Henk <csaba@gluster.com>
* geo-rep: partial support for unprivileged gsyncd via mountbrokerCsaba Henk2011-09-121-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | gsyncd: - mounting code is split to a direct and a mountbroker based backend - option gluster-command gone - new options: gluster-params, gluster-cli-options, mountbroker - mountbroker mount backend is used if either a mountbroker label is given through the mountbroker option, or if gsyncd is unprivileged; in this case the username is used as label - have gluster cli invocations log to stderr so that we don't hit a permission issue with the logfiles glusterd: - do gsyncd pre-config with new options - add option geo-replication-log-group, so if that specified geo-rep logfile directories are given to that group (and thus members of the given group can do logging there) This is just WIP as geo-rep relies on trusted extended attributes and those are not accessible for unprivileged users. Even if we solved this issue, glusterd security settings are too coarse, so that if we made it possible for an unprivileged gsyncd to operate, we would open up too far. Change-Id: Icd520b58cbadccea3fad7c0f437b99de1e22db14 BUG: 2825 Reviewed-on: http://review.gluster.com/399 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* gsyncd: python3 compat fixesCsaba Henk2011-09-121-3/+7
| | | | | | | | | | | Also add __codecheck script which can verify if source is OK at the syntactical level with a given Python interpreter. Change-Id: Ieff34bcd3efd1cdc0e8f9a510c05488f35897bbe BUG: 1570 Reviewed-on: http://review.gluster.com/320 Reviewed-by: Kaushik BV <kaushikbv@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* gsyncd: do the homework, document _everything_Csaba Henk2011-09-081-0/+23
| | | | | | | | Change-Id: I559e6a0709b8064cfd54c693e289c741f9c4c4ab BUG: 1570 Reviewed-on: http://review.gluster.com/319 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushik BV <kaushikbv@gluster.com>
* gsyncd: refine command invocationCsaba Henk2011-08-251-0/+3
| | | | | | | | | | | | | | | | | | | | Use subprocess module instead of os.spawn* / ad-hoc fork/exec. With this, we do now: - close uneeded files in children - watch childrens' stderr: - have a thread which collects childrens' stderr into a ring buffer (so that stderr pipe doesn't get stuffed) - on command failure show stderr - distinguish between rsync exit values, tolerate only partial errors - if connection is broken to slave, show ssh/slave gsycd's stderr Change-Id: Ia92f57b5bdfa47f8c44375c50cf279006a0bf69b BUG: 2946 Reviewed-on: http://review.gluster.com/85 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: Kaushik BV <kaushikbv@gluster.com> Reviewed-by: Kaushik BV <kaushikbv@gluster.com>
* gsyncd: do some basic sanitization on logsCsaba Henk2011-07-291-3/+29
| | | | | | | | | | | | - exceptions raised by us will be logged as single-line error messages (full stack strace is shown only at DEBUG loglevel) - common/well understood exceptions are mapped to "user-parsable" error logs Change-Id: I75f1fb848483372364b2093878d9cfed576c9739 BUG: 2778 Reviewed-on: http://review.gluster.com/125 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* gsyncd: log exit properlyCsaba Henk2011-07-281-0/+2
| | | | | | | | Change-Id: Iedd8c0ce9dec2d8dcb01e0e5b409cb53185b1716 BUG: 2778 Reviewed-on: http://review.gluster.com/82 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushik BV <kaushikbv@gluster.com>
* syncdaemon: minor cleanups on terminationCsaba Henk2011-04-171-4/+7
| | | | | | | | Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2736 (gsyncd hangs if crash occurs in the non-main thread) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2736
* syncdaemon: yet another try to exit properlyCsaba Henk2011-04-161-3/+85
| | | | | | | | | | | | The final cleanup sequence + call to _exit, which was just done in the main thread, now is called for in each thread when the thread crashes. Seems we aren't left there hanging this way. Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2736 (gsyncd hangs if crash occurs in the non-main thread) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2736
* syncdaemon: ensure -/_ invariance in tunables, in all componentsCsaba Henk2011-04-131-0/+4
| | | | | | | | Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2659 (gsync config-del option is not working properly) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2659
* syncdaemon: force termination for unhandled exception in any threadCsaba Henk2011-04-131-0/+20
| | | | | | | | Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2736 (gsyncd hangs if crash occurs in the non-main thread) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2736
* syncdaemon: fix transaction codeCsaba Henk2011-04-111-1/+2
| | | | | | | | Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2659 (gsync config-del option is not working properly) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2659
* syncdaemon: add monitor mode to support autorestartCsaba Henk2011-04-041-0/+8
| | | | | | | | Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 2537 (gsync autorestart) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2537
* syncdaemon: provide transactional semantics to config file writingCsaba Henk2011-04-041-0/+31
| | | | | | | | | | | | So updating the config file from multiple contexts won't mess it up. This prepares the next commit where we'll set options internaly (which lacks the serial nature of user actions). Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 2537 (gsync autorestart) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2537
* syncdaemon: add support from dumping urls in canonical and escaped canonical ↵Csaba Henk2011-03-101-0/+11
form Signed-off-by: Kaushik BV <kaushikbv@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1570 (geosync related changes) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1570