summaryrefslogtreecommitdiffstats
path: root/xlators/features/changelog/src/changelog-rpc.c
Commit message (Collapse)AuthorAgeFilesLines
* Land part 2 of clang-format changesGluster Ant2018-09-121-279/+269
| | | | | Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* core (named threads): flood of -Wformat-truncation warnings with gcc-7.1Kaleb S. KEITHLEY2018-07-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Starting in Fedora 26 which has gcc-7.1.x, -Wformat-trunction is enabled with -Wformat, resulting in a flood of new warnings. This many warnings is a concern because it makes it hard(er) to see other warnings that should be addressed. An example is at https://kojipkgs.fedoraproject.org//packages/glusterfs/3.12.0/1.fc28/data/logs/x86_64/build.log For more info see https://review.gluster.org/#/c/18267/ I can't find much (or good) documentation on the heuristics the compiler uses for this warning. In the case of printing integer types it appears it looks at the available space in the destination and the range of values for the variable and/or its type. To address the specific question about why 0x3ff versus 0xfff to mask the value, either would suffice to hint to the compiler that the printed value will fit in three characters. But the loop is from 0...1023 (or 0...0x3ff if you prefer) so I chose that as a more "accurate" mask to use as it exactly matches the range of values of the loop. Fixes: bz#1492847 Change-Id: I6e309ba42159841131d8241bfc0566ef09e00aa9
* changelog: fix br-state-check.t failure for brick_muxMohit Agrawal2018-05-251-0/+33
| | | | | | | | | | | | | | | Problem: Sometime br-state-check.t crash while runnning for brick multiplex and command in test case is taking 2 minutes for detach a brick Solution: Update code in changelog xlator specific to wait on all connection before cleanup rpc threads and cleanup rpc object only in non brick mux scenario BUG: 1577672 Change-Id: I16e257c1e127744a815000b87bd8b7b8d9c51e1b fixes: bz#1577672 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* gluster: Sometimes Brick process is crashed at the time of stopping brickMohit Agrawal2018-04-191-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | Problem: Sometimes brick process is getting crashed at the time of stop brick while brick mux is enabled. Solution: Brick process was getting crashed because of rpc connection was not cleaning properly while brick mux is enabled.In this patch after sending GF_EVENT_CLEANUP notification to xlator(server) waits for all rpc client connection destroy for specific xlator.Once rpc connections are destroyed in server_rpc_notify for all associated client for that brick then call xlator_mem_cleanup for for brick xlator as well as all child xlators.To avoid races at the time of cleanup introduce two new flags at each xlator cleanup_starting, call_cleanup. BUG: 1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Note: Run all test-cases in separate build (https://review.gluster.org/#/c/19700/) with same patch after enable brick mux forcefully, all test cases are passed. Change-Id: Ic4ab9c128df282d146cf1135640281fcb31997bf updates: bz#1544090
* features/changelog: changelog threads are not cleaned properly after receive ↵Mohit Agrawal2017-12-131-1/+4
| | | | | | | | | | | | | | GF_EVENT_CLEANUP Problem: changelog threads are not cleaned properly after receive GF_EVENT_CLEANUP while brick mux is enabled Solution: change changelog notify code path to cleanup threads properly BUG: 1524816 Change-Id: I007f4e14e3d557707df738faf222b78c793d5ab5 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* glusterfs: Use gcc builtin ATOMIC operator to increase/decreate refcount.Mohit Agrawal2017-12-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In glusterfs code base we call mutex_lock/unlock to take reference/dereference for a object.Sometime it could be reason for lock contention also. Solution: There is no need to use mutex to increase/decrease ref counter, instead of using mutex use gcc builtin ATOMIC operation. Test: I have not observed yet how much performance gain after apply this patch specific to glusterfs but i have tested same with below small program(mutex and atomic both) and get good difference. static int numOuterLoops; static void * threadFunc(void *arg) { int j; for (j = 0; j < numOuterLoops; j++) { __atomic_add_fetch (&glob, 1,__ATOMIC_ACQ_REL); } return NULL; } int main(int argc, char *argv[]) { int opt, s, j; int numThreads; pthread_t *thread; int verbose; int64_t n = 0; if (argc < 2 ) { printf(" Please provide 2 args Num of threads && Outer Loop\n"); exit (-1); } numThreads = atoi(argv[1]); numOuterLoops = atoi (argv[2]); if (1) { printf("\tthreads: %d; outer loops: %d;\n", numThreads, numOuterLoops); } thread = calloc(numThreads, sizeof(pthread_t)); if (thread == NULL) { printf ("calloc error so exit\n"); exit (-1); } __atomic_store (&glob, &n, __ATOMIC_RELEASE); for (j = 0; j < numThreads; j++) { s = pthread_create(&thread[j], NULL, threadFunc, NULL); if (s != 0) { printf ("pthread_create failed so exit\n"); exit (-1); } } for (j = 0; j < numThreads; j++) { s = pthread_join(thread[j], NULL); if (s != 0) { printf ("pthread_join failed so exit\n"); exit (-1); } } printf("glob value is %ld\n",__atomic_load_n (&glob,__ATOMIC_RELAXED)); exit(0); } time ./thr_count 800 800000 threads: 800; outer loops: 800000; glob value is 640000000 real 1m10.288s user 0m57.269s sys 3m31.565s time ./thr_count_atomic 800 800000 threads: 800; outer loops: 800000; glob value is 640000000 real 0m20.313s user 1m20.558s sys 0m0.028 Change-Id: Ie5030a52ea264875e002e108dd4b207b15ab7cc7 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* libglusterfs: Name threads on creationRaghavendra Talur2017-07-191-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set names to threads on creation for easier debugging. Output of top -H -p <PID-OF-GLUSTERFSD> Before: 19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd After: 19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustertimer 19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustermemsweep 19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc0 19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc1 19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll0 19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteridxwrker 19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteriotwr0 19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrssign 19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrswrker 19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterclogecon 19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd0 19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd1 19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd2 19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixjan 19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixfsy 25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll1 5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll2 7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixhc Change-Id: Id5f333755c1ba168a2ffaa4fce6e71c375e10703 BUG: 1254002 Updates: #271 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: https://review.gluster.org/11926 Reviewed-by: Prashanth Pai <ppai@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* core: run many bricks within one glusterfsd processJeff Darcy2017-01-301-2/+6
| | | | | | | | | | | | | | | | | | | | | | | This patch adds support for multiple brick translator stacks running in a single brick server process. This reduces our per-brick memory usage by approximately 3x, and our appetite for TCP ports even more. It also creates potential to avoid process/thread thrashing, and to improve QoS by scheduling more carefully across the bricks, but realizing that potential will require further work. Multiplexing is controlled by the "cluster.brick-multiplex" global option. By default it's off, and bricks are started in separate processes as before. If multiplexing is enabled, then *compatible* bricks (mostly those with the same transport options) will be started in the same process. Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb BUG: 1385758 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/14763 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/changelog: fix a segment fault in changelog_cleanup_dispatchersRyan Ding2016-09-021-2/+2
| | | | | | | | | | | | | | | | | the arg 'count' in changelog_cleanup_dispatchers means the thread count to cleanup, so the cleanup loop should start with count-1, not count. change the invoke place also. Change-Id: Ia00853a2e0c5e01e145f60048b1f0ac05f2440ab BUG: 1369397 Signed-off-by: Ryan Ding <ryan.ding@open-fs.com> Reviewed-on: http://review.gluster.org/15293 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Milind Changire <mchangir@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* changelog/rpc: Fix rpc_clnt_t mem leaksKotresh HR2016-07-221-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: 1. Freeing up rpc_clnt object might lead to crashes. Well, it was not a necessity to free rpc-clnt object till now because all the existing use cases needs to reconnect back on disconnects. Hence timer code was not taking ref on rpc-clnt object. Glusterd had some use-cases that led to crash due to ping-timer and they fixed only those code paths that involve ping-timer. Now, since changelog has an use-case where rpc-clnt need to be freed up, we need to fix timer code to take refs 2. In changelog, because of issue 1, only mydata was being freed which is incorrect. And there are races where rpc-clnt object would access the freed mydata which would lead to crashes. Since changelog xlator resides on brick side and is long living process, if multiple libgfchangelog consumers register to changelog and disconnect/reconnect mulitple times, it would result in leak of 'rpc-clnt' object for every connect/disconnect. SOLUTION: 1. Handle ref/unref of 'rpc_clnt' structure in timer functions properly. 2. In changelog, unref 'rpc_clnt' in RPC_CLNT_DISCONNECT after disabling timers and free mydata on RPC_CLNT_DESTROY. RPC SETUP IN CHANGELOG: 1. changelog xlator initiates rpc server say 'changelog_rpc_server' 2. libgfchangelog initiates one rpc server say 'libgfchangelog_rpc_server' 3. libgfchangelog initiates rpc client and connects to 'changelog_rpc_server' 4. In return changelog_rpc_server initiates a rpc client and connects back to 'libgfchangelog_rpc_server' REF/UNREF HANDLING IN TIMER FUNCTIONS: Let's say rpc clnt refcount = 1 1. Take the ref before reigstering callback to timer queue >>>> rpc_clnt_ref (say ref count becomes = 2) 2. Register a callback to timer say 'callback1' 3. If register fails: >>>> rpc_clnt_unref (ref count = 1) 4. On timer expiration, 'callback1' gets called. So unref rpc clnt at the end in 'callback1'. This is corresponding to ref taken in step 1 >>>> rpc_clnt_unref (ref count = 1) 5. The cycle from step-1 to step-4 continues....until timer cancel event happens 6. timer cancel of say 'callback1' If timer cancel fails: Do nothing, Step-4 would have unrefd If timer cancel succeeds: >>>> rpc_clnt_unref (ref count = 1) Change-Id: I91389bc511b8b1a17824941970ee8d2c29a74a09 BUG: 1316178 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/13658 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* changelog: fix typecasting of functionPrasanna Kumar Kalever2016-03-171-3/+3
| | | | | | | | | | | | | | | | | fix compiler error: "error: void value not ignored as it ought to be" while the code tries to typecast return type of 'LOCK_DESTROY(lock)' to (void) Change-Id: I38ed47f3a12719c7a4a59388c0d021858d4dfe6d BUG: 1312354 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/13533 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* all: reduce "inline" usageJeff Darcy2015-09-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | There are three kinds of inline functions: plain inline, extern inline, and static inline. All three have been removed from .c files, except those in "contrib" which aren't our problem. Inlines in .h files, which are overwhelmingly "static inline" already, have generally been left alone. Over time we should be able to "lower" these into .c files, but that has to be done in a case-by-case fashion requiring more manual effort. This part was easy to do automatically without (as far as I can tell) any ill effect. In the process, several pieces of dead code were flagged by the compiler, and were removed. Change-Id: I56a5e614735c9e0a6ee420dab949eac22e25c155 BUG: 1245331 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/11769 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* features/changelog: Porting log messages to new logging frameworkAnusha Rao2015-08-111-1/+3
| | | | | | | | | | | Change-Id: Ic7f842acca52908fd88e0796dc90b82650405b25 BUG: 1194640 Signed-off-by: Anusha Rao <anusha91rao@gmail.com> Reviewed-on: http://review.gluster.org/10532 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com>
* features/changelog: Remove inline keyword to avoid warnings (gcc v5.1.1)Anoop C S2015-06-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | When compiled with gcc5, following warnings were displayed and volume start failed: changelog-helpers.h:499:1: warning: inline function 'changelog_dispatch_event' declared but never defined changelog_dispatch_event (xlator_t *, changelog_priv_t *, changelog_event_t *); gf-changelog-journal-handler.c:692:17: warning: 'list_add_tail' is static but used in inline function 'gf_changelog_queue_journal' which is not static list_add_tail (&entry->list, &jnl_proc->entries); Fix is to remove the keyword from function prototype and definitions. Change-Id: I188b35b7ca087a94d7a48a052b05a6d845e3b74b BUG: 1226307 Signed-off-by: Anoop C S <achiraya@redhat.com> Reviewed-on: http://review.gluster.org/11004 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* rpc: Maintain separate xlator pointer in 'rpcsvc_state'Kotresh HR2015-05-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The structure 'rpcsvc_state', which maintains rpc server state had no separate pointer to track the translator. It was using the mydata pointer itself. So callers were forced to send xlator pointer as mydata which is opaque (void pointer) by function prototype. 'rpcsvc_register_init' is setting svc->mydata with xlator pointer. 'rpcsvc_register_notify' is overwriting svc->mydata with mydata pointer. And rpc interprets svc->mydata as xlator pointer internally. If someone passes non xlator structure pointer to rpcsvc_register_notify as libgfchangelog currently does, it might corrupt mydata. So interpreting opaque mydata as xlator pointer is incorrect as it is caller's choice to send mydata as any type of data to 'rpcsvc_register_notify'. Maintaining two different pointers in 'rpcsvc_state' for xlator and mydata solves the issue. Change-Id: I7874933fefc68f3fe01d44f92016a8e4e9768378 BUG: 1215161 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/10366 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* features/changelog: RPC'fy {libgf}changelogVenky Shankar2015-03-181-0/+300
This patch introduces RPC based communication between the changelog translator and libgfchangelog. It replaces the old pathetic stream based interaction that existed earlier (due to time constraints :-/). Changelog, upon initialization starts a RPC server (rpcsvc) allowing clients to invoke a probe API as a bootup mechanism to request for event notifications. During probe, clients can choose an event filter specifying the type(s) of events they are interested in. As of now there is no way to change the event notification set once the probe RPC call is made, but that is easier to implement. The actual event notifications is done on a separate RPC session. The client (libgfchangelog) itself starts and RPC server which the changelog translator "connects back" during probe. Notifications are dispatched by a bunch of threads from the server (translator) and the client optionally orders them if ordered notifications are requried. FOPs fill in their respective event details in a buffer (rot-buffs to be particular) and a bunch of threads (consumers) swap the buffers out of roatation and dispatch them via RPC. To avoid writer starvation, then number of dispatcher threads is one less than the number of buffer list in rot-buffs.x libgfchangelog becomes purely callback based -- upon event notification from the server (and re-ordering them if required) invoke a callback routine specified by consumer(s). A major part of the patch is also aimed at providing backward compatibility for geo-replication, which was one of the main consumer of the stream based API. Also, this patch does not\ "turn on" event notifications for all fops, just a bunch which is currently in requirement. Another pain point is that the server does not filter events before dispatching it to the clients. That load is taken up by the client itself (although it's done at the library layer rather than making it hard on the callback implementor). This needs improvement and care needs to be taken to not load the server up with expensive filtering mechanisms. Change-Id: Ibf60a432b68f2dfa60c6f9add2bcfd37a9c41395 BUG: 1170075 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/9708 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>