summaryrefslogtreecommitdiffstats
path: root/glusterfsd
Commit message (Collapse)AuthorAgeFilesLines
...
* core: Resolve memory leak for brickMohit Agrawal2019-01-161-1/+7
| | | | | | | | | | | Problem: Some functions are not freeing memory allocated by xdr_to_genric so it has become leak Solution: Call free to avoid leak Change-Id: I3524fe2831d1511d378a032f21467edae3850314 fixes: bz#1656682 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* core: glusterd/add-brick-and-validate-replicated-volume-options.t is crashMohit Agrawal2019-01-141-6/+2
| | | | | | | | | | | | | Problem: Sometime brick is getting crash at the time of handling pmap signin request Solution: glusterfs_mgmt_pamp_signin is using same frame to send pmap signin request so to avoid crash send signin request on separate frame Change-Id: I443f854171ec4372e8d5f84bdc576c468e92c493 fixes: bz#1665656 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* core: Resolve dict_leak at the time of destroying graphMohit Agrawal2019-01-141-2/+1
| | | | | | | | | | | | Problem: In gluster code some of the places it call's get_new_dict to create a dictionary without taking reference so at the time of dict_unref it has become a leak Solution: To resolve the same call dict_new instead of get_new_dict updates bz#1650403 Change-Id: I3ccbbf5af07079a4fa09aad2cd0458c8625b2f06 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* glusterd: kill the process without releasing the cleanup mutex lockSanju Rakonde2019-01-021-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Problem: glusterd acquires a cleanup mutex lock before it starts cleanup process, so that any other thread which tries to acquire lock on any resource will be blocked on cleanup mutex lock. We don't want any thread to try to acquire any resource, once the cleanup is started. because other threads might try to acquire lock on resources which are already freed by the thread which is going though the cleanup phase. previously we were releasing the cleanup mutex lock before the process exit. As we are releasing the cleanup mutex lock, before the process can exit some other thread which is blocked on cleanup mutex lock is acquiring the cleanup mutex lock and trying to acquire some resources which are already freed as a part of cleanup. This is leading glusterd to crash. Solution: We should exit the process without releasing the cleanup mutex lock. Change-Id: Ibae1c62260f141019017f7a547519a5d38dc2bb6 fixes: bz#1654270 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* posix: use synctask for janitorPoornima G2018-12-191-5/+3
| | | | | | | | | | | | | | With brick mux, the number of threads increases as the number of bricks increases. As an initiative to reduce the number of threads in brick mux scenario, replacing janitor thread to use synctask infra. Now close() and closedir() handle by separate janitor thread which is linked with glusterfs_ctx. Updates #475 Change-Id: I0c4aaf728125ab7264442fde59f3d08542785f73 Signed-off-by: Poornima G <pgurusid@redhat.com>
* fuse: add --lru-limit optionAmar Tumballi2018-12-142-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The inode LRU mechanism is moot in fuse xlator (ie. there is no limit for the LRU list), as fuse inodes are referenced from kernel context, and thus they can only be dropped on request of the kernel. This might results in a high number of passive inodes which are useless for the glusterfs client, causing a significant memory overhead. This change tries to remedy this by extending the LRU semantics and allowing to set a finite limit on the fuse inode LRU. A brief history of problem: When gluster's inode table was designed, fuse didn't have any 'invalidate' method, which means, userspace application could never ask kernel to send a 'forget()' fop, instead had to wait for kernel to send it based on kernel's parameters. Inode table remembers the number of times kernel has cached the inode based on the 'nlookup' parameter. And 'nlookup' field is not used by no other entry points (like server-protocol, gfapi etc). Hence the inode_table of fuse module always has to have lru-limit as '0', which means no limit. GlusterFS always had to keep all inodes in memory as kernel would have had a reference to it. Again, the reason for this is, kernel's glusterfs inode reference was pointer of 'inode_t' structure in glusterfs. As it is a pointer, we could never free it (to prevent segfault, or memory corruption). Solution: In the inode table, handle the prune case of inodes with 'nlookup' differently, and call a 'invalidator' method, which in this case is fuse_invalidate(), and it sends the request to kernel for getting the forget request. When the kernel sends the forget, it means, it has dropped all the reference to the inode, and it will send the forget with the 'nlookup' parameter too. We just need to make sure to reduce the 'nlookup' value we have when we get forget. That automatically cause the relevant prune to happen. Credits: Csaba Henk, Xavier Hernandez, Raghavendra Gowdappa, Nithya B fixes: bz#1560969 Change-Id: Ifee0737b23b12b1426c224ec5b8f591f487d83a2 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* glusterfsd: Fix coverity issueIraj Jamali2018-12-141-5/+0
| | | | | | | | | | | Problem reported: value assigned to a variable is never used Fixes CID : 1274230 updates: bz#789278 Change-Id: I7afcb411876dea81c6820c5b31ae0a2896f9ca15 Signed-off-by: Iraj Jamali <ijamali@redhat.com>
* libglusterfs: Move devel headers under glusterfs directoryShyamsundarR2018-12-055-31/+31
| | | | | | | | | | | | | | | | | | | | | | | | libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* server: Resolve memory leak path in server_initMohit Agrawal2018-12-031-0/+4
| | | | | | | | | | | | | | Problem: 1) server_init does not cleanup allocate resources while it is failed before return error 2) dict leak at the time of graph destroying Solution: 1) free resources in case of server_init is failed 2) Take dict_ref of graph xlator before destroying the graph to avoid leak Change-Id: I9e31e156b9ed6bebe622745a8be0e470774e3d15 fixes: bz#1654917 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* glusterd: perform store operation in cleanup lockAtin Mukherjee2018-11-271-35/+38
| | | | | | | | | All glusterd store operation and cleanup thread should work under a critical section to avoid any partial store write. Change-Id: I4f12e738f597a1f925c87ea2f42565dcf9ecdb9d Fixes: bz#1652430 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* coverity: Fix coverity issuesMohammed Rafi KC2018-11-261-2/+5
| | | | | | | | | | | | | | | | This patch fixes coverity CID : 1356537 https://scan6.coverity.com/reports.htm#v42907/p10714/fileInstanceId=87389108&defectInstanceId=26791927&mergedDefectId=1356537 CID : 1395666 https://scan6.coverity.com/reports.htm#v42907/p10714/fileInstanceId=87389187&defectInstanceId=26791932&mergedDefectId=1395666 CID : 1351707 https://scan6.coverity.com/reports.htm#v42907/p10714/fileInstanceId=87389027&defectInstanceId=26791973&mergedDefectId=1351707 CID : 1396910 https://scan6.coverity.com/reports.htm#v42907/p10714/fileInstanceId=87389027&defectInstanceId=26791973&mergedDefectId=13596910 Change-Id: I8094981a741f4d61b083c05a98df23dcf5b022a2 updates: bz#789278 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* core: Resolve memory leak at the time of graph initMohit Agrawal2018-11-221-3/+34
| | | | | | | | | | | | | | | Problem: In the commit 751b14f2bfd40e08ad395ccd98c6eb0a41ac4e91 one code path is missed to avoid leak at the time of calling graph init Solution: Before destroying graph call xlator fini to avoid leak for server-side xlators those call init during graph init Credit: Pranith Kumar Karampuri fixes: bz#1651431 Change-Id: I6e7cff0d792ab9d954524b28667e94f2d9ec19a2 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* core: Resolve memory leak at the time of graph initMohit Agrawal2018-11-201-4/+7
| | | | | | | | | | | Problem: Memory leak when graph init fails as during volfile exchange between brick and glusterd Solution: Fix the error code path in glusterfs_graph_init Change-Id: If62bee61283fccb7fd60abc6ea217cfac12358fa fixes: bz#1651431 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* glusterfsd: NULL pointer dereferencing clang fixIraj Jamali2018-11-201-1/+1
| | | | | | | | | Added a check to avoid clang warning Updates: bz#1622665 Change-Id: If9ae4e4f2ae13c85dad0e87d8dd6930dde74bbda Signed-off-by: Iraj Jamali <ijamali@redhat.com>
* glusterfsd: Make io-stats xlator search position independentPranith Kumar K2018-11-151-8/+10
| | | | | | | | | | | | | | | | | Problem: glusterfsd notify trigger for profile info command expects decompounder xlator to have the name of the brick and its immediate child to be io-stats xlator. In GD2 decompounder xlator doesn't exist, so this is preventing io-stats xlator from receiving the profile info collection notification. Fix: search for io-stats xlator below server xlator till the first instance is found and send notification for it. fixes bz#1649709 Change-Id: I92a1d9019bbd5546050ab43d50d571c444e027ed Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* glusterfsd: Make each multiplexed brick sign inPrashanth Pai2018-11-121-4/+22
| | | | | | | | | | | | | | | | | | | | | | | NOTE: This change will be consumed by brick mux implementation of glusterd2 only. No corresponsing change in glusterd1 has been made. When a multiplexed brick process is shutting down, it sends sign out requests to glusterd for all bricks that it contains. However, sign in request is only sent for a single brick. Consequently, glusterd has to use some tricky means to repopulate pmap registry with information of multiplexed bricks during glusterd restart. This change makes each multiplexed brick send a sign in request to glusterd2 which ensures that glusterd2 can easily repopulate pmap registry with port information. As a bonus, sign in request will now also contain PID of the brick sending the request so that glusterd2 can rely on this instead of having to read/manage brick pidfiles. Change-Id: I409501515bd9a28ee7a960faca080e97cabe5858 updates: bz#1193929 Signed-off-by: Prashanth Pai <ppai@redhat.com>
* glusterfsd: Do not process GLUSTERD_NODE_STATUS if graph is not readyHu Jianfei2018-11-071-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | Otherwise, gnfs will crash if we try to get nfs clients status in following situation. Also see commit 2f9e555f. Reproducible Steps: 1. systemctl restart glusterd; gluster volume status rep 2. systemctl restart glusterd; gluster volume status rep nfs clients step 1 works ok, but step 2 will lead localhost gnfs crash with certain probability. /lib64/libglusterfs.so.0(+0x270f0)[0x7effb6c7b0f0] /lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7effb6c854a4] /lib64/libc.so.6(+0x35270)[0x7effb52e7270] /usr/sbin/glusterfs(glusterfs_handle_node_status+0x155)[0x7effb7196905] /lib64/libglusterfs.so.0(+0x63f40)[0x7effb6cb7f40] /lib64/libc.so.6(+0x46d40)[0x7effb52f8d40] Updates: bz#1646869 Change-Id: Ia4cb009f821d32b2d18ba48d3467cc81a4b07747 Signed-off-by: Xie Changlong <xiechanglong@cmss.chinamobile.com> Signed-off-by: Hu Jianfei <hujianfei@cmss.chinamobile.com>
* fuse: diagnostic FLUSH interruptCsaba Henk2018-11-062-0/+45
| | | | | | | | | | | | | | | | | | | We add dummy interrupt handling for the FLUSH fuse message. It can be enabled by the "--fuse-flush-handle-interrupt" hidden command line option, or "-ofuse-flush-handle-interrupt=yes" mount option. It serves no other than diagnostic & demonstational purposes -- to exercise the interrupt handling framework a bit and to give an usage example. Documentation is also provided that showcases interrupt handling via FLUSH. Change-Id: I522f1e798501d06b74ac3592a5f73c1ab0590c60 updates: #465 Signed-off-by: Csaba Henk <csaba@redhat.com>
* glusterfsd: fix the asan leak messageAmar Tumballi2018-10-161-0/+1
| | | | | | | | | | | | | | | | | | | Fixes below trace of ASan: Direct leak of 130 byte(s) in 1 object(s) allocated from: #0 0x7fa794bb5850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7fa7944e5de9 in __gf_malloc ../../../libglusterfs/src/mem-pool.c:136 #2 0x40b85c in gf_strndup ../../../libglusterfs/src/mem-pool.h:166 #3 0x40b85c in gf_strdup ../../../libglusterfs/src/mem-pool.h:183 #4 0x40b85c in parse_opts ../../../glusterfsd/src/glusterfsd.c:1049 #5 0x7fa792a98720 in argp_parse (/lib64/libc.so.6+0x101720) #6 0x40d89f in parse_cmdline ../../../glusterfsd/src/glusterfsd.c:2041 #7 0x406d07 in main ../../../glusterfsd/src/glusterfsd.c:2625 updates: bz#1633930 Change-Id: I394b3fc24b7a994c1b03635cb5e973e7290491d3 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* glusterfsd: Fix coverity issues around unused valuesShyamsundarR2018-10-161-0/+2
| | | | | | | | | | | | | | This patch fixes, CID: 1274064, 1274232 The fix is to not add into the return dict the throughput and time values on error computing the same. The pattern applied is the same as when an error occurs during adding either the throughput or the time value to the dict. Change-Id: I33e21e75efbc691f18b818934ef3bf70dd075097 Updates: bz#789278 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* all: fix warnings on non 64-bits architecturesXavi Hernandez2018-10-101-1/+1
| | | | | | | | | | When compiling in other architectures there appear many warnings. Some of them are actual problems that prevent gluster to work correctly on those architectures. Change-Id: Icdc7107a2bc2da662903c51910beddb84bdf03c0 fixes: bz#1632717 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* glusterfsd/mgmt : Check for NULL after creating frameAshish Pandey2018-09-211-0/+4
| | | | | | | | | | CID: 1382390 https://scan6.coverity.com/reports.htm#v42607/p10714fileInstanceId=85472585&defectInstanceId=26074725&mergedDefectId=1382390 Change-Id: Iade073a5e72f29ad5e8f372955869bc287eb9793 updates: bz#789278 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
* Land part 2 of clang-format changesGluster Ant2018-09-123-4706/+4658
| | | | | Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* Land clang-format changesGluster Ant2018-09-123-148/+125
| | | | Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b
* mgmt/glusterd : Fix coverity issueAshish Pandey2018-09-111-3/+6
| | | | | | | | | | | | CID: 727146, 727066 https://scan6.coverity.com/reports.htm#v42607/p10714/fileInstanceId=85393035&defectInstanceId=26034751&mergedDefectId=727146 https://scan6.coverity.com/reports.htm#v42607/p10714/fileInstanceId=85392913&defectInstanceId=26034571&mergedDefectId=727066 updates: bz#789278 Change-Id: Ieaef33829ec88e68690dabce4ea21d2e61dad9f6 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
* glusterfsd/src/glusterfsd.c: Move to GF_MALLOC() instead of GF_CALLOC() when ↵Yaniv Kaul2018-09-071-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | possible It doesn't make sense to calloc (allocate and clear) memory when the code right away fills that memory with data. It may be optimized by the compiler, or have a microscopic performance improvement. In some cases, also changed allocation size to be sizeof some struct or type instead of a pointer - easier to read. In some cases, removed redundant strlen() calls by saving the result into a variable. 1. Only done for the straightforward cases. There's room for improvement. 2. Please review carefully, especially for string allocation, with the terminating NULL string. Only compile-tested! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: Iaed86fcc909022c5158c3e08a9106b1110b9df0a
* New flag to glusterfsd binary to print libexec dirAravinda VK2018-09-053-1/+14
| | | | | | | | | | | | New CLI option for `glusterfsd` binary to get the path of libexec directory. This helps glusterd2 to detect the installed path of `gsyncd` and other binaries. Usage: `glusterfsd --print-libexecdir` Updates: bz#1193929 Change-Id: I8c1a74afd9acec7ee7bd3deabed9d9f20fe3fb5f Signed-off-by: Aravinda VK <avishwan@redhat.com>
* clang-scan: fix multiple issuesAmar Tumballi2018-08-311-5/+12
| | | | | | | | | | | * Buffer overflow issue in glusterfsd * Null argument passed to function expecting non-null (event-epoll) * Make sure the op_ret value is set in macro (posix) Updates: bz#1622665 Change-Id: I32b378fc40a5e3ee800c0dfbc13335d44c9db9ac Signed-off-by: Amar Tumballi <amarts@redhat.com>
* coverity: multiple fixesAmar Tumballi2018-08-311-1/+4
| | | | | | | | CID: 1390477, 1124827 updates: bz#789278 Change-Id: I41060d131aec6e58e7267ac8531b29a70f8c4359 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* build: add --enable-asan configure optionsNiels de Vos2018-08-301-1/+1
| | | | | | | | | | | | | | Introduce a `./configure --enable-asan` to build with `-fsanitize=address -fno-omit-frame-pointer` options. This uses the libasan.so shared library, so that needs to be available. While running builds with the ASAN options, several linker issues surfaced and these have been addressed with this change as well. Building with --enable-asan has been tested on Fedora 28. Change-Id: I428a9da70dd8f7d0056cfbe5c398619a571469b2 Updates: #492 Signed-off-by: Niels de Vos <ndevos@redhat.com>
* multiple files: move from strlen() to sizeof()Yaniv Kaul2018-08-292-3/+3
| | | | | | | | | | | | | | | {glusterfsd|glusterfsd-mgmt|quota-common-utils|xlator|tier|stripe}.c tools/setgfid2path/src/main.c xlators/cluster/afr/src/afr-inode-read.c {glusterfs-acl|glusterfs}.h For const strings, just do compile time size calc instead of runtime. Compile-tested only! Change-Id: I303684b1ff29b05c10126fb1057f507e404ced07 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* coverity: Multiple coverity fixes for issues with HIGH severityShyamsundarR2018-08-221-10/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glfs-fops.c 1391414 Uninitialized pointer read List head needed initialization glusterfsd-mgmt.c graph.c 1382431 Buffer not null terminated 1382417 Dereference before null check 1382347 Buffer not null terminated Cleaned usage of volfile_checksum member of gf_volfile_t struct across the code base. glusterd-tier.c 1382426 Resource leak 1370955 Dereference before null check The function fixed needs more work, but with tier almost being deprecated, addressed some parts of the reported coverity issues as appropriate. Tested using the following test cases: ./tests/basic/tier/new-tier-cmds.t ./tests/basic/tier/tier.t ./tests/basic/tier/bug-1214222-directories_missing_after_attach_tier.t ./tests/basic/tier/tier_lookup_heal.t ./tests/basic/tier/tier-heald.t ./tests/basic/tier/tier-snapshot.t ./tests/features/glfs-lease.t Change-Id: I396f1c34bb112bb22d2745ed279e1a4850cac4af Updates: bz#789278 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* glusterfsd/src/glusterfsd.c : reduce size or re-scope message variableYaniv Kaul2018-08-211-2/+2
| | | | | | | | | | | | | The the error and/or message variable was either: - Reduced in size - from 2048 bytes to 64 bytes, for example. or - Changed in scope - defined in a smaller scope. Compile-tested only! Change-Id: I20b9fb3407a74ba96fcbc7f05fcab534ff562c09 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* coverity: last of the secure temp fixesShyamsundarR2018-08-131-3/+1
| | | | | | | | | | | | | | | | | | Coverity ignore directive is not working if the comment is split across lines (or has an empty line at the end. This can be seen in this report: https://download.gluster.org/pub/gluster/glusterfs/static-analysis /master/glusterfs-coverity/2018-08-06-b982e09f/html/1 /384glusterfsd-mgmt.c.html#error In other places the same pattern has avoided coverity from flagging off the same call, except here. Updates: bz#789278 Change-Id: Ic35ff0fc91d0a42904630728ef7c18215aa277f3 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* coverity: Fix remaining SECURE_TEMP issues reportedShyamsundarR2018-08-032-2/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Two pending SECURE_TEMP issues still exist in the coverity reports, these are fixed by this patch. In both instances (where functions actually seem to be duplicates of each other) the need was for a FILE * and not an fd. Applied the same pattern in both places as in other parts of the code where mkstemp was used and later a FILE * was created from the resulting fd for use. Coverity report: https://download.gluster.org/pub/gluster/ glusterfs/static-analysis/master/glusterfs-coverity/ 2018-07-30-4d3c62e7/html/ Issues numbered: 382, 383 (named SECURE_TEMP) Further added tmpfile to the blacklist, so that future code changes do not add the same, into symbol-check.sh. Also corrected shellcheck errors in symbol-check.sh as a result of updating the same. Updates: bz#789278 Change-Id: I1d572a16ca5b5df2f597aeaa5f454fad34c8296e Signed-off-by: ShyamsundarR <srangana@redhat.com>
* build: rename event.h to gf-event.hNiels de Vos2018-07-272-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Newer FreeBSD versions (noticed with 10.3-RELEASE) provide a event.h file that on occasion gets included instead of the libglusterfs file. When this happens, 'struct event_pool' will not be defined and building will fail with errors like: autoscale-threads.c:18:55: error: incomplete definition of type 'struct event_pool' int thread_count = pool->eventthreadcount; ~~~~^ autoscale-threads.c:17:16: note: forward declaration of 'struct event_pool' struct event_pool *pool = ctx->event_pool; ^ This problem is caused by 'pkg-config --cflags uuid' that adds /usr/local/include to the GF_CPPFLAGS. The use of libuuid is preferred so that the contrib/uuid/ directory can be removed. By renaming event.h to gf-event.h there is no conflict between the different event.h files anymore and compiling on FreeBSD works without issues. Change-Id: Ie69f6b8a4f8f8e9630d39a86693eb74674f0f763 Updates: bz#1607319 Signed-off-by: Niels de Vos <ndevos@redhat.com>
* glusterd: Add multiple checks before attach/start a brickMohit Agrawal2018-07-271-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In brick mux scenario sometime glusterd is not able to start/attach a brick and gluster v status shows brick is already running Solution: 1) To make sure brick is running check brick_path in /proc/<pid>/fd , if a brick is consumed by the brick process it means brick stack is come up otherwise not 2) Before start/attach a brick check if a brick is mounted or not 3) At the time of printing volume status check brick is consumed by any brick process Test: To test the same followed procedure 1) Setup brick mux environment on a vm 2) Put a breaking point in gdb in function posix_health_check_thread_proc at the time of notify GF_EVENT_CHILD_DOWN event 3) unmount anyone brick path forcefully 4) check gluster v status it will show N/A for the brick 5) Try to start volume with force option, glusterd throw message "No device available for mount brick" 6) Mount the brick_root path 7) Try to start volume with force option 8) down brick is started successfully Change-Id: I91898dad21d082ebddd12aa0d1f7f0ed012bdf69 fixes: bz#1595320 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* All: run codespell on the code and fix issues.Yaniv Kaul2018-07-221-2/+2
| | | | | | | | | | | | Please review, it's not always just the comments that were fixed. I've had to revert of course all calls to creat() that were changed to create() ... Only compile-tested! Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* core: dereference check on the variables in glusterfs_handle_brick_statusHari Gowtham2018-07-171-1/+16
| | | | | | | | | | | | | | | problem: In a race condition, the active->first which is supposed to be filled is NULL and trying to dereference it crashs. back trace: Core was generated by `/usr/sbin/glusterfsd -s bxts470192.eu.rabonet.com --volfile-id prod_xvavol.bxts'. Program terminated with signal 11, Segmentation fault. 1029 any = active->first; (gdb) bt Change-Id: Ia6291865319a9456b8b01a5251be2679c4985b7c fixes: bz#1600451 Signed-off-by: Hari Gowtham <hgowtham@redhat.com>
* glusterfsd: Do not process GLUSTERD_BRICK_XLATOR_OP if graph is not readyRavishankar N2018-07-022-1/+9
| | | | | | | | | | | | | | | | | | | | Problem: If glustershd gets restarted by glusterd due to node reboot/volume start force/ or any thing that changes shd graph (add/remove brick), and index heal is launched via CLI, there can be a chance that shd receives this IPC before the graph is fully active. Thus when it accesses glusterfsd_ctx->active, it crashes. Fix: Since glusterd does not really wait for the daemons it spawned to be fully initialized and can send the request as soon as rpc initialization has succeeded, we just handle it at shd. If glusterfs_graph_activate() is not yet done in shd but glusterd sends GD_OP_HEAL_VOLUME to shd, we fail the request. Change-Id: If6cc07bc5455c4ba03458a36c28b63664496b17d fixes: bz#1596513 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* rpc/clnt: Don't let consumers manage "connected" stateRaghavendra G2018-06-042-36/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The state management of "connected" in rpc is ad-hoc as far as the responsibility goes. Note that there is nothing wrong with functionality itself. rpc layer manages this state in disconnect codepath and has exposed an api to manage this one from consumers. Note that rpc layer never sets "connected" to true by itself, which forces the consumers to use this api to get a working rpc connection. The situation is best captured from a comment in code from Jeff Darcy in glusterfsd/src/gf-attach.c: -/* - * In a sane world, the generic RPC layer would be capable of tracking - * connection status by itself, with no help from us. It might invoke our - * callback if we had registered one, but only to provide information. Sadly, - * we don't live in that world. Instead, the callback *must* exist and *must* - * call rpc_clnt_{set,unset}_connected, because that's the only way those - * fields get set (with RPC both above and below us on the stack). If we don't - * do that, then rpc_clnt_submit doesn't think we're connected even when we - * are. It calls the socket code to reconnect, but the socket code tracks this - * stuff in a sane way so it knows we're connected and returns EINPROGRESS. - * Then we're stuck, connected but unable to use the connection. To make it - * work, we define and register this trivial callback. - */ Also, consumers of rpc know about state of connection only through the notifications sent by rpc-clnt. So, consumers don't have any extra information to manage the state and hence letting them manage the state is counter intuitive. This patch cleans that up and instead moves the responsibility of state management of rpc layer into itself. Change-Id: I31e641a60795fc480ca753917f4b2579f1e05094 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1585585
* protocol/server: don't assume there would be a volfile idAmar Tumballi2018-05-081-1/+12
| | | | | | | | | | | | | | | | | | | | | Earlier glusterfs never had an assumption someone would start it with right arguments, and brick processes would be spawned by a management layer. It just assume the role based on the volfile. Other than volfile, no other arguments should be technically mandatory for working of glusterfs. With this patch, that assumption holds true. Updates: github issue # 352 A note on why this particular issue for this basic sanity? As per the design of thin-arbiter/tie-breaker, it can be started independently on any machine, without need of glusterd. So, similar to 'glusterd', we should be able to spawn a process with any translator without options/volume id etc. fixes: bz#1569399 Change-Id: I5c0650fe0bfde35ad94ccba60e63f6cdcd1ae5ff Signed-off-by: Amar Tumballi <amarts@redhat.com>
* glusterd: handling brick termination in brick-muxSanju Rakonde2018-05-071-12/+12
| | | | | | | | | | | | | | | | | Problem: There's a race between the glusterfs_handle_terminate() response sent to glusterd from last brick of the process and the socket disconnect event that encounters after the brick process got killed. Solution: When it is a last brick for the brick process, instead of sending GLUSTERD_BRICK_TERMINATE to brick process, glusterd will kill the process (same as we do it in case of non brick multiplecing). The test case is added for https://bugzilla.redhat.com/show_bug.cgi?id=1549996 Change-Id: If94958cd7649ea48d09d6af7803a0f9437a85503 fixes: bz#1545048 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* fuse: add support for kernel writeback cacheCsaba Henk2018-05-042-0/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Added kernel-writeback-cache command line and xlator option for requesting utilisation of the writeback cache of the kernel in FUSE_INIT (see [1]). - Added attr-times-granularity command line and xlator option via which granularity of the {a,m,c}time in stat (attr) data that we support can be indicated to kernel. This is a means to avoid divergence of the attr times between kernel and userspace that could occur with writeback-cache, while still maintaining maximum time precision the FUSE server is capable of (see [2]). - Handling FATTR_CTIME flag in FUSE_SETATTR that indicates presence of ctime in setattr payload. Currently we cannot associate arbitrary ctimes to files on backend, so we just touch them to update their ctimes to current time. Having ctimes in setattr payload is also a side effect of writeback cache (see [3] and [4]). [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4d99ff8, "fuse: Turn writeback cache on" [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e27c9d3, "fuse: fuse: add time_gran to INIT_OUT" [3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1e18bda, "fuse: add .write_inode" [4]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ab9e13f, "fuse: allow ctime flushing to userspace" Updates: #435 Change-Id: Id174c8e0c815c4456c35f8c53e41a6a507d91855 Signed-off-by: Csaba Henk <csaba@redhat.com>
* glusterfsd: initiate pmap_signout for all detach brick requestsAtin Mukherjee2018-05-041-0/+1
| | | | | | | | | In glusterfs_handle_terminate all bricks getting detached need to initiate a pmap_signout. Change-Id: Iacbd6fcd49215fe6a5210df7dfed1260fde9179a Fixes: bz#1570011 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* server: fix unresolved symbols by moving them to libglusterfsMohit Agrawal2018-04-201-103/+0
| | | | | | | | | | | | | | | | Problem: glusterd2 build is failed due to undefined symbol (xlator_mem_cleanup , glusterfsd_ctx) in server.so Solution: To resolve the same done below two changes 1) Move xlator_mem_cleanup code from glusterfsd-mgmt.c to xlator.c to be part of libglusterfs.so 2) replace glusterfsd_ctx to this->ctx because symbol glusterfsd_ctx is not part of server.so BUG: 1544090 Change-Id: Ie5e6fba9ed458931d08eb0948d450aa962424ae5 fixes: bz#1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* Make glusterfsd binary print statedump & xlator dirPrashanth Pai2018-04-192-0/+46
| | | | | | | | | | | | | | | | | | | | | | | The glusterd2 needs following options, some of which are provided by gluster CLI today: --print-xlatordir --print-statedumpdir --print-logdir However, the CLI package need not be present on the machine running glusterd2. This change adds the above CLI options to glusterfsd binary which glusterd2 depends on. Reverts 9a1ae47c8d60836ae0628a04a153f28c1085c0e8 Related changes: https://review.gluster.org/#/c/19882/ https://github.com/gluster/glusterd2/pull/663 Updates: bz#1193929 Change-Id: I18c123b0d3350d2bd4f2400783e3b94e402a4e29 Signed-off-by: Prashanth Pai <ppai@redhat.com>
* gluster: Sometimes Brick process is crashed at the time of stopping brickMohit Agrawal2018-04-191-21/+75
| | | | | | | | | | | | | | | | | | | | | | | | Problem: Sometimes brick process is getting crashed at the time of stop brick while brick mux is enabled. Solution: Brick process was getting crashed because of rpc connection was not cleaning properly while brick mux is enabled.In this patch after sending GF_EVENT_CLEANUP notification to xlator(server) waits for all rpc client connection destroy for specific xlator.Once rpc connections are destroyed in server_rpc_notify for all associated client for that brick then call xlator_mem_cleanup for for brick xlator as well as all child xlators.To avoid races at the time of cleanup introduce two new flags at each xlator cleanup_starting, call_cleanup. BUG: 1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Note: Run all test-cases in separate build (https://review.gluster.org/#/c/19700/) with same patch after enable brick mux forcefully, all test cases are passed. Change-Id: Ic4ab9c128df282d146cf1135640281fcb31997bf updates: bz#1544090
* glusterd: volume inode/fd status broken with brick muxhari gowtham2018-04-191-19/+15
| | | | | | | | | | | | | | | | | | | | | | | Problem: The values for inode/fd was populated from the ctx received from the server xlator. Without brickmux, every brick from a volume belonged to a single brick from the volume. So searching the server and populating it worked. With brickmux, a number of bricks can be confined to a single process. These bricks can be from different volumes too (if we use the max-bricks-per-process option). If they are from different volumes, using the server xlator to populate causes problem. Fix: Use the brick to validate and populate the inode/fd status. Signed-off-by: hari gowtham <hgowtham@redhat.com> Change-Id: I2543fa5397ea095f8338b518460037bba3dfdbfd fixes: bz#1566067
* cluster/afr: Make sure latency-arg is passed to afrPranith Kumar K2018-04-181-1/+1
| | | | | | | | | | | xlator_notify doesn't pass the extra arguments that come in the input function, so XLATOR_NOTIFY macro should be used instead to pass the extra arguments to the function. BUG: 1567881 fixes bz#1567881 Change-Id: Ic15b6c446638cbacf3149693147a754219037c47 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>