summaryrefslogtreecommitdiffstats
path: root/xlators
Commit message (Collapse)AuthorAgeFilesLines
* glusterd/shd: Return null proc if process is not running.Mohammed Rafi KC2019-08-054-18/+65
| | | | | | | | | | | | | We were ruturning first proc entry even if it is not running. This was in an assumption that the process could have just started and not updated the pidfile. Now we that we have introduced the states for process state, we can take decision based on that. Change-Id: Ibfc11c966b0db599a8d6a08d8b975233b2bbfb8c Fixes: bz#1728766 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* multiple files: reduce minor work under RCU_READ_LOCKYaniv Kaul2019-08-0512-240/+261
| | | | | | | | | 1. Try to unlock faster - in error paths. 2. Remove memory allocations - do them before the lock. Change-Id: I1e9ddd80b99de45ad0f557d62a5f28951dfd54c8 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* storage/posix: set the op_errno to proper errno during gfid setRaghavendra Bhat2019-08-041-0/+1
| | | | | | | | | In posix_gfid_set, the proper error is not captured in one of the failure cases. Change-Id: I1c13f0691a15d6893f1037b3a5fe385a99657e00 Fixes: bz#1736482 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* locks/fencing: Address hang while lock preemptionSusant Palai2019-08-023-20/+29
| | | | | | | | | | | | The fop_wind_count can go negative when fencing is enabled on unwind path of the IO leading to hang. Also changed code so that fop_wind_count needs to be maintained only till fencing is enabled on the file. updates: bz#1717824 Change-Id: Icd04b42bc16cd3d50eaa581ee57233910194f480 Signed-off-by: Susant Palai <spalai@redhat.com>
* Multiple files: get trivial stuff done before lockYaniv Kaul2019-08-016-22/+26
| | | | | | | | | Initialize a dictionary for example seems to be prefectly fine to be done before taking a lock. Change-Id: Ib29516c4efa8f0e2b526d512beab488fcd16d2e7 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* posix/ctime: Fix race during lookup ctime xattr healKotresh HR2019-08-011-18/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Ctime heals the ctime xattr ("trusted.glusterfs.mdata") in lookup if it's not present. In a multi client scenario, there is a race which results in updating the ctime xattr to older value. e.g. Let c1 and c2 be two clients and file1 be the file which doesn't have the ctime xattr. Let the ctime of file1 be t1. (from backend, ctime heals time attributes from backend when not present). Now following operations are done on mount c1 -> ls -l /mnt/file1 | c2 -> ls -l /mnt/file1;echo "append" >> /mnt/file1; The race is that the both c1 and c2 didn't fetch the ctime xattr in lookup, so both of them tries to heal ctime to time 't1'. If c2 wins the race and appends the file before c1 heals it, it sets the time to 't1' and updates it to 't2' (because of append). Now c1 proceeds to heal and sets it to 't1' which is incorrect. Solution: Compare the times during heal and only update the larger time. This is the general approach used in ctime feature but got missed with healing legacy files. fixes: bz#1734299 Change-Id: I930bda192c64c3d49d0aed431ce23d3bc57e51b7 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* cluster/ec: Create heal task with heal process idAshish Pandey2019-07-301-1/+19
| | | | | | | | | | | | | | | | | | | Problem: ec_data_undo_pending calls syncop_fxattrop->SYNCOP without a frame. In this case SYNCOP gets the frame of the task. However, when we create a synctask for heal we provide frame as NULL. Now, if the read-only feature is ON, it will receive the process ID of the shd as 0 and will consider that it as not an internal process. This will prevent healing of a file with "Read-only file system" error message log. Solution: While launching heal, create a synctask using frame and set process id of the SHD which is -6. Change-Id: I37195399c85de322cbcac75633888922c4e3db4a Fixes: bz#1734252
* cluster/ec: Fix reopen flags to avoid misbehaviorPranith Kumar K2019-07-302-3/+8
| | | | | | | | | | | | | | | | | | | | | | | Problem: when a file needs to be re-opened O_APPEND and O_EXCL flags are not filtered in EC. - O_APPEND should be filtered because EC doesn't send O_APPEND below EC for open to make sure writes happen on the individual fragments instead of at the end of the file. - O_EXCL should be filtered because shd could have created the file so even when file exists open should succeed - O_CREAT should be filtered because open happens with gfid as parameter. So open fop will create just the gfid which will lead to problems. Fix: Filter out these two flags in reopen. Change-Id: Ia280470fcb5188a09caa07bf665a2a94bce23bc4 Fixes: bz#1733935 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* event: rename event_XXX with gf_ prefixedXiubo Li2019-07-296-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I hit one crash issue when using the libgfapi. In the libgfapi it will call glfs_poller() --> event_dispatch() in file api/src/glfs.c:721, and the event_dispatch() is defined by libgluster locally, the problem is the name of event_dispatch() is the extremly the same with the one from libevent package form the OS. For example, if a executable program Foo, which will also use and link the libevent and the libgfapi at the same time, I can hit the crash, like: kernel: glfs_glfspoll[68486]: segfault at 1c0 ip 00007fef006fd2b8 sp 00007feeeaffce30 error 4 in libevent-2.0.so.5.1.9[7fef006ed000+46000] The link for Foo is: lib_foo_LADD = -levent $(GFAPI_LIBS) It will crash. This is because the glfs_poller() is calling the event_dispatch() from the libevent, not the libglsuter. The gfapi link info : GFAPI_LIBS = -lacl -lgfapi -lglusterfs -lgfrpc -lgfxdr -luuid If I link Foo like: lib_foo_LADD = $(GFAPI_LIBS) -levent It will works well without any problem. And if Foo call one private lib, such as handler_glfs.so, and the handler_glfs.so will link the GFAPI_LIBS directly, while the Foo won't and it will dlopen(handler_glfs.so), then the crash will be hit everytime. The link info will be: foo_LADD = -levent libhandler_glfs_LIBADD = $(GFAPI_LIBS) I can avoid the crash temporarily by linking the GFAPI_LIBS in Foo too like: foo_LADD = $(GFAPI_LIBS) -levent libhandler_glfs_LIBADD = $(GFAPI_LIBS) But this is ugly since the Foo won't use any APIs from the GFAPI_LIBS. And in some cases when the --as-needed link option is added(on many dists it is added as default), then the crash is back again, the above workaround won't work. Fixes: #699 Change-Id: I38f0200b941bd1cff4bf3066fca2fc1f9a5263aa Signed-off-by: Xiubo Li <xiubli@redhat.com>
* glusterd: write voldir once in glusterd-store and don't attempt again.Yaniv Kaul2019-07-291-29/+16
| | | | | | | | | | | | | | | | | | glusterd_store_brickinfos() is calling per each brick the function glusterd_store_brickinfo(). In it, we call: ret = glusterd_store_create_brick_dir(volinfo); However, volinfo is the same for all those bricks - no need to again and again call it (which tries to mkdir that dir). We can do it once above the loops in glusterd_store_brickinfos() While at, combine two similar functions that write additional dirs. Change-Id: I5858cf7783f088ea13a8fa20115118efa816f4cb updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* cluster/ec: Always read from good-maskPranith Kumar K2019-07-262-5/+25
| | | | | | | | | | There are cases where fop->mask may have fop->healing added and readv shouldn't be wound on fop->healing. To avoid this always wind readv to lock->good_mask fixes bz#1727081 Change-Id: I2226ef0229daf5ff315d51e868b980ee48060b87 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* fuse: add missing GF_FREE to fuse_interruptCsaba Henk2019-07-251-1/+4
| | | | | | Change-Id: Id7e003e4a53d0a0057c1c84e1cd704c80a6cb015 Fixes: bz#1728047 Signed-off-by: Csaba Henk <csaba@redhat.com>
* quiesce: add missing fopsAmar Tumballi2019-07-251-0/+30
| | | | | | Updates: bz#1693692 Change-Id: I4f005e7168c201709a85db443d643b81e6d3d282 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* [core] fix return of local in __nlc_inode_ctx_getRinku Kothiya2019-07-251-22/+14
| | | | | | | | | | | | __nlc_inode_ctx_get assigns a value to nlc_pe_p which is never used by its parent function or any of the predecessor hence remove the assignment and also that function argument as it is not being used anywhere. fixes: bz#1732496 Change-Id: I5b950e1e251bd50a646616da872a4efe9d2ff8c9 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
* cluster/ec: fix EIO error for concurrent writes on sparse filesXavi Hernandez2019-07-241-9/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | EC doesn't allow concurrent writes on overlapping areas, they are serialized. However non-overlapping writes are serviced in parallel. When a write is not aligned, EC first needs to read the entire chunk from disk, apply the modified fragment and write it again. The problem appears on sparse files because a write to an offset implicitly creates data on offsets below it (so, in some way, they are overlapping). For example, if a file is empty and we read 10 bytes from offset 10, read() will return 0 bytes. Now, if we write one byte at offset 1M and retry the same read, the system call will return 10 bytes (all containing 0's). So if we have two writes, the first one at offset 10 and the second one at offset 1M, EC will send both in parallel because they do not overlap. However, the first one will try to read missing data from the first chunk (i.e. offsets 0 to 9) to recombine the entire chunk and do the final write. This read will happen in parallel with the write to 1M. What could happen is that half of the bricks process the write before the read, and the half do the read before the write. Some bricks will return 10 bytes of data while the otherw will return 0 bytes (because the file on the brick has not been expanded yet). When EC tries to recombine the answers from the bricks, it can't, because it needs more than half consistent answers to recover the data. So this read fails with EIO error. This error is propagated to the parent write, which is aborted and EIO is returned to the application. The issue happened because EC assumed that a write to a given offset implies that offsets below it exist. This fix prevents the read of the chunk from bricks if the current size of the file is smaller than the read chunk offset. This size is correctly tracked, so this fixes the issue. Also modifying ec-stripe.t file for Test #13 within it. In this patch, if a file size is less than the offset we are writing, we fill zeros in head and tail and do not consider it strip cache miss. That actually make sense as we know what data that part holds and there is no need of reading it from bricks. Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6 Fixes: bz#1730715 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* core: use more restrictive mode while creating the directoriesSanju Rakonde2019-07-2310-41/+41
| | | | | | | fixes: bz#1724024 Change-Id: I539fb7248b2cfc037ec29f1413ea648f9ec21ef2 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* features/utime: Fix mem_put crashPranith Kumar K2019-07-221-1/+3
| | | | | | | | | | | | | | Problem: When frame->local is not null FRAME_DESTROY calls mem_put on it. Since the stub is already destroyed in call_resume(), it leads to crash Fix: Set frame->local to NULL before calling call_resume() fixes: bz#1593542 Change-Id: I0f8adf406f4cefdb89d7624ba7a9d9c2eedfb1de Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* (multiple files) use dict_allocate_and_serialize() where applicable.Yaniv Kaul2019-07-226-110/+28
| | | | | | | | This function does length, allocation and serialization for you. Change-Id: I142a259952a2fe83dd719442afaefe4a43a8e55e updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* ctime: Set mdata xattr on legacy filesKotresh HR2019-07-226-56/+228
| | | | | | | | | | | | | | | | | | | | | | | | | | Problem: The files which were created before ctime enabled would not have "trusted.glusterfs.mdata"(stores time attributes) xattr. Upon fops which modifies either ctime or mtime, the xattr gets created with latest ctime, mtime and atime, which is incorrect. It should update only the corresponding time attribute and rest from backend Solution: Creating xattr with values from brick is not possible as each brick of replica set would have different times. So create the xattr upon successful lookup if the xattr is not created Note To Reviewers: The time attributes used to set xattr is got from successful lookup. Instead of sending the whole iatt over the wire via setxattr, a structure called mdata_iatt is sent. The mdata_iatt contains only time attributes. Change-Id: I5e535631ddef04195361ae0364336410a2895dd4 fixes: bz#1593542 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* dht: log getxattr failure for node-uuid at "DEBUG"Susant Palai2019-07-181-2/+5
| | | | | | | | | | | | | | | | | | | | There are two ways to fetch node-uuid information from dht. 1 - #define GF_XATTR_LIST_NODE_UUIDS_KEY "trusted.glusterfs.list-node-uuids" This key is used by AFR. 2 - #define GF_REBAL_FIND_LOCAL_SUBVOL "glusterfs.find-local-subvol" This key is used for non-afr volume type. We do two getxattr operations. First on the #1 key followed by on #2 if getxattr on #1 key fails. Since the parent function "dht_init_local_subvols_and_nodeuuids" logs failure, moving the log-level to DEBUG in dht_find_local_subvol_cbk. fixes: bz#1730175 Change-Id: I4d88244dc26587b111ca5b00d4c00118efdaac14 Signed-off-by: Susant Palai <spalai@redhat.com>
* cluster/ec: skip updating ctx->loc again when ec_fix_open/opendirKinglong Mee2019-07-172-10/+14
| | | | | | | | | | | | | The ec_manager_open/opendir memsets ctx->loc which causes memory/inode leak, and ec_fheal uses ctx->loc out of fd->lock that loc_copy may copy bad data when memset it. This patch skips updating ctx->loc when it is initilizaed. With it, ctx->loc is filled once, and never updated. Change-Id: I3bf5ffce4caf4c1c667f7acaa14b451d37a3550a fixes: bz#1729772 Signed-off-by: Kinglong Mee <mijinlong@horiscale.com>
* cluster/ec: inherit healing from lock when it has infoKinglong Mee2019-07-161-2/+3
| | | | | | | | | If lock has info, fop should inherit healing mask from it. Otherwise, fop cannot inherit right healing when changed_flags is zero. Change-Id: Ife80c9169d2c555024347a20300b0583f7e8a87f fixes: bz#1727081 Signed-off-by: Kinglong Mee <mijinlong@horiscale.com>
* system/posix-acl: update ctx only if iatt is non-NULLHomma2019-07-161-0/+8
| | | | | | | | | We need to safe-guard against possible zero'ing out of iatt structure in acl ctx, which can cause many issues. fixes: bz#1668286 Change-Id: Ie81a57d7453a6624078de3be8c0845bf4d432773 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* dht-common.h: reorder variables to reduce padding.Yaniv Kaul2019-07-151-73/+81
| | | | | | | | | Manually added '-Wpadded' to get warnings on padding, and reordered structs to reduce most of them. Change-Id: I0c505fcb3dfef76399ac9d5d33bfb235354532de updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* Detach iot_worker to release its resourcesLiguang Li2019-07-151-0/+1
| | | | | | | | | | | | When iot_worker terminates, its resources have not been reaped, which will consumes lots of memory. Detach iot_worker to automically release its resources back to the system. fixes: bz#1729107 Change-Id: I71fabb2940e76ad54dc56b4c41aeeead2644b8bb Signed-off-by: Liguang Li <liguang.lee6@gmail.com>
* glusterd: do not mark skip_locking as true for geo-rep operationsSanju Rakonde2019-07-141-2/+7
| | | | | | | | | | | | | | | | | We need to send the commit req to peers in case of geo-rep operations even though it is a no volname operation. In commit phase peers try to set the txn_opinfo which will fail because it is a no volname operation where we don't require a commit phase. We mark skip_locking as true for no volname operations, but we have to give an exception to geo-rep operations, so that they can set txn_opinfo in commit phase. Please refer to detailed RCA at the bug: 1729463 fixes: bz#1729463 Change-Id: I9f2478b12a281f6e052035c0563c40543493a3fc Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* Fix spelling errorsAravinda VK2019-07-144-4/+4
| | | | | | | Fixes: bz#1728554 Change-Id: I88357aed7c14988a12616035c3738c32c09a8f9a Signed-off-by: Patrick Matthäi <pmatthaei@debian.org> Signed-off-by: Aravinda VK <avishwan@redhat.com>
* ibverbs/rdma: remove from buildAmar Tumballi2019-07-131-10/+0
| | | | | | | | | | | | | | | We have proposed about this an year ago, and with recent smoke failures, it looks like the right time to take such call. ref: https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html With this, glusterfs-8.0 wouldn't have rdma feature, and would allow some modularity changes possible with rpc layer (as we would have just 1 transport) Updates: bz#1635688 Change-Id: Ia277dca4d4b1f0cffae20819024a52b075b775e5 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* features/snapview-server: obtain the list of snapshots inside the lockRaghavendra Bhat2019-07-121-1/+1
| | | | | | | | The current list of snapshots from priv->dirents is obtained outside the lock. Change-Id: I8876ec0a38308da5db058397382fbc82cc7ac177 Fixes: bz#1726783
* cluster/ta: Notify the clients only if there are pending healskarthik-us2019-07-124-22/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In case of thin arbiter, before index healer starts crawling the indices at every heal-timeout interval, even if there is nothing to be healed it will send an upcall notification to all the clients to release any AFR_TA_DOM_NOTIFY locks that they hold. SHD will wait for the upcall to return before proceeding with the heal even though there is nothing to be healed. This will also invalidates the cached information about the bricks states on the clients which leads to extra calls on TA from clients for the next reads & writes if needed. This will impact the IO performance. Fix: - Before sending the upcall to the clients, check for any pending heals on TA without taking any locks. - If there is nothing marked bad on TA, then continue with the index crawl to heal any dirty markings present on the files due to any post-op failure. - If there is a brick marked as bad on TA, then take the AFR_TA_DOM_NOTIFY lock on TA from SHD, get the state on TA and continue with the current healing process. Change-Id: Ieb477bc6cb18bbdfd4e7a0453c5ed79b574ec9d6 fixes: bz#1724184 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* cluster/afr: Fix incorrect reporting of gfid & type mismatchkarthik-us2019-07-122-2/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problems: 1. When checking for type and gfid mismatch, if the type or gfid is unknown because of missing gfid handle and the gfid xattr it will be reported as type or gfid mismatch and the heal will not complete. 2. If the source selected during entry heal has null gfid the same will be sent to afr_lookup_and_heal_gfid(). In this function when we try to assign the gfid on the bricks where it does not exist, we are considering the same gfid and try to assign that on those bricks. This will fail in posix_gfid_set() since the gfid sent is null. Fix: If the gfid sent to afr_lookup_and_heal_gfid() is null choose a valid gfid before proceeding to assign the gfid on the bricks where it is missing. In afr_selfheal_detect_gfid_and_type_mismatch(), do not report type/gfid mismatch if the type/gfid is unknown or not set. Change-Id: Ia06552e4dc4a9f89cb7f5302833604bd21bbf7da fixes: bz#1722507 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* Replace usleep() with nanosleep()Vijay Bellur2019-07-113-4/+4
| | | | | | | | | | | | | | | | | | | As usleep has been obsoleted, changed all invocations of usleep to nanosleep. From man 3 usleep: "4.3BSD, POSIX.1-2001. POSIX.1-2001 declares this function obsolete; use nanosleep(2) instead. POSIX.1-2008 removes the specification of usleep()." Added a helper function gf_nanosleep() to have a single place for handling edge cases that might arise from the conversion of usleep to nanosleep and allow the sleep to resume with right remaining value upon being interrupted. Fixes: bz#1721686 Change-Id: Ia39ab82c9e0f4669d2c00d4cdf25e38d94ef9f62 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* Remove hadoop related code from the codebaseVijay Bellur2019-07-095-84/+15
| | | | | | | | | As Hadoop is no longer supported, dropping code for handling Hadoop access. Fixes: bz#1728417 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Change-Id: I8fcf4faacb364f1c9a8abb0c48faec337087f845
* glusterd/svc: update pid of mux volumes from the shd processMohammed Rafi KC2019-07-099-35/+114
| | | | | | | | | | | | | | | | | | | | | | | | For a normal volume, we are updating the pid from a the process while we do a daemonization or at the end of the init if it is no-daemon mode. Along with updating the pid we also lock the file, to make sure that the process is running fine. With brick mux, we were updating the pidfile from gluterd after an attach/detach request. There are two problems with this approach. 1) We are not holding a pidlock for any file other than parent process. 2) There is a chance for possible race conditions with attach/detach. For example, shd start and a volume stop could race. Let's say we are starting an shd and it is attached to a volume. While we trying to link the pid file to the running process, this would have deleted by the thread that doing a volume stop. Change-Id: I29a00352102877ce09ea3f376ca52affceb5cf1a Updates: bz#1722541 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* posix: fix Wformat-overflow warningSheetal Pamecha2019-07-091-2/+2
| | | | | | | | warning: ‘%s’ directive argument is null Change-Id: I2ce9560f98a8310886c31384e40c2e101ad2c719 updates: bz#1193929 Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
* glusterd: fix packed member address warningSheetal Pamecha2019-07-091-2/+3
| | | | | | | | | | warning: taking address of packed member of ‘struct _quota_limits’ may result in an unaligned pointer value. Change-Id: Ib889c99184560f0d24dbcc4fae10b563a2b9b7fd updates: bz#1193929 Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
* quick-read: rename cache-invalidation key to avoid redundant keysAtin Mukherjee2019-07-082-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | With group-metadata-cache group profile settings performance.cache-invalidation option when turned on enables both md-cache and quick-read xlator's cache-invalidation feature. While the intent of the group-metadata-cache is to set md-cache xlator's cache-invalidation feature, quick-read xlator also gets affected due to the same. While md-cache feature and it's profile existed since release-3.9, quick-read cache-invalidation was introduced in release-4 and due to this op-version mismatch on any cluster which is >= glusterfs-4 when this group profile is applied it breaks backward compatibility with the old clients. The proposed fix here is to rename the key in quick-read to 'quick-read-cache-invalidation' so that both these features have distinct identification. While this brings in by itself a backward compatibility challenge where this feature is enabled in an existing cluster and when the same is upgraded to a version where this change exists, it will lead to an unidentified old key. But as a workaround we can always ask users upgrading to release-7 version to turn off this option, upgrade the cluster and turn it back on with the new key. This needs to be documented once the patch is accepted. Fixes: bz#1698042 Change-Id: I30422ba6496208e21191a8d78ad29b2e21078664 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
* gnfs: use strcpy to prevent memory overflowXie Changlong2019-07-081-1/+1
| | | | | | fixes: bz#1727248 Change-Id: Iea289032a8feecf2945668d3fb44a6a53089fdea Signed-off-by: Xie Changlong <xiechanglong@cmss.chinamobile.com>
* glusterd.h: align structsYaniv Kaul2019-07-081-64/+62
| | | | | | | | Move some variables around to reduce compiler padding. Change-Id: I9cd53ce257fb169270c295ac60d58d4a6a950d4f updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd: Show the correct brick status in get-stateMohit Agrawal2019-07-043-2/+37
| | | | | | | | | | | | | Problem: get-state does not show correct brick status if brick status is not Started, it always shows started if any value is set brickinfo->status Solution: Check the value of brickinfo->status to show correct status in get-state Change-Id: I12a79619024c2cf59f338220d144f2f034059b3b fixes: bz#1726906 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* glusterd: don't log a warning message for tier-enabled keySanju Rakonde2019-07-041-2/+5
| | | | | | | | | | | | | We are logging a warning message saying unknown-key for tier-enabled kay. although the tier xlator is deprecated, this key is left behind for handling the peer rejection issues in a heterogeneous cluster. We need not to log if this key is not found/recognised. updates: bz#1193929 Change-Id: Ia68661898a618f99a240ca8d8a124ff6a65ebe9d Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* features/snapview-server: use the same volfile server for gfapi optionsRaghavendra Bhat2019-07-032-4/+42
| | | | | | | | | | | snapview server xlator makes use of "localhost" as the volfile server while initing the new glfs instance to talk to a snapshot. While localhost is fine, better use the same volfile server that was used to start the snapshot daemon containing the snapview-server xlator. Change-Id: I4485d39b0e3d066f481adc6958ace53ea33237f7 fixes: bz#1725211 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/dht: Fixed a memleak in dht_rename_cbkN Balachandran2019-07-021-11/+33
| | | | | | | | | Fixed a memleak in dht_rename_cbk when creating a linkto file. Change-Id: I705adef3cb79e33806520fc2b15558e90e2c211c fixes: bz#1722698 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* glusterfs-fops: fix the modularityAmar Tumballi2019-07-021-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | glusterfs-fops.h was moved to rpc/xdr to support compound fops. (ref: https://review.gluster.org/14032, 2f945b86d3) This was fine as long as all these header files were in single include directory after 'install'. With the move to separate out glusterfs specific header files into another directory inside /usr/include (ref: https://review.gluster.org/21746, 20ef211cfa), glusterfs-fops.h file was not in the proper path when an external .c file tried to include any of glusterfs specific .h file (like xlator.h). Now, we have removed compound-fops, with that, none of the enums declared in glusterfs-fops.h are actually getting used on wire anymore. Hence, it makes sense to get this to libglusterfs/src as a single point of definition. With this change, the external programs can use glusterfs header files. also remove some enum definitions which are not used in code anymore. Updates: bz#1636297 Change-Id: I423c44d3dbe2efc777299c544ece3cb172fc7e44 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* core: use multiple servers while mounting a volume using ipv6Sanju Rakonde2019-07-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | According to man page: mount -t glusterfs [-o <options>] <server1>,<server2>, <server3>,..<serverN>:/<volname>[/<subdir>] <mount_point> When we try mount -t glusterfs 52:54:00:23:5a:b6,52:54:00:a0:be:ed:/ta-vol1 /mnt/ta we see the below msg in log: [2019-06-17 11:41:36.085809] I [MSGID: 100030] [glusterfsd.c:2867:main] 0-/usr/local/sbin/glusterfs: Started running /usr/local/sbin/glusterfs version 7dev (args: /usr/local/sbin/glusterfs --process-name fuse --volfile-server=52:54:00:23:5a --volfile-id=/ta-vol1 /mnt/ta) With this change, we'll able to give multiple volfile servers while mounting the volume using ipv6. After the change, I see the following in log: [2019-06-17 12:00:21.183658] I [MSGID: 100030] [glusterfsd.c:2867:main] 0-/usr/local/sbin/glusterfs: Started running /usr/local/sbin/glusterfs version 7dev (args: /usr/local/sbin/glusterfs --process-name fuse --volfile-server=52:54:00:23:5a:b6 --volfile-server=52:54:00:a0:be:ed --volfile-id=/ta-vol1 /mnt/ta) fixes: bz#1719290 credits: Aga <aga_1990@hotmail.com> Change-Id: Icf89bea3ba15d8374ef428aeb59f2ef55ad544ec Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: fix clang scan defectsAtin Mukherjee2019-07-012-3/+4
| | | | | | | | | | Fixes following: https://build.gluster.org/job/clang-scan/744/clangScanBuildBugs/browse/report-dd8d31.html#EndPath https://build.gluster.org/job/clang-scan/744/clangScanBuildBugs/browse/report-89a1cd.html#EndPath Updates: bz#1622665 Change-Id: Ibd201fb2ca54ae7ae3fed8a8d87815358b614349 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd/thin-arbiter: Thin-arbiter integration with GD1Vishal Pandey2019-06-289-23/+730
| | | | | | | | | | | | | | | | | | | | | | | | gluster volume create <VOLNAME> replica 2 thin-arbiter 1 <host1>:<brick1> <host2>:<brick2> <thin-arbiter-host>:<path-to-store-replica-id-file> [force] The changes have been made in a way that the last brick in the bricks list will be treated as the thin-arbiter. GD1 will be manipulated to consider replica count to be as 2 and continue creating the volume like any other replica 2 volume but since thin-arbiter volumes need ta-brick client xlator entries for each subvolume in fuse volfile, volfile generation is modified in a way to inject these entries seperately in the volfile for every subvolume. Few more additions - 1- Save the volinfo with new fields ta_bricks list and thin_arbiter_count. 2- Introduce a new option client.ta-brick-port to add remote-port to ta-brick xlator entry in fuse volfiles. The option can be set using the following CLI syntax - gluster volume set <VOLNAME> client.ta-brick-port <PORTNO.> 3- Volume Info will contain a Thin-Arbiter-path entry to distinguish from other replicate volumes. Change-Id: Ib434e2313b29716f32476c6c211d282c4ef39406 Updates #687 Signed-off-by: Vishal Pandey <vpandey@redhat.com>
* graph/shd: Use top down approach while cleaning xlatorMohammed Rafi KC2019-06-279-1/+12
| | | | | | | | | | | | | | We were cleaning xlator from botton to top, which might lead to problems when upper xlators trying to access the xlator object loaded below. One such scenario is when fd_unref happens as part of the fini call which might lead to calling the releasedir to lower xlator. This will lead to invalid mem access Change-Id: I8a6cb619256fab0b0c01a2d564fc88287c4415a0 Updates: bz#1716695 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* protocol/client: propagte GF_EVENT_CHILD_PING only for connections to brickRaghavendra G2019-06-271-4/+12
| | | | | | | | | | | | | | | | | | | | | | | Two reasons: * ping responses from glusterd may not be relevant for Halo replication. Instead, it might be interested in only knowing whether the brick itself is responsive. * When a brick is killed, propagating GF_EVENT_CHILD_PING of ping response from glusterd results in GF_EVENT_DISCONNECT spuriously propagated to parent xlators. These DISCONNECT events are from the connections client establishes with glusterd as part of its reconnect logic. Without GF_EVENT_CHILD_PING, the last event propagated to parent xlators would be the first DISCONNECT event from brick and hence subsequent DISCONNECTS to glusterd are not propagated as protocol/client prevents same event being propagated to parent xlators consecutively. propagating GF_EVENT_CHILD_PING for ping responses from glusterd would change the last_sent_event to GF_EVENT_CHILD_PING and hence protocol/client cannot prevent subsequent DISCONNECT events Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1716979 Change-Id: I50276680c52f05ca9e12149a3094923622d6eaef
* posix : add posix_set_ctime() in posix_ftruncate()Jiffin Tony Thottan2019-06-271-0/+2
| | | | | | Change-Id: I0cb5320fea71306e0283509ae47024f23874b53b fixes: bz#1723761 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>