summaryrefslogtreecommitdiffstats
path: root/libglusterfs
Commit message (Collapse)AuthorAgeFilesLines
* epoll: Adding the ability to configure epoll threadsShyam2015-02-075-30/+235
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the ability to configure the number of event threads for various gluster services. Currently with the multi thread epoll patch, it is possible to have more than one thread waiting on socket activity and processing the same. This thread count is currently static, which this commit makes dynamic. The current services which use IO path, i.e brick processes, any client process (nfs, FUSE, gfapi, heal, rebalance, etc.a), gain 2 set parameters to control the number of threads that are processing events. These settings are, - client.event-threads <n> - server.event-threads <n> The client setting affects the client graph consumers, and the server setting affects the brick processes. These are processed and inited/reconfigured using the client/server protocol xlators. Other services (say glusterd) would need to extend similar configuration settings to take advantage of multi threaded event processing. At present glusterd is not enabled with this commit, as it does not stand to gain from this multi-threading (as I understand it). Change-Id: Id8422fc57a9f95a135158eb6477ccf9d3c9ea4d9 BUG: 1104462 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/9488 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* epoll: edge triggered and multi-threaded epollVijaikumar M2015-02-075-300/+529
| | | | | | | | | | | | | | | | | | | | | | | - edge triggered (oneshot) polling with epoll - pick one event to avoid multiple events getting picked up by same thread and so get better distribution of events against multiple threads - wire support for multiple poll threads to epoll_wait in parallel - evdata to store absolute index and not hint for epoll - store index and gen of slot instead of fd and index hint - perform fd close asynchronously inside event.c for multithread safety - poll is still single threaded Change-Id: I536851dda0ab224c5d5a1b130a571397c9cace8f BUG: 1104462 Signed-off-by: Anand Avati <avati@redhat.com> Signed-off-by: Vijaikumar M <vmallika@redhat.com> Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/3842 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* syncop: Provide syncop_ftw and syncop_dir_scan utilsPranith Kumar K2015-02-064-12/+200
| | | | | | | | | | | | | | | | | ftw provides file tree walk. dir_scan does just a readdir not readdirp. Also changed Afr's self-heal-daemon's crawling functions to use this. These utils will be used by ec in future to do proactive/full healing. Change-Id: I05715ddb789592c1b79a71e98f1e8cc29aac5c26 BUG: 1177601 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9485 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* DHT: cluster.min-free-disk option should validate correctlyGauravKumarGarg2015-02-043-8/+46
| | | | | | | | | | | | | | | | | | | | | | PROBLEM: Previously gluster accepting input value as a percentage which is out of range [0-100] and accepting input value as a size (unit is byte) which is fractional for option cluster.min-free-disk. FIX: Now with this change it will refer to correct validation function and it will accept value that is in range [0-100] for input value as a percentage and unsigned integer value for input as a size (unit in byte) for option cluster.min-free-disk. Change-Id: Iee1962a100542e146276cfc8a4068abddee2bf2d BUG: 1163108 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/9104 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs : Corrected functions calls.Anand2015-02-042-4/+4
| | | | | | | | | | | | | | | | | Problem : There was mismatch between arguments and parameters in some functions (ex:glusterfs_uuid_buf_get,glusterfs_lkowner_buf_get). It could lead to stack overflow issues . Fix : Arguments are removed during calling these function. Change-Id: Icb41bd4119502d192d9cc7242d385ebe62cdb51a BUG: 1180424 Signed-off-by: Anand <anekkunt@redhat.com> Reviewed-on: http://review.gluster.org/9427 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs: Avoid initializing per process globals more than once.Vijay Bellur2015-01-281-4/+24
| | | | | | | | | | | | | | | | gfapi consumers can invoke glusters_globals_init() multiple times through glfs_new(). This will result in re-initialization of already inited variables and non deterministic behavior. To avoid this, a new function gf_globals_init_once() has been added. The invocation of this function is guarded through pthread_once(), thereby ensuring single initialization of per process globals. Change-Id: I0ecde02ee49e0c7379c2eb0f1c879d89774ec82f BUG: 1184366 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/9430 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* uss: disable memory accounting for the snapshot daemonRaghavendra Bhat2015-01-284-1/+20
| | | | | | | | | | | | | | | | | | * Bring in option to disable memory accounting for a glusterfs process This reverses the changes done by the commit 7fba3a88f1ced610eca0c23516a1e720d75160cd. * Change the key from "memory-accounting" to "no-memory-accounting", as by default all the glusterfs process enable memory accounting now. So to disable memory accounting for some process, "no-mem-accounting" argument has to be passed. Change-Id: I39c7cefb0fe764ea3e48f4e73e1305b084c5f497 BUG: 1184366 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/9469 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: Implement Volume heal enable/disablePranith Kumar K2015-01-202-0/+20
| | | | | | | | | | | | | | | | | | For volumes with replicate, disperse xlators, self-heal daemon should do healing. This patch provides enable/disable functionality for the xlators to be part of self-heal-daemon. Replicate already had this functionality with 'gluster volume set cluster.self-heal-daemon on/off'. But this patch makes it uniform for both types of volumes. Internally it still does 'volume set' based on the volume type. Change-Id: Ie0f3799b74c2afef9ac658ef3d50dce3e8072b29 BUG: 1177601 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9358 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* features/changelog: Cleanup .processing and .current directoryAravinda VK2015-01-182-0/+89
| | | | | | | | | | | | | | | | | On changelog_register cleanup .processing, .history/.processing, .current and .history/.current from the working directory. Moved glusterd_recursive_rmdir and glusterd_for_each_entry to common place(libglusterfs) and renamed as recursive_rmdir and GF_FOR_EACH_ENTRY_IN_DIR respectively BUG: 1162057 Change-Id: I1f98468a344cead039026762a805437b2f9e507b Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9082 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* NetBSD portability fix: recover errno on runner errorEmmanuel Dreyfus2015-01-161-4/+0
| | | | | | | | | | | | | | | | | Some time ago we introduced F_CLOSEM to efficiently close unused file descriptors when using a runner. But since it also close the file descriptor used to send back errno to glusterd, it got unable to detect an error on execve(). Fix this by backing out F_CLOSEM usage. BUG: 1129939 Change-Id: I40d3255555145e04e8feafaa2ff4e5fb1570e9a2 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/9447 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: split-brain resolution CLIRavishankar N2015-01-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the AFR heal command to include automated split-brain resolution. This patch [3/3] is the final patch for afr automated split-brain resolution implementation. "gluster volume heal <VOLNAME> [full | statistics [heal-count [replica <HOSTNAME:BRICKNAME>]] |info [healed | heal-failed | split-brain]| split-brain {bigger-file <FILE> |source-brick <HOSTNAME:BRICKNAME> [<FILE>]}]" The new additions being: 1.gluster volume heal <VOLNAME> split-brain bigger-file <FILE> Locates the replica containing the FILE, selects bigger-file as source and completes heal. 2.gluster volume heal <VOLNAME> split-brain source-brick <HOSTNAME:BRICKNAME> <FILE> Selects <FILE> present in <HOSTNAME:BRICKNAME> as source and completes heal. 3.gluster volume heal <VOLNAME> split-brain <HOSTNAME:BRICKNAME> Selects all split-brained files in <HOSTNAME:BRICKNAME> as source and completes heal. Note: <FILE> can be either the full file name as seen from the root of the volume (or) the gfid-string representation of the file, which sometimes gets displayed in the heal info command's output. Entry/gfid split-brain resolution is not supported. Example can be found in the test case. Change-Id: I4649733922d406f14f28ee9033a5cb627b9538b3 BUG: 1136769 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/9377 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* libglusterfs: change signature of syncop_(f)getxattrRavishankar N2015-01-052-6/+10
| | | | | | | | | | | | | | | | | Pass xdata dict to syncop_(f)getxattr calls. This patch [1/3] is required as a part of afr automated split-brain resolution implementation. Change-Id: I3970b3dd6daf64681a031e37f8e9afb14fb3d668 BUG: 1136769 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/9375 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* feature/changelog: Virtual xattr to trigger explicit sync in geo-rep.Kotresh HR2014-12-291-0/+1
| | | | | | | | | | | | | | | | | | | A virtual xattr "glusterfs.geo-rep.trigger-sync" is provided in glusterfs through changelog translator. Geo-rep triggers a explicit data sync on setting this xattr on a file. Changelog captures a DATA entry on file's gfid on setting this virtual xattr on a file. This is supported only for files. It doesn't support directories. Usage: setfattr -n glusterfs.geo-rep.trigger-sync <file-path> Change-Id: Ia689326ac2dcb31035ffbecad2c548eda4eb9245 BUG: 1176934 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9337 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* cluster/afr : Change in volume heal info commandAnuradha2014-12-231-0/+2
| | | | | | | | | | | | | | | | gluster volume heal <volname> info command will now also display if the files listed (in the output of the command) are in split-brain or possibly being healed. This patch also fixes build warning that occurs. Change-Id: I1fc92e62137f23b2b9ddf6e05819cee6230741d1 BUG: 1163804 Signed-off-by: Anuradha <atalur@redhat.com> Reviewed-on: http://review.gluster.org/9119 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* telldir()/seekdir() portability fixesEmmanuel Dreyfus2014-12-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POSIX says that an offset obtained from telldir() can only be used on the same DIR *. Linux is abls to reuse the offset accross closedir()/opendir() for a given directory, but this is not portable and such a behavior should be fixed. An incomplete fix for the posix xlator was merged in http://review.gluster.com/8926 This change set completes it. - Perform the same fix index xlator. - Use appropriate casts and variable types so that 32 bit signed offsets obtained by telldir() do not get clobbered when copied into 64 bit signed types. - modify glfs-heal.c and afr-self-heald.c so that they do not use anonymous fd, since this will cause closedir()/opendir() between each syncop_readdir(). On failure we fallback to anonymous fs only for Linux so that we can cope with updated client vs not updated brick. - Avoid sending an EINVAL when the client request for the EOF offset. Here we fix an error in previous fix for posix xlator: since we fill each directory entry with the offset of the next entry, we must consider as EOF the offset of the last entry, and not the value of telldir() after we read it. - Add checks in regression tests that we do not hit cases where offsets fed to seekdir() are wrong. Introduce log_newer() shell function to check for messages produced by the current script. This fix gather changes from http://review.gluster.org/9047 and http://review.gluster.org/8936 making them obsolete. BUG: 1129939 Change-Id: I59fb7f06a872c4f98987105792d648141c258c6a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/9071 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Raghavendra Bhat <raghavendra@redhat.com>
* client_t: fix for potential NULL pointer dereferenceNiels de Vos2014-12-121-3/+4
| | | | | | | | | | | | | In case an error occurs, 'client' is free'd. The log message just before exiting the function should therefore not use the structure anymore. BUG: 789278 Change-Id: I0848328b29585057cd037a5972c4e5f06a7f978b CID: 1226165 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9262 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* cluster/afr: Associate the inode returned by inode_link() with corresponding ↵Krutika Dhananjay2014-12-091-1/+4
| | | | | | | | | | | | | entry Change-Id: Ic4436a64075a2615a2293cdfdf2ba6622827cafa BUG: 1129939 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9254 Reviewed-by: Emmanuel Dreyfus <manu@netbsd.org> Tested-by: Emmanuel Dreyfus <manu@netbsd.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* libgfapi: Wait for GF_EVENT_CHILD_DOWN in glfs_fini()Anoop C S2014-12-081-5/+5
| | | | | | | | | | | | | | | | | | | | | | Whenever glfs_fini() is being called, currently no check is made inside the function to determine whether the child is already down or not. This patch will wait for GF_EVENT_CHILD_DOWN for the active subvol and then exits. TBD: Apart from the active subvol, wait for other CHILD_DOWN events generated through operations like volume set in future. Change-Id: I81c64ac07b463bfed48bf306f9e8f46ba0f0a76f BUG: 1153610 Signed-off-by: Anoop C S <achiraya@redhat.com> Reviewed-on: http://review.gluster.org/9060 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* dict: Remove the redundant hash calculation when the hash size is 1Poornima Gurusiddaiah2014-11-261-5/+23
| | | | | | | | | | | | | Currently the dict is created with hash size 1, i.e. there is only one hash bucket and the calculation of hash decomes redundant. Change-Id: Id70aea0d798902494ebb6d82955d97d591bc73d2 BUG: 789278 Signed-off-by: Poornima Gurusiddaiah <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/8211 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* dict: Write dict_foreach_* in form of dict_foreach_matchPranith Kumar K2014-11-251-53/+18
| | | | | | | | | | Change-Id: Iaa3454f7f3b6516660b1976bea63e39ea7795f8f BUG: 1164051 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9121 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* Replace copied (from rsync) checksum code by adler32() from zlibNiels de Vos2014-11-212-35/+6
| | | | | | | | | | | | | | | | | | | | The weak checksum code that is included in libglusterfs has initialy been copied from the rsync sources. Instead of maintaining a copy of a function, we should use a function from a shared library. The algorithm seems to be Adler-32, zlib provides an implementation. The strong checksum function has already been replaced by MD5 from OpenSSL. It is time to also remove the comments about the origin of the implementation, because it is not correct anymore. Change-Id: I70c16ae1d1c36b458a035e4adb3e51a20afcf652 BUG: 1149943 Reported-by: Wade Mealing <wmealing@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9035 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* core: use gf_time_fmt() instead of localtime()+strftime()Kaleb S. KEITHLEY2014-11-202-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | gf_time_fmt() has existed since 3.3; it provides consistent timestamps (i.e. UTC times) throughout the implementation. (BTW, the other name for UTC is GMT.) N.B. many (all?) commercial storage solutions use UTC time for logging. This makes for easier debugging across geographically distributed systems. Also adding a "%s" fmt for portably printing time as simple numeric value on systems regardless of whether 32-bit or 64-bit time_t. Plus a minor tweak to return a ptr to the dest-string to allow gf_time_fmt() to be passed as a param in a *printf(). Someday we should pick the "one true" timestamp format and revise all calls to gf_time_fmt() to use it instead of the five or six different formats. Change-Id: I78202ae14b7246fa424efeea56bf2463e14abfb0 BUG: 1109917 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/8085 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* gNFS : make it possible to mount a subdir that actually is a symlinkjiffin2014-11-142-0/+131
| | | | | | | | | | | | | | | | | We are using the function to export all sub-directories in a gluster volume via nfs. For real directories it works fine but if we have a symbolic link which points to the directory, it is not possible to mount that directory via nfs and the nameof the link. Kernel nfs resolves symlink handle to directoryhandle , similar gluster nfs should resolve the symbolic link handle into directory handle. Change-Id: I8bd07534ba9474f0b863f2335b2fd222ab625dba BUG: 1157223 Signed-off-by: jiffin tony thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/9052 Reviewed-by: soumya k <skoduri@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* features/marker: Filter internal xattrs in lookupPranith Kumar K2014-11-112-0/+58
| | | | | | | | | | | | | Afr should ignore quota-size-key as part of self-heal but should heal quota-limit key. Change-Id: Ic0b06bd20a563a00d6bfdc2dc5a76c661e533ecb BUG: 1161106 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9061 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* store: fix compile warning in gf_store_unlock()Niels de Vos2014-11-111-1/+6
| | | | | | | | | | | | | | | | | | | | The following warning is logged when building on Fedora 20: store.c:712:15: warning: ignoring return value of 'lockf', declared with attribute warn_unused_result [-Wunused-result] It does not really matter if unlocking fails, close() will release the lock in any case. But, log the error in case close() can not finish and hangs indefinitely (is that possible?). Change-Id: If6c832f9aec10da6c1adb761b13b58e22d38a065 BUG: 1009076 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9078 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* inode: Handle '/' in basename in inode_link/unlinkPranith Kumar K2014-11-071-1/+25
| | | | | | | | | | | | | | | | | | | Problem: inode_link is sometimes called with a trailing '/'. Lookup, dentry operations like link/unlink/mkdir/rmdir/rename etc come without trailing '/' so the stale dentry with '/' remains in the dentry list of the inode. Fix: Add assert checks and return NULL for '/' in bname. Fix ancestry building code to call without '/' at the end. Change-Id: I9c71292a3ac27754538a4e75e53290e182968fad BUG: 1158751 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9004 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rebalance: ``check_free_space`` should ignore quota_statfsHarshavardhana2014-10-312-3/+12
| | | | | | | | | | | | | | | | | | | | | quota_statfs() returns aggregated details of space usage of bricks this causes distribute to be confused during ``rebalance``, where ``statfs()`` values are used to schedule file migration. We can make sure the values of ``statfs`` are from individual bricks by selectively instructing ``quota_statfs()`` to return non aggregated values. Change-Id: I1397faeee66a1b9c26709cfda693286d227a4170 BUG: 1158262 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/8996 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Use F_CLOSEM if availableEmmanuel Dreyfus2014-10-301-2/+6
| | | | | | | | | | | | | Use F_CLOSEM to close all file descriptors if available. BUG: 764655 Change-Id: Ib3c682825b89c163ebb152848f2533b3cb62cdce Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8379 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* inode: include dentry cycle checks for all existing inodesAnand Avati2014-10-291-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | inode_link() has the responsibility of maintaining the DAGness of the dentry tree, and prevent cyclic loops from forming. To do this the technique used is to perform cyclic check only while linking inodes which already were linked in the inode table (i.e linking a new_inode() can never form a loop). While this was how it was supposed to be all along, the @old_inode variable was missed out to get updated if the given inode to inode_link itself was already linked (i.e, the code was only handling a complex case when the given new inode had a gfid which already existed) Without this patch, it is possible to call inode_link in a specific way which results in dentry loops. Change-Id: I4c87fa2d63f11e31c73d8b847e56962f6c983880 BUG: 1158226 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/8995 Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* libglusterfs: Do not redefine AT_SYMLINK_NOFOLLOW on DarwinHarshavardhana2014-10-231-0/+3
| | | | | | | | | Change-Id: I6c31b0a01da4b2ad05d4df67418e917c2774faa9 BUG: 1089172 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/8943 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* libglusterfs: include compat-errno.h in common-utils to avoid smoke failuresAtin Mukherjee2014-10-211-0/+1
| | | | | | | | | | Change-Id: I14ae91bf20a0eb7a79b3d6028844f55642eb5426 BUG: 1151303 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/8955 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* logs: Do selective logging for errnosPranith Kumar K2014-10-202-0/+40
| | | | | | | | | | | | | | | | | | | | | Problem: Just after replace-brick the mount logs are filled with ENOENT/ESTALE warning logs because the file is yet to be self-healed now that the brick is new. Fix: Do conditional logging for the logs. ENOENT/ESTALE will be logged at lower log level. Only when debug logs are enabled, these logs will be written to the logfile. Change-Id: If203d09e2479e8c2415ebc14fb79d4fbb81dfc95 BUG: 1151303 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8918 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: statedump supportAtin Mukherjee2014-10-153-25/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although glusterd currently has statedump support but it doesn't dump its context information. Implementing glusterd_dump_priv function to export per-node glusterd information would be useful for debugging bugs. Once implemented, we could enhance sos-report to fetch this information. This would potentially reduce our time to root cause and data needed for debugability can be dumped gradually. Following is the main items of the dump list targeted in this patch : * Supported max/min op-version and current op-version * Information about peer list * Information about peer list involved while a transaction is going on (xaction_peers) * option dictionary in glusterd_conf_t * mgmt_v3_lock in glusterd_conf_t * List of connected clients * uuid of glusterd * A section of rpc related information like live connections and their statistics There are couple of issues which were found during implementation and testing phase: - xaction_peers of glusterd_conf_t was not initialized in init because of which traversing through this list head was crashing when there was no active transaction - gf_free was not setting the typestr to NULL if the the alloc count becomes 0 for a mem-type earlier allocated. Change-Id: Ic9bce2d57682fc1771cd2bc6af0b7316ecbc761f BUG: 1139682 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/8665 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* Do not hardcode umount(8) path, emulate lazy umountEmmanuel Dreyfus2014-10-033-1/+56
| | | | | | | | | | | | | | | | | | 1) Use a system-dependent macro for umount(8) location instead of relying on $PATH to find it, for security and portability sake. 2) Introduce gf_umount_lazy() to replace umount -l (-l for lazy) invocations, which is only supported on Linux; On Linux behavior in unchanged. On other systems, we fork an external process (umountd) that will take care of periodically attempt to unmount, and optionally rmdir. BUG: 1129939 Change-Id: Ia91167c0652f8ddab85136324b08f87c5ac1e51d Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8649 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Csaba Henk <csaba@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterfs: allow setxattr of keys with null values.Ravishankar N2014-09-291-6/+3
| | | | | | | | | | | | | | | Disk based file systems allow to get/set extended attribute key-value pairs where value can be null. Fuse/libgfapi clients must be able to do the same on a gluster volume. Change-Id: Ifc11134cc07f1a3ede43f9d027554dcd10b5c930 BUG: 1135514 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/8567 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* synctask: add backtrace per waiting taskKrishnan Parthasarathi2014-09-222-1/+4
| | | | | | | | | | | | | | | The backtrace is 'saved' in a per-task buffer. This would come handy while debugging code using synctasks. Change-Id: I732b275f6d15b31f31361f5ecf2ba47cacde9b54 BUG: 1138503 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/8622 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Add last successful glusterd lock backtraceKrishnan Parthasarathi2014-09-225-23/+132
| | | | | | | | | | | | | | Also, moved the backtrace fetching logic to a separate function. Modified the backtrace fetching logic able to work under memory pressure conditions. Change-Id: Ie38bea425a085770f41831314aeda95595177ece BUG: 1138503 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/8584 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* syncop: Invoke dict_unref() in inodelk only if dictionary is not NULLVijay Bellur2014-09-171-2/+4
| | | | | | | | | | | | | | | | | | In the absence of this check, logs can get flooded with messages like this when rebalance is run: [2014-09-04 17:48:07.148262] W [dict.c:480:dict_unref] (-->/lib64/libc.so.6() [0x30daa47a00] (-->/usr/local/lib/libglusterfs.so.0(synctask_wrap+0x12) [0x7fa20b7c6ec2] (-->/usr/local/lib/glusterfs/3.7dev/xlator/cluster/distribute.so(dht_migrate_file+0x23f) [0x7fa200fdb58f]))) 0-dict: dict is NULL Change-Id: I4c93e4485293b35d86ba07df4d583d2758ec3f49 BUG: 1130888 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8601 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* Always check for ENODATA with ENOATTREmmanuel Dreyfus2014-09-081-0/+6
| | | | | | | | | | | | | | | | | | Linux defines ENODATA and ENOATTR with the same value, which means that code can miss on on the two without breaking. FreeBSD does not have ENODATA and GlusterFS defines it as ENOATTR just like Linux does. On NetBSD, ENODATA != ENOATTR, hence we need to check for both values to get portable behavior. BUG: 764655 Change-Id: I003a3af055fdad285d235f2a0c192c9cce56fab8 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8447 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* statedump: Print curr_stdalloc in mempool statedump ...Krutika Dhananjay2014-08-291-0/+1
| | | | | | | | | | | | | | | ...for, it is curr_stdalloc that is incremented for every mem_get() and decremented on every call to mem_put() and can be used to detect leaks, when cold_count is 0. Change-Id: I418a132c3ea4c0b99ea5c6840ff3024d8d19ddf4 BUG: 1134221 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/8557 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* libglusterfs/syncop: implement inodelkRaghavendra G2014-08-262-0/+44
| | | | | | | | | | Change-Id: Iea489157490b70cb2bb03576b0d4943c6d8f052d BUG: 1130888 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/8522 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Parsing key/value pair with value containing '='Vijaikumar M2014-08-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | Parsing key/value pair in the brickinfo file, where value contains '=' would truncate everything after '=' in value. For example: A key/value pair mnt-opts=rw,noatime,allocsize=1MiB,noattr2 is parsed as: mnt-opts=rw,noatime,allocsize BUG: 1132451 Change-Id: I756b2fc4b06875267212b1a5c3e91c5a40c6accb Signed-off-by: Vijaikumar M <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/8524 Reviewed-by: Sachin Pandit <spandit@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* build: make GLUSTERD_WORKDIR rely on localstatedirHarshavardhana2014-08-073-4/+8
| | | | | | | | | | | | | | | | | | | | | | - Break-way from '/var/lib/glusterd' hard-coded previously, instead rely on 'configure' value from 'localstatedir' - Provide 's/lib/db' as default working directory for gluster management daemon for BSD and Darwin based installations - loff_t is really off_t on Darwin - fix-off the warnings generated by clang on FreeBSD/Darwin - Now 'tests/*' use GLUSTERD_WORKDIR a common variable for all platforms. - Define proper environment for running tests, define correct PATH and LD_LIBRARY_PATH when running tests, so that the desired version of glusterfs is used, regardless where it is installed. (Thanks to manu@netbsd.org for this additional work) Change-Id: I2339a0d9275de5939ccad3e52b535598064a35e7 BUG: 1111774 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8246 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* libglusterfs: wrong function definition of synclock_destory(). Humble Chirammal2014-08-069-10/+10
| | | | | | | | | | | | | | | | | synclock_destory() has been prototyped in syncop.h, how-ever synclock_destroy() is the actual function used in syncop.c. Correcting this function definition along with few typos. Change-Id: I35a818190c1d37c303279ca7a820f01895751bd9 BUG: 1075417 Signed-off-by: Humble Chirammal <hchiramm@redhat.com> Reviewed-on: http://review.gluster.org/8266 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Poornima G <pgurusid@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* feature/geo-rep: Keep marker.tstamp's mtime unchangeable during snapshot.Kotresh H R2014-08-042-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Geo-replicatoin does a full xsync crawl after snapshot restoration of slave and master. It does not do history crawl. Analysis: Marker creates 'marker.tstamp' file when geo-rep is started for the first time. The virtual extended attribute 'trusted.glusterfs.volume-mark' is maintained and whenever it is queried on gluster mount point, marker fills it on the fly and returns the combination of uuid, ctime of marker.tstamp and others. So ctime of marker.tstamp, in other sense 'volume-mark' marks the geo-rep start time when the session is freshly created. From the above, after the first filesystem crawl(xsync) is done during first geo-rep start, stime should always be less than 'volume-mark'. So whenever stime is less than volume-mark, it does full filesystem crawl (xsync). Root Cause: When snapshot is restored, marker.tstamp file is freshly created losing the timestamps, it was originally created with. Solution: 1. Change is made to depend on mtime instead of ctime. 2. mtime and atime of marker.tstamp is restored back when snapshot is created and restored. Change-Id: I4891b112f4aedc50cfae402832c50c5145807d7a BUG: 1125918 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8401 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Bump max op versionAtin Mukherjee2014-08-031-1/+1
| | | | | | | | | | Change-Id: Ic66a0a751233ebbcb65d13f7e3265a046fae9a0b BUG: 1125431 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/8397 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* cluster/dht: Modified logic of linkto file deletion on non-hashedVenkatesh Somyajulu2014-07-311-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently whenever dht_lookup_everywhere gets called, if in dht_lookup_everywhere_cbk, a linkto file is found on non-hashed subvolume, file is unlinked. But there are cases when this file is under migration. Under such condition, we should avoid deletion of file. When some other rebalance process changes the layout of parent such that dst_file (w.r.t. migration) falls on non-hashed node, then may be lookup could have found it as linkto file but just before unlink, file is under migration or already migrated In such cased unlink can be avoided. Race: ------- If we have two bricks (brick-1 and brick-2) with initial file "a" under BaseDir which is hashed as well as cached on (brick-1). Assume "a" hashing gives 44. Brick-1 Brick-2 Initial Setup: BaseDir/a BaseDir [1-50] [51-100] Now add new-brick Brick-3. 1. Rebalance-1 on node Node-1 (Brick-1 node) will reset the BaseDir Layout. 2. After that it will perform a) Create linkto file on new-hashed (brick-2) b) Perform file migration. 1.Rebalance-1 Fixes the base-layout: Brick-1 Brick-2 Brick-3 --------- ---------- ------------ BaseDir/a BaseDir BaseDir [1-33] [34-66] [67-100] 2. Only a) is BaseDir/a BaseDir/a(linkto) BaseDir performed Create linktofile Now rebalance 2 on node-2 jumped in and it will perform step 1 and 2-a. After (rebal-2, step-1), it changes the layout of the BaseDir. BaseDir/a BaseDir/a(link) BaseDir [67-100] [1-33] [34-66] For (rebale-2, step-2), It will perform lookup at Brick-3 as w.r.t new layout 44 falls for brick-3. But lookup will fail. So dht_lookup_everywhere gets called. NOTE: On brick-2 by rebalance-1, a linkto file was created. Currently that linkto files gets deleted by rebalance-2 lookup as it is considered as stale linkto file. But with patch if rebalance is already in progress or rebalance is over, linkto file will not be unlinked. If rebalance is in progress fd will be open and if rebalance is over then linkto file wont be set. Change-Id: I3fee0d28de3c76197325536a9e30099d2413f079 BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8345 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Attempt to fix cmockery2 buildEmmanuel Dreyfus2014-07-271-1/+1
| | | | | | | | | | | | | | | | | | | The current code assumes cmockery2 is installed in default paths. Use PKG_MODULES_CHECK to find it using pkg-config if it is not. If not found by pkg-config, try AC_CHECK_LIB. There are also some build flag adjustement so that local overrides do not loose the required -I flags. This includes and enhance http://review.gluster.org/8340/ BUG: 764655 Change-Id: Ide9f77d1e70afe3c1c5c57ae2b93127af6a425f9 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8365 Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/changelog: Capture "correct" internal FOPsVenky Shankar2014-07-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes changelog capturing internal FOPs in a cascaded setup, where the intermediate master would record internal FOPs (generated by DHT on link()/rename()). This is due to I/O happening on the intermediate slave on geo-replication's auxillary mount with client-pid -1. Currently, the internal FOP capturing logic depends on client pid being non-negative and the presence of a special key in dictionary. Due to this, internal FOPs on an inter-mediate master would be recorded in the changelog. Checking client-pid being non-negative was introduced to capture AFR self-heal traffic in changelog, thereby breaking cascading setups. By coincidence, AFR self-heal daemon uses -1 as frame->root->pid thereby making is hard to differentiate b/w geo-rep's auxillary mount and self-heal daemon. Change-Id: Ib7bd71e80dd1856770391edb621ba9819cab7056 BUG: 1122037 Original-Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8347 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Bump op-version for 3.7.0Kaushal M2014-07-231-2/+3
| | | | | | | | | Change-Id: I4542edeca140bc2252d765b5cfc2e24d1d90cdb1 BUG: 1122398 Reviewed-on: http://review.gluster.org/8354 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>