summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* cluster/afr: Add the "diff" self-heal algorithm.Vikas Gorur2009-09-223-2/+362
| | | | | | | | | | | The "diff" self-heal algorithm works as follows: For each block: Compute MD5 checksum on source and all sinks If checksum on a sink differs from source: Read block from source and write to sinks Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* cluster/afr: Make the self-heal algorithm pluggable.Vikas Gorur2009-09-225-168/+287
| | | | | | | | Abstract the read/write loop part of data self-heal. This patch has support for the "full" (i.e., read and write entire file) algorithm. Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* cluster/afr: Open source and sinks in read/write mode during self-heal.Vikas Gorur2009-09-221-2/+2
| | | | | | | Since a self-heal algorithm (e.g., rsync) might want to both read and write from both the source and sink files, open them as O_RDWR. Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* protocol/server: Implement rchecksum.Vikas Gorur2009-09-221-1/+80
| | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* protocol/client: Implement rchecksum.Vikas Gorur2009-09-221-0/+71
| | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* storage/posix: Implement rchecksum.Vikas Gorur2009-09-221-0/+66
| | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* libglusterfs: Add RCHECKSUM fop.Vikas Gorur2009-09-229-2/+204
| | | | | | | rchecksum (fd, offset, len): Calculates both the weak and strong checksums for a block of {len} bytes at {offset} in {fd}. Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* libglusterfs: Add checksum functions.Vikas Gorur2009-09-225-2/+495
| | | | | | | | | | | gf_rsync_weak_checksum: Calculates a simple 32-bit checksum. gf_rsync_strong_checksum: Calculates the MD5 checksum. The strong checksum function makes use of Christophe Devine's MD5 implementation (adapted from the rsync source code, version 3.0.6. <http://www.samba.org/ftp/rsync/>). Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* booster: implement F_DUPFD command in fcntl.Raghavendra G2009-09-221-0/+2
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 277 (running dd on booster returns EINVAL) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=277
* performance/write-behind: add option "enable-trickling-writes".Raghavendra G2009-09-221-22/+41
| | | | | | | | | | | | | | - With this option enabled, writes are stack-wound even though not enough data is aggregated, provided there are no write-requests which are stack-wound but reply is yet to come. The reason behind this option is to make use of the network, which is relatively free (with no writes or replies in transit). However, with non-standard block-sizes of writes the performance can actually degrade. Hence making this configurable. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 276 (write behind needs to be optimized.) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=276
* performance/write-behind: reduce traversal of request list during wb_mark_winds.Raghavendra G2009-09-221-35/+28
| | | | | | | | | | | | | | | | - move all the decision making code to __wb_can_wind. - don't continue traversing the request list, once we know any of the following conditions are true: * requests other than write are present in queue. * writes are happening at non-contiguous offsets. * there are no write requests, which are wound to server but not yet received the reply. * enough data is aggregated for writing. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 276 (write behind needs to be optimized.) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=276
* performance/write-behind: reduce list-traversal during wb_mark_unwindsRaghavendra G2009-09-221-13/+19
| | | | | | | | | | - don't traverse entire request list to get the window-size, instead break when current window size becomes greater than configured limit. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 276 (write behind needs to be optimized.) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=276
* performance/write-behind: remove redundant traversal of write-requests in ↵Raghavendra G2009-09-221-3/+5
| | | | | | | | | | | | the wind list in wb_ sync. - no need of getting the total_count of number of requests in the list. Even if there is a single request, we need to sync it. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 276 (write behind needs to be optimized.) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=276
* performance/write-behind: Aggregate adjacent contiguous write-buffers into ↵Raghavendra G2009-09-221-1/+77
| | | | | | | | | single iobuf. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 276 (write behind needs to be optimized.) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=276
* performance/write-behind: fine-tune logic of wb_mark_windsRaghavendra G2009-09-221-65/+14
| | | | | | | | | | | | | | | - remove wb_mark_wind_aggregegate_size_aware, since wb_mark_wind_all does the same work (with check for whether current aggregated data size is greater than the configured limit before calling it). Moreover, wb_mark_wind_aggregate_size_aware called __wb_get_aggregate_size redundantly, thereby reducing the performance, since for small sized large number of writes, traversing the list of requests takes significant amount of time. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 276 (write behind needs to be optimized.) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=276
* libglusterfsclient: Fix build warningsShehjar Tikoo2009-09-221-18/+20
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 275 (libglusterfsclient: Generic build failure bug for libglusterfsclient and booster) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=275
* booster: Fix build warningsShehjar Tikoo2009-09-221-14/+15
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 275 (libglusterfsclient: Generic build failure bug for libglusterfsclient and booster) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=275
* storage/posix: Update nr_files after system call succeeds.Vikas Gorur2009-09-221-12/+12
| | | | | | | | | | In posix_open(), posix_create(), and posix_close(), update stats->nr_files only after the FOP has succeeded. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 248 (Updating stats in posix is incorrect) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=248
* performance/quick-read: refine logic of qr_readv.Raghavendra G2009-09-221-3/+14
| | | | | | | | | | - An extra vector was being allocated when the number of bytes being read from cache were equal to the iobuf size. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 274 (Memory corruption in Apache running on booster) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=274
* performance/quick-read: optimizations to lookupRaghavendra G2009-09-221-24/+49
| | | | | | | | | | | - qr_lookup not to send request for file-content if the cache is already present during revalidates. - flush the cache in qr_lookup_cbk if the cache is not in sync with the file. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 273 (Code review and optimize quick-read) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=273
* performance/quick-read: make a comment more explicit.Raghavendra G2009-09-221-2/+2
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 273 (Code review and optimize quick-read) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=273
* performance/quick-read: checking for qr_file in inode-context and creating ↵Raghavendra G2009-09-221-30/+45
| | | | | | | | | if not present should be atomic. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 273 (Code review and optimize quick-read) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=273
* performance/quick-read: refine the logic in qr_lookup.Raghavendra G2009-09-221-40/+54
| | | | | | | | | | | | | - a new size has to be set in xattr_req only if (quick-read is configured with a maximum file size limit && ((xattr_req does not have a request key for getting content) || (the size requested in xattr_req is not equal to configured size in quick-read))) Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 273 (Code review and optimize quick-read) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=273
* protocol/client: access glusterfs context from the ctx member of xlator objectRaghavendra G2009-09-222-1/+8
| | | | | | | | | | - A global context pointer cannot be used with libglusterfsclient, since there can be many contexts in a single process. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 271 (applications using booster protocol/client crash in client_setvolume_cbk.) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=271
* performance/write-behind: check for the presence of context only in fds not ↵Raghavendra G2009-09-221-6/+12
| | | | | | | | | opened on directories. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 260 (ls on booster VMP results in error: "File descriptor in bad state") URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=260
* client-protocol: fix race-condition encountered while accessing fdctxRaghavendra G2009-09-221-47/+218
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - In protocol/client, fdctx is accessed by two sets of procedures, protocol_client_mark_fd_bad falls in one set whereas the other set consists of all fops which receive fd as an argument. The way these fdctxs are got is different in these two sets. While in the former set, fdctx is accessed through conf->saved_fds, which is a list of fdctxs of fds representing opened/created files. In the latter set, fdctxs are got directly from fd through fd_ctx_get(). Now there can be race conditions between two threads executing one procedure from these two sets. As an example let us consider following scenario: A flush operation is timed out and polling thread executing protocol_client_mark_fd_bad, fuse thread executing client_release. This can happen because, immediately a reply for flush is written to fuse, a release on the same fd can be sent to glusterfs and the polling thread still might be doing cleanup. Consider following set of events: 1. fuse thread does fd_ctx_get (fd). 2. polling thread gets the same fdctx but through conf->saved_fds. 3. Now both threads go ahead and does list_del (fdctx) and eventually free fdctx. In other situations the same set events might occur and the threads executing fops other than flush in the second set might be accessing a fdctx freed in protocol_client_mark_fd_bad. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 127 (race-condition in accessing fdctx in protocol/client) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=127
* performance/quick-read: access glusterfs_ctx from xlator instead of using ↵Raghavendra G2009-09-171-1/+1
| | | | | | | | | | | | | | glusterfs_get_ctx - since glusterfs_get_ctx gets the global context pointer, there can be problems in a multithreaded application running on libglusterfsclient doing multiple glusterfs_inits. Hence use context specific to the current xlator tree stored in each xlator object. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 240 (segmentation fault in qr_readv) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=240
* Add iot_fxattrop to io-threadsPavan Sondur2009-09-171-0/+1
| | | | | | | | | It was already implemented but not set to .fxattrop Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 267 (Add fxattrop to iothreads) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=267
* Dumop inodectx addedVijay Bellur2009-09-165-13/+88
| | | | | | | | | | Added dumpop inodectx. Support for dumop inodectx added in dht, locks and client-protocol. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 213 (Support for process state dump) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=213
* Changed prototype for inode_table_dump() and inode_dump().Vijay Bellur2009-09-165-20/+76
| | | | | | | | | | Changed prototype for inode_table_dump() and inode_dump() Added support for dumpop inode in mount/fuse and protocol/server Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 213 (Support for process state dump) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=213
* protocol/client: Support for dumpop priv.Vijay Bellur2009-09-161-0/+56
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 213 (Support for process state dump) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=213
* mount/fuse: Support for dumpop priv.Vijay Bellur2009-09-161-0/+47
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 213 (Support for process state dump) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=213
* cluster/dht: Support for dumpop priv.Vijay Bellur2009-09-161-0/+134
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 213 (Support for process state dump) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=213
* libglusterfs:Acquire lock before accessing fdtable contents during statedump.Vijay Bellur2009-09-161-8/+16
| | | | | | | | | | Hold lock while dumping fdtable. Dump only inode ino instead of the complete inode. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 213 (Support for process state dump) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=213
* glusterfsd: Removing conditional compilation for SIGUSR1 handler.Vijay Bellur2009-09-161-2/+0
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 213 (Support for process state dump) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=213
* io-cache: fix table->max_pri to 1 as the lowest priorityAnand Avati2009-09-161-1/+2
| | | | | | | | | | | | | patch http://patches.gluster.com/patch/1319/ breaks when no priority is mentioned in the config. the patch makes ioc_get_priority() return 1 as the value when no priority is given, but ioc_get_priority_list() was still returning 0 as the max_pri (maximum priority) which would result in lru list heads not getting initialized Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 261 (support for disabling caching of certain files) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=261
* cluster/stripe, when dbench is run, client crashes because in stripe.c priv ↵Vinayak Hegde2009-09-161-1/+4
| | | | | | | | | is dreferenced without initialising. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 266 (In stripe client crashes after some time when disk space is full) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=266
* booster: use __REDIRECT macro to prevent creat being renamed to creat64.Raghavendra G2009-09-151-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - nm on libglusterfs-booster.so shows only creat64 defined but not creat. This behavior is observed due to following reasons. 1. Booster is compiled with _FILE_OFFSET_BITS=64. 2. fcntl.h when included with _FILE_OFFSET_BITS=64 defined, renames all occurences of creat to creat64 in the source code from the point of #include <fcntl.h>. fcntl.h should be included since booster.c uses many of the macros defined in that header and glusterfs (booster in turn) has to be compiled with _FILE_OFFSET_BITS=64 since glusterfs uses datatypes (off_t, stat etc) whose sizes vary depending on whether this macro is defined or not. Basically, this macro should be defined to provide portability across 32 and 64 bit architectures. The correct fix is to make glusterfs to use datatypes big enough to hold 64 bit variants as well as 32 bit variants (like int64_t replacing off_t) and not to define _FILE_OFFSET_BITS=64 at all. As a temporary work around, 1. we can implement creat functionality in a function with different name, say booster_false_creat 2. rename this function to creat using __REDIRECT macro. since this renaming happens after renaming of creat to creat64 (from the first __REDIRECT macro in fcntl.h), we will end up with creat symbol being defined in libglusterfs-booster.so Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 264 (creat is not resolved properly to the symbol defined in booster) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=264
* libglusterfsclient: NULL terminate the vmp entry during vmp_entry_init.Raghavendra G2009-09-151-2/+5
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 263 (files are not resolved to glusterfs when vmp is not terminated with a '/'.) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=263
* booster: use appropriate conversion specifier during logging in close.Raghavendra G2009-09-151-1/+1
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 262 (crash in booster close due to invalid conversion specifier during logging.) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=262
* 2.0.6 patch for io-cache pattern-matched non-cachingStephan von Krawczynski2009-09-151-5/+23
| | | | | | | | | | | | | | | | | | Hello all, here is a small feature patch. Its intention is to give the user more control over the files performance/io-cache really caches. If the user knows exactly which files should be cached and which shouldn't there is currently no way to tell glusterfs _not_ to cache certain pattern. This patch allows you to disable caching by setting the priority of a pattern to "0". If you do not give any priority option it works just like before and caches everything. Honestly I am not totally sure that disabling caching works the way we did it, please comment. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 261 (support for disabling caching of certain files) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=261
* storage/posix: transform inode number in stat structureRaghavendra G2009-09-151-133/+446
| | | | | | | | | | - when export directory is configured to span across multiple mountpoints, the inode number has to be transformed in order to make it unique. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 254 (storage/posix has to do inode number transformation wherever it unwinds with a stat structure) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=254
* extras: Add LD_PRELOAD test toolShehjar Tikoo2009-09-155-0/+1089
| | | | | | | | | | | | | | This tool allows us to check the sanity of the LD_PRELOAD mechanism so that we can be sure that an application's syscalls will be redirected into booster when that library is LD_PRELOADed. In case of failed syscalls, this tool should be run first to see if the calls are redirected as required. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 251 (Improve booster debugging through ld-preload testing tool) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=251
* transport/ib-verbs: initialize fini member of new-transports created during ↵Raghavendra G2009-09-141-0/+2
| | | | | | | | | | | accepting client connections. - This bug used to cause a memory leak of 2 * sizeof(ib_verbs_private_t) for each new client connection. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 259 (Memory leak on server side when there are large number of disconnections from clients) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=259
* libglusterfsclient: Wait for time ample enough for all the children of ↵Raghavendra G2009-09-131-0/+11
| | | | | | | | | distribute to initialize before sending lookup on '/'. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 255 (libglusterfsclient should wait till all the children of distribute are initialized before sending first lookup on '/') URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=255
* protocol/server: server_stub_resume should check for failure of lookup when ↵Raghavendra G2009-09-131-1/+2
| | | | | | | | | oldloc.parent is NULL. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 215 (crash on ib-verbs in 2.0.6-rc4) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=215
* booster: Fix fd_t leak in pread64Shehjar Tikoo2009-09-091-0/+1
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 174 (booster: fd_ts, they are a leakin) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=174
* libglusterfsclient: Fix incorrect NULL check for fdShehjar Tikoo2009-09-091-1/+1
| | | | | | | | | We should check fdctx instead. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 253 (Global bug for libglusterfsclient NULL checks and CALLOC handling fixes) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=253
* libglusterfsclient: Handle CALLOC failure in libgf_client_lookupShehjar Tikoo2009-09-091-0/+7
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 253 (Global bug for libglusterfsclient NULL checks and CALLOC handling fixes) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=253
* libglusterfsclient: Handle CALLOC failure in libgf_init_vmpentryShehjar Tikoo2009-09-091-0/+18
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 253 (Global bug for libglusterfsclient NULL checks and CALLOC handling fixes) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=253