summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/afr/src/afr.h
Commit message (Collapse)AuthorAgeFilesLines
* Revert "cluster/afr: eager locking of FD writes"v3.2.4qa3Vijay Bellur2011-09-221-11/+1
| | | | | | | | | This reverts commit 81456ec2dfb312ae60c5c4e6f960a3cbf8aaaa4c. Change-Id: Id03335117f5137f5d09781850bf4fba6eca0f73d Reviewed-on: http://review.gluster.com/492 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: eager locking of FD writesAnand Avati2011-09-081-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is a change in the way write transactions hold a lock which optimizes the case of sequential writes from a single writer. Lock phase of a transaction has two sub-phases. First is an attempt to acquire locks in parallel by broadcasting non-blocking lock requests. If lock aquistion fails on any server, then the held locks are unlocked and revert to a blocking locked mode sequentially on one server after another. The change in this patch is to make the initial broadcasting lock request attempt to acquire lock on the entire file. If this fails, we revert back to the sequential "regional" blocking lock as before. In the case where such an "eager" lock is granted in the non-blocking phase, it gives rise to an opportunity for optimization. i.e, if the next write transaction on the same FD arrives before the unlock phase of the first transaction, it "takes over" the full file lock. Similarly if yet another transaction arrives before the unlock phase of the "optimized" transaction, that in turn "takes over" the lock as well. The actual unlock now happens at the end of the last "optimzed" transaction. Any operation which arrives before the unlock phase of the previous transaction is a potential candidate to become an "optimized" transaction. In cases where the previous transaction had aquired lock as a "regional" blocking lock, and the next transaction comes in before its unlock phase, then it would not be an "optimized" transaction. Implied assumption ------------------ Since two or more transactions can now operate within the same large lock, there is a possibility that overlapping transactions can arrive at oppoosite orders on the servers. However in the larger picture this is not possible as write-behind already ensures that no two overlapping writes on an inode are in transit at the same time. Overlapping writes across clients are not a problem as they compete at locks anyways. Theoretical benefits and potential harms ---------------------------------------- In case of a single writer: The benefits are large for sequential writes. In the best case the entire file write can happen with just one lock and unlock per server, provided writes are coming in fast enough and getting pipelined by write-behind soon enough (which is usually the case). If the writes are not coming in fast enough, then the optimization "kicks in" for only those subsets of writes which are close enough to get "piggybacked". For random writes the benefits are the same as well. In any case the overall performance is better than or equal to the performance without this optimization for a single writer. In case of multiple writers: When multiple writers are not writing concurrently, there is no negative performance impact. When multiple writers are writing concurrently to the same region, there is no negative impact either, as they were previously getting arbitrated at the locks translator too. In the case of multiple writers writing to different regions concurrently, there will be an increased number of "failovers" from failed parallel non-blocking to sequential blocking regional locks. This above "worst case" has a simple workaround that as soon as we detect > 1 open-fd-count in lookup xattr, we can disable this optimization on those fds. Beneficial side-effects ----------------------- There is another similar optimization in AFR for changelogs which goes by the name of "changelog-piggybacking". That works in a similar way where pending flags get 'taken over' or 'piggybacked' by the next transaction if its 'pre-op' phase kicks in before the 'post-op' phase of the previous transaction. It has been observed that this changelog-piggybacking optimization gives a saving of about ~55% savings of xattr calls hitting the wire, measured across various types of network interfaces. The side effect of this eager-lock optimization is that it gives an almost 100% saving of xattr calls by making the optimistic-changelog work much more efficiently as it gives a wider overlap of the xattr phases of two consecutive transactions. Change-Id: I41c02eb3b64c14c68ef66a344610ec3f024cd59d BUG: 3409 Reviewed-on: http://review.gluster.com/243 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* afr/stripe: collect pathinfo xattr from all childsVenky Shankar2011-08-241-0/+2
| | | | | | | | Change-Id: Iec8b609e66ef21f4fdd6ee2ff3060f0b71d47ca0 BUG: 3046 Reviewed-on: http://review.gluster.com/237 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* Change Copyright current yearPranith Kumar K2011-08-101-1/+1
| | | | | | | | Change-Id: Id1f1a91cf15d933d5621a0073ddaebe02df0f159 BUG: 3348 Reviewed-on: http://review.gluster.com/198 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* LICENSE: s/GNU Affero General Public/GNU General Public/Pranith Kumar K2011-08-061-3/+3
| | | | | | | | Change-Id: Ibf5f45431d7a55b70d7304649af652d6f25bb688 BUG: 3348 Reviewed-on: http://review.gluster.com/183 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* loc_t: add 'gfid' and 'pargfid' fieldsAmar Tumballi2011-06-141-0/+3
| | | | | | | | | | | | | | | | | these fields are used mainly in case of selfheal path, where 'inode->gfid'||'parent->gfid' is not yet set. These fields in 'loc' will have lower precedence than 'inode->gfid' in client protocol. also contains 'Pranith <pranithk@gluster.com>'s patch to set proper loc->gfid during afr selfheal Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2346 (Log message enhancements in GlusterFS - phase 1) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2346
* cluster/afr: Send the first child up/down after all its children notifyPranith Kumar K2011-05-311-0/+1
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2870 (Inconsistent xattr values when creating bricks) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2870
* declare favorite child as int instead of unsigned intRaghavendra Bhat2011-04-131-1/+1
| | | | | | | | | | | | | | In afr_private_t structure favorite child is declared as unsigned int. In init function of afr we set favorite child to -1, if that option is not found in volfile. But favorite child value will be set to a huge value instead of -1 since it is an unsigned int and in statedump file favorite child value is displayed as a huge value instead of -1. Signed-off-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2668 ([glusterfs-3.2.9qa7]: createbench error) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2668
* cluster/afr: log enhancements - part 1Amar Tumballi2011-04-061-1/+3
| | | | | | | | Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Vijay Bellur <vijay@shell.gluster.com> BUG: 2346 () URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2346
* cluster/afr: whitespace cleanupAmar Tumballi2011-03-291-346/+346
| | | | | | | | Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 2346 (Log message enhancements in GlusterFS - phase 1) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2346
* afr-entry-self-heal: fixes to detected renames (gfid based)Anand Avati2011-03-141-0/+1
| | | | | | | | | | | | | - perform expunge first (before impunge) to be able to delete renamed away files - perform readdirp instead of readdir to get gfid along with entry names - if gfid mismatch is found, expunge the entry Signed-off-by: Anand Avati <avati@gluster.com> Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 2500 (Self Healing not working) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2500
* cluster/afr: Perform self-heal as rootPranith K2011-02-081-0/+2
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 2370 (cluster/afr: Perform self-heal as root) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2370
* adding libxlator, to ensure proper client side aggregation of marks by ↵Kaushik BV2011-01-271-0/+10
| | | | | | | | | | | clustering translators Signed-off-by: Kaushik BV <kaushikbv@gluster.com> Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 2310 (georeplication) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2310
* replicate: optimistic changelogAnand Avati2010-11-091-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The standard way of maintaining changelog in replicate has been to write out pending flags and to unset the pending flag post the actual operation. This new optimization kicks in only when all subvolumes are up. The optimization is that, during pre-op, no changelog is written for METADATA and ENTRY/RENAME operations. If during the operation nothing failed, no changelog is updated in post-op either. If however, something does fail during an operation, then, pending flags get written during post op pointing only towards the failed nodes. DATA transactions continue to work the way they are. If one subvolume is down, pending flags are written in pre-op changelog itself as before. The impact of this optimization is only in the case when both servers die or the client dies while the 'FOP' stage of the transaction is in progress. By nature of METADATA and ENTRY operations, detecting a mismatch later is not dependent on the presence of changelog. Changelog only determines the direction in which self-heal happens for these types of transactions. For the direction too this optimization does not have a major impact because in the cases of failure (both servers dieing or client dieing) the final state (direction of self-heal) would be arbitrary anyways as the syscall wouldn't have completed. Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 2068 (performance enhancements) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2068
* Copyright changesVijay Bellur2010-10-111-1/+1
| | | | | | | | Signed-off-by: Vijay Bellur <vijay@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* Change GNU GPL to GNU AGPLPranith K2010-10-041-3/+3
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1388 () URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1388
* rmdir: introduce extra flags parameter in FOP prototypeAnand V. Avati2010-10-021-0/+1
| | | | | | | | Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* Changes to replace flock with gf_flock across GlusterFS.Pavan Sondur2010-10-011-3/+3
| | | | | | | | Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 865 (Add locks recovery support in GlusterFS) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=865
* cluster/afr: Recover locks on child_up from source to sink.Pavan Sondur2010-10-011-0/+7
| | | | | | | | Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 865 (Add locks recovery support in GlusterFS) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=865
* protocol/client: cluster/afr: Support lock recovery and self heal.Pavan Sondur2010-09-301-0/+17
| | | | | | | | Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 865 (Add locks recovery support in GlusterFS) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=865
* replicate: remove checks which prevented self-heal when open fds were presentAnand Avati2010-09-291-1/+0
| | | | | | | | | | | | | this check is not needed anymore since the introduction of changelog piggybacking as the optimization technique instead of first-write-to-flush technique some of the self-healing issues with NFS mounts should be resolved Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1235 (Bug for all pump/migrate commits) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1235
* replicate: keep read_child in inode ctx as up-to-date as possibleAnand Avati2010-09-291-0/+1
| | | | | | | | | | | | | In every transaction check if the currently set read child in the inode context failed in the fop and set it to another subvol on which the latest fop has passed. This will prevent read fops landing on subvols which have witnessed a failure. Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1172 (ls -lh on NFS mount of 2-mirror replicate gives incorrect file size) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1172
* replicate: replace first-write-to-flush optimizationAnand V. Avati2010-09-291-15/+9
| | | | | | | | | | | | use a changelog piggybacking optimization instead of first-write-to-flush optimization and do other cleanups (removal of post-post-op hook etc.) Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1235 (Bug for all pump/migrate commits) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1235
* whitespace cleanupAnand V. Avati2010-09-291-15/+15
| | | | | | | | | Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1235 (Bug for all pump/migrate commits) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1235
* cluster/pump: introduce another flag to enable pump functionalityAmar Tumballi2010-09-161-0/+1
| | | | | | | | | | | | * by default pump will act as a pass through xlator, only when replace-brick start command is issued, it will set the flag, and then pump features (ie, afr) will come in to picture. Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1590 (Stack overflow during self-heal) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1590
* cluster/afr: Various self heal fixes wrt gfid.Pavan Sondur2010-09-071-0/+3
| | | | | | | | Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* gfid: changes in symlink() prototype to have params dictionary with uuid in itAnand Avati2010-09-041-0/+1
| | | | | | | | | Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* gfid: changes in mkdir() prototype to have params dictionary with uuid in itAnand Avati2010-09-041-0/+1
| | | | | | | | | Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* gfid: changes in mknod() prototype to have params dictionary with uuid in itAnand Avati2010-09-041-0/+1
| | | | | | | | | Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* gfid: change in create() prototype to have params dictionary with uuid in itAnand Avati2010-09-041-0/+1
| | | | | | | | | Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* cluster/afr: Use 2 phase locking for transactions and self heal.Pavan Sondur2010-08-221-9/+113
| | | | | | | | Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 960 () URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=960
* cluster/afr: Return correct flock structures correctly in lk fopsPranith Kumar K2010-08-171-1/+2
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 1042 (Use correct flock structures in lk fops) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1042
* cluster/pump: Save path (/) when abort is received in pump.Pavan Sondur2010-08-161-1/+0
| | | | | | | | Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 1235 (Bug for all pump/migrate commits) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1235
* cluster/pump: Dynamically control sink connect and disconnect.Pavan Sondur2010-08-121-0/+2
| | | | | | | | Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 1303 (Cleanup replace-brick state info) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1303
* add pump xlator and changes for replace-brickPavan Sondur2010-08-061-0/+22
| | | | | | | | Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 1235 (Bug for all pump/migrate commits) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1235
* removed 'fop->checksum' from codebase as its not required anymoreAmar Tumballi2010-07-061-5/+0
| | | | | | | | Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 734 (keep only the working/usable code in build tree to focus more on development) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=734
* NULL dereference fixes in code base after running with 'clang'Amar Tumballi2010-07-021-15/+17
| | | | | | | | | | | | * 212 logical (NULL deref/divide by zero) errors reduced to 28 (27 of them in contrib/ and lex part of codebase, 1 is invalid) * 11 API errors reduced to 0 Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 966 (NULL check for avoiding NULL dereferencing of pointers..) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=966
* Unset split-brain flags in afr_self_heal_completion_cbk if self heal ↵Simone Gotti2010-05-091-1/+1
| | | | | | | | | | completes successfully. Signed-off-by: Simone Gotti <simone.gotti@gmail.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 884 (I/O errors after fixed split brain and successfully completed self heal) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=884
* frame's 'op', 'type' restructuredAmar Tumballi2010-05-031-1/+0
| | | | | | | | Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 875 (Implement a new protocol to provide proper backward/forward compatibility) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=875
* Memory accounting changesVijay Bellur2010-04-231-22/+30
| | | | | | | | | | | Memory accounting Changes. Thanks to Vinayak Hegde and Csaba Henk for their contributions. Signed-off-by: Vijay Bellur <vijay@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 329 (Replacing memory allocation functions with mem-type functions) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=329
* cluster/afr: Cleanup fd ctx in releasedir cbkVijay Bellur2010-04-081-0/+3
| | | | | | | | Signed-off-by: Vijay Bellur <vijay@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 805 (memory leak in afr) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=805
* iatt: changes across the codebaseAnand V. Avati2010-03-161-50/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - libglusterfs -- call-stub -- inode -- protocol - libglusterfsclient - cluster/replicate - cluster/{dht,nufa,switch} - cluster/unify - cluster/HA - cluster/map - cluster/stripe - debug/error-gen - debug/trace - debug/io-stats - encryption/rot-13 - features/filter - features/locks - features/path-converter - features/quota - features/trash - mount/fuse - performance/io-threads - performance/io-cache - performance/quick-read - performance/read-ahead - performance/stat-prefetch - performance/symlink-cache - performance/write-behind - protocol/client - protocol/server - storage-posix Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 361 (GlusterFS 3.0 should work on Mac OS/X) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=361
* cluster/afr: Failover readdir calls.Vikas Gorur2010-03-041-0/+7
| | | | | | | | | | | | | | | | This patch makes the replicate readdir call fail over to the next subvolume if the first call fails. It takes care to ensure that entries are not duplicated. The failover behavior of readdir only comes into effect if the option 'strict-readdir' is on. Signed-off-by: Vikas Gorur <vikas@dev.gluster.com> Signed-off-by: root <root@client02.(none)> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 453 (afr_readdir does not fail over) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=453
* afr: fix memory leaksAnand Avati2009-12-041-0/+3
| | | | | | | | Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
* storage/posix: Added janitor thread.Vikas Gorur2009-12-021-1/+0
| | | | | | | | | | | | | | | | | | The janitor thread deletes all files and directories in the "/" GF_REPLICATE_TRASH_DIR directory. This directory is used by replicate self-heal to dump files and directories it deletes. This is needed because letting replicate walk the directory tree and delete a directory and all its children is too racy. Instead, replicate self-heal only does an atomic rename(), and the janitor thread takes care of actually deleting them. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 227 (replicate selfheal does not remove directory with contents in it) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=227
* cluster/afr: Set the self-heal "source" as read subvolume even when not ↵Vikas Gorur2009-12-011-0/+1
| | | | | | | | | | | | | | doing self-heal. This patch sets the read-subvolume equal to the self-heal "source" even if we're not doing self-heal (because some one else is already doing it). Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 320 (Improve self-heal performance) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
* cluster/afr: Preserve generation number along with inode in lookup and ↵Vikas Gorur2009-11-301-0/+6
| | | | | | | | | | | | | creation fops. This fixes fuse_create_cbk conflict warnings and random errors while running dbench (typically open handle failure with ENOENT). Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 315 (generation number support) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=315
* cluster/afr: Do self-heal on unopened fds.Vikas Gorur2009-11-251-0/+4
| | | | | | | | | | | | | | This patch completes the previous patch for self-heal of open fds in replicate. If an fd was never opened on a subvolume, we remember that and do the open after we've done self-heal on that fd. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
* cluster/afr: Refactored the self-heal interface.Vikas Gorur2009-11-241-17/+31
| | | | | | | | | | Cleaned up the self-heal interface to callers. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
* cluster/afr: Do self-heal on reopened fds.Vikas Gorur2009-11-241-1/+29
| | | | | | | | | | | | | | | | | | | | | | | This patch brings in partial support for self-heal of open fds. The precondition is that the fd should have been opened successfully during the initial open() (or create()), and we assume that protocol/client has successfully reopened the fd when the subvolume comes back up. It works by doing an "up/down flush" (a dummy flush transaction to do post-op wherever necessary) and then triggering data self-heal on the file in the post-post-op hook of the dummy flush transaction. This ensures that any writes that come in during self-heal will wait until self-heal completes. The up/down flush is also done when a subvolume goes down, so that post-op is done on all subvolumes where pre-op was done. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170