summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/afr/src/afr-self-heal-common.c
Commit message (Collapse)AuthorAgeFilesLines
* cluster/afr: Set lk-owner to pid when fuse does not supply it.v3.0.5rc7Pavan Sondur2010-06-291-0/+2
| | | | | | | | | | | | | | | | Use the frame->root address as lk-owner when FUSE does not supply lk_owner. Raghu, I unit tested this patch with dbench and self heal tests. Did not observe lk-owner=0 in any server logs. Can you verify this patch with the other tests you had run today? Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 1032 (Set lock-owner with pid when fuse does not supply value) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1032
* Unset split-brain flags in afr_self_heal_completion_cbk if self heal ↵Simone Gotti2010-06-011-1/+3
| | | | | | | | | | | | | | | completes successfully. Signed-off-by: Simone Gotti <simone.gotti@gmail.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 884 (I/O errors after fixed split brain and successfully completed self heal) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=884 Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 884 (I/O errors after fixed split brain and successfully completed self heal) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=884
* cluster/afr: Pick a source for metadata self-heal even if all nodes are ↵Vikas Gorur2010-01-141-0/+28
| | | | | | | | | | | | | | | innocent. If metadata changelog has been disabled, all subvolumes will be innocent. In that case, simply pick the subvolume on which the file has the lowest uid as the source and sync other subvolumes to it. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 451 (metadata self-heal does not a pick a source if mode/times have been changed at the backend) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=451
* cluster/afr: Use dict_ref instead of dict_copy_with_ref.Vikas Gorur2010-01-081-2/+2
| | | | | | | | Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 509 (Crash in afr_local_cleanup ()) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=509
* cluster/afr: Sync the parent directory's mtime during missing entries self-heal.Vikas Gorur2009-12-071-6/+26
| | | | | | | | Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 137 (Parent directory mtime not reset after a create in self-heal) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=137
* afr: fix fd ref leak in self-healAnand Avati2009-12-061-1/+3
| | | | | | | | | | sh->healing_fd should be ref'ed only when healing_fd_opened is not set Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
* cluster/afr: Add log messages when setattr fails in self-heal.Vikas Gorur2009-12-021-0/+6
| | | | | | | | Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 146 (Add setattr FOP) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=146
* cluster/afr: Fix conditional typo.Vikas Gorur2009-12-021-1/+2
| | | | | | | | Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
* afr: remove memcpy of @local contents in afr_local_copyAnand Avati2009-12-011-8/+23
| | | | | | | | | | | | copy out members which are needed. memcpy of full local causes a copy of pointers without references and results in various corruption errors Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
* cluster/afr: Don't do memcpy of afr_local_t in afr_local_copy.Vikas Gorur2009-12-011-11/+5
| | | | | | | | | | | | For the background self-heal frame's local_t, copy only required members --- not a wholesale memcpy. The memcpy lead to pointers being copied and then double free'd. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 320 (Improve self-heal performance) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
* cluster/afr: Set the self-heal "source" as read subvolume even when not ↵Vikas Gorur2009-12-011-3/+12
| | | | | | | | | | | | | | doing self-heal. This patch sets the read-subvolume equal to the self-heal "source" even if we're not doing self-heal (because some one else is already doing it). Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 320 (Improve self-heal performance) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
* cluster/afr: Refactored lookup_cbk and introduce precedence of errors.Vikas Gorur2009-11-301-0/+4
| | | | | | | | | | | | | Error handling in afr_lookup_cbk was faulty because it did not give priority to errors such as ESTALE over ENOENT, and ENOENT over other errors. This patch fixes that, and also breaks up afr_lookup_cbk into multiple logical functions. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 205 ([ glusterfs 2.0.6rc4 ] - Hard disk failure not handled correctly) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=205
* cluster/afr: Refactored the self-heal interface.Vikas Gorur2009-11-241-84/+48
| | | | | | | | | | Cleaned up the self-heal interface to callers. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
* cluster/afr: Do self-heal on reopened fds.Vikas Gorur2009-11-241-17/+32
| | | | | | | | | | | | | | | | | | | | | | | This patch brings in partial support for self-heal of open fds. The precondition is that the fd should have been opened successfully during the initial open() (or create()), and we assume that protocol/client has successfully reopened the fd when the subvolume comes back up. It works by doing an "up/down flush" (a dummy flush transaction to do post-op wherever necessary) and then triggering data self-heal on the file in the post-post-op hook of the dummy flush transaction. This ensures that any writes that come in during self-heal will wait until self-heal completes. The up/down flush is also done when a subvolume goes down, so that post-op is done on all subvolumes where pre-op was done. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
* cluster/afr: Fix inode context bitmasks.Vikas Gorur2009-11-241-6/+2
| | | | | | | | | | | Set opendir_done and split_brain flags correctly in the inode context. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 249 (Self heal of a file that does not exist on the first subvolume) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=249
* cluster/afr: Ensure directory contents are in sync during opendir.Vikas Gorur2009-11-131-3/+10
| | | | | | | | | | | | | | | | | | | | | | The problem: If some files on the first subvolume disappeared without leaving a trace in the entry changelog (this can happen, for example, when an fsck has deleted files or when a hard drive is replaced), those files would never be self-healed even though they would be present on the second subvolume. This is because readdir is sent only to the first subvolume, and since the files don't appear in the directory listing, no lookup would ever be sent on them. This patch fixes this problem by doing a readdir on all the subvolumes during the first opendir on a directory inode. If a discrepancy in the contents is detected, entry self-heal in a special "force merge" mode is triggered on that directory. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 249 (Self heal of a file that does not exist on the first subvolume) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=249
* cluster/afr: NFS-friendly logic changesVikas Gorur2009-10-271-2/+2
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 145 (NFSv3 related additions to 2.1 task list) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=145
* cluster/afr: Do self-heal in the background.Vikas Gorur2009-10-261-11/+126
| | | | | | | | | | | | | | This patch introduces a new option "background-self-heal-count", with a default value of 16. This means that upto {background-self-heal-count} number of files/directories will be healed in the background at any given time. If such number of self-heals are already in progress, further self-heals take place in the foreground. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 320 (Improve self-heal performance) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
* cluster/afr: Set mtime of parent directory in self-heal properly.Vikas Gorur2009-10-131-2/+3
| | | | | | | | | | | While creating/deleting an entry as part of entry self-heal, set the parent directory's mtime to match that on the source subvolume. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 137 (Parent directory mtime not reset after a create in self-heal) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=137
* prevent spurious unlocks from afr selfhealAnand Avati2009-10-131-9/+25
| | | | | | | | | | afr selfheal now remembers all the nodes on which locks were successfully held and sends unlocks only to those nodes Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 112 (parallel deletion of files mounted by different clients on the same back-end hangs and/or does not completely delete) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=112
* Changed occurrences of Z Research to Gluster.Vijay Bellur2009-10-071-1/+1
| | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* Global: NFS-friendly prototype changesShehjar Tikoo2009-10-011-3/+6
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 145 (NFSv3 related additions to 2.1 task list) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=145
* Global: Introduce setattr and fsetattr fopsShehjar Tikoo2009-10-011-34/+27
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 146 (Add setattr FOP) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=146
* cluster/afr: Do not try to self-heal "/"Vikas Gorur2009-09-081-8/+16
| | | | | | | | | | If the root directory does not exist on a subvolume, don't try to create it. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 28 (Deleting a backend export directory in an AFR setup can cause a segfault while trying to self heal) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=28
* Set timestamps properly when creating missing entries.Vikas Gorur2009-07-061-3/+37
| | | | | | | In AFR self-heal set timestamp of a freshly created missing entry to that of the source entry. Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* afr logging cleanupAnand V. Avati2009-04-281-3/+2
|
* Cleaned up log messages in replicate.Vikas Gorur2009-04-241-19/+20
| | | | Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Check return value of afr_sh_select_source.2.0.0rc8Vikas Gorur2009-04-201-1/+0
| | | | | | If select_source returns -1, abort self-heal. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Minor fix in afr_sh_build_pending_matrix.Vikas Gorur2009-04-201-3/+0
| | | | | | Remove incorrect check for xattr[i] being NULL. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Fix two memory leaks in afr self heal code.Vikas Gorur2009-04-171-0/+4
| | | | Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Changed xattr format of afr changelog to support adding and removing of ↵Vikas Gorur2009-04-161-135/+104
| | | | | | subvolumes while keeping existing data. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Compulsorily do self heal if file sizes differ.Vikas Gorur2009-04-091-14/+85
| | | | | | | | | If file sizes differ, then compulsorily do self-heal. If no 'wise' sources are found, then pick a 'fool' with the biggest file size. If even 'fools' aren't found, pick the 'innocent' source with the biggest file size. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Handle files which have no pending xattrs at all.Vikas Gorur2009-04-091-1/+27
| | | | | | | | If a pending xattr key is non-existent on a file (call such files 'ignorant'), make all other non-ignorant subvolumes point towards the ignorant one. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Erase xattr during self-heal based on original dict.Vikas Gorur2009-04-091-3/+17
| | | | | | | | Decrement xattr during self-heal based on the original dict instead of pending_matrix, as the pending_matrix might have been altered later. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Fix in return value of afr_sh_mark_sourcesVikas Gorur2009-04-061-0/+2
| | | | | | | | | | | | afr_sh_mark_sources now returns: -1 if two wise subvols conflict (split-brain) 0 if all subvols are innocent (no self-heal needed) >0 if sources found Also, changes to callers of afr_sh_mark_sources to handle return value properly. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Made self heal logic more precise.Vikas Gorur2009-03-241-22/+233
| | | | | | | | | Discard earlier patch sent for the same error. This patch fixes it more comprehensively. This solves the spurious split-brain seen by many users. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Add extra 'volume' parameter to inodelk/entrylk callsVikas Gorur2009-03-121-0/+2
| | | | Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* updated copyright header to extend copyright upto 2009Basavanagowda Kanur2009-02-261-1/+1
| | | | | | updated copyright header to include 2009. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Added all filesVikas Gorur2009-02-181-0/+1073