| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These changes are required to make GlusterFS compile on MacOSX (10.5).
Currently glusterfs server component alone will work over Mac, and it has
to be built with following options to ./configure.
"bash$ ./configure --disable-fuse-client --disable-fusermount "
Signed-off-by: Amar Tumballi <amar@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 361 (GlusterFS 3.0 should work on Mac OS/X)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=361
|
|
|
|
|
|
|
|
|
|
|
|
| |
copy out members which are needed. memcpy of full local causes
a copy of pointers without references and results in various corruption
errors
Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
|
|
|
|
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the background self-heal frame's local_t, copy only
required members --- not a wholesale memcpy. The memcpy
lead to pointers being copied and then double free'd.
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 320 (Improve self-heal performance)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
|
|
|
|
|
|
|
|
|
|
|
| |
Set the buf.st_size of the original frame's afr_local_t, and
not the copy_frame'd one.
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 320 (Improve self-heal performance)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
doing self-heal.
This patch sets the read-subvolume equal to the self-heal "source"
even if we're not doing self-heal (because some one else is already
doing it).
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 320 (Improve self-heal performance)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
creation fops.
This fixes fuse_create_cbk conflict warnings and random errors while
running dbench (typically open handle failure with ENOENT).
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 315 (generation number support)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=315
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Error handling in afr_lookup_cbk was faulty because it
did not give priority to errors such as ESTALE over ENOENT,
and ENOENT over other errors. This patch fixes that, and
also breaks up afr_lookup_cbk into multiple logical functions.
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 205 ([ glusterfs 2.0.6rc4 ] - Hard disk failure not handled correctly)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=205
|
|
|
|
|
|
|
|
|
|
|
|
| |
when active_sink count is 0, the code proceeded into a dangerous loop
resulting in a crash while issuing the call or in the callback
afr_sh_data_setattr_cbk or afr_sh_data_flush_cbk
Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
|
|
|
|
|
|
|
|
|
|
|
|
| |
reset pre_op_done[i] to 0 after issuing a postop in flush. this was
missed during the introduction of pre_op_done[] array and was resulting
in a lot of spurious self heals when spurious flushes were received
Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
|
|
|
|
|
|
|
|
|
|
|
|
| |
When checksum fop returns error, mark for terminating the loop at the end
of the iteration (when all checksum calls of that iteration return) and
not immediately
Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 320 (Improve self-heal performance)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
|
|
|
|
|
|
|
|
|
|
|
| |
alloca.h should be included on a platform-specific basis.
Lets common-utils.h handle that.
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 349 (FreeBSD compilation error (alloca.h).)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=349
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch completes the previous patch for self-heal of
open fds in replicate.
If an fd was never opened on a subvolume, we remember that
and do the open after we've done self-heal on that fd.
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
|
|
|
|
|
|
|
|
|
|
| |
Cleaned up the self-heal interface to callers.
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch brings in partial support for self-heal of open
fds. The precondition is that the fd should have been opened
successfully during the initial open() (or create()), and we
assume that protocol/client has successfully reopened the fd
when the subvolume comes back up.
It works by doing an "up/down flush" (a dummy flush transaction
to do post-op wherever necessary) and then triggering
data self-heal on the file in the post-post-op hook of the
dummy flush transaction. This ensures that any writes
that come in during self-heal will wait until self-heal completes.
The up/down flush is also done when a subvolume goes down,
so that post-op is done on all subvolumes where pre-op was done.
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
|
|
|
|
|
|
|
|
| |
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactored the operation of the data self-heal algorithm
as:
* open all fd's (if fd not supplied by caller)
* lock 0-0 (if lock not supplied by caller)
* fxattrop, fstat (instead of lookup)
... self heal ...
* unlock (if lock not supplied by caller)
* close (if fd not supplied by caller).
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
|
|
|
|
|
|
|
|
|
|
|
| |
Data self-heal now holds blocking locks, and instead of locking
on all subvolumes, it only locks on {data-lock-server-count} subvolumes.
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
|
|
|
|
|
|
|
|
|
|
| |
self-heal
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
|
|
|
|
|
|
|
|
|
|
|
| |
Set opendir_done and split_brain flags correctly
in the inode context.
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 249 (Self heal of a file that does not exist on the first subvolume)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=249
|
|
|
|
|
|
|
|
|
|
|
| |
local->cont.opendir.checksum was being free'd both in the
self-heal completion function and self-heal unwind.
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 249 (Self heal of a file that does not exist on the first subvolume)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=249
|
|
|
|
|
|
|
|
|
|
|
| |
For ENTRY_RENAME_TRANSACTIONs, keep track separately whether the
lower_path and the higher_path have been locked, and unlock only
those which have been.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 112 (parallel deletion of files mounted by different clients on the same back-end hangs and/or does not completely delete)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=112
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch does two things related to revalidate:
1) If a revalidate fails on any subvolume, the entire lookup
call is failed.
2) Self-heal is not triggered on a revalidate if revalidate
has failed on any subvolume.
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 389 (auto-heal fails randomly and causes "Stale NFS file handle" errors)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=389
|
|
|
|
|
|
|
|
|
|
|
| |
Change the success condition to op_ret >= 0 instead
of op_ret == 0.
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 249 (Self heal of a file that does not exist on the first subvolume)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=249
|
|
|
|
|
|
|
|
| |
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 320 (Improve self-heal performance)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The problem: If some files on the first subvolume disappeared
without leaving a trace in the entry changelog (this can happen,
for example, when an fsck has deleted files or when a hard drive
is replaced), those files would never be self-healed even though
they would be present on the second subvolume. This is because
readdir is sent only to the first subvolume, and since the files
don't appear in the directory listing, no lookup would ever be
sent on them.
This patch fixes this problem by doing a readdir on all the subvolumes
during the first opendir on a directory inode. If a discrepancy in the
contents is detected, entry self-heal in a special "force merge" mode
is triggered on that directory.
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 249 (Self heal of a file that does not exist on the first subvolume)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=249
|
|
|
|
|
|
|
|
|
|
|
| |
Defined symbolic constants for the bit masks and
made 'split-brain' a single bit field in the ctx.
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 249 (Self heal of a file that does not exist on the first subvolume)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=249
|
|
|
|
|
|
|
|
| |
Signed-off-by: Vinayak Hegde <vinayak@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 360 (All fop fails when stat-prefetch is loaded on afr.)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=360
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't wait for the next recursive call to sh_{full,diff}_loop_driver
to decide that we've reached the end of file, as the frame could
have been destroyed by that time (if subvolumes are posix).
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 320 (Improve self-heal performance)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
|
|
|
|
|
|
|
|
|
|
|
| |
If the inodelk_count or entrylk_count is positive on a
file/directory, don't try to do self-heal on it.
Signed-off-by: Vikas Gorur <vikas@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 326 ([2.0.8rc9] Spurious self-heal)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=326
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If entry self-heal determines that a file/directory should
be deleted from a subvolume, move that entry to a directory
called "/.trash" on that subvolume. This is for two reasons:
1) It limits the damage that can be done by a "wrong" entry
self-heal.
2) It solves the problem of a to-be-deleted directory not
being empty.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 227 (replicate selfheal does not remove directory with contents in it)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=227
|
|
|
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 145 (NFSv3 related additions to 2.1 task list)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=145
|
|
|
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 146 (Add setattr FOP)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=146
|
|
|
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 145 (NFSv3 related additions to 2.1 task list)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=145
|
|
|
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 320 (Improve self-heal performance)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
|
|
|
|
|
|
|
|
|
|
| |
During entry self-heal, make sure not only that a symlink
exists on all subvolumes, but also that their targets match.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 193 (symlink contents not self-healed by replicate)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=193
|
|
|
|
|
|
|
|
|
|
|
|
| |
Just before the lookup is unwound during background data self-heal,
the read subvolume is set to the self-heal source subvol so that
read operations on the file work correctly, and don't have to
wait for the self-heal to complete.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 320 (Improve self-heal performance)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces a new option "background-self-heal-count", with a
default value of 16.
This means that upto {background-self-heal-count} number of files/directories
will be healed in the background at any given time. If such number of self-heals
are already in progress, further self-heals take place in the foreground.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 320 (Improve self-heal performance)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
|
|
|
|
|
|
|
|
|
|
| |
Start upto "data-self-heal-window-size" instances of the read-write loop
of the "diff" data self-heal algorithm simultaneously.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 320 (Improve self-heal performance)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Start upto "data-self-heal-window-size" instances of the read-write loop
of the "full" data self-heal algorithm simultaneously.
Add a new option "data-self-heal-window-size" with range [1-1024],
and a default value of 16.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 320 (Improve self-heal performance)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the initial lookup shows that 'pending' is positive, then
self-heal will hold a lock and do a lookup again. This lookup
might show that 'pending' is zero everywhere. However, entry
self-heal used to consider this as a case of 'no sources' and
try to merge the directories. This patch checks for that case
and does not do the merge.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 326 ([2.0.8rc9] Spurious self-heal)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=326
|
|
|
|
|
|
|
|
|
| |
impunge_parent_setattr.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 137 (Parent directory mtime not reset after a create in self-heal)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=137
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
frame.
There was a race condition in assuming that afr_sh_entry_impunge_parent_setattr_cbk will
always return before impunge_xattrop_cbk and impunge_setattr_cbk.
This patch fixes two additional problems:
1) Building the parent_loc from impunge_local->loc after STACK_WIND to
impunge_xattrop_cbk has happened. In a simple afr-posix configuration
the stack will have been destroyed by the time building of parent_loc is
attempted.
2) parent_loc built in impunge_newfile_cbk was not being loc_wipe'd.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 137 (Parent directory mtime not reset after a create in self-heal)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=137
|
|
|
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 325 (crash in afr_fd_ctx_set)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=325
|
|
|
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 112 (parallel deletion of files mounted by different clients on the same back-end hangs and/or does not completely delete)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=112
|
|
|
|
|
|
|
|
|
|
|
| |
While creating/deleting an entry as part of entry self-heal,
set the parent directory's mtime to match that on the source
subvolume.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 137 (Parent directory mtime not reset after a create in self-heal)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=137
|
|
|
|
|
|
|
|
|
|
| |
afr selfheal now remembers all the nodes on which locks were successfully
held and sends unlocks only to those nodes
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 112 (parallel deletion of files mounted by different clients on the same back-end hangs and/or does not completely delete)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=112
|
|
|
|
|
|
|
|
|
| |
mark a subvol with held lock only if op_ret == 0
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 112 (parallel deletion of files mounted by different clients on the same back-end hangs and/or does not completely delete)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=112
|
|
|
|
|
|
|
|
|
|
|
|
| |
transactions.
Hold the lock on the {higher_path} only after the lock on the
{lower_path} has been granted successfully.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 112 (parallel deletion of files mounted by different clients on the same back-end hangs and/or does not completely delete)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=112
|
|
|
|
|
|
|
|
|
|
|
| |
when files on all backend nodes are missing, the logic in afr_sh_entry_erase_pending
is broken and results in missing lookup frame. this causes processes to enter into
uninterruptible sleep state.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 311 (missing frame (lookup) when entry-selfheal finds missing files in all backend nodes)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=311
|