summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/afr
Commit message (Collapse)AuthorAgeFilesLines
* cluster/afr: Prevent double big lock when data self-heal loops are not spawnedPranith Kumar K2011-09-062-7/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The steps in normal data self heal: 1) take big lock by self-heal frame. Get the xattrs/stat to decide source, sink information. 2) spawn loop frames which perform self-heal by taking small locks on the file. Every time a new lock is taken and the old lock is released. 3) Before releasing the final small lock a big lock is taken by the self-heal frame, and unlock on small-lock. Erasing of the pending xattrs happen then the big unlock happen and that is the end of the data self-heal. When a data self-heal is needed for a file and the fop that triggers the self-heal is open with O_TRUNC. Fuse sends open then an explicit truncate for this. Open triggers the self-heal but by the time it tries to spawn the loops the file size is truncated to 0, so no loops are formed. These are the steps: 1) Take big lock by self-heal frame. Get the xattrs/stat to decide source, sink information. 2) loop frames are not spawned. The big lock is not released. 3) One more big lock is taken by the same self-heal frame, Erasing of the pending xattrs etc happen, now it does two big unlocks, but after the first unlock, the information on which the locks were performed is forgotten, so the next unlock becomes a no-op. So there is a stale big lock on that file preventing further writes. As a fix, if the loops are not spawned, use the previous big lock to perform the rest of the operations needed in completing the data self-heal. No need to have one more big lock. Change-Id: Id03171269594e447b2b6d1331e362d83bd1e3430 BUG: 3506 Reviewed-on: http://review.gluster.com/339 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/afr: Bring down the self-heal window size to 1Pranith Kumar K2011-09-061-1/+1
| | | | | | | | | | | | This is brought in an effort to be nice to the system resources when self-heal is in progress. Change-Id: I123f1eb4d8000613a35c0117f0aa27f926f3a921 BUG: 3503 Reviewed-on: http://review.gluster.com/333 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/afr: Perform flush on all the children involved in self-healPranith Kumar K2011-08-221-19/+6
| | | | | | | | Change-Id: I66362a3087a635fb7b759d7836a1f6564a6a7fc9 BUG: 3456 Reviewed-on: http://review.gluster.com/294 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Change definition of stale childPranith Kumar K2011-08-221-1/+1
| | | | | | | | | | | | The code is checking for priv->child_up[i], which can change while the fop is in progress. Since pending[child][id-of-transaction] alone is enough to tell if the child became stale or not, use just that. Change-Id: I494bf02cca66f4fd41526195fafce86a202c6bd1 BUG: 3455 Reviewed-on: http://review.gluster.com/293 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Paused fop should not continue with fopPranith Kumar K2011-08-223-3/+11
| | | | | | | | Change-Id: Idce22a6266c354e327d5d717715d2e62533eec58 BUG: 3448 Reviewed-on: http://review.gluster.com/292 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: fop should not continue if it is paused, until resumesPranith Kumar K2011-08-212-0/+8
| | | | | | | | | Change-Id: Ie026ebed98cf5ff75ae1a13437d29f67d0e0254a BUG: 3448 Reviewed-on: http://review.gluster.com/286 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: set frame localPranith Kumar K2011-08-212-1/+2
| | | | | | | | | Change-Id: I861b3c4494735b0ba6e038cdc39c50b9866747a8 BUG: 3448 Reviewed-on: http://review.gluster.com/283 Reviewed-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Perform self-heal without locking the whole filePranith Kumar K2011-08-2019-1726/+1580
| | | | | | | | Change-Id: I206571c77f2d7b3c9f9d7bb82a936366fd99ce5c BUG: 3182 Reviewed-on: http://review.gluster.com/141 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Update fresh_children in lookup if no other ops in progressPranith Kumar K2011-08-199-115/+236
| | | | | | | | | | | | | If write/truncate fails we should remove the child that failed the fop from the fresh children. The previous code assumes that the children that succeeded the fop are fresh children, which is wrong. Fixed that in this patch. Change-Id: I1e6e21e20faea00516a0fdd2e95f2d7e9cf9076d BUG: 3411 Reviewed-on: http://review.gluster.com/263 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* option validation: further fixesAnand Avati2011-08-191-0/+15
| | | | | | | | | | fixes in option handling changes Change-Id: I0a44cdb088e3f08cd43d583a580736d0903fa88c BUG: 3415 Reviewed-on: http://review.gluster.com/261 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* xlator options: revamp xlator option validation/reconfigure codeAnand Avati2011-08-181-538/+111
| | | | | | | | | | | | | | | | | - move option handling to options.c (new file) - remove duplication of option validation code - remove duplication of gf_log / sprintf - get rid of xlator_t->validate_options - get rid of option validation in rpc-transport - get rid of validate_options() in every xlator - use xlator_volume_option_get to clean up many functions - introduce primitives to init/reconfigure option types Change-Id: I51798af72c8dc0a2b9e017424036eb3667dfc7ff BUG: 3415 Reviewed-on: http://review.gluster.com/235 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/afr: read_child should be >= 0v3.3.0qa2Pranith Kumar K2011-08-131-1/+3
| | | | | | | | Change-Id: I447fb6a93cdd77de322cd5ded30673411c4cf79e BUG: 3251 Reviewed-on: http://review.gluster.com/233 Reviewed-by: Vijay Bellur <vijay@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* Change Copyright current yearPranith Kumar K2011-08-1026-26/+26
| | | | | | | | Change-Id: I2d10f2be44f518f496427f257988f1858e888084 BUG: 3348 Reviewed-on: http://review.gluster.com/200 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* LICENSE: s/GNU Affero General Public/GNU General Public/Pranith Kumar K2011-08-0626-78/+78
| | | | | | | | Change-Id: I3914467611e573cccee0d22df93920cf1b2eb79f BUG: 3348 Reviewed-on: http://review.gluster.com/182 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* PUMP: set pump lk_owner,pid to frame->rootPranith K2011-07-172-9/+8
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 3182 (Afr self-heal should happen with out big lock) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=3182
* cluster/afr: Don't depend on fuse lk_owner for inodelksPranith K2011-07-172-8/+6
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 3182 (Afr self-heal should happen with out big lock) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=3182
* cluster/afr: Fix conflict files and gfid self-healPranith K2011-07-177-600/+811
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2745 (failure to detect split brain) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2745
* cluster/afr: Detect conflict/gfid self-healsPranith K2011-07-176-161/+532
| | | | | | | | | | Added some helper functions that can be reused Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2745 (failure to detect split brain) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2745
* cluster/afr: make expunge/impunge re-usablePranith K2011-07-172-64/+176
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2745 (failure to detect split brain) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2745
* cluster/afr: Choose next call child from fresh-children for inode-read-fopsPranith K2011-07-177-442/+391
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2840 (files not getting self-healed when the first child goes down) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2840
* cluster/afr: Add fresh children along with read-child to inode contextPranith K2011-07-1716-353/+817
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2840 (files not getting self-healed when the first child goes down) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2840
* cluster/afr: Move afr local alloc functions from header files to sourcesPranith K2011-07-172-105/+124
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2840 (files not getting self-healed when the first child goes down) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2840
* afr: changes in volume_options to assist volume set help/help-xmlKaushik BV2011-07-121-6/+40
| | | | | | | | Signed-off-by: Kaushik BV <kaushikbv@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2041 (volume set help option) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2041
* afr/stripe: collect pathinfo xattrs from all childsVenky Shankar2011-07-122-1/+125
| | | | | | | | Signed-off-by: Venky Shankar <venky@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 3046 (getxattr for afr should returns realpath from all childs) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=3046
* cluster/afr: Handle lookups when self-heal is offPranith K2011-07-129-513/+996
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2586 (read child is set without checking the xattr) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2586
* pump, afr: dict related memory fixes.Krishnan P2011-07-123-42/+106
| | | | | | | | Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2489 (GlusterFS crashing with replace-brick) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2489
* cluster/afr: fix the range of the lock taken in [f]truncatePranith K2011-07-011-4/+4
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 3077 (afr [f]truncate locks wrong region in transaction) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=3077
* pump: mark pending before notify to children to avoid race in single CPU.Krishnan P2011-06-201-2/+2
| | | | | | | | Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 3050 ('replace-brick' hangs on vm's) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=3050
* afr: bg self-heal must be off if self-heal-count=0.Krishnan P2011-06-201-3/+3
| | | | | | | | Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 3036 (self-heal problem in replace-brick) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=3036
* core: fill 'ia_ino' from 'ia_gfid' in 'storage/posix' to preserve same ino ↵Amar Tumballi2011-06-167-205/+0
| | | | | | | | | | | | | | | number take the least significant 64bit from gfid and assign it to 'ia_ino', hence for a given file (or directory), the 'ia_ino' number is always same, and we need not worry about the 'itransform' in 'cluster/*' translators. Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 3042 (inode number should be constant on storage) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=3042
* cluster/afr: Give proper device id for mknodPranith Kumar K2011-06-161-1/+2
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2840 (files not getting self-healed when the first child goes down) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2840
* PUMP: perform opendir on pump xlator instead of its childPranith K2011-06-161-4/+2
| | | | | | | | | | | | | | | | | | | | | | When replace-brick is performed, there is a high probability for the directories on the source brick to not have any pending entry xattrs. So for proper self-heal we need the force merge to kick in. If the opendir is performed on pump xlator directory is examined and if the entries on its children do not match then force merge is triggered, so missing entries will be created on the sink. Pending xattrs are set from source to sink on the files present on source in this process. So when the lookup happens the self-heal is triggered. Before this fix, the code is working because the self-heal source is decided based on what file is biggest in size which is a wrong and removed. Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2586 (read child is set without checking the xattr) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2586
* pump: cleanup xattrs on both commit and abort path.Krishnan P2011-06-162-4/+149
| | | | | | | | | | | | | | | This change makes glusterd to send a setxattr command for replace-brick commit operation similar to abort. Earlier we could commit even before the 'migration' of data was complete, with this change we fail that operation. Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 3033 (Changes to replace-brick and syntask interface.) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=3033
* afr: holding stack var via dict_set_static_bin corrupts.Krishnan P2011-06-161-2/+15
| | | | | | | | Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 3033 (Changes to replace-brick and syntask interface.) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=3033
* syncop: Modified to accept one 'non-frame' arg.Krishnan P2011-06-161-5/+3
| | | | | | | | | | | | | Earlier syncops used to accept one argument which was a call frame to carry out the fops synchronously. Now we have two args passed to synctask function, one call frame and another void pointer. Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 3033 (Changes to replace-brick and syntask interface.) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=3033
* cluster/afr: propagate proper errno returned by lock fopsAnand Avati2011-06-101-4/+0
| | | | | | | | | | If locks could not be held on any of the servers, then propagate the errno returned by the lock FOPs instead of hardcoding EAGAIN/EINVAL. Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2993 ([glusterfs-3.2.0qa2]: hang while doing the selfheal) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2993
* cluster/afr: Log errors in afr self-heal with GF_LOG_ERRORPranith Kumar K2011-06-087-44/+49
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2986 (Failed operations should should be logged `E' or `W') URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2986
* pump: init last_event array to be used in afr_notifyPranith Kumar K2011-05-311-0/+7
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2870 (Inconsistent xattr values when creating bricks) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2870
* cluster/afr: Send Non-blocking lock in non-blocking entrylkPranith Kumar K2011-05-301-1/+1
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2949 (self-heal hangs) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2949
* cluster/afr: Send the first child up/down after all its children notifyPranith Kumar K2011-05-303-54/+115
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2870 (Inconsistent xattr values when creating bricks) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2870
* pump: Detect 'empty' brick and finish migration.Krishnan Parthasarathi2011-05-301-0/+8
| | | | | | | | Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2909 (replace brick of empty brick never says migration completed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2909
* replicate: print favorite child as an int instead of unsigned intRaghavendra Bhat2011-05-201-1/+1
| | | | | | | | Signed-off-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2668 ([glusterfs-3.2.9qa7]: createbench error) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2668
* Move `self-heal completed' message to log level INFO.Sachidananda2011-05-111-1/+1
| | | | | | | | Signed-off-by: Sachidananda Urs <sac@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2867 (Move self-heal completed message to INFO level) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2867
* cluster/afr: set loc gfids for fresh lookupPranith Kumar K2011-05-043-0/+15
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2346 (Log message enhancements in GlusterFS - phase 1) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2346
* Move the log `self-heal pending' message to INFO level.Sachidananda2011-05-031-1/+1
| | | | | | | | Signed-off-by: Sachidananda Urs <sac@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2867 (Move self-heal completed message to INFO level) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2867
* Move self-heal completed log message to INFO level.Sachidananda2011-05-031-1/+1
| | | | | | | | Signed-off-by: Sachidananda Urs <sac@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2867 (Move self-heal completed message to INFO level) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2867
* loc_t: add 'gfid' and 'pargfid' fieldsAmar Tumballi2011-05-032-0/+11
| | | | | | | | | | | | | | these fields are used mainly in case of selfheal path, where 'inode->gfid'||'parent->gfid' is not yet set. These fields in 'loc' will have lower precedence than 'inode->gfid' in client protocol. Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2346 (Log message enhancements in GlusterFS - phase 1) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2346
* cluster/afr: Avoid null dereferencePranith Kumar K2011-04-141-1/+3
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2750 ([glusterfs-3.2.0qa11]: nfs server crashed in afr_sh_entry_expunge_cbk) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2750
* PUMP: initialize loc at declarationPranith Kumar K2011-04-131-5/+5
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2726 ([glusterfs-3.2.0qa11]: glusterfs server crashed due to stack overflow) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2726
* declare favorite child as int instead of unsigned intRaghavendra Bhat2011-04-131-1/+1
| | | | | | | | | | | | | | In afr_private_t structure favorite child is declared as unsigned int. In init function of afr we set favorite child to -1, if that option is not found in volfile. But favorite child value will be set to a huge value instead of -1 since it is an unsigned int and in statedump file favorite child value is displayed as a huge value instead of -1. Signed-off-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2668 ([glusterfs-3.2.9qa7]: createbench error) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2668