summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/afr/src/afr-self-heal-data.c
Commit message (Collapse)AuthorAgeFilesLines
* cluster/afr: Sparse file self-heal cangesPranith Kumar K2014-03-261-9/+28
| | | | | | | | | | | | | - Fix boundary condition for offset - Honour data-self-heal-algorithm option - Added tests for sparse file self-healing Change-Id: I14bb1c9d04118a3df4072f962fc8f2f197391d95 BUG: 1080707 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7339 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: refactorAnand Avati2014-03-221-1616/+478
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Remove client side self-healing completely (opendir, openfd, lookup) - Re-work readdir-failover to work reliably in case of NFS - Remove unused/dead lock recovery code - Consistently use xdata in both calls and callbacks in all FOPs - Per-inode event generation, used to force inode ctx refresh - Implement dirty flag support (in place of pending counts) - Eliminate inode ctx structure, use read subvol bits + event_generation - Implement inode ctx refreshing based on event generation - Provide backward compatibility in transactions - remove unused variables and functions - make code more consistent in style and pattern - regularize and clean up inode-write transaction code - regularize and clean up dir-write transaction code - regularize and clean up common FOPs - reorganize transaction framework code - skip setting xattrs in pending dict if nothing is pending - re-write self-healing code using syncops - re-write simpler self-heal-daemon Change-Id: I1e4080c9796c8a2815c2dab4be3073f389d614a8 BUG: 1021686 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6010 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Improvement in logging of self heal completion statusVenkatesh Somyajulu2013-08-291-13/+149
| | | | | | | | | | | Additional information for source and sinks are added. Change-Id: I1704956ff86ac3ae36744efe7499c1d1c43faeaf BUG: 968301 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/5638 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Print self-heal log when self-heal succeedsPranith Kumar K2013-07-311-0/+59
| | | | | | | | | Change-Id: I95e47e589419dc6a032cbd8ba01964b6c176c2d5 BUG: 927146 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5408 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Let two data-self-heals compete in new domainPranith Kumar K2013-07-031-22/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: At the moment data-self-heal acquires locks in following pattern. It takes full file lock then gets xattrs on files on both replicas. Decides sources/sinks based on the xattrs. Now it acquires lock from 0-128k then unlocks the full file lock. Syncs 0-128k range from source to sink now acquires lock 128k+1 till 256k then unlocks 0-128k, syncs 128k+1 till 256k block... so on finally it takes full file lock again then unlocks the final small range block. It decrements pending counts and then unlocks the full file lock. This pattern of locks is chosen to avoid more than 1 self-heal to be in progress. BUT if another self-heal tries to take a full file lock while a self-heal is already in progress it will be put in blocked queue, further inodelks from writes by the application will also be put in blocked queue because of the way locks xlator grants inodelks. So until the self-heal is complete writes are blocked. Here is the code: xlators/features/locks/src/inodelk.c - line 225 if (__blocked_lock_conflict (dom, lock) && !(__owner_has_lock (dom, lock))) { ret = -EAGAIN; if (can_block == 0) goto out; gettimeofday (&lock->blkd_time, NULL); list_add_tail (&lock->blocked_locks, &dom->blocked_inodelks); } This leads to hangs in applications. Fix: Since we want to prevent two parallel self-heals. We let them compete in a separate "domain". Lets call the domain on which the locks have been taken on in previous approach as "data-domain". In the new approach When a self-heal is triggered, it acquires a full lock in the new domain "self-heal-domain". After this it performs data-self-heal using the locks in "data-domain" as before. unlock the full file lock in "self-heal-domain" With this approach, application's writevs don't have to wait in pending queue when more than 1 self-heal is triggered. Change-Id: Id79aef3dfa888945977fb9758374ac41c320d0d5 BUG: 967717 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5100 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Refactor inodelk to handle multiple domainsPranith Kumar K2013-07-031-14/+30
| | | | | | | | | | | | | | | | | | - afr_local_copy should not be memduping locked nodes, that would mean that lock is taken in self-heal on those nodes even before it actually takes the lock. So removed memdup code. Even entry lock related copying (lockee info) is also not necessary for self-heal functionality, so removing that as well. Since it is not local_copy anymore changed its name. - My editor changed tabs to spaces. Change-Id: I8dfb92cb8338e9a967c06907a8e29a8404782d61 BUG: 967717 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5099 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Allow data/entry self heal for metadata split-brainVenkatesh Somyajulu2013-07-021-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Currently whenever there is metadata split-brain, a variable sh->op_failed is set to 1 to denote that self heal got failed. But if we proceed for data self heal, even code-path of data self heal also relies on the sh->op_failed variable. So if will check for sh->op_failed variable and will eventually fails to do data self heal. So needed a mechanism to allow data self heal even if metadata is in split brain. Fix: Some data structure revamp is done in http://review.gluster.com/#/c/5106/ fix and this patch is based on the above fix. Now we can store which particular self-heal got failed i.e GFID_OR_MISSING_ENTRY_SELF_HEAL, METADATA, DATA, ENTRY. And we can do two types of self heal failure check. 1. Individual type check: We can check which among all four (Metadata, Data, Gfid or missing entry, entry self heal) got failed. 2. In afr_self_heal_completion_cbk, we need to make check based on the fact that if any specific self heal got failed treat the complete self heal as failure so that it will populate corresponding circular buffer of event history accordingly. Change-Id: Icb91e513bcc752386fc8a78812405cfabe5cac2d BUG: 977797 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/5253 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Improvement in logging of self heal completion statusVenkatesh Somyajulu2013-06-131-15/+19
| | | | | | | | | | | | | | | | | | | | | | Problem: As the end of the self heal, message logged by "afr_self_heal_completion_cbk" is inadequate to determine what exactly failed during the course of afr self heal. It is worth to have knowledge of what all types of self heal got triggered for an entity and whether the status is success or failure. Fix: At the end of self heal, it will log information about out of 4 types of self heal (gfid or missing entry self heal, metadata, data and entry self heal), who all got triggered and who all got failed or successful at the end. Change-Id: I5360762fbd7d391ac4c6af6706b4835c5801835a BUG: 968301 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/5106 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: fsync before erase xattrs in data self-healPranith Kumar K2013-03-281-1/+75
| | | | | | | | | | | | Added extra fsync to data self-heal code to make sure the data reached disk before erasing the changelogs Change-Id: I9e7e6e55cdc49de2b991705d1638946464a9d4f9 BUG: 927146 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4744 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Preserve mtime in self-healPranith Kumar K2013-03-121-14/+25
| | | | | | | | | | | | | | | | | | | | | | | Problem: Data self-heal may choose sink iatt to set mtimes. This happens because after syncing of data is done self-heal does one more xattrops/fstat to determine sources sinks to set the inode-ctx. Since this is done after data syncing and erase of xattrs, old source and old sink are now sources, but the mtimes of them differ. Old code just takes the first source from the list and update mtimes, which could be sink before the self-heal started. Fix: Set mtime from 'sources before syncing'. Change-Id: Id769e1b99aa4f041eaee775f64cbf2c57b799723 BUG: 918437 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4658 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: added logging of changelog for split-brain in glustershd.log fileVenkatesh Somyajula2013-02-031-5/+2
| | | | | | | | | | Change-Id: Iaf119f839cb2113b8f8efb7bf7636d471b6541bf BUG: 866440 Signed-off-by: Venkatesh Somyajula <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/4385 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Remember type of split-brain in inode-ctxPranith Kumar K2012-12-111-9/+3
| | | | | | | | | | | | | Along with this change, fixed the race of setting the split-brain status in inode-ctx after unwinding the fop from self-heal in case of back-ground self-heal. Change-Id: Ifc829300df485f50f139443802e8b6dc7038b4ad BUG: 873962 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4198 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Empty string should not be default option valPranith Kumar K2012-12-051-0/+4
| | | | | | | | | | | | Glusterd does not allow empty string as default value. Changed afr option values to disallow empty string as value. Change-Id: I92a2d658907dbc6101e1139dd91f548acb5506f5 BUG: 859927 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4271 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* afr: use data trylock mode in read/write self-heal trigger pathsBrian Foster2012-12-041-1/+8
| | | | | | | | | | | | | | | | | | | | | | Self-heal data lock contention between clients and glustershd instances can lead to long wait and user response times if the client ends up pending its lock on glustershd self-heal of a large file. We have reports of guest vm instances going completely unresponsive during self-heal of virtual disk images. Optimize the read/write self-heal trigger codepath (i.e., afr_open_fd_fix()) to trylock for self-heal and skip the self-heal otherwise to minimize the likelihood of a running/active guest of competing with glustershd on arrival of a brick. Note that lock contention is still possible from the client (e.g., via lookup). BUG: 874045 Change-Id: I406443c061ff6acd2a851179626b78352caa5c03 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4258 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* afr: support self-heal data trylock mechanismBrian Foster2012-12-041-6/+12
| | | | | | | | | | | | | | | Introduce a block flag to support an optional blocking or non-blocking mode in the self-heal data locking mechanism. All callers are modified to use blocking mode, which is the current default behavior (no change in behavior is introduced by this commit). BUG: 874045 Change-Id: Ib7ff9984578fa11de4e3b6981508100cdddd37cd Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4257 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* afr: send unique dict_t instances to replicas in self-heal fxattropBrian Foster2012-11-291-28/+42
| | | | | | | | | | | | | | | | | | | | | afr_sh_data_fxattrop() currently allocates and sends a single xattr dict_t instance to each replica. The callback codepath references the returned object in the self-heal in-memory state for the particular replica. If storage/posix is in the same address-space (i.e., running a single glusterfs client with a fuse->afr->posix graph), the same object is modified and returned for each child, causing corrupted in-memory state and afr xattrs. Allocate and send independent xattr dict_t's for each replica. This allows self-heal to work correctly in a single address-space graph. BUG: 868478 Change-Id: I42832e85b5d1abb6098c28944c717e129300109e Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4149 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Cluster/afr: Fix output for gluster volume heal vn info healedVenkatesh Somyajula2012-11-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | Problem: Whenever gluster volume heal vol full command is executed, the entries stored in the circual buffer for sh->healed are added in the dictionary in the _crawl_post_sh_action function irrespective of whether actual self heal (due to non-zero values in chage log) takes place or not. Fix: Value of key (actual-sh-done) will be set to 1 whenever self heal takes place due to non-zero change log values and if for some FOP self heal daemon finds that no self heal required after examining the pending matrix, the value will be 0. Change-Id: I11fd0b9ee76759af17c5bca6bfafbaf66bcaacbc BUG: 863068 Signed-off-by: Venkatesh Somyajula <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/4181 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* afr: Avoid excessive logging in self-heal.Krishnan Parthasarathi2012-08-231-2/+2
| | | | | | | | | | | | | | - (Excessive) Logging has been very useful as 'bread-crumbs' in many a root-cause analyses. This patch aims at avoiding logging when the information could be reconstructed using the xattrs, statedump, and/or "volume heal" CLI commands. Change-Id: Iebc6b10ae18f0dd9704bdc6dd03bcfe0f2a09abd BUG: 844804 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/3805 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Modified split-brain handlingPranith Kumar K2012-07-261-26/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RCA The bug is observed because the decision to mark a file in split-brain is taken outside appropriate locks. Lookup gathers xattrs outside any lock. The xattrs being in split-brain in lookup should only be taken as a hint. Appropriate inodelks should be taken before confirming a split-brain. Self-heal confirms this at the moment. If data/metadata self-heal is turned off, inspecting of xattrs could not be performed so split-brain behavior does not work correctly if the self-heal options are turned off. Fix Self-heals are launched to inspect xattrs even when the data/metadata self-heal options are turned off. The decision to heal data/metadata after the xattrs are inspected is based on whether the options are turned on/off. So decision to set/reset split-brain flag is taken inside appropriate locks. Testcases: tests 33-36 in https://github.com/pranithk/gluster-tests/blob/master/afr/self-heal.sh Change-Id: Ia8aeab08208b50c06609ad35a9d72f3d553ee343 BUG: 833727 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3626 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Perform data self-heal for non regular filesPranith Kumar K2012-07-251-86/+60
| | | | | | | | | | | | | | | | | | | | | | | | | RCA: Data self-heal for non regular files open the files and then proceeds using that fd. This approach does not work for symlinks because open on symlink opens the file resolved by it. Fix: If the file is not a regular file then perform self-heal using loc. It needs to get 'big' lock and then perform lookup to get changelog then erase data part of chagelog, then unlock. Test cases: Automated at https://github.com/pranithk/gluster-tests/blob/master/afr/special-file-self-heal-test.sh Change-Id: I924a922f5135872efe2cccf2e712ada082c5689f BUG: 811317 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3724 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* remove useless if-before-free (and free-like) functionsJim Meyering2012-07-131-4/+2
| | | | | | | | | | | | See comments in http://bugzilla.redhat.com/839925 for the code to perform this change. Signed-off-by: Jim Meyering <meyering@redhat.com> BUG: 839925 Change-Id: I10e4ecff16c3749fe17c2831c516737e08a3205a Reviewed-on: http://review.gluster.com/3661 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Don't reset split-brain when data-self-heal is offPranith Kumar K2012-06-191-0/+4
| | | | | | | | | BUG: 804606 Change-Id: I8cefcb6efa687fac4ad412403c085b3767218f72 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3586 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* replicate: add hashed read-child method.Jeff Darcy2012-05-311-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both the first-to-respond method and the round-robin method are susceptible to clients repeatedly choosing the same servers across a series of opens, creating hot spots. Also, the code to handle a replica being down will ignore both methods and just choose the first remaining (which is not an issue for two-way but can be otherwise). The hashed method more reliably avoids such hot spots. There are three values/modes. 0: use the old (broken) methods. 1: select a read-child based on a hash of the file's GFID, so all clients will choose the same subvolume for a file (ensuring maximum consistency) but will distribute load for a set of files. 2: select a read-child based on a hash of the file's GFID plus the client's PID, so different children will distribute load even for one file. Mode 2 will probably be optimal for most cases. Using response time when we open the file is problematic, both because a single sample might not have been representative even then and because load might have shifted in the hours or days since (for long-lived files). Trying to use more current load information can lead to "herd following" behavior which is just as bad. Pseudo-random distribution is likely to be the best we can reasonably do, just as it is for DHT. Change-Id: I798c2760411eacf32e82a85f03bb7b08a4a49461 BUG: 802513 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.com/2926 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Release inodelk on erase changelog failuresPranith Kumar K2012-05-231-0/+3
| | | | | | | | | Change-Id: I58271e1ac5a116b5bc717d7cad9f03eb7dc8a1a4 BUG: 811551 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3417 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Enforce order in pre/post opPranith Kumar K2012-05-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The xattrop order in pre/post op on all the subvols is client-0, client-1... client-n where n is (replica-count - 1). This order can lead to invalid split-brains if the brick dies in the middle of xattrops. Example: transaction completed pre-op, so on all the subvolumes xattrs have '1' changelog. Now post-op is sent to both the subvols. On subvol-0 change-log of client-0 is decremented to 0, before decrementing change-log of client-1 to 0 the brick dies. This change-log status on subvol-0 gives the meaning that a change is done on subvol-0 successfully but on subvol-1 it failed. Which is not what happened. Changes done when the subvol-0 was down will lead to pending change-log on subvol-1 for subvol-0. Which is correct. When the subvol-0 is brought back up, the change-log will be in split-brain state even when it is not a legitimate split-brain. If the brick dies in the middle of xattrops it should remain fool. Pre-op should perform xattrop of the local change-log first and post-op should perform xattrop of the local change-log last. In case of optimistic changelogs txn_changelog should be done last on local if it succeeds, first if it fails. Change-Id: Ib6eeb20cdc49b0b1fd2f454f25a9c8e08388c6e7 BUG: 765194 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3226 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Perform conservative merge on dir with xattr split-brainPranith Kumar K2012-05-181-3/+14
| | | | | | | | | | Change-Id: I96d59ad239c2c5efee14dd4b01a10a3f565d491e BUG: 765587 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3091 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Handle files w.o. xattrs and size mismatch.Pranith Kumar K2012-05-181-1/+9
| | | | | | | | | | Change-Id: Ia27ee996bed8f5915c154718bf6e859b6a2fc335 BUG: 765587 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3090 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Mark zero size file as sink in absense of xattrs.Pranith Kumar K2012-05-181-1/+1
| | | | | | | | | | Change-Id: I4500f39a49ee16e6e88451dcf147d9f49b1d749e BUG: 765587 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3089 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Determining sources should do both fxattrop, fstatPranith Kumar K2012-05-181-95/+64
| | | | | | | | | Change-Id: Ifab37db2af8d489cd516e992b7423c765dcabc4f BUG: 765587 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3088 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* license: dual license under GPLV2 and LGPLV3+Kaleb KEITHLEY2012-05-101-14/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note that the license was not changed in any of the following: .../argp-standalone/... .../booster/... .../cli/... .../contrib/... .../extras/... .../glusterfsd/... .../glusterfs-hadoop/... .../mod_clusterfs/... .../scheduler/... .../swift/... The license was not changed in any of the non-building xlators. The license was not changed in any of the xlators that seemed — to me — to be clearly server-side only, e.g. protocol/server Note too that copyright was changed along with the license; I did not change the copyright in files where the license did not change. If you find any errors or ommissions please don't hesitate to let me know. The complete list of files with the license change is: libglusterfs/src/byte-order.h libglusterfs/src/call-stub.c libglusterfs/src/call-stub.h libglusterfs/src/checksum.c libglusterfs/src/checksum.h libglusterfs/src/circ-buff.c libglusterfs/src/circ-buff.h libglusterfs/src/common-utils.c libglusterfs/src/common-utils.h libglusterfs/src/compat-errno.c libglusterfs/src/compat-errno.h libglusterfs/src/compat.c libglusterfs/src/compat.h libglusterfs/src/daemon.c libglusterfs/src/daemon.h libglusterfs/src/defaults.c libglusterfs/src/defaults.h libglusterfs/src/dict.c libglusterfs/src/dict.h libglusterfs/src/event-history.c libglusterfs/src/event-history.h libglusterfs/src/event.c libglusterfs/src/event.h libglusterfs/src/fd-lk.c libglusterfs/src/fd-lk.h libglusterfs/src/fd.c libglusterfs/src/fd.h libglusterfs/src/gf-dirent.c libglusterfs/src/gf-dirent.h libglusterfs/src/globals.c libglusterfs/src/globals.h libglusterfs/src/glusterfs.h libglusterfs/src/graph-print.c libglusterfs/src/graph-utils.h libglusterfs/src/graph.c libglusterfs/src/hashfn.c libglusterfs/src/hashfn.h libglusterfs/src/iatt.h libglusterfs/src/inode.c libglusterfs/src/inode.h libglusterfs/src/iobuf.c libglusterfs/src/iobuf.h libglusterfs/src/latency.c libglusterfs/src/latency.h libglusterfs/src/list.h libglusterfs/src/lkowner.h libglusterfs/src/locking.h libglusterfs/src/logging.c libglusterfs/src/logging.h libglusterfs/src/mem-pool.c libglusterfs/src/mem-pool.h libglusterfs/src/mem-types.h libglusterfs/src/options.c libglusterfs/src/options.h libglusterfs/src/rbthash.c libglusterfs/src/rbthash.h libglusterfs/src/run.c libglusterfs/src/run.h libglusterfs/src/scheduler.c libglusterfs/src/scheduler.h libglusterfs/src/stack.c libglusterfs/src/stack.h libglusterfs/src/statedump.c libglusterfs/src/statedump.h libglusterfs/src/syncop.c libglusterfs/src/syncop.h libglusterfs/src/syscall.c libglusterfs/src/syscall.h libglusterfs/src/timer.c libglusterfs/src/timer.h libglusterfs/src/trie.c libglusterfs/src/trie.h libglusterfs/src/xlator.c libglusterfs/src/xlator.h libglusterfsclient/src/libglusterfsclient-dentry.c libglusterfsclient/src/libglusterfsclient-internals.h libglusterfsclient/src/libglusterfsclient.c libglusterfsclient/src/libglusterfsclient.h rpc/rpc-lib/src/auth-glusterfs.c rpc/rpc-lib/src/auth-null.c rpc/rpc-lib/src/auth-unix.c rpc/rpc-lib/src/protocol-common.h rpc/rpc-lib/src/rpc-clnt.c rpc/rpc-lib/src/rpc-clnt.h rpc/rpc-lib/src/rpc-transport.c rpc/rpc-lib/src/rpc-transport.h rpc/rpc-lib/src/rpcsvc-auth.c rpc/rpc-lib/src/rpcsvc-common.h rpc/rpc-lib/src/rpcsvc.c rpc/rpc-lib/src/rpcsvc.h rpc/rpc-lib/src/xdr-common.h rpc/rpc-lib/src/xdr-rpc.c rpc/rpc-lib/src/xdr-rpc.h rpc/rpc-lib/src/xdr-rpcclnt.c rpc/rpc-lib/src/xdr-rpcclnt.h rpc/rpc-transport/rdma/src/name.c rpc/rpc-transport/rdma/src/name.h rpc/rpc-transport/rdma/src/rdma.c rpc/rpc-transport/rdma/src/rdma.h rpc/rpc-transport/socket/src/name.c rpc/rpc-transport/socket/src/name.h rpc/rpc-transport/socket/src/socket.c rpc/rpc-transport/socket/src/socket.h xlators/cluster/afr/src/afr-common.c xlators/cluster/afr/src/afr-dir-read.c xlators/cluster/afr/src/afr-dir-read.h xlators/cluster/afr/src/afr-dir-write.c xlators/cluster/afr/src/afr-dir-write.h xlators/cluster/afr/src/afr-inode-read.c xlators/cluster/afr/src/afr-inode-read.h xlators/cluster/afr/src/afr-inode-write.c xlators/cluster/afr/src/afr-inode-write.h xlators/cluster/afr/src/afr-lk-common.c xlators/cluster/afr/src/afr-mem-types.h xlators/cluster/afr/src/afr-open.c xlators/cluster/afr/src/afr-self-heal-algorithm.c xlators/cluster/afr/src/afr-self-heal-algorithm.h xlators/cluster/afr/src/afr-self-heal-common.c xlators/cluster/afr/src/afr-self-heal-common.h xlators/cluster/afr/src/afr-self-heal-data.c xlators/cluster/afr/src/afr-self-heal-entry.c xlators/cluster/afr/src/afr-self-heal-metadata.c xlators/cluster/afr/src/afr-self-heal.h xlators/cluster/afr/src/afr-self-heald.c xlators/cluster/afr/src/afr-self-heald.h xlators/cluster/afr/src/afr-transaction.c xlators/cluster/afr/src/afr-transaction.h xlators/cluster/afr/src/afr.c xlators/cluster/afr/src/afr.h xlators/cluster/afr/src/pump.c xlators/cluster/afr/src/pump.h xlators/cluster/dht/src/dht-common.c xlators/cluster/dht/src/dht-common.h xlators/cluster/dht/src/dht-diskusage.c xlators/cluster/dht/src/dht-hashfn.c xlators/cluster/dht/src/dht-helper.c xlators/cluster/dht/src/dht-inode-read.c xlators/cluster/dht/src/dht-inode-write.c xlators/cluster/dht/src/dht-layout.c xlators/cluster/dht/src/dht-linkfile.c xlators/cluster/dht/src/dht-mem-types.h xlators/cluster/dht/src/dht-rebalance.c xlators/cluster/dht/src/dht-rename.c xlators/cluster/dht/src/dht-selfheal.c xlators/cluster/dht/src/dht.c xlators/cluster/dht/src/nufa.c xlators/cluster/dht/src/switch.c xlators/cluster/stripe/src/stripe-helpers.c xlators/cluster/stripe/src/stripe-mem-types.h xlators/cluster/stripe/src/stripe.c xlators/cluster/stripe/src/stripe.h xlators/features/index/src/index-mem-types.h ¹ xlators/features/index/src/index.c ¹ xlators/features/index/src/index.h ¹ xlators/performance/io-cache/src/io-cache.c xlators/performance/io-cache/src/io-cache.h xlators/performance/io-cache/src/ioc-inode.c xlators/performance/io-cache/src/ioc-mem-types.h xlators/performance/io-cache/src/page.c xlators/performance/io-threads/src/io-threads.c xlators/performance/io-threads/src/io-threads.h xlators/performance/io-threads/src/iot-mem-types.h xlators/performance/md-cache/src/md-cache-mem-types.h xlators/performance/md-cache/src/md-cache.c xlators/performance/quick-read/src/quick-read-mem-types.h xlators/performance/quick-read/src/quick-read.c xlators/performance/quick-read/src/quick-read.h xlators/performance/read-ahead/src/page.c xlators/performance/read-ahead/src/read-ahead-mem-types.h xlators/performance/read-ahead/src/read-ahead.c xlators/performance/read-ahead/src/read-ahead.h xlators/performance/symlink-cache/src/symlink-cache.c xlators/performance/write-behind/src/write-behind-mem-types.h xlators/performance/write-behind/src/write-behind.c xlators/protocol/auth/addr/src/addr.c ¹ xlators/protocol/auth/login/src/login.c ¹ xlators/protocol/client/src/client-callback.c xlators/protocol/client/src/client-handshake.c xlators/protocol/client/src/client-helpers.c xlators/protocol/client/src/client-lk.c xlators/protocol/client/src/client-mem-types.h xlators/protocol/client/src/client.c xlators/protocol/client/src/client.h xlators/protocol/client/src/client3_1-fops.c ¹ Copyright only, license reverted to original Change-Id: If560e826c61b6b26f8b9af7bed6e4bcbaeba31a8 BUG: 820551 Signed-off-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.com/3304 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* self-heald: Find self-heal failures, split-brainPranith Kumar K2012-04-051-1/+2
| | | | | | | | | | Change-Id: Ib967f0fe0b537fe60e51d7d05462b58a7f16596e BUG: 806745 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3077 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Handle self-heal of files with holesPranith Kumar K2012-04-041-7/+9
| | | | | | | | | Change-Id: I6c04fe3022f234455d52620f42b9add80fc6abe4 BUG: 765424 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3065 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: handle fstat failure in data-self-healPranith Kumar K2012-03-291-1/+1
| | | | | | | | | | | | | | | | The final fstat which makes the call_count 0 could be a failure. In that case the buf could either be NULL or buf is all zeros. If buf is NULL then it will crash, if it is all zeros buf->ia_type will be IA_INVAL and it proceeds to special file fix. sh->type is assigned with the ia_type of the file to be healed. I modified the code to depend on that instead. Change-Id: Icf7e19ff5908207128f2a1ee2963ad6b791c1ac5 BUG: 804645 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3031 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Handle afr data self-heal failures gracefullyPranith Kumar K2012-03-281-51/+137
| | | | | | | | | Change-Id: I5f91098111a8ca29982f3b4292e2109e4a36cce1 BUG: 765373 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2662 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: adding extra data for fopsAmar Tumballi2012-03-221-18/+20
| | | | | | | | | | | | | with this change, the xlator APIs will have a dictionary as extra argument, which is passed between all the layers. This can be utilized for overloading in some of the operations. Change-Id: I58a8186b3ef647650280e63f3e5e9b9de7827b40 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 782265 Reviewed-on: http://review.gluster.com/2960 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Perform xattrop with all afr-keysPranith Kumar K2012-01-271-46/+3
| | | | | | | | | | | | | | | | | | | | | | | | Self-heal does not happen if the file has change log xattr only for one of the subvol keys. This patch makes sure that xattrop is done for all the afr subvol keys after a new entry is created in entry-self-heal. 1) Added matrix create/cleanup functions 2) Impunging a new file does multiple xattrops on the source subvol, one per sink. The code can do a single xattrop after the entry is created on all the sinks. 3) Missing entry self-heal uses one frame per sink to heal the file. This leads to multiple xattrops on the source subvol. That code is changed now to use one frame which will create the file on all subvols. Change-Id: I65a42f9779b03f7efae283479f8653fb2cb8046b BUG: 762680 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2503 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Krishnan Parthasarathi <kp@gluster.com>
* core: change lk-owner as a 1k bufferAmar Tumballi2012-01-241-12/+16
| | | | | | | | | | | | | so, NLM can send the lk-owner field directly to the locks translators, while doing the same effort, also enabled sending maximum of 500 aux gid over protocol. Change-Id: I87c2514392748416f7ffe21d5154faad2e413969 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 767229 Reviewed-on: http://review.gluster.com/779 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/afr: Handle split-brain/all-fool xattrs for directoryPranith Kumar K2011-12-271-3/+8
| | | | | | | | | | | | In case of split-brain/all-fool xattrs perform conservative merge. Don't treat ignorant subvol as fool. Change-Id: I6ddf89949cd5793c2abbead7c47f091e8461f1d4 BUG: 765528 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2521 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Remove treating sh_frame as special loop_framePranith Kumar K2011-11-231-7/+1
| | | | | | | | Change-Id: I0d87f06f989b2d4b971967c52d4898331693a801 BUG: 3675 Reviewed-on: http://review.gluster.com/735 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Fix memory leaksPranith Kumar K2011-11-231-15/+5
| | | | | | | | Change-Id: I79a1c70c47649fbcf236191f174d766d5806545c BUG: 3805 Reviewed-on: http://review.gluster.com/719 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr : Fix self-heal of special filesKaushal M2011-10-311-1/+33
| | | | | | | | | | | | | | | | Fixes self-heal of special files like device files, fifo files, socket files etc. Does it by doing the following: * Prevent setting of pending data xattr on a special file during entry self-heal when a new fils is created. * Allow data self-heal to be started on all file types other than directories. During data self-heal, for special files just erase pending xattrs, if those xattrs were set by previous releases of glusterfs. Change-Id: I34d8121e23ad00e85371ae2a36ef30cf3bd5db7a BUG: 3525 Reviewed-on: http://review.gluster.com/618 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pranithk@gluster.com>
* cluster/afr: Make data selfheal trigger to be configurable.Pranith Kumar K2011-09-081-1/+2
| | | | | | | | | | | | | | | | | | | | By default, lookup triggers data self-heal but that is not the preferred way of operating replicated volumes. We would like the data self heals to be triggered in open instead. Number of back-ground self-heals allowed is 16 and lookups block until self-heal is completed. We want to prevent blocking in fops. We can not make lookups independent of self-heal frames because when there are gfid conflicts the decision of which file is correct is determined in self-heal phase. So in afr, lookup self-heal is going to guarantee name space consistency and open/fd fops will take responsibility for data consistency, these are non blocking. The user needs to set the option cluster.data-self-heal "open" for this behavior. Change-Id: If9463cdb9ebac114708558ec13bbca0270acd659 BUG: 3503 Reviewed-on: http://review.gluster.com/334 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* Eliminate many "var set but not used" warnings with newer gcc.Jeff Darcy2011-09-071-13/+0
| | | | | | | | | | | | | | | | This fixes ~200 such warnings, but leaves three categories untouched. (1) Rpcgen code. (2) Macros which set variables in the outer (calling function) scope. (3) Variables which are set via function calls which may have side effects. Change-Id: I6554555f78ed26134251504b038da7e94adacbcd BUG: 2550 Reviewed-on: http://review.gluster.com/371 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/afr: Prevent double big lock when data self-heal loops are not spawnedPranith Kumar K2011-09-061-1/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The steps in normal data self heal: 1) take big lock by self-heal frame. Get the xattrs/stat to decide source, sink information. 2) spawn loop frames which perform self-heal by taking small locks on the file. Every time a new lock is taken and the old lock is released. 3) Before releasing the final small lock a big lock is taken by the self-heal frame, and unlock on small-lock. Erasing of the pending xattrs happen then the big unlock happen and that is the end of the data self-heal. When a data self-heal is needed for a file and the fop that triggers the self-heal is open with O_TRUNC. Fuse sends open then an explicit truncate for this. Open triggers the self-heal but by the time it tries to spawn the loops the file size is truncated to 0, so no loops are formed. These are the steps: 1) Take big lock by self-heal frame. Get the xattrs/stat to decide source, sink information. 2) loop frames are not spawned. The big lock is not released. 3) One more big lock is taken by the same self-heal frame, Erasing of the pending xattrs etc happen, now it does two big unlocks, but after the first unlock, the information on which the locks were performed is forgotten, so the next unlock becomes a no-op. So there is a stale big lock on that file preventing further writes. As a fix, if the loops are not spawned, use the previous big lock to perform the rest of the operations needed in completing the data self-heal. No need to have one more big lock. Change-Id: Id03171269594e447b2b6d1331e362d83bd1e3430 BUG: 3506 Reviewed-on: http://review.gluster.com/339 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/afr: Perform flush on all the children involved in self-healPranith Kumar K2011-08-221-19/+6
| | | | | | | | Change-Id: I66362a3087a635fb7b759d7836a1f6564a6a7fc9 BUG: 3456 Reviewed-on: http://review.gluster.com/294 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Perform self-heal without locking the whole filePranith Kumar K2011-08-201-276/+433
| | | | | | | | Change-Id: I206571c77f2d7b3c9f9d7bb82a936366fd99ce5c BUG: 3182 Reviewed-on: http://review.gluster.com/141 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Update fresh_children in lookup if no other ops in progressPranith Kumar K2011-08-191-1/+1
| | | | | | | | | | | | | If write/truncate fails we should remove the child that failed the fop from the fresh children. The previous code assumes that the children that succeeded the fop are fresh children, which is wrong. Fixed that in this patch. Change-Id: I1e6e21e20faea00516a0fdd2e95f2d7e9cf9076d BUG: 3411 Reviewed-on: http://review.gluster.com/263 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* Change Copyright current yearPranith Kumar K2011-08-101-1/+1
| | | | | | | | Change-Id: I2d10f2be44f518f496427f257988f1858e888084 BUG: 3348 Reviewed-on: http://review.gluster.com/200 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* LICENSE: s/GNU Affero General Public/GNU General Public/Pranith Kumar K2011-08-061-3/+3
| | | | | | | | Change-Id: I3914467611e573cccee0d22df93920cf1b2eb79f BUG: 3348 Reviewed-on: http://review.gluster.com/182 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/afr: Fix conflict files and gfid self-healPranith K2011-07-171-23/+7
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2745 (failure to detect split brain) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2745