summaryrefslogtreecommitdiffstats
path: root/tests/bugs
Commit message (Collapse)AuthorAgeFilesLines
* nfs: AUTH support for exported sub-directoriesRajesh Joseph2013-07-093-37/+77
| | | | | | | | | | | | | | | | | | | | | Problem: NFS allows exporting subdirectories but there is not support for providing AUTH on per directory basis. Fix: Modified nfs.export-dir to include AUTH parameters e.g. nfs.export-dir "/dir1(10.1.1.2),/dir2(10.1.1.0/24|host1) During mount operation NFS will check if the IP from where the connection is made is configured in the AUTH parameter, else the mount operation will fail with EACCES error. Updated admin-guide and volume set help message. Change-Id: I5c6d22edb168b4f46376d1cd6878cd065fc081cc BUG: 968227 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/5124 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cli: Fix remove brick cli out for wrong volume nameVenkatesh Somyajulu2013-07-041-0/+32
| | | | | | | | | | | | | | | | | | Problem: gluster volume remove-brick command, was not printing the error in case of volume-name specified is wrong. Fix: Fix will print error message to indicate that provided volume name is invalid. Although patch for bug 961669 http://review.gluster.org/#/c/4975/ does print cli-output now, but still xml is unable to use the response values Change-Id: I2ee1df86c1e756fb8e93b4d6bbdd102b4f368f87 BUG: 961307 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/4972 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Provide an option to disable afr durabilityPranith Kumar K2013-07-031-0/+42
| | | | | | | | | Change-Id: I40eec20ca6b3f857245a2438883822e251077ee9 BUG: 979365 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5269 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: More checks before starting rebalance/remove-brickKaushal M2013-07-021-0/+33
| | | | | | | | | | | | Check if a previous remove-brick operation has been committed before starting a new rebalance/remove-brick task. Change-Id: I553e5ba64a6a352ca91032ab1a17997051a4494e BUG: 963541 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5019 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Allow data/entry self heal for metadata split-brainVenkatesh Somyajulu2013-07-021-0/+114
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Currently whenever there is metadata split-brain, a variable sh->op_failed is set to 1 to denote that self heal got failed. But if we proceed for data self heal, even code-path of data self heal also relies on the sh->op_failed variable. So if will check for sh->op_failed variable and will eventually fails to do data self heal. So needed a mechanism to allow data self heal even if metadata is in split brain. Fix: Some data structure revamp is done in http://review.gluster.com/#/c/5106/ fix and this patch is based on the above fix. Now we can store which particular self-heal got failed i.e GFID_OR_MISSING_ENTRY_SELF_HEAL, METADATA, DATA, ENTRY. And we can do two types of self heal failure check. 1. Individual type check: We can check which among all four (Metadata, Data, Gfid or missing entry, entry self heal) got failed. 2. In afr_self_heal_completion_cbk, we need to make check based on the fact that if any specific self heal got failed treat the complete self heal as failure so that it will populate corresponding circular buffer of event history accordingly. Change-Id: Icb91e513bcc752386fc8a78812405cfabe5cac2d BUG: 977797 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/5253 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs: Fix valid_host_name for consecutive dotsKaushal M2013-07-021-0/+21
| | | | | | | | | | | | | The valid_host_name() function must reject addresses with consecutive dots in them. Change-Id: I1749de80c66e8fbad63b2e014f79e0203906030e BUG: 977246 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5249 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Handle NULL fdctx in fsyncPranith Kumar K2013-06-271-0/+29
| | | | | | | | | | | | | | | | | | | Problem: If fdctx is NULL in afr_fsync, process crashes because of NULL dereference. Fix: if fdctx is NULL, always say witnessed unstable write so that fsyncs are done properly. Handled fdctx being null in afr_delayed_changelog_post_op otherwise fsync stub is never resumed and the mount was hanging. Change-Id: Icacc900e9be63c29db3325cb0e19cc250adebaac BUG: 978794 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5258 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* nfs: Remove afr split-brain handling in nfsPranith Kumar K2013-06-251-0/+36
| | | | | | | | | | | | | | | | We added this code as an interim fix until afr can handle split-brains even when opens are not issued. Afr code has matured to reject fd based fops when there are split-brains so we can remove it. Change-Id: Ib337f78eccee86469a5eaabed1a547a2cea2bdcf BUG: 974972 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5227 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Fix fd/memory leak on fsyncPranith Kumar K2013-06-241-0/+27
| | | | | | | | | Change-Id: I764883811e30ca9d9c249ad00b6762101083a2fe BUG: 976800 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5248 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/dht: In reconfig handle removed decommissioned nodesshishir gowda2013-06-211-0/+48
| | | | | | | | | | | | | | If no decommissioned nodes options are set in the options, then clear the conf->decommissioned_bricks. Change-Id: I426d2bcc874aab21b2eba0b16a580b9a26672ea2 Signed-off-by: shishir gowda <sgowda@redhat.com> BUG: 973073 Reviewed-on: http://review.gluster.org/5199 Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* rpc: duplicate request cache for nfsRajesh Amaravathi2013-06-211-0/+23
| | | | | | | | | | | | | | | Duplicate request cache provides a mechanism for detecting duplicate rpc requests from clients. DRC caches replies and on duplicate requests, sends the cached reply instead of re-processing the request. Change-Id: I3d62a6c4aa86c92bf61f1038ca62a1a46bf1c303 BUG: 847624 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.org/4049 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Perform delayed changelog wakeups for anon fdPranith Kumar K2013-06-201-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | Problem: Nfs xlator never does open on a file for performing writes, afr does not perform changelog wakeup for this fd so operations which do metadata operations as soon as the data operations are completed perceive a delay of 'post-op-delay-secs'. Fix: Perform changelog wakeup on anon-fd if the fd with same pid is not present in inode-list. Note: This approach is a short-term fix. A proper fix needs a new domain for taking metadata locks so that data/metadata locks don't compete with each other. Change-Id: I253afb289eadf30c7951e56fb2c4840d7132f5e4 BUG: 966018 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5066 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* nfs: option to disable aclRajesh Amaravathi2013-06-151-0/+14
| | | | | | | | | | | | | | | | 1. Option to disable or enable acl with nfs.acl boolean option. 2. Deregister the acl service with the portmapper service when no longer required. Change-Id: I6562b6b40138d040aa2bf1e5641f4c0e0e9f9d09 BUG: 970070 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.org/5136 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterfs: discard (hole punch) supportBrian Foster2013-06-131-0/+56
| | | | | | | | | | | | | | | | Add support for the DISCARD file operation. Discard punches a hole in a file in the provided range. Block de-allocation is implemented via fallocate() (as requested via fuse and passed on to the brick fs) but a separate fop is created within gluster to emphasize the fact that discard changes file data (the discarded region is replaced with zeroes) and must invalidate caches where appropriate. BUG: 963678 Change-Id: I34633a0bfff2187afeab4292a15f3cc9adf261af Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/5090 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gluster: add fallocate fop supportBrian Foster2013-06-131-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement support for the fallocate file operation. fallocate allocates blocks for a particular inode such that future writes to the associated region of the file are guaranteed not to fail with ENOSPC. This patch adds fallocate support to the following areas: - libglusterfs - mount/fuse - io-stats - performance/md-cache,open-behind - quota - cluster/afr,dht,stripe - rpc/xdr - protocol/client,server - io-threads - marker - storage/posix - libgfapi BUG: 949242 Change-Id: Ice8e61351f9d6115c5df68768bc844abbf0ce8bd Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4969 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/dht: Make sure loc has gfidPranith Kumar K2013-06-101-0/+1
| | | | | | | | | | | | | | | | | | Problem: In some code paths neither loc->gfid nor loc->inode->gfid is populated which leads to EINVAL for linkfile setattr in dht_linkfile_attr_heal. Fix: Populate loc->gfid before dht_linkfile_attr_heal. Change-Id: I062770e6f6eaead304eff1dae81f8588a3b97eed BUG: 971805 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5178 Reviewed-by: Shishir Gowda <sgowda@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: Set task op at the time of task-id setPranith Kumar K2013-06-061-0/+30
| | | | | | | | | | | | | | | | | | Problem: If a remove-brick start is executed on m1 with brick from m2 on local subvolume no rebalance process is launched. Because of this volinfo->rebal.op is not set. This leads to volume status failures. Fix: Set rebal.op even when the reblance process is not started. Change-Id: I71c7e6f09353be14c1e8edca3c8685ebfdf226d6 BUG: 964059 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5030 Reviewed-by: Kaushal M <kaushal@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* system/posix-acl: check for the sticky bit of the parent directoryRaghavendra Bhat2013-06-031-0/+50
| | | | | | | | | | | | | * While creating links, check if there is sticky bit set for the parent directory and whether the sticky bit permits the user to create the link. Change-Id: Ic0d09d9ed579c4eb47462c71602a3a60cc7d3bc1 BUG: 958691 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4934 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/dht: Linkfiles creation with correct uid/gidshishir gowda2013-05-311-0/+42
| | | | | | | | | | | | | | | | | | | | If renames are done with different uid/gid (non-owners), then we would end up with incorrect uid/gid. The fix is to create linkfiles, and heal the uid/gid as root:root. This preserves our notion of creation as root:root and heal the uid/gid as root:root in all paths. Additionally, we need to consider uid/gid from only src_cached subvol, and not from linkfiles. rename is also done as root:root if done on linkfile, as setattr of ownership on linkfile is done after the rename Change-Id: Icb5d431dc42da9c02dfae81980e3fe769a47a274 BUG: 884597 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4682 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* tests/bugs: Increase timeout for rebalance completion checkshishir gowda2013-05-312-2/+2
| | | | | | | | | Change-Id: Ic299c0d7b3996f6e85f9627430efbdf3f9ea0a8f BUG: 884455 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4942 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cli: set min-op-version and max-op-version for getspecJeff Darcy2013-05-301-0/+13
| | | | | | | | | Change-Id: I2185df5d6b560d9367ae404c91812048e1655180 BUG: 969193 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/5119 Reviewed-by: Kaushal M <kaushal@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* tests: Change 'volume create' to 'volume create force'John Smith2013-05-231-4/+6
| | | | | | | | | | | Using 'force' when creating volumes prevents errors when creating bricks in the root partition. This fixes test bug-823081.t for bug-962226 Change-Id: I00996e1ab76713084076507d0aebdb65edc806c8 BUG: 962226 Signed-off-by: John Smith <lbalbalba@gmail.com> Reviewed-on: http://review.gluster.org/5036 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
* tests: Increase wait time for nfs mount.Vijay Bellur2013-05-201-2/+5
| | | | | | | | | Change-Id: I61815b502c90314ea6924e3046fb9b396ff56e8b BUG: 927616 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/5050 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Tests: changing calls from 'fstat' to 'fstat64' fixes these tests on 32 bit ↵John Smith2013-05-193-6/+6
| | | | | | | | | | | platforms: bug-858242.t, bug-808400*.t Change-Id: Ifd85c711a8d16eb3ef17bd1c585acdc34121b12d BUG: 962226 Signed-off-by: John Smith <lbalbalba@gmail.com> Reviewed-on: http://review.gluster.org/5034 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht: Don't do extra unref in dht-migration checksPranith Kumar K2013-05-161-0/+34
| | | | | | | | | | | | | | | | | | | Problem: syncop_open used to perform a ref in syncop_open_cbk so the extra unref was needed but now syncop_open_cbk does not take a ref so no need to do extra unref. Fix: remove the extra fd_unref and let dht_local_wipe do the final unref. Change-Id: Ibe8f9a678d456a0c7bff175306068b5cd297ecc4 BUG: 961615 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4974 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* tests: Ensure portmap registration before nfs mount.Vijay Bellur2013-05-151-1/+6
| | | | | | | | | Change-Id: I12a660a7dfbe4a2d0428910d762434043395fe02 BUG: 927616 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/5012 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: remove-brick: prevent removal from a replicate volume.Ravishankar N2013-05-131-0/+43
| | | | | | | | | | | | Prevent the removal of brick(s) from a plain replicate volume and display the error message at the CLI. Change-Id: I8e182404564147329d8cd364b7c7931d19f14570 BUG: 961669 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/4975 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Club missing entry, missing gfid self-healsPranith Kumar K2013-05-071-0/+42
| | | | | | | | | | | | | | | | | | | | | Problem: gfid-self-heal always assigns the gfid(GFID-1) it gets from lookup. Between the time of lookup to triggering the gfid-self-heal the entry could have changed. Now lets say there is a case where one of the files of the replica subolumes already has a gfid (GFID-2) and the other does not. In that case healing should happen with GFID-2 instead of GFID-1. Fix: Missing-entry-self-heal already handles all these cases. So removed separate handling of gfid-self-heal. Change-Id: Ie96261e9036c8f3cb4cad89347f9bf7b681cdc1a BUG: 767585 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/2670 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: delete "volume-name" from dict before processing the next optionKrutika Dhananjay2013-05-021-0/+21
| | | | | | | | | | Change-Id: Ib78963c1f43a66dab50b443742979c7c4e4cbc23 BUG: 958790 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4940 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Avoid self-healing extended attribute used by SELinux.Vijay Bellur2013-04-302-16/+37
| | | | | | | | | | | | | | | | | | Since removexattr() fails to remove "security.selinux" in a system where SELinux is enforcing, xattr self-healing fails. As a consequence of this, user extended attributes are not being healed. Added a check in afr to prune SELinux xattr from the dictionary used for removing xattrs from the sink. Minor changes in tests and md-cache as well. Signed-off-by: Vijay Bellur <vbellur@redhat.com> Change-Id: I854bfc0098dde812ce2afe64b125ee40c04bdeb1 BUG: 957877 Reviewed-on: http://review.gluster.org/4905 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* tests: Modified test to use remove-brick instead of 'start' variantKrishnan Parthasarathi2013-04-301-2/+1
| | | | | | | | | | | | | | | | | | | remove-brick start doesn't remove the brick from the volume immediately. It would wait until migration of data to other bricks are complete. Even when there is no data to be migrated, one can expect a finite delay from the time of remove-brick start command's exit and removal of brick(s). This may cause subsequent checks on brick count to fail in a non-deterministic manner. Also, renamed the test file name to reflect bug-id corresponding to community release. Change-Id: Ic43f011e251640decb68e46f4a10e0824ade0ac9 BUG: 878004 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4885 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cli: add a command 'gluster pool list [--xml]'Niels de Vos2013-04-261-0/+13
| | | | | | | | | | | | | | | | | * unlike 'gluster peer status', which lists only info about peers, this command lists localhost also in the list, so the sorted output from all the nodes should match. * made the output script friendly by keeping it one output per line. Change-Id: I853656753b35c617debbcceecbb71c8d6dd3c334 BUG: 764638 Original-review: http://review.gluster.org/4221 Original-author: Amar Tumballi <amarts@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/4862 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: validate performance.nfs.* option values during volume set stageKrutika Dhananjay2013-04-181-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: performance.nfs.* option values (which are of type boolean) are not validated during the stage phase of 'volume set'. The result - nfs graph generation fails during commit phase, AFTER the option and its (invalid) value have been placed in volinfo->dict. CAUSE: nfsperfxl_option_handler() - the function that validates the values of performance.nfs.* options - never receives the (key,value) pair that needs to be set, for validation during 'volume set' stage. FIX: In build_nfs_graph(), copy the (mod_)dict containing the (option,value) parameters into set_dict before attempting to build the client graph for the volume on which the operation is being performed. Of course, an easier way out would be to simply do a 'volume reset' and pretend nothing wrong happened! Change-Id: I56b17d0239d58a9e0b7798933a3c8451e2675b69 BUG: 949930 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4814 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Avoided deadlock in single node cluster, glusterd restartKrishnan Parthasarathi2013-04-161-0/+10
| | | | | | | | | | | | | | | | | | | In a single node cluster, it is possible to deadlock on the "big lock", while restarting bricks. In glusterd_restart_bricks, we perform a glusterd_brick_connect, where we release the big lock in anticipation that glusterd_brick_rpc_notify could run in the same C stack (and deadlocking). So, in the restart code path, we could unlock before we have performed a lock on the big lock. To fix this, we need to take the big lock in the glusterd_launch_synctask 'thread' as well. Change-Id: I1abea1ca82b55c784b8a810a8194f254b32b1dcc BUG: 948686 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4837 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* glusterd: big lock - a coarse-grained locking to prevent racesKrishnan Parthasarathi2013-04-121-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are primarily three lists that are part of glusterd process, that are concurrently accessed. Namely, priv->volumes, priv->peers and volinfo->bricks_list. Big-lock approach ----------------- WHAT IS IT? Big lock is a coarse-grained lock which protects all three lists, mentioned above, from racy access. HOW DOES IT WORK? At any given point in time, glusterd's thread(s) are in execution _iff_ there is a preceding, inbound network event. Of course, the sigwaiter thread and timer thread are exceptions. A network event is an external trigger to glusterd, via the epoll thread, in the form of POLLIN and POLLERR. As long as we take the big-lock at all such entry points and yield it when we are done, we are guaranteed that all the network events, accessing the global lists, are serialised. This amounts to holding the big lock at - all the handlers of all the actors in glusterd. (POLLIN) - all the cbks in glusterd. (POLLIN) - rpc_notify (DISCONNECT event), if we access/modify one of the three lists. (POLLERR) In the case of synctask'ized volume operations, we must remember that, if we held the big lock for the entire duration of the handler, we may block other non-synctask rpc actors from executing. For eg, volume-start would block in PMAP SIGNIN, if done incorrectly. To prevent this, we need to yield the big lock, when we yield the synctask, and reacquire on waking up of the synctask. Change-Id: Ib929f9905b55fb6c3fc27fefb497a26dba058e4f BUG: 948686 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4784 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* tests: fix further issues with bug-874498.tAnand Avati2013-04-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The failure of bug-874498.t seems to be a "bug" in glustershd. The situation seems to be when both subvolumes of a replica are "local" to glustershd, and in such cases glustershd is sensitive to the order in which the subvols come up. The core of the issue itself is that, without the patch (#4784), self-heal daemon completes the processing of index and no entries are left inside the xattrop index after a few seconds of volume start force. However with the patch, the stale "backing file" (against which index performs link()) is left. The likely reason is that an "INDEX" based crawl is not happening against the subvol when this patch is applied. Before #4784 patch, the order in which subvols came up was : [2013-04-09 22:55:35.117679] I [client-handshake.c:1456:client_setvolume_cbk] 0-patchy-client-0: Connected to 10.3.129.13:49156, attached to remote volume '/d/backends/brick1'. ... [2013-04-09 22:55:35.118399] I [client-handshake.c:1456:client_setvolume_cbk] 0-patchy-client-1: Connected to 10.3.129.13:49157, attached to remote volume '/d/backends/brick2'. However, with the patch, the order is reversed: [2013-04-09 22:53:34.945370] I [client-handshake.c:1456:client_setvolume_cbk] 0-patchy-client-1: Connected to 10.3.129.13:49153, attached to remote volume '/d/backends/brick2'. ... [2013-04-09 22:53:34.950966] I [client-handshake.c:1456:client_setvolume_cbk] 0-patchy-client-0: Connected to 10.3.129.13:49152, attached to remote volume '/d/backends/brick1'. The index in brick2 has the list of files/gfid to heal. It appears to be the case that when brick1 is the first subvol to be detected as coming up, somehow an INDEX based crawl is clearing all the index entries in brick2, but if brick2 comes up as the first subvol, then the backing file is left stale. Also, doing a "gluster volume heal full" seems to leave out stale backing files too. As the crawl is performed on the namespace and the backing file is never encountered there to get cleared out. So the interim (possibly permanent) fix is to have the script issue a regular self-heal command (and not a "full" one). The failure of the script itself is non-critical. The data files are all healed, and it is just the backing file which is left behind. The stale backing file too gets cleared in the next index based healing, either triggered manually or after 10mins. Change-Id: I5deb79652ef449b7e88684311e804a8a2aa4725d BUG: 874498 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4798 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* glusterd: changes in 'volume create' behaviourKrutika Dhananjay2013-04-093-0/+238
| | | | | | | | | | | | | This patch incorporates all the changes suggested on the behaviour of 'volume create' command in http://review.gluster.org/#change,4214 (comment #14, to be precise). Change-Id: Iaac524a59738b177415595b18aa8a136090d3d25 BUG: 948729 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4740 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* tests: fix dependency on sleep in bug-874498.tAnand Avati2013-04-091-8/+14
| | | | | | | | | | | | | | | With the introduction of http://review.gluster.org/4784, there are delays which breaks bug-874498.t which wrongly depends on healing to finish within 2 seconds. Fix this by using 'EXPECT_WITHIN 60' instead of sleep 2. Change-Id: I2716d156c977614c719665a5e1f159dabf2878b5 BUG: 874498 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4796 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/distribute: Ignore non-participating subvols for layout checksshishir gowda2013-04-091-0/+89
| | | | | | | | | | | | | | | | | | | | | | | When subvols-per-directory is < available subvols, then there are layouts which are not populated. This leads to incorrect identification of holes or overlaps. We need to ignore layouts, which have err == 0, and start == stop. In the current scenario (start == stop == 0). Additionally, in layout-merge, treat missing xattrs as err = 0. In case of missing layouts, anomalies will reset them. For any other valid subvoles, err != 0 in case of layouts being zeroed out. Also reverted back dht_selfheal_dir_xattr, which does layout calculation only on subvols which have errors. Change-Id: I9f57062722c9e8a26285e10675c31a78921115a1 BUG: 921408 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4668 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* tests: fix spurious regression test failuresJeff Darcy2013-04-081-0/+21
| | | | | | | | | Change-Id: I752aeb8e25f43281d2f5cf33d0ff5aeae49687e7 BUG: 764966 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4794 Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Anand Avati <avati@redhat.com>
* cli: Address a double free with volume info.Vijay Bellur2013-04-081-0/+12
| | | | | | | | | | | | | Crash is observed when volume info is performed on a non-exisiting volume name and the output format is xml. Change-Id: I88aa5d9dc954b1352f5cc3b5b38742c832bc1bb8 BUG: 949298 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/4785 Reviewed-by: Kaushal M <kaushal@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterfsd: Cleanup temporary files from /tmpVijay Bellur2013-04-081-0/+23
| | | | | | | | | | | | | | | | For each gluster{d,fs,fsd} start, one or more temporary file(s) created in /tmp were not being unlinked. This patch cleans that up. Modified a typo in an unrelated log message as well. Change-Id: I3dec2a2ca40c7d6828eb238ec9cd08b6072cf0dd BUG: 949327 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/4786 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* storage/posix: introduce node-uuid-pathinfoVenky Shankar2013-04-051-0/+39
| | | | | | | | | | | | | | | enabling this option has an effect on pathinfo xattr request returning <node-uuid>:<path> instead of the default - which is <hostname>:<path>. Change-Id: Ice1b38abf8e5df1568bab6d79ec0d53dfa520332 BUG: 765380 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/4567 Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* tests: Remove grep process entries from pidgrepRaghavendra Talur2013-04-041-1/+1
| | | | | | | | | | | | | | | | | | Problem: We were picking process with lowest pid from ps|grep result. However, lowest pid need not be oldest process as recycling of PIDs can take place. Solution: Removed grep process entries from ps entries using grep -v grep. Change-Id: I2b9687a05a34cf6358f773183770d69a3fb9eb10 BUG: 858488 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/4765 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Tests: Change rebalance status verificationPranith Kumar K2013-04-022-3/+3
| | | | | | | | | | | | | | According to the comment at the following URL https://bugzilla.redhat.com/show_bug.cgi?id=916226#c2 "success:" can come even before rebalance is completed. Changed it to check for "completed" instead. Change-Id: Ibe9d3b75493240f30261ac2a1280f32ef32886da BUG: 916226 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4614 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* rpc/nfs: cleanup legacy code of general optionsRajesh Amaravathi2013-04-021-0/+135
| | | | | | | | | | | | | | | | | | Removing the code which handles "general" options. Since it is no longer possible to set general options which apply for all volumes by default, this was redundant. This cleanup of general options code also solves a bug wherein with nfs.addr-namelookup on, nfs.rpc-auth-reject wouldn't work on ip addresses Change-Id: Iba066e32f9a0255287c322ef85ad1d04b325d739 BUG: 921072 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.org/4691 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/afr: sync xattrs removed on source to sink(s)Venky Shankar2013-04-021-0/+93
| | | | | | | | | | | | | xattrs are first removed from sink followed by setting source xattrs. Change-Id: I181cb5b785b667bbfc6e40787a2183a8f45de06b BUG: 906646 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/4656 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* rpc: disable root-squash dynamically upon volume set commandRaghavendra Bhat2013-04-011-0/+61
| | | | | | | | | Change-Id: I2ba9ca339ffbe07cb74833165a46a941225b623d BUG: 927616 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4722 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: piggyback and fsync resume changesPranith Kumar K2013-03-281-1/+3
| | | | | | | | | | | | | | 1) pre_op_piggyback should always be decremented. 2) Move fsync resume to just after post_op. 3) fsync stub should be created from afr's local not from the final response. Change-Id: I220bb532eb03bea584292f4dd2e816ad0c3e0cf7 BUG: 927146 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4741 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: ensure DATA operations are made durable before POST-OPAnand Avati2013-03-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The changelogging scheme of AFR stores information about the state of all replicas in all replicas (in the extended attribute of the respective files on each server) in the form of 'pending counts' of operations (effectively "dirty flags"). These xattrs are blindly trusted while performing self-heal, and therefore utmost care has to be taken while updating and maintaing them. The most critical updation is the clearing of the pending counts corresponding to the *other* server in the changelog of a given server. Before clearing the pending count, we need durability guarantee of the write which was performed on the other server. To obtain such a guarantee, it may be necessary to explicitly introduce an fsync() phase (if the file itself wasn't already opened with O_SYNC). This patch introduces the detection of unstable stable writes on a file and issues explicit fsync() on the servers before performing the POST-OP clearing of pending flags. Change-Id: I2171b86a74ec91e40e5877eef0a4e7379578ecf7 BUG: 927146 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4721 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>