summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* USS: initialize a list before using it.Raghavendra G2014-09-181-1/+3
| | | | | | | | | | | | backport of the patch http://review.gluster.org/8569 by Raghavendra G <rgowdapp@redhat.com> Change-Id: I7b25fdf27c6d7ff66d24925bc73d9c6681259d37 BUG: 1143961 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/8764 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Cluster/DHT: Changing rename log severityNithya Balachandran2014-09-181-6/+5
| | | | | | | | | | | | | | Changing log level for a rename message from debug to info to improve debuggability Change-Id: I53031fcf97fffd62095692477330ecde0cf47dcd BUG: 1138395 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on-master: http://review.gluster.org/8582 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/8626
* cluster/dht: Changed log level to DEBUGVenkatesh Somyajulu2014-09-181-4/+4
| | | | | | | | | | | | Change-Id: I7a4ee0c5a6a94bd4f31aff510a2971750913ed45 BUG: 1142402 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on-master: http://review.gluster.org/8621 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: susant palai <spalai@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8749
* cluster/dht: Fixed double UNWIND in lookup everywhere codeShyam2014-09-161-4/+4
| | | | | | | | | | | | | | | | | | | | | | | In dht_lookup_everywhere_done: Line: 1194 we call DHT_STACK_UNWIND and in the same if condition we go ahead and call, goto unwind_hashed_and_cached; which at Line 1371 calls another UNWIND. As is obvious, higher frames could cleanup their locals and on receiving the next unwind could cause a coredump of the process. Fixed the same by calling the required return post the first unwind Change-Id: Ic5d57da98255b8616a65b4caaedabeba9144fd49 BUG: 1142409 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on-master: http://review.gluster.org/8666 Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: susant palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/8751
* cluster/dht: fix memory corruption in locking api.Raghavendra G2014-09-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | <man 3 qsort> The contents of the array are sorted in ascending order according to a comparison function pointed to by compar, which is called with two arguments that "point to the objects being compared". </man 3 qsort> qsort passes "pointers to members of the array" to comparision function. Since the members of the array happen to be (dht_lock_t *), the arguments passed to dht_lock_request_cmp are of type (dht_lock_t **). Previously we assumed them to be of type (dht_lock_t *), which resulted in memory corruption. Change-Id: Iee0758704434beaff3c3a1ad48d549cbdc9e1c96 BUG: 1142406 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on-master: http://review.gluster.org/8659 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8750
* features/quota: fixes to dentry management code in rename.Raghavendra G2014-09-161-10/+6
| | | | | | | | | | | | | | | | | | | 1. After a successful rename (src, dst), the dentry <dst-parent, dst-basename> would be associated with src-inode. 2. Its src inode that survives if both of src and dst are present. The fixes are done based on the above two observation. Change-Id: I7492a512e3732b1455c243b02fae12d489532bfb BUG: 1142411 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on-master: http://review.gluster.org/8687 Reviewed-by: susant palai <spalai@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8752
* cluster/dht: Rename should not fail post hardlink creationShyam2014-09-162-41/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | In the rename path, we wind the creation of newname hardlink and linkto file in dst hashed a the same time. If the linkto creation fails, but the link creation succeeds, we enter the failure code and cleanup the created newname hardlink. In the interim if another client looks up newname and finds it as a hardlink from FUSE, it could send an unlink for oldname instead of a rename. This combined with the above cleanup code could end up losing all the files copies, and thereby losing data. This fix separates these steps into 2 parts, creating the linkto first and then the link file, so that post link file creation no failures would cleanup the newname file. If linkto fails then link is not attempted, thereby not polluting the name space with newname. Change-Id: I61da8e906060da16a31ea1076eec2f01fd617f44 BUG: 1138395 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on-master: http://review.gluster.org/8570 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8615
* ec: Optimize read/write performanceXavier Hernandez2014-09-1613-268/+706
| | | | | | | | | | | | | | | | | | | | This patch significantly improves performance of read/write operations on a dispersed volume by reusing previous inodelk/ entrylk operations on the same inode/entry. This reduces the latency of each individual operation considerably. Inode version and size are also updated when needed instead of on each request. This gives an additional boost. This is a backport of http://review.gluster.org/8369/ Change-Id: I4b98d5508c86b53032e16e295f72a3f83fd8fcac BUG: 1140844 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8746 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* ec: Only heal data/metadata when inode has enough informationXavier Hernandez2014-09-161-0/+8
| | | | | | | | | | | | | | | | Sometimes loc_t structure in a heal request doesn't contain enough information to do an inodelk call (basically the gfid is missing). In these cases, self heal only recovers entry information. This is a backport of http://review.gluster.org/8368/ Change-Id: I459990c7df728ff4baf164df046672ddcde3efa5 BUG: 1140626 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8747 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* Do not call rpc_transport_unref() on NULL transEmmanuel Dreyfus2014-09-161-2/+4
| | | | | | | | | | | | | | | | | | rpc_clnt_disable() sets rpc->conn->trans to NULL, hence we should not call rpc_transport_unref() afterwards. I moved it before the rpc_clnt_disable() call, but I am not sure it should be called at all, perhaps it should just go away. This is a backport of I488d0207494e3a3fad52e64e67b2e740b236b864 BUG: 1138897 Change-Id: I001ab73b37f74ba98463bd9c32c01c670e9c7ad8 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8629 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* gNFS: Fix memory leak in setacl code pathNiels de Vos2014-09-161-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | If ACL is set on a file in Gluster NFS mount (setfacl command), and it succeed, then the NFS call state data is leaked. Though all the failure code path frees up the memory. Impact: There is a OOM kill i.e. vdsm invoked oom-killer during rebalance and Killed process 4305, UID 0, (glusterfs nfs process) FIX: Make sure to deallocate the memory for call state in acl3_setacl_cbk() using nfs3_call_state_wipe(); Cherry picked from commit 5c869aea79c0f304150eac014c7177e74ce0852e: > Change-Id: I9caa3f851e49daaba15be3eec626f1f2dd8e45b3 > BUG: 1139195 > Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com> > Reviewed-on: http://review.gluster.org/8651 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Niels de Vos <ndevos@redhat.com> Change-Id: I9caa3f851e49daaba15be3eec626f1f2dd8e45b3 BUG: 1139244 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/8654 Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Propagate EIO on inode's type mismatchKrutika Dhananjay2014-09-168-121/+461
| | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/8574, and http://review.gluster.org/8586 Original author of the test script: Pranith Kumar K <pkarampu@redhat.com> Change-Id: I0c32bdd8e666f8175c0a8fbf940934e6ce469931 BUG: 1136830 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/8706 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* statedump: Print curr_stdalloc in mempool statedump ...Krutika Dhananjay2014-09-161-0/+1
| | | | | | | | | | | | | | | | ...for, it is curr_stdalloc that is incremented for every mem_get() and decremented on every call to mem_put() and can be used to detect leaks, when cold_count is 0. Backport of: http://review.gluster.org/8557 Change-Id: I1f4aac3af1234b9a76be7cb86ef26d728788950c BUG: 1136831 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/8705 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Fix dict_t leaksKrutika Dhananjay2014-09-168-35/+64
| | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/8557/ dict_t objects that are ref'd in alloca'd "replies" in afr_replies_copy() are not unref'd before "replies" go out of scope. Change-Id: I9bb45bc673ec13292ac96dda060aceb48739ebe8 BUG: 1136831 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/8704 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Typo fix THANKS messageJustin Clift2014-09-161-1/+1
| | | | | | | | | | | Backport of: http://review.gluster.org/8174 BUG: 1142046 Change-Id: Ibfb1b9ec6784141ff59ee39df76fac2d7ce73cc6 Reviewed-on: http://review.gluster.org/8745 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Handle EAGAIN properly in inodelkPranith Kumar K2014-09-153-15/+130
| | | | | | | | | | | | | | | | | Problem: When one of the brick is taken down and brough back up in a replica pair, locks on that brick will be allowed. Afr returns inodelk success even when one of the bricks already has the lock taken. Fix: If any brick returns EAGAIN return failure to parent xlator. BUG: 1142020 Change-Id: Iee3f5990be75e10f8accec9bc3856e3f76d1593c Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8744 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* osx: LaunchDaemon plist filename should start with org instead of comJustin Clift2014-09-152-2/+2
| | | | | | | | | | | | Backport of http://review.gluster.org/8734 BUG: 1141665 Change-Id: Ia41b00f3c49644f6a3c12e76c2395fce117c1bee Signed-off-by: Justin Clift <justin@gluster.org> Reviewed-on: http://review.gluster.org/8735 Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* test: cleanup should clean all left over (stale) mountsAtin Mukherjee2014-09-131-4/+7
| | | | | | | | | | | | | | | | | | | This is a temporary work around to fix the spurious failures seen in ec testcases. As per the initial analysis it looks like quota (glusterd_quota_initiate_fs_crawl) is causing a mount point in /tmp to be stale. Once the root cause is identified this fix can be reverted as well. Backport of http://review.gluster.org/#/c/8703/ Change-Id: I8686f144ed298124074f749e75c13028ec00be01 BUG: 1141187 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/8703 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Justin Clift <justin@gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8708
* ec: Do not destroy inode context on inode invalidationXavier Hernandez2014-09-121-28/+13
| | | | | | | | | | | | | | Currently there is no need to handle inode invalidation requests, so this callback has been removed. This is a backport of http://review.gluster.org/8420/ Change-Id: I0ac2e47679bf62b1493e0403178305923bc036e8 BUG: 1140846 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8702 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Treat linkto file rename failure as non-critial errorShyam2014-09-121-8/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is a critical failure iff we fail to rename the cached file if the rename of the linkto failed, it is not a critical failure, and we do not want to lose the created hard link for the new name as that could have been read by other clients. NOTE: If another client is attempting the same oldname -> newname rename, and finds both file names as existing, and are hard links to each other, then FUSE would send in an unlink for oldname. In this time duration if we treat the linkto as a critical error and unlink the newname we created, we would have effectively lost the file to rename operations. Repercussions of treating this as a non-critical error is that we could leave behind a stale linkto file and/or not create the new linkto file, the second case would be rectified by a subsequent lookup, the first case by a rebalance, like for all stale linkto files Change-Id: Ia53ad8b43c3cf8f48ef5b43fd1fec4274e807556 BUG: 1138395 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on-master: http://review.gluster.org/8563 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8614
* cluster/dht: synchronize rename and file-migrationRaghavendra G2014-09-123-34/+293
| | | | | | | | | | Change-Id: I4f243c946f76d440680b651235f925e3d0ebf0fd BUG: 1138395 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on-master: http://review.gluster.org/8523 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8613
* storage/posix: Don't unlink .glusterfs-hardlink before linkto checkVenkatesh Somyajulu2014-09-121-8/+37
| | | | | | | | | | | BUG: 1138385 Change-Id: I90a10ac54123fbd8c7383ddcbd04e8879ae51232 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on-master: http://review.gluster.org/8559 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8612
* cluster/dht: invoke callback when there are no locks to be unlocked.Raghavendra G2014-09-121-0/+4
| | | | | | | | | | | Change-Id: I375cb68f1075c2d58cf9d09ed6bd5e2746e1637d BUG: 1138395 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on-master: http://review.gluster.org/8549 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8611
* ec: Removed SSE2 dependencyXavier Hernandez2014-09-128-10272/+11687
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the Galois Field multiplications using pure C code without any assembler support. This makes the ec xlator portable to other architectures. In the future it will be possible to use an optimized implementation of the multiplications using architecture dependent facilities (it will be automatically detected and configured). To allow bricks with different machine word sizes to be able to work seamlessly in the same volume, the minimum fragment length to be stored in any brick has been fixed to 512 bytes. Otherwise, different implementations will corrupt the data (SSE2 used 128 bytes, while new implementation would have used 64). This patch also removes the '-msse2' option added on patch http://review.gluster.org/8396/ This is a backport of http://review.gluster.org/8413/ Change-Id: Iaf6e4ef3dcfda6c68f48f16ca46fc4fb61a215f4 BUG: 1140845 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8701 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Prefer gfid links for inode-handlePranith Kumar K2014-09-123-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/8575 Problem: File path could change by other entry operations in-flight so if renames are in progress at the time of other operations like open, it may lead to failures. We observed that this issue can also happen while renames and readdirps/lookups are in progress because dentry-table is going stale sometimes. Fix: Prefer gfid-handles over paths for files. For directory handles prefering gfid-handles hits performance issues because it needs to resolve paths traversing up the symlinks. Tests which test if files are opened should check on gfid path after this change. So changed couple of tests to reflect the same. Note: This patch doesn't fix the issue for directories. I think a complete fix is to come up with an entry operation serialization xlator. Until then lets live with this. BUG: 1136821 Change-Id: If93e46d542a4e96a81a0639b5210330f7dbe8be0 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8594 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Fix all locked_on bricks are sinks check in self-healsPranith Kumar K2014-09-125-82/+66
| | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/8456 Problem: Counts may give wrong results when the number of bricks is > 2. If the locks are acquired on one source and sink, but the source accuses even the down sink then there will be 2 sinks and lock is acquired on 2 bricks so even when there is a clear source and sink **_finalize_source functions think the file/directory is in split-brain. Fix: Check that all the bricks which are locked are sinks. BUG: 1136829 Change-Id: I56a8f9ff261bdeec8c441237c485036141b6f00d Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8593 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mount/fuse: Handle fd resolution failuresPranith Kumar K2014-09-126-187/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/8402 Problem: Even when the fd resolution failed, the fop is continuing on the new graph which may not have valid inode. This lead to NULL layout subvols in dht which lead to crash in fsync after graph migration. Fix: - Remove resolution error handling in FUSE_FOP as it was only added to handle fd migration failures. - check in fuse_resolve_done for fd resolution failures and fail the fop right away. - loc resolution failures are already handled in the corresponding fops. - Return errno from state->resolve.op_errno in resume functions. - Send error to fuse on frame allocation failures. - Removed unused variable state->resolved - Removed unused macro FUSE_FOP_COOKIE BUG: 1136827 Change-Id: I4010b7fccd7d8caf0ce4e7629e81b605102d8fb4 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8592 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* tests: metadata self-heal testsPranith Kumar K2014-09-121-0/+133
| | | | | | | | | | | | Backport of http://review.gluster.org/8515 BUG: 1136826 Change-Id: Iebbf5ac58e7bf884221638ad10b19dee60b6ea56 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8591 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Perform metadata sync inside metadata locksPranith Kumar K2014-09-121-19/+15
| | | | | | | | | | | | | Backport of http://review.gluster.org/8514 BUG: 1136825 Change-Id: I480a0dc0dfff8bcff084c9b3f048c5b355683f73 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8590 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Perform gfid heal inside locks.Pranith Kumar K2014-09-126-27/+69
| | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/8512 Problem: Allowing lookup with 'gfid-req' will lead to assigning gfid at posix layer. When two mounts perform lookup in parallel that can lead to both bricks getting different gfids leading to gfid-mismatch/EIO for the lookup. Fix: Perform gfid heal inside lock. BUG: 1136823 Change-Id: I1059fcd38338348b7e96a0bd4c35b0f49bf77aec Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8589 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs/syncop: implement inodelkRaghavendra G2014-09-122-0/+44
| | | | | | | | | | | Change-Id: Iea489157490b70cb2bb03576b0d4943c6d8f052d BUG: 1138395 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on-master: http://review.gluster.org/8522 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8610
* io-stats: Adding private data dumping methodVipul Nayyar2014-09-111-3/+74
| | | | | | | | | | | | | | | | | | For the glusterfsiostat tool to be able to gather stats about mounted volumes from meta xlator, private information in the io-stats xlator needs to be dumped in the .meta folder. Added functionality for total data being read/written to be dumped along with latency related information for all fop functions present in io-stats. Change-Id: I75486f0ca361844a643861789f6c1406f439674d BUG: 1130023 Signed-off-by: Vipul Nayyar <nayyar_vipul@yahoo.com> Reviewed-on: http://review.gluster.org/8244 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8696
* storage/posix: removing deleting entries in case of creation failuresRaghavendra G2014-09-104-41/+91
| | | | | | | | | | | | | | | The code is not atomic enough to not to delete a dentry created by a prallel dentry creation operation. Change-Id: I9bd6d2aa9e7a1c0688c0a937b02a4b4f56d7aa2e BUG: 1138387 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on-master: http://review.gluster.org/8327 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8693 Tested-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: introduce locking api.Raghavendra G2014-09-095-1/+658
| | | | | | | | | | | | Change-Id: I41389ba91951d3e63e617aa32cd0bee848261c72 BUG: 1138395 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on-master: http://review.gluster.org/8521 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8609 Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/dht: Modified logic of linkto file deletion on non-hashedVenkatesh Somyajulu2014-09-104-23/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently whenever dht_lookup_everywhere gets called, if in dht_lookup_everywhere_cbk, a linkto file is found on non-hashed subvolume, file is unlinked. But there are cases when this file is under migration. Under such condition, we should avoid deletion of file. When some other rebalance process changes the layout of parent such that dst_file (w.r.t. migration) falls on non-hashed node, then may be lookup could have found it as linkto file but just before unlink, file is under migration or already migrated In such cased unlink can be avoided. Race: ------- If we have two bricks (brick-1 and brick-2) with initial file "a" under BaseDir which is hashed as well as cached on (brick-1). Assume "a" hashing gives 44. Brick-1 Brick-2 Initial Setup: BaseDir/a BaseDir [1-50] [51-100] Now add new-brick Brick-3. 1. Rebalance-1 on node Node-1 (Brick-1 node) will reset the BaseDir Layout. 2. After that it will perform a) Create linkto file on new-hashed (brick-2) b) Perform file migration. 1.Rebalance-1 Fixes the base-layout: Brick-1 Brick-2 Brick-3 --------- ---------- ------------ BaseDir/a BaseDir BaseDir [1-33] [34-66] [67-100] 2. Only a) is BaseDir/a BaseDir/a(linkto) BaseDir performed Create linktofile Now rebalance 2 on node-2 jumped in and it will perform step 1 and 2-a. After (rebal-2, step-1), it changes the layout of the BaseDir. BaseDir/a BaseDir/a(link) BaseDir [67-100] [1-33] [34-66] For (rebale-2, step-2), It will perform lookup at Brick-3 as w.r.t new layout 44 falls for brick-3. But lookup will fail. So dht_lookup_everywhere gets called. NOTE: On brick-2 by rebalance-1, a linkto file was created. Currently that linkto files gets deleted by rebalance-2 lookup as it is considered as stale linkto file. But with patch if rebalance is already in progress or rebalance is over, linkto file will not be unlinked. If rebalance is in progress fd will be open and if rebalance is over then linkto file wont be set. Change-Id: I3fee0d28de3c76197325536a9e30099d2413f07d BUG: 1138385 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on-master: http://review.gluster.org/8345 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Added code to capture races in dht-lookup pathVenkatesh Somyajulu2014-09-092-10/+152
| | | | | | | | | | | | Change-Id: I9270d2d40ebd4b113ff961583dfda7754741f151 BUG: 1138385 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on-master: http://review.gluster.org/8430 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/8668 Tested-by: Vijay Bellur <vbellur@redhat.com>
* Revert "cluster/dht: Added code to capture races in dht-lookup path"Vijay Bellur2014-09-090-0/+0
| | | | | | | | | This reverts commit 226ea315d7ff63548b1163966e24f80a5e1641ab Change-Id: I463793bbe17d3a852e97223960e4647586b34237 Reviewed-on: http://review.gluster.org/8667 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Fix dht_access treating directory like filesShyam2014-09-092-4/+105
| | | | | | | | | | | | | | | | | | | | | | | | | When the cluster topology changes due to add-brick, all sub volumes of DHT will not contain the directories till a rebalance is completed. Till the rebalance is run, if a caller bypasses lookup and calls access due to saved/cached inode information (like NFS server does) then, dht_access misreads the error (ESTALE/ENOENT) from the new subvolumes and incorrectly tries to handle the inode as a file. This results in the directories in memory state in DHT to be corrupted and not heal even post a rebalance. This commit fixes the problem in dht_access thereby preventing DHT from misrepresenting a directory as a file in the case presented above. Change-Id: Idcdaa3837db71c8fe0a40ec0084a6c3dbe27e772 BUG: 1138393 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on-master: http://review.gluster.org/8462 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8608 Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/dht: Added keys in dht_lookup_everywhere_doneVenkatesh Somyajulu2014-09-091-4/+72
| | | | | | | | | | | | | | | | | | | | | Case where both cached (C1) and hashed file are found, but hash does not point to above cached node (C1), then dont unlink if either fd-is-open on hashed or linkto-xattr is not found. Change-Id: I7ef49b88d2c88bf9d25d3aa7893714e6c0766c67 BUG: 1138385 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Change-Id: I86d0a21d4c0501c45d837101ced4f96d6fedc5b9 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on-master: http://review.gluster.org/8429 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: susant palai <spalai@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8607 Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* dht: fix rename raceNithya Balachandran2014-09-091-2/+5
| | | | | | | | | | | | | | | Additional check to check if we created the linkto file before deleting it in the rename cleanup function Change-Id: I919cd7cb24f948ba4917eb9cf50d5169bb730a67 BUG: 1138387 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on-master: http://review.gluster.org/8338 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8605 Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/dht: Fix races to avoid deletion of linkto fileVenkatesh Somyajulu2014-09-095-41/+401
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Explanation of Race between rebalance processes: https://bugzilla.redhat.com/show_bug.cgi?id=1110694#c4 STATE 1: BRICK-1 only one brick Cached File in the system STATE 2: Add brick-2 BRICK-1 BRICK-2 STATE 3: Lookup of File on brick-2 by this node's rebalance will fail because hashed file is not created yet. So dht_lookup_everywhere is about to get called. STATE 4: As part of lookup link file at brick-2 will be created. STATE 5: getxattr to check that cached file belongs to this node is done STATE 6: dht_lookup_everywhere_cbk detects the link created by rebalance-1. It will unlink it. STATE 7: getxattr at the link file with "pathinfo" key will be called will fail as the link file is deleted by rebalance on node-2 Fix: So in the STATE 6, we should avoid the deletion of link file. Every time dht_lookup_everywhere gets called, lookup will be performed on all the nodes. So to avoid STATE 6, if linkto file is found, it is not deleted until valid case is found in dht_lookup_everywhere_done. Case 1: if linkto file points to cached node, and cached file exists, uwind with success. Case 2: if linkto does not point to current cached node, and cached file exists: a) Unlink stale link file b) Create new link file Case 3: Only linkto file exists: Delete linkto file Case 4: Only cached file Create link file (Handled event without patch) Case 5: Neither cached nor hashed file is present Return with ENOENT (handled even without patch) Change-Id: Ibf53671410d8d613b8e2e7e5d0ec30fc7dcc0298 BUG: 1138385 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on-master: http://review.gluster.org/8231 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8603 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/changelog: Removal of redundant fop color count while draining.Ajeet Jha2014-09-091-5/+0
| | | | | | | | | | BUG: 1138952 Change-Id: I594be0d09c6af2e4a34da3e819d1ab6fd85e34c4 Signed-off-by: Ajeet Jha <ajha@redhat.com> Reviewed-on: http://review.gluster.org/8542 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-on: http://review.gluster.org/8647
* features/changelog: barrier all entry creation fopsVijay Bellur2014-09-091-27/+308
| | | | | | | | | | | | | | | | | | | | when a snapshot is taken, there are chances of entry creation fops not being recorded either in changelog or through the recursive ancestry xtime updation by marker. This causes consumers of changelog (primarily geo-replication as of today) to not be aware of these entries after a snapshot is restored. This can lead to inconsistencies. This patch is an interim workaround to barrier creates till changelog becomes completely crash consistent. BUG: 1138952 Change-Id: Idd5e690a05fe2c7c5d32d1541a0d9b5132881ea7 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8517 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: ajeet jha <ajha@redhat.com> Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8646
* geo-rep/glusterd: API to check active geo-rep session for the volumeKotresh H R2014-09-084-67/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Requirement: Snapshot needs an API to fail the CLI if any geo-rep session is active for that volume. Solution: A function "gd_vol_is_geo_rep_active" is provided to check if any geo-rep session is active for that volume. An in memory dict called 'gsync_running_slaves' is maintained in 'volinfo' structure to keep track of active geo-rep session for the volume. The key 'slavenode::slavevol' with value 'running' is added whenever geo-rep is started/resumed into the dict and the same is removed if stopped/paused. So the 'count' in dict is used to decide whether the geo-rep is active or not for that volume. Also added "this->name" in gf_log in routines which this patch is touched. BUG: 1138952 Change-Id: Ib13aeb509a56edf510651b77e20bf3cc43a3e763 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8459 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/8645 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: Fixing issue with xsync upper limitAravinda VK2014-09-082-59/+50
| | | | | | | | | | | | | | | | | | | | | While identifying the file/dir to sync, xtime of the file was compared with xsync_upper_limit as `xtime < xsync_upper_limit` After the sync, xtime of parent directory is updated as stime. With the upper limit condition, stime is updated as MIN(xtime_parent, xsync_upper_limit) With this files will get missed if `xtime_of_file == xsync_upper_limit` With this patch xtime_of_file is compared as xtime_of_file <= xsync_upper_limit BUG: 1138952 Change-Id: I469e8638ab6923e518022a539a19e2d040b60eb0 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8439 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8644
* geo-rep: Handle RMDIR recursivelyAravinda VK2014-09-081-0/+12
| | | | | | | | | | | | | | | | | | If RMDIR is recorded in brick changelog which is due to self heal traffic then it will not have UNLINK entries for child files. Geo-rep hangs with ENOTEMPTY error on slave. Now geo-rep recursively deletes the dir if it gets ENOTEMPTY. BUG: 1138952 Change-Id: Ie79db90c52103b39fa795bb8a096b363d450b427 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8477 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/8643 Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* feature/geo-rep: Keep marker.tstamp's mtime unchangeable during snapshot.Kotresh H R2014-09-085-13/+169
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Geo-replicatoin does a full xsync crawl after snapshot restoration of slave and master. It does not do history crawl. Analysis: Marker creates 'marker.tstamp' file when geo-rep is started for the first time. The virtual extended attribute 'trusted.glusterfs.volume-mark' is maintained and whenever it is queried on gluster mount point, marker fills it on the fly and returns the combination of uuid, ctime of marker.tstamp and others. So ctime of marker.tstamp, in other sense 'volume-mark' marks the geo-rep start time when the session is freshly created. From the above, after the first filesystem crawl(xsync) is done during first geo-rep start, stime should always be less than 'volume-mark'. So whenever stime is less than volume-mark, it does full filesystem crawl (xsync). Root Cause: When snapshot is restored, marker.tstamp file is freshly created losing the timestamps, it was originally created with. Solution: 1. Change is made to depend on mtime instead of ctime. 2. mtime and atime of marker.tstamp is restored back when snapshot is created and restored. BUG: 1138952 Change-Id: I0e19e1cb2593171b9a2b41d0d303330feb7fd2b3 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8401 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8642
* geo-rep/libgfchangelog: Support of symlinks while creation of working dir.Kotresh H R2014-09-081-1/+1
| | | | | | | | | | | | | | | In gf_changelog_register, enable symlink support while creating working directory if its not already created. BUG: 1138952 Change-Id: I8fec52a5768fae46ce30a2331f30f1d8d5e2e173 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8409 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/8641 Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep/libgfchangelog: Create working dir during changelog_register if not ↵Kotresh H R2014-09-081-6/+15
| | | | | | | | | | | | | | | | | | | | present. Earlier, xysnc's register was being called first, which was creating working directory before calling changelog_register. Now it is history crawl first. Hence working directory would not have been created. Create it in gf_changelog_register itself if it is not already created. BUG: 1138952 Change-Id: Ie39b9fd8c1ef7385f76a9b67d0acc3c1c2fd2bb2 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8399 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8640
* geo-rep: minimize xsync crawl usage and set upper limit to xsync crawlAravinda VK2014-09-082-38/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For effective handling of deletes and renames use history crawl as much as possible. History crawl will run in loop till it syncs all data before live changelog time. When it uses xsync crawl(fallback when changelog not available, or very first crawl) it sets upper limit to crawl. After completing History crawl, it checks actual end time returned by history api to compare with register time, if actual end is less than register time then run history crawl one more time. If first turn history processing time is less than the CHANGELOG ROLLOVER TIME then sleep for the difference, After sleep if it is guaranteed that rollover will happen and switches to live changelog consumption without switching to xsync. This sleep is only when history processing completed < CHANGELOG_ROLLOVER_TIME and sleep only after the first turn, So will not affect the performance. BUG: 1138952 Change-Id: Ida024211d312f60f0e8190805e7469b2165f00e1 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8151 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/8639 Reviewed-by: Vijay Bellur <vbellur@redhat.com>