summaryrefslogtreecommitdiffstats
path: root/xlators
Commit message (Collapse)AuthorAgeFilesLines
* afr: serialize modification of {entrylk,inodelk}_lock_countAnand Avati2013-02-071-53/+54
| | | | | | | | | | | | | | Typically this lock was not needed in practice, but with http://review.gluster.org/3842, this code gets executed in multiple threads for different servers and we lose a count. This results in leaked lock and a hang for a future transaction. Change-Id: I377ed20e44f2a45cff522289dfef181f0653eca2 BUG: 765564 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4480 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* fuse-bridge: use READDIRPLUS support when availableAnand Avati2013-02-073-2/+141
| | | | | | | | | | | | | This patch makes use of READDIRPLUS call when support is available in the kernel. Change-Id: Iac78881179567856b55af1f46594a2b2859309f0 BUG: 908128 Signed-off-by: Anand V. Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/3905 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
* dht: better layout-optimization algorithmJeff Darcy2013-02-072-22/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This method deals with the case where swapping might gain a bigger overlap for the xlator currently under consideration, but sacrifices even more from the xlator we're swapping with. For example: A = 0x00000000 - 0x44444443 (new 0x00000000 - 0x55555554) B = 0x44444444 - 0x77777776 (new 0x55555555 - 0xaaaaaaa9) C = 0x77777777 - 0xffffffff (new 0xaaaaaaaa - 0xffffffff) Here, the new range for B has a bigger overlap with the old C than with the old B (0x33333333 vs. 0x22222222 to be precise) so looking only at that might lead us to swap. However, such a swap turns the new C's overlap from 0x55555556 (vs. old C) to *zero* (vs. old B). In other words, we've gained 0x11111111 for B but lost 0x55555556 for C, so it's a bad idea. The new algorithm accounts for all effects of the swap, so it not only avoids bad swaps but can make some good ones that would have been missed previously. For example, if swapping a range X with a later range Y would not increase the overlap for X we would previously have skipped it even if the swap would increase Y's overlap without affecting X's. This is the normal case when we're adding a new brick (which initially has zero overlap with any old range) so finding more good swaps is probably even more important than avoiding bad ones. Also, the logic in dht_overlap_calc was completely broken before, causing integer overflows instead of providing correct values, so no matter what higher-level algorithm was in place the GIGO effect would have resulted in bad decisions. Change-Id: If61ed513cfcb931916c6b51da293e3efbaaf385f BUG: 853258 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/3908 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: synctaskize 'volume create' operationKrutika Dhananjay2013-02-061-42/+6
| | | | | | | | | | | | .. and also move brickpath validation to volume create stage Change-Id: Ia028677932ca5f6aa05dcf624f47033b62e7b212 BUG: 862834 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4213 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* rpc: get hostnames of client to allow FQDN based authenticationRajesh Amaravathi2013-02-061-0/+4
| | | | | | | | | | | | | | If FQDNs are used to authenticate clients, then from this commit forth, the client ip(v4,6) is reverse looked up using getnameinfo to get a hostname associated with it, if any, thereby making FQDN-based rpc authentication possible. Change-Id: I4c5241e7079a2560de79ca15f611e65c0b858f9b BUG: 903553 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.org/4439 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* nfs/nlm: use req's uid and gid for open_and_resumeRajesh Amaravathi2013-02-061-7/+4
| | | | | | | | | | | | | | | | Previously, NLM was setting the frame->root->{uid,gid} to root by default. This causes permission problems with root squashing for lock calls. Now, we obtain the uid and gid from rpc request. And duplicate #defines are removed from rpcsvc.h Change-Id: I5d6c87aed8d04aab2619bb913408048c0a02d1e7 BUG: 906884 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.org/4466 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* open-behind: translator to perform open calls in backgroundAnand Avati2013-02-066-1/+959
| | | | | | | | | | | | | | | | | | | | This is functionality peeled out of quick-read into a separate translator. Fops which modify the file (where it is required to perform the operation on the true fd) will trigger and wait for the backend open to succeed and use that fd. Fops like fstat() readv() etc. will use anonymous FD (configurable) when original fd is unopened at the backend. Change-Id: Id9847fdbfdc82c1c8e956339156b6572539c1876 BUG: 846240 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4406 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* glusterd: Use client-op-versions during "volume set"Kaushal M2013-02-062-2/+59
| | | | | | | | | | | | | | | The supported op-versions of the client and the name of the requested volume, are saved during server_getspec(). These are used during the staging of volume set. If the option being set is not supported by any of the clients which currently have the volume mounted, then set will fail. Change-Id: I4e6b60b274d5200508762dc0204cfa848a6c0aa4 BUG: 907311 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4424 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd,glusterfsd,libgfapi: Client op-versionKaushal M2013-02-062-20/+69
| | | | | | | | | | | | | | | This patch introduces op-version support for glusterfs clients. Now, a client sends its supported op-versions during the volfile fetch request and glusterd will return the volfile only if the client can support the current op-version of the cluster. Change-Id: Iab1f1f1706802962bcf27058657c44e8a344d2f6 BUG: 907311 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4247 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* storage/posix: Fix open-fd-count virtual xattrPranith Kumar K2013-02-061-8/+3
| | | | | | | | | | | Send open-fd-count maintained in inode. Change-Id: I23db5d052bdeb4f67978ff618ed5a0bed7d1592d BUG: 908146 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4469 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Avoid priv->eager_lock value update racePranith Kumar K2013-02-064-4/+6
| | | | | | | | | | Change-Id: I7049c0c64e36a9dfa4cc0e0b34de7ec111d2f6c1 BUG: 908302 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4076 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Perform wakeup just before fopPranith Kumar K2013-02-062-13/+21
| | | | | | | | | | | | | | | There is no necessity for the delayed-post-op to wait until the next fop phase on the fd completes. Change-log, locks are inherited by the time next fop phase is attempted so the wakeup can happen just before the fop phase is started. Change-Id: I0b8e591f591b0f7565eb55265ab51f476ed2b165 BUG: 908302 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4073 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* fuse(efence): zero sized memory was being allocated.Anand Avati2013-02-051-0/+5
| | | | | | | | | | | | | For the last call of getdents(), gf_malloc called with 0 size, which is then caught by efence. BUG: 782760 Change-Id: If289029117a62ecfcecc70480e5ac8f0e050487d Signed-off-by: Anand Avati <avati@redhat.com> Original-author: Varun Shastry <vshastry@redhat.com> Reviewed-on: http://review.gluster.org/3846 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* protocol/client: Avoid double free of framePranith Kumar K2013-02-041-2/+1
| | | | | | | | | | | | | | | When client_submit_request fails it calls cbk. The cleanups should happen only in cbk. The code committed as part of http://review.gluster.org/4357 violates this. Also found that clnt_release_reopen_fd violates this as well. This patch fixes these issue. Change-Id: Ic02ba278724b03c65c00b686c39fd7846122618a BUG: 821056 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4464 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: remove extra call to glusterd_volume_compute_cksum()Krutika Dhananjay2013-02-041-6/+0
| | | | | | | | | | | | | In the commit phase of volume create, checksum on volinfo is computed twice - once in the call to glusterd_store_volinfo() and once, further down, in the same function glusterd_op_create_volume(). Change-Id: I36f9426943cd48937d4946b4b4ef09f19f31d888 BUG: 812356 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4463 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: log changes in volume create and delete codepathKrutika Dhananjay2013-02-044-152/+215
| | | | | | | | | | | | | | | | | | | | Making log changes involving two commands as they both share sections of code (like the part where the volume metadata is cleaned up in vol delete in case of success; and in vol create in case of failure). * Most of the changes are of the 's/THIS/this' kind. * Changed some of the log messages to give as much information as available in case of failure. * Changed log levels in some of the log messages. Change-Id: I10242511fe9400a07ab04717464d748d9172dd85 BUG: 812356 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4462 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: "volume heal info" doesn't report output properlyVenkatesh Somyajulu2013-02-042-15/+80
| | | | | | | | | | | | Problem: "volume heal info" doesn't reports files to be healed when gluster* processes on one of the storage node is not running Change-Id: Iff7d41407014624e4da9b70d710039ac14b48291 BUG: 880898 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/4371 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / gsyncd: Separate log file directory for Mountbroker sessionsVenky Shankar2013-02-042-1/+33
| | | | | | | | | | | | | | | | | | | | | ... so that a mountbroker session which is initiated b/w master and slave does not use the same log file if it's started after a normal geo-rep session b/w master and slave. This results in EPERM as the log file is owned by root and the geo-rep slave process (now running as a non privileged user) does not have access to it. Also, having separate log file directory for mountbroker sessions looks clean. NOTE: geo-rep's client mount log file location remains unchanged. Change-Id: Ic7a732e250aee5393b9c3f6ebf6dfe2c310b7fe4 BUG: 893960 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/4407 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Indexed node information in dict to retain consistency.Avra Sengupta2013-02-041-39/+55
| | | | | | | | | | | | | | | | | | | | | | Problem: -------- Depending on the response time from different nodes, the response dict for rebalance status was populated in a FIFO manner, and hence the output for the CLI was never consistent. Fix: ---- Irrespective of the response time of the nodes, we now index the entries in the response dict for rebalance status, in reference to the peerlist. So, the order of the entries and hence the CLI output is always consistent. Change-Id: Ica7e89e5d95aa9860a6f3c7eff58ca2052e05bd6 BUG: 888390 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4416 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/dht: Correct min_free_disk behaviourRaghavendra Talur2013-02-042-27/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Files were being created in subvol which had less than min_free_disk available even in the cases where other subvols with more space were available. Solution: Changed the logic to look for subvol which has more space available. In cases where all the subvols have lesser than Min_free_disk available , the one with max space and atleast one inode is available. Known Issue: Cannot ensure that first file that is created right after min-free-value is crossed on a brick will get created in other brick because disk usage stat takes some time to update in glusterprocess. Will fix that as part of another bug. Change-Id: If3ae0bf5a44f8739ce35b3ee3f191009ddd44455 BUG: 858488 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/4420 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/stripe: Mount issues with Stripe xlatorVarun Shastry2013-02-033-20/+46
| | | | | | | | | | | | | | | | | | | Problem: * 'CONNECTING' is taken as CHILD_UP. * Sending notifications (default_notify()) for all the events individually while mounting. Solution: * Consider Child up only after the event CHILD_UP is received. * Send a single notification for all the children's events only while mounting. Change-Id: I1b7de127e12f5bfb8f80702dbdce02019e138bc8 BUG: 885072 Signed-off-by: Varun Shastry <vshastry@redhat.com> Reviewed-on: http://review.gluster.org/4356 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* protocol/server: upon server_connection_put, set xl_private of the transport toRaghavendra Bhat2013-02-031-2/+10
| | | | | | | | | | | | | | | | | NULL Suppose the get_xlator_by_name returns NULL and the connection is put back then update the xl_private of the transport by setting to NULL. Otherwise server_connection_put would have freed the connection object and xl_private of the transport would still be pointing to the same location which is freed, thus leading to a segfault when the location is accessed. Change-Id: Id47e0edde3073b09765338c730847ba3095df9e2 BUG: 901457 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4411 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* performance/write-behind: do not try to take LOCK in forgetRaghavendra Bhat2013-02-031-7/+3
| | | | | | | | | | | LOCK attempt in wb_forget is unnecessary Change-Id: Ibdedc23d0c829c34aedd6fc5bc0e0a584b832514 BUG: 903566 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4423 Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: do dict unref after sending reply to cliKrutika Dhananjay2013-02-0311-49/+5
| | | | | | | | | | | | | | | | | This patch channelizes dict unrefs of dictionaries created from the cli req during volume ops to one common function - glusterd_to_cli() - which is guaranteed to be called irrespective of whether the command succeeds or fails. This patch also removes extra unrefs at a few places. Change-Id: Ic8ba7166387b5dfd1f5ae860539e1b7093a94662 BUG: 861044 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4003 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* locks: Protected racy (read) access of ext_listKrishnan Parthasarathi2013-02-031-10/+18
| | | | | | | | | | Change-Id: Ibf639695ebd99c11c6960c9be82c0cee71b50744 BUG: 905864 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4458 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mount.glusterfs: Fixed regexp matcher for existing mount pointsKrishnan Parthasarathi2013-02-031-1/+1
| | | | | | | | | | Change-Id: I58d237a3d2f4caa7f3865c2e4899c472f7457450 BUG: 906887 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4457 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/dht: ignore EEXIST error in mkdir to avoid GFID mismatchAnand Avati2013-02-031-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In dht_mkdir_cbk, EEXIST error is treated like a true error. Because of this the following sequence of events can happen, eventually resulting in GFID mismatch and (and possibly leaked locks and hang, in the presence of replicate.) The issue exists when many clients concurrently attempt creation of directory and subdirectory (e.g mkdir -p /mnt/gluster/dir1/subdir) 0. First mkdir happens by one client on the hashed subvolume. Only one client wins the race. Others racing mkdirs get EEXIST. Yet other "laggers" in the race encounter the just-created directory in lookup() on the hash dir. 1. At least one "lagger" lookup() notices that there are missing directories on other subvolumes (which the "winner" mkdir is yet to create), and starts off self-heal of the directory. 2. At least on some subvolumes, self-heal's mkdir wins the race against the "winner" mkdir and creates the directory first. This causes the "winner" mkdir to experience EEXIST error on those subvolumes. 3. On other subvolumes where "winner" mkdir won the race, self-heal experiences EEXIST error, but self-heal is properly translating that into a success (but mkdir code path is not -- which is the bug.) 4. Both mkdir and self-heal assign hash layouts to the just created directory. But self-heal distributes hash range across N (total) subvolumes, whereas mkdir distributes hash range across N - M (where M is the number of subvolumes where mkdir lost the race). Both the clients "cache" their respective layouts in the near future for all future creates inside them (evidence in logs) 5. During the creation of the subdirectory, two clients race again. Ideally winner performs mkdir() on the hashed subvolume and proceeds to create other dirs, loser experiences EEXIST error on the hashed subvolume and backs off. But in this case, because the two clients have different layout views of the parent directory (because of different hash splits and assignements), the hashed subvolumes for the new directory can end up being different. Therefore, both clients now win the race (they were never fighting against each other on a common server), assigning different GFIDs to the directory on their respective (different) subvolumes. Some of the remaining subvolumes get GFID1, others GFID2. Conclusion/Fix: Making mkdir translate EEXIST error as success (just the way self-heal is already rightly doing) will bring back truth to the design claim that concurrent mkdir/self-heals perform deterministic + idempotent operations. This will prevent the differing "hash views" by different clients and thereby also avoid GFID mismatch by forcing all clients to have a "fair race", because the hashed subvolume for all will be the same (and thereby avoiding leaked locks and hangs.) Change-Id: I84592fb9b8a3f739a07e2afb23b33758a0a9a157 BUG: 907072 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4459 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* glusterd: identify local address by searching network interfaces.Jeff Darcy2013-02-031-19/+78
| | | | | | | | | | | | | | using bind(3) to identify local address fails when net.ipv4.ip_nonlocal_bind (i.e, /proc/sys/net/ipv4/ip_nonlocal_bind) is set to 1. Change-Id: I7047b6fb94ef0df10b78673fab34dbd169344fec BUG: 890587 Original-author: JulesWang <w.jq0722@gmail.com> Signed-off-by: JulesWang <w.jq0722@gmail.com> Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4437 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
* glusterd: Made volume-status use synctask frameworkKrishnan Parthasarathi2013-02-035-44/+65
| | | | | | | | | Change-Id: Id4062799104e5831467ced65a43bfe377b6163f4 BUG: 852147 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4297 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Added syncop version of BRICK_OPKrishnan Parthasarathi2013-02-031-28/+242
| | | | | | | | | | | | - Made rsp dict available to all glusterd's STAGE/BRICK/COMMIT OP. Change-Id: I5d825d0670d0f1aa8a0603f2307b3600ff6ccfe4 BUG: 852147 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4296 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Moved node rsp functions to glusterd-utils.cKrishnan Parthasarathi2013-02-035-385/+410
| | | | | | | | | | Change-Id: Ib4c4794563a5a694fab16f17c642f788399462f6 BUG: 852147 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4295 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Removed start-unlock event injection in 'synctask' codepathKrishnan Parthasarathi2013-02-032-22/+46
| | | | | | | | | | Change-Id: I87e02c95d0b650dab7f9ee86c96b2e09ada50109 BUG: 862834 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4118 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: added logging of changelog for split-brain in glustershd.log fileVenkatesh Somyajula2013-02-034-10/+68
| | | | | | | | | | Change-Id: Iaf119f839cb2113b8f8efb7bf7636d471b6541bf BUG: 866440 Signed-off-by: Venkatesh Somyajula <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/4385 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* protocol/client: Periodically attempt reopensPranith Kumar K2013-02-035-93/+263
| | | | | | | | | | | | | | | | | | | | | | If the brick is taken down and the hard disk is replaced and the brick is brought back up, the re-opens of the open-fds will fail because the file is not present on the brick. Re-opens are not attempted even if the files are re-created by self-heal until the brick is brought down after the files are re-created and brought back up. This is a problem with a VM-store in a replica-setup. Until the fd is re-opened the writes will never happen on the brick where the hard-disk is replaced. To handle this situation gracefully, client xlator is enhanced to perform finodelk, fxattrop, writev, readv using anonymous fds if the file is yet to be re-opened. If the fop succeeds then client xlator attempts re-open. Change-Id: I1cc6d1bbf8227cd996868ab2ed0a57fb05e00017 BUG: 821056 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4358 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* protocol/client: Add fdctx back to saved-list after reopenPranith Kumar K2013-02-034-451/+76
| | | | | | | | | Change-Id: I01caa1b51570359e6e3ffe1ffb7279cbdb0b0c64 BUG: 821056 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4357 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/dht: stack wind with cookieVarun Shastry2013-01-313-29/+45
| | | | | | | | | | | | | | | Default_fops uses stack_wind_tail. It winds without creating the frame leading into wrong subvol return in the cookie. To avoid the problem caused by the same, we're getting the subvol by passing the cookie. Change-Id: I51ee79b22c89e4fb0b89e9a0bc3ac96c5b469f8f BUG: 893338 Signed-off-by: Varun Shastry <vshastry@redhat.com> Reviewed-on: http://review.gluster.org/4388 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Anand Avati <avati@redhat.com>
* fuse-bridge: fix some breakages from lock migration patchAnand Avati2013-01-301-12/+28
| | | | | | | | | | | | | | | | | | | | | - do not attempt lock migration if no locks were ever acquired on an fd. - fix fd_lk_ctx_t ref leak during fd migration - remove spurious fd_unref() (probably added to compensate for the fd_ref leak in syncop_open_cbk) - remove @newfdptr out-param which makes fd ref management really tricky (and currently refs were unmanaged for the out-param). Instead acquire ref and unref within lock migration function. Change-Id: I4cc9c451f0df4c051612bd1fa7bef11e801570e4 BUG: 808400 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4453 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* libglusterfs/syncop: do not hold ref on the fd in cbkRaghavendra Bhat2013-01-302-7/+6
| | | | | | | | | | | | | * Do not do fd_ref in cbks of the fops which return a fd (such as open, opendir, create). Change-Id: Ic2f5b234c5c09c258494f4fb5d600a64813823ad BUG: 885008 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4282 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterfs : Moved option files, and statedumps from /tmpAvra Sengupta2013-01-294-5/+7
| | | | | | | | | Change-Id: Ibdede396c4d6859225937316b7a59a661bcaf9f5 BUG: 764890 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4422 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: if a subvolume is down wind the lock request to nextRaghavendra Bhat2013-01-291-15/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | When one of the subvolume is down, then lock request is not attempted on that subvolume and move on to the next subvolume. /* skip over children that are down */ while ((child_index < priv->child_count) && !local->child_up[child_index]) child_index++; In the above case if there are 2 subvolumes and 2nd subvolume is down (subvolume 1 from afr's view), then after attempting lock on 1st child (i.e subvolume 0) child index is calculated to be 1. But since the 2nd child is down child_index is incremented to 2 as per the above logic and lock request is STACK_WINDed to the child with child_index 2. Since there are only 2 children for afr the child (i.e the xlator_t pointer) for child_index will be NULL. The process crashes when it dereference the NULL xlator object. Change-Id: Icd9b5ad28bac1b805e6e80d53c12d296526bedf5 BUG: 765564 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4438 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: wakeup delayed post op on fsyncPranith Kumar K2013-01-291-5/+3
| | | | | | | | | Change-Id: I5d84ef72615f9d71b4af210976e2449de6e02326 BUG: 888174 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4446 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Change order of unwind, resume for writevPranith Kumar K2013-01-291-31/+87
| | | | | | | | | | | | | | | | | | Generally inode-write fops do transaction.unwind then transaction.resume, but writev needs to make sure that delayed post-op frame is placed in fdctx before unwind happens. This prevents the race of flush doing the changelog wakeup first in fuse thread and then this writev placing its delayed post-op frame in fdctx. This helps flush make sure all the delayed post-ops are completed. Change-Id: Ia78ca556f69cab3073c21172bb15f34ff8c3f4be BUG: 888174 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4428 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* quick-read: various fixesAnand Avati2013-01-291-1/+12
| | | | | | | | | | | | | | | | | | | | | | - initialize xdata in qr_lookup even if it was NULL from top. This allows qr to do its job even if lookup originated from fuse-resolve.c - extend test cases to include 1 second delay and retry - fix bug while checking condition for cached unwind qr_readv_cached() unwinds if op_ret > 0. Therefore qr_readv() must wind to subvol only if !(op_ret > 0) (i.e, op_ret <= 0). - qr_readv_cached() is using uninitialized @conf pointer. Thanks to Raghavendra Bhat for catching this! Change-Id: Ifaf2ea2685e452210ef9ba3c2d1f2ab51900650c BUG: 846240 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4452 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* performance/io-cache: propagate errors while unwinding frame inRaghavendra G2013-01-291-5/+11
| | | | | | | | | | | read path. Change-Id: Ieb5d592a987e8681d5ec019da309f75e3b207580 BUG: 858242 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.org/4204 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mgmt/glusterd: Expose post-op-delay through cliPranith Kumar K2013-01-281-0/+1
| | | | | | | | | Change-Id: I13e3699bd58d53896ae54e1bfafb3cd1c9580c7c BUG: 905307 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4443 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* performance/md-cache: add force-readdirp flag to make readdirp configurableBrian Foster2013-01-282-3/+30
| | | | | | | | | | | | | | md-cache currently transforms all readdir fops into readdirp fops. This patch creates the 'force-readdirp' configuration flag to provide control over this behavior. force-readdirp is enabled by default to maintain current default behavior. BUG: 903175 Change-Id: Idd70926dec7c271204bdfb11fb052e56d0a39420 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4440 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* quick-read: refactorAnand Avati2013-01-283-3340/+413
| | | | | | | | | | | | | | | | | - peel out 'open behind' functionality into a separate translator - issue where, if file size had grown by revalidate, data was not flushed - removed unnecessary acquistion of table->lock (e.g in qr_lookup()) - keep inode ctx persistent, prune only data (effectively changing the order of lock acquisition from INODE -> TABLE) - validation with readdirplus - use variable size iobufs to simply cached reads Change-Id: If1586d0298fd1697ddff9fd7008efb3d286d436a BUG: 846240 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4403 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/afr: before checking lock_count of internal lock make sure its notRaghavendra Bhat2013-01-281-12/+13
| | | | | | | | | | | | | | | entrylk when the expected lock count is equal to the attempted lock count, then before deciding that lock is failed on all the nodes, make sure the lock type is checked properly. Change-Id: I1f362d54320cb6ec5654c5c69915c0f61c91d8c7 BUG: 765564 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4436 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: make 'glusterd_is_local_addr' return boolJulesWang2013-01-266-46/+45
| | | | | | | | | Change-Id: Id3bd0bfc4802c166f7a32b0cc6a726aeb5617b5d BUG: 890618 Signed-off-by: JulesWang <w.jq0722@gmail.com> Reviewed-on: http://review.gluster.org/4427 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* storage/posix: skip path construction when dentry list is emptyBrian Foster2013-01-261-0/+2
| | | | | | | | | | | | | | | This is a minor latency optimization to the readdirp path in storage/posix. During a recursive list, we hit this codepath with an empty list once per high-level directory to read when end of directory is reached. Skip constructing hpath, since we don't do anything with it in this case. BUG: 903175 Change-Id: I98d7c65505205d55575f064b1e982700f1320cc0 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4432 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>