summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* glusterd: Fixed volume-sync in synctask codepath.Krishnan Parthasarathi2013-04-121-12/+18
| | | | | | | | | Change-Id: I2911d3ac80825310f84c5ba6bd7890e65e1ee219 BUG: 950048 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4643 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* object-storage: rebase Swift to 1.8.0 (grizzly)Kaleb S. KEITHLEY2013-04-122-37/+136
| | | | | | | | | | | Merged (git cherry-pick) from master/HEAD to release-3.4 Change-Id: I24265c12a45eac4cec761748096118c9647440be BUG: 948041 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/4780 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Preserve mtime in self-healPranith Kumar K2013-04-122-14/+77
| | | | | | | | | | | | | | | | | | | | | | | | Problem: Data self-heal may choose sink iatt to set mtimes. This happens because after syncing of data is done self-heal does one more xattrops/fstat to determine sources sinks to set the inode-ctx. Since this is done after data syncing and erase of xattrs, old source and old sink are now sources, but the mtimes of them differ. Old code just takes the first source from the list and update mtimes, which could be sink before the self-heal started. Fix: Set mtime from 'sources before syncing'. Change-Id: Id769e1b99aa4f041eaee775f64cbf2c57b799723 BUG: 918437 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4658 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-on: http://review.gluster.org/4663
* socket: Make non-ssl sockets perform non-blocking connect()Krishnan Parthasarathi2013-04-121-0/+12
| | | | | | | | | Change-Id: Icb60cf7ad3ea7ca0eeb12fd19b95a6b340857bb2 BUG: 920916 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4685 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* dict: Put "goto out" in dict_unserialize to avoid process crashVenkatesh Somyajulu2013-04-121-0/+1
| | | | | | | | | | | | | | | | Problem: In the dictionary serialization function, if the [(buf + vallen) > (orig_buf + size)], then memdup is getting failed. Fix: Put "goto out" whenever this condition is met. Change-Id: I8c07dd5187364ccd6ad7625e2e3907d8b56447a9 BUG: 947824 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/4771 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* build: add BuildRequires librdmacm-develKaleb S. KEITHLEY2013-04-121-1/+1
| | | | | | | | | | | | | | | See http://review.gluster.org/149 Installed librdmacm-devel RPM on the build server. cherry pick from http://review.gluster.org/#/c/4804/ BUG: 819130 Change-Id: I30e14ebf7646c19923940f86a72bf42497cac70c Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/4806 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/distribute: Ignore non-participating subvols for layout checksshishir gowda2013-04-114-27/+178
| | | | | | | | | | | | | | | | | | | | | | | | | Backporting fix http://review.gluster.org/#/c/4668/ When subvols-per-directory is < available subvols, then there are layouts which are not populated. This leads to incorrect identification of holes or overlaps. We need to ignore layouts, which have err == 0, and start == stop. In the current scenario (start == stop == 0). Additionally, in layout-merge, treat missing xattrs as err = 0. In case of missing layouts, anomalies will reset them. For any other valid subvoles, err != 0 in case of layouts being zeroed out. Also reverted back dht_selfheal_dir_xattr, which does layout calculation only on subvols which have errors. BUG: 921408 Change-Id: I75a8edcb92af5b53b3253c9addd7a812e9242836 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4800 Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* dht: improve transform/detransform of d_off (and be ext4 safe)shishir gowda2013-04-111-5/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backporting Avati's fix http://review.gluster.org/4711 The scheme to encode brick d_off and brick id into global d_off has two approaches. Since both brick d_off and global d_off are both 64-bit wide, we need to be careful about how the brick id is encoded. Filesystems like XFS always give a d_off which fits within 32bits. So we have another 32bits (actually 31, in this scheme, as seen ahead) to encode the brick id - which is typically plenty. Filesystems like the recent EXT4 utilize the upto 63 low bits in d_off, as the d_off is calculated based on a hash function value. This leaves us no "unused" bits to encode the brick id. However both these filesystmes (EXT4 more importantly) are "tolerant" in terms of the accuracy of the value presented back in seekdir(). i.e, a seekdir(val) actually seeks to the entry which has the "closest" true offset. This "two-prong" scheme exploits this behavior - which seems to be the best middle ground amongst various approaches and has all the advantages of the old approach: - Works against XFS and EXT4, the two most common filesystems out there. (which wasn't an "advantage" of the old approach as it is borken against EXT4) - Probably works against most of the others as well. The ones which would NOT work are those which return HUGE d_offs _and_ NOT tolerant to seekdir() to "closest" true offset. - Nothing to "remember in memory" or evict "old entries". - Works fine across NFS server reboots and also NFS head failover. - Tolerant to seekdir() to arbitrary locations. Algorithm: Each d_off can be encoded in either of the two schemes. There is no requirement to encode all d_offs of a directory or a reply-set in the same scheme. The topmost bit of the 64 bits is used to specify the "type" of encoding of this particular d_off. If the topmost bit (bit-63) is 1, it indicates that the encoding scheme holds a HUGE d_off. If the topmost bit is is 0, it indicates that the "small" d_off encoding scheme is used. The goal of the "small" d_off encoding is to stay as dense as possible towards the lower bits even in the global d_off. The goal of the HUGE d_off encoding is to stay as accurate (close) as possible to the "true" d_off after a round of encoding and decoding. If DHT has N subvolumes, we need ROOF(Log2(N)) "bits" to encode the brick ID (call it "n"). SMALL d_off =========== Encoding -------- If the top n + 1 bits are free in a brick offset, then we leave the top bit as 0 and set the remaining bits based on the old formula: hi_mask = 0xffffffffffffffff hi_mask = ~(hi_mask >> (n + 1)) if ((hi_mask & d_off_brick) != 0) do_large_d_off_encoding () d_off_global = (d_off_brick * N) + brick_id Decoding -------- If the top bit in the global offset is 0, it indicates that this is the encoding formula used. So decoding such a global offset will be like the old formula: if ((d_off_global & 0x8000000000000000) != 0) do_large_d_off_decoding() d_off_brick = (d_off_global % N) brick_id = d_off_global / N HUGE d_off ========== Encoding -------- If the top n + 1 bits are NOT free in a given brick offset, then we set the top bit as 1 in the global offset. The low n bits are replaced by brick_id. low_mask = 0xffffffffffffffff << n // where n is ROOF(Log2(N)) d_off_global = (0x8000000000000000 | d_off_brick & low_mask) + brick_id if (d_off_global == 0xffffffffffffffff) discard_entry(); Decoding -------- If the top bit in the global offset is set 1, it indicates that the encoding formula used is above. So decoding would look like: hi_mask = (0xffffffffffffffff << n) low_mask = ~(hi_mask) d_off_brick = (global_d_off & hi_mask & 0x7fffffffffffffff) brick_id = global_d_off & low_mask If "losing" the low n bits in this decoding of d_off_brick looks "scary", we need to realize that till recently EXT4 used to only return what can now be expressed as (d_off_global >> 32). The extra 31 bits of hash added by EXT recently, only decreases the probability of a collision, and not eliminate it completely, anyways. In a way, the "lost" n bits are made up by decreasing the probability of collision by sharding the files into N bricks / EXT directories -- call it "hash hedging", if you will :-) Change-Id: I9551c581c3f3d4c9e719764881036d554f60c557 Thanks-to: Zach Brown <zab@redhat.com> BUG: 838784 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4799 Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterfs.spec.in: sync with fedora glusterfs.specKaleb S. KEITHLEY2013-03-282-0/+13
| | | | | | | | | | | | | | | add --without ufo cherry-pick from refs/changes/42/4742/1 Change-Id: If1b77003ded537f9664fa6ad677d48d118516c64 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> BUG: 819130 Reviewed-on: http://review.gluster.org/4743 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: Luis Pabon <lpabon@redhat.com> Reviewed-by: Luis Pabon <lpabon@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* libglusterfs/dict: fix infinite loop in dict_keys_join()Vijaykumar Koppad2013-03-271-2/+4
| | | | | | | | | | | | - missing "pairs = next" caused infinite loop Change-Id: I3edc4f50473f7498815c73e1066167392718fddf BUG: 905871 Signed-off-by: Vijaykumar Koppad <vkoppad@redhat.com> Reviewed-on: http://review.gluster.org/4728 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterfs.spec.in: resync with Fedora glusterfs.specKaleb S. KEITHLEY2013-03-267-181/+1001
| | | | | | | | | | | | | | | | | | | | | | | | | | cherry-pick from master, including commits: 5d3b478e76f1015b11bfd7d48465ab12a4f0737e fd407a4f5cdb869dc52efe8fc9e1d284f60f5992 6f6789884227b8260f140c39c063d77b0516af97 84f5e4b354526fbb7f0665345816e81c81245c8f 2398e1e0da61f4ec5f209c704e037b54b5c249e1 Resync with Fedora's glusterfs.spec To build a set of RPMs: % ./autogen.sh % ./configure --enable-fusermount % make dist % cd extras/LinuxRPM && make glusterrpms Updated rpm.t BUG: 819130 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Change-Id: Ib73be0fbb7ee16a5c41b4f7c7a3f66d0224bfe6c Reviewed-on: http://review.gluster.org/4725 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* build: Fix package version to match release versionKrishnan Parthasarathi2013-03-201-1/+1
| | | | | | | | | Change-Id: I00e0ebc4e36cedd771a46b6bd1f3267439ab9474 BUG: 922765 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4673 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* mgmt/glusterd: Start fs-crawl in separate thread so as not to block epollVarun Shastry2013-03-192-4/+53
| | | | | | | | | | | | | | | tests/basic/quota.t covers test case for this. Patch is only for 3.4 branch, http://review.gluster.org/4495 fixes the issue in master. Change-Id: I92674f5413441cc896245d5b3d0925f44ce8b2d3 BUG: 919998 Signed-off-by: Varun Shastry <vshastry@redhat.com> Reviewed-on: http://review.gluster.org/4680 Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* rpm: package /var/run/gluster so that statedumps can be createdKaleb S. KEITHLEY2013-03-121-0/+5
| | | | | | | | | | | | | | | Creating statedumps fail when /var/run/gluster does not exist. This directory should be part of the 'glusterfs' package that is installed on storage servers and native clients. Merged Niels's change from both $HEAD and release-3.3 BUG: 917554 Change-Id: I6ffc497c0bb6bc90c97a91a72bba9118853d4c8c Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/4659 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/distribute: Fix layout overlaps due to spread-count in selfheal pathshishir gowda2013-03-092-52/+20
| | | | | | | | | | | | | | | We needed to zero out the layout range, before we re-calculate the range. When spread-count is issued, we would end up with stale ranges in the layout. Replaced dht_selfheal_dir_xattr with dht_fix_dir_xattr, which correctly resets the un-used (after re-cal) layouts. Change-Id: I1a900d15df07335f59356bd23182ccec34381ab2 BUG: 884455 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4648 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gfapi: dict_unref() xattr_req in fop finish instead of dict_destroy()Anand Avati2013-03-072-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | The current way of calling dict_destroy() at the end of an API fop assumes that xattr_req is not stored/ref'd by any translators in the stack. However when translators like DHT store xattr_req in dht_local_t with a dict_ref() and perform dict_unref() in the unwind path, things get subject to a race. The race is between the woken up thread (by syncop_wake) i.e, the gfapi invoking thread and the thread where the FOP was unwound. As the C stack of STACK_UNWIND unwinds back, dht_local_unref() gets invoked within the DHT_STACK_UNWIND macro. This thread attempts dict_unref, which would be "safe" if it wins the race against the gfapi invoking thread. However if the gfapi invoking thread wins the race, it will perform dict_destroy() first and therefore make dict_unref() within dht_local_unref perform a double free. This is the embarrassing on-screen bug which showed up in a roomful of people during the gluster dev summit demo of qemu/libgfapi integration. Change-Id: I284c93de87cdc128d5801f42c84aa87f753090d4 BUG: 839950 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4645 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* performance/write-behind: guarantee non-overlapping concurrent writesv3.4.0alpha2Jeff Darcy2013-03-061-1/+65
| | | | | | | | | | | | | | | | | | Maintain a list of writes (either written behind or SYNC) which are currently "in progress" (i.e, STACK_WIND'ed towards server) and hold off any new STACK_WIND of write (either written behind or SYNC) which overlaps with any of the "in progress" writes. This is a guarantee which AFR's eager-lock depends upon (though not strictly a write-behind requirement) Change-Id: Icedd0b51b440366a906dc9223d62b7fd6ef2ca03 BUG: 857673 Original-author: Anand Avati <avati@redhat.com> Signed-off-by: Anand Avati <avati@redhat.com> Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4642 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* performance/write-behind: Add test case for fd being marked bad after write ↵Raghavendra G2013-03-061-0/+33
| | | | | | | | | | | failures. BUG: 765473 Change-Id: Ia5d9fecc7f84ee4d51f8037e2dd1ed03f0394bd9 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.org/4632 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: Increasing throughput of synctask based mgmt ops.Krishnan Parthasarathi2013-03-064-425/+564
| | | | | | | | | Change-Id: Ibd963f78707b157fc4c9729aa87206cfd5ecfe81 BUG: 913662 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4638 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* volgen: Use bind-address option for bricks when option set on glusterdKrishnan Parthasarathi2013-03-061-8/+20
| | | | | | | | | | | | | | | | Brick processes listen on all the interfaces on a given port. When multiple glusterds run on one machine, glusterd assumes that it 'owns' the ports on that machine. This can lead to the different glusterd instances to step on each other's ports. This fix ensures that brick processes listen only on the its host IP when glusterd has bind-address option set. Change-Id: I4c1b05643c64d3098bf56e977e768e611ffce0f5 BUG: 913662 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4637 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* synctask: support for (assymetric) counted barriersKrishnan Parthasarathi2013-03-062-76/+122
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [Backport of Avati's patch on master - http://review.gluster.org/4558] This patch introduces a new set of primitives: - synctask_barrier_init (stub) - synctask_barrier_waitfor (stub, count) - synctask_barrier_wake (stub) Unlike pthread_barrier_t, this barrier has an explicit notion of "waiter" and "waker". The "waiter" waits for @count number of "wakers" to call synctask_barrier_wake() before returning. The wait performed by the waiter via synctask_barrier_waitfor() is co-operative in nature and yields the thread for scheduling other synctasks in the mean time. Intended use case: Eliminate excessive serialization in glusterd and allow for concurrent RPC transactions. Code which are currently in this format: ---old--- list_for_each_entry (peerinfo, peers, op_peers_list) { ... GD_SYNCOP (peerinfo->rpc, stub, rpc_cbk, ...); } ... int rpc_cbk (rpc, stub, ...) { ... __wake (stub); } ---old--- Can be restructred into the format: ---new--- synctask_barrier_init (stub); { list_for_each_entry (peerinfo, peers, op_peers_list) { ... rpc_submit (peerinfo->rpc, stub, rpc_cbk, ...); count++; } } synctask_barrier_wait (stub, count); ... int rpc_cbk (rpc, stub, ...) { ... synctask_barrier_wake (stub); } ---new--- In the above structure, from the synctask's point of view, the region between synctask_barrier_init() and synctask_barrier_wait() are spawning off asynchronous "threads" (or RPC) and keep count of how many such threads have been spawned. Each of those threads are expected to make one call to synctask_barrier_wake(). The call to synctask_barrier_wait() makes the synctask thread co-operatively wait/sleep till @count such threads call their wake function. This way, the synctask thread retains the "synchronous" flow in the code, yet at the same time allows for asynchronous "threads" to acheive parallelism over RPC. Change-Id: Ie037f99b2d306b71e63e3a56353daec06fb0bf41 BUG: 913662 Signed-off-by: Anand Avati <avati@redhat.com> Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4636 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* testcase for the open-behind xlator when open failsJeff Darcy2013-03-061-0/+56
| | | | | | | | | | | | | | Test if the fops which are put into a stub and are waiting for the open to complete should be unwound with the error if open call itself fails. Change-Id: I8c363d98303a7df1a0ca9ea6ef207c7123fdd388 BUG: 846240 Original-author: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4634 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* performance/write-behind: mark fd bad if any written behind writes failRaghavendra G2013-03-061-57/+114
| | | | | | | | | BUG: 765473 Change-Id: I1ddd6ef9f5361aed96f97aa1344823836c6ddecb Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.org/4630 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* tests: move common funtion definitions to include.rcRaghavendra G2013-03-068-156/+22
| | | | | | | | | BUG: 765473 Change-Id: Id0d194374d34cfec8ee601090f7fe38b1856ac22 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.org/4631 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* tests/fileio.rc: library for file descriptor based IO in testsRaghavendra G2013-03-061-0/+61
| | | | | | | | | | | | | | | | | | | | There are situations in test scripts where we want to keep open file descriptors while performing other commands. Bash has abilities to manage file descriptors by numbers, but the syntax is a little brain damaging. This library provides wrappers around it to abstract away bash's syntax and also provides a helper function to pick a free file descriptor on the fly. The APIs are pretty self explanatory. Change-Id: I82f1d1957646dd6c468d9e85c90ec30c978c7ad6 BUG: 764966 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.org/4635 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Jeff Darcy <jdarcy@redhat.com>
* Do not call xdr_string() with a NULL error messageJeff Darcy2013-03-051-0/+1
| | | | | | | | | | | | | It is illegal to call xdr_string() with a NULL string. Linux just retruns false, NetBSD gets a SIGSEGV when xdr_string() calls strlen(NULL) BUG: 916439 Change-Id: Ia958470ada6e8e55a86d439922ec942d038f5f13 Original-author: Emmanuel Dreyfus <manu@netbsd.org> Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4629
* cluster/afr: do complete split-brain check in all the fd based fopsRaghavendra Bhat2013-03-054-19/+33
| | | | | | | | | | | | | | | | | fd based operations such as readv checked only for data split brain instead of complete split-brain (i.e both data + metadata) assuming that open would have done the complete split-brain check. However open-behind would have unwound open, without winding to afr thus preventing the complete split-brain check and some appliations will be able to read the contents of the file even though the file has metadata split-brain. So let all the fd based fops do a defensive check of complete split-brain. Change-Id: I0ea52f782b371ce73e8e1c61f9def438fce1bd28 BUG: 846240 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4620 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Fix check for task-id existence in 'volume status'Kaushal M2013-03-053-3/+8
| | | | | | | | | | | | | | This fixes the issue of task-id tests failing randomly. The condition used to check rebalance/remove-brick was running was wrong, which could lead to the task-id for these tasks to not be displayed even when the actual commit hadn't occured. BUG: 857330 Change-Id: I0f86c6bbe7acec586ee0ea6e663369ea26171904 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4617 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* tests: Add spaces around '=' in a string comparision in TEST primitiveKaushal M2013-03-041-1/+1
| | | | | | | | | BUG: 764966 Change-Id: I2da197bdddb4a4d098ebb044410e21ced4dbd806 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4618 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* rpc: bring in root-squashing behavior in rpcRaghavendra Bhat2013-03-047-2/+139
| | | | | | | | | | | | | | | | | | * requests coming in as root are converted to nfsnobody * with open-behind some acl checks wont happen and nfsnobody can read the file "whose owner is root and other users do not have permission to read the file". This is becasue open-behind does not send the open to the brick and sends success to the application, thus the acl related tests on the file wont happen which would have prevented the file from being opened. Change-Id: I12a3e6b2a12884d00bb81f2779074fed09b1b2e4 BUG: 887145 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4619 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/distribute: Reopen fds in migration internally as root:rootshishir gowda2013-03-041-1/+10
| | | | | | | | | | | | | | Though linkfile_create and rebalance dst file create sent a setattr with correct ownership, there is still a race window where the linkfile open (client open due to migration) will fail, as its ownership will be root:root. BUG: 884597 Change-Id: Iba73681eae4f280d39ee6c9a40009e195768bee7 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4612 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* synctask: implement setuid-like SYNCTASK_SETID()shishir gowda2013-03-042-0/+30
| | | | | | | | | | | | | | | | | synctasks can now call SYNCTASK_SETID(uid,gid) to set the effective uid/gid of the frame with which the FOP will be performed. Once called, the uid/gid is set either till the end of the synctask or till the next call of SYNCTASK_SETID() Back-porting Avati's patch http://review.gluster.org/#change,4269 BUG: 884597 Change-Id: Id0569da4bb8959636881457217fe004bf30c5b9d Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4611 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/distribute: Prevent spurious multiple defrag crawlsshishir gowda2013-03-041-9/+14
| | | | | | | | | | | | | | | | | In dht_notify, we used to create a thread to start defrag crawls after we had heard from all child subvols. This was in-correct, as a later event, could also trigger the crawl again(due to the fact that all subvols had responded). The fix is to make sure, the thread is started only once after all subvols have responded the first time BUG: 916449 Change-Id: I1619344fbb1cb51d5e1db38d8a29821fa870fa8b Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4610 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/distribute: Preserve file size during rebalance migrationshishir gowda2013-03-043-0/+94
| | | | | | | | | | | | | | | | | | If holes are encountered, then we do not write these to the dst, which sometimes causes file size to be lesser than src. Data is not corrupted, as when non-zero reads are received, we do write that data. Calling a truncrate to give file size to prevent it from being truncated to less than src in case the file end has holes. Thanks to Brian Foster for providing the test case BUG: 915554 Change-Id: I7e1e0c475118b073c3ebb87e93220c1ec22e8b7d Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4609 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/distribute: Remove suprious fd_unref callshishir gowda2013-03-041-2/+0
| | | | | | | | | | | | After fix http://review.gluster.org/4282 (libglusterfsterfs/syncop: do not hold ref on the fd in cbk) was pushed, syncop_open does not take a ref anymore. BUG: 910661 Change-Id: Idedff91270966e6e70e71ee83785c0228e238d31 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4608 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/dht: Create linkfile with file uid/gidshishir gowda2013-03-046-4/+282
| | | | | | | | | | | | | | | | Currently, linkfile creation happens as root. use uid/gid returned from _cbk (link/rename) to set the correct ownership of the link files. Also added test/dht.rc to implement common dht functions BUG: 884597 Change-Id: I6bc0e04f62d4716fc033681e5678e852a1be7a2f Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4607 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* glusterd: use gf_strdup() in place of strdup()Krutika Dhananjay2013-03-042-1/+15
| | | | | | | | | Change-Id: Idee71019dbc6eeaa0a808d671b29d6f3038a1a89 BUG: 913487 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4563 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* nfs/server: Fix multiple crashes in acl handler.Vijay Bellur2013-03-031-10/+16
| | | | | | | | | Change-Id: I67c224c74c02f7058bcf546713501dd7ab810826 BUG: 915280 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/4606 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* libglusterfs: Fix memory leaks in fd_lk_insert_and_mergeVijay Bellur2013-03-034-4/+138
| | | | | | | | | | Change-Id: I666664895fdd7c7199797796819e652557a7ac99 BUG: 834465 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/4529 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* mount.glusterfs: Introduce mem-accounting as an optionVijay Bellur2013-03-031-9/+17
| | | | | | | | | | | | | | | | option mem-accounting enables memory accounting for the client process. re-factored to keep options with values and options without values in different sections of mount.glusterfs. Change-Id: I54ebc31a1eae6d7a5ce7b0255cd7df74d37d46c1 BUG: 834465 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/4528 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* Better mechanism to handle memory accountingVijay Bellur2013-03-034-13/+8
| | | | | | | | | | | | | | | Memory accounting will now be enabled if: 1) Any glusterfs process is spawned with argument --mem-accounting. 2) DEBUG is defined. Change-Id: I3345e114127a57ce61916be0e2c4e0049a4c3432 BUG: 834465 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/4527 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* Branch updated to release-3.4Vijay Bellur2013-02-111-1/+1
| | | | | | | | | Change-Id: Ie61e727f28b61ddf9d0e7133533aeaba4e7e0b7c BUG: 820551 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/4500 Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Anand Avati <avati@redhat.com>
* Tests: Remove 'sleep' from the test casePranith Kumar K2013-02-081-34/+36
| | | | | | | | | Change-Id: Iaff2076b0ef7f69a6ba6efd4123271bde490977a BUG: 873962 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4498 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Tests: Explicitly set eager-lock onPranith Kumar K2013-02-081-1/+2
| | | | | | | | | Change-Id: I834fd5adab6e328ed106e413fc06e4280d1d24b2 BUG: 888174 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4497 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: harden 'volume start' staging to check for brick dirs' presenceKrutika Dhananjay2013-02-084-13/+101
| | | | | | | | | | | | | | | | | | | | | PROBLEM: When the brick directory of a volume is absent on any of the servers, AND an attempt is made to start the volume, commit fails ONLY on the node where the brick dir is absent, leading to a split-brain like situation. FIX: Harden 'volume start' to check for the presence of brick directories at the time of staging, thereby preventing commit failure. Change-Id: I67faeb9afbd3aa76f08645924462db126bf7a977 BUG: 889996 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4365 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/dht: pathinfo xattr changes for directoriesVenky Shankar2013-02-083-92/+264
| | | | | | | | | | | | | | | | | | | | Since directories have presence on all subvolumes there is no definite meaning of ->hashed_subvol or ->cached_subvol. getxattr() code path chooses ->cached_subvol for pathinfo extended attribute. While this makes sense of files, it makes less sense for directories. Further if a hashed or a cached subvolume is down, and there's a getxattr request for a directory, we return with an errno. This patch changes pathinfo extended attribute contents by aggregating information from all subvolumes that are up. Change-Id: I58adb741d63ccfd1d0239af75eb65f26f0fb384d Signed-off-by: Venky Shankar <vshankar@redhat.com> BUG: 856455 Reviewed-on: http://review.gluster.org/4047 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Made gsync set use synctask frameworkAvra Sengupta2013-02-082-6/+2
| | | | | | | | | Change-Id: I409fa5a9f55434ece47a8a51d4812d3eca42d269 BUG: 852147 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4473 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Making volume-reset use synctask frameworkAvra Sengupta2013-02-082-5/+54
| | | | | | | | | Signed-off-by: Avra Sengupta <asengupt@redhat.com> Change-Id: Ib25c8fa69d84b8132505ae3f1e67cf88d3f6f9ec BUG: 852147 Reviewed-on: http://review.gluster.org/4474 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd : Made volume-set use synctask framework.Avra Sengupta2013-02-081-5/+1
| | | | | | | | | Change-Id: I1aa08ca843b87839180f9097bca370270a856e6d BUG: 852147 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4488 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* tests/bugs/bug-762989.t: do not check the listening portsRaghavendra Bhat2013-02-081-1/+1
| | | | | | | | | Change-Id: Ifbcf28bb476ee95343beaf42fb84a1b834c9ffcb BUG: 762989 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4486 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>