summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* cluster/stripe: implement the coalesce stripe file formatBrian Foster2012-06-076-102/+882
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The coalesce file format for cluster/stripe condenses the striped files to a contiguous layout. The elimination of holes in striped files eliminates space wasted via local filesystem preallocation heuristics and significantly improves read performance. Coalesce mode is implemented with a new 'coalesce' xlator option, which is user-configurable and disabled by default. The format of newly created files is marked with a new 'stripe-coalesce' xattr. Cluster/stripe handles/preserves the format of files regardless of the current mode of operation (i.e., a volume can simultaneously consist of coalesced and non-coalesced files). Files without the stripe-coalesce attribute are assumed to have the traditional format to provide backward compatibility. extras/stripe-merge: support traditional and coalesce stripe formats Update the stripe-merge recovery tool to handle the traditional and coalesced file formats. The format of the file is detected automatically (and verified) via the stripe-coalesce attributes. BUG: 801887 Change-Id: I682f0b4e819f496ddb68c9a01c4de4688280fdf8 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.com/3282 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: generate node UUID lazilyAnand Avati2012-06-0713-62/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A commonly faced problem among glusterfs users is: after a fresh installation of glusterfs in a virtual machine, the VM image is cloned to make multiple instances of the server. This breaks glusterd because right after glusterfs installation on the first boot glusterd would have created the node UUID and this gets inherited into the clone. The result is wierd behavior at the time of peer probe where glusterd does not (yet) deal with UUID collisions in a user friendly way. This patch is for the 'prevention' of the issue. The approach here is to avoid generating a UUID on the first start of glusterd, but instead generate a node UUID only when a node UUID is found to be necessary. This naturally avoids the creation of node UUID on first boot and prevents the issue to a large extent. This issue also needs a 'cure' patch, which gives more meaningful error messages to the user and provides CLI to recover from the situations (gluster peer reset?) Change-Id: Ieaaeeaf76ed35385844e98a8e23fc3dd8df5a208 BUG: 811493 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.com/3533 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* io-cache,quick-read: bring down log levelAnand Avati2012-06-072-4/+5
| | | | | | | | | | | | | | | | | | log messages were unnecessarily in INFO level. The two functions with the same name were non-static and actually the quick-read's call landed in the io-cache's version: 2012-06-07 17:02:29.848667] I [io-cache.c:1549:check_cache_size_ok] 0-single-master-io-cache: Max cache size is 33791991808 [2012-06-07 17:02:29.848751] I [io-cache.c:1549:check_cache_size_ok] 0-single-master-quick-read: Max cache size is 33791991808 Changed them to static declaration. Change-Id: Id9daf9593b2832e4c261f95eac6181efea8899a5 BUG: 765227 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.com/3536 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* libglusterfs : Fix validation for integer volume options.Kaushal M2012-06-073-10/+39
| | | | | | | | | | | | | | | | | | | | | Integer volume options which specified only the min value as 0, would not be validated during "volume set". The range check for an option happened only if both min and max were not 0. In the above case, even though a minium was specified, the range check did not happen as both min and max were 0. To allow forced validation in such cases, a new member, "validate", has been added to volume_options_t. This member takes the values GF_OPT_VALIDATE_BOTH, GF_OPT_VALIDATE_MIN and GF_OPT_VALIDATE_MAX (GF_OPT_VALIDATE_BOTH is the default). Change-Id: I351de0eedb6028120e5c0b073ee5d9c141dee717 BUG: 809847 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.com/3084 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mount.glusterfs: Add support for {attribute,entry}-timeout optionsKaushal M2012-06-061-0/+11
| | | | | | | | | | Change-Id: Ib41a2537ac86513a008029fca818951706a144f7 BUG: 829279 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.com/3530 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Check for null gfid_reqPranith Kumar K2012-06-061-1/+1
| | | | | | | | | | | | | | gfid_req is set only by the fuse xlator. Fresh lookups performed by self-heal-daemon, rebalance will not have gfid at all. Change-Id: I6712e3063067ecc5f19956e75d28c86bfc19fc65 BUG: 829203 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3529 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* protocol/client: Remember the gfid of opened fdPranith Kumar K2012-06-063-112/+107
| | | | | | | | | | | | | | | | This is needed when the fresh lookup triggers self-heal, gfid won't be present in inode yet. Similar situation happens with Rebalance as it does not perform inode_link. Added similar fix for re-opendir. Removed inode from fdctx and removed some duplication of code. Change-Id: Ic94e5738c8585ed86801d2eed9ddab1015246710 BUG: 826080 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3517 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* rpc-transport/socket: fix the state machine for XDATA readingAnand Avati2012-06-062-11/+40
| | | | | | | | | | | | | | | | | The socket state machine was broken for reading XDATA on the server. This code was structured such that when there was a partial read in a particular state, some variables would remain uninitialized in the next 'run' of the state machine. Also did some re-org of the state machine with two more states to make the code more readable and similar in state-breakup pattern to the other states. Change-Id: Ia32c78d4b9567bb08c6df8dc9fd6f05749d312a4 BUG: 829062 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.com/3524 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* mount/fuse: use correct fdctx to inherit direct-io-values from.Raghavendra G2012-06-061-1/+1
| | | | | | | | | | Change-Id: Ifea178f4dbe57720c16dc3851b262952f3d81159 BUG: 762533 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.com/3531 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: nfs.disable fix for "volume set help"Kaushal M2012-06-061-7/+12
| | | | | | | | | | | | | Fixes volgen to include "nfs.disable" in output of "volume set help". Also fixes some incorrect entries in glusterd_volopt_map. Change-Id: Ica5edf1ece31f9daa040fcdf559c1643ecdfd568 BUG: 828027 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.com/3509 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* rpc: avoid an invalid free of item on a listJeff Darcy2012-06-051-0/+1
| | | | | | | | | | | | | If we actually "consumed" vol_opt by putting it on THIS->volume_options, it's still in use and we shouldn't free it before returning. Change-Id: I8ef3e4ce8a8b9f2552faa3345f1686e173d1aa10 BUG: 829104 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.com/3528 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterfsd: further fd leak fixes for graph changeCsaba Henk2012-06-051-0/+6
| | | | | | | | | Change-Id: I8e23d6bb95cddbb3862c524d79d1a956956b7a51 BUG: 789278 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3527 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cli: Fix error output for peer probe on address validation failureKaushal M2012-06-051-0/+2
| | | | | | | | | | | | | Displays an error message and sets proper return value on failure of address validation in peer probe. Change-Id: I5ced5524040e19a95dc832b6f676874983d0f2a7 BUG: 817648 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.com/3520 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Use linkat(2) when linking on symlinkEmmanuel Dreyfus2012-06-051-0/+18
| | | | | | | | | | | | | | link(2) behavior is not standardized when it comes to symlink. BSD links to the symlink target (and fails if it does not exist), Linux links to the symlink itself. Use linkat(2) instead of link(2) in order to get a portable behavior. BUG: 764655 Change-Id: If7f6f17b48a4ccf8827c3795ec147306df6b5542 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.com/3507 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: coverity fixes (mostly resource leak fixes)Amar Tumballi2012-06-0521-49/+162
| | | | | | | | | | | currently working on obvious resource leak reports in coverity Change-Id: I261f4c578987b16da399ab5a504ad0fda0b176b1 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 789278 Reviewed-on: http://review.gluster.com/3265 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Set errstr for duplicate add-brickKaushal M2012-06-051-0/+3
| | | | | | | | | | | | Sets op_errstr when add-brick is given a duplicate brick. Change-Id: I7b8f8139f9f09834a71a5abc725692b145896830 BUG: 803336 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.com/3519 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* replicate: default read_child to a local brick if there is one.Jeff Darcy2012-06-057-5/+165
| | | | | | | | | | | Controlled by the "choose-local" option (on by default). Change-Id: I560f27c81703f2c9c62fdb51532c8eb763826df7 BUG: 806462 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.com/3005 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* fuse: be good at suicideCsaba Henk2012-06-051-1/+1
| | | | | | | | | | | | | We get hung on the exit path if we kill only the current thread on AUTH_FAILED. Kill indeed the current process. Change-Id: I36042f245a22bd2a284df37fd6d3a3e0b76f81e9 BUG: 826975 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3523 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Self-heald: inode_link files while crawlingPranith Kumar K2012-06-051-5/+10
| | | | | | | | | Change-Id: I559a3ff507b9487b1dfca7871c188a05d89ea6d6 BUG: 826580 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3515 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* posix: fix the 'ENOENT' logs for setxattr()Amar Tumballi2012-06-041-6/+7
| | | | | | | | | | | | | | from marker, setxattr() is attempted on the path even after the unlink() happens if the fd is still active. In such cases, we should not be logging the failures. Change-Id: Icdd9c951f0d331cdda0bec42ae343302b2dbafde BUG: 766611 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.com/3514 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mount.glusterfs: update the glusterd WORKDIRAmar Tumballi2012-06-041-1/+1
| | | | | | | | | Change-Id: I70d091611d314598412b5315adcbe1b5147a8773 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 824231 Reviewed-on: http://review.gluster.com/3513 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* libglusterfs: valid_host_name() fixKaushal M2012-06-031-1/+1
| | | | | | | | | | | | Fix valid_host_name() to allow single character hostnames. Change-Id: I72527ecedec52fa47336d95b0586eb18dac6273d BUG: 827403 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.com/3508 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* protocol/client: do not ignore the xdata received for some fopsRaghavendra Bhat2012-06-031-0/+11
| | | | | | | | | | | | | opendir, fsetattr, fsync, lk were sending NULL xdata to the server even though it (xdata) had values within it. Change-Id: Ic274ab903c5c1e443409dd250ede80cd85d10b36 BUG: 826923 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.com/3502 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Use inet as default listenerEmmanuel Dreyfus2012-06-015-27/+18
| | | | | | | | | | | | | This patch was proposed by Anand Babu Periasamy on gluster-devel@ It fixes the inet/inet6 mismatch between client/glusterfsd/glusterd at mine BUG: 764655 Change-Id: I172570aa58ea08c4c74cfd28f121d3d4e02a55e0 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.com/3319 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: Anand Babu Periasamy <abperiasamy@gmail.com> Reviewed-by: Anand Babu Periasamy <abperiasamy@gmail.com>
* glusterd-volgen: by default include 'cluster/distribute' in volfileAmar Tumballi2012-06-011-6/+9
| | | | | | | | | | | | | | | | | | | include 'cluster/distribute' even if there is just one brick in the volume, that way, the directories would have some of the required extended attributes on it before a 'add-brick'. this fixes the issues of applications getting errored out when a 'add-brick' is done when a volume had only one brick before. Change-Id: Ie9d559e6b26aafd3d67908ab20a006e4e5e70d73 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 815227 Reviewed-on: http://review.gluster.com/3213 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Shishir Gowda <shishirng@gluster.com> Reviewed-by: Raghavendra G <raghavendra@gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cli: let commands specify the exit value in batch modeCsaba Henk2012-06-012-4/+4
| | | | | | | | | | | | | | | | | | | | | | Old behavior: when cli is ran in batch mode (sequence of commands are fed to it in stdin), if a command returns an error (ie. -1), the cli exits upon it with 255 (-1 on 8 bit). New behavior: consider any non-zero return from cli commands as error and use the negative of that return value as exit value, thus giving control to cli commands over the exit value, while (as of the existing command set) adhering to the convention of exiting with 1 on error. Spotted upon stumbling upon mount/umount commands which did want to exit with 1 on error but that was not possible as of old behavior. Change-Id: I6f41191cdc718c3e676cfae1e404152f4cb715c5 BUG: 765214 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3218 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mount.glusterfs: enhance option 'transport=' for 'rdma'Amar Tumballi2012-06-011-0/+10
| | | | | | | | | Change-Id: I9e05cc8f4b73c6a83a4be956423f4e209237c215 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 798163 Reviewed-on: http://review.gluster.com/2855 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features/locks: update user_flock structure before insertingPranith Kumar K2012-06-011-0/+3
| | | | | | | | | | Change-Id: Idfa00e4f3263d50b327f5a2c6f13ec68ffc8fbee BUG: 805994 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3048 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* protocol/client: provide a buffer for storing reply of readlink.Raghavendra G2012-06-011-7/+46
| | | | | | | | | | | | | since a readlink response can be bigger than size of rdma-msges that can be inlined, we need to provide a buffer where server can do an rdma-write of response. Change-Id: I6ab06c3a94702f810ab0c57b409aaaf35cc93057 BUG: 822337 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.com/3464 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: fix issues with volume reset handlingCsaba Henk2012-06-012-22/+35
| | | | | | | | | | | | - properly resolve shortened key names - make sure user gets decent feedback Change-Id: I94b75f34b29cb71fb1a2edf17c3f1bf841bb552a Signed-off-by: Csaba Henk <csaba@redhat.com> BUG: 826958 Reviewed-on: http://review.gluster.com/3500 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: regenerate brick vol-files on upgradePranith Kumar K2012-06-012-4/+8
| | | | | | | | | | | | | If upgrade/downgrade option is set in glusterd it terminates after the volfiles are regenerated. No need for 'sleep 10' hack anymore. BUG: 825872 Change-Id: I12e666eb871aad7e7efa954b9307993952745d92 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3482 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: makefile typo fix.Amar Tumballi2012-06-011-1/+1
| | | | | | | | | | | | instead of /var/lib/glusterd, symlink was pointing to /var/log/glusterd Change-Id: I485ad8d6cc8535378179621dea7539328d22454c Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 824231 Reviewed-on: http://review.gluster.com/3503 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: xlator option framework cleanupsCsaba Henk2012-06-012-8/+7
| | | | | | | | | | | | | | | | | | | | | | - Upon init/reconf, if an option is not in the opt dict, and a default value is neither specified, null it out. With this, the xlator config data that comes out of init/reconf becomes deterministic in terms of the xlator option declarations and the incoming option value dictionary. (Needed for correct operation of volume reset.) - We can rely bravely on the guarantee given by init/reconf as of which no NULL value is passed to the converter functions. Drop the spurious null check of not_null(), and rebaptize it to pass(). Change-Id: Ifa068bcc0275456c01ed00a3a315a985eb262e49 BUG: 765147 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3505 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* nfs/nlm: when setting nlmclnt->rpc_clnt, do not overwrite old rpc_clntKrishna Srinivas2012-06-011-18/+5
| | | | | | | | | Change-Id: I01a1c0c0c8d3402b8fe061258001eea2c0029e83 BUG: 819518 Signed-off-by: Krishna Srinivas <ksriniva@redhat.com> Reviewed-on: http://review.gluster.com/3419 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Add server-side aux-GID resolution.Jeff Darcy2012-06-016-8/+217
| | | | | | | | | Change-Id: I09bcbfa41c7c31894ae35f24086bef2d90035ccc BUG: 827457 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.com/3241 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* protocol/server: do not wind opendir call if fd creation failsRaghavendra Bhat2012-06-013-3/+13
| | | | | | | | | | | | | | If resolve fails in some fd based operation, then do not use fd to get gfid (fd might be NULL). Use the gfid present in resolve structure. Change-Id: I1058274a2f9b4e58a76e4e6019e7c5ce1906d365 BUG: 827376 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.com/3504 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Optimize for small dicts, and avoid an overrun.Jeff Darcy2012-06-012-33/+68
| | | | | | | | | | | | | | | | | | | | As dicts get used more and more in the I/O path (especially for xattrs and the new xdata feature), removing some of their inherent inefficiency becomes more important. This patch addresses some of the issues around allocating data_pair_t structures separately. Along the way, I found that the way we're allocating the "members" hash table was subtly wrong, and could lead to a memory overrun. This is a latent bug because nobody uses dict_get_new_full that way, but I added an assert to guard against that possibility. One beneficial side effect is that we now save four pointers' worth of space per dict, offsetting the extra space used for the new members. Change-Id: Ie8c3e49f1e584daec4b0d2d8ce9dafbc76fb57b2 BUG: 827448 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.com/3040 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* protocol/client: do not access the local object after being freedRaghavendra Bhat2012-05-311-2/+2
| | | | | | | | | | Change-Id: I2d3aeb084168b9ed68a670b91e09126917f82968 BUG: 826588 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.com/3494 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mgmt/glusterd: self-heals should be on in glustershdPranith Kumar K2012-05-311-0/+8
| | | | | | | | | BUG: 825740 Change-Id: I44829fb985f9c394b1e240e8ee7f8d026593add9 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3481 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/dht: set conf->defrag to NULL after freeing the defrag structureRaghavendra Bhat2012-05-311-2/+3
| | | | | | | | | | | | | Also no need to free the xlator object after rebalance is over, as the process is about to be killed. Change-Id: I6973e43c0353b5de61c0b39e52a22c618be361f4 BUG: 826584 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.com/3495 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/distribute: set the inode layout in readdirp_cbk() for filesAmar Tumballi2012-05-311-0/+16
| | | | | | | | | | | | with this, inode-linking it in readdirp_cbk will be neater. Change-Id: Ie2cd646438f851e1755e9b6a3fc9898059bee359 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 816140 Reviewed-on: http://review.gluster.com/2717 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* replicate: add hashed read-child method.Jeff Darcy2012-05-317-18/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both the first-to-respond method and the round-robin method are susceptible to clients repeatedly choosing the same servers across a series of opens, creating hot spots. Also, the code to handle a replica being down will ignore both methods and just choose the first remaining (which is not an issue for two-way but can be otherwise). The hashed method more reliably avoids such hot spots. There are three values/modes. 0: use the old (broken) methods. 1: select a read-child based on a hash of the file's GFID, so all clients will choose the same subvolume for a file (ensuring maximum consistency) but will distribute load for a set of files. 2: select a read-child based on a hash of the file's GFID plus the client's PID, so different children will distribute load even for one file. Mode 2 will probably be optimal for most cases. Using response time when we open the file is problematic, both because a single sample might not have been representative even then and because load might have shifted in the hours or days since (for long-lived files). Trying to use more current load information can lead to "herd following" behavior which is just as bad. Pseudo-random distribution is likely to be the best we can reasonably do, just as it is for DHT. Change-Id: I798c2760411eacf32e82a85f03bb7b08a4a49461 BUG: 802513 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.com/2926 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* distribute: support user-specified layouts.Jeff Darcy2012-05-314-3/+15
| | | | | | | | | | | | | The new type is DHT_HASH_TYPE_DM_USER=1 (on disk in network byte order) and we treat it the same as DHT_HASH_TYPE_DM except that we don't stomp on it during rebalance. Change-Id: I893571a9b89577acdea2fe868915b18d3663fd77 BUG: 807312 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.com/3004 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mgmt/glusterd: Do shd validation for replicate volumesPranith Kumar K2012-05-311-1/+13
| | | | | | | | | | | | Staging needs to build graphs for replicate volumes in stopped state as well. Change-Id: I6474cd0fc43c9fa1916826d4a452f301fe7fe811 BUG: 823128 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3489 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Csaba Henk <csaba@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* storage/posix: Prevent gfid handle leaksPranith Kumar K2012-05-311-4/+6
| | | | | | | | | | | | | | | | | | | | The case which can lead to gfid handle leaks: Self-heal removes directory '/d' with 10 files in it, in brick b1. This dir is renamed to <landfill>/<hashval of '<brick-path>/d'> by posix. Before the janitor thread could remove the directory, self-heal could remove another directory with same path '/d'. Then again the rename to same path is done by posix as before. The gfid-handles of the old '/d', 10 files in it are not unlinked. To prevent such problems, rename the directory to be removed to <landfill>/<gfid-str>. Change-Id: Iad13708e1ebcc5222b64c058aa9a2d372e1bfa5b BUG: 811970 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3159 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* storage/posix: Move landfill inside .glusterfsPranith Kumar K2012-05-318-66/+116
| | | | | | | | | Change-Id: Ia2944f891dd62e72f3c79678c3a1fed389854a90 BUG: 811970 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3158 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd-hooks: added support for separate namespace for 'volume set' keysAmar Tumballi2012-05-312-0/+32
| | | | | | | | | | | | | | | | | | | | | | The keys in the above mentioned namespace could be used by hook scripts to peform tasks on 'special' keys as defined by the storage admin. The choice of the key and its semantics of it are outside the scope of glusterd. It is the responsibility of the storage admin to keep the meaning of the key(s) consistent. If a user gives a command like 'gluster volume set <VOLNAME> user.for-this-key do-this" scripts would get 'user.for-this-key=do-this' as argument. Change-Id: I5509e17d99e4ddd8bf5df968dcd51ff9a80dc3ab Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 825902 Reviewed-on: http://review.gluster.com/3443 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kp@gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: cut out a standalone socket path calculation routineCsaba Henk2012-05-312-7/+16
| | | | | | | | | Change-Id: If5f196c9154ea59e37b83d3e4cad445fee6e9d45 BUG: 826512 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3490 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pranithk@gluster.com>
* logging: change the 'logfile' value in a locked regionAmar Tumballi2012-05-301-88/+91
| | | | | | | | | | | | | | 'logfile' is a global variable, and it can change if log-rotate command is issued. currently 'fprintf(logfile)' happens in a locked region where as the 'fclose(logfile)' can happen outside the locked region causing racy behavior. Change-Id: I40871e5c365303b7c602e2c302b085d64f6b945f Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 826032 Reviewed-on: http://review.gluster.com/3493 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* protocol: do not log getxattr/ENODATA as warningAnand Avati2012-05-292-2/+2
| | | | | | | | | | | | When SELinux is enabled, most of the files do not have labels and result is a ton of unnecessary logs Change-Id: I0e781e2fb6bcfb3fb12298175a41f7b981af9c39 BUG: 811217 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.com/3486 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>