summaryrefslogtreecommitdiffstats
path: root/tests/bugs
Commit message (Collapse)AuthorAgeFilesLines
* dht: fix rename raceJeff Darcy2014-07-171-0/+102
| | | | | | | | | | | | | | | | | | | | | | | If two clients try to rename the same file at the same time, we sometimes end up with *no file at all* in either the old or new location. That's kind of bad. The culprit seems to be some overly aggressive cleanup code. AFAICT, based on today's study of the code, the intent of the changed section is to remove any linkfile we might have created before the actual rename. However, what we're removing might not be our extra link. If we're racing with another client that's also doing a rename, it might be the only remaining link to the user's data. The solution, which is good enough to pass this test but almost certainly still not complete, is to be more selective about when we do this unlink. Now, we only do it if we know that, at some point, we did in fact create the link without error (notably ENOENT on the source or EEXIST on the destination) ourselves. Change-Id: I8d8cce150b6f8b372c9fb813c90be58d69f8eb7b BUG: 1117851 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/8269 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: do not check for snapd handle in restore if uss is disabledRaghavendra Bhat2014-07-151-0/+24
| | | | | | | | | Change-Id: I01afe64685a5794cce9265580c6c5de57a045201 BUG: 1119582 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/8310 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* cli: Changed "rebalance start" outputVenkatesh Somyajulu2014-07-141-0/+29
| | | | | | | | | | | Change-Id: Ie87f1a2107b07a6e519ed894a74edf3b3e0a8340 BUG: 1063230 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/6946 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Regression test portability: dd usageEmmanuel Dreyfus2014-07-1418-32/+32
| | | | | | | | | | | | | | NetBSD, FreeBSD, and MacOS X dd(1) bs argument uses m for megabyte, while Linux uses M. Use bs=1024k instead of bs=1M for better compatibility. BUG: 764655 Change-Id: I603f57adbc9b31f6d634b918726437fbfce42e03 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8278 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Justin Clift <justin@gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/gfid-access: error handling for entry creationVenky Shankar2014-07-141-0/+32
| | | | | | | | | | | | | | | | | | | | | | | Proceed with setattr() only on a successfull entry creation. Winding a setattr() using a freshlyOC initiated inode would most likely fail in one translator or the other (e.g. DHT expecting the layout information to be set in the inode context), which is the case if the inode was not looked up. Therefore, gfid-access handles failure entry creations and passes the _correct_ errno back to the client instead of continuing with setattr() call and probably returning back incorrect errno. Also, filling up inode->gfid is required as the new inode is not looked up and ->gfid would be certainely required for inode operations. Change-Id: Ie92f5647a89bf558c07710ab0400bce69d59fc31 BUG: 1111490 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/8260 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* dht: support heterogeneous brick sizesJeff Darcy2014-07-121-19/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Calculation of layouts now considers the size of each brick, so that smaller bricks don't get an "unfair" share of allocations and start returning ENOSPC while the larger bricks still have plenty of space. The observation has been made that some clients might get ENOTCONN when trying to fetch disk-size information, and end up calculating layouts differently. The following meta-observations can be made. (1) This scenario is extremely unlikely in configurations with AFR. (2) The most likely consequence of this scenario is that some files will be placed sub-optimally by the client with the obsolete (non-weighted) layout. They'll still be found anyway, so this isn't a show stopper. (3) Without this patch it's *guaranteed* that some files will be placed sub-optimally, because any layout that fails to account for brick sizes is sub-optimal. (4) We shouldn't be doing fix-layout from two nodes simultaneously anyway. That's inefficient at best. Any instances of such behavior are separate bugs, which should be fixed separately. (5) In the most extreme edge case, two nodes doing weighted and non-weighted layout fixes could race and end up creating an internally inconsistent layout. This condition is still transient; it will be detected and repaired automatically the next time anyone fetches the layout. (If it's not that's also a preexisting bug that can show up in other contexts.) In conclusion, it's not the purpose of this patch to fix bugs elsewhere in DHT. Its purpose is to make life incrementally better for users who add new hardware with larger disks etc. than the older equipment. It's only one part of an ongoing process to improve layout management and repair, all the way up to support for multiple hash rings or tiering. Change-Id: I05eb6f9eface9cdaf8622e0260c8c7f29020447f BUG: 1114680 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/8093 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* tests: Fixed spurious failure in bug-887098-gmount-crash.tXavier Hernandez2014-07-111-8/+2
| | | | | | | | | | | | | | | | | | | | This script was trying to see if the mount process died by doing a 'ps ax' and a grep of the original pid in the results. After that the pid of the first line returned by grep was compared to the original pid. This method can lead to false negatives because it's possible that the original pid appears in some other part of the 'ps ax' list. This patch uses get_mount_process_pid() from volume.rc to check if the process is still alive. Change-Id: I0285366e601a146793c47e9c1156a4bb36d6fcb3 BUG: 1092850 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8286 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/changelog: prevent deadlock on thread cancellationVenky Shankar2014-07-111-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | helper threads (fsync, rollover) wake up periodically and perform their respective operation under a lock (crt->lock). These threads are also subjected to cancellation under some circumstance such as disabling changelog. This is inherently dangerous when funtions which are cancellation points for pthread_cancel(3) are used in the locked region. Consider this pthread_mutex_lock(&mutex); { /* ... */ ret = fsync (fd); <-- cancellation point /* ... */ } pthread_mutex_unlock(&mutex); A pthread_cancel(3) by another thread just before fsync(3) but after pthread_mutex_lock(3) would result in the thread getting cancelled when fsync(3) is invoked, thereby never unlocking the mutex. Moreover, in case of changelog translator, the locked region (under crt->lock in changelog-rt.c) is also the code path for fop changelog updation. Therefore, unlocking the mutex in thread cleanup handler (pthread_cleanup_pop(3)) might prematurely release the mutex during fop updation path. This patch fixes such problems existing in fsync and rollover threads. Fix is to enter the locked region with cancellation disabled and enable it after mutex unlock. Also, test for a cancellation request early on in case none of the functions are cancellation points. Change-Id: I1795627a12827609c1da659d07fc1457ffa033de BUG: 1110917 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/8106 Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd/regression: Temp fix for spurious errJoseph Fernandes2014-07-101-2/+5
| | | | | | | | | | | | | As discussed in the mail, Disabling the checking of snap brick status until the investigation is done on the port bind issue. Change-Id: I8854cee050de1b7f843e3d40631b6cb61fd8583e BUG: 1112559 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/8259 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* socket/glusterd/client: enable SSL for managementJeff Darcy2014-07-101-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The feature is controlled by presence of the following file: /var/lib/glusterd/secure-access See the comment near the definition of SECURE_ACCESS_FILE in glusterfs.h for the rationale. With this enabled, the following rules apply to connections: UNIX-domain sockets never have SSL. Management-port sockets (both connecting and accepting, in daemons and CLI) have SSL based on presence of the file. Other IP sockets have SSL based on the existing client.ssl and server.ssl volume options. Transport multi-threading is explicitly turned off in glusterd (it would otherwise be turned on when SSL is) due to multi-threading issues. Tests have been elided to avoid risk of leaving a file which will cause all subsequent tests to run with management SSL still enabled. IMPLEMENTATION NOTE The implementation is a bit messy, and consists of two stages. First we decide whether to set the relevant fields in our context structure, based on presence of the sentinel file OR a command-line override. Later we decide whether a particular connection should actually use SSL, based on the context flags plus what kind of connection we're making[1] and what kind of daemon we're in[2]. [1] inbound, outbound to glusterd port, other outbound [2] glusterd, glusterfsd, other TESTING NOTE Instead of just running one special test for this feature, the ideal would be to run all tests with management SSL enabled. However, it would be inappropriate or premature to set up an optional feature in the patch itself. Therefore, the method of choice is to submit a separate patch on top, which modifies "cleanup" in include.rc to recreate the secure-access file and associated SSL certificate/key files before each test. Change-Id: I0e04d6d08163893e24ec8c031748c5c447d7f780 BUG: 1114604 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/8094 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* nfs: prevent assertion error with MOUNT over UDPNiels de Vos2014-07-071-0/+24
| | | | | | | | | | | | | | | | | | | | | | | The MOUNT service over UDP runs in a separate thread. This thread does not have the correct *THIS xlator set. *THIS points to the global (base) xlator structure, but GF_CALLOC() requires it to be the NFS-xlator so that assertions can get validated correctly. This is solved by passing the NFS-xlator to the pthread function, and setting the *THIS pointer explicitly in the new thread. It seems that on occasion (needs further investigation) MOUNT over UDP does not unregister itself. There can also be issues when the kernel NLM implementation has been registered at portmap/rpcbind, so adding some unregister procedures in the cleanup of the test-cases. Change-Id: I3be5a420fc800bbcc14198d0b6faf4cf2c7300b1 BUG: 1116503 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/8241 Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* statedump: Don't print mem-type numbersPranith Kumar K2014-07-061-3/+3
| | | | | | | | | | | | Change-Id: I381bfa9535fe60c37758761d34b98dbbc4e5f569 BUG: 1114188 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8239 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Tested-by: Justin Clift <justin@gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* socket: add certificate-depth and cipher-list options for SSLJeff Darcy2014-07-041-0/+2
| | | | | | | | | | Change-Id: I82757f8461807301a4a4f28c4f5bf7f0ee315113 BUG: 1114604 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/8040 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/snapshot: fixing glusterd quorum during snap operationJoseph Fernandes2014-07-041-0/+58
| | | | | | | | | | | | | | | | During a snapshot operation, glusterd quorum will be checked only on transaction peers, which are selected in the begin of the operation, and not on the entire peer list which is susceptible for change for any peer attach operation. Change-Id: I089e3262cb45bc1ea4a3cef48408a9039d3fbdb9 BUG: 1114403 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/8200 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Tested-by: Kaushal M <kaushal@redhat.com>
* rpc/auth: allow SSL identity to be used for authorizationJeff Darcy2014-07-021-0/+1
| | | | | | | | | | | | | | | | | | | Access to a volume is now controlled by the following options, based on whether SSL is enabled or not. * server.ssl-allow: get identity from certificate, no password needed * auth.allow: get identity and matching password from command line It is not possible to allow both simultaneously, since the connection itself is either using SSL or it isn't. Change-Id: I5a5be66520f56778563d62f4b3ab35c66cc41ac0 BUG: 1114604 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/3695 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: use the right rpc for snapd while getting pending node rpcRaghavendra Bhat2014-07-011-2/+2
| | | | | | | | | | | | | * Also changed the testcase bug-1111041.t to correctly get the snapshot daemon's pid Change-Id: I22c09a1e61f049f21f1886f8baa5ff421af3f8fa BUG: 1111041 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/8209 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Sachin Pandit <spandit@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* mgmt/glusterd: display snapd status as part of volume statusRaghavendra Bhat2014-06-301-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Made changes to save the port used by snapd in the info file for the volume i.e. <glusterd-working-directory>/vols/<volname>/info This is how the gluster volume status of a volume would look like for which the uss feature is enabled. [root@tatooine ~]# gluster volume status vol Status of volume: vol Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick tatooine:/export1/vol 49155 Y 5041 Snapshot Daemon on localhost 49156 Y 5080 NFS Server on localhost 2049 Y 5087 Task Status of Volume vol ------------------------------------------------------------------------------ There are no active volume tasks Change-Id: I8f3e5d7d764a728497c2a5279a07486317bd7c6d BUG: 1111041 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/8114 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* tests: Trigger dir heal by lookupPranith Kumar K2014-06-301-8/+3
| | | | | | | | | | | | | | Heal full in v2 needs some improvements which Ravi is working on. Fixed the script to heal based on lookup from mount until then. Change-Id: I7b5f8a294019d9f8cfc9c2346d7997f31b4c3d7c BUG: 1092850 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8178 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/snapshot: Correct the mount path checkAvra Sengupta2014-06-301-0/+38
| | | | | | | | | | | | | | | | | | | Before removing a lvm, we check if the lvm is mounted on the brick path. If not, we remove the brick path only. Correcting this check to support restore cases, where the volname is not the non-hyphanated uuid, but the original volume's name. Change-Id: If158f4651d36efa2f94523458faf826230e9c76a BUG: 1113975 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/8192 Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Tested-by: Justin Clift <justin@gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Sachin Pandit <spandit@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* cluster/stripe: Fix EINVAL errors on quota enabled volumesKrutika Dhananjay2014-06-261-0/+25
| | | | | | | | | | | | | | | | | | Write operations on directories with quota enabled used to fail with EINVAL on stripe volumes. This was due to assert failure in stripe_lookup(), meant to ensure loc->path is not NULL. However, in nameless lookup (in this particular case triggered by quotad, which has stripe xlator in its graph), loc->path can be legitimately NULL. The fix involves removing this check in stripe_lookup(). Change-Id: Ibbd4f68763fdd8a85f29da78b3937cef1ee4fd1e BUG: 1100050 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/8145 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* tests: Change umount with force_umount with 5 retriesPranith Kumar K2014-06-2358-86/+115
| | | | | | | | | | | | | Change-Id: I0e2dbdfd34080328dfa6b4eebef0366f2b0fcb04 BUG: 1092850 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8117 Tested-by: Justin Clift <justin@gluster.org> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Sachin Pandit <spandit@redhat.com> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/stripe: don't treat ESTALE as failure in lookupRavishankar N2014-06-231-0/+18
| | | | | | | | | | | | | | | | | | Problem: In a stripe volume, symlinks are created only on the first brick via the default_symlink() call. During gfid lookup, server sends ESTALE from the other bricks, which is treated as error in stripe_lookup_cbk() Fix: Don't treat ESTALE as error in stripe_lookup_cbk() Change-Id: Ie4ac8f0dfd3e61260161620bdc53665882e7adbd BUG: 1111454 Signed-off-by: Ravishankar N <root@ravi3.(none)> Reviewed-on: http://review.gluster.org/8135 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs: Don't allow '-0' as input value for numbersPranith Kumar K2014-06-221-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: io-stats tries to init log-buf-size as uint32. All gf_string2u*** functions which get the unsigned values from string don't want the string to contain '-'. So the GF_OPTION_INIT with '-0' as value fails init in io-stats, but by that time 'ret' is already reset to 0. Io-stats ends up returning 0 even when init failed. Because of this caller of init thinks initialization is successful when it is not. iostat_xlator->private is still NULL. Because of this when a fop tries to access members of io-stat-private structure, it crashes. Fix: I initially thought may be we should fix all gf_string2u*** functions to accept '-0'. But all these functions are used only for setting volume options. If we accept '-0', gluster volume info shows output as follows: diagnostics.brick-log-buf-size: -0 This seemed ugly, so I felt it is better to disallow '-0' as valid input for numbers. Also fixed return value in cases of failures in io-stats. Change-Id: I67ac92853b6d2be70516ad1d07505ffd9f058aa4 BUG: 1111557 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8129 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/index: Don't delete current xattrop index.Ravishankar N2014-06-203-2/+31
| | | | | | | | | | | | Delete the base entry in indices/xattrop only when it is stale. Change-Id: I675c1510dd8293d068e31b552b0de48f50aac658 BUG: 1101647 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/8119 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* tests: changes to some of the uss testcasesRaghavendra Bhat2014-06-201-2/+2
| | | | | | | | | | | | | | Made the below changes tests/basic/uss.t: removed the older way of getting the list of snapshots bugs/bug/bug-1109770.t: added uss disable test also to check snapd behavior Change-Id: I57b6bc8fa82bcaa544f483ad382e1bb4d11ef122 BUG: 1092850 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/8081 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* tests: Provide force_umount with 5 retriesPranith Kumar K2014-06-181-2/+2
| | | | | | | | | | | Change-Id: I2b5784c48eedcccb17690de438addd29075926bd BUG: 1092850 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8104 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Do layout self healing of directory for nameless lookupVenkatesh Somyajulu2014-06-171-0/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Currently in the nameless lookup code path, if at the end of the lookup, even if it detects that layout anamolies are there, layout healing will not be done as there is no code to heal it. So there can be race between mkdir and lookup. Assume mkdir is going on from some other mount point, Say, M1. Directories are created on some nodes but layout is not set yet. Now from M2, nameless lookup goes, lookup will be success full as the directory is present on some of the nodes, but it won't heal layout. Now if create goes after lookup fop, because layout is absent, file creation will fail. Fix: Included the code of layout self-heal in the nameless lookup path. At the end of lookup, layout will be computed as it would have been in the named lookup, but it will be set to those node only, where directory is present. So after that if create fop goes, the probabiliy to get the subvolume with proper hash-range is high now, so reduces the race window. Other: Whenever a directory is created, we have to choose a brick from which we start allocating layout in a circular fashion. To calculate this starting brick, I have changed the candidate from name of the directory to gfid of the directory But to compute where a given file belongs, we will still use the name of the file. Hash computed from the name of the file should belong to any one of the directory-hash-range Calculation of hash for a file is acting as a consumer and the setting of directory layout based on gfid is acting as a producer, which are independent from each other. Change-Id: I3808c55082cd1b5c72d2c77cbbc063f55aa38bee BUG: 1095888 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/7493 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: save the snapd port in volinfo after starting snapdRaghavendra Bhat2014-06-171-0/+72
| | | | | | | | | Change-Id: I9266bbf4f67a2135f9a81b32fe88620be11af6ea BUG: 1109889 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/8084 Reviewed-by: Kaushal M <kaushal@redhat.com> Tested-by: Kaushal M <kaushal@redhat.com>
* tests: Avoid sleepPranith Kumar K2014-06-1621-129/+65
| | | | | | | | | Change-Id: I7169be3532232754b9461c4e1b27bf6bc857f7a6 BUG: 1092850 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8083 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: volume stop should also stop its snapview-daemonRaghavendra Bhat2014-06-161-0/+69
| | | | | | | | | | Change-Id: I702372c6c8341b54710c531662e3fd738cfb5f9a BUG: 1109770 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/8076 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* tests: Validate self-heal daemon completionsPranith Kumar K2014-06-161-1/+4
| | | | | | | | | | | | | | | Removed sleep with EXPECT_WITHIN Heal full doesn't generate indices until the files/dirs are recreated. So wait until they are re-created and then wait for heal completion. Change-Id: I82399f6a17f94ecc101db45b83d8ef7bfa9c64dd BUG: 1092850 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8069 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/quota: No root squash for quota aux mount by defaultVarun Shastry2014-06-151-0/+31
| | | | | | | | | | | | | | | | | | With change 28209283a67f13802cc0c1d3df07c676926810a2, the root squash option is enabled by default even for the trusted clients. This disallowed quota auxiliary mount from setting the limit. This patch adds the quota aux mount process to list of 'special' clients which have the root squash disabled by default. Change-Id: Ie6583dd3deb170563daf001239c51bcff1ce078b BUG: 1104692 Signed-off-by: Varun Shastry <vshastry@redhat.com> Reviewed-on: http://review.gluster.org/7967 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* tests: Fix spurious failure for tests/bugs/bug-918437-sh-mtime.tPranith Kumar K2014-06-151-2/+12
| | | | | | | | | Change-Id: I355ae02bed54753480279ddb058cc4b19ace6792 BUG: 1092850 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8068 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/snapshot : Provide enable/disable option for snapshot auto-delete ↵Sachin Pandit2014-06-131-22/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | feature. This patch provides an interface to enable or disable the auto-delete feature. Syntax : gluster snapshot config auto-delete <enable/disable> DETAILS : 1) When auto-delete feature is disabled, If the the soft-limit is reached then user is given a warning about exceeding soft-limit along with successful snapshot creation message (oldest snapshot is not deleted). And upon reaching hard-limit further snapshot creation is not allowed. Example : ------------------------------------------------------------------ |Case - 1: Upon reaching soft-limit | |Snapshot create : snap successfully created. |Warning : soft-limit of volume (vol) is reached. Snapshot creation |is not possible once hard-limit is reached. | |----------------------------------------------------- |Case - 2: Upon reaching hard-limit | |Snapshot create : snap creation failed. |Error : hard-limit of volume (vol) is reached, Hence it is not |possible to take further snapshots. Please delete few snapshots |of the volume (vol) before taking another snapshot. ------------------------------------------------------------------ 2) When auto-delete feature is enabled, then as soon as the soft-limit is reached the oldest snapshot is deleted for every successful snapshot creation (same as existing method), With this it is made sure that number of snapshot created is not more than snap-max-hard-limit. Change-Id: Ie3ca64bbd2c763371f541cd2e378314e73b695b4 BUG: 1105415 Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/8017 Tested-by: Justin Clift <justin@gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* tests: Fix spurious failure in tests/bugs/bug-830665.tPranith Kumar K2014-06-131-3/+16
| | | | | | | | | | | | | | | | | | | Problem with script: EXPECT_WITHIN fails the test if the command it executes fails. There is a possibility that the file script tries to 'cat' may not exist. In those cases it would fail leading to spurious failures. Fix: Add a function which returns empty string when the file doesn't exist and 'cat' file when it does exist. Change-Id: I0abfb343f2fce1034ee4b5b680e3783c4f6e8486 BUG: 1092850 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8057 Reviewed-by: Sachin Pandit <spandit@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* tests: Revert janitor link file removal testPranith Kumar K2014-06-121-83/+0
| | | | | | | | | | | | | I found that order of execution in afr-v2 self-heal is causing the links to disappear some times. I need to fix that issue and then submit this test again Change-Id: Ia886feb796b7854645813f486b7b7ac4e944ed17 BUG: 1101143 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8055 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* tests: fix for spurious failure:ggarg2014-06-121-0/+2
| | | | | | | | | | Change-Id: I39cc497f12c83aa055acb6e88e4c3e1e8774e577 BUG: 1089668 Signed-off-by: ggarg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/8050 Reviewed-by: Sachin Pandit <spandit@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* tests : Fix for spurious failure of bug-1104642.t.Sachin Pandit2014-06-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Problem : This test case checks whether global option is updated when glusterd goes down and comes back up. There might be a scenario where the handshake is not completed yet and we issue a volume info in between. In that case when we try to grep a key from gluster volume info output it might not be there. Hence to get some time we can issue a peer status command and verify that the peer count is correct. Change-Id: I2933d55ce8c80517a555fa3c2e4cd768cde30abf BUG: 1104642 Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/8041 Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Janitor should guard against dir renames.Pranith Kumar K2014-06-121-0/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Directory rename while a brick is down can cause gfid handle of that directory to be deleted until next lookup happens on that directory. *) Self-heal does not have intelligence to detect renames at the moment. So it has to delete the directory 'd' using special flags, because it has to perform 'rm -rf' of that directory as it is not empty. Posix xlator implements this by renaming the directory deleted to 'landfill' directory in '.glusterfs' where janitor thread will perform actual rm -rf by traversing the directory. Janitor thread wakes up every 10 minutes to check if there are any directories to be deleted and deletes them. As part of deleting it also deletes the gfid-handles. Steps to hit the problem: 1) On a replicate volume create a directory 'd', file in 'd' called 'f' so the directory 'd' is not empty. 2) bring one of the bricks down (lets call it brick-a, the other one is brick-b 3) Rename d to d1 4) When brick-a comes online again, self-heal deletes directory 'd' and creates directory 'd1' on brick-a for performing self-heal. So on brick-a, gfid-handle of 'd' pointing to 'da is deleted and recreated to point to 'd1'. 5) This directory 'b' with all its directory hierarchy (for now just the file 'f') will be under 'landfill' directory. 6) When janitor thread wakes up and deletes directory 'd' and gfid-handle of 'd' without realizing that it is now pointing to 'd1'. Thus 'd1' loses its gfid-handle Fix: Delete gfid-handle for a directory only when the gfid-handle is stale. Change-Id: I21265b3bd3852f0967d916aaa21108ae5c9e7373 BUG: 1101143 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7879 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mgmt/glusterd : Import the global options irrespective of change inSachin Pandit2014-06-091-0/+46
| | | | | | | | | | | | | | | | | | | | volume information. Problem : global options maintained by glusterd was getting synced only when there was change in volume information. Solution : Import the global option irrespective of change in volume information. Change-Id: I9e59b3cb25bdc19601a09fcf8df2e31a8481ece0 BUG: 1104642 Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/7970 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* cli/snapshot : Dont Do the validation of snapshot config limit in CLISachin Pandit2014-06-041-0/+2
| | | | | | | | | | | | | | | | | | | code path. Problem : If we try to set the volume snap limit to more that 256, it always shows value cannot exceed 256, irrespective of system max limit. Solution : Dont do validation in CLI side. Change-Id: I292c0bc91a1806cd4906fca0151dd98135e6e49a BUG: 1098122 Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/7777 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* glusterd: Fetching the txn_id before performing glusterd_op_bricks_select in ↵Avra Sengupta2014-06-021-0/+20
| | | | | | | | | | | | | | | | | | | | glusterd_brick_op() In glusterd_brick_op(), the txn_id mut be fetched before failing the transaction for any other reason. Moving the fetching of txn_id to the beginning of the function. Also initializing txn_id to priv->global_txn_id where it wasn't initialized. Change-Id: I44d7daa444f00a626f24670c92324725f6c5fb35 BUG: 1102656 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7926 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* cluster/dht: Fix min-free-disk calculations when quota-deem-statfs is onKrutika Dhananjay2014-06-021-0/+120
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: As part of file creation, DHT sends a statfs call to all of its sub-volumes and expects in return the local space consumption and availability on each one of them. This information is used by DHT to ensure that atleast min-free-disk amount of space is left on each sub-volume in the event that there ARE other sub-volumes with more space available. But when quota-deem-statfs is enabled, quota xlator on every brick unwinds the statfs call with volume-wide consumption of disk space. This leads to miscalculation in min-free-disk algo, thereby misleading DHT at some point, into thinking all sub-volumes have equal available space, in which case DHT keeps sending new file creates to subvol-0, causing it to become 100% full at some point although there ARE other subvols with ample space available. FIX: The fix is to make quota_statfs() behave as if quota xlator weren't enabled, thereby making every brick return only its local consumption and disk space availability. Change-Id: I211371a1eddb220037bd36a128973938ea8124c2 BUG: 1099890 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/7845 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* user servicable snapshotsRaghavendra Bhat2014-05-291-2/+10
| | | | | | | | | | Change-Id: Idbf27dbe088e646a8ab81cedc5818413795895ea Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Anand Subramanian <anands@redhat.com> Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/7700 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Glusterd/Rebalance: Update rebalance status properly inSusant Palai2014-05-261-0/+33
| | | | | | | | | | | | | | node_state.info credit: kaushal@redhat.com spalai@redhat.com Change-Id: I08d0771e2168a4a6ebd473e8a937b8b2eda1341a BUG: 1075087 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/7214 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* tests: s/timeout/EXPECT_WITHIN/Pranith Kumar K2014-05-211-1/+1
| | | | | | | | | | | Also fixed nfs.rc so that regression build works on my fedora VM Change-Id: Ife36343bf1a590430e24065b9bcdf5bed3ae546d BUG: 1092850 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7837 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* tests: Use uniform timeoutsPranith Kumar K2014-05-1966-133/+132
| | | | | | | | | Change-Id: I479ab941b3b2da3b16f624400fbd300f08326268 BUG: 1092850 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7799 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* tests: Wait for nfs export to be availablePranith Kumar K2014-05-1619-61/+69
| | | | | | | | | Change-Id: I59a5e0cb78f2b670761a65272b8ab1d7bdb3668a BUG: 1092850 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7773 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd : barrier enable/disable should fail if already enabled/disabledAtin Mukherjee2014-05-121-0/+24
| | | | | | | | | | | | | | | | | In barrier notify function, if we fail to set the barrier option execution goes to default_notify which returns 0 and command returns success. Fix : We need not call the default_notify function when handling GF_EVENT_TRANSLATOR_OP in barrier xlator's notify. Change-Id: Ia2c361b43cca7791c29829d69dcd6fc7923102f6 BUG: 1092841 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/7609 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Sachin Pandit <spandit@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* rpc: implement server.manage-gids for group resolving on the bricksNiels de Vos2014-05-091-13/+41
| | | | | | | | | | | | | | | | | | | | | | | | | The new volume option 'server.manage-gids' can be enabled in environments where a user belongs to more than the current absolute maximum of 93 groups. This option triggers the following behavior: 1. The AUTH_GLUSTERFS structure sent by GlusterFS clients (fuse, nfs or libgfapi) will contain only one (1) auxiliary group, instead of a full list. This reduces network usage and prevents problems in encoding the AUTH_GLUSTERFS structure which should fit in 400 bytes. 2. The single group in the RPC Calls received by the server is replaced by resolving the groups server-side. Permission checks and similar in lower xlators are applied against the full list of groups where the user belongs to, and not the single auxiliary group that the client sent. Change-Id: I9e540de13e3022f8b63ff893ecba511129a47b91 BUG: 1053579 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/7501 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Anand Avati <avati@redhat.com>