summaryrefslogtreecommitdiffstats
path: root/tests/basic/ec
Commit message (Collapse)AuthorAgeFilesLines
* tests: prevent regular hangs in ec/nfs.tNiels de Vos2015-03-161-1/+4
| | | | | | | | | | | | | | | | | When the test systems gets into a memory pressure state (the Jenkins VMs do not have much RAM), the localhost NFS-mount can get hung. It is possible to prevent this by writing with O_DIRECT. Unfortnately, the 'dd' command on NetBSD does not seem to support such an option. The alternative is to reduce the I/O that can get cached on the NFS-client, like reducing the "count" option for "dd". Change-Id: I1da9cb41133bb934bcbae0a6bc091f798514ed3d BUG: 1163543 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9883 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* epoll: Fix broken RPC throttling due to MT epollShyam2015-03-011-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | The RPC throttle which kicks in by setting the poll-in event on a socket to false, is broken with the MT epoll commit. This is due to the event handler of poll-in attempting to read as much out of the socket till it receives an EAGAIN. Which may never happen and hence we would be processing far more RPCs that we want to. This is being fixed by changing the epoll from ET to LT, and reading request by request, so that we honor the throttle. The downside is that we do not drain the socket, but go back to epoll_wait before reading the next request, but when kicking in throttle, we need to anyway and so a busy connection would degrade to LT anyway to maintain the throttle. As a result this change should not cause deviation in the performance much for busy connections. Change-Id: I522d284d2d0f40e1812ab4c1a453c8aec666464c BUG: 1192114 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/9726 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* Temporarily remove nfs.t to avoid regression failuresXavier Hernandez2015-02-121-23/+0
| | | | | | | | | | | | | | | | | Test basic/ec/nfs.t is causing many regression failures due to a problem related with NFS. While the NFS problem is solved, this patch removes the test to avoid more regression failures. Change-Id: I29884c5e06732e427130d1bc82f1b83553916f95 BUG: 1192114 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/9649 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* cluster/ec: Handle CHILD UP/DOWN in all casesPranith Kumar K2015-01-281-0/+79
| | | | | | | | | | | | | | | | | | | | | Problem: When all the bricks are down at the time of mounting the volume, then mount command hangs. Fix: 1. Ignore all CHILD_CONNECTING events comming from subvolumes. 2. On timer expiration (without enough up or down childs) send CHILD_DOWN. 3. Once enough up or down subvolumes are detected, send the appropriate event. When rest of the subvols go up/down without changing the overall ec-up/ec-down send CHILD_MODIFIED to parent subvols. Change-Id: Ie0194dbadef2dce36ab5eb7beece84a6bf3c631c BUG: 1179180 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9396 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/ec: Handle internal xattr get/setPranith Kumar K2015-01-081-0/+45
| | | | | | | | | | | | | | | | | | | | Problem: Internal xattrs of EC like trusted.ec.size/config/version can be modified by users and that can lead to misbehavior in EC. Fix: Don't let the user modify the xattrs. Hide these xattrs in getfattr outputs. Change-Id: I39cec96ae12826b506b496fda7da74201015fd75 BUG: 1178688 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9385 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Emmanuel Dreyfus <manu@netbsd.org> Tested-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* Regression test portability: ec.tEmmanuel Dreyfus2014-12-171-4/+4
| | | | | | | | | | | | | This test unmount/remount the filesystem to invalidate cache, but this leads to timing problems on NetBSD. We can work them around without sleeping by remounting on another mount point. BUG: 1129939 Change-Id: I10b3183e5e715053de162a6980af188710b607bb Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/9285 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* ec: Temporary fix for quota.tXavier Hernandez2014-12-041-0/+1
| | | | | | | | | | | | | | | This fix solves a problem with tests/basic/ec/quota.t that generates a segmentation fault in DHT. This is a temporary fix until bug #1167793 is solved. Change-Id: I8587e66a63375ba2b312e8c0bfa1dd0d94d4c19f BUG: 1129939 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/9222 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Regression tests portability: Do not 'cd' into volumeXavier Hernandez2014-11-242-67/+65
| | | | | | | | | | | | | | | | | Changing current directory to the root of the volume to execute tests from there keeps an open file descriptor to it that could interfere with some tests. I've removed all 'cd' and used abosulte paths on all tests. Change-Id: Ic54afb7d7974e9e80818201bcd99ee2b01d00442 BUG: 1129939 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/9151 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Emmanuel Dreyfus <manu@netbsd.org> Tested-by: Emmanuel Dreyfus <manu@netbsd.org>
* Regression tests portability: basic/ec/self-heal.tEmmanuel Dreyfus2014-11-191-1/+1
| | | | | | | | | | | | When checking for bricks availability, use ls -l produces a more reliable result than just ls. BUG: 1129939 Change-Id: Ided548a8f4154714d2c33ec538d0623d7c328952 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/9133 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* ec: Correctly handle quota xattrsXavier Hernandez2014-11-121-0/+58
| | | | | | | | | Change-Id: I35e11d83c318210d44b918e847cf13db35b01510 BUG: 1158008 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8990 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* ec: Fix rebalance issuesXavier Hernandez2014-10-273-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some issues in ec xlator made that rebalance didn't complete successfully and generated some warnings and errors in the log. The most critical error was a race condition that caused false corruption detection when two specific operations were executed sequentially and they shared the same lock. This explains the problem: 1. A setxattr is issued. 2. setxattr: ec locks the inode before updating the xattr. 3. setxattr: The xattr is updated. 4. setxattr: Upper xlator is notified that the operation completed. 5. setxattr: A background task is initiated to update the version of the file. 6. A stat is issued on the same file. 7. stat: Since the lock is already acquired, it's reused. 8. stat: A lookup is issued to determine version and size information of the file. At this point, operations 5 and 8 can interfere. This can make that lookup sees different information on each brick, determining that some bricks are corrupted and incorrectly excluding them from the operation and initiating a self-heal. In some cases this false detection combined with self-heal could lead to invalid updates of the trusted.ec.size xattr, leaving the file smaller than it should be. This only happens if the first operation does not perform a lookup, because chained operations reuse the information returned by the previous one, avoiding this kind of problems. To solve this, now the background update is executed atomically with the posterior unlock. This avoids some reuses of the lock while updating. However this reduces performance because the window in which new requests can reuse the lock is much smaller now. This has been alleviated by using the same technique implemented in AFR (i.e. waiting some time before releasing the lock). Some minor changes also introduced in this patch: * Bug in management of 'trusted.glusterfs.pathinfo' that was writing beyond the allocated space. * Uninitialized variable. * trusted.ec.config was not created for regular files created with mknod. * An invalid state was used in access fop. Change-Id: Idfaf69578ed04dbac97a62710326729715b9b395 BUG: 1152902 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8947 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* ec: Fix self-heal issuesXavier Hernandez2014-10-212-44/+166
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Doing an 'ls' of a directory that has been modified while one of the bricks was down, sometimes returns the old directory contents. Cause: Directories are not marked when they are modified as files are. The ec xlator balances requests amongst available and healthy bricks. Since there is no way to detect that a directory is out of date in one of the bricks, it is used from time to time to return the directory contents. Solution: Basically the solution consists in use versioning information also for directories, however some additional changes have been necessary. Changes: * Use directory versioning: This required to lock full directory instead of a single entry for all requests that add or remove entries from it. This is needed to allow atomic version update. This affects the following fops: create, mkdir, mknod, link, symlink, rename, unlink, rmdir Another side effect is that opendir requires to do a previous lookup to get versioning information and discard out of date bricks for subsequent readdir(p) calls. * Restrict directory self-heal: Till now, when one discrepancy was found in lookup, a self-heal was automatically started. This caused the versioning information of a bad directory to be healed instantly, making the original problem to reapear again. To solve this, when a missing directory is detected in one or more bricks on lookup or opendir fops, only a partial self-heal is performed on it. A partial self-heal basically creates the directory but does not restore any additional information. This avoids that an 'ls' could repair the directory and cause the problem to happen again. With this change, output of 'ls' is always consistent. However, since the directory has been created in the brick, this allows any other operation on it (create new files, for example) to succeed on all bricks and not add additional work to the self-heal process. To force a self-heal of a directory, any other operation must be done on it. For example a getxattr. With these changes, the correct healing procedure that would avoid inconsistent directory browsing consists on a post-order traversal of directoriesi being healed. This way, the directory contents will be healed before healing the directory itslef. * Additional changes to fix self-heal errors - Don't use fop->fd to decide between fd/loc. open, opendir and create have an fd, but the correct data is in loc. - Fix incorrect management of bad bricks per inode/fd. - Fix incorrect selection of fop's target bricks when there are bad bricks involved. - Improved ec_loc_parent() to always return a parent loc as complete as possible. Change-Id: Iaf3df174d7857da57d4a87b4a8740a7048b366ad BUG: 1149726 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8916 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* ec: Fix incorrect management of healed bricksXavier Hernandez2014-10-201-19/+19
| | | | | | | | | | | | | | | | | | The final lookup made to restore final file attributes after a self-heal did clear the mask of bad bricks, causing that the final setattr won't modify any brick at all. This caused that some attriutes, specially the modification time of the file didn't get updated properly. Now the mask of healed bricks is saved before doing the last lookup. It's also used to correctly report the repaired bricks. Change-Id: Ib94083c9e1b562515dfb54f9574120f1f031dccc BUG: 1149723 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8905 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* test/ec: Fix spurious failures caused by self-healXavier Hernandez2014-10-0312-53/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The sha1sum of a file may update the access time of that file. If this happens while a brick is down, as it is forced in the test, that brick doesn't get the update, getting out of sync. When the brick is restarted, self-heal repairs the file, but the test shouldn't access brick contents until self-heal finishes. If this is combined with a kill of another brick before self-heal has finished repairing the file, the volume could become inaccessible. Since the purpose of these tests is only to check ec functionality (there is another test that checks self-heal), the test that corrupts the file has been removed. Additional checks to validate the state of the volume have been added to avoid some timing issues. BUG: 1144108 Change-Id: Ibd9288de519914663998a1fbc4321ec92ed6082c Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8892 Reviewed-by: Emmanuel Dreyfus <manu@netbsd.org> Tested-by: Emmanuel Dreyfus <manu@netbsd.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* ec: Add state dump supportXavier Hernandez2014-10-031-0/+26
| | | | | | | | | | | | Change-Id: I4504f3050674dde217e79af28cb4d2b5370fe2d5 BUG: 1148010 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8891 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Emmanuel Dreyfus <manu@netbsd.org> Tested-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Regression test portability: truncateEmmanuel Dreyfus2014-10-011-2/+2
| | | | | | | | | | | | | | Use truncate -s 1M instead of truncate --size=1m for portability sake BUG: 1129939 Change-Id: I5bf6ca1f9bb4fa3c91796a659a06bf368776b3e5 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8894 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net>
* ec: Removed SSE2 dependencyXavier Hernandez2014-09-111-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the Galois Field multiplications using pure C code without any assembler support. This makes the ec xlator portable to other architectures. In the future it will be possible to use an optimized implementation of the multiplications using architecture dependent facilities (it will be automatically detected and configured). To allow bricks with different machine word sizes to be able to work seamlessly in the same volume, the minimum fragment length to be stored in any brick has been fixed to 512 bytes. Otherwise, different implementations will corrupt the data (SSE2 used 128 bytes, while new implementation would have used 64). This patch also removes the '-msse2' option added on patch http://review.gluster.org/8395/ Change-Id: Iaf6e4ef3dcfda6c68f48f16ca46fc4fb61a215f4 BUG: 1125166 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8413 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* tests/ec: Avoid spurious umount failuresXavier Hernandez2014-09-111-1/+1
| | | | | | | | | | | Change-Id: Id22d64a95adf3666a5e4208f87f9a6d91c40b267 BUG: 1092850 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8694 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Emmanuel Dreyfus <manu@netbsd.org> Tested-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* porting: Provide setfattr/getfattr implementationHarshavardhana2014-09-052-3/+3
| | | | | | | | | | | | | | | | - Use 'getfattr' properly avoid redundant options during xattr query - Untabify certain parts of tests (remove tabs) - Avoid backtick evaluation for certain values to make code more portable. - Use awk on FreeBSD/Darwin, since 'wc' implementation is broken and adds spurious spaces in its output. Change-Id: I7dcc0b70874e43b4cda8c306ed18a31b7a3f990a BUG: 1131713 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/8520 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Emmanuel Dreyfus <manu@netbsd.org> Tested-by: Emmanuel Dreyfus <manu@netbsd.org>
* porting: various fixes regression tests OSX/FreeBSDHarshavardhana2014-08-294-11/+9
| | | | | | | | | | | | | | | | | | | | - `wc -l` on OSX/FreeBSD adds spurious spaces, this clobbers up TAP output parsers - fix it. - `umount -l` doesn't exist on OSX/FreeBSD use 'umount -f' if available. - Add check for 'file' version, to handle mime type variations across versions - Converge 'glusterfs --attribute-timeout=0 --entry-timeout=0' into '$GFS' - Modify remaining 'mount -t nfs' to use 'mount_nfs' - Update sha1sum for OSX to use 'openssl sha1'. Change-Id: Id1012faa5d67a921513d220e7fa9cebafe830d34 BUG: 1131713 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/8501 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* Regression test portability: mktempEmmanuel Dreyfus2014-08-203-3/+3
| | | | | | | | | | | | | | Linux mktemp accepts to run without a template, NetBSD mandates it. Since the template option has the same syntax, add it everywhere. While there, also do this in scripts outside of regression testing. BUG: 764655 Change-Id: I3ec140afbc9009257c81a56d77afcc21fef74cc4 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8432 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net>
* Regression test portability: mountEmmanuel Dreyfus2014-08-201-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Address various portability-related problems related to mount - In order to address the non-portability of NFS mount options, use the mount_nfs shell function everywhere, and use it to translate options. - Make sure NFS mounts are unmounted before shutting down the daemons in order to avoid deadlock. The change is done in every test that did not unmounted NFS mounts at the end of the script, and in global cleanup function as well. The force_umount shell function from volume.rc was duplicated as umount_nfs in nfs.rc so that we do not have to add an include on volume.rc for all NFS tests that do not need it. - The FUSE mount type on NetBSD is puffs|perfuse|fuse.glusterfs instead of just fuse.glusterfs, make the regexp configurable in include.rc - Finding wether the mount is RO or RW in mount output needs a system-dependent command configurable in include.rc - mount options in /proc/mounts may be limited to "rw", adjust the regexp for this case where there is no comma And while there change rm into rm -f in tests/basic/mount.t for removal opearation that should fail, since rm may ask for confirmation Change-Id: I1fb708486ec350b2885e2404879561c1020fa8fd BUG: 1129939 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8494 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net>
* cluster/ec: Fix incorrect management of NFS requestsXavier Hernandez2014-08-021-0/+18
| | | | | | | | | | | | | | | | | | | Some operations, specially those comming from NFS, do not use a regular fd and use an anonymous fd (i.e. a previous open call has not been sent). Any context information created during open or create will not be present on these fd's, so we simply return NULL for contexts of those fd. Also it seems that NFS can send write requests with a very big buffer (higher that the default value of 128 KB). Some changes have been made to correctly handle these large buffers. Change-Id: I281476bd0d2cbaad231822248d6a616fcf5d4003 BUG: 1122417 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8367 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* ec: Test volume mount point in a better wayXavier Hernandez2014-08-011-1/+1
| | | | | | | | | | | | | | | An 'ls -a1' on an empty volume seems to return 3 entries instead of the expected 2 ('.' and '..') in the build servers. I changed the test to a simple 'stat', which is enough and more reliable. Change-Id: I12d0f47394ad378b40fc9b86507cdb3543f99970 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8313 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cli/glusterd: Added support for dispersed volumesXavier Hernandez2014-07-1110-0/+597
Two new options have been added to the 'create' command of the cli interface: disperse [<count>] redundancy <count> Both are optional. A dispersed volume is created by specifying, at least, one of them. If 'disperse' is missing or it's present but '<count>' does not, the number of bricks enumerated in the command line is taken as the disperse count. If 'redundancy' is missing, the lowest optimal value is assumed. A configuration is considered optimal (for most workloads) when the disperse count - redundancy count is a power of 2. If the resulting redundancy is 1, the volume is created normally, but if it's greater than 1, a warning is shown to the user and he/she must answer yes/no to continue volume creation. If there isn't any optimal value for the given number of bricks, a warning is also shown and, if the user accepts, a redundancy of 1 is used. If 'redundancy' is specified and the resulting volume is not optimal, another warning is shown to the user. A distributed-disperse volume can be created using a number of bricks multiple of the disperse count. Change-Id: Iab93efbe78e905cdb91f54f3741599f7ea6645e4 BUG: 1118629 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/7782 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>