summaryrefslogtreecommitdiffstats
path: root/xlators/storage
Commit message (Collapse)AuthorAgeFilesLines
* features/trash : Notify CTR translator if an unlink happens to a fileJiffin Tony Thottan2015-04-242-3/+12
| | | | | | | | | | | | | | | | | | | | | This implementation is same as the posix_unlink_cbk() where CTR sends a request during a unlink to send the number of links to the inode and posix obliges sending it using the unwind xdata dict. For Trash xlator a unlink is stat + mkdir(if parent is not present) + rename. And hence this is handled in trash_unlink_rename_cbk(). Change-Id: I402e83567b88e3c9fe171379693c82937af567f9 BUG: 1205545 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Anoop C S <achiraya@redhat.com> Reviewed-on: http://review.gluster.org/9989 Tested-by: NetBSD Build System Tested-by: Joseph Fernandes Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* xlators/storage/posix: Fix Dereference before null check (CID 1293501).Günther Deschner2015-04-181-1/+1
| | | | | | | | | | | | | | | | | | Coverity CID 1293501. Everywhere in this call "name" is explicitly checked for NULL derreference just not here in this path. Guenther Change-Id: Ie3e7b704702cb979a036052238ed65eda1531407 BUG: 789278 Signed-off-by: Günther Deschner <gd@samba.org> Reviewed-on: http://review.gluster.org/10252 Tested-by: NetBSD Build System Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Introduce xattr-fill on fdsKrutika Dhananjay2015-04-134-85/+148
| | | | | | | | | | | ... with some of the code borrowed from http://review.gluster.org/#/c/3904/ Change-Id: I4901ef14d6f843d8d69f102d43d21b60ba298092 BUG: 1207603 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10180 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* bitrot/scrub: Scrubber fixesVenky Shankar2015-04-083-6/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a handful of problem with scrubber which are detailed below. Scrubber used to skip objects for verification due to missing fd iterface to fetch versioning extended attributes. Similar to the inode interface, an fd based interface in POSIX is now introduced. Moreover, this patch also fixes potential false reporting by scrubber due to: An object gets dirtied and signed when scrubber is busy calculatingobject checksum. This is fixed by caching the signed version when an object is first inspected for stalenes, i.e., during pre-compute stage. This version is used to verify checksum in the post-compute stage when the signatures are compared for possible corruption. Side effect of _not_ sending signature length during signing resulted in "truncated" signature to be set for an object. Now, at the time of signing, the signature length is sent and is used in place of invoking strlen() to get signature length (which could have possible 00s). The signature length itself is not persisted in the signature xattr, but is calculated on-the-fly by substracting the xattr length by the "structure" header size. Some of the log entries are made more meaningful (as and aid for debugging). Change-Id: I938bee5aea6688d5d99eb2640053613af86d6269 BUG: 1207624 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/10118 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/bitrot-stub: Enhancement to versioning protocolVenky Shankar2015-04-081-48/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | .. and potential bug fixes / memleak. While assigning initial version to an object, both extended attributes (namely, ongoing version and the default signing version) were persisted. This is optimized to just persist the ongoing version along with safe handling of xattr request(s) in it's absence. This is better than the earlier approach as the two xattr sets were not atomic anyway (allowing a request to sneak in between between two set operations). This also allows to perform sanity checks on objects during lookup()/getxattr(): objects with missing ongoing version but presence of signature are possible candidates of tampering (and catching implementation bugs). There were couple of instances in the code where versioning xattrs were incorrectly removed before in-memory versions were initialized, which have been fixed with this patch. A memory leak in the IPC code path is also fixed. Change-Id: I01c690ccfe7156a883582275f40f79a7c10c0900 BUG: 1207054 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/10117 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Introduce xattr_fill capability in posix_statKrutika Dhananjay2015-04-081-1/+7
| | | | | | | | | | Change-Id: I2b6503ad9333f445ebdcd9fa660da20b861b985f BUG: 1207603 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10158 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* Avoid conflict between contrib/uuid and system uuidEmmanuel Dreyfus2015-04-046-42/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterfs relies on Linux uuid implementation, which API is incompatible with most other systems's uuid. As a result, libglusterfs has to embed contrib/uuid, which is the Linux implementation, on non Linux systems. This implementation is incompatible with systtem's built in, but the symbols have the same names. Usually this is not a problem because when we link with -lglusterfs, libc's symbols are trumped. However there is a problem when a program not linked with -lglusterfs will dlopen() glusterfs component. In such a case, libc's uuid implementation is already loaded in the calling program, and it will be used instead of libglusterfs's implementation, causing crashes. A possible workaround is to use pre-load libglusterfs in the calling program (using LD_PRELOAD on NetBSD for instance), but such a mechanism is not portable, nor is it flexible. A much better approach is to rename libglusterfs's uuid_* functions to gf_uuid_* to avoid any possible conflict. This is what this change attempts. BUG: 1206587 Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10017 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* Xlators : Fixed typosManikandan Selvaganesh2015-04-021-1/+1
| | | | | | | | | | | Change-Id: I948f85cb369206ee8ce8b8cd5e48cae9adb971c9 BUG: 1075417 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/9529 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
* bd: coverity fixes, removing logically dead code and correcting checksNandaja Varma2015-03-302-3/+5
| | | | | | | | | | | | | | | | Coverity CIDs: 1128910 1128911 1128913 1128912 1134020 Change-Id: I2d871723fbfe43f9ff6b3beba7a99b0d81d4aff5 BUG: 789278 Signed-off-by: Nandaja Varma <nvarma@redhat.com> Reviewed-on: http://review.gluster.org/9588 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* bd: Fixing dereference after null check(FORWARD_NULL)arao2015-03-301-1/+2
| | | | | | | | | | | | | | CID: 1128907 The pointer variable 'bdatt' redirected to a goto label when the value was NULL and in the other condition when it is not NULL , hence the bdatt is again checked for NULL at 'reverse xattr' label. Change-Id: I2289cbf30fde9faf97e6eebd4902953a44049f9e BUG: 789278 Signed-off-by: arao <arao@redhat.com> Reviewed-on: http://review.gluster.org/9619 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix : unchecked return value coverity fix.Manikandan Selvaganesh2015-03-291-3/+9
| | | | | | | | | | | | | CID : 1124364 Change-Id: I1e16e3ff46b191ba2ea527e628c77a99a56f6c31 BUG: 789278 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/9667 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: handle failure from posix_resolvevmallika2015-03-251-0/+4
| | | | | | | | | | | | | | | When building ancestory, posix_resolve gets the inode from the gfid. We need to handle the failure case from this function Change-Id: I19f0f0c739686b1b0ef96309212aa1c7911b3589 BUG: 1203629 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/9941 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* Bitrot StubVenky Shankar2015-03-243-0/+96
| | | | | | | | | | | | | Bitrot stub implements object versioning required for identifying signature freshness. More details about versioning is explained as a part of the "bitrot feature documentation" patch. Change-Id: I2ad70d9eb109ba4a12148ab8d81336afda529ad9 BUG: 1170075 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/9709 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Adding ChangeTimeRecorder(CTR) Xlator to GlusterFSJoseph Fernandes2015-03-191-1/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ********************************************************************** ChangeTimeRecorder(CTR) Xlator | ********************************************************************** ChangeTimeRecorder(CTR) is server side xlator(translator) which sits just above posix xlator. The main role of this xlator is to record the access/write patterns on a file residing the brick. It records the read(only data) and write(data and metadata) times and also count on how many times a file is read or written. This xlator also captures the hard links to a file(as its required by data tiering to move files). CTR Xlator is the consumer of libgfdb. To Enable/Disable CTR Xlator: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ gluster volume set <volume-name> features.ctr-enabled {on/off} To Enable/Disable Frequency Counter Recording in CTR Xlator: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ gluster volume set <volume-name> features.record-counters {on/off} Change-Id: I5d3cf056af61ac8e3f8250321a27cb240a214ac2 BUG: 1194753 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/9935 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* every/where: add GF_FOP_IPC for inter-translator communicationJeff Darcy2015-03-171-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several features - e.g. encryption, erasure codes, or NSR - involve multiple cooperating translators which sometimes need a "private" means of communication amongst themselves. Historically we've used virtual or synthetic xattrs, but that's not very elegant and clutters up the getxattr/setxattr path which must also handle real xattr requests. This new fop should address that. The only argument is an int32_t "op" which should be recognized by the target translator. It is recommended that translators using these feature follow some convention regarding the ops that they define, to avoid conflicts. Using a hash of the target translator's type string as a base for a series of ops would probably be a good start. Any other information can be passed in both directions using xdata. The default behavior for this fop, as with any other, is to pass through to FIRST_CHILD. That makes use of this fop "transparent" to other translators that were written before it existed, but it also means that it only really works with pass-through translators. If a routing translator (such as DHT) or a fan-out translator (such as AFR) is involved, the IPC might not reach its intended destination unless those translators are modified to forward IPC fops along all paths. If an IPC gets all the way to storage/posix it is considered an error, much like an uncaught exception. We don't actually *do* anything in that case, but we do log it send back an EOPNOTSUPP error. This makes the "unrecognized opcode" condition distinguishable from the "no IPC support" condition (which would yield an RPC error instead) so clients can probe for the presence of a handler for their own favorite opcode and either use that or use old-school xattrs depending on the result. BUG: 1158628 Signed-off-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: I84af1b17babe5b30ec03ecf027ae37d09b873968 Reviewed-on: http://review.gluster.org/8812 Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* xlators/storage/bd : Unused value is removed.Manikandan Selvaganesh2015-03-151-1/+0
| | | | | | | | | | | | | | CID:1128926 Change-Id: I5ad1229e225a36f995245a847db1a19609a18cd8 BUG: 789278 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/9556 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: add ACL translation for the GF_POSIX_ACL_*_KEY xattrNiels de Vos2015-03-094-1/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding support for two virtual extended attributes that are used for converting a binary POSIX ACL to a POSIX.1e long ACL text format. This makes it possible to transfer the ACL over the network to a different OS which can convert the POSIX.1e text format to its native structures. The following xattrs are sent over RPC in SETXATTR/GETXATTR procedures, and contain the POSIX.1e long ACL text format: - glusterfs.posix.acl: maps to ACL_TYPE_ACCESS - glusterfs.posix.default_acl: maps to ACL_TYPE_DEFAULT acl_from_text() (from libacl) converts the text format into an acl_t structure. This structure is then used by acl_set_file() to set the ACL in the filesystem. libacl-devel is needed for linking against libacl, so it has been added to the BuildRequires in the .spec. NetBSD does not support POSIX ACLs. Trying to get/set POSIX ACLs on a storage server running NetBSD, an error will be returned with errno set to ENOTSUP. Faking support, but not enforcing ACLs seems wrong to me. URL: http://www.gluster.org/community/documentation/index.php/Features/Improved_POSIX_ACLs BUG: 1185654 Change-Id: Ic5eb73d69190d3492df2f711d0436775eeea7de3 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9627 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* Fix dictionary leaks in ancestry-building code.Pranith Kumar K2015-03-092-6/+1
| | | | | | | | | | Change-Id: I7a4a24ed95f897d1c14d89f3869c20ba40f85b7f BUG: 1188636 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9839 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs: Moved common functions as utils in syncop/common-utilsPranith Kumar K2015-02-271-3/+9
| | | | | | | | | | | | | | | These will be used by both afr and ec. Moved syncop_dirfd, syncop_ftw, syncop_dir_scan functions also into syncop-utils.c Change-Id: I467253c74a346e1e292d36a8c1a035775c3aa670 BUG: 1177601 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9740 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Storage/posix : Adding error checks in path formationNithya Balachandran2015-02-191-2/+3
| | | | | | | | | | | | | Modified a few log messages added for this fix. Also set the op_errno in an error check. Change-Id: I87caf2f89031aedad1aaee001aef54896dbecd3b BUG: 1113960 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/9702 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* posix: Fix unlink failing under specific conditionPrashanth Pai2015-02-191-1/+3
| | | | | | | | | | | | | | | | | | | | PROBLEM: Files are undeletable when these three conditions are met: 1. File does not have trusted.pgfid.<gfid> xattr set. This won't be set when build-pgfid is off (default). 2. File has hardlink count > 1. 3. build-pgfid option is turned on. FIX: Allow unlink on files not having trusted.pgfid.<gfid> xattr. Change-Id: I58a9d9a1b29a0cb07f4959daabbd6dd04fab2b34 BUG: 1122028 Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: http://review.gluster.org/8352 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* Storage/posix : Adding error checks in path formationNithya Balachandran2015-02-184-20/+184
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Renaming directories can cause the size of the buffer required for posix_handle_path to increase between the first call, which calculates the size, and the second call which forms the path in the buffer allocated based on the size calculated in the first call. The path created in the second call overflows the allocated buffer and overwrites the stack causing the brick process to crash. The fix adds a buffer size check to prevent the buffer overflow. It also checks and returns an error if the posix_handle_path call is unable to form the path instead of working on the incomplete path, which is likely to cause subsequent calls using the path to fail with ELOOP. Preventing buffer overflow and handling errors BUG: 1113960 Change-Id: If3d3c1952e297ad14f121f05f90a35baf42923aa Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/9289 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* storage/posix: Don't try to set gfid in case of INTERNAL-mknodPranith Kumar K2015-01-161-7/+12
| | | | | | | | | | Change-Id: I96540ed07f08e54d2a24a3b22c2437bddd558c85 BUG: 1088649 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9446 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* storage/posix: Set gfid after all xattrs, uid/gid are setPranith Kumar K2015-01-121-32/+32
| | | | | | | | | | | | | | | | | | | | | | | | Problem: When a new entry is created gfid is set even before uid/gid, xattrs are set on the entry. This can lead to dht/afr healing that file/dir with the uid/gid it sees just after the gfid is set, i.e. root/root. Sometimes setattr/setxattr are failing on that file/dir. Fix: Set gfid of the file/directory only after uid/gid, xattrs are setup properly. Readdirp, lookup either wait for the gfid to be assigned to the entry or not update the in-memory inode ctx in posix-acl xlator which was producing lot EACCESS/EPERM to the application or dht/afr self-heals. Change-Id: I0a6ced579daabe3452f5a85010a00ca6e8579451 BUG: 1088649 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9434 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* storage/posix: Set correct fsgid before doing symlinkPranith Kumar K2015-01-121-2/+2
| | | | | | | | | | Change-Id: Ic50dfa5e5084c7b148e42a5014cca2b47c8ab5ed BUG: 1180986 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9431 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* telldir()/seekdir() portability fixesEmmanuel Dreyfus2014-12-171-9/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POSIX says that an offset obtained from telldir() can only be used on the same DIR *. Linux is abls to reuse the offset accross closedir()/opendir() for a given directory, but this is not portable and such a behavior should be fixed. An incomplete fix for the posix xlator was merged in http://review.gluster.com/8926 This change set completes it. - Perform the same fix index xlator. - Use appropriate casts and variable types so that 32 bit signed offsets obtained by telldir() do not get clobbered when copied into 64 bit signed types. - modify glfs-heal.c and afr-self-heald.c so that they do not use anonymous fd, since this will cause closedir()/opendir() between each syncop_readdir(). On failure we fallback to anonymous fs only for Linux so that we can cope with updated client vs not updated brick. - Avoid sending an EINVAL when the client request for the EOF offset. Here we fix an error in previous fix for posix xlator: since we fill each directory entry with the offset of the next entry, we must consider as EOF the offset of the last entry, and not the value of telldir() after we read it. - Add checks in regression tests that we do not hit cases where offsets fed to seekdir() are wrong. Introduce log_newer() shell function to check for messages produced by the current script. This fix gather changes from http://review.gluster.org/9047 and http://review.gluster.org/8936 making them obsolete. BUG: 1129939 Change-Id: I59fb7f06a872c4f98987105792d648141c258c6a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/9071 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Raghavendra Bhat <raghavendra@redhat.com>
* storage/posix: Set errno for xattrop failuresPranith Kumar K2014-12-101-0/+3
| | | | | | | | | | Change-Id: I4d44068c8da5257227d62906ec18ae16f6ed6c02 BUG: 1172477 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9261 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Raghavendra Bhat <raghavendra@redhat.com>
* Glusterfs/posix: Stack corruption in posix_handle_pumpNithya Balachandran2014-12-031-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | posix_handle_pump can corrupt the stack if the buffer passed to it is too small to hold the final path. Fix : Check if the buffer is sufficiently large to hold the new path component before modifying it. This will prevent the buffer overrun but the path returned will most likely have too many symbolic links causing subsequent file ops to fail with ELOOP. The callers of this function do not currently check the return value. The code needs to be modified to have all callers check the return value and take appropriate action in case of an error. Change-Id: I6d9589195a4b0d971a107514ded6e97381e5982e BUG: 1113960 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/8189 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* posix: remove duplicate dirfd calls in posix_opendirAtin Mukherjee2014-11-301-1/+1
| | | | | | | | | BUG: 1168910 Change-Id: I285d352d20374bb3edee2db42d062d4724198425 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/9186 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: Fix buffer overrun in _handle_list_xattr()Emmanuel Dreyfus2014-11-281-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | In _handle_list_xattr() we test remaining_size > 0 to check that we do not overrun the buffer, but since that variable was unsigned (size_t), the condition would let us go beyond end of buffer if remaining_size became negative. This could happen if attribute list grew between the first sys_llistxattr() call that gets the size and the second sys_llistxattr() call that get the data. We fix the problem by making remaining_size signed (ssize_t). This also matches sys_llistxattr() return type. While there, we use the size returned by the second sys_llistxattr() call to parse the buffser, as it may also be smaller than the size obtained from first call, if attribute list shrank. This fixes a spurious crash in tests/basic/afr/resolve.t BUG: 1129939 Change-Id: Ifc5884dd0f39a50bf88aa51fefca8e2fa22ea913 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/9204 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* core: fix remaining *printf formation warnings on 32-bitKaleb S. KEITHLEY2014-11-263-11/+13
| | | | | | | | | | | | | | | | | | This fixes a few lingering size_t problems. Of particular note are some uses of off_t for size params in function calls. There is no correct, _portable_ way to correctly print an off_t. The best you can do is use a scratch int64_t/PRId64 or uint64_t/PRIu64. Change-Id: I86f3cf4678c7dbe5cad156ae8d540a66545f000d BUG: 1110916 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/8105 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: Changed order of chown and chmodVenkatesh Somyajulu2014-11-141-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Rebalance process runs in the root mode. If a normal user create a file and if it requires migration then because the migrated file is created by root, its owner and mode should be changed to the source normal user and permission should be changed the previous mode. If the suid bit is also set, then at the destination suid bit should also be set. Two operations are performed in the given order: 1. chmod 2. chown But chown resets the suid bit. So changed the order of these two operations so that first chown will be performed and then chmod will be performd so that suid bit will be preserved. Change-Id: Ib63b5cf528f8336b69bf090ad43bb02eec1d1602 BUG: 1086228 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/7435 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* inode: Handle '/' in basename in inode_link/unlinkPranith Kumar K2014-11-071-1/+1
| | | | | | | | | | | | | | | | | | | Problem: inode_link is sometimes called with a trailing '/'. Lookup, dentry operations like link/unlink/mkdir/rmdir/rename etc come without trailing '/' so the stale dentry with '/' remains in the dentry list of the inode. Fix: Add assert checks and return NULL for '/' in bname. Fix ancestry building code to call without '/' at the end. Change-Id: I9c71292a3ac27754538a4e75e53290e182968fad BUG: 1158751 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9004 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Treat ENODATA/ENOATTR as success in bulk removexattrPranith Kumar K2014-11-051-0/+14
| | | | | | | | | | | | | | | | | Bulk remove xattr is internal fop in gluster. Some of the xattrs may have special behavior. Ex: removexattr("posix.system_acl_access"), removes more than one xattr on the file that could be present in the bulk-removal request. Removexattr of these deleted xattrs will fail with either ENODATA/ENOATTR. Since all this fop cares is removal of the xattrs in bulk-remove request and if they are already deleted, it can be treated as success. Change-Id: Id8f2a39b68ab763ec8b04cb71b47977647f22da4 BUG: 1160509 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9049 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Avoid spurious EINVAL in posix_readdir()Emmanuel Dreyfus2014-10-292-3/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | On non Linux systems, we check that seekdir() succeeds and we return EINVAL if it does not. We need this to avoid infinite loops if some other component in GlusterFS makes an invalid seekdir() usage. This was introduced in this change: http://review.gluster.org/#/c/8760/ But seekdir() also fails when using the offset returned for the last entry, and this is expected behavior. As a result, the seekdir() test produces a spurious EINVAL when reaching end of directory. That error is not propagated to calling process, but it may harm internal GlusterFS processing. At least it produce a spurious error message in brick's log. We fix the problem by remembering the last entry offset in fd private data. When a new posix_readdir() invocation requests that offset, we avoid returning EINVAL. BUG: 1129939 Change-Id: I4e67a2ea46538aae63eea663dd4aa33b16ad24c7 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8926 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Posix: Brick failure detection fix for ext4 filesystemLalatendu Mohanty2014-10-281-6/+64
| | | | | | | | | | | | | | | | Issue: stat() on XFS has a check for the filesystem status but ext4 does not. Fix: Replacing stat() call with open, write and read to a new file under the "brick/.glusterfs" directory. This change will work for xfs, ext4 and other fileystems. Change-Id: Id03c4bc07df4ee22916a293442bd74819b051839 BUG: 1130242 Signed-off-by: Lalatendu Mohanty <lmohanty@redhat.com> Reviewed-on: http://review.gluster.org/8213 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* POSIX filesystem compliance: PATH_MAXEmmanuel Dreyfus2014-10-033-4/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POSIX mandates the filesystem to support paths of lengths up to _XOPEN_PATH_MAX (1024). This is the PATH_MAX limit here: http://pubs.opengroup.org/onlinepubs/009604499/basedefs/limits.h.html When using a path of 1023 bytes, the posix xlator attempts to create an absolute path by prefixing the 1023 bytes path by the brick base path. The result is an absolute path of more than _XOPEN_PATH_MAX bytes which may be rejected by the backend filesystem. Linux's ext3fs PATH_MAX seems to defaut to 4096, which means it will work (except if brick base path is longer than 2072 bytes but it is unlikely to happen. NetBSD's FFS PATH_MAX defaults to 1024, which means the bug can happen regardless of brick base path length. If this condition is detected for a brick, the proposed fix is to chdir() the brick glusterfsd daemon to its brick base directory. Then when encountering a path that will exceed _XOPEN_PATH_MAX once prefixed by the brick base path, a relative path is used instead of an absolute one. We do not always use relative path because some operations require an absolute path on the brick base path itself (e.g.: statvfs). At least on NetBSD, this chdir() uncovers a race condition which causes file lookup to fail with ENODATA for a few seconds. The volume quickly reaches a sane state, but regression tests are fast enough to choke on it. The reason is obscure (as often with race conditions), but sleeping one second after the chdir() seems to change scheduling enough that the problem disapear. Note that since the chdir() is done if brick backend filesystem does not support path long enough, it will not occur with Linux ext3fs (except if brick base path is over 2072 bytes long). BUG: 1129939 Change-Id: I7db3567948bc8fa8d99ca5f5ba6647fe425186a9 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8596 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/quota: Heal pgfid xattr on existing data when the quota isvmallika2014-09-302-0/+37
| | | | | | | | | | | | | | | | | | | | | enable The pgfid extended attributes are used to construct the ancestry path (from the file to the volume root) for nameless lookups on files. As NFS relies on nameless lookups heavily, quota enforcement through NFS would be inconsistent if quota were to be enabled on a volume with existing data. Solution is to heal the pgfid extended attributes as a part of lookup perfomed by quota-crawl process. In a posix lookup check for pgfid xattr and if it is missing set the xattr. Change-Id: I5912ea96787625c496bde56d43ac9162596032e9 BUG: 1147378 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/8878 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Fix invalid seekdir() usageEmmanuel Dreyfus2014-09-301-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to POSIX, seekdir() should only be given offset obtained from telldir() on the same DIR * http://pubs.opengroup.org/onlinepubs/9699919799/functions/seekdir.html Code from afr-self-heald.c and index.c is operating outside of the specification, by doing using seekdir() with offset from a previously open/close/re-open directory. This seems to work on Linux (although with no guarantee it will always in the future). On NetBSD the seekdir() with a in invalid offset is a nilpotent operation, and causes an infinite loop, since index_fill_readdir() always restart from the beginning of the directory. The situation is fixed by using a non anonymous fd in afr-self-heald.c: we explicitely open the directory so that it remains open on the brick side during the timeframe where we want to reuse offsets in seekdir(). This requires adding an opendir fop in index xlator. If the brick was not updated, the opendir will fail and we fallback to the standard violating approach for backward compatibility on Linux. On other systems we fail since it never worked. While there, add tests to check seekdir() success in index and posix xlators, so that incorrect usage from calling code produce an explicit error instead of an infinite loop. We can only do it on non Linux systems, for the sake of backward compatibility when the brick was updated but not the client. BUG: 1129939 Change-Id: I88ca90acfcfee280988124bd6addc1a1893ca7ab Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8760 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterfs: allow setxattr of keys with null values.Ravishankar N2014-09-292-4/+4
| | | | | | | | | | | | | | | Disk based file systems allow to get/set extended attribute key-value pairs where value can be null. Fuse/libgfapi clients must be able to do the same on a gluster volume. Change-Id: Ifc11134cc07f1a3ede43f9d027554dcd10b5c930 BUG: 1135514 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/8567 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Do not forbid fallocate on non Linux systemsEmmanuel Dreyfus2014-09-261-7/+2
| | | | | | | | | | | | | | | | Linux fallocate() differs from posix_fallocate() by an extra flag that can have the FALLOC_FL_KEEP_SIZE value; Do not test FALLOC_FL_KEEP_SIZE existence to enable fallocate() in posix xlator, as sys_fallocate() in libglusterfs provides support for both implementations. BUG: 1129939 Change-Id: Idf41a0396028a15e81281791bf6912d7fd674e3f Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8856 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Log when mkdir is on an existing gfid but non-existentRaghavendra G2014-09-181-1/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | path. consider following steps on a distribute volume 1. rename (src, dst) on hashed subvolume 2. snapshot taken 3. restore snapshots and do stat on src and dst Now, we end up with two directories src and dst having same gfid, because of distribute creating directories on non-existent subvolumes as part of directory healing. This can happen even with race between rename and directory healing in dht-lookup. This can lead to undefined behaviour while accessing any of both directories. Hence, we are logging paths of both directories, so that a sysadmin can take some corrective action when (s)he sees this log. One of the corrective action can be to copy contents of both directories from backend into a new directory and delete both directories. Since effort involved to fix this issue is non-trivial, giving this workaround till we come up with a fix. Change-Id: I38f4520e6787ee33180a9cd1bf2f36f46daea1ea BUG: 1105082 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/8008 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* Always check for ENODATA with ENOATTREmmanuel Dreyfus2014-09-083-10/+14
| | | | | | | | | | | | | | | | | | Linux defines ENODATA and ENOATTR with the same value, which means that code can miss on on the two without breaking. FreeBSD does not have ENODATA and GlusterFS defines it as ENOATTR just like Linux does. On NetBSD, ENODATA != ENOATTR, hence we need to check for both values to get portable behavior. BUG: 764655 Change-Id: I003a3af055fdad285d235f2a0c192c9cce56fab8 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8447 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht: Added code to capture races in dht-lookup pathVenkatesh Somyajulu2014-09-031-0/+6
| | | | | | | | | Change-Id: I9270d2d40ebd4b113ff961583dfda7754741f15b BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8430 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix : Missing space in log messageNithya Balachandran2014-09-031-1/+1
| | | | | | | | | | | Added a space in a log message Change-Id: Iabd50e6b5c9ff4673f59d6b52b785894b3dcdaf9 BUG: 1116150 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/8585 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* storage/posix: Prefer gfid links for inode-handlePranith Kumar K2014-09-021-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: File path could change by other entry operations in-flight so if renames are in progress at the time of other operations like open, it may lead to failures. We observed that this issue can also happen while renames and readdirps/lookups are in progress because dentry-table is going stale sometimes. Fix: Prefer gfid-handles over paths for files. For directory handles prefering gfid-handles hits performance issues because it needs to resolve paths traversing up the symlinks. Tests which test if files are opened should check on gfid path after this change. So changed couple of tests to reflect the same. Note: This patch doesn't fix the issue for directories. I think a complete fix is to come up with an entry operation serialization xlator. Until then lets live with this. Change-Id: I10bda1083036d013f3a12588db7a71039d9da6c3 BUG: 1136159 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8575 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Don't unlink .glusterfs-hardlink before linkto checkVenkatesh Somyajulu2014-08-281-8/+37
| | | | | | | | | | BUG: 1116150 Change-Id: I90a10ac54123fbd8c7383ddcbd04e8879ae51232 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8559 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: fix issue in posix_fsyncZhang Huan2014-08-011-1/+2
| | | | | | | | | | | | | | Fix the issue that posix_fsync does not correctly return and save error code in op_errno when call to sys_fdatasync fails. Change-Id: Id0b62cfa009dbb52c8a0992abd5c46330fa0a8c0 BUG: 1125814 Signed-off-by: Zhang Huan <zhhuan@gmail.com> Reviewed-on: http://review.gluster.org/8398 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Modified logic of linkto file deletion on non-hashedVenkatesh Somyajulu2014-07-311-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently whenever dht_lookup_everywhere gets called, if in dht_lookup_everywhere_cbk, a linkto file is found on non-hashed subvolume, file is unlinked. But there are cases when this file is under migration. Under such condition, we should avoid deletion of file. When some other rebalance process changes the layout of parent such that dst_file (w.r.t. migration) falls on non-hashed node, then may be lookup could have found it as linkto file but just before unlink, file is under migration or already migrated In such cased unlink can be avoided. Race: ------- If we have two bricks (brick-1 and brick-2) with initial file "a" under BaseDir which is hashed as well as cached on (brick-1). Assume "a" hashing gives 44. Brick-1 Brick-2 Initial Setup: BaseDir/a BaseDir [1-50] [51-100] Now add new-brick Brick-3. 1. Rebalance-1 on node Node-1 (Brick-1 node) will reset the BaseDir Layout. 2. After that it will perform a) Create linkto file on new-hashed (brick-2) b) Perform file migration. 1.Rebalance-1 Fixes the base-layout: Brick-1 Brick-2 Brick-3 --------- ---------- ------------ BaseDir/a BaseDir BaseDir [1-33] [34-66] [67-100] 2. Only a) is BaseDir/a BaseDir/a(linkto) BaseDir performed Create linktofile Now rebalance 2 on node-2 jumped in and it will perform step 1 and 2-a. After (rebal-2, step-1), it changes the layout of the BaseDir. BaseDir/a BaseDir/a(link) BaseDir [67-100] [1-33] [34-66] For (rebale-2, step-2), It will perform lookup at Brick-3 as w.r.t new layout 44 falls for brick-3. But lookup will fail. So dht_lookup_everywhere gets called. NOTE: On brick-2 by rebalance-1, a linkto file was created. Currently that linkto files gets deleted by rebalance-2 lookup as it is considered as stale linkto file. But with patch if rebalance is already in progress or rebalance is over, linkto file will not be unlinked. If rebalance is in progress fd will be open and if rebalance is over then linkto file wont be set. Change-Id: I3fee0d28de3c76197325536a9e30099d2413f079 BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8345 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: removing deleting entries in case of creation failuresRaghavendra G2014-07-304-41/+91
| | | | | | | | | | | | | The code is not atomic enough to not to delete a dentry created by a prallel dentry creation operation. Change-Id: I9bd6d2aa9e7a1c0688c0a937b02a4b4f56d7aa3d BUG: 1117851 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/8327 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>