| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When searching for an exact entry we need to compare the
component counts in the candidate VMP and the count in the
path being searched. This is opposite to the current
situation where we compare the component count in VMP
and the component count in maxentry, which will always
be same.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 209 (VMP parsing through fstab has issues)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=209
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Another attempt to enhance searching for VMP entries.
There was a problem of returning the longest prefix match
from all the VMPs without checking whether the number of
matched components were same as the number of components
in the candidate VMP.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 209 (VMP parsing through fstab has issues)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=209
|
|
|
|
|
|
|
|
|
| |
allocated memory.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 228 (Segmentation fault in glusterfs_getxattr)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=228
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some weeks back, I'd separated the big lock into vmplock and mountlock.
See commit 304e4274ca9b0339539581c5413e3339078c1182 in mainline.
At that time, we did not have a solution to the problem
of when to init the vmplist in a thread-safe manner, since
there was no lock to protect the vmplock specifically, and that
when libgf_vmp_map_ghandle was called inside glusterfs_mount
so the "lock" was already being held.
Now that we have separate mount and vmp locks, the
accesses can be synced correctly.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 211 (libglusterfsclient: Race condition against vmplist in libgf_vmp_map_ghandle)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=211
|
|
|
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 210 (libglusterfsclient: Enhance logging)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=210
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Till now, we've been doing a character by character comparison
between a given path and the VMP, to search for the glusterfs
handle for the given path.
This does not work for all cases and has been a known bug.
This commit changes the byte-by-byte comparison into a more
accurate component based comparison to fix search
failures.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 209 (VMP parsing through fstab has issues)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=209
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an fd_t is fd_create'd, we need to call fd_bind on it to
ensure that any fd_lookup on the inode gets us this fd. We're not
doing this so translators like write-behind were not able to order
path-based requests at all resulting in some fops like stat, which
could be issued after a writev, overtaking a previous writev which
is still being written-behind.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 179 (fileop reports miscompares on read tests)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=179
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier we have invalidated the iattr cache on writes. Now
we need to do so for reads also, so that we are not updating
the iattr cache with 0-filled stat received from io-cache.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 180 (fileop fails at chmod with stale file handle error over unfs3)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=180
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Till now we've been creating an iovec, storing references in it
to the application data and simply passing it on to the translator
tree. This means that the buffer being passed to the translators is
not at all associated with the memory ref'd by the iobref argument
to write fop. This is a problem when write-behind is a translator in
the tree since it assumes that the memory in the iovecs passed to
write fops is already refcounted by the iobref and so it simply copies
the address of the application data. The problem is that the application
can continue using this buffer, free it or over-write it destroying the
data that write-behind may write at a later time.
The solution involves copying the application's write buffer into
an iobuf which will be referred to by the iobref.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 178 (libglusterfsclient: Data corruption on using write-behind in translator tree)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=178
|
|
|
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 115 (./configure adds libglusterfsclient when it shouldn't)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=115
|
|
|
|
|
|
|
|
|
|
|
|
| |
There seems to a reproduceable corruption specifically of
the libglusterfs_client_local_t that is allocated for
the read call. Therefore, the subsequent access to fd inside
local leads to a segfault. This is a temporary fix.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 164 (libglusterfsclient: Segfault due to memory corruption of frame local in libgf_client_read)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=164
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In libgf_client_loc_fill, there is a possibility that all
the ino, par and name are specified as non-NULL,non-zero args.
So if an inode is located in the itable using the ino and the
subsequent search for the inode using the par-ino and the file
name does not result in an inode being found, the current
code over-writes the inode that was found through the ino. The
correct behaviour is to stop further searches if inode
was already found using ino.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 161 (unfs3 crashes on link system call by fileop)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=161
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the loc_t of the link being created, we must fill in the inode
of the old/target loc since this is a link operation. The
inode_link to the new parent is called in libgf_client_link.
This fixes a crash while running fileop over a fully-loaded
dist-repl vol file.
Ref: Bugzilla 161
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 161 (unfs3 crashes on link system call by fileop)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=161
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is needed to work around the replicate behaviour of
possibly returning device number for the same file from
different subvolumes.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 148 (replicate: Returns st_dev from different subvols resulting in ESTALE thru unfs3booster)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=148
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The workaround for the DHT requirement for a lookup on /
needs to be done only once when the xlator graph is inited.
Doing it on every path's lookup results in a major performance
penalty when using distribute subvolumes upwards of 16, as reported
by Avati.
Ref: bug 152
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 152 (libglusterfsclient: DHT workaround is a major performance bottleneck)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=152
|
|
|
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 130 (build warnings)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=130
|
|
|
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 149 (libglusterfsclient interacts incorrectly with write-behind on writev)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=149
|
|
|
|
|
|
|
|
|
|
|
| |
We werent updating the attr AKA stat cache on read and write
on files so every stat on the file before the timeout was returning
stale attr from the cache. Yuck!
This fixes it. Turns out there is a good aspect of unfs3's notoriety
when it comes to doing stat()s for every operation.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
| |
Ref: http://www.gnu.org/s/libc/manual/html_node/Access-Modes.html
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible that the only translator in the libglusterfsclient
tree is the posix. In that case, inside gluster_init, the graph
init routines will need to call lstat on the posix subdirectory.
Since even the glusterfs stack is running over booster, those
calls will also first require vmp searching. BUT, the vmp lock
is the same as the mount lock that was already taken when we entered
glusterfs_mount, so a deadlock occurs.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This bug shows up while using unfs3 with replicate. The absence
of an inode_lookup on a looked-up/created inode results in it
getting pruned from the inode table. Consequently, a subsequent
lookup for the inode results in a different inode number being
returned by replicate. This breaks unfs3 because it tries to remember
the inode numbers returned by two different stat-family calls.
Resolves: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=11
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
| |
- Generally glusterfs_reset is called after fork in child to empty out
vmplist.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
| |
stored in fd_ctx is used.
- this helps in implementing sendfile(2). manpage says that
"If offset is not NULL, then sendfile() does not modify the current
file offset of in_fd"
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
| |
- unmounts all the entries in the vmplist.
- this api helps booster to cleanup all the mounts in a single call.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
| |
- this patch also checks for the presence of vmp before adding
an vmpentry.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
| |
We can avoid memory allocation, de-allocation and
data copies by just using the entries passed to us from
a lower layer and by de-linking the entries from the original
list.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This improves the potential for pre-fetching a larger
number of dirents. Consider that, with 255 chars as the max
name length for each dirent, in the worst case scenario, where
we actually have files with such large names, we're not getting
more than 4 entries with the current block size of 1024.
Generally also, increasing the size to 4k provides us
with a higher chance that directories with low to medium
number of dirents will be pre-fetched in a single readdir fop.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fop interface is such that we're able to extract more than
1 dirent in a readdir fop. This commit now enables libglusterfsclient
to read multiple entries on a glusterfs_readdir call. Once these
have been pre-fetched, they're cached till either glusterfs_closedir
,glusterfs_rewinddir or glusterfs_seekdir are called.
The current implementation is beneficial for sequential directory
reading and probably indifferent to applications that do a lot of seekdir
and rewinddir after opening the directory. This is because
both these calls result in dirent cache invalidation.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
| |
There is a mechanism for caching the inode numbers got from a lookup
and a struct stat got from a stat or fstat but I wasnt sure if it worked.
This commit simplifies cache updates and checks and the accompanying
tests have made sure that the cache does work.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
| |
During a rename, if the new file exists, the old name needs to
over-write the new name. We're returning EEXIST, which is wrong
behaviour.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
In __do_path_resolve, we need to use the new_loc.path as the input
for resolution rather than the resolved variable, simply because we're
not interested in resolving the names that have been resolved, as
pointed out by the variable name 'resolved'. Instead, we need to resolve
new_loc, which stores the next component in the path to
be looked up.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Not cleaning up the ino member of a loc_t results in SIGABRT
in __inode_link because in some cases, the loc->ino is
different from loc->inode->ino. This happens especially in code
blocks which re-use a loc_t structure for pointing at different
inodes/files. For eg, if a loc_t has been assigned an inode and
an ino, and followed by a libgf_client_loc_wipe, then re-use of this
loc in say libgf_client_lookup results the SIGABRT because
libgf_client_lookup calls inode_link with the same loc_t. However,
this loc_t has just been assigned a new inode pointer but the ino
member still contains a previous inode's inode number. This difference
in inode numbers results in an assertion failure, so the SIGABRT.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
| |
Here I am only refining the entry parsing code in order
to clarify the exit conditions from the loop. There were
a few workloads where this loop went infinite.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit basically reverts the previous readdir conformance
patch I sent a few days back. That commit had a completely retarded
and broken way of maintaining per-directory dirent.
It was broken for two reasons:
1. Creating a wrapper structure around the directory's fd_t
only for storing a struct dirent is not clean enough. This commit
takes a better approach by storing the dirent in fd_t context.
This dirent is valid only if the fd_t refers to a directory.
2. That commit was made and tested under the assumption (..stupidity
is a better word..) that only opendir call is used for opening a
directory. That is not correct. Directories are also opened using the
open syscall. The point is, glusterfs_open returns an fd_t and so did
glusterfs_opendir. The previous patch actually changed opendir to
return a new wrapper structure. That is fine, if we go by the POSIX
definition of open and opendir because, they're both supposed to
return different types, an int and a DIR*. However, in
libglusterfsclient, all other code assumes that directory handles
corresponding to DIR* and file descriptors corresponding to int types
are the same type, resulting in use of the same locking and fd context
addition/extraction code. So a directory opened using opendir returned
a wrapper structure which went down into the libglusterfsclient stack
where some function called a lock on the handle assuming it was an
fd_t, since it is not and dereferencing of the supposed fd->inode->lock
results in a seg fault.
Obviously, this didnt show up till unfs3 used open() to open a
directory and not opendir.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
build on solaris and other platforms
|
|
|
|
|
|
|
|
|
|
|
| |
readdir is supposed to be non-re-entrant only with respect to the
given dir stream, not the whole process.
What that means is the static struct dirent that we maintain in
libglusterfsclient should be per-directory handle and not
process-wide.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of now, we use 1024 bytes as the buffer for reading directory
entries. If a directory as many files, then its possible that it does
not fit into this buffer, thereby requiring more than one call to
readdir. Now suppose the last bunch of directories fit more or less
exactly int the 1024 byte buffer. If this happens, the offset
extracted by the current logic(in libgf_client_readdir) never gets
updated beyond the first entry in this last block, because the last
block's first entry always remains same. This explanation is
convoluted, I know, but I too found out the hard way.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This commit does two things:
1. Changes glusterfs_readdir prototype to conform to the POSIX
readdir().
2. Uses a 1024-byte value instead of sizeof(struct dirent) for the
@size for libgf_client_readdir. This allows even larger names to fit
into a single readdir request to the server.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|