| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
- thanks to Ioannis Aslanidis <iaslanidis@flumotion.com> for reporting.
- breakup the server_connection_cleanup into smaller procedures.
- do following operations in a single atomic operation.
1. conn->active_transports--
2. collecting pointer to lock table and all fds if there are no active transports
this will avoid any race conditions.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
| |
We can avoid memory allocation, de-allocation and
data copies by just using the entries passed to us from
a lower layer and by de-linking the entries from the original
list.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
These checks are needed in case a higher layer intends to
delink the dirent list and passes a NULL pointer to
fop_readdir_cbk_stub for the entries parameter.
Consequently, the gf_dirent_free must guard against an empty list
because the stub that is passed to it mgiht have an empty
dirent list.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier it was thought that only not having 'opensm' running will cause
handshake errors in ib-verbs.
Recently understood that even having a wrong 'ib-verbs.port' option can
also cause the same behavior, and it took more than 5-6 e-mail iterations
with the user and lot of brain cycle in support team to understand the
problem. Made the log message more descriptive, so user can be find the
cause, or can send us email without wasting time.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. A page will be put on the inode waitq if the 'freshness' has to be verified with an fstat()
2. while the fstat is in transit, other calls (like lookup) can update ioc_inode->tv, resetting the freshness (page still on inode waitq)
3. Another read request on the same page, after the updated freshness, will wake up the page frames neglecting the fact that the page is also waiting on the inode (waiting for the fstat completion)
4. once the page's frames are woken, the page becomes elegible for purging and can get destroyed for various reasons, leaving a destroyed page pointer in the inode's waitq
5. fstat returns and hits the destroyed page pointer causing a crash
The fix is to all together disable cache hits when any page of the same inode is under validation. The otherwise cache hit will now be subjected to the ongoing validation by getting queued to the inode waitq.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks to Krishna <krishna (at) gluster.com> for pointing this out.
When a unify self-heal of large directory (directory with lot of entries)
is done, the getdents_cbk used to fail because of new limit of buffer size
(128KB). Noticed that earlier it used to streach upto 4MB, hence the value
1024 worked fine. By reducing it to 512, noticed, we can fit in well within
128KB limit, and hence unify self-heal goes through.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This improves the potential for pre-fetching a larger
number of dirents. Consider that, with 255 chars as the max
name length for each dirent, in the worst case scenario, where
we actually have files with such large names, we're not getting
more than 4 entries with the current block size of 1024.
Generally also, increasing the size to 4k provides us
with a higher chance that directories with low to medium
number of dirents will be pre-fetched in a single readdir fop.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fop interface is such that we're able to extract more than
1 dirent in a readdir fop. This commit now enables libglusterfsclient
to read multiple entries on a glusterfs_readdir call. Once these
have been pre-fetched, they're cached till either glusterfs_closedir
,glusterfs_rewinddir or glusterfs_seekdir are called.
The current implementation is beneficial for sequential directory
reading and probably indifferent to applications that do a lot of seekdir
and rewinddir after opening the directory. This is because
both these calls result in dirent cache invalidation.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
| |
In order to expose the timeout values for stat and inode
caching, this commit introduces a new fstab option "attr_timeout"
that defines the number of seconds for which a looked up inode
or a stat()'ed structure is valid in the cache.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
| |
There is a mechanism for caching the inode numbers got from a lookup
and a struct stat got from a stat or fstat but I wasnt sure if it worked.
This commit simplifies cache updates and checks and the accompanying
tests have made sure that the cache does work.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Handle two cases when deciding log/fstab file:
1. It turns out that that strdup or strlen doesnt actually
check for NULL before trying to do its thing with the string
so it seg-faults on seeing a NULL char pointer.
2. getenv can return an empty string if the
env var was exported as:
$ export GLUSTEFS_BOOSTER_LOG=
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When multiple threads try to create a glusterfs context using the
glusterfs_init function, those threads end up using the global
vairables in the vol file parser in an non-synchronized manner,
resulting in a seg-fault.
There is now a big lock around searches and additions from the mount
table in do_open. This lock granularity could be reduced.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
| |
This can happen when 'option export-statfs-size off' is given in
posix volume. Caused divide by 0 error.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
| |
..and hope for a chance to improve performance on
high speed links like 10GigE.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
| |
A mismatch in the size of the used buffer, between reading and then further writing caused an infinite loop and big files(1Mb, 10Mb etc) could not be downloaded through the lighttpd web service using mod_glusterfs. This is because the big file which is broken up into chunks, has a read and a subsequent write.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
| |
but since introduction of IObuf it is no more. Now iobuf has to be unref'ed instead.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
| |
THIS setting
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
| |
xlator CTX: macro to access glusterfs global context (glusterfs_ctx_t)
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This is another attempt at fixing build problems on Solaris.
I am told that booster build is disabled on Solaris and I know
that it is disabled on Mac OS X also. Getting it to work
on both these systems is now on my TODO list, mainly
because on both these systems, we can have a glusterfs client
running without requiring FUSE.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It seems that use of mutexes is resulting in pretty high thread
sleep and wake-up cost. What is worse, if a worker thread has
acquired a lock, there is a possibility of the main glusterfs thread
being put to sleep. We change the use of mutexes into spinlock.
At the same time, we cannot anymore use condvars for notification since
the condvar interface depends on mutexes itself. Semaphores come to
out rescue. Luckily, even the pthread semaphores have a timedwait
interface to allow our idle worker threads to make an exit decision.
Further, it is possible that spinlocks are not available on all systems
so all this is curtained behind #defines so we can fall back to
mutexes and condvars implementation.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
| |
We've had complaints from users who've used
autoscaling option with default settings for min and
max threads, about high memory consumption because of
the large default value for max-threads.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit brings in support for allocation of iot_request_t's
in io-threads through the use of the mem-pool. We're hoping
that the overheads of hundreds and thousands of small allocations
can be avoided through this.
The important point to note is that the memory pool is not
for the translator as a whole but there is one small memory
pool for each worker thread. Not only does that help us
avoid malloc overheads for small allocations like iot_request_t
but also avoid contention on the heap data structures when multiple
threads want an iot_request_t from the pool.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
| |
When mandatory locks are enabled and a read/write
would block due to a lock and if the fd is opened
with O_NONBLOCK, return EAGAIN (previously EWOULDBLOCK).
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit changes mem-pool behaviour to return a directly usable
address by performing the required adjustment on the address
being returned.
This is different from the previous behaviour where we're trying to fit
into the requested size, the list_head*2 also. This is not efficient
enough in terms of space but hopefully works better than not having any
mem-pool at all. Besides, I am not comfortable with mem-pool meta-data
and caller-useable memory area being the same because of the potential for
mem-pool's data structure corruption.
PS:
Please do read the comments in the code for more info during review.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|