| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current invalidation of stats in wb_readdirp_cbk is prone to races. As
the deleted comment explains,
<snip>
We cannot guarantee integrity of entry->d_stat as there are cached
writes. The stat is most likely stale as it doesn't account the cached
writes. However, checking for non-empty liability list here is not a
fool-proof solution as there can be races like,
1. readdirp is successful on posix
2. sync of cached write is successful on posix
3. write-behind received sync response and removed the request from
liability queue
4. readdirp response is processed at write-behind.
In the above scenario, stat for the file is sent back in readdirp
response but it is stale.
</snip>
The fix is to mark readdirp sessions (tracked in this patch by
non-zero value of "readdirps" on parent inode) and if fulfill
completes when one or more readdirp sessions are in progress, mark the
inode so that wb_readdirp_cbk doesn't send iatts for that in inode in
readdirp response. Note that wb_readdirp_cbk already checks for
presence of a non-empty liability queue and invalidates iatt. Since
the only way a liability queue can shrink is by fulfilling requests in
liability queue, wb_fulfill_cbk indicates wb_readdirp_cbk that a
potential race could've happened b/w readdirp and fulfill.
Change-Id: I12d167bf450648baa64be1cbe1ca0fddf5379521
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
updates: bz#1512691
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 4d3c62e71f3250f10aa0344085a5ec2d45458d5c.
Traversing all children of a directory in wb_readdirp caused
significant performance regression. Hence reverting this patch
Change-Id: I6c3b6cee2dd2aca41d49fe55ecdc6262e7cc5f34
updates: bz#1512691
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
| |
Fixes CID: 1124360 1291740 1370918
Change-Id: I008c7ade8f9809d040f42f6d3e9af70fff2f3dc6
updates: bz#789278
Signed-off-by: Bhumika Goyal <bgoyal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
rename response contains a postbuf stat of src-inode. Since md-cache
caches stat in rename codepath too, we've to make sure stat accounts
any cached writes in write-behind. So, we make sure rename is resumed
only after any cached writes are committed to backend.
Change-Id: Ic9f2adf8edd0b58ebaf661f3a8d0ca086bc63111
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Updates: bz#1512691
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current invalidation of stats in wb_readdirp_cbk is prone to races. As
the deleted comment explains,
<snip>
We cannot guarantee integrity of entry->d_stat as there are cached
writes. The stat is most likely stale as it doesn't account the cached
writes. However, checking for non-empty liability list here is not a
fool-proof solution as there can be races like,
1. readdirp is successful on posix
2. sync of cached write is successful on posix
3. write-behind received sync response and removed the request from
liability queue
4. readdirp response is processed at write-behind.
In the above scenario, stat for the file is sent back in readdirp
response but it is stale.
</snip>
Change-Id: I6ce170985cc6ce3df2382ec038dd5415beefded5
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Updates: bz#1512691
|
|
|
|
|
|
|
|
|
|
|
|
| |
Please review, it's not always just the comments that were fixed.
I've had to revert of course all calls to creat() that were changed
to create() ...
Only compile-tested!
Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 9340b3c7a6c8556d6f1d4046de0dbd1946a64963.
operations/writes across different fds of the same file cannot be
considered as independent. For eg., man 2 fsync states,
<man 2 fsync>
fsync() transfers ("flushes") all modified in-core data of
(i.e., modified buffer cache pages for) the file referred to by the
file descriptor fd to the disk device
</man>
This means fsync is an operation on file and fd is just a way to reach
file. So, it has to sync writes done on other fds too. Patch
9340b3c7a6c, prevents this.
The problem fixed by patch 9340b3c7a6c - a flush on an fd is hung on a
failed write (held in cache for retrying) on a different fd - is
solved in this patch by making sure __wb_request_waiting_on considers
failed writes on any fd as dependent on flush/fsync on any fd (not
just the fd on which writes happened) opened on the same file. This
means failed writes on any fd are either synced or thrown away on
witnessing flush/fsync on any fd of the same file.
Change-Id: Iee748cebb6d2a5b32f9328aff2b5b7cbf6c52c05
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Updates: bz#1512691
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the aggregate size is by default 128K (page size).
From performance perspective small number of large writes is faster
than large number of small writes, especially in EC volumes. But identifying
the right aggregate size depends on multiple factors like the memcpy overhead,
network overhead etc. On local machine, combining 128k writes to 1M writes for
EC volumes yielded 30% improvement.
As a part of this patch, aggregate size is just made configurable and page_size
is modified accordingly.
Raghavendra Gowdappa had suggested that, while aggregating writes we should get
rid of memcpy of large write size, and instead add the pointer to existinf vector,
will be doing it as a part of another patch. Also, in EC volumes, the vectors are
merged into one vector, so even if we save memcopy in write_behind, EC would anyways
do memcopy for merging vectors into one vector.
Updates: #364
Change-Id: Ib67294b8577bea14dde1c84cd271012ecea99f09
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
| |
Updates #302
Change-Id: I2ade3dc4aef4da619c5b64058872594f0f1e004f
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The variabled "fulfilled" in wb_fulfill_short_write is not reset to 0
while handling every member of the list.
This has some interesting consequences:
* If we break from the loop while processing last member of the list
head->winds, req is reset to head as the list is a circular
one. However, head is already fulfilled and can potentially be
freed. So, we end up adding a freed request to wb_inode->todo
list. This is the RCA for the crash tracked by the bug associated
with this patch (Note that we saw "holder" which is freed in todo
list).
* If we break from the loop while processing any of the last but one
member of the list head->winds, req is set to next member in the
list, skipping the current request, even though it is not entirely
synced. This can lead to data corruption.
The fix is very simple and we've to change the code to make sure
"fulfilled" reflects whether the current request is fulfilled or not
and it doesn't carry history of previous requests in the list.
Change-Id: Ia3d6988175a51c9e08efdb521a7b7938b01f93c8
BUG: 1528558
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
| |
This patch creates a new way of defining message id's that is easier
and less error prone because it doesn't require so many manual changes
each time a new component is defined or a new message created.
Change-Id: I71ba8af9ac068f5add7e74f316a2478bc991c67b
Signed-off-by: Xavier Hernandez <jahernan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reduces the space used from four bytes to one, and allows
new code to use familiar C99 types/values interoperably with our
old cruft. It does *not* change current declarations or code;
that will be left for a separate - much larger - patch.
Updates: #80
Change-Id: I5baedd17d3fb05b38f0d8b8bb9dd62824475842e
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
write-behind xlator does not honor the client pid being
set. It doesn't pass down the client pid saved in
'frame->root->pid'. This patch fixes the same.
Change-Id: I838dcf43f56d6d0aa1d2c88811a2b271d9e88d05
BUG: 1430608
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/16854
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and statedump too. Also "const char *" (versus just "char *") for the
fmt param.
Change-Id: Ic63734a673208a2cd49aebccce7659816e6179e3
BUG: 1399196
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: https://review.gluster.org/15881
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
statedump
Change-Id: Ia5dd718458a5e32138012f81f014d13fc6b28be2
BUG: 1415115
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: https://review.gluster.org/16440
Reviewed-by: N Balachandran <nbalacha@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since __wb_request_unref can remove the request from various lists,
calling it without holding wb_inode->lock results in corruptions when
other threads simultaneously try to access the lists this request is
part of.
Thanks to "Nithya Balachandran" <nbalacha@redhat.com> for pointing out
the bug.
Change-Id: I78fb6433c2e212500d07780f7b45c5a0e2bf9209
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: https://review.gluster.org/16464
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I2ea1350fcbe4b6c06dcb8093b28316f734cd3b48
BUG: 1379655
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/16285
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cached writes
Fops like readdirp, link, fallocate, discard, zerofill return iatt of
files in their responses. This iatt can be cached by md-cache. Hence
it is important that write-behind maintains relative ordering of these
fops with cached writes. Failure to do so, can result in md-cache
storing stale iatts and returning the same to applications.
Change-Id: Icfe12ad807e42fe9e52a9f63e47ce63f511c6946
BUG: 1390050
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/15757
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the issue is happened in this case:
assume a file is opened with fd1 and fd2.
1. some WRITE opto fd1 got error, they were add back to 'todo' queue
because of those error.
2. fd2 closed, a FLUSH op is send to write-behind.
3. FLUSH can not be unwind because it's not a legal waiter for those
failed write(as func __wb_request_waiting_on() say). and those failed
WRITE also can not be ended if fd1 is not closed. fd2 stuck in close
syscall.
to resolve this issue, we can change the way we determine 2 requests is
'conflict': flush/fsync is not conflict with those write that is not
belonged to them. so __wb_pick_winds() can wind the FLUSH op.
below is some information when the stuck issue happen:
glusterdump logs:
[xlator.performance.write-behind.wb_inode]
path=/ltp-F9eG0ZSOME/rw-buffered-16436
inode=0x7fdbe8039b9c
window_conf=1048576
window_current=249856
transit-size=0
dontsync=0
[.WRITE]
request-ptr=0x7fdbe8020200
refcount=1
wound=no
generation-number=4
req->op_ret=-1
req->op_errno=116
sync-attempts=3
sync-in-progress=no
size=131072
offset=1220608
lied=-1
append=0
fulfilled=0
go=0
[.WRITE]
request-ptr=0x7fdbe8068c30
refcount=1
wound=no
generation-number=5
req->op_ret=-1
req->op_errno=116
sync-attempts=2
sync-in-progress=no
size=118784
offset=1351680
lied=-1
append=0
fulfilled=0
go=0
[.FLUSH]
request-ptr=0x7fdbe8021cd0
refcount=1
wound=no
generation-number=6
req->op_ret=0
req->op_errno=0
sync-attempts=0
gdb detail about above 3 requests:
(gdb) print *((wb_request_t *)0x7fdbe8021cd0)
$2 = {all = {next = 0x7fdbe803a608, prev = 0x7fdbe8068c30}, todo = {next
= 0x7fdbe803a618, prev = 0x7fdbe8068c40}, lie = {next = 0x7fdbe8021cf0,
prev = 0x7fdbe8021cf0}, winds = {next = 0x7fdbe8021d00, prev =
0x7fdbe8021d00}, unwinds = {next = 0x7fdbe8021d10, prev =
0x7fdbe8021d10}, wip = {
next = 0x7fdbe8021d20, prev = 0x7fdbe8021d20}, stub =
0x7fdbe80224dc, write_size = 0, orig_size = 0, total_size = 0, op_ret =
0, op_errno = 0,
refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_FLUSH, lk_owner
= {len = 8, data = "W\322T\f\271\367y$", '\000' <repeats 1015 times>},
iobref = 0x0, gen = 6, fd = 0x7fdbe800f0dc, wind_count = 0, ordering =
{size = 0, off = 0, append = 0, tempted = 0, lied = 0, fulfilled = 0,
go = 0}}
(gdb) print *((wb_request_t *)0x7fdbe8020200)
$3 = {all = {next = 0x7fdbe8068c30, prev = 0x7fdbe803a608}, todo = {next
= 0x7fdbe8068c40, prev = 0x7fdbe803a618}, lie = {next = 0x7fdbe8068c50,
prev = 0x7fdbe803a628}, winds = {next = 0x7fdbe8020230, prev =
0x7fdbe8020230}, unwinds = {next = 0x7fdbe8020240, prev =
0x7fdbe8020240}, wip = {
next = 0x7fdbe8020250, prev = 0x7fdbe8020250}, stub =
0x7fdbe8062c3c, write_size = 131072, orig_size = 4096, total_size = 0,
op_ret = -1,
op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop =
GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>},
iobref = 0x7fdbe80311a0, gen = 4, fd = 0x7fdbe805c89c, wind_count = 3,
ordering = {size = 131072, off = 1220608, append = 0, tempted = -1,
lied = -1, fulfilled = 0, go = 0}}
(gdb) print *((wb_request_t *)0x7fdbe8068c30)
$4 = {all = {next = 0x7fdbe8021cd0, prev = 0x7fdbe8020200}, todo = {next
= 0x7fdbe8021ce0, prev = 0x7fdbe8020210}, lie = {next = 0x7fdbe803a628,
prev = 0x7fdbe8020220}, winds = {next = 0x7fdbe8068c60, prev =
0x7fdbe8068c60}, unwinds = {next = 0x7fdbe8068c70, prev =
0x7fdbe8068c70}, wip = {
next = 0x7fdbe8068c80, prev = 0x7fdbe8068c80}, stub =
0x7fdbe806746c, write_size = 118784, orig_size = 4096, total_size = 0,
op_ret = -1,
op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop =
GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>},
iobref = 0x7fdbe8052b10, gen = 5, fd = 0x7fdbe805c89c, wind_count = 2,
ordering = {size = 118784, off = 1351680, append = 0, tempted = -1,
lied = -1, fulfilled = 0, go = 0}}
you can see they are all on 'todo' queue, and FLUSH op fd is not the
same WRITE op fd.
Change-Id: Id687f9cd3b9f281e1a97c83f1ce981ede272b8ab
BUG: 1372211
Signed-off-by: Ryan Ding <ryan.ding@open-fs.com>
Reviewed-on: http://review.gluster.org/15380
Tested-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
wb_fulfill_request
Before this patch, a request is removed from liability queue only when
ref count of request hits 0. Though, wb_fulfill_request does an unref,
it need not be the last unref and hence the request may survive in
liability queue till the last unref. Let,
T1: the time at which wb_fulfill_request is invoked
T2: the time at which last unref is done on request
Let's consider a case of T2 > T1. In the time window between T1 and
T2, any other request (waiter) conflicting with request in liability
queue (blocker - basically a write which has been lied) is blocked
from winding. If T2 happens to be when wb_do_unwinds is invoked, no
further processing of request list happens and "waiter" would get
blocked forever. An example imaginary sequence of events is given
below:
1. A write request w1 is picked up for unwinding in __wb_pick_unwinds
(but unwind is not done _yet_ and hence reference
remains). However, w1 is moved to liability queue. Let's call this
invocation of wb_process_queue by wb_writev as PQ1.
2. A flush (f1) request hits write behind. Since the liability queue
of inode is not empty, f1 is not picked for unwinding. Let's call
the invocation of wb_process_queue by wb_flush as PQ2.
3. PQ2 continues and picks w1 for fulfilling and invokes
wb_fulfill. As part of successful wb_fulfill_cbk,
wb_fulfill_request (w1) is invoked. But, w1 is not freed (and hence
not removed from liability queue) as w1 is not unwound _yet_ and a
ref remains (PQ1 has not invoked wb_do_unwinds _yet_).
4. wb_fulfill_cbk (triggered by PQ2) invokes a wb_process_queue (let's
say PQ3). f1 is not resumed in PQ3 as w1 is still in liability
queue. At this time, PQ2 and PQ3 are complete.
5. PQ1 continues, unwinds w1 and does last unref on w1 and w1 is freed
(and removed from liability queue). Since PQ1 didn't invoke
wb_fulfill on any other write requests, there won't be any future
codepaths that would invoke wb_process_queue and f1 is stuck
forever.
With this fix, w1 is removed from liability queue in step 3 above and
PQ3 resumes f1 in step 4 (as there are no requests conflicting with f1
in liability queue during execution of PQ3).
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
BUG: 1379655
Change-Id: Idacda1fcd520ac27f30224f8dfe8360dba6ac6cb
Reviewed-on: http://review.gluster.org/15579
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
And minor cleanup of a few of the Makefile.am files while we're
at it.
Rewrite the make rules to do what xdrgen does. Now we can get rid
of xdrgen.
Note 1. netbsd6's sed doesn't do -i. Why are we still running
smoke tests on netbsd6 and not netbsd7? We barely support netbsd7
as it is.
Note 2. Why is/was libgfxdr.so (.../rpc/xdr/src/...) linked with
libglusterfs? A cut-and-paste mistake? It has no references to
symbols in libglusterfs.
Note3. "/#ifndef\|#define\|#endif/" (note the '\'s) is a _basic_
regex that matches the same lines as the _extended_ regex
"/#(ifndef|define|endif)/". To match the extended regex sed needs to
be run with -r on Linux; with -E on *BSD. However NetBSD's and
FreeBSD's sed helpfully also provide -r for compatibility. Using a
basic regex avoids having to use a kludge in order to run sed with
the correct option on OS X.
Note 4. Not copying the bit of xdrgen that inserts copyright/license
boilerplate. AFAIK it's silly to pretend that machine generated
files like these can be copyrighted or need license boilerplate.
The XDR source files have their own copyright and license; and
their copyrights are bound to be more up to date than old
boilerplate inserted by a script. From what I've seen of other
Open Source projects -- e.g. gcc and its C parser files generated
by yacc and lex -- IIRC they don't bother to add copyright/license
boilerplate to their generated files.
It appears that it's a long-standing feature of make (SysV, BSD,
gnu) for out-of-tree builds to helpfully pretend that the source
files it can find in the VPATH "exist" as if they are in the $cwd.
rpcgen doesn't work well in this situation and generates files
with "bad" #include directives.
E.g. if you `rpcgen ../../../../$srcdir/rpc/xdr/src/glusterfs3-xdr.x`,
you get an #include directive in the generated .c file like this:
...
#include "../../../../$srcdir/rpc/xdr/src/glusterfs3-xdr.h"
...
which (obviously) results in compile errors on out-of-tree build
because the (generated) header file doesn't exist at that location.
Compared to `rpcgen ./glusterfs3-xdr.x` where you get:
...
#include "glusterfs3-xdr.h"
...
Which is what we need. We have to resort to some Stupid Make Tricks
like the addition of various .PHONY targets to work around the VPATH
"help".
Warning: When doing an in-tree build, -I$(top_builddir)/rpc/xdr/...
looks exactly like -I$(top_srcdir)/rpc/xdr/... Don't be fooled though.
And don't delete the -I$(top_builddir)/rpc/xdr/... bits
Change-Id: Iba6ab96b2d0a17c5a7e9f92233993b318858b62e
BUG: 1330604
Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/14085
Tested-by: Niels de Vos <ndevos@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
up on failure case __inode_ctx_put need to free the allocated memory
Indirect leak of 104 byte(s) in 1 object(s) allocated from:
#0 0x496669 in __interceptor_calloc (/usr/local/sbin/glusterfsd+0x496669)
#1 0x7f8a288522f9 in __gf_calloc libglusterfs/src/mem-pool.c:117
#2 0x7f8a17235962 in __posix_acl_ctx_get xlators/system/posix-acl/src/posix-acl.c:308
Change-Id: I0ce6da3967c55931a70f77d8551ccf52e4cdfda3
BUG: 1338733
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Reviewed-on: http://review.gluster.org/14505
Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Don't mark the request with a fake EIO after a short write.
* retry the remaining buffer at least once before unwinding reply to
application. This way we capture correct error from backend (ENOSPC,
EDQUOT etc).
Thanks to "Vijaikumar Mallikarjuna"<vmallika@redhat.com> for the test
script.
Change-Id: I73a18b39b661a7424db1a7855a980469a51da8f9
BUG: 1292020
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/13438
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. while handling short writes, __wb_fulfill_short_write would've freed
current request before moving on to next in the list if request is not
big enough to accomodate completed number of bytes. So, make sure to
save next member before invoking __wb_fulfill_short_write on current
request. Also handle the case where request is exactly the size of
remaining completed number of bytes.
2. When write request is unwound because there is a conflicting failed
liability, make sure its deleted from tempted list. Otherwise there
will be two unwinds (one as part handling a failed request in
wb_do_winds and another in wb_do_unwinds).
Change-Id: Id1b54430796b19b956d24b8d99ab0cdd5140e4f6
BUG: 1297373
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/13213
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
short writes.
1. Imagine a write when cache is filled with failed syncs.
2. This write won't be unwound since cache size has exceeded
configured limit.
3. With trickling-writes on by default, the last write request wont be
considered for winding when there is non zero in-transit size.
4. There was a bug in accounting of in-transit size when winds
resulted in short writes. Due to this bug, in-transit size used to be
non-zero even when there are no syncs in progress.
5. Due to 3 and 4, current write request won't be wound till there is
another write or fsync or flush from application. But application
can't do any other fop till current write request is unwound. This
resulted in deadlock and hence application would be hung in 'D'
state.
This patch fixes bug in accounting of in-transit size during short
writes.
Change-Id: I04ce8bb510efaaed7623cac38d69b32dbc3730ce
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
BUG: 1279730
Reviewed-on: http://review.gluster.org/13177
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When compiled with -Werror flag gcc throws the following
error:
‘iov_length’ is static but used in inline
function ‘__wb_modify_write_request’ which is not static.
Let gcc decide what functions to inline and remove the inline
keyword.
Change-Id: I6d832596eefcf08306634936e11d2c8d4b8f9ccd
BUG: 1279730
Signed-off-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-on: http://review.gluster.org/13113
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Revisiting http://review.gluster.org/#/c/11814/, which unintentionally
introduced warnings from libtool about the xlator .so names.
According to [1], the -module option must appear in the Makefile.am
file(s); if -module is defined in a macro, e.g. in configure(.ac),
then libtool will not recognize that this is a module and will emit a
warning.
[1]
http://www.gnu.org/software/automake/manual/automake.html#Libtool-Modules
Change-Id: Ifa5f9327d18d139597791c305aa10cc4410fb078
BUG: 1248669
Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/13003
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: soumya k <skoduri@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. When sync fails, the cached-write is still preserved unless there
is a flush/fsync waiting on it.
2. When a sync fails and there is a flush/fsync waiting on the
cached-write, the cache is thrown away and no further retries will
be made. In other words flush/fsync act as barriers for all the
previous writes. The behaviour of fsync acting as a barrier is
controlled by an option (see below for details). All previous
writes are either successfully synced to backend or forgotten in
case of an error. Without such barrier fop (especially flush which
is issued prior to a close), we end up retrying for ever even after
fd is closed.
3. If a fop is waiting on cached-write and syncing to backend fails,
the waiting fop is failed.
4. sync failures when no fop is waiting are ignored and are not
propagated to application. For eg.,
a. first attempt of sync of a cached-write w1 fails
b. second attempt of sync of w1 succeeds
If there are no fops dependent on w1 are issued b/w a and b,
application won't know about failure encountered in a.
5. The effect of repeated sync failures is that, there will be no
cache for future writes and they cannot be written behind.
fsync as a barrier and resync of cached writes post fsync failure:
==================================================================
Whether to keep retrying failed syncs post fsync is controlled by an
option "resync-failed-syncs-after-fsync". By default, this option is
set to "off".
If sync of "cached-writes issued before fsync" (to backend) fails,
this option configures whether to retry syncing them after fsync or
forget them. If set to on, cached-writes are retried till a "flush"
fop (or a successful sync) on sync failures. fsync itself is failed
irrespective of the value of this option, when there is a sync failure
of any cached-writes issued before fsync.
Change-Id: I6097c9257bfb9ee5b15616fbe6a0576ae9af369a
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
BUG: 1279730
Reviewed-on: http://review.gluster.org/12594
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've been lucky that we haven't had any symbol collisions until now.
Now we have a collision between the snapview-client's svc_lookup() and
libntirpc's svc_lookup() with nfs-ganesha's FSAL_GLUSTER and libgfapi.
As a short term solution all the snapview-client's FOP methods were
changed to static scope. See http://review.gluster.org/11805. This
works in snapview-client because all the FOP methods are defined in
a single source file. This solution doesn't work for other xlators
with FOP methods defined in multiple source files.
To address this we link with libtool's '-export-symbols $symbol-file'
(a wrapper around `ld --version-script ...` --- on linux anyway) and
only export the minimum required symbols from the xlator sharedlib.
N.B. the libtool man page says that the symbol file should be named
foo.sym, thus the rename of *.exports to *.sym. While foo.exports
worked, we will follow the documentation.
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
BUG: 1248669
Change-Id: I1de68b3e3be58ae690d8bfb2168bfc019983627c
Reviewed-on: http://review.gluster.org/11814
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: soumya k <skoduri@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix the build broken because of patch
http://review.gluster.org/#/c/9822/
Change-Id: I0ee502c0fad5be87186c80ab4729036f52f85fa3
BUG: 1194640
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/11451
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
logs to new logging framework.
Change-Id: Ie6aaf8d30bd4457bb73c48e23e6b1dea27598644
BUG: 1194640
Signed-off-by: arao <arao@redhat.com>
Reviewed-on: http://review.gluster.org/9822
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of including config.h in each file, and have the additional
config.h included from the compiler commandline (-include option).
When a .c file tests for a certain #define, and config.h was not
included, incorrect assumtions were made. With this change, it can not
happen again.
BUG: 1222319
Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/10808
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
git@forge.gluster.org:~schafdog/glusterfs-core/osx-glusterfs
Working functionality on MacOSX
- GlusterD (management daemon)
- GlusterCLI (management cli)
- GlusterFS FUSE (using OSXFUSE)
- GlusterNFS (without NLM - issues with rpc.statd)
Change-Id: I20193d3f8904388e47344e523b3787dbeab044ac
BUG: 1089172
Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
Signed-off-by: Dennis Schafroth <dennis@schafroth.com>
Tested-by: Harshavardhana <harsha@harshavardhana.net>
Tested-by: Dennis Schafroth <dennis@schafroth.com>
Reviewed-on: http://review.gluster.org/7503
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A program that calls mmap() on a newly created sparse file, may receive
a SIGBUS signal. If SIGBUS is not handled, a segmentation fault will
occur and the program will exit.
A bug in the write-behind translator can cause the creation of a sparse
file created with open(), seek(), write() to be cached. The last write()
may not be sent to the server, until write-behind deems this necessary.
* open(.., O_TRUNC, ...)/creat() the file, it is 0 bytes big
* seek() into the file, use offset 31
* write() 1 byte to the file
* the range from byte 0-30 are unwritten so called 'sparse'
The following illustration tries to capture this:
Legend:
[ = start of file
_ = unallocated/unwritten bytes
# = allocated bytes in the file
] = end of file
[_______________#]
| |
'- byte 0 '- byte 31
Without this change, reading from byte 0-30 will return an error, and
reading the same area through an mmap()'d pointer will trigger a SIGBUS.
Reading from this range did not trigger the outstanding write() to be
flushed. The brick that receives the read() (translated over the network
from mmap()) does not know that the file has been extended, and returns
-EINVAL. This error gets transported back from the brick to the
glusterfs-fuse client, and translated by the Linux kernel/VFS into
SIGBUS triggered by mmap().
In order to solve this, a new attribute to the wb_inode structure is
introduced; the current size of the file. All FOPs that can modify the
size, are expected to update wb_inode->size. This makes it possible for
extending writes with an offset bigger than EOF to mark the unwritten
area as modified/pending.
Change-Id: If5ba6646732e6be26568541ea9b12852a5d0b988
BUG: 1058663
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/6835
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I941fc89b2d696c7f227330321ed4bba3ed1deac4
BUG: 789278
Signed-off-by: Poornima <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/6868
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
.. by UNWINDing ENOMEM error, rather than crashing.
Change-Id: Ica2d6399eaf7e381e7ebc41155620559c139c4d3
BUG: 1034398
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6349
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
we find fd marked bad while trying to fulfill lies.
* flush was queued behind some unfulfilled write.
* A previously wound write returned an error and hence fd was marked
bad with corresponding error.
* wb_fulfill_head (invocation probably rooted in wb_flush), before
winding checks for failures of previous writes and since there was a
failure, calls wb_head_done without even winding one request in head.
* wb_head_done unrefs all the requests in list "head".
* since flush was last operation on fd (and most likely last operation
on inode itself), no one invokes wb_process_queue and flush is stuck
in request queue for eternity.
Change-Id: I3b5b114a1c401d477dd7ff64fb6119b43fda2d18
BUG: 988642
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/5398
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ib766403774c1323e0bbddafedeaa47e7fa3a59fa
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 987415
Reviewed-on: http://review.gluster.org/5296
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When open() with O_DIRECT happens, write-behind was being disabled for the
fd irrespective of strict_O_DIRECT option. This commit disables write-behind
only when strict_O_DIRECT is enabled.
Change-Id: Ieef180e52910c3bf64d46b26b0e5dc3b8542f6d2
BUG: 923556
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-on: http://review.gluster.org/4697
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fuse has support for optimized async. direct I/O handling via the
FUSE_ASYNC_DIO init flag. Enable FUSE_ASYNC_DIO when advertised
by fuse.
performance/write-behind: fix dio hang
Also fix a hang observed during aio-stress testing due to conflicting
request handling in write-behind. Overlapping requests are skipped
in pick_winds and may never continue when the conflicting write in
progress returns. Add a wb_process_queue() call after a non-wb request
completes to keep the queue moving.
BUG: 963258
Change-Id: Ifba6e8aba7a7790b288a32067706b75f263105d4
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/5014
Reviewed-by: Anand Avati <avati@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Maintain a list of writes (either written behind or SYNC) which are
currently "in progress" (i.e, STACK_WIND'ed towards server) and hold
off any new STACK_WIND of write (either written behind or SYNC) which
overlaps with any of the "in progress" writes.
This is a guarantee which AFR's eager-lock depends upon (though not
strictly a write-behind requirement)
Change-Id: Icedd0b51b440366a906dc9223d62b7fd6ef2ca03
BUG: 857673
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4551
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <raghavendra@gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- re-structure members of call_stub_t with new simpler layout
- easier to inspect call_stub_t contents in gdb now
- fix a bunch of double unrefs and double frees in cbk stub
- change all STACK_UNWIND to STACK_UNWIND_STRICT and thereby fixed
a lot of bad params
- implement new API call_unwind_error() which can even be called on
fop_XXX_stub(), and not necessarily fop_XXX_cbk_stub()
Change-Id: Idf979f14d46256af0afb9658915cc79de157b2d7
BUG: 846240
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4520
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I515fc26c61e1ea929a3049b3001c58a64f3e6c87
BUG: 765473
Signed-off-by: Raghavendra G <raghavendra@gluster.com>
Reviewed-on: http://review.gluster.org/4515
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I1c9541058c7d07786539a3266ca125a6a15287d8
BUG: 859835
Signed-off-by: Anand Avati <avati@redhat.com>
Original-author: Kacper Kowalik (Xarthisius) <xarthisius.kk@gmail.com>
Signed-off-by: Kacper Kowalik (Xarthisius) <xarthisius.kk@gmail.com>
Reviewed-on: http://review.gluster.org/3967
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
LOCK attempt in wb_forget is unnecessary
Change-Id: Ibdedc23d0c829c34aedd6fc5bc0e0a584b832514
BUG: 903566
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/4423
Reviewed-by: Anand Avati <avati@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I0db00b7334bb9707ab48bd661ac03a3ad818d6e4
BUG: 893458
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/4393
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use of an unsigned type in some calculations of size would lead to segmentation
faults, if several large adjacent writes came in concurrently.
Also, improves buffer allocation code to take the size required into account.
Credits for the patch go to Amar.
Change-Id: I8a09c52d49909e4ee8e7d4dcfa02ec33ea36a551
BUG: 880948
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/4307
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* write-behind: free the inode context in wb_forget
* distribute: in readdirp callback put the allocated context to the inode
* distribute: check if the layout is NULL before accessing it in layout_unref
Change-Id: I7698f81b85b99d06bf6b01fc1a6e51e1593b5e27
BUG: 790709
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/4250
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
off_t is 'long int' (signed word) and therefore ULLONG_MAX - 1
Change-Id: I027de7a1b2ca24865d5d787f9986930e97911ca4
BUG: 857673
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4079
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I690e8bf650d6e6e50899c2e17a79f42789e701eb
BUG: 843792
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/4036
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|