| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
For allowing parallel writes we shouldn't depend on ia_size to be same for
all the bricks in each write_cbk(). But we need to make sure backend size
is correct on all the bricks and no crashes/manual modifications happened.
Fix:
At the time of get_size_version() we do 1 check to make sure size of the file
is same across the bricks. From then on the FOPs will give the status of the
fop, so we rely on this information to keep which bricks are good/bad.
Updates #251
Change-Id: I1df645347e2e9f2e09cfa4411b6cc305d7f4e4e5
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: https://review.gluster.org/17741
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
4 + 2 EC volume configuration.
If untar of linux is going on and we kill a brick,
indices will be created for the files/dir which need
to be healed. ec_shd_index_sweep spawns threads to
scan these entries and start heal. If in the middle
of this we kill one more brick, we end up in a
situation where we can not heal an entry as there
are only "ec->fragment" number of bricks are UP.
However, the scan will be continued and it will
trigger the heal for those entries.
Solution:
When a heal is triggered for an entry, check if it
*CAN* be healed or not. If not come out with ENOTCONN.
Change-Id: I305be7701c289f36bd7bde22491b71074771424f
BUG: 1464359
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: https://review.gluster.org/17692
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Sunil Kumar Acharya <sheggodu@redhat.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a SEEK_HOLE was issued near to the end of file, sometimes an
offset beyond the end of file was returned. Another problem was that
using some offsets greater than the end of file returned successfully
instead of failing with ENXIO.
Change-Id: I238d2884ba02fd19a78116b0f8f8e8d6338fb3f5
BUG: 1449348
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: https://review.gluster.org/17228
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
"gluster v heal <volname> info" is taking
long time to respond when a brick is down.
RCA:
Heal info command does virtual mount.
EC wait for 10 seconds, before sending UP call to upper xlator,
to get notification (DOWN or UP) from all the bricks.
Currently, we are increasing ec->xl_notify_count based on
the current status of the brick. So, if a DOWN event notification
has come and brick is already down, we are not increasing
ec->xl_notify_count in ec_handle_down.
Solution:
Handle DOWN even as notification irrespective of what
is the current status of brick.
Change-Id: I0acac0db7ec7622d4c0584692e88ad52f45a910f
BUG: 1464091
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: https://review.gluster.org/17606
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
The change in EC to return list of node uuids for
GF_XATTR_NODE_UUID_KEY was causing problems with
geo-rep.
Fix:
This patch will allow to get the single node uuid
as it was doing before with the key
"GF_XATTR_NODE_UUID_KEY", and will also allow to get
the list of node uuids by using a new key
"GF_XATTR_LIST_NODE_UUIDS_KEY". This will solve
the problem with geo-rep and any other features which
were depending on this.
BUG: 1462790
Change-Id: I2d9214a9658d4a41a3d6de08600884d2bda5f3eb
Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
Reviewed-on: https://review.gluster.org/17594
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When application sends a blocking lock, the lk fop actually waits under
inodelk. This can lead to a dead-lock.
1) Let's say app-1 takes exculsive-fcntl-lock on the file
2) app-2 attempts an exclusive-fcntl-lock on the file which goes to blocking
stage note: app-2 is blocked inside transaction which holds an inode-lock
3) app-1 tries to perform write which needs inode-lock so it gets blocked on
app-2 to unlock inodelk and app-2 is blocked on app-1 to unlock fcntl-lock
Fix:
Correct way to fix this issue and make fcntl locks perform well would be to
introduce
2-phase locking for fcntl lock:
1) Implement a try-lock phase where locks xlator will not merge lk call with
existing calls until a commit-lock phase.
2) If in try-lock phase we get quorum number of success without any EAGAIN
error, then send a commit-lock which will merge locks.
3) In case there are any errors, unlock should just delete the lock-object
which was tried earlier and shouldn't touch the committed locks.
Unfortunately this is a sizeable feature and need to be thought through for any
corner cases. Until then remove transaction from lk call.
BUG: 1455049
Change-Id: I18a782903ba0eb43f1e6526fb0cf8c626c460159
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: https://review.gluster.org/17542
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem-1 : Recursive healing of same file is happening
when IO is going on even after data heal completes.
Solution:
RCA: At the end of the write, when ec_update_size_version
gets called, we send it only on good bricks and not
on healing brick. Due to this, xattr on healing brick
will always remain out of sync and when the background
heal check source and sink, it finds this brick to be
healed and start healing from scratch. That involve
ftruncate and writing all of the data again.
To solve this, send xattrop on all the good bricks as
well as healing bricks.
Problem-2: The above fix exposes the data corruption
during heal. If the write on a file is going on and
heal finishes, we find that the file gets corrupted.
RCA:
The real problem happens in ec_rebuild_data(). Here we receive the
'size' argument which contains the real file size at the time of
starting self-heal and it's assigned to heal->total_size.
After that, a sequence of calls to ec_sync_heal_block() are done. Each
call ends up calling ec_manager_heal_block(), which does the actual work
of healing a block.
First a lock on the inode is taken in state EC_STATE_INIT using
ec_heal_inodelk(). When the lock is acquired, ec_heal_lock_cbk() is
called. This function calls ec_set_inode_size() to store the real size
of the inode (it uses heal->total_size).
The next step is to read the block to be healed. This is done using a
regular ec_readv(). One of the things this call does is to trim the
returned size if the file is smaller than the requested size.
In our case, when we read the last block of a file whose size was = 512
mod 1024 at the time of starting self-heal, ec_readv() will return only
the first 512 bytes, not the whole 1024 bytes.
This isn't a problem since the following ec_writev() sent from the heal
code only attempts to write the amount of data read, so it shouldn't
modify the remaining 512 bytes.
However ec_writev() also checks the file size. If we are writing the
last block of the file (determined by the size stored on the inode that
we have set to heal->total_size), any data beyond the (imposed) end of
file will be cleared with 0's. This causes the 512 bytes after the
heal->total_size to be cleared. Since the file was written after heal
started, the these bytes contained data, so the block written to the
damaged brick will be incorrect.
Solution:
Align heal->total_size to a multiple of the stripe size.
Thanks "Xavier Hernandez" <xhernandez@datalab.es>
to find out the root cause and to fix the issue.
Change-Id: I6c9f37b3ff9dd7f5dc1858ad6f9845c05b4e204e
BUG: 1428673
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: https://review.gluster.org/16985
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes for various minor spelling errors and typos
Reported-by: Patrick Matthäi <pmatthaei@debian.org>
Change-Id: Ic1be36f82e3d822bbdc9559878bd79520fc0fcd5
BUG: 1457808
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: https://review.gluster.org/17442
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FALLOCATE file operations is not implemented in the
existing EC code. This change set implements it
for EC.
BUG: 1448293
Change-Id: Id9ed914db984c327c16878a5b2304a0ea461b623
Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
Reviewed-on: https://review.gluster.org/15200
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
EC was retuning the UUID of the brick with smaller value. This had
the side effect of not evenly balancing the load between bricks on
rebalance operations.
This patch modifies the common functions that combine multiple subvolume
values into a single result to take into account the subvolume order
and, optionally, other subvolumes that could be damaged.
This makes easier to add future features where brick order is important.
It also makes possible to easily identify the originating brick of each
answer, in case some brick will have an special meaning in the future.
Change-Id: Iee0a4da710b41224a6dc8e13fa8dcddb36c73a2f
BUG: 1366817
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: https://review.gluster.org/17297
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A bad check in the answer of a seek request caused a segmentation
fault when seek reported an error.
Change-Id: Ifb25ae8bf7cc4019d46171c431f7b09b376960e8
BUG: 1439068
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: https://review.gluster.org/16998
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix implements the heal window size option for
EC. This option control the maximum size of
read/write operation carried out in self-heal
process.
BUG: 1441491
Change-Id: I6c0ef65c9ca18b0828f91b319d4f52ac5b77d0d8
Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
Reviewed-on: https://review.gluster.org/17098
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In ec_child_select, we should send fop on healing bricks
unconditionaly but to check the number of healthy bricks
against fragments and minimum count, we should not count
these healing bricks.
Count bits of fop->mask before adding ealing brick to
fop->mask
Change-Id: I3fa80bdd5ca34ca070d610116b84154b917c5999
BUG: 1439527
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: https://review.gluster.org/17007
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note: Even though gcc(1) will automatically treat ffs() and popcount()
as built-in, calling them explicitly as __builtin in the source helps
make it easier to find them. (And if no __builtin_ffs use the one in
libc.)
Change-Id: Ib74d9b221ff03a01df5ad05907024da1a83a7a88
BUG: 1438772
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: https://review.gluster.org/16993
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Debian builds detected spelling issues with GlusterFS 3.10.1. Instead of
carrying the patch in the Debian sources, let's include the fixes here
too.
Change-Id: I38db6adf142f7ec247bffd47aa1e6ff1a0c49e00
Reported-by: Patrick Matthäi <pmatthaei@debian.org>
BUG: 1437853
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: https://review.gluster.org/16973
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During meatadata heal, we were not updating the version
though all the inode attributes were in sync.
Updated the code to adjust version when all the inode
attributes are in sync.
BUG: 1425703
Change-Id: I6723be3c5f748b286d4efdaf3c71e9d2087c7235
Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
Reviewed-on: https://review.gluster.org/16772
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We wanted to mark dirty for metadata/entry operations
whenever query-info is set and info is not yet there because we
are anyway sending xattrop over the network. But this is causing
25% regression from 3.8.8 so removing this optimization
Also fixed two small issues that we didn't find in the previous
patch
1) reconfigure failure was sending return value 0 for optimistic-changelog
2) ec->optimistic_changelog was set to true even before OPTION_INIT
BUG: 1408809
Change-Id: Iabb0b64bd4d3623688790e4b67e5c20b4da977a1
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: https://review.gluster.org/16865
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Fix to https://bugzilla.redhat.com/show_bug.cgi?id=1316873 has made
changes to set dirty flag before every update fop, data or metadata, and unset
it after successful operation. That makes some of the fops very slow such as
entry operations or metadata operations.
Solution: File data operations are the only operation which take some time and
setting dirty flag before a fop and unsetting it after serves the purpose as
probability of failure of a fop is high when the time duration is more. For all
the other operations, set dirty flag at the end of the fop, if any brick is
down and need heal.
Providing following option to choose between high performance or better heal
marking for metadata and entry fops.
Set/Unset dirty flag for every update fop at the start of the fop. If ON, this
option impacts performance of entry operations or metadata operations as it
will set dirty flag at the start and unset it at the end of ALL update fop. If
OFF and all the bricks are good, dirty flag will be set at the start only for
file fops For metadata and entry fops dirty flag will not be set at the start,
if all the bricks are good. This does not impact performance for metadata
operations and entry operation but has a very small window to miss marking
entry as dirty in case it is required to be healed.
Thanks to Xavi and Ashish for the design
Picked the .t file from Ashish' patch https://review.gluster.org/16298
BUG: 1408809
Change-Id: I3ce860063f0e2901e50754dcfc3e4ed22daf819f
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: https://review.gluster.org/16821
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: Xavier Hernandez <xhernandez@datalab.es>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem-1
If Lookup which doesn't take any locks observes version mismatch it can't be
trusted. If we launch a heal based on this information it will lead to
self-heals which will affect I/O performance in the cases where Lookup is
wrong. Considering self-heal-daemon and operations on the inode from client
which take locks can still trigger heal we can choose to not attempt a heal on
Lookup.
Problem-2:
Fixed spurious failure of
tests/bitrot/bug-1373520.t
For the issues above, what was happening was that ec_heal_inspect()
is preventing 'name' heal to happen
Problem-3:
tests/basic/ec/ec-background-heals.t
To be honest I don't know what the problem was, while fixing
the 2 problems above, I made some changes to ec_heal_inspect() and
ec_need_heal() after which when I tried to recreate the spurious
failure it just didn't happen even after a long time.
BUG: 1414287
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Change-Id: Ife2535e1d0b267712973673f6d474e288f3c6834
Reviewed-on: https://review.gluster.org/16468
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When dynamic code generation fails for some reason, instead of causing
a failure in encode/decode, fallback to the precompiled version.
Change-Id: I4f8a97d3033aa5885779722b19c6e611caa4ffea
BUG: 1421955
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: https://review.gluster.org/16614
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On FreeBSD the S_ISVTX flag is completely ignored when creating a
regular file. Since gluster needs to create files with this flag set,
specialy for DHT link files, it's necessary to force the flag.
This fix does this by calling fchmod() after creating a file that
must have this flag set.
Change-Id: I51eecfe4642974df6106b9084a0b144835a4997a
BUG: 1411228
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: https://review.gluster.org/16417
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Heal failed or passed should not be logged as info. These can be
observed from heal info if the heal is happening or not. If we require
to debug a case where heal is not happening, we can set the level to
DEBUG.
Change-Id: I062668eadd145ef809b25e818e6bca1094f54cd6
BUG: 1420619
Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
Reviewed-on: https://review.gluster.org/16580
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
EC uses mmap() to create a memory area for the dynamic code. Since
the code is created on the fly and executed when needed, this region
of memory needs to have write and execution privileges.
This combination is not allowed by default by selinux. To solve the
problem a file is used as a backend storage for the dynamic code and
it's mapped into two distinct memory regions, one with write access
and the other one with execution access. This approach is the
recommended way to create dynamic code by a program in a more secure
way, and selinux allows it.
Additionally selinux requires that the backend file be stored in a
directory marked with type bin_t to be able to map it in an executable
area. To satisfy this condition, GLUSTERFS_LIBEXECDIR has been used.
This fix also changes the error check for mmap(), that was done
incorrectly (it checked against NULL instead of MAP_FAILED), and it
also correctly propagates the error codes and makes sure they aren't
silently ignored.
Change-Id: I71c2f88be4e4d795b6cfff96ab3799c362c54291
BUG: 1402661
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: https://review.gluster.org/16405
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for multiple brick translator stacks running
in a single brick server process. This reduces our per-brick memory usage by
approximately 3x, and our appetite for TCP ports even more. It also creates
potential to avoid process/thread thrashing, and to improve QoS by scheduling
more carefully across the bricks, but realizing that potential will require
further work.
Multiplexing is controlled by the "cluster.brick-multiplex" global option. By
default it's off, and bricks are started in separate processes as before. If
multiplexing is enabled, then *compatible* bricks (mostly those with the same
transport options) will be started in the same process.
Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb
BUG: 1385758
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: https://review.gluster.org/14763
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since EC already winds one write after other there is no need to align
application fcntl locks with ec blocks. Also added this locking to be
done as a transaction to prevent partial upgrade/downgrade of locks
happening.
BUG: 1410425
Change-Id: I7ce8955c2174f62b11e5cb16140e30ff0f7c4c31
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: https://review.gluster.org/16445
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Heal failed or passed should not be logged as warning.
These can be observed from heal info if the heal is
happening or not. If we require to debug a case where
heal is not happening, we can set the level to DEBUG.
Change-Id: I347665c8c8b6223bb08a9f3dd5643a10ddc3b93e
BUG: 1417050
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: https://review.gluster.org/16473
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Write on a file has been slowed down significantly after
http://review.gluster.org/#/c/13733/
RC : When update fop starts on a file, it sets dirty flag at
the start and remove it at the end which make an index entry
in indices/xattrop. During IO, SHD scans this and finds out
an index and starts heal even if all the fragments are healthy
and up tp date. This heal takes inodelk for different types of
heal. If the IO is for long time this will happen in every 60 seconds.
Due to this extra, unneccessary locking, IO gets slowed down.
Solution:
Before starting any type of heal check if file needs heal or not.
Change-Id: Ib9519a43e7e4b2565d3f3153f9ca0fb92174fe51
BUG: 1409191
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/16377
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Operation failed messages are getting logged
based on the callbacks of lockless fop's. If a fop does
not take a lock, it is possible that it will get some
out of sync xattr, iatts. We can not depend on these
callback to psay that the fop has failed.
Solution: Print failed messages only for locked fops.
However, heal would still be triggered.
Change-Id: I4427402c8c944c23f16073613caa03ea788bead3
BUG: 1414287
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/16435
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Updating the warning message with details to improve
user understanding.
BUG: 1409202
Change-Id: I001f8d5c01c97fff1e4e1a3a84b62e17c025c520
Signed-off-by: Sunil Kumar H G <sheggodu@redhat.com>
Reviewed-on: http://review.gluster.org/16315
Tested-by: Sunil Kumar Acharya
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In link fop lookup is happening on the new fop which doesn't exist so the iatt
ec serves parent xlators has size as zero which leads to 'cat' giving empty output
Fix:
Change code so that lookup happens on the existing link instead.
BUG: 1409730
Change-Id: I70eb02fe0633e61d1d110575589cc2dbe5235d76
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/16320
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: Xavier Hernandez <xhernandez@datalab.es>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
In disperse volume, the file is present across bricks, hence the stat
from one brick doesn't carry the valid size of the file. Therefore
the upcall from one brick updating the md-cache results in wrong size
being updated.
Fix:
If the notification is cache invalidation then, indicate md-cache that
the attributes is invalid.
BUG: 1410375
Change-Id: Id89d2283478e70b62b435a8891fffc86d2be8cb2
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/16329
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Rename does two locks. There is a case where when it tries to unlock it sends
xattrop of the directory with new version, callback of these two xattrops can
be picked up by two separate epoll threads. Both of them will try to set the
lk-owner for unlock in parallel on the same frame so one of these unlocks will
fail because the lk-owner doesn't match.
Fix:
Specify the lk-owner which will be set on inodelk frame which will not be over
written by any other thread/operation.
BUG: 1402710
Change-Id: I666ffc931440dc5253d72df666efe0ef1d73f99a
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/16074
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: ec_writev_start calls ec_make_internal_fop_xdata
to set "yes" in xdata before ec_readv (an internal fop)
is called for head and tail. Second call to this function
is overwriting the previous allocated dict_t to "xdata",
which results in memory leak.
Solution: In ec_make_internal_fop_xdata, check if *xdata
is NULL or not to avoid overwriting *xdata.
Change-Id: I49b83923e11aff9b92d002e86424c0c2e1f5f74f
BUG: 1400818
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/16007
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In a disperse volume with "K+R" configuration, where
"K" is the number of data bricks and "R" is the number of redundancy
bricks (Total number of bricks, N = K+R), if only K bricks are UP,
we should NOT start heal process. This is because the bricks, which
are supposed to be healed, are not UP. This will unnecessary
eat up the resources.
Solution: Check for the number of xl_up_count and only
if it is greater than ec->fragments (number of data bricks),
start heal process.
Change-Id: I8579f39cfb47b65ff0f76e623b048bd67b15473b
BUG: 1399072
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/15937
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently these are few events related to child_up/down:
GF_EVENT_CHILD_UP : Issued when any of the protocol client
connects.
GF_EVENT_CHILD_MODIFIED : Issued by afr/dht/ec
GF_EVENT_CHILD_DOWN : Issued when any of the protocol client
disconnects.
These events get modified at the dht/afr/ec layers. Here is a
brief on the same.
DHT:
- All the subvolumes reported once, and atleast one child came
up, then GF_EVENT_CHILD_UP is issued
- connect GF_EVENT_CHILD_UP is issued
- disconnect GF_EVENT_CHILD_MODIFIED is issued
- All the subvolumes disconnected, GF_EVENT_CHILD_DOWN is issued
AFR:
- First subvolume came up, then GF_EVENT_CHILD_UP is issued
- Subsequent subvolumes coming up, results in GF_EVENT_CHILD_MODIFIED
- Any of the subvolumes go down, then GF_EVENT_SOME_CHILD_DOWN is issued
- Last up subvolume goes down, then GF_EVENT_CHILD_DOWN is issued
Until the patch [1] introduced GF_EVENT_SOME_CHILD_UP,
GF_EVENT_CHILD_MODIFIED was issued by afr/dht when any of the subvolumes
go up or down.
Now with md-cache changes, there is a necessity to differentiate between
child up and down. Hence, introducing GF_EVENT_SOME_DESCENDENT_DOWN/UP and
getting rid of GF_EVENT_CHILD_MODIFIED.
[1] http://review.gluster.org/12573
Change-Id: I704140b6598f7ec705493251d2dbc4191c965a58
BUG: 1396038
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15764
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the
fix.
BUG: 1384297
Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15728
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Currently heal info command prints all
the files/directories if the index for the
file/directory is present in .glusterfs/indices folder.
After implementing patch http://review.gluster.org/#/c/13733/
indices of the file which is going through update fop
will also be present in .glusterfs/indices even
if the fop is successful on all the brick. At this time
if heal info command is being used, it will also display this
file which is actually healthy and does not require any heal.
Solution: Take lock on a file corresponding to the indices
and inspect xattrs to decide if the file needs heal or not.
Change-Id: I6361e2813ece369be12d02e74816df4eddb81cfa
BUG: 1366815
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/15543
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ipc will be wound to all the bricks, but for it to be
successfull, the fop should succeed on minimum number of bricks.
Change-Id: I3f8cb6a349e87bafd0773583def9d4e3765aa140
BUG: 1211863
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15387
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
And minor cleanup of a few of the Makefile.am files while we're
at it.
Rewrite the make rules to do what xdrgen does. Now we can get rid
of xdrgen.
Note 1. netbsd6's sed doesn't do -i. Why are we still running
smoke tests on netbsd6 and not netbsd7? We barely support netbsd7
as it is.
Note 2. Why is/was libgfxdr.so (.../rpc/xdr/src/...) linked with
libglusterfs? A cut-and-paste mistake? It has no references to
symbols in libglusterfs.
Note3. "/#ifndef\|#define\|#endif/" (note the '\'s) is a _basic_
regex that matches the same lines as the _extended_ regex
"/#(ifndef|define|endif)/". To match the extended regex sed needs to
be run with -r on Linux; with -E on *BSD. However NetBSD's and
FreeBSD's sed helpfully also provide -r for compatibility. Using a
basic regex avoids having to use a kludge in order to run sed with
the correct option on OS X.
Note 4. Not copying the bit of xdrgen that inserts copyright/license
boilerplate. AFAIK it's silly to pretend that machine generated
files like these can be copyrighted or need license boilerplate.
The XDR source files have their own copyright and license; and
their copyrights are bound to be more up to date than old
boilerplate inserted by a script. From what I've seen of other
Open Source projects -- e.g. gcc and its C parser files generated
by yacc and lex -- IIRC they don't bother to add copyright/license
boilerplate to their generated files.
It appears that it's a long-standing feature of make (SysV, BSD,
gnu) for out-of-tree builds to helpfully pretend that the source
files it can find in the VPATH "exist" as if they are in the $cwd.
rpcgen doesn't work well in this situation and generates files
with "bad" #include directives.
E.g. if you `rpcgen ../../../../$srcdir/rpc/xdr/src/glusterfs3-xdr.x`,
you get an #include directive in the generated .c file like this:
...
#include "../../../../$srcdir/rpc/xdr/src/glusterfs3-xdr.h"
...
which (obviously) results in compile errors on out-of-tree build
because the (generated) header file doesn't exist at that location.
Compared to `rpcgen ./glusterfs3-xdr.x` where you get:
...
#include "glusterfs3-xdr.h"
...
Which is what we need. We have to resort to some Stupid Make Tricks
like the addition of various .PHONY targets to work around the VPATH
"help".
Warning: When doing an in-tree build, -I$(top_builddir)/rpc/xdr/...
looks exactly like -I$(top_srcdir)/rpc/xdr/... Don't be fooled though.
And don't delete the -I$(top_builddir)/rpc/xdr/... bits
Change-Id: Iba6ab96b2d0a17c5a7e9f92233993b318858b62e
BUG: 1330604
Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/14085
Tested-by: Niels de Vos <ndevos@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://review.gluster.org/14085 fixes a "pragma leak" where the
generated rpc/xdr headers have a pair of pragmas that disable these
warnings. With the warnings disabled, many unused variables have
crept into the code base.
And 14085 won't pass its own smoke test until all these warnings are
fixed.
BUG: 1369124
Change-Id: I24607fc2082c3424f876f740a88fb7d0173d322d
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/15518
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, for all the update operations, metadata or data,
we set the dirty flag at the end of the operation only if
a brick is down. This leads to delay in healing and in some
cases not at all.
In this patch we set (+1) the dirty flag
at the start of the metadata or data update operations and
after successfull completion of the fop, we unset (-1) it again.
Change-Id: Ide5668bdec7b937a61c5c840cdc79a967598e1e9
BUG: 1316873
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/13733
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements functionalities for fast encoding/decoding
using hardware support. Currently optimized x86_64, SSE and AVX is
added.
Additionally this patch implements a caching mecanism for inverse
matrices to reduce computation time, as well as a new method for
computing the inverse that takes quadratic time instead of cubic.
Finally some unnecessary memory copies have been eliminated to
further increase performance.
Change-Id: I26c75f26fb4201bd22b51335448ea4357235065a
BUG: 1289922
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/12837
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch will generates events in following
cases which will be consumed by new event
framework.
Consider an EC volume with (K+M) configuration
K = Data bricks
M = Redundancy bricks
1- EVENT_EC_MIN_BRICKS_NOT_UP -
When minimum "K" number of bricks, required
for any ec fop, are not up.
2- EVENT_EC_MIN_BRICKS_UP
When minimum "K" number of bricks, required
for any ec fop, are up.
Change-Id: I0414b8968c39740a171e5aa14b087afd524d574f
BUG: 1371470
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/15348
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In some cases we see that readdir keeps winding to the brick that doesn't have
any blocked locks i.e. first brick. This is leading to the client assuming that
there are no blocking locks on the inode so it won't give away the lock. Other
clients end up blocked on the lock as if the command hung.
Fix:
Proper way to fix this issue is to use infra present in
http://review.gluster.org/14736 This is a stop gap fix where we start taking
inodelks in opendir which goes to all the bricks, this will detect if there is
any contention.
BUG: 1346719
Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15309
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
BUG: 1368451
Change-Id: I5d6b91d714ad6906dc478a401e614115c89a8fbb
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15083
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cyclic order
When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.
This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.
Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a
BUG: 1363721
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15080
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Oleksandr Natalenko <oleksandr@natalenko.name>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...and remove their definitons from EC and AFR.
Also `s/alloca+memset0/alloca0` wherever it is used.
Change-Id: I3b71e596d12a7d8900f5d761af6b98305c8874d5
BUG: 1366226
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/15147
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: This issue arises when we do a rolling update
from 3.7.5 to 3.7.9.
For 4+2 volume running 3.7.5, if we update 2 nodes
and after heal completion kill 2 older nodes, this
problem can be seen. After update and killing of
bricks, 2 nodes will return inodelk count key in dict
while other 2 nodes will not have inodelk count in dict.
This is also true for get-link-count.
During dictionary match , ec_dict_compare, this will
lead to mismatch of answers and the file operation
on mount point will fail with IO error.
Solution:
Don't match inode, entry and link count keys while
comparing two dictionaries. However, while combining the
data in ec_dict_combine, go through all the dictionaries
and select the maximum values received in different dicts
for these keys.
Change-Id: I33546e3619fe8f909286ee48fb0df2009cd3d22f
BUG: 1347686
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/14761
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Icebe1b865edb317685e93f3ef11d98fd9b2c2e9a
BUG: 1357226
Signed-off-by: Zhou Zhengping <johnzzpcrystal@gmail.com>
Reviewed-on: http://review.gluster.org/14936
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks to Rafi for hinting a while back that this kind of
problem he saw once. I didn't think the theory was valid.
Could have caught it earlier if I had tested his theory.
Change-Id: Iac6ffcdba2950aa6f8cf94f8994adeed6e6a9c9b
BUG: 1344836
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14703
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: mohammed rafi kc <rkavunga@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|