| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are three kinds of inline functions: plain inline, extern inline,
and static inline. All three have been removed from .c files, except
those in "contrib" which aren't our problem. Inlines in .h files, which
are overwhelmingly "static inline" already, have generally been left
alone. Over time we should be able to "lower" these into .c files, but
that has to be done in a case-by-case fashion requiring more manual
effort. This part was easy to do automatically without (as far as I can
tell) any ill effect.
In the process, several pieces of dead code were flagged by the
compiler, and were removed.
Change-Id: I56a5e614735c9e0a6ee420dab949eac22e25c155
BUG: 1245331
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/11769
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently ec only sends a single read request at a time for a given
inode. Since reads do not interfere between them, this patch allows
multiple concurrent read requests to be sent in parallel.
Change-Id: If853430482a71767823f39ea70ff89797019d46b
BUG: 1245689
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/11742
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
be same.
Problem:
After replacing the brick using "replace-brick" command and running "heal
full", the version of the root directory of the newly added brick is not
getting healed. heal starts running on the dentries of the root but does not
run on root directory.
Solution:
Run heal on root directory.
Change-Id: Ifd42a3fb341b049c895817e892e5b484a5aa6f80
BUG: 1243382
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/11676
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- fd_unref should decrement fd->inode->fd_count only if it is present in the
inode's fd list.
- successful open/opendir should perform fd_bind.
Change-Id: I81dd04f330e2fee86369a6dc7147af44f3d49169
BUG: 1207735
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11044
Reviewed-by: Anoop C S <anoopcs@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Self-heal was always using a fixed block size to heal a file. This
was incorrect for dispersed volumes with a number of data bricks not
being a power of 2.
This patch adjusts the block size to a multiple of the stripe size
of the volume. It also propagates errors detected during the data
heal to stop healing the file and not mark it as healed.
Change-Id: I9ee3fde98a9e5d6116fd096ceef88686fd1d28e2
BUG: 1251446
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/11862
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bitmask of good and bad bricks was kept in the context of the
corresponding inode or fd. This was problematic when an external
process (another client or the self-heal process) did heal the
bricks but no one changed the bitmaks of other clients.
This patch removes the bitmask stored in the context and calculates
which bricks are healthy after locking them and doing the initial
xattrop. After that, it's updated using the result of each fop.
Change-Id: I225e31cd219a12af4ca58871d8a4bb6f742b223c
BUG: 1236065
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/11844
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I82e245615419c2006a2d1b5e94ff0908d2f5e891
BUG: 1245276
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/11741
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dict_set_bin() is handling the pointer that it passed inconsistently.
Depending on the errors that can occur, the pointer passed to the dict
can be free'd, but there is no guarantee.
It is cleaner to have the caller free the pointer that allocated it and
dict_set_bin() returned an error. When dict_set_bin() returned success,
the given pointer will be free'd when dict_unref() calls data_destroy().
Many callers of dict_set_bin() already take care of free'ing the pointer
on error. The ones that did not, are corrected with this change too.
Change-Id: I39a4f7ebc0cae6d403baba99307d7ce408f25966
BUG: 1242280
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/11638
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
New lock could come at the time timer is on the way to unlock. This was leading
to crash in timer thread because thread executing new lock can free up the
timer_link->fop and then timer thread will try to access structures already
freed.
Fix:
If the timer event is fired, set lock->release to true and wait for unlock to
complete.
Thanks to Xavi and Bhaskar for helping in confirming that this race is the RC.
Thanks to Kritika for pointing out and explaining how Avati's patch can be used
to fix this bug.
Change-Id: I45fa5470bbc1f03b5f3d133e26d1e0ab24303378
BUG: 1243187
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11670
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Also remove internal-fop setting in create/mknod etc xattrs.
Rebalance was failing because ec was giving EIO when lock acquiring fails as
the file/dir doesn't exist. Posix_create/mknod are not setting config xattr
because internal-fop key is present in dict and setxattr for this fails leading
to failure in setting rest of xattrs.
Change-Id: Ifb429c8db9df7cd51e4f8ce53fdf1e1b975c9993
BUG: 1242254
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11639
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
| |
BUG: 1232172
Change-Id: I3a56e487840d86147dd85bf5fbe79b165eae289f
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11589
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- On lock reuse preserve 'healing' bits
- Don't set ctx->size outside locks in healing code
- Allow xattrop internal fops also on the fop->mask.
Change-Id: I6b76da5d7ebe367d8f3552cbf9fd18e556f2a171
BUG: 1232678
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11640
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for disperse volume.
Problem : User can set wrong values for cluster.heal-timeout
using cli. Any value like string, negative numbers
or 0 could be set.
Solution : Check the value given by user. It should be numerical,
non negative and within the range.
Change-Id: I5184ef1a11bb2c225f42ac9ccb1ba680a86cfe09
BUG: 1239037
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/11573
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
| |
BUG: 1232678
Change-Id: I35503039e4723cf7f33d6797f0ba90dd0aca130b
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11580
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With readdir[p] taking locks to figure out which bricks are
good/bad, no need to take any locks on opendir.
BUG: 1232172
Change-Id: I4d924aeeaecab23af08c4598548a20d2a44cd849
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11506
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In ec_lock() there is a chance that ec_resume is called on fop even before
ec_sleep. This can result in refs == 0 for fop leading to use after free in
this function when it calls ec_sleep so do ec_sleep at start and ec_resume at
end of this function.
Change-Id: I879b2667bf71eaa56be1b53b5bdc91b7bb56c650
BUG: 1240284
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11558
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ic22813371faca4e8198c9b0b20518e68d275f3c1
BUG: 1232678
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11531
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ib0de34c86ee25de361ec821d4015296c514742e0
BUG: 1240210
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11546
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Provide options to control number of active background heal count and qlen.
Change-Id: Idc2419219d881f47e7d2e9bbc1dcdd999b372033
BUG: 1237381
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11473
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- 8 parallel heals can happen.
- 128 heals will wait for their turn
- Heals will be rejected if 128 heals are already waiting.
Change-Id: I2e99bf064db7bce71838ed9901a59ffd565ac390
BUG: 1237381
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11471
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I99d7a038f29cebe823e17a8dda40d335441185bc
BUG: 1237381
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11472
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When quota_deem_statfs is enabled, quota sends aggregated statfs values
In EC we should not multiply statfs values with fragment number
Change-Id: I7ef8ea1598d84b86ba5c5941a2bbe0a6ab43c101
BUG: 1233162
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/11315
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia05ae750a245a37d48978e5f37b52f4fb0507a8c
BUG: 1194640
Signed-off-by: Nandaja Varma <nandaja.varma@gmail.com>
Reviewed-on: http://review.gluster.org/10465
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem : trusted.ec.config attr was missing for the healed file
Solution: Writing trusted.ec.config while healing a file.
Change-Id: I340dd45ff8ab5bc1cd6e9b0cd2b2ded236e5acf0
BUG: 1235246
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/11407
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I1e629a6adc803c4b7164a5a7a81ee5cb1d0e139c
BUG: 1232172
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11246
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GF_CONTENT_KEY aggregation requires that the fragments on the bricks belong to
same data i.e. no operations are modifying the content while lookup is
performed on it. The only way to know it is to get at least ec->fragments+1
number of responses and see that two different sets of ec->fragments number of
fragments give same data. But at the moment we feel that this slows down
ec-lookup. So removing handling of this for now.
Change-Id: I2da5087f1311d5cdde999062607b143b48c17713
BUG: 1226279
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11003
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem : While launching heal, it shows heal launch
was unsuccessful. However, internaly it was successfully
launched.
Solution : Don't reset op_ret to -1 in for loop for
every brick.
Change-Id: Iff89fdaf6082767ed67523a56430a9e83e6984d3
BUG: 1203089
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/11267
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In very rare circumstances it was possible that a subfop started
by another fop could finish fast enough to cause that two or more
instances of the same state machine be executing at the same time.
Change-Id: I319924a18bd3f88115e751a66f8f4560435e0e0e
BUG: 1233258
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/11317
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I3059f3b577f550c92fb77c6b6b44defd0584cd2e
BUG: 1230647
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11178
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
While files are being created if more than redundancy number of bricks
go down, then unlock for these fops do not go to the bricks. This will
lead to stale locks leading to hangs.
Fix:
Wind unlock fops at all costs.
Change-Id: I50a87e8b4d6d2dde5bf7405b82e3aeecd95ad00e
BUG: 1220348
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11152
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
1) ec_access/ec_readlink_/ec_readdir[p] _cbks are trying to recover only from
ENOTCONN.
2) When the fop succeeds it unwinds right away. But when its
ec_fop_manager resumes, if the number of bricks that are up is less than
ec->fragments, the the state machine will resume with -EC_STATE_REPORT which
unwinds again. This will lead to crashes.
Fix:
- If fop fails retry on other subvols, as ESTALE/ENOENT/EBADFD etc are also
recoverable.
- unwind success/failure in _cbks
Change-Id: I2cac3c2f9669a4e6160f1ff4abc39f0299303222
BUG: 1228952
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11111
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I264e47ca679d8b57cd8c80306c07514e826f92d8
BUG: 1193388
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/10784
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
ec_update_size_version expects all the keys it did xattrop with to come in
response so that it can set the values again in ec_update_size_version_done.
But EC_XATTR_DIRTY is not combined so the value won't be present in the
response. So ctx->post/pre_dirty are not updated in
ec_update_size_version_done. So these values are still non-zero. When
ec_unlock_now is called as part of flush's unlock phase it again tries to
perform same xattrop for EC_XATTR_DIRTY. But ec_update_size_version is not
expected to be called in unlock phase of flush because ec_flush_size_version
should have reset everything to zero and unlock is never invoked from
ec_update_size_version_done for flush/fsync/fsyncdir. This leads to stale lock
which leads to hang.
Fix:
EC_XATTR_DIRTY is removed in ex_xattrop_cbk and is never combined with other
answers. So remove handling of this in the response.
Change-Id: If0ea3efec3235a6e312465d8838585fbe752c7ea
BUG: 1227654
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11078
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A previous patch (http://review.gluster.org/10974) introduced a
bug that caused that some metadata differences could not be
detected in some circumstances. This could cause that self-heal
is not triggered and the file not repaired.
We also need to consider all differences for lookup requests, even
if there isn't any lock. Special handling of differences in lookup
is already done in lookup specific code.
Change-Id: I3766b0f412b3201ae8a04664349578713572edc6
BUG: 1225793
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/11018
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When ec combines iatt structures from multiple bricks, it checks
for equality in important fields. This is ok for iatt related to
inodes involved in the operation that have been locked before
starting execution. However some fops return iatt information
from other inodes. For example a rename locks source and destination
parent directories, but it also returns an iatt from the entry
itself.
In these cases we ignore differences in some fields to avoid false
detection of inconsistencies and trigger unnecessary self-heals.
Another issue is solved in this patch that caused that the real
size of the file stored into the inode context was lost during
self-heal.
Change-Id: I8b8eca30b2a6c39c7b9bbd3b3b6ba95228fcc041
BUG: 1225793
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/10974
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: NetBSD Build System
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a file is under migration, any xattrs created on it
are lost post migration of the file. This is because
the xattrs are set only on the cached subvol of the source
and as the source is under migration, it becomes a linkto file
post migration.
Change-Id: Ib8e233b519cf954e7723c6e26b38fa8f9b8c85c0
BUG: 1193636
Signed-off-by: Nithya Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/10212
Tested-by: NetBSD Build System
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
EC uses an eager lock mechanism to optimize multiple read/write
requests on the same entry or inode. This increases performance
but can have adverse results when other clients try to access the
same entry/inode.
To solve this, this patch adds a functionality to detect when this
happens and force an earlier release to not block other clients.
The method consists on requesting GF_GLUSTERFS_INODELK_COUNT and
GF_GLUSTERFS_ENTRYLK_COUNT for all fops that take a lock. When this
count is greater than one, the lock is marked to be released. All
fops already waiting for this lock will be executed normally before
releasing the lock, but new requests that also require it will be
blocked and restarted after the lock has been released and reacquired
again.
Another problem was that some operations did correctly lock the
parent of an entry when needed, but got the size and version xattrs
from the entry instead of the parent.
This patch solves this problem by binding all queries of size and
version to each lock and replacing all entrylk calls by inodelk ones
to remove concurrent updates on directory metadata. This also allows
rename to correctly update source and destination directories.
Change-Id: I2df0b22bc6f407d49f3cbf0733b0720015bacfbd
BUG: 1165041
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/10852
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ec_heal creates ec_fop_data but doesn't run ec_manager. ec_fop_data_allocate
adds this fop to ec->pending_fops, because ec_manager is not run on this heal
fop it is never removed from ec->pending_fops. When it is accessed after free
it leads to crash. It is better to not to add HEAL fops to ec->pending_fops
because we don't want graph switch to hang the mount because of a BIG
file/directory heal.
BUG: 1188145
Change-Id: I8abdc92f06e0563192300ca4abca3909efcca9c3
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10868
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a delayed lock is pending, a graph switch doesn't correctly
terminate it. This means that the update of version and size xattrs
is lost, causing EIO errors.
This patch handles GF_EVENT_PARENT_DOWN event to correctly finish
pending udpdates before completing the graph switch.
Change-Id: I394f3b8d41df8d83cdd36636aeb62330f30a66d5
BUG: 1188145
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/10787
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia1834ec23d5de615526d4d4e4d2e32aff155b7f7
BUG: 1211962
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10806
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a blocking lock is requested, lock request is succeeded even when
ec->fragment number of locks are acquired successfully in non-blocking locking
phase. This will lead to fop succeeding only on the bricks where the locks are
acquired, leading to the necessity of self-heals. To prevent these un-necessary
self-heals, if the remaining locks fail with EAGAIN in non-blocking lock phase
try blocking locking phase instead.
Change-Id: I940969e39acc620ccde2a876546cea77f7e130b6
BUG: 1221145
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10770
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a file does not exist on a brick but it does on others, there
could be problems trying to access it because there was some loc_t
structures with null 'pargfid' but 'name' was set. This forced
inode resolution based on <pargfid>/name instead of <gfid> which
would be the correct one. To solve this problem, 'name' is always
set to NULL when 'pargfid' is not present.
Another problem was caused by an incorrect management of errors
while doing incremental locking. The only allowed error during an
incremental locking was ENOTCONN, but missing files on a brick can
be returned as ESTALE. This caused an EIO on the operation.
This patch doesn't care of errors during an incremental locking. At
the end of the operation it will check if there are enough successfully
locked bricks to continue or not.
Change-Id: I9360ebf8d819d219cea2d173c09bd37679a6f15a
BUG: 1176062
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/9407
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- With this change, the xattr will represent if the file needs to be healed or
not. It will have different values for data/entry and metadata changes.
- inode ref leaks and dict_set_dynstr related leaks fixed
- Added support for trylock/lock based on heal-cmd execution or not
in data heal.
- Made fixes to pass regression runs
Change-Id: I9d8def4c2badde18a76b7898816fecfac113737a
BUG: 1215265
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10385
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Data self-heal:
1) Take inode lock in domain 'this->name:self-heal' on 0-0 range (full file),
So that no other processes try to do self-heal at the same time.
2) Take inode lock in domain 'this->name' on 0-0 range (full file),
3) perform fxattrop+fstat and get the xattrs on all the bricks
3) Choose the brick with ec->fragment number of same version as source
4) Truncate sinks
5) Unlock lock taken in 2)
5) For each block take full file lock, Read from sources write to the sinks, Unlock
6) Take full file lock and see if the file is still sane copy i.e. File didn't become unusable while the bricks are offline.
Update mtime to before healing
7) xattrop with -ve values of 'dirty' and difference of highest and its own
version values for version xattr
8) unlock lock acquired in 6)
9) unlock lock acquired in 1)
Change-Id: I6f4d42cd5423c767262c9d7bb5ca7767adb3e5fd
BUG: 1215265
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10384
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Metadata self-heal:
1) Take inode lock in domain 'this->name' on 0-0 range (full file)
2) perform lookup and get the xattrs on all the bricks
3) Choose the brick with highest version as source
4) Setattr uid/gid/permissions
5) removexattr stale xattrs
6) Setxattr existing/new xattrs
7) xattrop with -ve values of 'dirty' and difference of highest and its own
version values for version xattr
8) unlock lock acquired in 1)
Entry self-heal:
1) take directory lock in domain 'this->name:self-heal' on 'NULL' to prevent
more than one self-heal
2) we take directory lock in domain 'this->name' on 'NULL'
3) Perform lookup on version, dirty and remember the values
4) unlock lock acquired in 2)
5) readdir on all the bricks and trigger name heals
6) xattrop with -ve values of 'dirty' and difference of highest and its own
version values for version xattr
7) unlock lock acquired in 1)
Name heal:
1) Take 'name' lock in 'this->name' on 'NULL'
2) Perform lookup on 'name' and get stat and xattr structures
3) Build gfid_db where for each gfid we know what subvolumes/bricks have
a file with 'name'
4) Delete all the stale files i.e. the file does not exist on more than
ec->redundancy number of bricks
5) On all the subvolumes/bricks with missing entry create 'name' with same
type,gfid,permissions etc.
6) Unlock lock acquired in 1)
Known limitation: At the moment with present design, it conservatively
preserves the 'name' in case it can not decide whether to delete it. this can
happen in the following scenario:
1) we have 3=2+1 (bricks: A, B, C) ec volume and 1 brick is down (Lets say A)
2) rename d1/f1 -> d2/f2 is performed but the rename is successful only on one
of the bricks (Lets say B)
3) Now name self-heal on d1 and d2 would re-create the file on both d1 and d2
resulting in d1/f1 and d2/f2.
Because we wanted to prevent data loss in the case above, the following
scenario is not healable, i.e. it needs manual intervention:
1) we have 3=2+1 (bricks: A, B, C) ec volume and 1 brick is down (Lets say A)
2) We have two hard links: d1/a, d2/b and another file d3/c even before the
brick went down
3) rename d3/c -> d2/b is performed
4) Now name self-heal on d2/b doesn't heal because d2/b with older gfid will
not be deleted. One could think why not delete the link if there is
more than 1 hardlink, but that leads to similar data loss issue I described
earlier:
Scenario:
1) we have 3=2+1 (bricks: A, B, C) ec volume and 1 brick is down (Lets say A)
2) We have two hard links: d1/a, d2/b
3) rename d1/a -> d3/c, d2/b -> d4/d is performed and both the operations are
successful only on one of the bricks (Lets say B)
4) Now name self-heal on the 'names' above which can happen in parallel can
decide to delete the file thinking it has 2 links but after all the
self-heals do unlinks we are left with data loss.
Change-Id: I3a68218a47bb726bd684604efea63cf11cfd11be
BUG: 1213358
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10298
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adding 64 bits in "version" key of extended attributes. First 64 bits (Left)
represents Data version. Last 64 bits (right) represents Meta Data version.
Note: 3.7 and 3.6 version ec can't co-exist with this change because xattrop in
3.6 will fail with ERANGE as the buffer passed to it will be '8' bytes where as
the value will be 16 bytes in 3.7. Where as 3.7 version clients can work with
old version files. For upgrades we need to tell users to complete heals and
then upgrade
BUG: 1215265
Change-Id: Ib85114680cb7e75b8371c984d9f7b6401c1ffb93
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/10312
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If both dicts are NULL then equal. If one of the dicts is NULL but the other
has only ignorable keys then also they are equal. If both dicts are non-null
then check if for each non-ignorable key, values are same or not. value_ignore
function is used to skip comparing values for the keys which must be present in
both the dictionaries but the value could be different.
geo-rep's stime xattr doesn't need to be present in list xattr but when
getxattr comes on stime xattr even if there aren't enough responses with the
xattr we should still give out an answer which is maximum of the stimes
available.
Change-Id: I8de2ceaa2db785b797f302f585d88e73b154167d
BUG: 1207712
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10078
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia7d43cb3b222db34ecb0e35424f1766715ed8e6a
BUG: 1188242
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/10176
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was leading to hangs when get_size_and_version fails
Change-Id: Iad9408c2dacc9a74594b8d2f94c95f402533b0f1
BUG: 1215265
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10390
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the version numbers do not match, then writes are performed only on at least
N-R bricks which have same version. But if we want to do healing of files which
are constantly modified we need to allow writes on subvols that are undergoing
heal. Data healing will mark 62nd bit while the heal is going on. When the data
transaction sees that this bit is set it needs to perform the fop on that
subvol irrespective of whether the versions match or do not match. Fop is
considered successful only if N-R non-healing bricks succeed.
Change-Id: I69a17582df397aaf6e8ca4b5e746c7ca802cbbde
BUG: 1215265
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10372
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|