| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Change-Id: I17b7f6864dd8c1f89500a4bb89f1c249835e68da
BUG: 1188184
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/9974
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libgfdb is only used by processes that run on Gluster servers. There is
no need to have this library (and its sqlite dependency) on any system
that installs a glusterfs package.
BUG: 1194753
Change-Id: I6b7fdc70210ef2a847efd596a0c995f0a4cf5d9b
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9983
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
Reviewed-by: Lalatendu Mohanty <lmohanty@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Coverity CID: 1256175
Change-Id: Ib29fc2eaa54a7ce8369918e68bf117d0f04ca94d
BUG: 789278
Signed-off-by: Nandaja Varma <nandaja.varma@gmail.com>
Reviewed-on: http://review.gluster.org/9679
Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ibad640d01975906b7642c76a1649e3e272f3a8bc
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9712
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scrubber performs signature verification for objects that were
signed by signer. This is done by recalculating the signature
(using the hash algorithm the object was signed with) and
verifying it aginst the objects persisted signature. Since the
object could be undergoing IO opretaion at the time of hash
calculation, the signature may not match objects persisted
signature. Bitrot stub provides additional information about
the stalesness of an objects signature (determinted by it's
versioning mechanism). This additional bit of information is
used by scrubber to determine the staleness of the signature,
and in such cases the object is skipped verification (although
signature staleness is performed twice: once before initiation
of hash calculation and another after it (an object could be
modified after staleness checks).
The implmentation is a part of the bitrot xlator (signer) which
acts as a signer or scrubber based on a translator option. As
of now the scrub process is ever running (but has some form of
weak throttling mechanism during filesystem scan). Going forward,
there needs to be some form of scrub scheduling and IO throttling
(during hash calculation) tunables (via CLI).
Change-Id: I665ce90208f6074b98c5a1dd841ce776627cc6f9
BUG: 1170075
Original-Author: Raghavendra Bhat <rabhat@redhat.com>
Original-Author: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9914
Tested-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the "Signer" -- responsible for signing files with their
checksums upon last file descriptor close (last release()).
The event notification facility provided by the changelog xlator
is made use of.
Moreover, checksums are as of now SHA256 hash of the object data
and is the only available hash at this point of time. Therefore,
there is no special "what hash to use" type check, although it's
does not take much to add various hashing algorithms to sign
objects with. Signatures are stored in extended attributes of the
objects along with the the type of hashing used to calculate the
signature. This makes thing future proof when other hash types
are added. The signature infrastructure is provided by bitrot
stub: a little piece of code that sits over the POSIX xlator
providing interfaces to "get or set" objects signature and it's
staleness.
Since objects are signed upon receiving release() notification,
pre-existing data which are "never" modified would never be
signed. To counter this, an initial crawler thread is spawned
The crawler scans the entire brick for objects that are unsigned
or "missed" signing due to the server going offline (node reboots,
crashes, etc..) and triggers an explicit sign. This would also
sign objects when bit-rot is enabled for a volume and/or after
upgrade.
Change-Id: I1d9a98bee6cad1c39c35c53c8fb0fc4bad2bf67b
BUG: 1170075
Original-Author: Raghavendra Bhat <raghavendra@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9711
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* BitRot enable/disable CLI per volume
* Volfile generation for Scrubber
* Relevant glusterd infrastructure
Change-Id: I1212af63f93ecc52b22ee6da920e1664f66a1e39
BUG: 1170075
Original-Author: Raghavendra Bhat <raghavendra@redhat.com>
Original-Author: Venky Shankar <vshankar@redhat.com>
Original-Author: Gaurav Kumar Garg <ggarg@redhat.com>
Original-Author: Anand Nekkunti <anekkunt@redhat.com>
Reviewed-on: http://review.gluster.org/9986
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Implement the skeleton of bit-rot xlator.
Original-Author: Raghavendra Bhat <raghavendra@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Signed-off-by: Anand Nekkunti <anekkunt@redhat.com>
Change-Id: If33218bdc694f5f09cb7b8097c4fdb74d7a23b2d
BUG: 1170075
Reviewed-on: http://review.gluster.org/9710
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: If54f4be7db8b6f98e65570b09c07251e21ebae15
BUG: 1194640
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/9837
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
- Added maintainers for BitRot detection
Change-Id: Ib5dae02176372e33c7b616f161d909815edbb53f
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9916
Tested-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
read-only/worm translator is not loaded by default in brick graph because of which
when read-only option is set through volume set volume still remains writable
untill the bricks are restarted as the translator does not have an inmemory flag
to decide whether the read-only/worm option is turned or not.
Solution:
read-only/worm should be loaded by default in brick graph and the read-only/worm
option can be toggled through volume set command. read-only/worm translator now'
has an in-memory flag to decide whether the volume is read-only or not and based
on that either reject the fop or proceed.
Change-Id: Ic79328698f6a72c50433cff15ecadb1a92acc643
BUG: 1134822
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/8571
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bitrot stub implements object versioning required for identifying
signature freshness. More details about versioning is explained
as a part of the "bitrot feature documentation" patch.
Change-Id: I2ad70d9eb109ba4a12148ab8d81336afda529ad9
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9709
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Certain translators may require to update the inode context
of an already linked inode before unwinding the call to the
client. Normally, such a case in encountered during parallel
operations when a fresh inode is chosen at call (wind) time.
In the callback path, one of inodes is successfully linked
in the inode table, thereby the other inodes being thrown
away (and the inode pointers for these calls being pointed
to the linked inode).
Translators which may have strict dependency on the correct
value in the inode context would get stale values in inode
context. This patch introduces a new callback which provides
gives translators an opportunity to "patch" their respective
inode contexts. Note that, as of now, this callback is only
invoked during create()s unwind path. Although this might
needed to be done for all dentry fops and lookup, but let
that be done as an when required (bitrot stub requires
this *only* for create()).
Change-Id: I6cd91c2af473c44d1511208060d3978e580c67a6
BUG: 1170075
Original-Author: Raghavendra Bhat <rabhat@redhat.com>
Original-Author: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9913
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Translators which wish to send event notifications can send
"down" an IPC FOP with op_type as GF_IPC_TARGET_CHANGELOG
and xdata carrying event structures (changelog_event_t).
Change-Id: I0e5f8c9170161c186f0e58d07105813e34e18786
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9775
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[WIP patch as of now, just needs a little tweak]
A pending TODO in the code caused regressions to fail as
bitrot daemons are spawned during volume start (equivalent
to enabling bitrot by default). The problematic part that
casued such failures is during brick disconnections with
unsafe handling of event data structured in the code.
With this patch, data structures are properly cleaned up
with care taken to cleanup all accessors first. This also
fixes potential memory leaks which was bluntly ignored
before.
Change-Id: I70ed82cb1a0fb56c85ef390007e321a97a35c5ce
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
original-author: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9959
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
libgfchangelog initializes global xlator on library load (via
constructor: _ctor) and mangles it's xlator context thereby
messing with certain important members of the command structure.
On receiving an RPC disconnection event, if the point-of-execution
was in libgfchangelogs context, accessing ->cmd_args during RPC
notify resulted in a segfault.
Fix:
Since the libarary needs to be able to work with processes that
have a notion of an xlator (THIS in particular) and without it,
care needs to be taken to allocate the global xlator when needed.
Moreover, the actual fix is to use the correct xlator context
in both cases. A new API is introduces when needs to be invoked
by the conusmer (although this could have been done during
register() call, keeping it a separate API makes thing flexible
and easy).
Test:
The issue is observed when a brick process goes offline. This is
triggered when test cases (.t's) are run in bulk, since each
test essestially spawns bricks processes (on volume start) and
terminates them (volume stop). Since bitrot daemon, as of now,
spawns upon volume start, the issue is much observed when the
volume is taken offline at the end of each test case. With this
fix, running the basic and core test cases along with building
the linux kernel has passed without daemon segfaults.
Thanks to Johnny (rabhat@) for helping in debugging the issue
(and with the fix :)).
Change-Id: I8d3022bf749590b2ee816504ed9b1dfccc65559a
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9953
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the current code of trash translator, file is moved to
trash directory without checking whether it is the last
hardlink.This may lead to inconsistency for a file in that
gluster volume.To avoid those scenarios,so a file is moved
to trash directory only if it is the last hardlink.
Change-Id: Id098e53a2236c6406ef91e6e2599ea2cff9bace3
BUG: 1132465
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
Reviewed-on: http://review.gluster.org/9926
Reviewed-by: Anoop C S <achiraya@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Also fixed iatt_combine to go over all the valid iatts
Change-Id: I1d52d705ed0437f602357acde3e479cedb748681
BUG: 1199767
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/9827
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Global opinfo should not be referred by syncop framework as it uses local
txn_opinfo for every transaction. There is one place in the codebase where the
global opinfo is set with the local txn_opinfo which can lead to an incorrect
opinfo for an on-going op-sm transaction which refers to the same global opinfo.
Change-Id: Ida63a8871b8d03fe646146eddfd3f2473f1b1d7c
BUG: 1202745
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/9908
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Nekkunti <anekkunt@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The subject of a patch should not end with a dot (.). It is not common
to have subjects of emails end with a dot, and neither is it common for
patches.
Change-Id: Id090241393aee3ca99df4887bdb2d7a7a8913164
BUG: 1198849
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9940
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ira Cooper <ira@redhat.com>
Reviewed-by: Lalatendu Mohanty <lmohanty@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
BUG: 1198849
Change-Id: I9597b4b7f37994865f88b99651ea9ec89787f5cf
Reported-by: Adam Borowski <kilobyte@angband.pl>
URL: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=778790
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9963
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I0327a48ba5a1a217f54557386b1ae1b986702340
BUG: 1178685
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9962
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also, glfs_fini() is currently not stable yet, this test case causes
many regression failures. The .t file has been renamed to .sh so that
the test does not get run automatically, but can be run easily by hand.
BUG: 1093594
Change-Id: I63fa4ddf798a505bc94d13d32dd02f22a9b7ab73
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9961
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FreeBSD does not have sys/xattr.h, including it in tier.h breaks
building on FreeBSD. There is nothing in tier.h that seems to require
definitions from the sys/xattr.h header, just remove it.
BUG: 1194753
Change-Id: If970272a0ce7728e0f18e5ae026880688ac31408
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9965
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Sachin Pandit <spandit@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Building on FreeBSD has been broken by http://review.gluster.org/9724
which introduces the cluster/tiering xlator. The CFLAGS passed to the
compiler do not include the path where sqlite3.h can be found.
In fact, an attempt was made to pass the flags on, but a later variable
overwrite these again.
BUG: 1194753
Change-Id: I1c890fa9a0d82492726306fe6b03bd50ca985e31
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9964
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Sachin Pandit <spandit@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Need to wait for a few seconds for rebalancing to complete
before stopping volume.
Change-Id: Ib81c02645240e7d74ebfb3e31ccbc612fc77b119
BUG: 1194753
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/9966
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The tier translator shares most of DHT's code. It differs in how
subvolumes are chosen for I/Os, and how file migration (cache promotion
and demotion) is managed. That different functionality is split to either
DHT or tier logic according to the "tier_methods" structure.
A cache promotion and demotion thread is created in a manner
similar to the rebalance daemon. The thread operates a timing
wheel which periodically checks for promotion and demotion candidates
(files). Candidates are queued and then migrated. Candidates must exist on
the same node as the daemon and meet other critera per caching policies.
This patch has two authors (Dan Lambright and Joseph Fernandes). Dan
did the DHT changes and Joe wrote the cache policies. The fix depends on
DHT readidr changes and the database library which have been submitted
separately. Header files in libglusterfs/src/gfdb should be reviewed in
patch 9683.
For more background and design see the feature page [1].
[1]
http://www.gluster.org/community/documentation/index.php/Features/data-classification
Change-Id: Icc26c517ccecf5c42aef039f5b9c6f7afe83e46c
BUG: 1194753
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/9724
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With trash translator, '.trashcan' is created in gluster volume
during init. So geo-rep create used to fail, with slave not
being empty. The gverify script should ignore '.trashcan' while
checking whether slave is empty.
Change-Id: Ib7ee265c6aa99b63c7ccf103747d47a6f756ad35
BUG: 1203086
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/9921
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks to Niels for this fix
Change-Id: I9a13c3de3ed5d4eb06c6af61a2519bf27f1b6259
BUG: 1194753
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/9957
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I4552d98ff6b51e8171865cd0a569f6e793db9dd0
BUG: 1194640
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/9876
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Lalatendu Mohanty <lmohanty@redhat.com>
Tested-by: Lalatendu Mohanty <lmohanty@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While building, the following warning is displayed:
xlators/features/arbiter/src/Makefile.am:7: variable `_la_LIBADD' is defined but no program or
xlators/features/arbiter/src/Makefile.am:7: library has `_la' as canonical name (possible typo)
The _la_LIBADD really seems like a typo, dropping it should not be harmful to
anything, except for the warning that will now be gone.
BUG: 1199985
Change-Id: I3f3ba911f59df2e51fdc6387295fff4bbcc5a12d
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9950
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Id9a7d0f457d9759ab7d0a52a4000b5ae36d211f8
BUG: 1194753
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9946
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Based on the high-level design by Anand V. Avati which can be found @
https://gist.github.com/avati/af04f1030dcf52e16535#sharding-xlator-stripe-20
Still to-do:
* complete implementation of inode write fops - [f]truncate,
zerofill, fallocate, discard
* introduce transaction mechanism in inode write fops
* complete readv
* Handle open with O_TRUNC
* Handle unlinking of all shards during unlink/rename
* Compute total ia_size and ia_blocks in lookup, readdirp, etc
* wind fsync/flush on all shards
Note: Most of the items above are related. Once we come up
with a clean way to determine the last shard/shard count for
a file/file size and the mgmt of sparse regions of the file,
implementing them becomes trivial.
Change-Id: Id871379b53a4a916e4baa2e06f197dd8c0043b0f
BUG: 1200082
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/9841
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the arbiter translator into the tree. This is a server
side xlator used for replica 3 volumes. It sits above posix and will be
loaded on the 3rd (last) brick of every afr subvolume in a replica 3
configuration. It intercepts inode read/write operations: reads are
unwound with ENOTCONN, inode writes are unwound with success without
actually passing them down to posix. Metadata operations are allowed to
pass through.
The CLI for creating a 3 way replica with arbiter is also added but kept
disabled (A 'normal' 3 way replica is created instead).
This patch is a part of the arbiter logic implementation for 3 way AFR,
details of which can be found at http://review.gluster.org/#/c/9656/
Change-Id: I395b81f49d5da52c466daf5c8518f1bbad9c16fa
BUG: 1199985
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/9840
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current way of installing the snap-scheduler tools breaks the
building of RPMs. The tools should be installed in /usr/sbin and there
is no need to change the attributes/permissions after the installation.
Change-Id: I1b8e206748617fb000d956b1b5da52d126971c29
BUG: 1203557
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9939
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Part 2/2 patch to enable users analyze and resolve
split-brain.
This patch enables :
1) Users to inspect the files in data and metadata split-brain.
2) Resolve the split-brain.
Both using a series of setfattr commands.
Consider a volume "test" with 2 bricks.
1) To inspect a file f1:
setfattr -n replica.split-brain-choice -v test-client-0 f1
After the execution of this command, if no read_subvol
is found, reads will be served from test-client-0 (corresponding
to brick-0).
2) To resolve split-brain :
setfattr -n replica.split-brain-heal-finalize -v test-client-0 f1
Execution of this command will lead to the resolution
of data and metadata split-brain with subvol mentioned in the
command (test-client-0 here) as the source and the rest as sink.
Change-Id: Ia20f3ee5abd3119e3d54fcc599f1e55ac65fd179
BUG: 1191396
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/9743
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A tiered volume is a normal volume with some number of new bricks
representing "hot" storage. The "hot" bricks can be attached or
detached dynamically to a normal volume. When this happens, a new graph
is constructed. The root of the new graph is an instance of the tier
translator. One subvolume of the tier translator leads to the old volume,
and another leads to the new hot bricks.
attach-tier <VOLNAME> [<replica> <COUNT>] <NEW-BRICK> ... [force]
volume detach-tier <VOLNAME> [replica <COUNT>] <BRICK>
... <start|stop|status|commit|force>
gluster volume rebalance <volume> tier start
gluster volume rebalance <volume> tier stop
gluster volume rebalance <volume> tier status
The "tier start" CLI command starts a server side daemon. The daemon
initiates file level migration based on caching policies. The daemon's
status can be monitored and stopped.
Note development on the "tier status" command is incomplete. It will be
added in a subsequent patch.
When the "hot" storage is detached, the tier translator is removed
from the graph and the tiered volume reverts to its original state as
described in the volume's info file.
For more background and design see the feature page [1].
[1]
http://www.gluster.org/community/documentation/index.php/Features/data-classification
Change-Id: Ic8042ce37327b850b9e199236e5be3dae95d2472
BUG: 1194753
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/9753
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Documentation for gluster volume heal info and split-brain
resolution.
Change-Id: Iacb30dff939c1acf06c24bd8844294d4a90ab38d
BUG: 1154599
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/9595
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The last two patches incorrectly add symbols indented by spaces. The
rest of the file uses tabs. This change corrects these occurences.
Change-Id: Ibfb057b78c1203a594bfeb73a2955e798e86c8e1
BUG: 1185654
Reported-by: Kaleb S. Keithley <kkeithle@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9937
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
**********************************************************************
ChangeTimeRecorder(CTR) Xlator |
**********************************************************************
ChangeTimeRecorder(CTR) is server side xlator(translator) which sits
just above posix xlator. The main role of this xlator is to record the
access/write patterns on a file residing the brick. It records the
read(only data) and write(data and metadata) times and also count on
how many times a file is read or written. This xlator also captures
the hard links to a file(as its required by data tiering to move
files).
CTR Xlator is the consumer of libgfdb.
To Enable/Disable CTR Xlator:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gluster volume set <volume-name> features.ctr-enabled {on/off}
To Enable/Disable Frequency Counter Recording in CTR Xlator:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gluster volume set <volume-name> features.record-counters {on/off}
Change-Id: I5d3cf056af61ac8e3f8250321a27cb240a214ac2
BUG: 1194753
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/9935
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NFS now has the ability to use a separate file for "netgroups" and
"exports". An administrator should have the ability to check the
validity of the files before applying the configuration.
The "glusterfsd" command now has the following additional arguments that
can be used to check the configuration:
--print-netgroups: Validate the netgroups file and print it out
--print-exports: Validate the exports file and print it out
BUG: 1143880
Change-Id: I24c40d50110d49d8290f9fd916742f7e4d0df85f
URL: http://www.gluster.org/community/documentation/index.php/Features/Exports_Netgroups_Authentication
Original-author: Shreyas Siravara <shreyas.siravara@gmail.com>
CC: Richard Wareing <rwareing@fb.com>
CC: Jiffin Tony Thottan <jthottan@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9365
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch imports timer-wheel[1] algorithm from the linux
kernel (~/kernel/time/timer.c) with some modifications.
Timer-wheel is an efficent way to track millions of timers for
expiry. This is a variant of the simple but RAM heavy approach
of having a list (timer bucket) for every future second.
Timer-wheel categorizes every future second into a logarithmic
array of arrays. This is done by splitting the 32 bit "timeout"
value into fixed "sliced" bits, thereby each category has a
fixed size array to which buckets are assigned.
A classic split would be 8+6+6+6 (used in this patch) which
results in 256+64+64+64 == 512 buckets. Therefore, the entire
32 bit futuristic timeouts have been mapped into 512 buckets.
[
NOTE:
There are other possible splits, such as "8+8+8+8", but
this patch sticks to the widely used and tested default.
]
Therfore, the first category "holds" timers whose expiry range
is between 1..256, the next cateogry holds 257..16384, third
category 16385..1048576 and so on. When timers are added,
unless it's in the first category, timers with different
timeouts could end up in the same bucket. This means that the
timers are "partially sorted" -- sorted in their highest bits.
The expiry code walks the first array of buckets and exprires
any pending timers (1..256). Next, at time value 257, timers
in the first bucket of the second array is "cascaded" onto
the first category and timers are placed into respective
buckets according to the thier timeout values. Cascading
"brings down" the timers timeout to the coorect bucket
of their respective category. Therefore, timers are sorted
by their highest bits of the timeout value and then by the
lower bits too.
[1] https://lwn.net/Articles/152436/
Change-Id: I1219abf69290961ae9a3d483e11c107c5f49c4e3
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9707
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With trash feature, .trashcan directory gets created
at each export directory. Xsync picks .trashcan to sync
and fails with EPERM. Xsync should ignore .trashcan
directory.
Change-Id: I45bd226c96011ace2c40dd2de878d886c7d34ce5
BUG: 1203293
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/9934
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces rotational buffers aiming at the classic
multiple producer and multiple consumer problem. A fixed set
of buffer list is allocated during initialization, where each
list consist of a list of buffers. Each buffer is an iovec
pointing to a memory region of fixed allocation size. Multiple
producers write data to these buffers. A buffer list starts with
a single buffer (iovec) and allocates more when required (although
this can be preallocatd in multiples of k).
rot-buffs allow multiple producers to write data parallely with
a bit of extra cost of taking locks. Therefore, it's much suited
for large writes. Multiple producers are allowed to write in the
buffer parallely by "reserving" write space for selected number
of bytes and returning pointer to the start of the reserved area.
The write size is selected by the producer before it starts the
write (which is often known). Therefore, the write itself need not
be serialized -- just the space reservation needs to be done safely.
The other part is when a consumer kicks in to consume what has
been produced. At this point, a buffer list switch is performed.
The "current" buffer list pointer is safely pointed to the next
available buffer list. New writes are now directed to the just
switched buffer list (the old buffer list is now considered out
of rotation). Note that the old buffer still may have producers
in progress (pending writes), so the consumer has to wait till
the writers are drained. Currently this is the slow path for
producers (write completion) and needs to be improved.
Currently, there is special handling for cases where the number
of consumers match (or exceed) the number of producers, which
could result in writer starvation. In this scenario, when a
consumers requests a buffer list for consumption, a check is
performed for writer starvation and consumption is denied
until at least another buffer list is ready of the producer
for writes, i.e., one (or more) consumer(s) completed, thereby
putting the buffer list back in rotation.
[
NOTE:
I've not performance tested this producer-consumer model
yet. It's being used in changelog for event notification.
The list of buffers (iovecs) are directly passed to RPC
layer.
]
Change-Id: I88d235522b05ab82509aba861374a2312bff57f2
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9706
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CLI command for bitrot features.
volume bitrot <volname> enable|disable
Above command will enable/disable bitrot feature for particular volume.
BUG: 1170075
Change-Id: Ie84002ef7f479a285688fdae99c7afa3e91b8b99
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Signed-off-by: Anand nekkunti <anekkunt@redhat.com>
Signed-off-by: Dominic P Geevarghese <dgeevarg@redhat.com>
Reviewed-on: http://review.gluster.org/9866
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GlusterFS volume snapshot provides point-in-time copy
of a GlusterFS volume. Currently, GlusterFS volume
snapshots can be easily scheduled by setting up
cron jobs on one of the nodes in the GlusterFS
trusted storage pool. This has a single point failure (SPOF),
as scheduled jobs can be missed if the node running the cron
jobs dies.
The solution to the above problems is addressed in this patch.
The snap_scheduler.py helper script expects the user to install
the argparse python module before using it.
Further details for the same are available at:
http://www.gluster.org/community/documentation/index.php/Features/Scheduling_of_Snapshot
Change-Id: I2c357af5b7d3e66f270d20eef50cdeecdcbe15c7
BUG: 1198027
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/9788
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
==========================================================================
Inode quota
==========================================================================
= Currently, the only way to retrieve the number of files/objects in a =
= directory or volume is to do a crawl of the entire directory/volume. =
= This is expensive and is not scalable. =
= =
= The proposed mechanism will provide an easier alternative to determine =
= the count of files/objects in a directory or volume. =
= =
= The new mechanism proposes to store count of objects/files as part of =
= an extended attribute of a directory. Each directory's extended =
= attribute value will indicate the number of files/objects present =
= in a tree with the directory being considered as the root of the tree. =
= =
= The count value can be accessed by performing a getxattr(). =
= Cluster translators like afr, dht and stripe will perform aggregation =
= of count values from various bricks when getxattr() happens on the key =
= associated with file/object count. =
A new interface is introduced:
------------------------------
limit-objects : limit the number of inodes at directory level
list-objects : list the directories where the limit is set
remove-objects : remove the limit from the directory
==========================================================================
CLI COMMAND:
gluster volume quota <volname> limit-objects <path> <number> [<percent>]
* <number> is a hard-limit for number of objects limitation for path "<path>"
If hard-limit is exceeded, creation of file/directory is no longer
permitted.
* <percent> is a soft-limit for number of objects creation for path "<path>"
If soft-limit is exceeded, a warning is issued for each creation.
CLI COMMAND:
gluster volume quota <volname> remove-objects [path]
==========================================================================
CLI COMMAND:
gluster volume quota <volname> list-objects [path] ...
Sample output:
------------------
Path Hard-limit Soft-limit Used Available
Soft-limit exceeded?
Hard-limit exceeded?
------------------------------------------------------------------------
--------------------------------------
/dir 10 80% 10 0
Yes
Yes
==========================================================================
[root@snapshot-28 dir]# ls
a b file11 file12 file13 file14 file15 file16 file17
[root@snapshot-28 dir]# touch a1
touch: cannot touch `a1': Disk quota exceeded
* Nine files are created in directory "dir" and directory is included in
* the
count too. Hence the limit "10" is reached and further file creation
fails
==========================================================================
Note: We have also done some re-factoring in cli for volume name
validation. New function cli_validate_volname is created
==========================================================================
Change-Id: I1823497de4f790a2a20ebb1770293472ea33ee2b
BUG: 1190108
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/9769
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the RPC based changes to {libgf}changelog, loading shared
objects dynamically would need symbols to be available from
other shared libraries. As an example, creating an RPC listner
loads the RPC transport shared object which requires symbols
to be available from already loaded shared objects.
Using RTLD_GLOBAL makes the symbols available for symbol
resolution of subsequently loaded libraries.
Change-Id: I3d3ef790eded82911f05836c707509157680645c
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9814
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces RPC based communication between the changelog
translator and libgfchangelog. It replaces the old pathetic stream
based interaction that existed earlier (due to time constraints :-/).
Changelog, upon initialization starts a RPC server (rpcsvc) allowing
clients to invoke a probe API as a bootup mechanism to request for
event notifications. During probe, clients can choose an event
filter specifying the type(s) of events they are interested in. As
of now there is no way to change the event notification set once
the probe RPC call is made, but that is easier to implement.
The actual event notifications is done on a separate RPC session.
The client (libgfchangelog) itself starts and RPC server which the
changelog translator "connects back" during probe. Notifications
are dispatched by a bunch of threads from the server (translator)
and the client optionally orders them if ordered notifications
are requried. FOPs fill in their respective event details in a
buffer (rot-buffs to be particular) and a bunch of threads
(consumers) swap the buffers out of roatation and dispatch them
via RPC. To avoid writer starvation, then number of dispatcher
threads is one less than the number of buffer list in rot-buffs.x
libgfchangelog becomes purely callback based -- upon event
notification from the server (and re-ordering them if required)
invoke a callback routine specified by consumer(s).
A major part of the patch is also aimed at providing backward
compatibility for geo-replication, which was one of the main
consumer of the stream based API. Also, this patch does not\
"turn on" event notifications for all fops, just a bunch which
is currently in requirement. Another pain point is that the
server does not filter events before dispatching it to the
clients. That load is taken up by the client itself (although
it's done at the library layer rather than making it hard on
the callback implementor). This needs improvement and care
needs to be taken to not load the server up with expensive
filtering mechanisms.
Change-Id: Ibf60a432b68f2dfa60c6f9add2bcfd37a9c41395
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9708
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These two functions add support for POSIX ACLs through the GFAPI-handle
interface.
The initial infrastructure for POSIX ACLs based on libacl has been added
with the required changes to the POSIX xlator:
- http://review.gluster.org/9627
NetBSD does not support POSIX ACLs, so using any of the functions should
return ENOTSUP.
URL: http://www.gluster.org/community/documentation/index.php/Features/Improved_POSIX_ACLs
Change-Id: Ie74f3f963c3f9d576cb2f2a1e6d97e3cd4b01eda
BUG: 1185654
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9736
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|