| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I83c494f2bb60d29495cd643659774d430325af0a
BUG: 1194640
Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com>
Reviewed-on: http://review.gluster.org/10297
Tested-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM
--------
statedump requests that traverse call frames of all call stacks in
execution may race with a STACK_RESET on a stack. This could crash the
corresponding glusterfs process. For e.g, recently we observed this in a
regression test case tests/basic/afr/sparse-self-heal.t.
FIX
---
gf_proc_dump_pending_frames takes a (TRY_LOCK) call_pool->lock before
iterating through call frames of all call stacks in progress. With this
fix, STACK_RESET removes its call frames under the same lock.
Additional info
----------------
This fix makes call_stack_t to use struct list_head in place of custom
doubly-linked list implementation. This makes call_frame_t manipulation
easier to maintain in the context of STACK_WIND et al.
BUG: 1229658
Change-Id: I7e43bccd3994cd9184ab982dba3dbc10618f0d94
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/11095
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The only check that _gf_ref_get() needs is "== 0" for detecting a
failure. The actual return value is not guaranteed to be the number of
active refences (they can change in other threads anyway).
BUG: 1163543
Change-Id: I8801601eab37046f5a5ee0bce5a62606115ca151
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/11328
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Checks for compiler supported atomic operations comes from client_t.h.
An example usage of this change can be found in adding reference
counting to "struct auth_cache_entry" in http://review.gluster.org/11023
Basic usage looks like this:
#include "refcount.h"
struct my_struct {
GF_REF_DECL;
... /* more members */
}
void my_destructor (void *data)
{
struct my_struct *my_ptr = (struct my_struct *) data;
... /* do some more cleanups */
GF_FREE (my_ptr);
}
void init_ptr (struct parent *parent)
{
struct my_struct *my_ptr = malloc (sizeof (struct my_struct));
GF_REF_INIT (my_ptr, my_destructor); /* refcount is set to 1 */
... /* my_ptr probably gets added to some parent structure */
parent_add_ptr (parent, my_ptr);
}
void do_something (struct parent *parent)
{
struct my_struct *my_ptr = NULL;
/* likely need to lock parent, depends on its access pattern */
my_ptr = parent_remove_first_ptr (parent);
/* unlock parent */
... /* do something */
GF_REF_PUT (my_ptr); /* calls my_destructor on refcount = 0 */
}
URL: http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/11202
Change-Id: Idb98a5861a44c31676108ed8876db12c320912ef
BUG: 1228157
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/11022
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I4beba3b50456f802824374b6e3fa8079d72f2c00
BUG: 1194640
Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com>
Reviewed-on: http://review.gluster.org/10825
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
common-utils.c:2871:17: warning: Function call argument is an uninitialized
value gf_log ("glusterfs", GF_LOG_DEBUG, "lower: %d, higher:%d",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Change-Id: Iffde527a392afe783908aee12040d555c77c6983
BUG: 1222769
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Reviewed-on: http://review.gluster.org/10814
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Iee1c083774c988375a7261cfd6d510ed4c574de2
BUG: 1194640
Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com>
Reviewed-on: http://review.gluster.org/10824
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I29bdeefb755805858e3cb1817b679cb6f9a476a9
BUG: 1194640
Signed-off-by: Hari Gowtham <hgowtham@dhcp35-85.lab.eng.blr.redhat.com>
Reviewed-on: http://review.gluster.org/9893
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently bitrot using 120 second waiting time for object to be signed
after all fop's released. This signing waiting time value should be tunable.
Command for changing the signing waiting time will be
#gluster volume bitrot <VOLNAME> signing-time <waiting time value in second>
Change-Id: I89f3121564c1bbd0825f60aae6147413a2fbd798
BUG: 1228680
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/11105
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Coverity CID: 1124835
Change-Id: I7e87f2b3bad35cf8a9c64c8502de23662d9f677f
BUG: 789278
Signed-off-by: Nandaja Varma <nvarma@redhat.com>
Reviewed-on: http://review.gluster.org/9565
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Sakshi Bansal
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Prashanth Pai <ppai@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I66e7ccc5e62482c3ecf0aab302568e6c9ecdc05d
BUG: 1194640
Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com>
Reviewed-on: http://review.gluster.org/10938
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Joseph Fernandes
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I61874561fdf2c175d2b3eec0c85c25f12dc60552
BUG: 1194640
Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com>
Reviewed-on: http://review.gluster.org/10819
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Joseph Fernandes
|
|
|
|
|
|
|
|
|
| |
Change-Id: I137f1b7805895810b8e6f0a70a183782bf472bf5
BUG: 1194640
Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com>
Reviewed-on: http://review.gluster.org/9898
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As 3.7.1 is released, and a DHT configuration option needs higher
op version, bumping the gluster op-version to 3.7.2 (or 30702).
Change-Id: Iaed9e49b86a195653ddca55994e2c2398b2ee3bc
BUG: 1227884
Signed-off-by: Shyam <srangana@redhat.com>
Reviewed-on: http://review.gluster.org/11070
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also, added memory allocation failure checks in light of the
comments received @
http://review.gluster.org/#/c/10809/2/libglusterfs/src/gf-dirent.c, and
http://review.gluster.org/#/c/10809/1/xlators/features/shard/src/shard.c
Change-Id: Ie4092218545c8f4f8a0e6cc1fec6ba37bbbf2620
BUG: 1226551
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/11026
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of including config.h in each file, and have the additional
config.h included from the compiler commandline (-include option).
When a .c file tests for a certain #define, and config.h was not
included, incorrect assumtions were made. With this change, it can not
happen again.
BUG: 1222319
Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/10808
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In release-3.7 the format of quota.conf is changed.
There is a backward compatibility issues during upgrade
1) There can be an issue when peer sync between node-3.6 and node-3.7
2) If the user sets/removes limit, there is will different format of
file in node-3.6 and node-3.7
This patch fixes the issue:
1) restrict the user to execute command quota enable, limit-usage, remove
2) write quota.conf in older format if op-version is less than 3.6
Change-Id: Ib76f5a0a85394642159607a105cacda743e7d26b
BUG: 1223739
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/10889
Tested-by: NetBSD Build System
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As 3.7 is released, and a DHT configuration option needs higher
op version, bumping the gluster op-version to 3.7.1 (or 30701).
Change-Id: I9747cf93b41be72e43077ed8e977e21eed99ccc3
BUG: 1223432
Signed-off-by: Shyam <srangana@redhat.com>
Reviewed-on: http://review.gluster.org/10849
Reviewed-by: Kaushal M <kaushal@redhat.com>
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a file is under migration, any xattrs created on it
are lost post migration of the file. This is because
the xattrs are set only on the cached subvol of the source
and as the source is under migration, it becomes a linkto file
post migration.
Change-Id: Ib8e233b519cf954e7723c6e26b38fa8f9b8c85c0
BUG: 1193636
Signed-off-by: Nithya Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/10212
Tested-by: NetBSD Build System
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I0481258de8da36cbee7c046f53b20359badaf064
BUG: 1221889
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10791
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) If the database file exists
a. Dont try re-creating the db schema
b. Dont try re-configuring the db.
2) Dont assert in fini_db () when connection is NULL
Change-Id: I15dd103fe7542f70113c1d5e539a99f8cd062be4
BUG: 1163543
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10870
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The key concept here is to determine whether a directory is "clean" by
comparing its last-known-good topology to the current one for the
volume. These are stored as "commit hashes" on the directory and the
volume root respectively. The volume's commit hash changes whenever a
brick is added or removed, and a fix-layout is done. A directory's
commit hash changes only when a full rebalance (not just fix-layout)
is done on it. If all bricks are present and have a directory
commit hash that matches the volume commit hash, then we can assume
that every file is in its "proper" place. Therefore, if we look for
a file in that proper place and don't find it, we can assume it's not
on any other subvolume and *safely* skip the global (broadcast to all)
lookup.
Change-Id: Id6ce4593ba1f7daffa74cfab591cb45960629ae3
BUG: 1219637
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Signed-off-by: Shyam <srangana@redhat.com>
Reviewed-on: http://review.gluster.org/7702
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Log typo fix for CTR Xlator and Libgfdb
Change-Id: I8d7208b9756199f8dc0a72771a3c953b4fa00f3f
BUG: 1215896
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10668
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When freeing memory, our memory-accounting code expects to be able to
dereference from the (previously) allocated block to its owning
translator. However, as we have already found once in option
validation and twice in logging, that translator might itself have
been freed and the dereference attempt causes on of our daemons to
crash with SIGSEGV. This patch attempts to fix that as follows:
* We no longer embed a struct mem_acct directly in a struct xlator,
but instead allocate it separately.
* Allocated memory blocks now contain a pointer to the mem_acct
instead of the xlator.
* The mem_acct structure contains a reference count, manipulated in
both the normal and translator allocate/free code using atomic
increments and decrements.
* Because it's now a separate structure, we can defer freeing the
mem_acct until its reference count reaches zero (either way).
* Some unit tests were disabled, because they embedded their own
copies of the implementation for what they were supposedly testing.
Life's too short to spend time fixing tests that seem designed to
impede progress by requiring a certain implementation as well as
behavior.
Change-Id: Id929b11387927136f78626901729296b6c0d0fd7
BUG: 1211749
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/10417
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of open
* This patch brings in the changes where object versioning is done in write and
truncate fops instead of tracking them in open and create fops. This model
works for both regular and anonymous fds. It also removes the race associated
with open calls, create and lookups.
This patch follows the below method for object versioning and notifications:
Before sending writev on the fd, increase the ongoing
version first. This makes anonymous fd write similar to the regular
fd write by having the ongoing version increased before doing the
write.
Do following steps to do versioning:
1) For anonymous fds set the fd context (so that release is invoked) and add
the fd context to the list maintained in the inode context.
For regular fds the above think would have been done in open itself.
2) Increase the on-disk ongoing version
3) Increase the in memory ongoing version and mark inode as non-dirty
3) Once versioning is successfully done send write operation. If
versioning fails, then fail the write fop.
5) In writev_cbk mark inode as modified.
Change-Id: I7104391bbe076d8fc49b68745d2ec29a6e92476c
BUG: 1207979
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/10233
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Data self-heal:
1) Take inode lock in domain 'this->name:self-heal' on 0-0 range (full file),
So that no other processes try to do self-heal at the same time.
2) Take inode lock in domain 'this->name' on 0-0 range (full file),
3) perform fxattrop+fstat and get the xattrs on all the bricks
3) Choose the brick with ec->fragment number of same version as source
4) Truncate sinks
5) Unlock lock taken in 2)
5) For each block take full file lock, Read from sources write to the sinks, Unlock
6) Take full file lock and see if the file is still sane copy i.e. File didn't become unusable while the bricks are offline.
Update mtime to before healing
7) xattrop with -ve values of 'dirty' and difference of highest and its own
version values for version xattr
8) unlock lock acquired in 6)
9) unlock lock acquired in 1)
Change-Id: I6f4d42cd5423c767262c9d7bb5ca7767adb3e5fd
BUG: 1215265
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10384
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces multithreaded filesystem scrubber based
on throttling option configured for a particular volume. The
implementation "logically" breaks scanning and scrubbing with
the number of scrubber threads auto-configured depending upon
the throttle configuration. Scanning (crawling) is left single
threaded (per brick) with entries scrubbed in bulk. On reaching
this "bulk" watermark, scanner waits until entries are scrubbed.
Bricks for a particular volume have a set of thread(s) assigned
for scrubbing, with entries for each brick scrubbed in a round
robin fashion to avoid scrub "stalls" when a brick (out of N
bricks) is under active scrubbing.
This mechanism helps us implement "pause/resume" with ease: all
one need to do is to cleanup scrubber threads and let the main
scanner thread "wait" untill scrubbing is resumed (where the
scrubber thread(s) are spawned again), therefore continuing
where we left off (unless we restart the deamons, where crawl
initiates from root directory again, but I guess that's OK).
[
NOTE:
Throttling is optional for the signer daemon, without which
it runs full throttle. However, passing "-DBR_RATE_LIMIT_SIGNER"
predefined in CFLAGS enables CPU throttling (during checksum
calculation) thereby avoiding high CPU usage.
]
Subsequent patches would introduce CPU throttling during hash
calculation for scrubber.
Change-Id: I5701dd6cd4dff27ca3144ac5e3798a2216b39d4f
BUG: 1207020
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10511
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) Provided setfattr command to set timeout for split-brain
choice.
2) If split-brain inspection/resolution is being done
from the mount for a file, ref the inode when
split-brain-choice is set.
This inode will be unconditionally unref-ed after timeout
seconds set by the user/default otherwise.
3) Updated the doc and testcase to reflect the changes.
Change-Id: I15c9037dee28855f21e680e7e3632e1f48dba4e1
BUG: 1209104
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/10134
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace-brick operation with data migration support have been
deprecated from gluster.
With this fix replace brick command will support only one commad
gluster volume replace-brick <VOLNAME> <SOURCE-BRICK> <NEW-BRICK> {commit force}
Change-Id: Ib81d49e5d8e7eaa4ccb5830cfec2bc081191b43b
BUG: 1094119
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10101
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: The CTR xlator records file meta (heat/hardlinks)
into the data. This works fine for files which are created
after ctr xlator is switched ON. But for files which were
created before CTR xlator is ON, CTR xlator is not able to
record either of the meta i.e heat or hardlinks. Thus making
those files immune to promotions/demotions.
Solution: The solution that is implemented in this patch is
do ctr-db heal of all those pre-existent files, using named lookup.
For this purpose we use the inode-xlator context variable option
in gluster.
The inode-xlator context variable for ctr xlator will have the
following,
a. A Lock for the context variable
b. A hardlink list: This list represents the successful looked
up hardlinks.
These are the scenarios when the hardlink list is updated:
1) Named-Lookup: Whenever a named lookup happens on a file, in the
wind path we copy all required hardlink and inode information to
ctr_db_record structure, which resides in the frame->local variable.
We dont update the database in wind. During the unwind, we read the
information from the ctr_db_record and ,
Check if the inode context variable is created, if not we create it.
Check if the hard link is there in the hardlink list.
If its not there we add it to the list and send a update to the
database using libgfdb.
Please note: The database transaction can fail(and we ignore) as there
already might be a record in the db. This update to the db is to heal
if its not there.
If its there in the list we ignore it.
2) Inode Forget: Whenever an inode forget hits we clear the hardlink list in
the inode context variable and delete the inode context variable.
Please note: An inode forget may happen for two reason,
a. when the inode is delete.
b. the in-memory inode is evicted from the inode table due to cache limits.
3) create: whenever a create happens we create the inode context variable and
add the hardlink. The database updation is done as usual by ctr.
4) link: whenever a hardlink is created for the inode, we create the inode context
variable, if not present, and add the hardlink to the list.
5) unlink: whenever a unlink happens we delete the hardlink from the list.
6) mknod: same as create.
7) rename: whenever a rename happens we update the hardlink in list. if the hardlink
was not present for updation, we add the hardlink to the list.
What is pending:
1) This solution will only work for named lookups.
2) We dont track afr-self-heal/dht-rebalancer traffic for healing.
Change-Id: Ia4bbaf84128ad6ce8c3ddd70bcfa82894c79585f
BUG: 1212037
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/10370
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of "trusted.glusterfs.bit-rot.*" use "trusted.bit-rot.*"
NOTE:
With this patch, data on existing volumes would be resigned
(which should be OK as of now since we do not expect many
users as of now :-))
Change-Id: I926c7bca266a9c8f2cb35d57c4d0359aa5cecfa0
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10181
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Background:
Glusterfs changelogs are stored in each brick, which records the changes
happened in that brick. Georep will run in all the nodes of master and
processes changelogs "independently".
Processing changelogs is in brick level, but all the fops will be replayed
on "slave mount" point.
Problem:
With a DHT volume, in changelog "internal fops" are NOT recorded.
For Rename case, Rename is recorded in "hashed" brick changelog.
(DHT's internal fops like creating linkto file, unlink is NOT recorded).
This lead us to inconsistent rename operations.
For example,
Distribute volume created with Two bricks B1, B2.
//Consider master volume mounted @ /mnt/master
and following operations executed:
cd /mnt/master
touch f1 // f1 falls on B1 Hash
mv f1 f2 // f2 falls on B2 Hash
// Here, Changelogs are recorded as below:
@B1
CREATE f1
@B2
RENAME f1 f2
Here, race exists between Brick B1 and B2, say B2 will get executed first.
Source file f1 itself is "NOT PRESENT", so it will go ahead and create
f2 (Current implementation).
We have this problem When rename falls in another brick and
file is unlinked in Master.
Similar kind of issue exists in following case too(multiple rename):
CREATE f1
RENAME f1 f2
RENAME f2 f1
Solution:
Instead of carrying out "changelogging" at "HASHED volume",
carry out at the "CACHED volume".
This way we have rename operations carried out where actual files are present.
So,Changelog recorded as :
@B1
CREATE f1
RENAME f1 f2
credit: sarumuga@redhat.com
PS: Some of the races as the one below are _NOT_ fixed by this patch
* f1 and f2 exist. B1 and B2 are their respective cached subvols. For
both files hashed-subvol == cached-subvol
* mv f1 f2 on master.
* B1 has change-log entry of rename f1 f2
* rebalance migrates f2 from B1 and B2
* mv f2 f1 on master.
* B2 has change-log entry of rename f2 f1
Since changelog entries (rename f1 f2) and (rename f2 f1) are processed
independently by gsyncds, which of either f1 and f2 survives on slave
is subject to race. Note that on master its file f1 with name f1 which
survived. On slave it can be either file f1 with name f1 or file f2
with name f2 based on who wins the race of processing changelog.
Change-Id: Iebc222f582613924c3a7cba37fb6d3e2d8332eda
BUG: 1141379
Signed-off-by: Nithya Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/10410
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Have added support to send attributes of both entries and
its parent (include oldparent in case of RENAME fop) in the
same notification request to avoid multiple rpc requests.
Also, made changes in gfapi to send parent object and its
attributes changed in a single upcall event.
Change-Id: I92833da3bcec38d65216921c2ce4d10367c32ef1
BUG: 1200262
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10460
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I92ff46bae36d39a449d4bbaedc88a322992f65eb
BUG: 1215265
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10391
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If both dicts are NULL then equal. If one of the dicts is NULL but the other
has only ignorable keys then also they are equal. If both dicts are non-null
then check if for each non-ignorable key, values are same or not. value_ignore
function is used to skip comparing values for the keys which must be present in
both the dictionaries but the value could be different.
geo-rep's stime xattr doesn't need to be present in list xattr but when
getxattr comes on stime xattr even if there aren't enough responses with the
xattr we should still give out an answer which is maximum of the stimes
available.
Change-Id: I8de2ceaa2db785b797f302f585d88e73b154167d
BUG: 1207712
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10078
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently when quota limit is set, corresponding gfid
is set in quota.conf. This patch supports storing
inode-quota limits in quota.conf and also stores
additional byte for each gfid to differentiate
between usage quota limit and inode quota limit.
Change-Id: I444d7399407594edd280e640681679a784d4c46a
BUG: 1202244
Signed-off-by: vmallika <vmallika@redhat.com>
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Reviewed-on: http://review.gluster.org/10069
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) ctr_link_consistency option for ctr xaltor is provided so that
the user can choose to switch it on or off.
/* For link consistency we do a double update i.e mark the link
* during the wind and during the unwind we update/delete the link.
* This has a performance hit. We give a choice here whether we need
* link consistency to be spoton or not using link_consistency flag.
* This will have only one link update */
2) In delete the wind time recording is moved to unwind path.
/* Special performance case:
* Updating wind time in unwind for delete. This is done here
* as in the wind path we will not know whether its the last
* link or not. For a last link there is not use to update any
* wind or unwind time!*/
Change-Id: I209472fb816f939db4a868b97ba053b028f17ea6
BUG: 1217786
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10170
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As suggested during the code-review of Bug1200262, have modified
GF_CBK_UPCALL to be exlusively GF_CBK_CACHE_INVALIDATION.
Thus, for any new upcall event, a new CBK procedure will be added.
Also made changes to store upcall data separately based on the
upcall event type received.
BUG: 1200262
Change-Id: I0f5e53d6f5ece16aecb514a0a426dca40fa1c755
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10049
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I71e940817ae0a9378e82332d5a8569114fc13482
BUG: 1194640
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/9868
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current patch address two part of the design proposed.
1. Rebalance multiple files in parallel
2. Crawl only bricks that belong to the current node
Brief design explanation for the above two points.
1. Rebalance multiple files in parallel:
-------------------------------------
The existing rebalance engine is single threaded. Hence, introduced
multiple threads which will be running parallel to the crawler. The
current rebalance migration is converted to a "Producer-Consumer"
frame work.
Where Producer is : Crawler
Consumer is : Migrating Threads
Crawler: Crawler is the main thread. The job of the crawler is now
limited to fix-layout of each directory and add the files which are
eligible for the migration to a global queue in a round robin manner
so that we will use all the disk resources efficiently. Hence, the
crawler will not be "blocked" by migration process.
Producer: Producer will monitor the global queue. If any file is
added to this queue, it will dqueue that entry and migrate the file.
Currently 20 migration threads are spawned at the beginning of the
rebalance process. Hence, multiple file migration happens in parallel.
2. Crawl only bricks that belong to the current node:
--------------------------------------------------
As rebalance process is spawned per node, it migrates only the files
that belongs to it's own node for the sake of load balancing. But it
also reads entries from the whole cluster, which is not necessary as
readdir hits other nodes.
New Design:
As part of the new design the rebalancer decides the subvols
that are local to the rebalancer node by checking the node-uuid of
root directory prior to the crawler starts. Hence, readdir won't hit
the whole cluster as it has already the context of local subvols and
also node-uuid request for each file can be avoided. This makes the
rebalance process "more scalable".
Change-Id: I73ed6ff807adea15086eabbb8d9883e88571ebc1
BUG: 1171954
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/9657
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia71488a20147ec3e99548aea15b4a57bb325fd06
BUG: 1194640
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/10420
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
all the logging patches will be dependent on this segment allocation.
Sending this change as a seperate one to avoid the conflicts.
Change-Id: Iedd72ab6dc3526f1a6b01828807b5e6b9edcba90
BUG: 1194640
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/10400
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem : In glusterd,we are using big lock which is implemented based on sync
task frame work for thread synchronization and rcu lock for data consistency.
sync task frame work swap the threads if there is no worker poll threads
available,due to this rcu lock and rcu unlock was happening in different threads
(urcu-bp will not allow this),resulting into glusterd crash.
fix : To avoid releasing the sync lock(big lock) in between rcu critical
section,implemented sync lock as recursive lock.
More details:
link : http://www.spinics.net/lists/gluster-devel/msg14632.html
Change-Id: I2b56c1caf3f0470f219b1adcaf62cce29cdc6b88
BUG: 1211640
Signed-off-by: anand <anekkunt@redhat.com>
Reviewed-on: http://review.gluster.org/10285
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Logic for adding the 'glusterd_brickinfo->group' member and using it to
find the brick positon has been taken from http://review.gluster.org/#/c/9919.
Thanks to Jeff Darcy for that.
This patch is a part of the arbiter logic implementation for 3 way AFR
details of which can be found at http://review.gluster.org/#/c/9656/
Change-Id: Idbfe4f29ee8e098e0102def8f38b32314316b188
BUG: 1199985
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/10257
Tested-by: NetBSD Build System
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently inode_ctx_get2 return success for value2 even if it is
not found. This patch fixes the same.
Change-Id: I6bf3e6cb280ab3b9b8818bf48dc6e42a349dfa5d
BUG: 12002268
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/10412
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
xdata should be passed even in error cases.
lookup() call was missed in previous patch set.
Change-Id: I1ad2c452d05a3b4433b640762aaea5d3a91f2ba5
BUG: 1209869
Signed-off-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-on: http://review.gluster.org/10193
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ifc7937ceb451f6e11e40a9513017226fd0f115b0
BUG: 1215265
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10382
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instantiate a process wide global instance of the timer wheel
data structure. Spawning glusterfs* process with option arg
"--global-timer-wheel" instantiates a global instance of
timer-wheel under global context (->ctx).
Translators can make use of this process wide instance [via a
call to glusterfs_global_timer_wheel()] instead of maintaining
an instance of their own and possibly consuming more memory.
Linux kernel too has a single instance of timer wheel where
subsystems such as IO, networking, etc.. make use of.
Bitrot daemon would be early consumers of this: bitrot translator
instances for multiple volumes would track objects belonging to
their respective bricks in this global expiry tracking data
structure. This is also a first step to move GlusterFS timer
mechanism to use timer-wheel.
Change-Id: Ie882df607e07acaced846ea269ebf1ece306d6ae
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10380
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Failing to reset scanning counter causes "incorrect" delay of around
50 seconds per directory entry. This causes scrubber to run extremely
slowly.
[
NOTE: This is a temporary fix. With the introduction of token
bucket based throttling, inducing throttle via sleep()
call would be unneeded.
]
Also, fix logging messages in scrubber to log brick and full path
of the object which is identified/marked as corrupted.
Change-Id: Id501bd15dcdbd8a09613f80f9d84050304740027
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10375
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When there are many NFS-clients doing very often mount/unmount actions,
the updating of the 'rmtab' can become a bottleneck and cause delays. In
these situations, the output of 'showmount' may be less important than
the responsiveness of the (un)mounting.
By setting 'nfs.mount-rmtab' to the value "/-", the cache file is not
updated anymore, and the entries are only kept in memory.
BUG: 1169317
Change-Id: I40c4d8d754932f86fb2b1b2588843390464c773d
Reported-by: Cyril Peponnet <cyril@peponnet.fr>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9223
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: soumya k <skoduri@redhat.com>
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|