| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
files deleted during promotion were not deleting as the
files are moving from hashed to non-hashed.
On deleting a file that is undergoing promotion,
the unlink call is not sent to the dst file as the
hashed subvol == cached subvol. This causes
the file to reappear once the migration is complete.
This patch also fixes a problem with stale linkfile
deleting.
Change-Id: I4b02a498218c9d8eeaa4556fa4219e91e7fa71e5
BUG: 1282390
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/12829
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CLI command for bitrot scrub status will be :
gluster volume bitrot <volname> scrub status
Above command will show the statistics of bitrot scrubber.
Upon execution of this command it will show some common
scrubber tunable value of volume <VOLNAME> followed by
statistics of scrubber statistics of individual nodes.
sample ouput for single node:
Volume name : <VOLNAME>
State of scrub: Active
Scrub frequency: biweekly
Bitrot error log location: /var/log/glusterfs/bitd.log
Scrubber error log location: /var/log/glusterfs/scrub.log
=========================================================
Node name:
Number of Scrubbed files:
Number of Unsigned files:
Last completed scrub time:
Duration of last scrub:
Error count:
=========================================================
This is just infrastructure. list of bad file, last scrub
time, error count value will be taken care by
http://review.gluster.org/#/c/12503/ and
http://review.gluster.org/#/c/12654/ patches.
Change-Id: I3ed3c7057c9d0c894233f4079a7f185d90c202d1
BUG: 1207627
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10231
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add xlator options to index xlator with xattrs that
it needs to keep track of.
Change-Id: If818673be5e626f77e65cc3a340f8cdd624179c2
BUG: 1250803
Signed-off-by: Anuradha Talur <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/12467
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As a part of CHILD_MODIFIED event DHT forgets the current layout and
performs fresh lookup. However this is not required when a replica pair
goes offline as the xattrs can be read from other replica pairs. Hence
setting different event to handle replica pair going down.
Change-Id: I5ede2a6398e63f34f89f9d3c9bc30598974402e3
BUG: 1281230
Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/12573
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With unlink, rename, rmdir, contribution xattrs
are removed. If the file is a last link
then remove_xattr will fail with ENOENT.
So it better to perform remove_xattr
only if there are more links to the file
Change-Id: Ifc1e7fda4d310fd87f6f28a635c9ea78b8f3929d
BUG: 1257694
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/12033
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a quota is disable and the clean-up process terminated
without completely cleaning-up the quota xattrs.
Now when quota is enabled again, this can mess-up the accounting
A version number is suffixed for all quota xattrs and this version
number is specific to marker xaltor, i.e when quota xattrs are
requested by quotad/client marker will remove the version suffix in the
key before sending the response
Change-Id: I1ca2c11460645edba0f6b68db70d476d8d26e1eb
BUG: 1272411
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/12386
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Using sampling feature you can record details about every Nth FOP.
The fields in each sample are: FOP type, hostname, uid, gid, FOP priority,
port and time taken (latency) to fufill the request.
- Implemented using a ring buffer which is not (m/c) allocated in the IO path,
this should make the sampling process pretty cheap.
- DNS resolution done @ dump time not @ sample time for performance w/
cache
- Metrics can be used for both diagnostics, traffic/IO profiling as well
as P95/P99 calculations
- To control this feature there are two new volume options:
diagnostics.fop-sample-interval - The sampling interval, e.g. 1 means
sample every FOP, 100 means sample every 100th FOP
diagnostics.fop-sample-buf-size - The size (in bytes) of the ring
buffer used to store the samples. In the even more samples
are collected in the stats dump interval than can be held in this buffer,
the oldest samples shall be discarded. Samples are stored in the log
directory under /var/log/glusterfs/samples.
- Uses DNS cache written by sshreyas@fb.com (Thank-you!), the DNS cache
TTL is controlled by the diagnostics.stats-dnscache-ttl-sec option
and defaults to 24hrs.
Test Plan:
- Valgrind'd to ensure it's leak free
- Run prove test(s)
- Shadow testing on 100+ brick cluster
Change-Id: I9ee14c2fa18486b7efb38e59f70687249d3f96d8
BUG: 1271310
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/12210
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Heal hardlink in the db for already existing data in the cold
tier during attach tier. i.e during fix layout do lookup to files
in the cold tier.
CTR xlator on the brick/server side does db update/insert of the hardlink on a namelookup.
Currently the namedlookup is done synchronous to the fixlayout that is
triggered by attach tier. This is not performant, adding more time to
fixlayout. The performant approach is record the hardlinks on a compressed
datastore and then do the namelookup asynchronously later, giving the ctr db
eventual consistency
Change-Id: I4ffc337fffe7d447804786851a9183a51b5044a9
BUG: 1252586
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/11828
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implementation of xattrop type:
GF_XATTROP_ADD_ARRAY_WITH_DEFAULT
GF_XATTROP_ADD_ARRAY64_WITH_DEFAULT
These operations are similar to 'GF_XATTROP_ADD_ARRAY',
except that it adds a default value if xattr is missing
or its value is zero on disk.
One use-case of this operation is in inode-quota.
When a new directory is created, its default dir_count
should be set to 1. So when a xattrop performed setting
inode-xattrs, it should account initial dir_count
1 if the xattrs are not present
Here is the usage of this operation
value required in xdata for each key
struct array {
int32_t newvalue_1;
int32_t newvalue_2;
...
int32_t newvalue_n;
int32_t default_1;
int32_t default_2;
...
int32_t default_n;
};
or
struct array {
int32_t value_1;
int32_t value_2;
...
int32_t value_n;
} data[2];
fill data[0] with new value to add
fill data[1] with default value
xattrop GF_XATTROP_ADD_ARRAY_WITH_DEFAULT
for i from 1 to n
{
if (xattr (dest_i) is zero or not set in the disk)
dest_i = newvalue_i + default_i
else
dest_i = dest_i + newvalue_i
}
value in xdata after xattrop is successful
struct array {
int32_t dest_1;
int32_t dest_2;
...
int32_t dest_n;
};
Change-Id: Ic6a08473e99fd98299a839d4d8416081a7534efd
BUG: 1243946
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/11702
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GF_XATTROP_GET_AND_SET stores the existing xattr
value in xdata and sets the new value
xattrop was reusing input xattr dict to set the results
instead of creating new dict.
This can be problem for server side xlators as the inout dict
will have the value changed.
Change-Id: I43369082e1d0090d211381181e9f3b9075b8e771
BUG: 1251454
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/11995
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a --resolve-gids commandline option to the glusterfs binary. This
option gets set when executing "mount -t glusterfs -o resolve-gids ...".
This option is most useful in combination with the "acl" mount option.
POSIX ACL permission checking is done on the FUSE-client side to improve
performance (in addition to the checking on the bricks).
The fuse-bridge reads /proc/$PID/status by default, and this file
contains maximum 32 groups. Any local (client-side) permission checking
that requires more than the first 32 groups will fail.
By enabling the "resolve-gids" option, the fuse-bridge will call
getgrouplist() to retrieve all the groups from the user accessing the
mountpoint. This is comparable to how "nfs.server-aux-gids" works.
Note that when a user belongs to more than ~93 groups, the volume option
server.manage-gids needs to be enabled too. Without this option, the
RPC-layer will need to reduce the number of groups to make them fit in
the RPC-header.
Change-Id: I7ede90d0e41bcf55755cced5747fa0fb1699edb2
BUG: 1246275
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/11732
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Access to bad objects (especially operations such as open, readv, writev)
should be denied to prevent applications from getting wrong data.
* Do not allow anyone apart from scrubber to set bad object xattr.
* Do not allow bad object xattr to be removed.
Change-Id: Ia9185a067233a9f26e3d41d41d11d9a4eb0da827
BUG: 1210689
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/11126
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
and poststat atomically
Change-Id: I9b52ddaed4e306e9a49f39c86450c94bea843a7b
BUG: 1233617
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/11345
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
| |
Change-Id: Ie589c866a53c0cfc167e31d1d14a11bf58c8dabf
BUG: 1207829
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11413
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of including config.h in each file, and have the additional
config.h included from the compiler commandline (-include option).
When a .c file tests for a certain #define, and config.h was not
included, incorrect assumtions were made. With this change, it can not
happen again.
BUG: 1222319
Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/10808
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a file is under migration, any xattrs created on it
are lost post migration of the file. This is because
the xattrs are set only on the cached subvol of the source
and as the source is under migration, it becomes a linkto file
post migration.
Change-Id: Ib8e233b519cf954e7723c6e26b38fa8f9b8c85c0
BUG: 1193636
Signed-off-by: Nithya Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/10212
Tested-by: NetBSD Build System
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The key concept here is to determine whether a directory is "clean" by
comparing its last-known-good topology to the current one for the
volume. These are stored as "commit hashes" on the directory and the
volume root respectively. The volume's commit hash changes whenever a
brick is added or removed, and a fix-layout is done. A directory's
commit hash changes only when a full rebalance (not just fix-layout)
is done on it. If all bricks are present and have a directory
commit hash that matches the volume commit hash, then we can assume
that every file is in its "proper" place. Therefore, if we look for
a file in that proper place and don't find it, we can assume it's not
on any other subvolume and *safely* skip the global (broadcast to all)
lookup.
Change-Id: Id6ce4593ba1f7daffa74cfab591cb45960629ae3
BUG: 1219637
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Signed-off-by: Shyam <srangana@redhat.com>
Reviewed-on: http://review.gluster.org/7702
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) Provided setfattr command to set timeout for split-brain
choice.
2) If split-brain inspection/resolution is being done
from the mount for a file, ref the inode when
split-brain-choice is set.
This inode will be unconditionally unref-ed after timeout
seconds set by the user/default otherwise.
3) Updated the doc and testcase to reflect the changes.
Change-Id: I15c9037dee28855f21e680e7e3632e1f48dba4e1
BUG: 1209104
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/10134
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace-brick operation with data migration support have been
deprecated from gluster.
With this fix replace brick command will support only one commad
gluster volume replace-brick <VOLNAME> <SOURCE-BRICK> <NEW-BRICK> {commit force}
Change-Id: Ib81d49e5d8e7eaa4ccb5830cfec2bc081191b43b
BUG: 1094119
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10101
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of "trusted.glusterfs.bit-rot.*" use "trusted.bit-rot.*"
NOTE:
With this patch, data on existing volumes would be resigned
(which should be OK as of now since we do not expect many
users as of now :-))
Change-Id: I926c7bca266a9c8f2cb35d57c4d0359aa5cecfa0
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10181
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Background:
Glusterfs changelogs are stored in each brick, which records the changes
happened in that brick. Georep will run in all the nodes of master and
processes changelogs "independently".
Processing changelogs is in brick level, but all the fops will be replayed
on "slave mount" point.
Problem:
With a DHT volume, in changelog "internal fops" are NOT recorded.
For Rename case, Rename is recorded in "hashed" brick changelog.
(DHT's internal fops like creating linkto file, unlink is NOT recorded).
This lead us to inconsistent rename operations.
For example,
Distribute volume created with Two bricks B1, B2.
//Consider master volume mounted @ /mnt/master
and following operations executed:
cd /mnt/master
touch f1 // f1 falls on B1 Hash
mv f1 f2 // f2 falls on B2 Hash
// Here, Changelogs are recorded as below:
@B1
CREATE f1
@B2
RENAME f1 f2
Here, race exists between Brick B1 and B2, say B2 will get executed first.
Source file f1 itself is "NOT PRESENT", so it will go ahead and create
f2 (Current implementation).
We have this problem When rename falls in another brick and
file is unlinked in Master.
Similar kind of issue exists in following case too(multiple rename):
CREATE f1
RENAME f1 f2
RENAME f2 f1
Solution:
Instead of carrying out "changelogging" at "HASHED volume",
carry out at the "CACHED volume".
This way we have rename operations carried out where actual files are present.
So,Changelog recorded as :
@B1
CREATE f1
RENAME f1 f2
credit: sarumuga@redhat.com
PS: Some of the races as the one below are _NOT_ fixed by this patch
* f1 and f2 exist. B1 and B2 are their respective cached subvols. For
both files hashed-subvol == cached-subvol
* mv f1 f2 on master.
* B1 has change-log entry of rename f1 f2
* rebalance migrates f2 from B1 and B2
* mv f2 f1 on master.
* B2 has change-log entry of rename f2 f1
Since changelog entries (rename f1 f2) and (rename f2 f1) are processed
independently by gsyncds, which of either f1 and f2 survives on slave
is subject to race. Note that on master its file f1 with name f1 which
survived. On slave it can be either file f1 with name f1 or file f2
with name f2 based on who wins the race of processing changelog.
Change-Id: Iebc222f582613924c3a7cba37fb6d3e2d8332eda
BUG: 1141379
Signed-off-by: Nithya Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/10410
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Have added support to send attributes of both entries and
its parent (include oldparent in case of RENAME fop) in the
same notification request to avoid multiple rpc requests.
Also, made changes in gfapi to send parent object and its
attributes changed in a single upcall event.
Change-Id: I92833da3bcec38d65216921c2ce4d10367c32ef1
BUG: 1200262
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10460
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently when quota limit is set, corresponding gfid
is set in quota.conf. This patch supports storing
inode-quota limits in quota.conf and also stores
additional byte for each gfid to differentiate
between usage quota limit and inode quota limit.
Change-Id: I444d7399407594edd280e640681679a784d4c46a
BUG: 1202244
Signed-off-by: vmallika <vmallika@redhat.com>
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Reviewed-on: http://review.gluster.org/10069
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As suggested during the code-review of Bug1200262, have modified
GF_CBK_UPCALL to be exlusively GF_CBK_CACHE_INVALIDATION.
Thus, for any new upcall event, a new CBK procedure will be added.
Also made changes to store upcall data separately based on the
upcall event type received.
BUG: 1200262
Change-Id: I0f5e53d6f5ece16aecb514a0a426dca40fa1c755
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10049
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current patch address two part of the design proposed.
1. Rebalance multiple files in parallel
2. Crawl only bricks that belong to the current node
Brief design explanation for the above two points.
1. Rebalance multiple files in parallel:
-------------------------------------
The existing rebalance engine is single threaded. Hence, introduced
multiple threads which will be running parallel to the crawler. The
current rebalance migration is converted to a "Producer-Consumer"
frame work.
Where Producer is : Crawler
Consumer is : Migrating Threads
Crawler: Crawler is the main thread. The job of the crawler is now
limited to fix-layout of each directory and add the files which are
eligible for the migration to a global queue in a round robin manner
so that we will use all the disk resources efficiently. Hence, the
crawler will not be "blocked" by migration process.
Producer: Producer will monitor the global queue. If any file is
added to this queue, it will dqueue that entry and migrate the file.
Currently 20 migration threads are spawned at the beginning of the
rebalance process. Hence, multiple file migration happens in parallel.
2. Crawl only bricks that belong to the current node:
--------------------------------------------------
As rebalance process is spawned per node, it migrates only the files
that belongs to it's own node for the sake of load balancing. But it
also reads entries from the whole cluster, which is not necessary as
readdir hits other nodes.
New Design:
As part of the new design the rebalancer decides the subvols
that are local to the rebalancer node by checking the node-uuid of
root directory prior to the crawler starts. Hence, readdir won't hit
the whole cluster as it has already the context of local subvols and
also node-uuid request for each file can be avoided. This makes the
rebalance process "more scalable".
Change-Id: I73ed6ff807adea15086eabbb8d9883e88571ebc1
BUG: 1171954
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/9657
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instantiate a process wide global instance of the timer wheel
data structure. Spawning glusterfs* process with option arg
"--global-timer-wheel" instantiates a global instance of
timer-wheel under global context (->ctx).
Translators can make use of this process wide instance [via a
call to glusterfs_global_timer_wheel()] instead of maintaining
an instance of their own and possibly consuming more memory.
Linux kernel too has a single instance of timer wheel where
subsystems such as IO, networking, etc.. make use of.
Bitrot daemon would be early consumers of this: bitrot translator
instances for multiple volumes would track objects belonging to
their respective bricks in this global expiry tracking data
structure. This is also a first step to move GlusterFS timer
mechanism to use timer-wheel.
Change-Id: Ie882df607e07acaced846ea269ebf1ece306d6ae
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10380
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes a handful of problem with scrubber which
are detailed below.
Scrubber used to skip objects for verification due to missing
fd iterface to fetch versioning extended attributes. Similar
to the inode interface, an fd based interface in POSIX is now
introduced.
Moreover, this patch also fixes potential false reporting by
scrubber due to:
An object gets dirtied and signed when scrubber is busy
calculatingobject checksum. This is fixed by caching the
signed version when an object is first inspected for
stalenes, i.e., during pre-compute stage. This version is
used to verify checksum in the post-compute stage when the
signatures are compared for possible corruption.
Side effect of _not_ sending signature length during signing
resulted in "truncated" signature to be set for an object.
Now, at the time of signing, the signature length is sent
and is used in place of invoking strlen() to get signature
length (which could have possible 00s). The signature length
itself is not persisted in the signature xattr, but is
calculated on-the-fly by substracting the xattr length by
the "structure" header size.
Some of the log entries are made more meaningful (as and aid
for debugging).
Change-Id: I938bee5aea6688d5d99eb2640053613af86d6269
BUG: 1207624
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10118
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This also lists the files that are on-going I/O, which
will be fixed later.
Change-Id: Ib3f60a8b7e8798d068658cf38eaef2a904f9e327
BUG: 1203581
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10020
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bitrot stub implements object versioning required for identifying
signature freshness. More details about versioning is explained
as a part of the "bitrot feature documentation" patch.
Change-Id: I2ad70d9eb109ba4a12148ab8d81336afda529ad9
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9709
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Part 2/2 patch to enable users analyze and resolve
split-brain.
This patch enables :
1) Users to inspect the files in data and metadata split-brain.
2) Resolve the split-brain.
Both using a series of setfattr commands.
Consider a volume "test" with 2 bricks.
1) To inspect a file f1:
setfattr -n replica.split-brain-choice -v test-client-0 f1
After the execution of this command, if no read_subvol
is found, reads will be served from test-client-0 (corresponding
to brick-0).
2) To resolve split-brain :
setfattr -n replica.split-brain-heal-finalize -v test-client-0 f1
Execution of this command will lead to the resolution
of data and metadata split-brain with subvol mentioned in the
command (test-client-0 here) as the source and the rest as sink.
Change-Id: Ia20f3ee5abd3119e3d54fcc599f1e55ac65fd179
BUG: 1191396
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/9743
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
**********************************************************************
ChangeTimeRecorder(CTR) Xlator |
**********************************************************************
ChangeTimeRecorder(CTR) is server side xlator(translator) which sits
just above posix xlator. The main role of this xlator is to record the
access/write patterns on a file residing the brick. It records the
read(only data) and write(data and metadata) times and also count on
how many times a file is read or written. This xlator also captures
the hard links to a file(as its required by data tiering to move
files).
CTR Xlator is the consumer of libgfdb.
To Enable/Disable CTR Xlator:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gluster volume set <volume-name> features.ctr-enabled {on/off}
To Enable/Disable Frequency Counter Recording in CTR Xlator:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gluster volume set <volume-name> features.record-counters {on/off}
Change-Id: I5d3cf056af61ac8e3f8250321a27cb240a214ac2
BUG: 1194753
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/9935
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NFS now has the ability to use a separate file for "netgroups" and
"exports". An administrator should have the ability to check the
validity of the files before applying the configuration.
The "glusterfsd" command now has the following additional arguments that
can be used to check the configuration:
--print-netgroups: Validate the netgroups file and print it out
--print-exports: Validate the exports file and print it out
BUG: 1143880
Change-Id: I24c40d50110d49d8290f9fd916742f7e4d0df85f
URL: http://www.gluster.org/community/documentation/index.php/Features/Exports_Netgroups_Authentication
Original-author: Shreyas Siravara <shreyas.siravara@gmail.com>
CC: Richard Wareing <rwareing@fb.com>
CC: Jiffin Tony Thottan <jthottan@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9365
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
==========================================================================
Inode quota
==========================================================================
= Currently, the only way to retrieve the number of files/objects in a =
= directory or volume is to do a crawl of the entire directory/volume. =
= This is expensive and is not scalable. =
= =
= The proposed mechanism will provide an easier alternative to determine =
= the count of files/objects in a directory or volume. =
= =
= The new mechanism proposes to store count of objects/files as part of =
= an extended attribute of a directory. Each directory's extended =
= attribute value will indicate the number of files/objects present =
= in a tree with the directory being considered as the root of the tree. =
= =
= The count value can be accessed by performing a getxattr(). =
= Cluster translators like afr, dht and stripe will perform aggregation =
= of count values from various bricks when getxattr() happens on the key =
= associated with file/object count. =
A new interface is introduced:
------------------------------
limit-objects : limit the number of inodes at directory level
list-objects : list the directories where the limit is set
remove-objects : remove the limit from the directory
==========================================================================
CLI COMMAND:
gluster volume quota <volname> limit-objects <path> <number> [<percent>]
* <number> is a hard-limit for number of objects limitation for path "<path>"
If hard-limit is exceeded, creation of file/directory is no longer
permitted.
* <percent> is a soft-limit for number of objects creation for path "<path>"
If soft-limit is exceeded, a warning is issued for each creation.
CLI COMMAND:
gluster volume quota <volname> remove-objects [path]
==========================================================================
CLI COMMAND:
gluster volume quota <volname> list-objects [path] ...
Sample output:
------------------
Path Hard-limit Soft-limit Used Available
Soft-limit exceeded?
Hard-limit exceeded?
------------------------------------------------------------------------
--------------------------------------
/dir 10 80% 10 0
Yes
Yes
==========================================================================
[root@snapshot-28 dir]# ls
a b file11 file12 file13 file14 file15 file16 file17
[root@snapshot-28 dir]# touch a1
touch: cannot touch `a1': Disk quota exceeded
* Nine files are created in directory "dir" and directory is included in
* the
count too. Hence the limit "10" is reached and further file creation
fails
==========================================================================
Note: We have also done some re-factoring in cli for volume name
validation. New function cli_validate_volname is created
==========================================================================
Change-Id: I1823497de4f790a2a20ebb1770293472ea33ee2b
BUG: 1190108
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/9769
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
position in the graph rather than relative (local) to a particular
translator.
Encoding the volume in this way allows a single translator to manage
which brick is currently being scanned for directory entries. Using a
single translator minimizes allocated bits in the d_off. It also allows
multiple DHT translators in the same graph to have a common frame of
reference (the graph position) for which brick is being read. Multiple
DHT translators are needed for the Tiering feature.
The fix builds off a previous change (9332) which removed subvolume
encoding from AFR. The fix makes an equivalent change to the EC
translator.
More background can be found in fix 9332 and gluster-dev discussions [1].
DHT and AFR/EC are responsibile (as before) for choosing which brick to
enumerate directory entries in over the readdir lifecycle.
The client translator receiving the readdir fop encodes the dht_t. It
is referred to as the "leaf node" in the graph and corresponds to the
brick being scanned.
When DHT decodes the d_off, it translates the leaf node to a local
subvolume, which represents the next node in the graph leading to
the brick.
Tracking of leaf nodes is done in common utility functions. Leaf nodes
counts and positional information are updated on a graph switch.
[1] www.gluster.org/pipermail/gluster-devel/2015-January/043592.html
Change-Id: Iaf0ea86d7046b1ceadbad69d88707b243077ebc8
BUG: 1190734
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/9688
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the only way to retrieve the number of files/objects in a
directory or volume is to do a crawl of the entire directory/volume.
This is expensive and is not scalable.
The new mechanism proposes to store count of objects/files as part of
an extended attribute of a directory. Each directory's extended
attribute value will indicate the number of files/objects present
in a tree with the directory being considered as the root of the tree.
Currently file usage is accounted in marker by doing multiple FOPs
like setting and getting xattrs. Doing this with STACK WIND and
UNWIND can be harder to debug as involves multiple callbacks.
In this code we are replacing current mechanism with syncop approach
as syncop code is much simpler to follow and help us implement inode
quota in an organized way.
Change-Id: Ibf366fbe07037284e89a241ddaff7750fc8771b4
BUG: 1188636
Signed-off-by: vmallika <vmallika@redhat.com>
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Reviewed-on: http://review.gluster.org/9567
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Several features - e.g. encryption, erasure codes, or NSR - involve
multiple cooperating translators which sometimes need a "private" means
of communication amongst themselves. Historically we've used virtual or
synthetic xattrs, but that's not very elegant and clutters up the
getxattr/setxattr path which must also handle real xattr requests. This
new fop should address that.
The only argument is an int32_t "op" which should be recognized by the
target translator. It is recommended that translators using these
feature follow some convention regarding the ops that they define, to
avoid conflicts. Using a hash of the target translator's type string as
a base for a series of ops would probably be a good start. Any other
information can be passed in both directions using xdata.
The default behavior for this fop, as with any other, is to pass through
to FIRST_CHILD. That makes use of this fop "transparent" to other
translators that were written before it existed, but it also means that
it only really works with pass-through translators. If a routing
translator (such as DHT) or a fan-out translator (such as AFR) is
involved, the IPC might not reach its intended destination unless those
translators are modified to forward IPC fops along all paths.
If an IPC gets all the way to storage/posix it is considered an error,
much like an uncaught exception. We don't actually *do* anything in
that case, but we do log it send back an EOPNOTSUPP error. This makes
the "unrecognized opcode" condition distinguishable from the "no IPC
support" condition (which would yield an RPC error instead) so clients
can probe for the presence of a handler for their own favorite opcode
and either use that or use old-school xattrs depending on the result.
BUG: 1158628
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Change-Id: I84af1b17babe5b30ec03ecf027ae37d09b873968
Reviewed-on: http://review.gluster.org/8812
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Framework on the server-side, to handle certain state of the files
accessed and send notifications to the clients connected.
A generic and extensible framework, used to maintain states in
the glusterfsd process for each of the files accessed
(including the clients info doing the fops) and send
notifications to the respective glusterfs clients incase of
any change in that state.
This patch handles "Inode Update/Invalidation" upcall event.
Feature page:
URL: http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure
Below link has a writeup which explains the code changes done -
URL: https://soumyakoduri.wordpress.com/2015/02/25/glusterfs-understanding-upcall-infrastructure-and-cache-invalidation-support/
Change-Id: Ie3d724be9a3419fcf18901a753e8ec2df2ac802f
BUG: 1200262
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/9535
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: If341e3c0a559aa5bbca9c1263a241c6592c59706
BUG: 1093594
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/9696
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is one part to enable users analyze and resolve
split-brain.
Problem : To know if a file is in data/metadata split-brain
Solution : Performing "getfattr -n afr.split-brain-status
<path-to-file>" from the mount provides this information.
Also provides the list of afr children to analyse to
get more information.
Change-Id: I4d9b429794759a906371416cb84c84a212e2c7b9
BUG: 1191396
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/9633
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... from all bricks in the volume
This patch is important in the context of MT epoll. With MT epoll,
notification events from client xlators could reach cluster xlators like
afr, dht, ec, stripe etc. in different orders.
For e.g, In a distributed replicate volume of 2 bricks, namely Brick1
and Brick2, the following network events are observed by a mount
process.
- connection to Brick1 is broken.
- connection to Brick1 has been restored.
- connection to Brick2 is broken.
- connection to Brick2 has been restored.
Without establishing a total ordering of events, we can't guarantee that
cluster xlators like afr, dht perceive them in the same order. While we
would expect afr (say) to perceive it as only one of Brick1 and Brick2
going down at any given time, it is possible for the notification of
Brick2 going offline to race with the notification of Brick1 coming back
online.
Change-Id: I78f5a52bfb05593335d0e9ad53ebfff98995593d
BUG: 1104462
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/9591
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add the ability to configure the number of event threads
for various gluster services.
Currently with the multi thread epoll patch, it is possible
to have more than one thread waiting on socket activity and
processing the same. This thread count is currently static,
which this commit makes dynamic.
The current services which use IO path, i.e brick processes,
any client process (nfs, FUSE, gfapi, heal,
rebalance, etc.a), gain 2 set parameters to control the number
of threads that are processing events. These settings are,
- client.event-threads <n>
- server.event-threads <n>
The client setting affects the client graph consumers, and the
server setting affects the brick processes. These are processed
and inited/reconfigured using the client/server protocol xlators.
Other services (say glusterd) would need to extend similar
configuration settings to take advantage of multi threaded event
processing.
At present glusterd is not enabled with this commit, as it does not
stand to gain from this multi-threading (as I understand it).
Change-Id: Id8422fc57a9f95a135158eb6477ccf9d3c9ea4d9
BUG: 1104462
Signed-off-by: Shyam <srangana@redhat.com>
Reviewed-on: http://review.gluster.org/9488
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Bring in option to disable memory accounting for a glusterfs process
This reverses the changes done by the commit
7fba3a88f1ced610eca0c23516a1e720d75160cd.
* Change the key from "memory-accounting" to "no-memory-accounting", as by
default all the glusterfs process enable memory accounting now. So to
disable memory accounting for some process, "no-mem-accounting" argument has
to be passed.
Change-Id: I39c7cefb0fe764ea3e48f4e73e1305b084c5f497
BUG: 1184366
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/9469
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extend the AFR heal command to include automated split-brain resolution.
This patch [3/3] is the final patch for afr automated split-brain resolution
implementation.
"gluster volume heal <VOLNAME> [full | statistics [heal-count [replica
<HOSTNAME:BRICKNAME>]] |info [healed | heal-failed | split-brain]| split-brain
{bigger-file <FILE> |source-brick <HOSTNAME:BRICKNAME> [<FILE>]}]"
The new additions being:
1.gluster volume heal <VOLNAME> split-brain bigger-file <FILE>
Locates the replica containing the FILE, selects bigger-file as source and
completes heal.
2.gluster volume heal <VOLNAME> split-brain source-brick <HOSTNAME:BRICKNAME>
<FILE>
Selects <FILE> present in <HOSTNAME:BRICKNAME> as source and completes heal.
3.gluster volume heal <VOLNAME> split-brain <HOSTNAME:BRICKNAME>
Selects all split-brained files in <HOSTNAME:BRICKNAME> as source and completes
heal.
Note: <FILE> can be either the full file name as seen from the root of the
volume (or) the gfid-string representation of the file, which sometimes gets
displayed in the heal info command's output.
Entry/gfid split-brain resolution is not supported.
Example can be found in the test case.
Change-Id: I4649733922d406f14f28ee9033a5cb627b9538b3
BUG: 1136769
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/9377
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A virtual xattr "glusterfs.geo-rep.trigger-sync" is provided
in glusterfs through changelog translator. Geo-rep triggers
a explicit data sync on setting this xattr on a file.
Changelog captures a DATA entry on file's gfid on setting this
virtual xattr on a file. This is supported only for files. It
doesn't support directories.
Usage: setfattr -n glusterfs.geo-rep.trigger-sync <file-path>
Change-Id: Ia689326ac2dcb31035ffbecad2c548eda4eb9245
BUG: 1176934
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/9337
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gluster volume heal <volname> info command
will now also display if the files listed (in the output
of the command) are in split-brain or possibly being
healed.
This patch also fixes build warning that occurs.
Change-Id: I1fc92e62137f23b2b9ddf6e05819cee6230741d1
BUG: 1163804
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/9119
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also, moved the backtrace fetching logic to a separate function.
Modified the backtrace fetching logic able to work under memory pressure
conditions.
Change-Id: Ie38bea425a085770f41831314aeda95595177ece
BUG: 1138503
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/8584
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Break-way from '/var/lib/glusterd' hard-coded previously,
instead rely on 'configure' value from 'localstatedir'
- Provide 's/lib/db' as default working directory for gluster
management daemon for BSD and Darwin based installations
- loff_t is really off_t on Darwin
- fix-off the warnings generated by clang on FreeBSD/Darwin
- Now 'tests/*' use GLUSTERD_WORKDIR a common variable for all
platforms.
- Define proper environment for running tests, define correct PATH
and LD_LIBRARY_PATH when running tests, so that the desired version
of glusterfs is used, regardless where it is installed.
(Thanks to manu@netbsd.org for this additional work)
Change-Id: I2339a0d9275de5939ccad3e52b535598064a35e7
BUG: 1111774
Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/8246
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
synclock_destory() has been prototyped in syncop.h,
how-ever synclock_destroy() is the actual function used in syncop.c.
Correcting this function definition along with few typos.
Change-Id: I35a818190c1d37c303279ca7a820f01895751bd9
BUG: 1075417
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/8266
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-by: soumya k <skoduri@redhat.com>
Reviewed-by: Poornima G <pgurusid@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently whenever dht_lookup_everywhere gets called, if in
dht_lookup_everywhere_cbk, a linkto file is found on non-hashed
subvolume, file is unlinked. But there are cases when this file
is under migration. Under such condition, we should avoid deletion
of file.
When some other rebalance process changes the layout of parent
such that dst_file (w.r.t. migration) falls on non-hashed node,
then may be lookup could have found it as linkto file but just
before unlink, file is under migration or already migrated
In such cased unlink can be avoided.
Race:
-------
If we have two bricks (brick-1 and brick-2) with initial file "a"
under BaseDir which is hashed as well as cached on (brick-1).
Assume "a" hashing gives 44.
Brick-1 Brick-2
Initial Setup: BaseDir/a BaseDir
[1-50] [51-100]
Now add new-brick Brick-3.
1. Rebalance-1 on node Node-1 (Brick-1 node) will reset
the BaseDir Layout.
2. After that it will perform
a) Create linkto file on new-hashed (brick-2)
b) Perform file migration.
1.Rebalance-1 Fixes the base-layout:
Brick-1 Brick-2 Brick-3
--------- ---------- ------------
BaseDir/a BaseDir BaseDir
[1-33] [34-66] [67-100]
2. Only a) is BaseDir/a BaseDir/a(linkto) BaseDir
performed Create linktofile
Now rebalance 2 on node-2 jumped in and it will perform
step 1 and 2-a.
After (rebal-2, step-1), it changes the layout of the BaseDir.
BaseDir/a BaseDir/a(link) BaseDir
[67-100] [1-33] [34-66]
For (rebale-2, step-2), It will perform lookup at Brick-3 as w.r.t new
layout 44 falls for brick-3. But lookup will fail.
So dht_lookup_everywhere gets called.
NOTE: On brick-2 by rebalance-1, a linkto file was created.
Currently that linkto files gets deleted by rebalance-2 lookup as it
is considered as stale linkto file. But with patch if rebalance is
already in progress or rebalance is over, linkto file will not be
unlinked. If rebalance is in progress fd will be open and if rebalance
is over then linkto file wont be set.
Change-Id: I3fee0d28de3c76197325536a9e30099d2413f079
BUG: 1116150
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/8345
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The feature is controlled by presence of the following file:
/var/lib/glusterd/secure-access
See the comment near the definition of SECURE_ACCESS_FILE in glusterfs.h
for the rationale. With this enabled, the following rules apply to
connections:
UNIX-domain sockets never have SSL.
Management-port sockets (both connecting and accepting, in
daemons and CLI) have SSL based on presence of the file.
Other IP sockets have SSL based on the existing client.ssl and
server.ssl volume options.
Transport multi-threading is explicitly turned off in glusterd (it would
otherwise be turned on when SSL is) due to multi-threading issues.
Tests have been elided to avoid risk of leaving a file which will cause
all subsequent tests to run with management SSL still enabled.
IMPLEMENTATION NOTE
The implementation is a bit messy, and consists of two stages. First we
decide whether to set the relevant fields in our context structure, based
on presence of the sentinel file OR a command-line override. Later we
decide whether a particular connection should actually use SSL, based on the
context flags plus what kind of connection we're making[1] and what kind of
daemon we're in[2].
[1] inbound, outbound to glusterd port, other outbound
[2] glusterd, glusterfsd, other
TESTING NOTE
Instead of just running one special test for this feature, the ideal
would be to run all tests with management SSL enabled. However, it
would be inappropriate or premature to set up an optional feature in the
patch itself. Therefore, the method of choice is to submit a separate
patch on top, which modifies "cleanup" in include.rc to recreate the
secure-access file and associated SSL certificate/key files before each
test.
Change-Id: I0e04d6d08163893e24ec8c031748c5c447d7f780
BUG: 1114604
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/8094
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|