| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The NFS-server sets EOF only in the READ reply when op_errno is set to
ENOENT. Xlators are expected to set op_errno to ENOENT when EOF is
reached, op_ret will contain the number of bytes returned by the READ.
When an NFS-client (like VMware ESXi) do a READ that exceeds the size of
the file, errno should be set to EOF and the return value contains the
number of bytes that are read (from the requested offset, until the end
of the file). Not setting EOF on a correct short READ, can result in
errors on the NFS-client.
This is not an issue with the Linux NFS-client (or VFS). Linux is smart
enough to not try to read more bytes than the file contains.
BUG: 1209298
Change-Id: Ib15538744908a6001d729288d3e18a432d19050b
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/10142
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Throttle value will be "normal" by default. For throttling down,
a thread will be put in to sleep. And for throttling up,
gf_defrag_process_dir will wake up the sleeping threads.
Change-Id: I74d530e3effd6e60e6eec81ccc8ff65789fa9c13
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/10526
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) Provided setfattr command to set timeout for split-brain
choice.
2) If split-brain inspection/resolution is being done
from the mount for a file, ref the inode when
split-brain-choice is set.
This inode will be unconditionally unref-ed after timeout
seconds set by the user/default otherwise.
3) Updated the doc and testcase to reflect the changes.
Change-Id: I15c9037dee28855f21e680e7e3632e1f48dba4e1
BUG: 1209104
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/10134
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bitrot scrubber paused/resume command should give proper error messages if
scrubber already pause/resume and user again try to perform same
operation on a volume.
Change-Id: I01ad69c80f03b177535a4e5f1c95ab7709a804b0
BUG: 1210684
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10209
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The generation number for each peerinfo object is unique. It can be used
to find the exact peerinfo object, which is required for peer RPC
notifications.
Using hostname and uuid matching to find peerinfos can return incorrect
peerinfos to be returned in certain cases like multi network peer probe.
This could cause updates to happen to incorrect peerinfos.
Change-Id: Ia0aada8214fd6d43381e5afd282e08d53a277251
BUG: 1215018
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/10495
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace-brick operation with data migration support have been
deprecated from gluster.
With this fix replace brick command will support only one commad
gluster volume replace-brick <VOLNAME> <SOURCE-BRICK> <NEW-BRICK> {commit force}
Change-Id: Ib81d49e5d8e7eaa4ccb5830cfec2bc081191b43b
BUG: 1094119
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10101
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As optional feature, during unlink, full path will be recorded.
Changelog Version number to be bumped up to 1.2.
With this patch, parser checks the version number before parsing
and handles accordingly.
Change-Id: Ic1ad98259c39e417029a08e26a1d4b467817e65a
BUG: 1214561
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10166
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adding 64 bits in "version" key of extended attributes. First 64 bits (Left)
represents Data version. Last 64 bits (right) represents Meta Data version.
Note: 3.7 and 3.6 version ec can't co-exist with this change because xattrop in
3.6 will fail with ERANGE as the buffer passed to it will be '8' bytes where as
the value will be 16 bytes in 3.7. Where as 3.7 version clients can work with
old version files. For upgrades we need to tell users to complete heals and
then upgrade
BUG: 1215265
Change-Id: Ib85114680cb7e75b8371c984d9f7b6401c1ffb93
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/10312
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
inode quota is a new feature implemented in glusterfs-3.7
if quota is enabled in the older version and is upgraded
to a new version, we can hit setxattr spike during self-heal
of inode quotas. So, when a quota is enabled, turn off
inode-quotas with a xlator option.
With this patch, we still account for inode quotas but only
when a write operation is performed for a particular file.
User will be able to query inode quotas once the Inode-quota
xlator option is enabled.
Change-Id: I52fb28bf7024989ce7bb08ac63a303bf3ec1ec9a
BUG: 1209430
Signed-off-by: vmallika <vmallika@redhat.com>
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Reviewed-on: http://review.gluster.org/10152
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Global option gluster features.ganesha enable
writes into the global 'option' file. The snapshot
feature also writes into the same file.
To handle concurrent multiple transactions correctly,
a new lock has to be introduced on this file.
Every operation using this file needs
to contest for the new lock type.
Change-Id: Ia8a324d2a466717b39f2700599edd9f345b939a9
BUG: 1200254
Signed-off-by: Meghana Madhusudhan <mmadhusu@redhat.com>
Reviewed-on: http://review.gluster.org/10130
Reviewed-by: Avra Sengupta <asengupt@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: soumya k <skoduri@redhat.com>
Tested-by: NetBSD Build System
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: The CTR xlator records file meta (heat/hardlinks)
into the data. This works fine for files which are created
after ctr xlator is switched ON. But for files which were
created before CTR xlator is ON, CTR xlator is not able to
record either of the meta i.e heat or hardlinks. Thus making
those files immune to promotions/demotions.
Solution: The solution that is implemented in this patch is
do ctr-db heal of all those pre-existent files, using named lookup.
For this purpose we use the inode-xlator context variable option
in gluster.
The inode-xlator context variable for ctr xlator will have the
following,
a. A Lock for the context variable
b. A hardlink list: This list represents the successful looked
up hardlinks.
These are the scenarios when the hardlink list is updated:
1) Named-Lookup: Whenever a named lookup happens on a file, in the
wind path we copy all required hardlink and inode information to
ctr_db_record structure, which resides in the frame->local variable.
We dont update the database in wind. During the unwind, we read the
information from the ctr_db_record and ,
Check if the inode context variable is created, if not we create it.
Check if the hard link is there in the hardlink list.
If its not there we add it to the list and send a update to the
database using libgfdb.
Please note: The database transaction can fail(and we ignore) as there
already might be a record in the db. This update to the db is to heal
if its not there.
If its there in the list we ignore it.
2) Inode Forget: Whenever an inode forget hits we clear the hardlink list in
the inode context variable and delete the inode context variable.
Please note: An inode forget may happen for two reason,
a. when the inode is delete.
b. the in-memory inode is evicted from the inode table due to cache limits.
3) create: whenever a create happens we create the inode context variable and
add the hardlink. The database updation is done as usual by ctr.
4) link: whenever a hardlink is created for the inode, we create the inode context
variable, if not present, and add the hardlink to the list.
5) unlink: whenever a unlink happens we delete the hardlink from the list.
6) mknod: same as create.
7) rename: whenever a rename happens we update the hardlink in list. if the hardlink
was not present for updation, we add the hardlink to the list.
What is pending:
1) This solution will only work for named lookups.
2) We dont track afr-self-heal/dht-rebalancer traffic for healing.
Change-Id: Ia4bbaf84128ad6ce8c3ddd70bcfa82894c79585f
BUG: 1212037
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/10370
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Idaae234b9e81c40040393e748db1f61363a48ed0
BUG: 1211913
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
Reviewed-on: http://review.gluster.org/10250
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of "trusted.glusterfs.bit-rot.*" use "trusted.bit-rot.*"
NOTE:
With this patch, data on existing volumes would be resigned
(which should be OK as of now since we do not expect many
users as of now :-))
Change-Id: I926c7bca266a9c8f2cb35d57c4d0359aa5cecfa0
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10181
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Problem: In case a defrag is null, going to out section will crash rebalance.
Change-Id: I8b3ee1ad85dc23ef0e2f2dd6f912d07216bd619f
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/10582
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I4cc060710482de8633141170dd35f669f01f639b
BUG: 1207615
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10528
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Multi-Head NFS-Ganesha servers need upcall (cache-invalidation)
support to notify them in case of any changes to the files in the backend.
Hence, upcall xlator option "features.cache-invalidation" needs to be enabled
when ganesha.enable is set to 'on'. Similarly, this feature needs
to be disabled when ganesha.enable is set to 'off'
Signed-off-by: Meghana Madhusudhan <mmadhusu@redhat.com>
Change-Id: Ifdd1d50e48a2bd2a388f73c0b9e318c6092ac190
BUG: 1213752
Reviewed-on: http://review.gluster.org/10581
Reviewed-by: soumya k <skoduri@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a read IO occurs against a file that has reached rebalance
phase 2, we redirect the IO to the destination. For tiered
volumes, when we try to reopen the file (on the destination),
the lower level DHT receives the open call and fails; it does
not have a "cached subvol". Fix is to "teach" the lower level
DHT of the new location by sending a locate before the open.
Change-Id: Ia4acb0035ff1da15f6a8f9ed54f43c76e8b98f5f
BUG: 1214048
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Signed-off-by: root <root@gprfs018.sbu.lab.eng.bos.redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/10324
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Background:
Glusterfs changelogs are stored in each brick, which records the changes
happened in that brick. Georep will run in all the nodes of master and
processes changelogs "independently".
Processing changelogs is in brick level, but all the fops will be replayed
on "slave mount" point.
Problem:
With a DHT volume, in changelog "internal fops" are NOT recorded.
For Rename case, Rename is recorded in "hashed" brick changelog.
(DHT's internal fops like creating linkto file, unlink is NOT recorded).
This lead us to inconsistent rename operations.
For example,
Distribute volume created with Two bricks B1, B2.
//Consider master volume mounted @ /mnt/master
and following operations executed:
cd /mnt/master
touch f1 // f1 falls on B1 Hash
mv f1 f2 // f2 falls on B2 Hash
// Here, Changelogs are recorded as below:
@B1
CREATE f1
@B2
RENAME f1 f2
Here, race exists between Brick B1 and B2, say B2 will get executed first.
Source file f1 itself is "NOT PRESENT", so it will go ahead and create
f2 (Current implementation).
We have this problem When rename falls in another brick and
file is unlinked in Master.
Similar kind of issue exists in following case too(multiple rename):
CREATE f1
RENAME f1 f2
RENAME f2 f1
Solution:
Instead of carrying out "changelogging" at "HASHED volume",
carry out at the "CACHED volume".
This way we have rename operations carried out where actual files are present.
So,Changelog recorded as :
@B1
CREATE f1
RENAME f1 f2
credit: sarumuga@redhat.com
PS: Some of the races as the one below are _NOT_ fixed by this patch
* f1 and f2 exist. B1 and B2 are their respective cached subvols. For
both files hashed-subvol == cached-subvol
* mv f1 f2 on master.
* B1 has change-log entry of rename f1 f2
* rebalance migrates f2 from B1 and B2
* mv f2 f1 on master.
* B2 has change-log entry of rename f2 f1
Since changelog entries (rename f1 f2) and (rename f2 f1) are processed
independently by gsyncds, which of either f1 and f2 survives on slave
is subject to race. Note that on master its file f1 with name f1 which
survived. On slave it can be either file f1 with name f1 or file f2
with name f2 based on who wins the race of processing changelog.
Change-Id: Iebc222f582613924c3a7cba37fb6d3e2d8332eda
BUG: 1141379
Signed-off-by: Nithya Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/10410
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the patch http://review.gluster.org/#/c/9657
the client pid set by tiering migration was getting over-
written in dht_start_rebalance_task(). Just corrected it
in dht_setxattr() before calling dht_start_rebalance_task()
and removed it from dht_start_rebalance_task().
Change-Id: I37cfa111f83a4e5d498042575c93799f60b49870
BUG: 1217937
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10502
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not allow attach or detach tier to work if any of the nodes
is running gluster code < 3.7.
Change-Id: Id9af8f4057f6fad9cb703ec7645bc01eccb11fc1
BUG: 1218287
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/10531
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we attach a tier, the hot tier becomes the hashed
subvolume. But directories may not yet have been replicated by
the fix layout process. Hence lookups to those directories
will fail on the hot subvolume. We should only go to the hashed
subvolume once the layout has been fixed. This is known if the
layout for the parent directory does not have an error. If
there is an error, the cold tier is considered the hashed
subvolume. The exception to this rules is ENOCON, in which
case we do not know where the file is and must abort.
Note we may revalidate a lookup for a directory even if the
inode has not yet been populated by FUSE. This case can
happen in tiering (where one tier has completed a lookup
but the other has not, in which case we revalidate one tier
when we call lookup the second time). Such inodes are
still invalid and should not be consulted for validation.
Change-Id: Ia2bc62e1d807bd70590bd2a8300496264d73c523
BUG: 1214289
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/10435
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Have added support to send attributes of both entries and
its parent (include oldparent in case of RENAME fop) in the
same notification request to avoid multiple rpc requests.
Also, made changes in gfapi to send parent object and its
attributes changed in a single upcall event.
Change-Id: I92833da3bcec38d65216921c2ce4d10367c32ef1
BUG: 1200262
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10460
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CID: 1293504 (Calling xlator_set_option without checking return value )
CID: 1293502 (Dereferencing a pointer that might be null xl when calling
xlator_set_option)
CID: 1293500 (Assigning value from dict_get_int32(dict, "type", &type)
to ret here, but that stored value is overwritten before
it can be used.)
Change-Id: I5314fb399480df70bd77bc374e3b573f2efd5710
BUG: 1093692
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10201
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Discussion in gluster-devel
http://www.gluster.org/pipermail/gluster-devel/2015-April/044301.html
MASTER NODE - Master Volume Node
MASTER VOL - Master Volume name
MASTER BRICK - Master Volume Brick
SLAVE USER - Slave User to which Geo-rep session is established
SLAVE - <SLAVE_NODE>::<SLAVE_VOL> used in Geo-rep Create command
SLAVE NODE - Slave Node to which Master worker is connected
STATUS - Worker Status(Created, Initializing, Active, Passive, Faulty,
Paused, Stopped)
CRAWL STATUS - Crawl type(Hybrid Crawl, History Crawl, Changelog Crawl)
LAST_SYNCED - Last Synced Time(Local Time in CLI output and UTC in XML output)
ENTRY - Number of entry Operations pending.(Resets on worker restart)
DATA - Number of Data operations pending(Resets on worker restart)
META - Number of Meta operations pending(Resets on worker restart)
FAILURES - Number of Failures
CHECKPOINT TIME - Checkpoint set Time(Local Time in CLI output and UTC
in XML output)
CHECKPOINT COMPLETED - Yes/No or N/A
CHECKPOINT COMPLETION TIME - Checkpoint Completed Time(Local Time in CLI
output and UTC in XML output)
XML output:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
cliOutput>
geoRep>
volume>
name>
sessions>
session>
session_slave>
pair>
master_node>
master_brick>
slave_user>
slave/>
slave_node>
status>
crawl_status>
entry>
data>
meta>
failures>
checkpoint_completed>
master_node_uuid>
last_synced>
checkpoint_time>
checkpoint_completion_time>
BUG: 1212410
Change-Id: I944a6c3c67f1e6d6baf9670b474233bec8f61ea3
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/10121
Tested-by: NetBSD Build System
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add logic in afr to work in conjunction with the arbiter xlator when a
replica 3 arbiter volume is created. More specifically, this patch:
* Enables full locks for afr data transaction for such volumes.
* Removes the upfront marking of pending xattrs at the time of pre-op
and defer it to post-op. (This is an arbiter independent change and is made for all afr transactions.)
* After pre-op stage, check if we can proceed with the fop stage without
ending up in split-brain by examining the changelog xattrs.
* Unwinds the fop with failure if only one source was available at the
time of pre-op and the fop happened to fail on particular source brick.
* Skips data self-heal if arbiter brick is the only source available.
* Adds the arbiter-count option to the shd graph.
This patch is a part of the arbiter logic implementation for 3 way AFR
details of which can be found at http://review.gluster.org/#/c/9656/
Change-Id: I9603db9d04de5626eb2f4d8d959ef5b46113561d
BUG: 1199985
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/10258
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I25f3536446798ea1cffd6b5dfbb3d2398766fcf3
BUG: 1194640
Signed-off-by: Nandaja Varma <nandaja.varma@gmail.com>
Reviewed-on: http://review.gluster.org/9808
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Avra Sengupta <asengupt@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If both dicts are NULL then equal. If one of the dicts is NULL but the other
has only ignorable keys then also they are equal. If both dicts are non-null
then check if for each non-ignorable key, values are same or not. value_ignore
function is used to skip comparing values for the keys which must be present in
both the dictionaries but the value could be different.
geo-rep's stime xattr doesn't need to be present in list xattr but when
getxattr comes on stime xattr even if there aren't enough responses with the
xattr we should still give out an answer which is maximum of the stimes
available.
Change-Id: I8de2ceaa2db785b797f302f585d88e73b154167d
BUG: 1207712
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10078
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia7d43cb3b222db34ecb0e35424f1766715ed8e6a
BUG: 1188242
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/10176
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To cleanup expired client entries (with access_time > 2*CACHE_INVALIDATION_TIMEOUT),
have
* defined a global list to contain all the upcall_inode_ctx allocated
* Every time a upcall_inode_ctx is allocated, it is added to the global list
* during inode_forget, that upcall_inode_ctx is marked for destroy
* created a reaper thread which scans through that list
* cleans up expired client entries
* frees the inode_ctx with destroy_mode set.
Note: This reaper thread is initialized only when features.cache_invalidation option
is enabled.
Change-Id: Iea2a63eb31b8e08d5709e7e090cf26fd13d01265
BUG: 1200267
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10342
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM:
There is no way to get the path of deleted file if we
have gfid from changelog since the file is already deleted.
SOLUTION:
Do a recursive readlink on parent gfid in backend .glusterfs
path to get the complete path in I/O callpath in changelog
translator and capture it in callback.
The path captured is relative from the brick root. The field
separator used is '\0'.
e.g.,
......\0<pgfid>/bname\0<relative-path>\0<next-record>
ADDITIONAL REQUIRED CHANGES:
1. The changelog translator option called "changelog.capture-del-path"
is introduced to enable or disable the capturing of deleted entry
path.
e.g.,
gluster vol set <vol-name> changelog.capture-del-path on/off
If capture-del-path is disabled, '\0' is captured instead of
relative path.
e.g.,
......\0<pgfid>/bname\0\0\0<next-record>
2. The minor number in the version of changelog is bumped up from v1.1
to v1.2.
3. If recursive readlink is failed for some reason, it will capture
\0 in place of <relative path>.
e.g.,
......\0<pgfid>/bname\0\0\0<next-record>
(same as when caputre-del-path option is disabled)
4. If bname argument passed to "resolve_pargfid_to_path" function
is NULL and pargfid is ROOT, "." is returned. This is not the
case with changelog, where bname is always passed. This is
applicable to other consumers of "resolve_pargfid_to_path"
routine.
NOTE:
Changelog parser should consider the above new changes
and should parse accordingly.
Change-Id: I040ed429b5aa7d391033fc6a540edbf07fc37827
BUG: 1214561
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10288
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: NetBSD Build System
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With gluster-3.7, 'performance.readdir-ahead' will be enabled by default on
new volumes when the cluster op-version supports it.
Change-Id: I44e76a69e7d1c11e6dfad72c941caf887bb810ee
BUG: 1216187
Signed-off-by: anand <anekkunt@redhat.com>
Reviewed-on: http://review.gluster.org/10433
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The structure 'rpcsvc_state', which maintains rpc server
state had no separate pointer to track the translator.
It was using the mydata pointer itself. So callers were
forced to send xlator pointer as mydata which is opaque
(void pointer) by function prototype.
'rpcsvc_register_init' is setting svc->mydata with xlator
pointer. 'rpcsvc_register_notify' is overwriting svc->mydata
with mydata pointer. And rpc interprets svc->mydata as
xlator pointer internally. If someone passes non xlator
structure pointer to rpcsvc_register_notify as libgfchangelog
currently does, it might corrupt mydata. So interpreting opaque
mydata as xlator pointer is incorrect as it is caller's choice
to send mydata as any type of data to 'rpcsvc_register_notify'.
Maintaining two different pointers in 'rpcsvc_state' for xlator
and mydata solves the issue.
Change-Id: I7874933fefc68f3fe01d44f92016a8e4e9768378
BUG: 1215161
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10366
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently when quota limit is set, corresponding gfid
is set in quota.conf. This patch supports storing
inode-quota limits in quota.conf and also stores
additional byte for each gfid to differentiate
between usage quota limit and inode quota limit.
Change-Id: I444d7399407594edd280e640681679a784d4c46a
BUG: 1202244
Signed-off-by: vmallika <vmallika@redhat.com>
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Reviewed-on: http://review.gluster.org/10069
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently glupy files resides in gluster namespace of python site packages.
The other projects like libgfapi-python ..etc are evolving and need to share
the gluster namespace. The current structure makes things difficult as all
subpackages have its own __init__ files and other files.
One subpackage can not any more own gluster namespace.
The attempt is to make below structure for gluster namespace so that
it is more portable and scalable for future use.
<sitepackages>/gluster/
|
-- __init__.py
|
|
-- glupy
|
-- __init__.py
-- glupy.py
-- ........
|
|
-- gfapi
|
-- __init__.py
-- gfapi.py
-- ........
By above structure clients can import:
>>> from gluster import glupy
>>> from gluster import gfapi
libgfapi-python project has been moved to this structure via
http://review.gluster.org/#/c/9668/
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Change-Id: I54886200ddb6a4153a74d9e187aeca7cad79ef9e
BUG: 1211900
Reviewed-on: http://review.gluster.org/10248
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix ignoring geo-rep safe errors in fuse layer
and also ignore logging in client translator
for mknod. Though it is rare, to happen with
mknod, it might happen with history crawl on
overlapping changelogs replay.
Change-Id: I7e145cd1dc53f04d444ad2e68e66e648be448e61
BUG: 1210562
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10422
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Command gluster volume status <VOLNAME> should show the status of bitrot
and scrubber daemon and its pid information.
Along with displaying bitrot and scrubber daemon information in gluster
volume status command there should be command to show its individual status
separately.
Command to show individual status of bitrot and scrubber daemon will
following.
command to show only bitd daemon information will be
gluster volume status <VOLNAME> bitd
command to show only scrubber daemon information
gluster volume status <VOLNAME> scrub
Change-Id: Id86aae1156c8c599347c98e2a538f294d37376e4
BUG: 1209752
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10175
Reviewed-by: Kaushal M <kaushal@redhat.com>
Tested-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ifbba6f340adfe2b4e3ad07260fbf4a25698ad8df
BUG: 1217949
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/10459
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When calling dlopen() for libgfdb, do not specify the library version
number "libgfdb.so.0.0.1", since libtool will not always create libraries
or link with that name with the full 3-digit version. For instance on
NetBSD only up to the 2-digit version is available and "libgfdb.so.0.0.1"
does not exist.
Instead, just specify "libgfdb.so" and rely on smymlinks installed by
libtool to find the relevant library.
BUG: 1129939
Change-Id: I074b1009d3622a122fdaeb4b99658bca3277e211
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/10407
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently whatever bitrot/scrubber tunable value user set for one
volume that value is considering for all other volumes also.
Each volume should act on their respective bitrot/scrubber tunable
value.
For handling bitrot/scrubber tunable value independently with respect
to all the volume bitrot and scrubber translator should run seperatly
for each volume.
Change-Id: I1d9379508afe6cfd2f78e3ebf29c829c362d84a9
BUG: 1170075
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10352
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During mount, NFS directly calls stat on the root of the volume
without sending a lookup on it. This was causing inode_ctx_get_block_size()
to fail on /. A check is now added in [f]stat which would ensure no action
is taken by shard xlator when the operation is on a directory.
Change-Id: I81849eeddfdad9f271155442408d95b4a25d7647
BUG: 1207615
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10427
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I968980dc4df458ec427e33503363bbd017e1163e
BUG: 1200271
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10194
Tested-by: NetBSD Build System
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier, both chagelog on/off and brick restart were considered
to be changelog breakage and treated as changelog not being
continuous. As a result, new HTIME.TSTAMP file was created on
both the above cases. Now the change is made such that only
on changelog enable/disable, the changelog is considered to be
discontinuous. New HTIME.TSTAMP file is not created on brick
restart, the changelogs files are appended to last HTIME.TSTAMP
file.
Treating changelog as continuous in above scenario is important
as changelog history API will fail otherwise. It can successfully
get changes between start and end timestamps only when changelog
is continuous (Changelogs in single HTIME.TSTAMP file are treated
as continuous). Without this change, changelog history API would
fail, and it would become necessary to fallback to other mechanisms
like xsync FSCrawl in case geo-rep to detect changes in this time
window. But Xsync FSCrawl would not be applicable to other
consumers like glusterfind.
Rationale:
1. In plain distributed volume, if brick goes down, no I/O can
happen onto the brick. Hence changelog is intact with data
on disk.
2. In distributed replicate volume, if brick goes down, since
self-heal traffic is captured in changelog. Eventually,
I/O happened whend brick down is captured in changelog.
Change-Id: I2eb66efe6ee9a9228fb1fcb38d6e7696b9559d5b
BUG: 1211327
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10222
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I1a90ad6669c1cb79aaae6b4bd9621c75d9985c8a
BUG: 1207615
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10446
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sssd uses 300 seconds by default too. There is no need to overload sssd
with requests that it would have cached.
BUG: 1215187
Change-Id: I3f04ea8cc90180d863253a9f46d62b71810a7b34
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/10371
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) ctr_link_consistency option for ctr xaltor is provided so that
the user can choose to switch it on or off.
/* For link consistency we do a double update i.e mark the link
* during the wind and during the unwind we update/delete the link.
* This has a performance hit. We give a choice here whether we need
* link consistency to be spoton or not using link_consistency flag.
* This will have only one link update */
2) In delete the wind time recording is moved to unwind path.
/* Special performance case:
* Updating wind time in unwind for delete. This is done here
* as in the wind path we will not know whether its the last
* link or not. For a last link there is not use to update any
* wind or unwind time!*/
Change-Id: I209472fb816f939db4a868b97ba053b028f17ea6
BUG: 1217786
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10170
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The quotad's graph generation was happening wrongly for
tiered volume. The check is been inserted.
Change-Id: I5554bc5280b0fbaec750e9008fdd930ad53a774f
BUG: 1214219
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10474
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow up patch for http://review.gluster.org/#/c/10080
In the above, the suggested change in
http://review.gluster.org/#/c/10080/7/xlators/cluster/dht/src/dht-rebalance.c
doesnot work. The reason it doesnt work is promotion and demotion are done in
a multithread way. Whenever a promotion or demotion thread is called, the frame
of the old sync_op thread is not carried with it. As a result the frame->root->pid
is not set.
Solution:
When the file is getting migrated, we get a tiering.migration key_value in the
xattr dict, so that we pass this dic key-value when we do syncop_setxattr()
to do data migration and set the frame->root->pid GF_CLIENT_PID_TIER_DEFRAG
in dht_setxattr() just before calling dht_start_rebalance_task().
Change-Id: I86fef2d961b32fdd2c0c69d8512cbe846b393404
BUG: 1194753
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10266
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Coverity IDs:
1214630
1214631
1214633
1234643
Change-Id: I172c4f49bf651b2324522f9e661023f73ca05339
BUG: 789278
Signed-off-by: Nandaja Varma <nvarma@redhat.com>
Reviewed-on: http://review.gluster.org/9557
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Sakshi Bansal
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For GF_CBK_CACHE_INVALIDATION, have changed the type of gfid
to be string (cannonical form) instead of opaque byte format
to ensure correctness across platforms supporting different
endianness.
BUG: 1200262
Change-Id: Iac4372714f4b4ebcd9c4393aaf46ceba3f37f587
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10224
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As suggested during the code-review of Bug1200262, have modified
GF_CBK_UPCALL to be exlusively GF_CBK_CACHE_INVALIDATION.
Thus, for any new upcall event, a new CBK procedure will be added.
Also made changes to store upcall data separately based on the
upcall event type received.
BUG: 1200262
Change-Id: I0f5e53d6f5ece16aecb514a0a426dca40fa1c755
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10049
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|