| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is to prevent the possibility of a deadlock when
rpc_connection_cleanup being called in the same thread as rpc_clnt_unref
Change-Id: Ia4dcc0a8a6e6158d4ddec68b780fccbc4cd64adb
BUG: 962619
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/5321
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New options being introduced in the master branch should now have
op-version set to the GD_OP_VERSION_MAX (3). Some of the options have
been backported to release-3.3 branch and hence should have their
op-version reduced. Some other options had op-version incorrectly set as
1.
Change-Id: If40325b7b2da7aa36f90261024117cd18cf51ef0
BUG: 981278
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/5318
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: If47e209cb61ea0eb74ee2d6ef9e9342b2d6ee13a
BUG: 980838
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/5261
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Goal of this health-checker is to detect fatal issues of the underlying
storage that is used for exporting a brick. The current implementation
requires the filesystem to detect the storage error, after which it will
notify the parent xlators and exit the glusterfsd (brick) process to
prevent further troubles.
The interval the health-check runs can be configured per volume with the
storage.health-check-interval option. The default interval is 30
seconds.
It is not trivial to write an automated test-case with the current
prove-framework. These are the manual steps that can be done to verify
the functionality:
- setup a Logical Volume (/dev/bz970960/xfs) and format is as XFS for
brick usage
- create a volume with the one brick
# gluster volume create failing_xfs glufs1:/bricks/failing_xfs/data
# gluster volume start failing_xfs
- mount the volume and verify the functionality
- make the storage fail (use device-mapper, or pull disks)
# dmsetup table
..
bz970960-xfs: 0 196608 linear 7:0 2048
# echo 0 196608 error > dmsetup-error-target
# dmsetup load bz970960-xfs dmsetup-error-target
# dmsetup resume bz970960-xfs
# dmsetup table
...
bz970960-xfs: 0 196608 error
- notice the errors caught by syslog:
Jun 24 11:31:49 vm130-32 kernel: XFS (dm-2): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 buf count 512
Jun 24 11:31:49 vm130-32 kernel: XFS (dm-2): I/O Error Detected. Shutting down filesystem
Jun 24 11:31:49 vm130-32 kernel: XFS (dm-2): Please umount the filesystem and rectify the problem(s)
Jun 24 11:31:49 vm130-32 kernel: VFS:Filesystem freeze failed
Jun 24 11:31:50 vm130-32 GlusterFS[1969]: [2013-06-24 10:31:50.500674] M [posix-helpers.c:1114:posix_health_check_thread_proc] 0-failing_xfs-posix: health-check failed, going down
Jun 24 11:32:09 vm130-32 kernel: XFS (dm-2): xfs_log_force: error 5 returned.
Jun 24 11:32:20 vm130-32 GlusterFS[1969]: [2013-06-24 10:32:20.508690] M [posix-helpers.c:1119:posix_health_check_thread_proc] 0-failing_xfs-posix: still alive! -> SIGTERM
- these errors are in the log of the brick as well:
[2013-06-24 10:31:50.500607] W [posix-helpers.c:1102:posix_health_check_thread_proc] 0-failing_xfs-posix: stat() on /bricks/failing_xfs/data returned: Input/output error
[2013-06-24 10:31:50.500674] M [posix-helpers.c:1114:posix_health_check_thread_proc] 0-failing_xfs-posix: health-check failed, going down
[2013-06-24 10:32:20.508690] M [posix-helpers.c:1119:posix_health_check_thread_proc] 0-failing_xfs-posix: still alive! -> SIGTERM
- the glusterfsd process has exited correctly:
# gluster volume status
Status of volume: failing_xfs
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick glufs1:/bricks/failing_xfs/data N/A N N/A
NFS Server on localhost 2049 Y 1897
Change-Id: Ic247fbefb97f7e861307a5998a9a7a3ecc80aa07
BUG: 971774
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/5176
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I40eec20ca6b3f857245a2438883822e251077ee9
BUG: 979365
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5269
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Check if a previous remove-brick operation has been committed before
starting a new rebalance/remove-brick task.
Change-Id: I553e5ba64a6a352ca91032ab1a17997051a4494e
BUG: 963541
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/5019
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Duplicate request cache provides a mechanism for detecting
duplicate rpc requests from clients. DRC caches replies
and on duplicate requests, sends the cached reply instead of
re-processing the request.
Change-Id: I3d62a6c4aa86c92bf61f1038ca62a1a46bf1c303
BUG: 847624
Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com>
Reviewed-on: http://review.gluster.org/4049
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Making the glusterd_store_* functions re-usable will help with future
changes that need to read/write lists of items.
BUG: 904065
Change-Id: I99fb8eced76d12d5a254567eccff9790b43d8da3
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/4676
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia8e1af082078f2f791708ba4faa4992bf291dd6e
BUG: 961339
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/5023
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
rpc_transport object, which is part of rpc_clnt, is destroyed
prematurely. This is because, rpc_transport object is ref'd by socket
layer and rpc layer. These ref's, until the synctask'izing of
operations, were unref'd sequentially in the epoll thread.
With more threads at play, the sequential unref guarantee is off.
Fix:
Shutting down the transport before proceeding with cleaning up of
rpc_clnt object would serialize the unref's on the rpc_transport object
and thus eliminating the race.
Also, we don't store the address of brickinfo in brick's rpc notify
function, to avoid the possibility of referring a freed brickinfo.
Instead we use a string based id to 'reach' the corresponding brickinfo.
Change-Id: If2739e2eeaee1e8b071ab2b6754b7ea0f81cfceb
BUG: 962619
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/5000
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Option to disable or enable acl with nfs.acl boolean
option.
2. Deregister the acl service with the portmapper service
when no longer required.
Change-Id: I6562b6b40138d040aa2bf1e5641f4c0e0e9f9d09
BUG: 970070
Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com>
Reviewed-on: http://review.gluster.org/5136
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
store being glusterd's persistent store under /var/lib/glusterd/
Change-Id: I1c01a09a8ce4a73ea612f05e7f14d4ab39ad1628
BUG: 971796
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/5177
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For unix path based sockets, the socket path is
cryptic (md5sum of path) and may not be useful for
the user in debugging so log it in DEBUG.
Changed logging in brick_rpc_notify to log brickinfo
for disconnects.
Change-Id: I69174bbbbde8352d38837723e950ad8fc15232aa
BUG: 963153
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5009
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Usage: gluster system:: uuid get
This is needed since we generate uuid of a node in a lazy manner. ie, we
generate a uuid for the node only on the first volume or peer operation,
when the node needs an external identity. With this command, we can
force[1] the uuid generation, without a volume or peer operation performed.
[1]: Querying for uuid (or uuid get), forces uuid to come into
existence.
Change-Id: I62c8b6754117756aa4d773dd48af4ddeb1a1d878
BUG: 971661
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/5175
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug(967445): The default value for all nfs options is displayed as "(null)"
Fix: Changed nfs options to show default value.
Change-Id: I3b1f27439c19a6655f7dcc7891df40706db9e474
BUG: 967445
Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-on: http://review.gluster.org/5098
Reviewed-by: Santosh Pradhan <spradhan@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
If a remove-brick start is executed on m1 with brick from m2
on local subvolume no rebalance process is launched. Because of
this volinfo->rebal.op is not set. This leads to volume status
failures.
Fix:
Set rebal.op even when the reblance process is not started.
Change-Id: I71c7e6f09353be14c1e8edca3c8685ebfdf226d6
BUG: 964059
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5030
Reviewed-by: Kaushal M <kaushal@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Only those peers which were locked need to be unlocked.
* Fix location of collating errors in callbacks. The callback functions
could miss collating errors if there was an rpc error.
Change-Id: Ie27c2f1ec197da4f5077a4d6e032127954ce87cd
BUG: 948686
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/5087
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I9e2743ab61c8baee92a1dfd376ec4bb145776176
BUG: 963524
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5016
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Volume op-versions calculations now take into account if an option,
a. enables/disables an xlator, or
b. is a boolean option.
This prevents op-versions from being updated when a feature is disabled.
Also, correctly close the dynamically loaded xlators in
xlator_volopt_dynload() and prevent leaks.
Change-Id: I895ddeeec6f6a33e509325f0ce6f01b7aad3cf5c
BUG: 954256
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/4952
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch enables the open-behind by default only when the op-version
allows it. Also the volume op-version calculations take account of this
enablement.
Change-Id: Idf7a3c274ec4828aafc815cdd1df829ecb853354
BUG: 954256
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/4866
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I4fc3c5c829adca256bb131f4a2722abc95741158
BUG: 963665
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/5020
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterd could deadlock after a peer-detach command as follows,
1) glusterd_friend_cleanup function 'flushes' out messages in the rpc
layer's queue, that haven't received a response. At this point, glusterd
has already acquired the big lock.
2) The side-effect of flushing out the messages is that the
corresponding call backs are called.
Call backs themselves are executed after acquiring the big lock. This
results in the big lock being acquired in a nested manner (in the same
thread), which causes
a deadlock.
This can also happen during brick/NFS/SHD disconnect in volume-stop.
Change-Id: Iab3aad143cd8ebbab53ea0b69687f0e7627dc8a9
BUG: 965533
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/5061
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
... in case of volume op failure on remote host
Change-Id: I7177dc02369dffa82f217496559532d18b7c7c7a
BUG: 963628
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/5018
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change removes the asymmetry in the 'layer' (read rpc, transport
etc) in which transport options were being filled for inet and unix
sockets.
Change-Id: Iaa080691fd5e4c3baedffa97e9c3f16642c1fc12
BUG: 955919
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4850
Reviewed-by: Raghavendra G <raghavendra@gluster.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM:
glusterd logs coming from glusterd_xfer_friend_add_resp() (wrongly)
indicate that a node responded to itself, although it actually
responded to one of its peers.
FIX:
Make glusterd_xfer_friend_add_resp() distinguish between remote host
and self and print the appropriate hostname.
Change-Id: I2a504eeb058c08a0d378443888eb6f1dc7edc76f
BUG: 963537
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/5017
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The restarting of bricks has been deffered until the cluster 'stabilizes'
itself volumes' view. Since glusterd_spawn_daemons is executed everytime
a peer 'joins' the cluster, it may inadvertently restart bricks that
were taken offline for say, maintenance purposes. This fix avoids that.
Change-Id: Ic2a0a9657eb95c82d03cf5eb893322cf55c44eba
BUG: 960190
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4973
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Id18b215a91cf016964ea98d2f414293b82167d24
BUG: 962362
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/4992
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prevent the removal of brick(s) from a plain replicate volume and
display the error message at the CLI.
Change-Id: I8e182404564147329d8cd364b7c7931d19f14570
BUG: 961669
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/4975
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I46c41c29c2d11652f6d8ccd5637be0ac9774fc1d
BUG: 927648
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/4873
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the current installation of glusterfs doesn't have a stored
op-version and is,
a. a fresh install, then set op-version to maximum
b. an upgrade from release which didn't have op-version support, set it
to minimum.
The install status is detected using the peer-uuid.
If both peer-uuid and op-version are not present in the store, the
installation is fresh.
If peer-uuid is present, but op-version is absent in the store, the
installation has been upgraded from a version which didn't support
op-versions.
By setting the initial op-version as above, we can ensure that
a. features are not enabled accidentally during upgrades
b. a fresh install starts with all possible features enabled.
Change-Id: I52aed9ebeb5d3823c81e65f739a78a924ecf7e12
BUG: 954256
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/4867
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ib0a772dc1cb9afc8adccd8f7092f480d2b525ea0
BUG: 960580
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4964
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit efa154bb0a4cac34d5a9610ec25d38eebe495f22.
-- Following is Avati's analysis (edited) from gerrit --
The claim of the patch (being reverted) is that it in some cases cbkfn
is missed. This is wrong analysis. cbk_fn is _always_ called. The patch
treats ret > 0 as a "missed cbk". ret > 0 only means socket submission
was not complete, and is queued to submit asynchronously when POLLOUT is
raised. This is sufficient to guarantee that cbkfn is going to be
called (either the socket errors or submission succeeds and reply
eventually arrives).
This commit also removes spurious barrier_wake(s).
call backs are guaranteed to be called even if the transport is
disconnected. This means, a 'wake' would be called if rpc_clnt_submit is
called. Also, we count both successful and failed operations in a
particular batch of operations for the synctask_barrier_wait. So,
calling synctask_barrier_wake on failure of rpc_clnt_submit (say, due to
network failure) would result in a spurious wake.
Change-Id: I7d508c2a54b74a65b82f097742206bc777afc53a
BUG: 948686
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4922
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the current implementation, barriers are in the core of the
syncprocessors. Wake()s are treated as syncbarrier wake. This
is however delicate, as spurious wake()s of the synctask can
mess up the accounting of the barrier and waking it prematurely.
The fix is to keep yield() and wake() as the basic primitives,
and implement barriers as an object impelemented on top of these
primitives. This way, only an explicit barrier_wake() gets
counted towards the barrier accounting, and spurious wakes
will be truly safe.
Change-Id: I8087f0f446113e5b2d0853431c0354335ccda076
BUG: 948686
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4921
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I5ae71ab98f9a336dc9bbf0e7b2ec50a6ed42b0f5
BUG: 948686
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4938
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ib78963c1f43a66dab50b443742979c7c4e4cbc23
BUG: 958790
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/4940
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I08065aaa3c140d4b02af4ca38f5f4d00d7f0c2bb
BUG: 958739
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/4937
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Volume operations were failed 'proactively', on the first disconnect of
a peer that was participating in the transaction.
The reason behind having this kludgey code in the first place was to
'abort' an ongoing volume operation as soon as we perceive the first
disconnect. But the rpc call backs themselves are capable of injecting
appropriate state machine events, which would set things in motion for an
eventual abort of the transaction.
Change-Id: Iad7cb2bd076f22d89a793dfcd08c2d208b39c4be
BUG: 847214
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4869
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* unlike 'gluster peer status', which lists only info about peers,
this command lists localhost also in the list, so the sorted
output from all the nodes should match.
* made the output script friendly by keeping it one output per line.
Change-Id: I853656753b35c617debbcceecbb71c8d6dd3c334
BUG: 764638
Original-review: http://review.gluster.org/4221
Original-author: Amar Tumballi <amarts@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/4862
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each volume is now associated with two op-versions,
* op_version - the op-version of the highest op-versioned feature enabled
* client_op_version - the op-version of the highest op-versioned feature
enabled which affects the clients only.
These two op-versions are generated dynamically and kept updated during
runtime. Glusterd now uses the respective volumes' client-op-version during
getspec requests.
To achieve the above a new field in the vme table is introduced,
client_option, this boolean field tells if the option is a client side
option.
Change-Id: I12c83b1dd29ab506026efd50d448cebbcee53c27
BUG: 907311
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/4584
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I970a51d3f62bcf414eb9552a68d1068430b93216
BUG: 950048
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4815
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM:
performance.nfs.* option values (which are of type boolean) are
not validated during the stage phase of 'volume set'.
The result - nfs graph generation fails during commit phase,
AFTER the option and its (invalid) value have been placed in
volinfo->dict.
CAUSE:
nfsperfxl_option_handler() - the function that validates the values of
performance.nfs.* options - never receives the (key,value) pair that
needs to be set, for validation during 'volume set' stage.
FIX:
In build_nfs_graph(), copy the (mod_)dict containing the (option,value)
parameters into set_dict before attempting to build the client graph
for the volume on which the operation is being performed.
Of course, an easier way out would be to simply do a 'volume reset' and
pretend nothing wrong happened!
Change-Id: I56b17d0239d58a9e0b7798933a3c8451e2675b69
BUG: 949930
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/4814
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a single node cluster, it is possible to deadlock on the "big
lock", while restarting bricks. In glusterd_restart_bricks, we perform a
glusterd_brick_connect, where we release the big lock in anticipation
that glusterd_brick_rpc_notify could run in the same C stack (and
deadlocking). So, in the restart code path, we could unlock before we
have performed a lock on the big lock.
To fix this, we need to take the big lock in the
glusterd_launch_synctask 'thread' as well.
Change-Id: I1abea1ca82b55c784b8a810a8194f254b32b1dcc
BUG: 948686
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4837
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are primarily three lists that are part of glusterd process,
that are concurrently accessed. Namely, priv->volumes, priv->peers
and volinfo->bricks_list.
Big-lock approach
-----------------
WHAT IS IT?
Big lock is a coarse-grained lock which protects all three
lists, mentioned above, from racy access.
HOW DOES IT WORK?
At any given point in time, glusterd's thread(s) are in execution
_iff_ there is a preceding, inbound network event. Of course, the
sigwaiter thread and timer thread are exceptions.
A network event is an external trigger to glusterd, via the epoll
thread, in the form of POLLIN and POLLERR.
As long as we take the big-lock at all such entry points and yield
it when we are done, we are guaranteed that all the network events,
accessing the global lists, are serialised.
This amounts to holding the big lock at
- all the handlers of all the actors in glusterd. (POLLIN)
- all the cbks in glusterd. (POLLIN)
- rpc_notify (DISCONNECT event), if we access/modify
one of the three lists. (POLLERR)
In the case of synctask'ized volume operations, we must remember that,
if we held the big lock for the entire duration of the handler,
we may block other non-synctask rpc actors from executing.
For eg, volume-start would block in PMAP SIGNIN, if done incorrectly.
To prevent this, we need to yield the big lock, when we yield the
synctask, and reacquire on waking up of the synctask.
Change-Id: Ib929f9905b55fb6c3fc27fefb497a26dba058e4f
BUG: 948686
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4784
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterd syncops perform a barrier_wake whenever rpc_clnt_submit returned -1.
This is based on the wrong assumption that the cbkfn wasn't called.
This would result in one more wakeup than there ought to be.
Change-Id: I591e67c267f0e26d1145bf8fb5feeb2c13a751a1
BUG: 948686
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4802
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch incorporates all the changes suggested on the behaviour of
'volume create' command in http://review.gluster.org/#change,4214
(comment #14, to be precise).
Change-Id: Iaac524a59738b177415595b18aa8a136090d3d25
BUG: 948729
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/4740
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Till now running glusterfs processes were allowed to run in valgrind
mode only when built with debug mode enabled.
Change-Id: I11e07ea2a4da4f82f70cdded6258a22d65d6db64
BUG: 922877
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/4688
Reviewed-by: Anand Avati <avati@redhat.com>
Tested-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Today there is a non-obvious dependence of eager-locking on
write-behind. The reason is that eager-locking works as long
as the inheriting transaction has no overlaps with any of the
transactions already in progress. While write-behind provides
non-overlapping writes as a side-effect most of times (and only
guarantees it when strict-write-ordering option is enabled,
which is not on by default) eager-lock needs the behavior
as a guarantee. This is leading to complex and unwanted checks
for the presence of write-behind in the graph, for the simple
task of checking for overlaps.
This patch removes the interdependence between eager-locking
and write-behind by making eager-locking do its own overlap checks
with in-progress writes.
Change-Id: Iccba1185aeb5f1e7f060089c895a62840787133f
BUG: 912581
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4782
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
enabling this option has an effect on pathinfo xattr
request returning <node-uuid>:<path> instead of the
default - which is <hostname>:<path>.
Change-Id: Ice1b38abf8e5df1568bab6d79ec0d53dfa520332
BUG: 765380
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/4567
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With readdir-optimize set to on, we instruct the posix layer to ignore
directory entries from not first subvolume. DHT discards directories
returned from non first subvolume. By making posix itself ignore it,
we are making directory crawls faster
Change-Id: Ia1faf2dedec0c615c0632c3c063e846f5742ede6
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/4613
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I57fbdd83f3098e64886c3dd690a1ae04fc37442d
BUG: 928648
Signed-off-by: Bala.FA <barumuga@redhat.com>
Reviewed-on: http://review.gluster.org/4739
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|