| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The auth_value was being reset to AUTH_GLUSTERFS_v2
during rpc disconnect. It shoud not be reset. The
disconnect during portmap request can race with
handshake. If handshake happens first and
disconnect later, auth_value would set to default
value and it never sets back to actual auth_value
fixes: bz#1579276
Change-Id: Ib46c9e01a97f6defb3fd1e0423fdb4b899b4a361
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
re-registering
> Reviewed-on: https://review.gluster.org/16849
> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: I05ed6b7c715a71e5819fbe8116e7c3146010f836
BUG: 1521030
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
| |
When a saved frame is to be forced unwind, there is no need to pass an
empty iovector without any data pointed to.
Change-Id: I6e858fb38644326e22239b83272b15db656035e5
BUG: 1523122
Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
xdr_replymsg is called to decode reply message, and it returns failure
if the message is corrupted. However, retrieving return value from
the global errno is 0 even xdr_replymsg fails.
Fix this issue by simply returning a negative value if call to
xdr_replymsg fails.
Change-Id: I2b9a1dc97652fbb6cf6568ea617f120713784a55
BUG: 1523122
Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Removed unused struct member and its one time usage.
- cleaned up wrong white space
member 'client_latency' was not used otherwise since it was added by
commit 07cc8679cdf3b29680f4f105d0222da168d8bfc1
Author: Kevin Vigor <kvigor@fb.com>
Date: Tue Mar 21 08:23:25 2017 -0700
Halo Replication feature for AFR translator
Change-Id: Ibb0ea828d4090bbe8897f6af326b317884162a00
BUG: 1495153
Signed-off-by: Sven Fischer <sven@fischer-abc.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have the following undefined symbol error from protocol/server.so:
glusterfs_mgmt_pmap_signout
glusterfs_autoscale_threads
See https://review.gluster.org/19225 (bz#1532238)
and https://review.gluster.org/19657 (bz#1550895)
(why are there two different bzs for the same bug?)
IMO this is a cleaner solution. I.e. moving the above two functions
to libgfrpc (.../rpc/rpc-lib/...)
I would also, for (foolish) consistency sake, like to see
glusterfs_mgmt_pmap_signin() moved from glusterfsd to libgfrpc as
well.
This works on f28/rawhide, with its new, more restrictive run-time
link semantics. The smoke and regression tests on earlier fedora and
centos will confirm that it works on those platforms too.
Change-Id: I9cfbd1cc15e7ebd9fc31b56ac791287fa2c584de
BUG: 1550895
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scale rpcsvc_request_handler threads to match the scaling of event
handler threads.
Please refer to https://bugzilla.redhat.com/show_bug.cgi?id=1467614#c51
for a discussion about why we need multi-threaded rpcsvc request
handlers.
Change-Id: Ib6838fb8b928e15602a3d36fd66b7ba08999430b
Signed-off-by: Milind Changire <mchangir@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clients will request for a list of volfile servers from glusterd2 by
setting a (optional) flag in GETSPEC RPC call. glusterd2 will check for
the presence of this flag and accordingly return a list of glusterd2
servers in GETSPEC RPC reply. Currently, this list of servers returned
only contains servers which have bricks belonging to the volume.
See:
https://github.com/gluster/glusterd2/issues/382
https://github.com/gluster/glusterfs/issues/351
Updates #351
Change-Id: I0eee3d0bf25a87627e562380ef73063926a16b81
Signed-off-by: Prashanth Pai <ppai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Building with --with-default-ipv6 causes shared
components of gluster calling the rpcbind6 functions
to fail. Adding the symbols in the list is all that is
necessary. Building without ipv6 keeps the same behavior.
No test cases as this is a build-specific fix.
Change-Id: I248d3291bf17326b07d152d9b79cdcfaf9068f0d
BUG: 1544961
Signed-off-by: Sheena Artrip <sheenobu@fb.com>
|
|
|
|
|
|
|
| |
updates #384
Change-Id: Id80bf470988dbecc69779de9eb64088559cb1f6a
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
| |
Updates #353
Change-Id: I755b9208690be76935d763688fa414521eba3a40
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce another authentication header which can now send more data.
This is useful because this data can be common for all the fops, and
we don't need to change all the signatures.
As part of this, made rpc-clnt.c little more modular to support multiple
authentication structures.
stack.h changes are placeholder for the ctime etc, can be moved later
based on need.
updates #384
Change-Id: I6111c13cfd2ec92e2b4e9295896bf62a8a33b2c7
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The locks xlator now is able to send a contention notification to
the current owner of the lock.
This is only a notification that can be used to improve performance
of some client side operations that might benefit from extended
duration of lock ownership. Nothing is done if the lock owner decides
to ignore the message and to not release the lock. For forced
release of acquired resources, leases must be used.
Change-Id: I7f1ad32a0b4b445505b09908a050080ad848f8e0
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without an export map (at link time) libgrpc and libgfxdr export over
150 and 450 symbols each, respectively. Many are not used by anything
else. (Unclear what the unused symbols are, some may be simple
sloppiness, e.g. not declaring functions static that should be. Others
may be intra-library calls that can't be static but aren't part of the
API, per se.)
By linking with an export map the number of exported symbols is
reduced to ~60 and ~250 respectively.
This parallels the similar change made to libglusterfs recently
and the older changes to the xlators to minimize the symbols that
are visible (exported) from the .so.
And I don't know, do we want to go all the way to symbol versions?
For these libs? And for libglusterfs?
fixes gluster/glusterfs#392
Change-Id: I9cdc3eee10e5f1408d7e7f2f29fad597c97e4003
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
If the transport object is freed in rpc_transport_unref,
a notify of RPC_TRANSPORT_CLEANUP is push to rpc_clnt_notify,
where the rpc_clnt(contains conn) is freed.
After that, using of conn after rpc_transport_unref is use after freed.
Change-Id: I5cac8a8e7ced7c1079930080a12abf02d46667d5
Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RTLD_LOCAL is the default value for symbol visibility flag of dlopen()
in Linux and NetBSD. Using it avoids conflicts during symbol resolution.
This also allows us to detect xlators that have not been explicitly
linked with libraries that they use. This used to go unnoticed
when RTLD_GLOBAL was being used.
BUG: 1193929
Change-Id: I50db6ea14ffdee96596060c4d6bf71cd3c432f7b
Signed-off-by: Prashanth Pai <ppai@redhat.com>
|
|
|
|
|
|
|
|
|
| |
This patch creates a new way of defining message id's that is easier
and less error prone because it doesn't require so many manual changes
each time a new component is defined or a new message created.
Change-Id: I71ba8af9ac068f5add7e74f316a2478bc991c67b
Signed-off-by: Xavier Hernandez <jahernan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In glusterfs code base we call mutex_lock/unlock to take
reference/dereference for a object.Sometime it could be
reason for lock contention also.
Solution: There is no need to use mutex to increase/decrease ref
counter, instead of using mutex use gcc builtin ATOMIC
operation.
Test: I have not observed yet how much performance gain after apply
this patch specific to glusterfs but i have tested same
with below small program(mutex and atomic both) and
get good difference.
static int numOuterLoops;
static void *
threadFunc(void *arg)
{
int j;
for (j = 0; j < numOuterLoops; j++) {
__atomic_add_fetch (&glob, 1,__ATOMIC_ACQ_REL);
}
return NULL;
}
int
main(int argc, char *argv[])
{
int opt, s, j;
int numThreads;
pthread_t *thread;
int verbose;
int64_t n = 0;
if (argc < 2 ) {
printf(" Please provide 2 args Num of threads && Outer Loop\n");
exit (-1);
}
numThreads = atoi(argv[1]);
numOuterLoops = atoi (argv[2]);
if (1) {
printf("\tthreads: %d; outer loops: %d;\n",
numThreads, numOuterLoops);
}
thread = calloc(numThreads, sizeof(pthread_t));
if (thread == NULL) {
printf ("calloc error so exit\n");
exit (-1);
}
__atomic_store (&glob, &n, __ATOMIC_RELEASE);
for (j = 0; j < numThreads; j++) {
s = pthread_create(&thread[j], NULL, threadFunc, NULL);
if (s != 0) {
printf ("pthread_create failed so exit\n");
exit (-1);
}
}
for (j = 0; j < numThreads; j++) {
s = pthread_join(thread[j], NULL);
if (s != 0) {
printf ("pthread_join failed so exit\n");
exit (-1);
}
}
printf("glob value is %ld\n",__atomic_load_n (&glob,__ATOMIC_RELAXED));
exit(0);
}
time ./thr_count 800 800000
threads: 800; outer loops: 800000;
glob value is 640000000
real 1m10.288s
user 0m57.269s
sys 3m31.565s
time ./thr_count_atomic 800 800000
threads: 800; outer loops: 800000;
glob value is 640000000
real 0m20.313s
user 1m20.558s
sys 0m0.028
Change-Id: Ie5030a52ea264875e002e108dd4b207b15ab7cc7
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
| |
Change-Id: I22e622212f30defe6f2af1a67d7b48a88d37a097
BUG: 1520974
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
|
|
|
|
|
|
|
|
|
|
| |
* Introduce xlator methods to allow dumping of metrics
* Separate options to get the metrics dumped in a path
Updates #168
Change-Id: I7df80df33b71d6f449f03c2332665b4a45f6ddf2
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
| |
Change-Id: Ic52045f5dd19e551612242450b8982f42ff327e9
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
icreate creates inode, while namelink links the basename to it's
parent gfid.
For now mkdir is the primary user of these fops. Better distribution is
acheived by creating the inode on ,(say) mds1 and linking the basename to it's
parent gfid on mds2. The inode serves readdirp, stat etc.
More details about the fops are present at:
https://review.gluster.org/#/c/13395/3/design/DHT2/DHT2_Icreate_Namelink_Notes.md
This backport of three patches from experimental branch.
1- https://review.gluster.org/#/c/18085/
2- https://review.gluster.org/#/c/18086/
3- https://review.gluster.org/#/c/18094/
Updates gluster/glusterfs#243
Change-Id: I1bd3d5a441a3cfab1acfeb52f15c6c867d362592
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rpc_clnt_submit() acquires conn->lock before call to
rpc_transport_submit_request() and subsequent queuing of frame into
saved_frames list. However, as part of handling RPC_TRANSPORT_MSG_RECEIVED
and RPC_TRANSPORT_MSG_SENT notifications in rpc_clnt_notify(), conn->lock
is again used to atomically update conn->last_received and conn->last_sent
event timestamps.
So when conn->lock is acquired as part of submitting a request,
a parallel POLLIN notification gets blocked at rpc layer until the request
submission completes and the lock is released.
To get around this, this patch call clock_gettime (instead to call gettimeofday)
to update event timestamps in conn->last_received and conn->last_sent and to
call clock_gettime don't need to call mutex_lock because it (clock_gettime)
is thread safe call.
Note: Run fio on vm after apply the patch, iops is improved after apply
the patch.
Change-Id: I347b5031d61c426b276bc5e07136a7172645d763
BUG: 1467614
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scan URL:
https://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2017-11-10-0f524f07/html/
ID: 9 (BAD_SHIFT)
ID: 58 (CHECKED_RETURN)
ID: 98 (DEAD_CODE)
ID: 249, 250, 251, 252 (MIXED_ENUMS)
ID: 289, 297 (NULL_RETURNS)
ID: 609, 613, 622, 644, 653, 655 (UNUSED_VALUE)
ID: 432 (RESOURCE_LEAK)
Change-Id: I2349877214dd38b789e08b74be05539f09b751b9
BUG: 789278
Signed-off-by: Milind Changire <mchangir@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Today the main users of client uuid are protocol layers, locks, leases.
Protocol layers requires each client uuid to be unique, even across
connects and disconnects. Locks and leases on the server side also use
the same client uid which changes across file migrations. Which makes the graph
switch and file migration tedious for locks and leases.
file migration across bricks becomes difficult as client uuid for the same
client, is different on the other brick.
The exact set of issues exists for leases as well.
Solution would be to introduce a constant in the client-uid string which
the locks and leases can use to identify the owner client across bricks.
Client uuid currently:
%s(ctx uuid)-%s(protocol client name)-%d(graph id)%s(setvolume count/reconnect count)
Proposed Client uuid:
"CTX_ID:%s-GRAPH_ID:%d-PID:%d-HOST:%s-PC_NAME:%s-RECON_NO:%s"
- CTX_ID: This is will be constant per client.
- GRAPH_ID, PID, HOST, PC_NAME(protocol client name), RECON_NO(setvolume count)
remains the same.
Change-Id: Ia81d57a9693207cd325d7b26aee4593fcbd6482c
BUG: 1369028
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
* xdr: add gfid to on wire format for fsetattr/rchecksum
* as it is change in on wire XDR format, needed backward
compatible RPC programs.
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 827334
Change-Id: Id0a2da3632516dc1a5560dde2b151b2e5f0be8e5
|
|
|
|
|
|
|
|
|
|
| |
Ensure that the fop program is the first in the program list
so that there's minimum amount of time spent to search the
program for the most frequently needed use case.
Change-Id: I45c3dcdbf39ec90ba39d914432d13a2ace00a5ee
BUG: 1509647
Signed-off-by: Milind Changire <mchangir@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
For achieving the above, needed below changes too.
* more sanity into how 'frame->op' is assigned.
* infra to have 'stats' as separate section in 'xlator_t' structure
Updates #137
Change-Id: I36679bf9577f3ed00a695b4e7d92870dcb3db8e1
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
On a service request, the actor is searched using an exclusive mutex
lock which is not really necessary since most of the time the actor
list is going to be searched and not modified.
Solution:
Use a read-write lock instead of a mutex lock.
Only modify operations on a service need to be done under a write-lock
which grants exclusive access to the code.
Change-Id: Ia227351b3f794bd8eee70c7a76d833cc716ab113
BUG: 1509644
Signed-off-by: Milind Changire <mchangir@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- This diff changes all locations in the code to prefer inet6 family
instead of inet. This will allow change GlusterFS to operate
via IPv6 instead of IPv4 for all internal operations while still
being able to serve (FUSE or NFS) clients via IPv4.
- The changes apply to NFS as well.
- This diff ports D1892990, D1897341 & D1896522 to the 3.8 branch.
Test Plan: Prove tests!
Reviewers: dph, rwareing
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: I34fdaaeb33c194782255625e00616faf75d60c33
BUG: 1406898
Reviewed-on-3.8-fb: http://review.gluster.org/16059
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Tested-by: Shreyas Siravara <sshreyas@fb.com>
|
|
|
|
|
|
| |
Change-Id: Idf99908aa48718a7faf7f0bbc647a679ec548282
BUG: 1443145
Signed-off-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
| |
Change-Id: I8c6f6b642f025d1faf74015b8f7aaecd7ebfd4d5
BUG: 1443145
Signed-off-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
entries to be healed
Command output:
Brick 192.168.2.8:/brick/1
Status: Connected
Total Number of entries: 363
Number of entries in heal pending: 362
Number of entries in split-brain: 0
Number of entries possibly healing: 1
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<cliOutput>
<healInfo>
<bricks>
<brick hostUuid="9105dd4b-eca8-4fdb-85b2-b81cdf77eda3">
<name>192.168.2.8:/brick/1</name>
<status>Connected</status>
<totalNumberOfEntries>363</numberOfEntries>
<numberOfEntriesInHealPending>362</numberOfEntriesInHealPending>
<numberOfEntriesInSplitBrain>0</numberOfEntriesInSplitBrain>
<numberOfEntriesPossiblyHealing>1</numberOfEntriesPossiblyHealing>
</brick>
</bricks>
</healInfo>
<opRet>0</opRet>
<opErrno>0</opErrno>
<opErrstr/>
</cliOutput>
Change-Id: I40cb6f77a14131c9e41b292f4901b41a228863d7
BUG: 1261463
Signed-off-by: Mohamed Ashiq Liyazudeen <mliyazud@redhat.com>
Reviewed-on: https://review.gluster.org/12154
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Karthik U S <ksubrahm@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
1. Ref counting increment on the client_t object is done in
rpcsvc_request_init() which is incorrect.
2. Ref not taken when delegating to grace_time_handler()
Solution:
1. Only fop requests which require processing down the graph via
stack 'frames' now ref count the request in get_frame_from_request()
2. Take ref on client_t object in server_rpc_notify() but avoid
dropping in RPCSVC_EVENT_TRANSPORT_DESRTROY. Drop the ref
unconditionally when exiting out of grace_time_handler().
Also, avoid dropping ref on client_t in
RPCSVC_EVENT_TRANSPORT_DESTROY when ref mangement as been
delegated to grace_time_handler()
Change-Id: Ic16246bebc7ea4490545b26564658f4b081675e4
BUG: 1481600
Reported-by: Raghavendra G <rgowdapp@redhat.com>
Signed-off-by: Milind Changire <mchangir@redhat.com>
Reviewed-on: https://review.gluster.org/17982
Tested-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 5b14c11d3cae38bc66006b02217ede485ae30dea.
BUG: 1484225
Change-Id: I3269d3fc64de3f3cc6f670ea564a87d7725e10fd
Signed-off-by: Milind Changire <mchangir@redhat.com>
Reviewed-on: https://review.gluster.org/18113
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
glusterd rpc code path attempts to reconnect to rebalance process
via the reconnect timer even after the rebalance process disconnection
Solution:
Set the clnt->disabled flag to 1 to avoid reconnection and cause
the clnt object to be unref'd
Change-Id: I4e38eaef45d2fdea86d25e9dff9f1af0cd29cf66
BUG: 1484225
Signed-off-by: Milind Changire <mchangir@redhat.com>
Reviewed-on: https://review.gluster.org/18093
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM: Both attach tier and add brick have the same RPC
and set of code. This becomes a hurdle while tring to implement
add brick on a tiered volume.
FIX: This patch separates the add brick and attach tier
giving them separate RPCs.
Change-Id: Iec57e972be968a9ff00b15b507e56a4f6dc398a2
BUG: 1376326
Signed-off-by: hari gowtham <hgowtham@redhat.com>
Reviewed-on: https://review.gluster.org/15503
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: hari gowtham <hari.gowtham005@gmail.com>
Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following log entry is observed during volume operations such
as volume create:
[2017-07-20 05:13:43.213797] E [client_t.c:321:gf_client_ref]
(-->/usr/local/lib/libgfrpc.so.0(rpcsvc_request_create+0x1a4) [0x7f987f66cd20]
-->/usr/local/lib/libgfrpc.so.0(rpcsvc_request_init+0xd0) [0x7f987f66ca23]
-->/usr/local/lib/libglusterfs.so.0(gf_client_ref+0x56) [0x7f987f91cbd5] )
0-client_t: null client [Invalid argument]
Change-Id: I49ba753e8d1a828bb275b0ccb1a181706774f387
BUG: 1193929
Signed-off-by: Prashanth Pai <ppai@redhat.com>
Reviewed-on: https://review.gluster.org/17848
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Amar Tumballi <amarts@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
frames which time out at current second are missed out
Solution:
change test to include frames timing out at current second
i.e. timeout <= current.tv_sec
instead of
timeout < current.tv_sec
Change-Id: I459d47856ade2b657a0289e49f7f63da29186d6e
BUG: 1468433
Signed-off-by: Milind Changire <mchangir@redhat.com>
Reviewed-on: https://review.gluster.org/17722
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set names to threads on creation for easier
debugging.
Output of top -H -p <PID-OF-GLUSTERFSD>
Before:
19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd
19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd
19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd
19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
After:
19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustertimer
19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustermemsweep
19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc0
19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc1
19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll0
19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteridxwrker
19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteriotwr0
19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrssign
19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrswrker
19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterclogecon
19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd0
19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd1
19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd2
19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixjan
19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixfsy
25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll1
5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll2
7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixhc
Change-Id: Id5f333755c1ba168a2ffaa4fce6e71c375e10703
BUG: 1254002
Updates: #271
Signed-off-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-on: https://review.gluster.org/11926
Reviewed-by: Prashanth Pai <ppai@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Program
Since poller thread bears the brunt of execution till the request is
handed over to io-threads, poller thread experiencies lock
contention(s) in the control flow till io-threads, which slows it
down. This delay invariably affects reading ping requests from network
and responding to them, resulting in increased ping latencies, which
sometimes results in a ping-timer-expiry on client leading to
disconnect of transport. So, this patch aims to free up poller thread
from executing code of Glusterfs Program. We do this by making
* Glusterfs Program registering itself asking rpcsvc to execute its
actors in its own threads.
* GF-DUMP Program registering itself asking rpcsvc to _NOT_ execute
its actors in its own threads. Otherwise program's ownthreads become
bottleneck in processing ping traffic. This means that poller thread
reads a ping packet, invokes its actor and hands the response msg to
transport queue.
Change-Id: I526268c10bdd5ef93f322a4f95385137550a6a49
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
BUG: 1421938
Reviewed-on: https://review.gluster.org/17105
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When external programs perform a dlopen("..so", RTLD_LAZY|RTLD_LOCAL)
on some shared objects like xlators, it can fail with dlerror set to
error string "undefined symbol <some-type>".
This was observed for the following shared objects: fuse.so, quota.so,
quotad.so, server.so, libgfrpc.so and socket.so
P.S: This was found while running a go program which fetches the list
of xlator options (volume_option_t) from xlator's shared object.
BUG: 1193929
Change-Id: I7b958409cf11fb67c2be32a3f85a96fb1260236b
Signed-off-by: Prashanth Pai <ppai@redhat.com>
Reviewed-on: https://review.gluster.org/17659
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The most common pattern, both in our code and elsewhere, is this:
struct _xyz {
...
};
typedef struct _xyz xyz_t;
These exceptions - especially call_frame/call_stack - have been slowing
down code navigation for years. By converging on a single pattern,
navigating from xyz_t in code to the actual definition of struct _xyz
(i.e. without having to visit the typedef first) might even be
automatable.
Change-Id: I0e5dd1f51f98e000173c62ef4ddc5b21d9ec44ed
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Reviewed-on: https://review.gluster.org/17650
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Jeff Darcy <jeff@pl.atyp.us>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I57ad970411db1ccd3d2c56c504c7da9cc221051f
BUG: 1448692
Signed-off-by: Zhou Zhengping <johnzzpcrystal@gmail.com>
Reviewed-on: https://review.gluster.org/17198
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
They snuck in with the HALO patch (07cc8679c)
Change-Id: I8ced6cbb0b49554fc9d348c453d4d5da00f981f6
BUG: 1447953
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: https://review.gluster.org/17174
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Halo Geo-replication is a feature which allows Gluster or NFS clients to write
locally to their region (as defined by a latency "halo" or threshold if you
like), and have their writes asynchronously propagate from their origin to the
rest of the cluster. Clients can also write synchronously to the cluster
simply by specifying a halo-latency which is very large (e.g. 10seconds) which
will include all bricks.
In other words, it allows clients to decide at mount time if they desire
synchronous or asynchronous IO into a cluster and the cluster can support both
of these modes to any number of clients simultaneously.
There are a few new volume options due to this feature:
halo-shd-latency: The threshold below which self-heal daemons will
consider children (bricks) connected.
halo-nfsd-latency: The threshold below which NFS daemons will consider
children (bricks) connected.
halo-latency: The threshold below which all other clients will
consider children (bricks) connected.
halo-min-replicas: The minimum number of replicas which are to
be enforced regardless of latency specified in the above 3 options.
If the number of children falls below this threshold the next
best (chosen by latency) shall be swapped in.
New FUSE mount options:
halo-latency & halo-min-replicas: As descripted above.
This feature combined with multi-threaded SHD support (D1271745) results in
some pretty cool geo-replication possibilities.
Operational Notes:
- Global consistency is gaurenteed for synchronous clients, this is provided by
the existing entry-locking mechanism.
- Asynchronous clients on the other hand and merely consistent to their region.
Writes & deletes will be protected via entry-locks as usual preventing
concurrent writes into files which are undergoing replication. Read operations
on the other hand should never block.
- Writes are allowed from _any_ region and propagated from the origin to all
other regions. The take away from this is care should be taken to ensure
multiple writers do not write the same files resulting in a gfid split-brain
which will require resolution via split-brain policies (majority, mtime &
size). Recommended method for preventing this is using the nfs-auth feature to
define which region for each share has RW permissions, tiers not in the origin
region should have RO perms.
TODO:
- Synchronous clients (including the SHD) should choose clients from their own
region as preferred sources for reads. Most of the plumbing is in place for
this via the child_latency array.
- Better GFID split brain handling & better dent type split brain handling
(i.e. create a trash can and move the offending files into it).
- Tagging in addition to latency as a means of defining which children you wish
to synchronously write to
Test Plan:
- The usual suspects, clang, gcc w/ address sanitizer & valgrind
- Prove tests
Reviewers: jackl, dph, cjh, meyering
Reviewed By: meyering
Subscribers: ethanr
Differential Revision: https://phabricator.fb.com/D1272053
Tasks: 4117827
Change-Id: I694a9ab429722da538da171ec528406e77b5e6d1
BUG: 1428061
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16099
Reviewed-on: https://review.gluster.org/16177
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
The following line of code intended to remove pmap entry for the
connection during disconnects:
pmap_registry_remove (this, 0, NULL, GF_PMAP_PORT_NONE, xprt);
However, no pmap entry will have it's type set to GF_PMAP_PORT_NONE
at any point in time. So a call to pmap_registry_search_by_xprt() in
pmap_registry_remove() will always fail to find a match.
Fix:
Optionally ignore pmap entry's type in pmap_registry_search_by_xprt().
BUG: 1193929
Change-Id: I705f101739ab1647ff52a92820d478354407264a
Signed-off-by: Prashanth Pai <ppai@redhat.com>
Reviewed-on: https://review.gluster.org/17129
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem :
- Deploy gluster on 2 nodes, one brick each, one volume replicated
- Create a snapshot
- Lose one server
- Add a replacement peer and new brick with a new IP address
- replace-brick the missing brick onto the new server
(wait for replication to finish)
- peer detach the old server
- after doing above steps, glusterd fails to restart.
Solution:
With the fix detach peer will populate an error : "N2 is part of
existing snapshots. Remove those snapshots before proceeding".
While doing so we force user to stay with that peer or to delete
all snapshots.
Change-Id: I3699afb9b2a5f915768b77f885e783bd9b51818c
BUG: 1322145
Signed-off-by: Gaurav Yadav <gyadav@redhat.com>
Reviewed-on: https://review.gluster.org/16907
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 086436a introduced generation number (cleanup_gen) to ensure that
rpc layer doesn't end up cleaning up the connection object if
application layer has already destroyed it. Bumping up cleanup_gen was
done only in rpc_clnt_connection_cleanup (). However the same is needed
in rpc_clnt_reconnect_cleanup () too as with out it if the object gets destroyed
through the reconnect event in the application layer, rpc layer will
still end up in trying to delete the object resulting into double free
and crash.
Peer probing an invalid host/IP was the basic test to catch this issue.
Change-Id: Id5332f3239cb324cead34eb51cf73d426733bd46
BUG: 1433578
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: https://review.gluster.org/16914
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Milind Changire <mchangir@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Avoid logging Success in the event of failure especially when errno has
no meaningful value w.r.t. the failure. In this case the errno is set to
zero when there's indeed a failure at the RPC level.
Change-Id: If2cc81aa1e590023ed22892dacbef7cac213e591
BUG: 1426032
Signed-off-by: Milind Changire <mchangir@redhat.com>
Reviewed-on: https://review.gluster.org/16730
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|