| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In gluster code some of the places it call's get_new_dict
to create a dictionary without taking reference so at the time
of dict_unref it has become a leak
Solution: To resolve the same call dict_new instead of get_new_dict
updates bz#1650403
Change-Id: I3ccbbf5af07079a4fa09aad2cd0458c8625b2f06
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The inode LRU mechanism is moot in fuse xlator (ie. there is no
limit for the LRU list), as fuse inodes are referenced from
kernel context, and thus they can only be dropped on request of
the kernel. This might results in a high number of passive
inodes which are useless for the glusterfs client, causing a
significant memory overhead.
This change tries to remedy this by extending the LRU semantics
and allowing to set a finite limit on the fuse inode LRU.
A brief history of problem:
When gluster's inode table was designed, fuse didn't have any
'invalidate' method, which means, userspace application could
never ask kernel to send a 'forget()' fop, instead had to wait
for kernel to send it based on kernel's parameters. Inode table
remembers the number of times kernel has cached the inode based
on the 'nlookup' parameter. And 'nlookup' field is not used by
no other entry points (like server-protocol, gfapi etc).
Hence the inode_table of fuse module always has to have lru-limit
as '0', which means no limit. GlusterFS always had to keep all
inodes in memory as kernel would have had a reference to it.
Again, the reason for this is, kernel's glusterfs inode reference
was pointer of 'inode_t' structure in glusterfs. As it is a
pointer, we could never free it (to prevent segfault, or memory
corruption).
Solution:
In the inode table, handle the prune case of inodes with 'nlookup'
differently, and call a 'invalidator' method, which in this case is
fuse_invalidate(), and it sends the request to kernel for getting
the forget request.
When the kernel sends the forget, it means, it has dropped all
the reference to the inode, and it will send the forget with the
'nlookup' parameter too. We just need to make sure to reduce the
'nlookup' value we have when we get forget. That automatically
cause the relevant prune to happen.
Credits: Csaba Henk, Xavier Hernandez, Raghavendra Gowdappa, Nithya B
fixes: bz#1560969
Change-Id: Ifee0737b23b12b1426c224ec5b8f591f487d83a2
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* libglusterfs changes to add new fop
* Fuse changes:
- Changes in fuse bridge xlator to receive and send responses
* posix changes to perform the op on the backend filesystem
* protocol and rpc changes for sending and receiving the fop
* gfapi changes for performing the fop
* tools: glfs-copy-file-range tool for testing copy_file_range fop
- Although, copy_file_range support has been added to the upstream
fuse kernel module, no release has been made yet of a kernel
which contains the support. It is expected to come in the
upcoming release of linux-4.20
So, as of now, executing copy_file_range fop on a fused based
filesystem results in fuse kernel module sending read on the
source fd and write on the destination fd.
Therefore a small gfapi based tool has been written to be able
test the copy_file_range fop. This tool is similar (in functionality)
to the example program given in copy_file_range man page.
So, running regular copy_file_range on a fuse mount point and
running gfapi based glfs-copy-file-range tool gives some idea about
how fast, the copy_file_range (or reflink) can be.
On the local machine this was the result obtained.
mount -t glusterfs workstation:new /mnt/glusterfs
[root@workstation ~]# cd /mnt/glusterfs/
[root@workstation glusterfs]# ls
file
[root@workstation glusterfs]# cd
[root@workstation ~]# time /tmp/a.out /mnt/glusterfs/file /mnt/glusterfs/new
real 0m6.495s
user 0m0.000s
sys 0m1.439s
[root@workstation ~]# time glfs-copy-file-range $(hostname) new /tmp/glfs.log /file /rrr
OPEN_SRC: opening /file is success
OPEN_DST: opening /rrr is success
FSTAT_SRC: fstat on /rrr is success
copy_file_range successful
real 0m0.309s
user 0m0.039s
sys 0m0.017s
This tool needs following arguments
1) hostname
2) volume name
3) log file path
4) source file path (relative to the gluster volume root)
5) destination file path (relative to the gluster volume root)
"glfs-copy-file-range <hostname> <volume> <log file path> <source> <destination>"
- Added a testcase as well to run glfs-copy-file-range tool
* io-stats changes to capture the fop for profiling
* NOTE:
- Added conditional check to see whether the copy_file_range syscall
is available or not. If not, then return ENOSYS.
- Added conditional check for kernel minor version in fuse_kernel.h
and fuse-bridge while referring to copy_file_range. And the kernel
minor version is kept as it is. i.e. 24. Increment it in future
when there is a kernel release which contains the support for
copy_file_range fop in fuse kernel module.
* The document which contains a writeup on this enhancement can be found at
https://docs.google.com/document/d/1BSILbXr_knynNwxSyyu503JoTz5QFM_4suNIh2WwrSc/edit
Change-Id: I280069c814dd21ce6ec3be00a884fc24ab692367
updates: #536
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: 1) server_init does not cleanup allocate resources
while it is failed before return error
2) dict leak at the time of graph destroying
Solution: 1) free resources in case of server_init is failed
2) Take dict_ref of graph xlator before destroying
the graph to avoid leak
Change-Id: I9e31e156b9ed6bebe622745a8be0e470774e3d15
fixes: bz#1654917
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- add sub-framework to send timed responses to kernel
- add interrupt handler queue
- implement INTERRUPT
fuse_interrupt looks up handlers for interrupted messages
in the queue. If found, it invokes the handler function.
Else responds with EAGAIN with a delay.
See spec at
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/fuse.txt?h=v4.17#n148
and explanation in comments.
Change-Id: I1a79d3679b31f36e14b4ac8f60b7f2c1ea2badfb
updates: #465
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reduces the no. of syscalls on Linux systems from 2, accept(2) and
fcntl(2) for setting O_NONBLOCK, to a single accept4(2). On NetBSD, we
have paccept(2) that does the same, if we leave signal masking aside.
Added sys_accept which accepts an extra flags argument than accept(2).
This would opportunistically use accept4/paccept as available. It would
fallback to accept(2) and fcntl(2) otherwise.
While at this, the patch sets FD_CLOEXEC flag on the accepted socket fd.
BUG: 1236272
Change-Id: I41e43fd3e36d6dabb07e578a1cea7f45b7b4e37f
fixes: bz#1236272
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
| |
For more information, see http://udrepper.livejournal.com/20407.html
BUG: 1236272
Change-Id: I25a645c10bdbe733a81d53cb714eb036251f8129
fixes: bz#1236272
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
When compiling in other architectures there appear many warnings. Some
of them are actual problems that prevent gluster to work correctly on
those architectures.
Change-Id: Icdc7107a2bc2da662903c51910beddb84bdf03c0
fixes: bz#1632717
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a previous patch (https://review.gluster.org/20769) we've
added the key length to be passed to dict_* funcs, to remove the need
to strlen() it. This patch moves some xlators to use it.
- It also adds dict_get_int32n which was missing.
- It also reduces the size of some key variables.
They were set to 1024b or PATH_MAX, where sometimes 64 bytes were
really enough.
Please review carefully:
1. That I did not reduce some the size of the key variables too much.
2. That I did not mix up some keys.
Compile-tested only!
Change-Id: Ic729baf179f40e8d02bc2350491d4bb9b6934266
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* This is to ensure FIPS support
* Also changed the signature of svs_uuid_generate to
get xlator argument
* Added xxh64 wrapper functions in common-utils to
generate gfid using xxh64
- Those wrapper functions can be used by other xlators
as well to generate gfids using xxh64. But as of now
snapview-server is going to be the only consumer.
Change-Id: Ide66573125dd74122430cccc4c4dc2a376d642a2
Updates: #230
Signed-off-by: Raghavendra Manjunath <raghavendra@redhat.com>
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
others.
They all take as a parameter the key length, instead of strlen() it.
In most cases, we know the key length, we just never bothered to save and pass it along.
(We most likely sprintf'ed it earlier and the return value could have been used).
A more interesting addition is dict_set_nstrn() [horrible name. Ideas are welcome].
It accepts both the string length and the key length and avoids strlen() both.
Some of it can be calculated on compile-time, btw.
For example:
dict_set_str (dict, "key", "all");
Should become:
dict_set_nstrn (dict, "key", sizeof ("key"), "all", sizeof ("all"));
Compile-tested only!
Change-Id: Ic2667f445f6c2e22e279505f5ad435788b4b668c
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since setxattr and removexattr fops cbk do not carry poststat,
the stat cache was being invalidated in setxatr/remoxattr cbk.
Hence the further lookup wouldn't be served from cache.
To prevent this invalidation, md-cache is modified to get
the poststat in set/removexattr_cbk in dict.
Co-authored with Xavi Hernandez.
Change-Id: I6b946be2d20b807e2578825743c25ba5927a60b4
fixes: bz#1586018
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Currently plugins for cloudsync will be using it to write back data
downloaded from remote store/cloud.
Change-Id: I59f10bebed21b19568c94cbf29e3d536d5570749
Updates: #387
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the past, it was often[1] forgotten for xlators to be linked against
the symbols they refer to. This often caused glusterd2 to fail while
loading xlator's shared object (.so) file.
This change adds "--no-undefined" as a linker flag which causes the
linker to treat unresolved symbol references as an error and hence fail
linking.
[1]:
https://review.gluster.org/#/c/19912/
https://review.gluster.org/#/c/19664/
https://review.gluster.org/#/c/19056/
https://review.gluster.org/#/c/17659/
https://bugzilla.redhat.com/show_bug.cgi?id=1532238
Bonus:
Added cloudsync and utime xlator's generated source files to .gitignore
Updates: bz#1193929
Change-Id: I9604a4a87b7313a5fa43bda5fdb37dfa7ef8facd
Signed-off-by: Prashanth Pai <ppai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Right now there are two types of upcalls
* poll method
* registering callback
But callback can be registered per fs and same callback fn shall be used
for any lease recall with object handle as argument as done for cache
invalidation.
TODO: RECALL LEASE for each glfd (for future reference)
(may be needed fo Samba as they do not deal with
object handles.
In case of RECALL_LEASE, we could associate separate
cbk function for each glfd either by
- extending pub_glfs_lease to accept new args (recall_cbk_fn, cookie)
- or by defining new API "glfs_register_recall_cbk_fn (glfd, recall_cbk_fn, cookie)
. In such cases, flag it and instead of calling below upcall functions, define
a new one to go through the glfd list and invoke each of theirs recall_cbk_fn.
Plus added following as well
* passed lease id to dict in required arguments
* added flag check in pub_glfs_open
Updates: #350
Change-Id: I07a971f0f26ec6aae0b9f9a5613504317dee153b
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Signed-off-by: Poornima G <pgurusid@redhat.com>
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently "gluster volume bitrot <volume name> scrub status"
gives the list of the corrupted objects (files as of now).
But only the gfids of those corrupted objects are seen and
one has to do getfattr, find etc operations to get the actual
path of those objects for removal etc.
This change makes an attempt to print the path of those files
as much as possible.
* Try to get the path using the on disk gfid2path xattr.
* If the above operation fails, then go for in memory path
(provided that the object has its dentry
properly created and linked in the inode table of the brick where
the corrupted object is present) So the gfid to path resolution is
a soft resolution, i.e. based on the inode and dentry cache in the
brick's memory. If the path cannot be obtained via inode table also,
then only gfid is printed.
Change-Id: Ie9a30307f43a49a2a9225821803c7d40d231de68
fixes: bz#1570962
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: glusterd2 build is failed due to undefined symbol
(xlator_mem_cleanup , glusterfsd_ctx) in server.so
Solution: To resolve the same done below two changes
1) Move xlator_mem_cleanup code from glusterfsd-mgmt.c
to xlator.c to be part of libglusterfs.so
2) replace glusterfsd_ctx to this->ctx because symbol
glusterfsd_ctx is not part of server.so
BUG: 1544090
Change-Id: Ie5e6fba9ed458931d08eb0948d450aa962424ae5
fixes: bz#1544090
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Sometimes brick process is getting crashed at the time
of stop brick while brick mux is enabled.
Solution: Brick process was getting crashed because of rpc connection
was not cleaning properly while brick mux is enabled.In this patch
after sending GF_EVENT_CLEANUP notification to xlator(server)
waits for all rpc client connection destroy for specific xlator.Once rpc
connections are destroyed in server_rpc_notify for all associated client
for that brick then call xlator_mem_cleanup for for brick xlator as well as
all child xlators.To avoid races at the time of cleanup introduce
two new flags at each xlator cleanup_starting, call_cleanup.
BUG: 1544090
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Note: Run all test-cases in separate build (https://review.gluster.org/#/c/19700/)
with same patch after enable brick mux forcefully, all test cases are
passed.
Change-Id: Ic4ab9c128df282d146cf1135640281fcb31997bf
updates: bz#1544090
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: if a lookup is done on a newly added brick for a path on which limit
has been reached, the lookup fails to heal the directory tree due to quota.
Solution: Tag the lookup as an internal fop and ignore it in quota.
Since marking internal fop does not usually give enough contextual information.
Introducing new flags to pass the contextual info.
Adding dict_check_flag and dict_set_flag to aid flag operations.
A flag is a single bit in a bit array (currently limited to 256 bits).
Change-Id: Ifb6a68bcaffedd425dd0f01f7db24edd5394c095
fixes: bz#1505355
BUG: 1505355
Signed-off-by: Sanoj Unnikrishnan <sunnikri@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: TLS verification fails while using intermediate CA
if mgmt SSL is enabled.
Solution: There are two main issue of TLS verification failing
1) not calling ssl_api to set cert_depth
2) The current code does not allow to set certificate depth
while MGMT SSL is enabled.
After apply this patch to set certificate depth user
need to set parameter option transport.socket.ssl-cert-depth <depth>
in /var/lib/glusterd/secure_acccess instead to set in
/etc/glusterfs/glusterd.vol. At the time of set secure_mgmt in ctx
we will check the value of cert-depth and save the value of cert-depth
in ctx.If user does not provide any value in cert-depth in that case
it will consider default value is 1
BUG: 1555154
Change-Id: I89e9a9e1026e37efb5c20f9ec62b1989ef644f35
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a mixed mode cluster involving 4.0 and older 3.x bricks, if
clients are newer, then the iatt encoded in the dictionary can be
of the older iatt format, which a newer client will map incorrectly
to the newer structure.
This causes failures in FOPs that depend on this iatt for some
functionality (seen in mkdir operations failing as EIO, when DHT
hits its internal setxattr call).
The fix provided is to convert the iatt in the dict, based on which
RPC version is used to communicate with the server.
IOW, this is the reverse of change in commit "b966c7790e"
Tested using a mixed mode cluster (i.e bricks in 3.12 and 4.0 versions)
and a mixed set of clients, 3.12 and 4.0 clients.
There is no regression test provided, as this needs a mixed mode cluster
to test and validate.
Change-Id: I454e54651ca836b9f7c28f45f51d5956106aefa9
BUG: 1554053
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following release-3.8-fb branch patch is upstreamed:
> features/namespace: Add namespace xlator and link into brick graph
> Commit ID: dbd30776f26e
> https://review.gluster.org/#/c/18041/
> By Michael Goulet <mgoulet@fb.com>
Changes in this patch:
Removes extra config.h and namespace.h file in namespace.c
Adds default_getspec_cbk to libglusterfs.sym
Rename dict_for_each to dict_foreach_inline
Remove fd.h header file stack.h
Add test case for truncate, open and symlink
This patch is required to forward port io-threads namespace patch.
Updates: #401
Change-Id: Ib88c95b89eecee9b8957df8a4c8712c899c761d1
Signed-off-by: Varsha Rao <varao@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Added a volume option 'fips-mode-rchecksum' tied to op version 4.
If not set, rchecksum fop will use MD5 instead of SHA256.
updates: #230
Change-Id: Id8ea1303777e6450852c0bc25503cda341a6aec2
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clients will request for a list of volfile servers from glusterd2 by
setting a (optional) flag in GETSPEC RPC call. glusterd2 will check for
the presence of this flag and accordingly return a list of glusterd2
servers in GETSPEC RPC reply. Currently, this list of servers returned
only contains servers which have bricks belonging to the volume.
See:
https://github.com/gluster/glusterd2/issues/382
https://github.com/gluster/glusterfs/issues/351
Updates #351
Change-Id: I0eee3d0bf25a87627e562380ef73063926a16b81
Signed-off-by: Prashanth Pai <ppai@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Idd86b9f0fa144c2316ab6276e2def28b696ae18a
BUG: 1543279
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
| |
Updates #353
Change-Id: I8a30b53a52618c6a6c740d2c67b19e5322ce4ddb
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Following APIs glfs_h_lease(), glfs_lease() added, so that gfapi applications
can set and get lease which enables more efficient client side caching.
Updates: #350
Change-Id: Iede85be9af1d4df969b890d0937ed0afa4ca6596
Signed-off-by: Poornima G <pgurusid@redhat.com>
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the list of xattrs that md-cache can cache is hard coded
in the md-cache.c file, this necessiates code change and rebuild
everytime a new xattr needs to be added to md-cache xattr cache
list.
With this patch, the user will be able to configure a comma
seperated list of xattrs to be cached by md-cache
Updates #297
Change-Id: Ie35ed607d17182d53f6bb6e6c6563ac52bc3132e
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
A new function glfs_setfsleaseid() added in gfapi. Currently lock owner
is saved in the thread context. Similarly the leaseid attribute can be
saved using glfs_setfsleaseid().
Updates: #350
Change-Id: I55966cca01d0f2649c32b87bd255568c3ffd1262
Signed-off-by: Poornima G <pgurusid@redhat.com>
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This new type helps to avoid excessive logs. It should be
set only in case of
* volume graph building (graph.y)
* dict unserialize
(happens once a dictionary is received on wire in old protocol)
All other dict set and get should have proper check and warning
logs if there is a mismatch.
updates #220
Change-Id: I1cccb304a877aa80c07aaac95f10f5005e35b9c5
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Added 2 more types which are present in gluster codebase, mainly
IATT and UUID.
Updates #203
Change-Id: Ib6d6d6aefb88c3494fbf93dcbe08d9979484968f
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
minimize risk of symbol collisions in global namespace.
see https://review.gluster.org/#/c/5697/ which Amar has
resurrected.
This is a strawman proposal to use an export-list to
only export the necessary symbols from libglusterfs. I suppose
some of this could be fixed by smarter use of static in the
function definitions.
It's a bit scary to see some of the names we expose. And then
there are the names we use in the reserved namespace.
One step short of going all the way to symbol versions
fixes gluster/glusterfs#382
Change-Id: Ifb848dfc655ef735dd27c73b7729e1188eb817f1
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|