| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
icreate creates inode, while namelink links the basename to it's
parent gfid.
For now mkdir is the primary user of these fops. Better distribution is
acheived by creating the inode on ,(say) mds1 and linking the basename to it's
parent gfid on mds2. The inode serves readdirp, stat etc.
More details about the fops are present at:
https://review.gluster.org/#/c/13395/3/design/DHT2/DHT2_Icreate_Namelink_Notes.md
This backport of three patches from experimental branch.
1- https://review.gluster.org/#/c/18085/
2- https://review.gluster.org/#/c/18086/
3- https://review.gluster.org/#/c/18094/
Updates gluster/glusterfs#243
Change-Id: I1bd3d5a441a3cfab1acfeb52f15c6c867d362592
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: It had been a longtime request to implement put fop
in gluster. put fop in gluster may not have the exact sementics
of HTTP PUT, but can be easily extended to do so. The subsequent
patches, will contain more semantics on the put fop and its
guarentees.
Why compound fop framework is not used for put?
Compound fop framework currently doesn't allow compounding of
entry fop and inode fops, i.e. fops on multiple inodes cannot be
combined in compound fop.
Updates #353
Change-Id: Idb7891b3e056d46d570bb7e31bad1b6a28656ada
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
For achieving the above, needed below changes too.
* more sanity into how 'frame->op' is assigned.
* infra to have 'stats' as separate section in 'xlator_t' structure
Updates #137
Change-Id: I36679bf9577f3ed00a695b4e7d92870dcb3db8e1
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In EC and AFR, we launch synctasks during self-heal.
(i) These tasks usually stackwind a FOP to all its children and call
synctask_yield() which does a swapcontext to synctask_switchto() and puts the
task in syncenv's waitq by calling __wait(task). This happends as long as the
FOP ckbs from all children haven't been received.
(ii) For each FOP cbk, we call synctask_wake() which again does a swapcontext
to synctask_switchto() which now puts the task in syncenv's runq by calling
__run(task). When the task runs and the conext switches back to the FOP path,
it puts the task in waitq because we haven't heard from all children as
explained in (i).
Thus we are unnecessarily using the swapcontext syscalls to just toggle
the task back and forth between the waitq and runq.
Fix:
Store the stackwind count in new variable 'syncbarrier->waitfor' before
winding the fop. In each cbk when we call synctask_wake(), perform an actual
wake only if the cbk count == stackwind count.
Change-Id: Id62d3b6ffed5a8c50f8b79267fb34e9470ba5ed5
BUG: 1434274
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: https://review.gluster.org/16931
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
md-cache caches a specified list of xattrs, and when cache invalidation
is enabled, it makes sense to recieve invalidation only when those xattrs
are modified by other clients. But the current implementation of upcall
is that, it will send invalidation when any of the on-disk xattrs is modified.
Solution:
md-cache sends a list of xattrs that it is interested in, to upcall by
issuing an ipc(). The challenge here is to make sure everytime a brick
goes offline and comes back up, the ipc() needs to be issued to the
bricks. Hence ipc() is sent from md-cache every time there is a
CHILD_UP/CHILD_MODIFIED event.
TODO:
There will be patches following, in cluster xlators, to implement ipc fop.
Change-Id: I6efcf3df474f5ce6eabd3d6694c00c7bd89bc25d
BUG: 1211863
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15002
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Prashanth Pai <ppai@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ic2ba77a1fdd27801a6e579e04e6c0dd93cd7127b
BUG: 1326085
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/14011
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ifd0ff278dcf43da064021f5c25e5dcd34347fcde
BUG: 1326085
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/13970
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia27d66b1061b0377857827515590eb89b18515c9
BUG: 1319992
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/11596
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add the new seek() FOP to the syncop framework. gfapi will use this in
the future.
Change-Id: I0c15153beb27de73d5844b6f692175750fc28f60
BUG: 1220173
Singed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/11481
Smoke: Gluster Build System <jenkins@build.gluster.com>
Tested-by: Niels de Vos <ndevos@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Include iatt to 'syncop_link' args to fetch proper attributes of
the newly linked inode.
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Change-Id: If6b92961bd7a89add3791ed3a9b494087348b492
BUG: 1241788
Reviewed-on: http://review.gluster.org/11611
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Default stacksize that synctask uses is 2M.
For marker we set it to 16k
Also move market xlator close to io-threads
to have smaller stack
Change-Id: I8730132a6365cc9e242a3564a1e615d94ef2c651
BUG: 1207735
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/11499
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of including config.h in each file, and have the additional
config.h included from the compiler commandline (-include option).
When a .c file tests for a certain #define, and config.h was not
included, incorrect assumtions were made. With this change, it can not
happen again.
BUG: 1222319
Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/10808
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem : In glusterd,we are using big lock which is implemented based on sync
task frame work for thread synchronization and rcu lock for data consistency.
sync task frame work swap the threads if there is no worker poll threads
available,due to this rcu lock and rcu unlock was happening in different threads
(urcu-bp will not allow this),resulting into glusterd crash.
fix : To avoid releasing the sync lock(big lock) in between rcu critical
section,implemented sync lock as recursive lock.
More details:
link : http://www.spinics.net/lists/gluster-devel/msg14632.html
Change-Id: I2b56c1caf3f0470f219b1adcaf62cce29cdc6b88
BUG: 1211640
Signed-off-by: anand <anekkunt@redhat.com>
Reviewed-on: http://review.gluster.org/10285
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ifc7937ceb451f6e11e40a9513017226fd0f115b0
BUG: 1215265
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10382
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for xdata in both the
request and response path of syncops.
Few calls like lookup already had the support;
have renamed variables in few places to maintain
uniformity.
xdata passed downwards is known as xdata_in
and xdata passed upwards is known as xdata_out.
There is an old patch by Jeff Darcy at
http://review.gluster.org/#/c/8769/3 which does the
same for some selected calls. It also brings in
xdata support at gfapi level.
xdata support at gfapi level would be introduced
in subsequent patches.
Change-Id: I340e94ebaf2a38e160e65bc30732e8fe1c532dcc
BUG: 1158621
Signed-off-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-on: http://review.gluster.org/9859
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the only way to retrieve the number of files/objects in a
directory or volume is to do a crawl of the entire directory/volume.
This is expensive and is not scalable.
The new mechanism proposes to store count of objects/files as part of
an extended attribute of a directory. Each directory's extended
attribute value will indicate the number of files/objects present
in a tree with the directory being considered as the root of the tree.
Currently file usage is accounted in marker by doing multiple FOPs
like setting and getting xattrs. Doing this with STACK WIND and
UNWIND can be harder to debug as involves multiple callbacks.
In this code we are replacing current mechanism with syncop approach
as syncop code is much simpler to follow and help us implement inode
quota in an organized way.
Change-Id: Ibf366fbe07037284e89a241ddaff7750fc8771b4
BUG: 1188636
Signed-off-by: vmallika <vmallika@redhat.com>
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Reviewed-on: http://review.gluster.org/9567
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Several features - e.g. encryption, erasure codes, or NSR - involve
multiple cooperating translators which sometimes need a "private" means
of communication amongst themselves. Historically we've used virtual or
synthetic xattrs, but that's not very elegant and clutters up the
getxattr/setxattr path which must also handle real xattr requests. This
new fop should address that.
The only argument is an int32_t "op" which should be recognized by the
target translator. It is recommended that translators using these
feature follow some convention regarding the ops that they define, to
avoid conflicts. Using a hash of the target translator's type string as
a base for a series of ops would probably be a good start. Any other
information can be passed in both directions using xdata.
The default behavior for this fop, as with any other, is to pass through
to FIRST_CHILD. That makes use of this fop "transparent" to other
translators that were written before it existed, but it also means that
it only really works with pass-through translators. If a routing
translator (such as DHT) or a fan-out translator (such as AFR) is
involved, the IPC might not reach its intended destination unless those
translators are modified to forward IPC fops along all paths.
If an IPC gets all the way to storage/posix it is considered an error,
much like an uncaught exception. We don't actually *do* anything in
that case, but we do log it send back an EOPNOTSUPP error. This makes
the "unrecognized opcode" condition distinguishable from the "no IPC
support" condition (which would yield an RPC error instead) so clients
can probe for the presence of a handler for their own favorite opcode
and either use that or use old-school xattrs depending on the result.
BUG: 1158628
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Change-Id: I84af1b17babe5b30ec03ecf027ae37d09b873968
Reviewed-on: http://review.gluster.org/8812
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
syncenv structures
Change-Id: I28020eb2fc08d886cd7c05ff96daf7ebb4264ffe
BUG: 1093594
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/9693
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These will be used by both afr and ec. Moved syncop_dirfd, syncop_ftw,
syncop_dir_scan functions also into syncop-utils.c
Change-Id: I467253c74a346e1e292d36a8c1a035775c3aa670
BUG: 1177601
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/9740
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Anuradha Talur <atalur@redhat.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This support can be used by the clients using SYNCOP framework,
to pass unique owners for various locks taken on a file, so that
the glusterfs-server can treat them as being locks from different owners.
Change-Id: Ie88014053af40fc7913ad6c1f7730d54cc44ddab
BUG: 1186713
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/9482
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ftw provides file tree walk.
dir_scan does just a readdir not readdirp.
Also changed Afr's self-heal-daemon's crawling functions to use this.
These utils will be used by ec in future to do proactive/full healing.
Change-Id: I05715ddb789592c1b79a71e98f1e8cc29aac5c26
BUG: 1177601
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/9485
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pass xdata dict to syncop_(f)getxattr calls.
This patch [1/3] is required as a part of afr automated split-brain resolution
implementation.
Change-Id: I3970b3dd6daf64681a031e37f8e9afb14fb3d668
BUG: 1136769
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/9375
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
quota_statfs() returns aggregated details of space usage
of bricks this causes distribute to be confused during
``rebalance``, where ``statfs()`` values are used to
schedule file migration.
We can make sure the values of ``statfs`` are from
individual bricks by selectively instructing
``quota_statfs()`` to return non aggregated values.
Change-Id: I1397faeee66a1b9c26709cfda693286d227a4170
BUG: 1158262
Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
Reviewed-on: http://review.gluster.org/8996
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The backtrace is 'saved' in a per-task buffer.
This would come handy while debugging code using
synctasks.
Change-Id: I732b275f6d15b31f31361f5ecf2ba47cacde9b54
BUG: 1138503
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/8622
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Iea489157490b70cb2bb03576b0d4943c6d8f052d
BUG: 1130888
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/8522
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Break-way from '/var/lib/glusterd' hard-coded previously,
instead rely on 'configure' value from 'localstatedir'
- Provide 's/lib/db' as default working directory for gluster
management daemon for BSD and Darwin based installations
- loff_t is really off_t on Darwin
- fix-off the warnings generated by clang on FreeBSD/Darwin
- Now 'tests/*' use GLUSTERD_WORKDIR a common variable for all
platforms.
- Define proper environment for running tests, define correct PATH
and LD_LIBRARY_PATH when running tests, so that the desired version
of glusterfs is used, regardless where it is installed.
(Thanks to manu@netbsd.org for this additional work)
Change-Id: I2339a0d9275de5939ccad3e52b535598064a35e7
BUG: 1111774
Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/8246
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
synclock_destory() has been prototyped in syncop.h,
how-ever synclock_destroy() is the actual function used in syncop.c.
Correcting this function definition along with few typos.
Change-Id: I35a818190c1d37c303279ca7a820f01895751bd9
BUG: 1075417
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/8266
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-by: soumya k <skoduri@redhat.com>
Reviewed-by: Poornima G <pgurusid@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
AFR self-heal needs to issue syncops with special PID. Extend
the custom UID/GID support to include custom PIDs
Change-Id: I736c0e177f862b029f203acc87f9eb46c8cb839b
BUG: 1021686
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6888
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
To be used in afr metadata self-heal
Change-Id: I8dac4b19d61e331702427eeb5b606aab3d20b328
BUG: 1021686
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6941
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I9b73c1db728e4cb3948fc118cceb292b21d48b96
BUG: 1021686
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6112
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glfs_zerofill() can be potentially called to zero-out entire file and
hence allow for bigger value of length parameter.
Change-Id: I75f1d11af298915049a3f3a7cb3890a2d72fca63
BUG: 1028673
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-on: http://review.gluster.org/6266
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: M. Mohan Kumar <mohan@in.ibm.com>
Tested-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in
the specified range. This fop will be useful when a whole file needs to
be initialized with zero (could be useful for zero filled VM disk image
provisioning or during scrubbing of VM disk images).
Client/application can issue this FOP for zeroing out. Gluster server
will zero out required range of bytes ie server offloaded zeroing. In
the absence of this fop, client/application has to repetitively issue
write (zero) fop to the server, which is very inefficient method because
of the overheads involved in RPC calls and acknowledgements.
WRITESAME is a SCSI T10 command that takes a block of data as input and
writes the same data to other blocks and this write is handled
completely within the storage and hence is known as offload . Linux ,now
has support for SCSI WRITESAME command which is exposed to the user in
the form of BLKZEROOUT ioctl. BD Xlator can exploit BLKZEROOUT ioctl to
implement this fop. Thus zeroing out operations can be completely
offloaded to the storage device , making it highly efficient.
The fop takes two arguments offset and size. It zeroes out 'size' number
of bytes in an opened file starting from 'offset' position.
This patch adds zerofill support to the following areas:
- libglusterfs
- io-stats
- performance/md-cache,open-behind
- quota
- cluster/afr,dht,stripe
- rpc/xdr
- protocol/client,server
- io-threads
- marker
- storage/posix
- libgfapi
Client applications can exloit this fop by using glfs_zerofill introduced in
libgfapi.FUSE support to this fop has not been added as there is no system call
for this fop.
Changes from previous version 3:
* Removed redundant memory failure log messages
Changes from previous version 2:
* Rebased and fixed build error
Changes from previous version 1:
* Rebased for latest master
TODO :
* Add zerofill support to trace xlator
* Expose zerofill capability as part of gluster volume info
Here is a performance comparison of server offloaded zeofill vs zeroing
out using repeated writes.
[root@llmvm02 remote]# time ./offloaded aakash-test log 20
real 3m34.155s
user 0m0.018s
sys 0m0.040s
[root@llmvm02 remote]# time ./manually aakash-test log 20
real 4m23.043s
user 0m2.197s
sys 0m14.457s
[root@llmvm02 remote]# time ./offloaded aakash-test log 25;
real 4m28.363s
user 0m0.021s
sys 0m0.025s
[root@llmvm02 remote]# time ./manually aakash-test log 25
real 5m34.278s
user 0m2.957s
sys 0m18.808s
The argument log is a file which we want to set for logging purpose and
the third argument is size in GB .
As we can see there is a performance improvement of around 20% with this
fop.
Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18
BUG: 1028673
Signed-off-by: Aakash Lal Das <aakash@linux.vnet.ibm.com>
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-on: http://review.gluster.org/5327
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is an ongoing effort to integrate NFS Ganesha (
https://github.com/nfs-ganesha/nfs-ganesha/wiki ) with GlusterFS as one of
the file system back ends.
Towards this we need extensions to gfapi that can handle object based
operations. Meaning, instead of using full paths or relative paths from
cwd, it is required that we can work with APIs, like the *at POSIX
variants, to be able to create, lookup, open etc. files and directories.
Hence the objects are the files or directories themselves and we give out
handles to these objects that can be used for further operations.
This code drop is an initial implementation of the proposed APIs.
The new APIs are implemented as glfs_h_XXX variants in the file
glfs-handleops.c to mirror glfs-fops.c style. The code leverages holding
onto inode references and doling these out as opaque/cookie type objects to
the callers, to enable them to be used as handles in other operations.
An fd based approach was considered, but due to the extra footprint that
the fd structure and its counterparts would incur, this was dropped to take
the approach of holding inode references themselves.
Tested by extending glfsxmp.c to invoke and exercise the added APIs, and
further tested with a reference integration of the same as an FSAL with NFS
Ganesha.
Change-Id: I23629c99e905b54070fa2e6565147812e5f3fa5d
BUG: 1016000
Signed-off-by: R.Shyamsundar <srangana@redhat.com>
Reviewed-on: http://review.gluster.org/5936
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Enhance syncenv_new() to accept scaling parameters of syncproc.
Previously the scaling parameters were hardcoded and decided at
compile time.
- New API synctask_create() which returns the created synctask. This
is similar to synctask_new which only returned the status of whether
a synctask could be created or not.
The meaning of NULL cbk in synctask_create() means the task is
"joinable". Until synctask_join() is called on such a synctask,
the task is not reaped and resources are not destroyed. The
task would be in a zombie state after synctask_fn returns and
before synctask_join() is called.
Change-Id: I368ec9037de9510d2ba951f0aad86aaf18d9a6b6
BUG: 986775
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/5365
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make the allocation of groups dynamic and increase the limit
to 65536.
Change-Id: I702364ff460e3a982e44ccbcb3e337cac9c2df51
BUG: 953694
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/5111
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for the DISCARD file operation. Discard punches a hole
in a file in the provided range. Block de-allocation is implemented
via fallocate() (as requested via fuse and passed on to the brick
fs) but a separate fop is created within gluster to emphasize the
fact that discard changes file data (the discarded region is
replaced with zeroes) and must invalidate caches where appropriate.
BUG: 963678
Change-Id: I34633a0bfff2187afeab4292a15f3cc9adf261af
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/5090
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement support for the fallocate file operation. fallocate
allocates blocks for a particular inode such that future writes
to the associated region of the file are guaranteed not to fail
with ENOSPC.
This patch adds fallocate support to the following areas:
- libglusterfs
- mount/fuse
- io-stats
- performance/md-cache,open-behind
- quota
- cluster/afr,dht,stripe
- rpc/xdr
- protocol/client,server
- io-threads
- marker
- storage/posix
- libgfapi
BUG: 949242
Change-Id: Ice8e61351f9d6115c5df68768bc844abbf0ce8bd
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/4969
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not let inode linking to happen only in lookup(). While
that works, it is inefficient.
Change-Id: I51bbfb6255ec4324ab17ff00566375f49d120c06
BUG: 953694
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4931
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I7731fd33ca0c925cc52f8d105275b44fc625a1e2
BUG: 948686
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/5058
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Earlier, SYNCOP macro, the only consumer of synctask_yield, would set
the task->state to SYNCTASK_SUSPEND. Today, we have glusterd having its
own wrapper macros which don't set task's state. There is also the
syncbarrier and synclock framework, which also participate in a
synctask's scheduling (and need to keep a task's state up to date). It
only makes more sense to leave a synctask's state to the synctask
library, since its an internal affair.
* Need to 'yawn' before 'yield' to avoid re-running tasks to set
task->woken appropriately.
Change-Id: Ic7a59e6ebcc46f03e53223ca237668d45a3cba40
BUG: 948686
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4985
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the current implementation, barriers are in the core of the
syncprocessors. Wake()s are treated as syncbarrier wake. This
is however delicate, as spurious wake()s of the synctask can
mess up the accounting of the barrier and waking it prematurely.
The fix is to keep yield() and wake() as the basic primitives,
and implement barriers as an object impelemented on top of these
primitives. This way, only an explicit barrier_wake() gets
counted towards the barrier accounting, and spurious wakes
will be truly safe.
Change-Id: I8087f0f446113e5b2d0853431c0354335ccda076
BUG: 948686
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4921
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I37d9e1fb4a715094876be6af3856c1b4cf398021
BUG: 953694
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4881
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Inherit the pid/euid/egid/groups of the running process in the
frame. Do this only in cases where a loaded frame was not
presented to the synctask.
This behavior is required for Samba VFS.
Change-Id: Ib181c90f47c6741197b9ce9f67a19e2914b647d2
BUG: 953694
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4878
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces a synclocks - co-operative locks for synctasks.
Synctasks yield themselves when a lock cannot be acquired at the time
of the lock call, and the unlocker will wake the yielded locker at
the time of unlock.
The implementation is safe in a multi-threaded syncenv framework.
It is also safe for sharing the lock between non-synctasks. i.e, the
same lock can be used for synchronization between a synctask and
a regular thread. In such a situation, waiting synctasks will yield
themselves while non-synctasks will sleep on a cond variable. The
unlocker (which could be either a synctask or a regular thread) will
wake up any type of lock waiter (synctask or regular).
Usage:
Declaration and Initialization
------------------------------
synclock_t lock;
ret = synclock_init (&lock);
if (ret) {
/* lock could not be allocated */
}
Locking and non-blocking lock attempt
-------------------------------------
ret = synclock_trylock (&lock);
if (ret && (errno == EBUSY)) {
/* lock is held by someone else */
return;
}
synclock_lock (&lock);
{
/* critical section */
}
synclock_unlock (&lock);
Change-Id: I081873edb536ddde69a20f4a7dc6558ebf19f5b2
BUG: 763820
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4717
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <raghavendra@gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ibd963f78707b157fc4c9729aa87206cfd5ecfe81
BUG: 913662
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4570
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces a new set of primitives:
- synctask_barrier_init (stub)
- synctask_barrier_waitfor (stub, count)
- synctask_barrier_wake (stub)
Unlike pthread_barrier_t, this barrier has an explicit notion of
"waiter" and "waker". The "waiter" waits for @count number of
"wakers" to call synctask_barrier_wake() before returning. The
wait performed by the waiter via synctask_barrier_waitfor() is
co-operative in nature and yields the thread for scheduling other
synctasks in the mean time.
Intended use case:
Eliminate excessive serialization in glusterd and allow for
concurrent RPC transactions.
Code which are currently in this format:
---old---
list_for_each_entry (peerinfo, peers, op_peers_list) {
...
GD_SYNCOP (peerinfo->rpc, stub, rpc_cbk, ...);
}
...
int rpc_cbk (rpc, stub, ...)
{
...
__wake (stub);
}
---old---
Can be restructred into the format:
---new---
synctask_barrier_init (stub);
{
list_for_each_entry (peerinfo, peers, op_peers_list) {
...
rpc_submit (peerinfo->rpc, stub, rpc_cbk, ...);
count++;
}
}
synctask_barrier_wait (stub, count);
...
int rpc_cbk (rpc, stub, ...)
{
...
synctask_barrier_wake (stub);
}
---new---
In the above structure, from the synctask's point of view, the region
between synctask_barrier_init() and synctask_barrier_wait() are spawning
off asynchronous "threads" (or RPC) and keep count of how many such
threads have been spawned. Each of those threads are expected to make
one call to synctask_barrier_wake(). The call to synctask_barrier_wait()
makes the synctask thread co-operatively wait/sleep till @count such threads
call their wake function.
This way, the synctask thread retains the "synchronous" flow in the code,
yet at the same time allows for asynchronous "threads" to acheive parallelism
over RPC.
Change-Id: Ie037f99b2d306b71e63e3a56353daec06fb0bf41
BUG: 913662
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4558
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I90e496b5d5027ac702ab3804ba52f26d537812a0
BUG: 764890
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4554
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
synctasks can now call SYNCTASK_SETID(uid,gid) to set the effective
uid/gid of the frame with which the FOP will be performed.
Once called, the uid/gid is set either till the end of the synctask
or till the next call of SYNCTASK_SETID()
Change-Id: I7eb74f7c473099bcae39310d2ab353d58f8eb2ba
BUG: 884597
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4269
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shishir Gowda <sgowda@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
make syncop_symlink() accept 'const char *linkname' instead of
'char *linkname'
Change-Id: I7751d552e4a4cc6e8b8e587b9e520213f4e11b45
BUG: 839950
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4020
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- syncop_mkdir()
- syncop_rmdir()
- syncop_rename()
Change-Id: I177db0f9af7c99fc6645d59521c8fb82f73812ca
BUG: 839950
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4019
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
|