| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I208289aae2423e4bb015cf33bafd2a961e1c3fc6
BUG: 1197593
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/9779
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Sachin Pandit <spandit@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CID : 1124884
Change-Id: I3332e844a01c1432f1d80a6acda7a87e8b01801c
BUG: 789278
Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-on: http://review.gluster.org/9677
Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This also lists the files that are on-going I/O, which
will be fixed later.
Change-Id: Ib3f60a8b7e8798d068658cf38eaef2a904f9e327
BUG: 1203581
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10020
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) Query fix in find_changed_with_freq()
2) Volume option typo fix for write_freq_threshold
and read_freq_threshold
Change-Id: I38e154818178aab412b2d7b2914cd29acef66ffb
BUG: 1207343
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10050
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
We've observed that glusterd was OOM killed after some minutes when volume set
command was run in a loop.
Analysis:
Initially the suspection was in glusterd code, but a deep dive into the codebase
revealed that while validating all the options as part of graph reconfiguration
at the time of freeing up the xlator object its one of the member mem_acct is
left over which causes memory leak.
Solution:
Free up xlator's mem_acct.rec in xlator_destroy ()
Change-Id: Ie9e7267e1ac4ab7b8af6e4d7c6660dfe99b4d641
BUG: 1201203
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/9862
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Debugging where memory gets free'd with help from overwriting the memory
before it is free'd with some structures (repeatedly). The struct
mem_invalid starts with a magic value (0xdeadc0de), followed by a
pointer to the xlator, the mem-type. the size of the GF_?ALLOC()
requested area and the baseaddr pointer to what GF_?ALLOC() returned.
With these details, and the 'struct mem_header' that is placed when
calling GF_?ALLOC(), it is possible to identify overruns and possible
use-after-free. A memory dump (core) or running with a debugger is
needed to read the surrounding memory of corrupt structures.
This additional memory invalidation/poisoning needs to be enabled by
passing --enable-debug to ./configure.
Change-Id: I9f5f37dc4b5b59142adefc90897d32e89be67b82
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/10019
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current glfs_new() function is not flexible enough to error out
and destroy the struct members or objects initialized just before the
error path/condition. This make the structs or objects to continue or
left out with partially recorded data in fs and ctx structs and cause
crashes/issues later in the code path. This patch avoid the issue.
Change-Id: Ie4514b82b24723a46681cc7832a08870afc0cb28
BUG: 1202492
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/9903
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Poornima G <pgurusid@redhat.com>
Reviewed-by: soumya k <skoduri@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://review.gluster.org/9269 addresses maintaining local xaction_peers in
syncop and mgmt_v3 framework. This patch is to maintain local xaction_peers list
for op-sm framework as well.
Change-Id: Idd8484463fed196b3b18c2df7f550a3302c6e138
BUG: 1204727
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/9972
Reviewed-by: Anand Nekkunti <anekkunt@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scrubber performs signature verification for objects that were
signed by signer. This is done by recalculating the signature
(using the hash algorithm the object was signed with) and
verifying it aginst the objects persisted signature. Since the
object could be undergoing IO opretaion at the time of hash
calculation, the signature may not match objects persisted
signature. Bitrot stub provides additional information about
the stalesness of an objects signature (determinted by it's
versioning mechanism). This additional bit of information is
used by scrubber to determine the staleness of the signature,
and in such cases the object is skipped verification (although
signature staleness is performed twice: once before initiation
of hash calculation and another after it (an object could be
modified after staleness checks).
The implmentation is a part of the bitrot xlator (signer) which
acts as a signer or scrubber based on a translator option. As
of now the scrub process is ever running (but has some form of
weak throttling mechanism during filesystem scan). Going forward,
there needs to be some form of scrub scheduling and IO throttling
(during hash calculation) tunables (via CLI).
Change-Id: I665ce90208f6074b98c5a1dd841ce776627cc6f9
BUG: 1170075
Original-Author: Raghavendra Bhat <rabhat@redhat.com>
Original-Author: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9914
Tested-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the "Signer" -- responsible for signing files with their
checksums upon last file descriptor close (last release()).
The event notification facility provided by the changelog xlator
is made use of.
Moreover, checksums are as of now SHA256 hash of the object data
and is the only available hash at this point of time. Therefore,
there is no special "what hash to use" type check, although it's
does not take much to add various hashing algorithms to sign
objects with. Signatures are stored in extended attributes of the
objects along with the the type of hashing used to calculate the
signature. This makes thing future proof when other hash types
are added. The signature infrastructure is provided by bitrot
stub: a little piece of code that sits over the POSIX xlator
providing interfaces to "get or set" objects signature and it's
staleness.
Since objects are signed upon receiving release() notification,
pre-existing data which are "never" modified would never be
signed. To counter this, an initial crawler thread is spawned
The crawler scans the entire brick for objects that are unsigned
or "missed" signing due to the server going offline (node reboots,
crashes, etc..) and triggers an explicit sign. This would also
sign objects when bit-rot is enabled for a volume and/or after
upgrade.
Change-Id: I1d9a98bee6cad1c39c35c53c8fb0fc4bad2bf67b
BUG: 1170075
Original-Author: Raghavendra Bhat <raghavendra@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9711
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: If54f4be7db8b6f98e65570b09c07251e21ebae15
BUG: 1194640
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/9837
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bitrot stub implements object versioning required for identifying
signature freshness. More details about versioning is explained
as a part of the "bitrot feature documentation" patch.
Change-Id: I2ad70d9eb109ba4a12148ab8d81336afda529ad9
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9709
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Certain translators may require to update the inode context
of an already linked inode before unwinding the call to the
client. Normally, such a case in encountered during parallel
operations when a fresh inode is chosen at call (wind) time.
In the callback path, one of inodes is successfully linked
in the inode table, thereby the other inodes being thrown
away (and the inode pointers for these calls being pointed
to the linked inode).
Translators which may have strict dependency on the correct
value in the inode context would get stale values in inode
context. This patch introduces a new callback which provides
gives translators an opportunity to "patch" their respective
inode contexts. Note that, as of now, this callback is only
invoked during create()s unwind path. Although this might
needed to be done for all dentry fops and lookup, but let
that be done as an when required (bitrot stub requires
this *only* for create()).
Change-Id: I6cd91c2af473c44d1511208060d3978e580c67a6
BUG: 1170075
Original-Author: Raghavendra Bhat <rabhat@redhat.com>
Original-Author: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9913
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Translators which wish to send event notifications can send
"down" an IPC FOP with op_type as GF_IPC_TARGET_CHANGELOG
and xdata carrying event structures (changelog_event_t).
Change-Id: I0e5f8c9170161c186f0e58d07105813e34e18786
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9775
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The tier translator shares most of DHT's code. It differs in how
subvolumes are chosen for I/Os, and how file migration (cache promotion
and demotion) is managed. That different functionality is split to either
DHT or tier logic according to the "tier_methods" structure.
A cache promotion and demotion thread is created in a manner
similar to the rebalance daemon. The thread operates a timing
wheel which periodically checks for promotion and demotion candidates
(files). Candidates are queued and then migrated. Candidates must exist on
the same node as the daemon and meet other critera per caching policies.
This patch has two authors (Dan Lambright and Joseph Fernandes). Dan
did the DHT changes and Joe wrote the cache policies. The fix depends on
DHT readidr changes and the database library which have been submitted
separately. Header files in libglusterfs/src/gfdb should be reviewed in
patch 9683.
For more background and design see the feature page [1].
[1]
http://www.gluster.org/community/documentation/index.php/Features/data-classification
Change-Id: Icc26c517ccecf5c42aef039f5b9c6f7afe83e46c
BUG: 1194753
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/9724
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Id9a7d0f457d9759ab7d0a52a4000b5ae36d211f8
BUG: 1194753
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9946
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Part 2/2 patch to enable users analyze and resolve
split-brain.
This patch enables :
1) Users to inspect the files in data and metadata split-brain.
2) Resolve the split-brain.
Both using a series of setfattr commands.
Consider a volume "test" with 2 bricks.
1) To inspect a file f1:
setfattr -n replica.split-brain-choice -v test-client-0 f1
After the execution of this command, if no read_subvol
is found, reads will be served from test-client-0 (corresponding
to brick-0).
2) To resolve split-brain :
setfattr -n replica.split-brain-heal-finalize -v test-client-0 f1
Execution of this command will lead to the resolution
of data and metadata split-brain with subvol mentioned in the
command (test-client-0 here) as the source and the rest as sink.
Change-Id: Ia20f3ee5abd3119e3d54fcc599f1e55ac65fd179
BUG: 1191396
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/9743
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
**********************************************************************
ChangeTimeRecorder(CTR) Xlator |
**********************************************************************
ChangeTimeRecorder(CTR) is server side xlator(translator) which sits
just above posix xlator. The main role of this xlator is to record the
access/write patterns on a file residing the brick. It records the
read(only data) and write(data and metadata) times and also count on
how many times a file is read or written. This xlator also captures
the hard links to a file(as its required by data tiering to move
files).
CTR Xlator is the consumer of libgfdb.
To Enable/Disable CTR Xlator:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gluster volume set <volume-name> features.ctr-enabled {on/off}
To Enable/Disable Frequency Counter Recording in CTR Xlator:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gluster volume set <volume-name> features.record-counters {on/off}
Change-Id: I5d3cf056af61ac8e3f8250321a27cb240a214ac2
BUG: 1194753
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/9935
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NFS now has the ability to use a separate file for "netgroups" and
"exports". An administrator should have the ability to check the
validity of the files before applying the configuration.
The "glusterfsd" command now has the following additional arguments that
can be used to check the configuration:
--print-netgroups: Validate the netgroups file and print it out
--print-exports: Validate the exports file and print it out
BUG: 1143880
Change-Id: I24c40d50110d49d8290f9fd916742f7e4d0df85f
URL: http://www.gluster.org/community/documentation/index.php/Features/Exports_Netgroups_Authentication
Original-author: Shreyas Siravara <shreyas.siravara@gmail.com>
CC: Richard Wareing <rwareing@fb.com>
CC: Jiffin Tony Thottan <jthottan@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9365
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch imports timer-wheel[1] algorithm from the linux
kernel (~/kernel/time/timer.c) with some modifications.
Timer-wheel is an efficent way to track millions of timers for
expiry. This is a variant of the simple but RAM heavy approach
of having a list (timer bucket) for every future second.
Timer-wheel categorizes every future second into a logarithmic
array of arrays. This is done by splitting the 32 bit "timeout"
value into fixed "sliced" bits, thereby each category has a
fixed size array to which buckets are assigned.
A classic split would be 8+6+6+6 (used in this patch) which
results in 256+64+64+64 == 512 buckets. Therefore, the entire
32 bit futuristic timeouts have been mapped into 512 buckets.
[
NOTE:
There are other possible splits, such as "8+8+8+8", but
this patch sticks to the widely used and tested default.
]
Therfore, the first category "holds" timers whose expiry range
is between 1..256, the next cateogry holds 257..16384, third
category 16385..1048576 and so on. When timers are added,
unless it's in the first category, timers with different
timeouts could end up in the same bucket. This means that the
timers are "partially sorted" -- sorted in their highest bits.
The expiry code walks the first array of buckets and exprires
any pending timers (1..256). Next, at time value 257, timers
in the first bucket of the second array is "cascaded" onto
the first category and timers are placed into respective
buckets according to the thier timeout values. Cascading
"brings down" the timers timeout to the coorect bucket
of their respective category. Therefore, timers are sorted
by their highest bits of the timeout value and then by the
lower bits too.
[1] https://lwn.net/Articles/152436/
Change-Id: I1219abf69290961ae9a3d483e11c107c5f49c4e3
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9707
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces rotational buffers aiming at the classic
multiple producer and multiple consumer problem. A fixed set
of buffer list is allocated during initialization, where each
list consist of a list of buffers. Each buffer is an iovec
pointing to a memory region of fixed allocation size. Multiple
producers write data to these buffers. A buffer list starts with
a single buffer (iovec) and allocates more when required (although
this can be preallocatd in multiples of k).
rot-buffs allow multiple producers to write data parallely with
a bit of extra cost of taking locks. Therefore, it's much suited
for large writes. Multiple producers are allowed to write in the
buffer parallely by "reserving" write space for selected number
of bytes and returning pointer to the start of the reserved area.
The write size is selected by the producer before it starts the
write (which is often known). Therefore, the write itself need not
be serialized -- just the space reservation needs to be done safely.
The other part is when a consumer kicks in to consume what has
been produced. At this point, a buffer list switch is performed.
The "current" buffer list pointer is safely pointed to the next
available buffer list. New writes are now directed to the just
switched buffer list (the old buffer list is now considered out
of rotation). Note that the old buffer still may have producers
in progress (pending writes), so the consumer has to wait till
the writers are drained. Currently this is the slow path for
producers (write completion) and needs to be improved.
Currently, there is special handling for cases where the number
of consumers match (or exceed) the number of producers, which
could result in writer starvation. In this scenario, when a
consumers requests a buffer list for consumption, a check is
performed for writer starvation and consumption is denied
until at least another buffer list is ready of the producer
for writes, i.e., one (or more) consumer(s) completed, thereby
putting the buffer list back in rotation.
[
NOTE:
I've not performance tested this producer-consumer model
yet. It's being used in changelog for event notification.
The list of buffers (iovecs) are directly passed to RPC
layer.
]
Change-Id: I88d235522b05ab82509aba861374a2312bff57f2
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9706
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
==========================================================================
Inode quota
==========================================================================
= Currently, the only way to retrieve the number of files/objects in a =
= directory or volume is to do a crawl of the entire directory/volume. =
= This is expensive and is not scalable. =
= =
= The proposed mechanism will provide an easier alternative to determine =
= the count of files/objects in a directory or volume. =
= =
= The new mechanism proposes to store count of objects/files as part of =
= an extended attribute of a directory. Each directory's extended =
= attribute value will indicate the number of files/objects present =
= in a tree with the directory being considered as the root of the tree. =
= =
= The count value can be accessed by performing a getxattr(). =
= Cluster translators like afr, dht and stripe will perform aggregation =
= of count values from various bricks when getxattr() happens on the key =
= associated with file/object count. =
A new interface is introduced:
------------------------------
limit-objects : limit the number of inodes at directory level
list-objects : list the directories where the limit is set
remove-objects : remove the limit from the directory
==========================================================================
CLI COMMAND:
gluster volume quota <volname> limit-objects <path> <number> [<percent>]
* <number> is a hard-limit for number of objects limitation for path "<path>"
If hard-limit is exceeded, creation of file/directory is no longer
permitted.
* <percent> is a soft-limit for number of objects creation for path "<path>"
If soft-limit is exceeded, a warning is issued for each creation.
CLI COMMAND:
gluster volume quota <volname> remove-objects [path]
==========================================================================
CLI COMMAND:
gluster volume quota <volname> list-objects [path] ...
Sample output:
------------------
Path Hard-limit Soft-limit Used Available
Soft-limit exceeded?
Hard-limit exceeded?
------------------------------------------------------------------------
--------------------------------------
/dir 10 80% 10 0
Yes
Yes
==========================================================================
[root@snapshot-28 dir]# ls
a b file11 file12 file13 file14 file15 file16 file17
[root@snapshot-28 dir]# touch a1
touch: cannot touch `a1': Disk quota exceeded
* Nine files are created in directory "dir" and directory is included in
* the
count too. Hence the limit "10" is reached and further file creation
fails
==========================================================================
Note: We have also done some re-factoring in cli for volume name
validation. New function cli_validate_volname is created
==========================================================================
Change-Id: I1823497de4f790a2a20ebb1770293472ea33ee2b
BUG: 1190108
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/9769
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
*************************************************************************
Libgfdb |
*************************************************************************
Libgfdb provides abstract mechanism to record extra/rich metadata
required for data maintenance, such as data tiering/classification.
It provides consumer with API for recording and querying, keeping
the consumer abstracted from the data store used beneath for storing data.
It works in a plug-and-play model, where data stores can be plugged-in.
Presently we have plugin for Sqlite3. In the future will provide recording
and querying performance optimizer. In the current implementation the schema
of metadata is fixed.
Schema:
~~~~~~
GF_FILE_TB Table:
~~~~~~~~~~~~~~~~~
This table has one entry per file inode. It holds the metadata required to
make decisions in data maintenance.
GF_ID (Primary key) : File GFID (Universal Unique IDentifier in the namespace)
W_SEC, W_MSEC : Write wind time in sec & micro-sec
UW_SEC, UW_MSEC : Write un-wind time in sec & micro-sec
W_READ_SEC, W_READ_MSEC : Read wind time in sec & micro-sec
UW_READ_SEC, UW_READ_MSEC : Read un-wind time in sec & micro-sec
WRITE_FREQ_CNTR INTEGER : Write Frequency Counter
READ_FREQ_CNTR INTEGER : Read Frequency Counter
GF_FLINK_TABLE:
~~~~~~~~~~~~~~
This table has all the hardlinks to a file inode.
GF_ID : File GFID (Composite Primary Key)``|
GF_PID : Parent Directory GFID (Composite Primary Key) |-> Primary Key
FNAME : File Base Name (Composite Primary Key)__|
FPATH : File Full Path (Its redundant for now, this will go)
W_DEL_FLAG : This Flag is used for crash consistancy, when a link is unlinked.
i.e Set to 1 during unlink wind and during unwind this record
is deleted
LINK_UPDATE : This Flag is used when a link is changed i.e rename.
Set to 1 when rename wind and set to 0 in rename unwind
Libgfdb API:
~~~~~~~~~~~
Refer libglusterfs/src/gfdb/gfdb_data_store.h
Change-Id: I2e9fbab3878ce630a7f41221ef61017dc43db11f
BUG: 1194753
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/9683
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
position in the graph rather than relative (local) to a particular
translator.
Encoding the volume in this way allows a single translator to manage
which brick is currently being scanned for directory entries. Using a
single translator minimizes allocated bits in the d_off. It also allows
multiple DHT translators in the same graph to have a common frame of
reference (the graph position) for which brick is being read. Multiple
DHT translators are needed for the Tiering feature.
The fix builds off a previous change (9332) which removed subvolume
encoding from AFR. The fix makes an equivalent change to the EC
translator.
More background can be found in fix 9332 and gluster-dev discussions [1].
DHT and AFR/EC are responsibile (as before) for choosing which brick to
enumerate directory entries in over the readdir lifecycle.
The client translator receiving the readdir fop encodes the dht_t. It
is referred to as the "leaf node" in the graph and corresponds to the
brick being scanned.
When DHT decodes the d_off, it translates the leaf node to a local
subvolume, which represents the next node in the graph leading to
the brick.
Tracking of leaf nodes is done in common utility functions. Leaf nodes
counts and positional information are updated on a graph switch.
[1] www.gluster.org/pipermail/gluster-devel/2015-January/043592.html
Change-Id: Iaf0ea86d7046b1ceadbad69d88707b243077ebc8
BUG: 1190734
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/9688
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The crash is seen when, glfs_init failed for some reason
and glfs_fini was called for cleaning up the partial initialization.
The fix is in two folds:
1. In timer store and restore the THIS, previously
it was being overwritten.
2. In glfs_free_from_ctx() and glfs_fini() check for
NULL before destroying.
Change-Id: If40bf69936b873a1da8e348c9d92c66f2f07994b
BUG: 1202290
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/9895
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the only way to retrieve the number of files/objects in a
directory or volume is to do a crawl of the entire directory/volume.
This is expensive and is not scalable.
The new mechanism proposes to store count of objects/files as part of
an extended attribute of a directory. Each directory's extended
attribute value will indicate the number of files/objects present
in a tree with the directory being considered as the root of the tree.
Currently file usage is accounted in marker by doing multiple FOPs
like setting and getting xattrs. Doing this with STACK WIND and
UNWIND can be harder to debug as involves multiple callbacks.
In this code we are replacing current mechanism with syncop approach
as syncop code is much simpler to follow and help us implement inode
quota in an organized way.
Change-Id: Ibf366fbe07037284e89a241ddaff7750fc8771b4
BUG: 1188636
Signed-off-by: vmallika <vmallika@redhat.com>
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Reviewed-on: http://review.gluster.org/9567
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Several features - e.g. encryption, erasure codes, or NSR - involve
multiple cooperating translators which sometimes need a "private" means
of communication amongst themselves. Historically we've used virtual or
synthetic xattrs, but that's not very elegant and clutters up the
getxattr/setxattr path which must also handle real xattr requests. This
new fop should address that.
The only argument is an int32_t "op" which should be recognized by the
target translator. It is recommended that translators using these
feature follow some convention regarding the ops that they define, to
avoid conflicts. Using a hash of the target translator's type string as
a base for a series of ops would probably be a good start. Any other
information can be passed in both directions using xdata.
The default behavior for this fop, as with any other, is to pass through
to FIRST_CHILD. That makes use of this fop "transparent" to other
translators that were written before it existed, but it also means that
it only really works with pass-through translators. If a routing
translator (such as DHT) or a fan-out translator (such as AFR) is
involved, the IPC might not reach its intended destination unless those
translators are modified to forward IPC fops along all paths.
If an IPC gets all the way to storage/posix it is considered an error,
much like an uncaught exception. We don't actually *do* anything in
that case, but we do log it send back an EOPNOTSUPP error. This makes
the "unrecognized opcode" condition distinguishable from the "no IPC
support" condition (which would yield an RPC error instead) so clients
can probe for the presence of a handler for their own favorite opcode
and either use that or use old-school xattrs depending on the result.
BUG: 1158628
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Change-Id: I84af1b17babe5b30ec03ecf027ae37d09b873968
Reviewed-on: http://review.gluster.org/8812
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Framework on the server-side, to handle certain state of the files
accessed and send notifications to the clients connected.
A generic and extensible framework, used to maintain states in
the glusterfsd process for each of the files accessed
(including the clients info doing the fops) and send
notifications to the respective glusterfs clients incase of
any change in that state.
This patch handles "Inode Update/Invalidation" upcall event.
Feature page:
URL: http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure
Below link has a writeup which explains the code changes done -
URL: https://soumyakoduri.wordpress.com/2015/02/25/glusterfs-understanding-upcall-infrastructure-and-cache-invalidation-support/
Change-Id: Ie3d724be9a3419fcf18901a753e8ec2df2ac802f
BUG: 1200262
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/9535
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
CC libglusterfs_la-inode.lo
inode.c: In function 'inode_table_destroy':
inode.c:1630:19: warning: variable 'this' set
but not used [-Wunused-but-set-variable]
xlator_t *this = NULL;
Change-Id: If4b37ab896ee0a309826d4be48c6599d6ec2710b
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/9846
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anoop C S <achiraya@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I41acd9970bef04bb16cd4d8532a84a95d5fb642a
BUG: 1199003
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>.
Reviewed-on: http://review.gluster.org/9810
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Parses linux style export file/netgroups file into a structure that
can be lookedup.
* This parser turns each line into a structure called an "export
directory". Each of these has a dictionary of hosts and netgroups
which can be looked up during the mount authentication process.
(See Change-Id Ic060aac and I7e6aa6bc)
* A string beginning withan '@' is treated as a netgroup and a string
beginning without an @ is a host.
(See Change-Id Ie04800d)
* This parser does not currently support all the options in the man page
('man exports'), but we can easily add them.
BUG: 1143880
URL: http://www.gluster.org/community/documentation/index.php/Features/Exports_Netgroups_Authentication
Change-Id: I181e8c1814d6ef3cae5b4d88353622734f0c0f0b
Original-author: Shreyas Siravara <shreyas.siravara@gmail.com>
CC: Richard Wareing <rwareing@fb.com>
CC: Jiffin Tony Thottan <jthottan@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/8758
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The /etc/exports format for NFS-exports (see Change-Id I7e6aa6b) allows
a more fine grained control over the authentication. This change adds
the functions and structures that will be used in by Change-Id I181e8c1.
BUG: 1143880
Change-Id: Ic060aac7c52d91e08519b222ba46383c94665ce7
Original-author: Shreyas Siravara <shreyas.siravara@gmail.com>
CC: Richard Wareing <rwareing@fb.com>
CC: Jiffin Tony Thottan <jthottan@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9362
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
syncop_inodelk doesn't work properly as lk_owner is not set
in the frame created by 'synctask_create'.
There is a possibility that more than one thread can acquire inode lock
with syncop_inodelk
Change-Id: I8193edb0d24b3a6e3a3f6a0c5d7ab5a1be8e7daf
BUG: 1188636
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/9858
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pipe2() doesn't works on Linux kernel version < 2.6.27 and
glibc < version 2.9. Hence replacing it with pipe(),
so that the build will not fail on Centos5.
Change-Id: If17aed0d51466cd7528cf8dde0edfa28b68139e5
BUG: 1200255
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/9844
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adding support for two virtual extended attributes that are used for
converting a binary POSIX ACL to a POSIX.1e long ACL text format. This
makes it possible to transfer the ACL over the network to a different OS
which can convert the POSIX.1e text format to its native structures.
The following xattrs are sent over RPC in SETXATTR/GETXATTR procedures,
and contain the POSIX.1e long ACL text format:
- glusterfs.posix.acl: maps to ACL_TYPE_ACCESS
- glusterfs.posix.default_acl: maps to ACL_TYPE_DEFAULT
acl_from_text() (from libacl) converts the text format into an acl_t
structure. This structure is then used by acl_set_file() to set the ACL
in the filesystem.
libacl-devel is needed for linking against libacl, so it has been added
to the BuildRequires in the .spec.
NetBSD does not support POSIX ACLs. Trying to get/set POSIX ACLs on a
storage server running NetBSD, an error will be returned with errno set
to ENOTSUP. Faking support, but not enforcing ACLs seems wrong to me.
URL: http://www.gluster.org/community/documentation/index.php/Features/Improved_POSIX_ACLs
BUG: 1185654
Change-Id: Ic5eb73d69190d3492df2f711d0436775eeea7de3
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9627
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: soumya k <skoduri@redhat.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I58dc7e0dc8d4ac4e10795e0536fcd0e1722116ed
BUG: 1143880
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
Reviewed-on: http://review.gluster.org/9830
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make sure we do not get stuck looping forever in event_dispatch_destroy()
by limiting the retries when waiting for other threads, and by giving up
when writing to other thread fails.
This fixes regression tests hanging forever on NetBSD.
BUG: 1129939
Change-Id: I4459cfb1ab7294e8c15a21b592e0154c22abae07
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/9825
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Id41fb29480bb6d22c34469339163da05b98c1a98
BUG: 1115907
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8226
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This uses https://cmocka.org/ as the unit testing framework.
With this change, unit testing is made optional as well. We assume there
is no cmocka available while building. cmocka will be enabled by default
later on. For now, to build with cmocka run:
$ ./configure --enable-cmocka
This change is based on the work of Andreas (replacing cmockery2 with
cmocka) and Kaleb (make cmockery2 an optional build dependency).
The only modifications I made, are additional #defines in unittest.h for
making sure the unit tests function as expected.
Change-Id: Iea4cbcdaf09996b49ffcf3680c76731459cb197e
BUG: 1067059
Merged-change: http://review.gluster.org/9762/
Signed-off-by: Andreas Schneider <asn@samba.org>
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Change-Id: Ia2e955481c102d5dce17695a9205395a6030e985
Reviewed-on: http://review.gluster.org/9738
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: If5e9d4ce98f845d3b52565ac62970959e663497f
BUG: 1093594
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/9699
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This generic parser will get used for parsing the netgroups and exports
files for the Gluster/NFS server. The parsing of netgroups shows how the
parser can be used (see Change-Id Ie04800d4).
BUG: 1143880
Change-Id: Id4cf2b0189ef5799c06868d211d3fcd9c8608c08
Original-author: Shreyas Siravara <shreyas.siravara@gmail.com>
CC: Richard Wareing <rwareing@fb.com>
CC: Jiffin Tony Thottan <jthottan@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9359
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
gracefully.
Change-Id: I49b6ceebb45773620c318fb5d20b81623db75ab6
BUG: 1093594
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/9691
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: If341e3c0a559aa5bbca9c1263a241c6592c59706
BUG: 1093594
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/9696
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Iafbbbfd9319751742b3c79419e1dd8e2958fee07
BUG: 1093594
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/9701
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
syncenv structures
Change-Id: I28020eb2fc08d886cd7c05ff96daf7ebb4264ffe
BUG: 1093594
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/9693
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Took the inode context free code from the patch
http://review.gluster.org/#/c/4775/18/libglusterfs/src/inode.c
Change-Id: I05fc025763fe4ce61dc61503de27ec1d3a203e50
BUG: 1093594
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/9700
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By this reservation, we are assigning some space for common errors
like dict_{get,set},memory accounting..etc.
Change-Id: Iee0f65b3dc4e00819f344bed01989352a4f8a87b
BUG: 1194640.
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/9752
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the slots_used in a table becomes 0, the table will not
get reused, leading to a leak.
This patch fixes the leak.
Change-Id: Ib86826d287368174ea7ebe0d0d64b2dec574634e
BUG: 1093594
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/9725
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Prashanth Pai <ppai@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The RPC throttle which kicks in by setting the poll-in event on a
socket to false, is broken with the MT epoll commit. This is due
to the event handler of poll-in attempting to read as much out of
the socket till it receives an EAGAIN. Which may never happen and
hence we would be processing far more RPCs that we want to.
This is being fixed by changing the epoll from ET to LT, and
reading request by request, so that we honor the throttle.
The downside is that we do not drain the socket, but go back to
epoll_wait before reading the next request, but when kicking in
throttle, we need to anyway and so a busy connection would degrade
to LT anyway to maintain the throttle. As a result this change
should not cause deviation in the performance much for busy
connections.
Change-Id: I522d284d2d0f40e1812ab4c1a453c8aec666464c
BUG: 1192114
Signed-off-by: Shyam <srangana@redhat.com>
Reviewed-on: http://review.gluster.org/9726
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
__socket_read_reply function releases sock priv->lock briefly for
notifying higher layers of message's xid. This could result in other
epoll threads that are processing events on this socket to read further
fragments of the same message. This may lead to incorrect fragment
processing and result in a crash.
Change-Id: I915665b2e54ca16f2ad65970e51bf76c65d954a4
BUG: 1197118
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Signed-off-by: Shyam <srangana@redhat.com>
Reviewed-on: http://review.gluster.org/9742
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|