| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Locks can be taken just to inspect the data as well, so allow them.
Xattrops are internal fops so we can allow them as well as longs as
it doesn't change the xattr value, i.e. All-zero xattrop.
Change-Id: Idc06d2043eb472c064db40d811a80058f0bda378
BUG: 1211123
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10727
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: NetBSD Build System
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Brick connection was bloated (and not implemented efficiently) with
calls which were not required to be called under lock. This resulted
in starvation of lock by critical code paths. This eventally did not
scale when the number of bricks per volume increases (add-brick and
the likes).
Also, this patch cleans up some of the weird reconnection logic that
added more to the starvation of resources and cleans up uncontrolled
growing of log files.
Change-Id: I05e737f2a9742944a4a543327d167de2489236a4
BUG: 1207134
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/10763
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: NetBSD Build System
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch reimplments existing scrub-frequency mechanism used
to schedule scrubber runs. Existing mechanism uses periodic
sleeps (waking up periodically on minimum granularity) and
performing a number of tracking checks based on counters and
sleep times. This patch does away with all the nifty counters
and uses timer-wheel to schedule scrub runs.
Scheduling changes are peformed by merely calculating the new
expiry time and calling mod_timer() [mod_timer_pending() in
some cases] making the code more debuggable and easier to
follow. This also introduces "hourly" scrubbing tunable as an
aid for testing scrubbing during development/testing cycle.
One could also implement on-demand scrubbing with ease: by
invoking mod_timer() with an expiry of one (1) second, thereby
scheduling a scrub run the very next second.
Change-Id: I6c7c5f0c6c9f886bf574d88c04cde14b76e60a8b
BUG: 1224596
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10893
Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch refactors the signing trigger mechanism used by bitrot
daemon as a "catch up" meachanism to sign files which _missed_
signing on the last run either due to bitrot being disabled and
enabled again or if bitrot is enabled for a volume with existing
data.
Existing implementation relies on overloading writev() to trigger
signing which just by the looks sounded dangerous and I hated it
to the core. This change moves all that business to the setxattr
interface thereby keeping the writev path strictly for client
IO.
Why not use IPC fop to trigger signing?
There's a need to access the object's inode to perform various
maintainance operations. inode is not _directly_ accessible in
the IPC fop (although, it can be found via inode_grep() for the
object's GFID - the inode just needs to be pinned in memory,
which is the case if there's an active fd on the inode). This
patch relies on good old technique of overloading fsetxattr()
to do the job instead of using IPC fop.
There are some pretty nice cleanups along the lines of memory
deallocations, unncessary allocations and redundant ref()ing
of structures (such as fd's) provided by this patch. All in
all - much improved code navigation.
Change-Id: Id93fe90b1618802d1a95a5072517dac342b96cb8
BUG: 1224600
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10942
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Iaa7022c95a8d9c9c471db025ec644e0bcc4eeb29
BUG: 1221104
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10772
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During ancestry build, loc path was set to invalid
path. path was set to one of its child instead
of itself. Because of this quota accounting was
going wrong
This patch fix the issue
Below mentioned tests removed from bad test list
as part of patch# 10930
./tests/basic/ec/quota.t
./tests/basic/quota-nfs.t
./tests/bugs/quota/bug-1035576.t
Change-Id: Iaa65b2d968c04c9abcd476d0e9f588cb7fd39294
BUG: 1223798
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/10918
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this patch getxattr of inodelk/entrylk counts can be requested in
readv/writev/create/unlink/opendir.
Change-Id: If7430317ad478a3c753eb33bdf89046cb001a904
BUG: 1165041
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10880
Tested-by: NetBSD Build System
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
snapview-client and snapview-server doesnot have statfs
fop implemented
Change-Id: I2cdd4c5784414b0549a01af9a28dbc723b7cdc67
BUG: 1176837
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/10358
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Avra Sengupta <asengupt@redhat.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently scrubber is crawling all the files continuously. It should
crawl files based on the scrubber frequency which user have set.
By default scrubber crawling frequency value will be biweekly.
Change-Id: I5762a92c1e700134cfe4283d1f631904adbfe31d
BUG: 1208131
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10602
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Log typo fix for CTR Xlator and Libgfdb
Change-Id: I8d7208b9756199f8dc0a72771a3c953b4fa00f3f
BUG: 1215896
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10668
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Background:
Glusterfs changelogs are stored in each brick, which records the changes
happened in that brick. Georep will run in all the nodes of master and processes
changelogs "independently". Processing changelogs is in brick level, but
all the fops will be replayed on "slave mount" point.
Problem:
With a DHT volume, in changelog "internal fops" are NOT recorded.
For Rename case, Rename is recorded in "hashed" brick changelog.
(DHT's internal fops like creating linkto file, unlink is NOT recorded).
This lead us to inconsistent rename operations.
For example,
Distribute volume created with Two bricks B1, B2.
//Consider master volume mounted @ /mnt/master and following operations
executed:
cd /mnt/master
touch f1 // f1 falls on B1 Hash
mv f1 f2 // f2 falls on B2 Hash
// Here, Changelogs are recorded as below:
@B1
CREATE f1
@B2
RENAME f1 f2
Here, race exists between Brick B1 and B2, say B2 will get executed first.
Source file f1 itself is "NOT PRESENT", so it will go ahead and create
f2 (Current implementation).
We have this problem When rename falls in another brick and
file is unlinked in Master.
Similar kind of issue exists in following case too(multiple rename):
CREATE f1
RENAME f1 f2
RENAME f2 f1
Solution:
Instead of carrying out "changelogging" at "HASHED volume",
carry out at the "CACHED volume".
This way we have rename operations carried out where actual files are present.
So,Changelog recorded as :
@B1
CREATE f1
RENAME f1 f2
Note:
This patch is dependent on dht changes from this patch.
http://review.gluster.org/10410/
changelog related changes are separated out for review.
In changelog, xdata passed from DHT is considered as :
1. In case of unlink (internal operation as part of rename), xdata value
is set , it is considered as RENAME and recorded accordingly.
2. In case of rename (Hash and Cache different), xdata value is NOT
set, recording rename operation is SKIPPED.
BUG: 1141379
Change-Id: Ifca675e6d4ef8c4e3b7ef4a7ec85de8b3a38dc08
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/10220
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ie7e7c6028c7bffe47e60a2e93827e0e8767a3d66
BUG: 1219894
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10687
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of open
* This patch brings in the changes where object versioning is done in write and
truncate fops instead of tracking them in open and create fops. This model
works for both regular and anonymous fds. It also removes the race associated
with open calls, create and lookups.
This patch follows the below method for object versioning and notifications:
Before sending writev on the fd, increase the ongoing
version first. This makes anonymous fd write similar to the regular
fd write by having the ongoing version increased before doing the
write.
Do following steps to do versioning:
1) For anonymous fds set the fd context (so that release is invoked) and add
the fd context to the list maintained in the inode context.
For regular fds the above think would have been done in open itself.
2) Increase the on-disk ongoing version
3) Increase the in memory ongoing version and mark inode as non-dirty
3) Once versioning is successfully done send write operation. If
versioning fails, then fail the write fop.
5) In writev_cbk mark inode as modified.
Change-Id: I7104391bbe076d8fc49b68745d2ec29a6e92476c
BUG: 1207979
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/10233
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This issue introduced due to manual rebase.
Change-Id: I0589f4a0a1270190340f419b8022d6483bcf853d
Signed-off-by: Aravinda VK <avishwan@redhat.com>
BUG: 1219479
Reviewed-on: http://review.gluster.org/10685
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Tested-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An empty changelog when rolled over gets unlinked and indexed with
a modified path-name in htime file. The modification is "changelog"
not "CHANGELOG" in basename of the empty changelog file.
Change-Id: I77fd0b48b5c33c245418f5ac7a9756f08ece24d9
BUG: 1208470
Signed-off-by: Ajeet Jha <ajha@redhat.com>
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/9572
Tested-by: NetBSD Build System
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With logical scan/scrub split, pausing filesystem scrubber is an
override to the thread throttling mechanism, which effectively
throttles "down" number of scrubber threads to zero. This causes
scanner to wait until threads are spawned again (when resumed)
thereby continuing where it left off (since the file tree walk
stack is effectively preserved when the main scanner thread
is waiting for scrubbers to consume scanned entries).
The only catch is when scrubber daemon restarts: file tree walk
stack is lost and scrubbing initiates from root. This is probably
OK for now (can be changed later to persist parent directory
information before entering pause state).
Change-Id: I5109a749b7fccd0f5367765078f46e6522dd32a1
BUG: 1208131
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10521
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces multithreaded filesystem scrubber based
on throttling option configured for a particular volume. The
implementation "logically" breaks scanning and scrubbing with
the number of scrubber threads auto-configured depending upon
the throttle configuration. Scanning (crawling) is left single
threaded (per brick) with entries scrubbed in bulk. On reaching
this "bulk" watermark, scanner waits until entries are scrubbed.
Bricks for a particular volume have a set of thread(s) assigned
for scrubbing, with entries for each brick scrubbed in a round
robin fashion to avoid scrub "stalls" when a brick (out of N
bricks) is under active scrubbing.
This mechanism helps us implement "pause/resume" with ease: all
one need to do is to cleanup scrubber threads and let the main
scanner thread "wait" untill scrubbing is resumed (where the
scrubber thread(s) are spawned again), therefore continuing
where we left off (unless we restart the deamons, where crawl
initiates from root directory again, but I guess that's OK).
[
NOTE:
Throttling is optional for the signer daemon, without which
it runs full throttle. However, passing "-DBR_RATE_LIMIT_SIGNER"
predefined in CFLAGS enables CPU throttling (during checksum
calculation) thereby avoiding high CPU usage.
]
Subsequent patches would introduce CPU throttling during hash
calculation for scrubber.
Change-Id: I5701dd6cd4dff27ca3144ac5e3798a2216b39d4f
BUG: 1207020
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10511
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To-Do:
* Make ftruncate work even in the absence of path
* Aggregate and update ia_blocks appropriately when a file is
truncated to a lower size.
Change-Id: Ifd24c2f5e80d2c3bc921261f5481251df8948126
BUG: 1207615
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10631
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) The glupy.so xlator should embed the runtime search path for
the python libraries. Unfortunately, python-config does not
gives the appprioate flags, therefore we need to also use
pkg-config to obtain them
2) Fix the glupy python module directory layout so that python
can import the module without problem
That two fixes seems to let glupy.t pass on NetBSD again.
BUG: 1129939
Change-Id: I397aa726ab8bf7d91fa0d6d870a30910a5f4a5d9
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/10616
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BitRot daemons (signer & scrubber) are disk/cpu hoggers when left
running full throttle. Checksum calculations (especially SHA family
of hash routines) can be quite CPU intensive. Moreover periodic
disk scans performed by scrubber followed by reading data blocks
for hash calculation (which is also done by signer) generate lot
of heavy IO request(s). This causes interference with actual client
operations (be it a regular client or filesystems daemons such as
self-heal, etc..) and results in degraded system performance.
This patch introduces throttling based on Token Bucket Filtering[1].
It's a well known algorithm for checking (and ensuring) that data
transmission conform to defined limits and generally used in packet
switched networks. Linux control groups (Cgroups) uses a variant[2]
of this algorithm to provide block device IO throttling (cgroup
subsys "blkio": blk-iothrottle).
So, why not just live with Cgroups?
Cgroups is linux specific. We need to have a throttling mechanism
for other supported UNIXes. Moreover, having our own implementation
gives much more finer control in terms of tuning it for our needs
(plus the simplicity of the alogorithm itself).
Ideally, throttling should be a part of server stack (either as a
separate translator or integrated with io-threads) since that's
the point of entry for IO request(s) from *all* client(s). That
way one could selectively throttle IO request(s) based on client
PIDs (frame->root->pid), e.g., self-heal daemon, bitrot, etc..
(*actual* clients can run full throttle). This implementation
avoids that deliberately (there needs to be a much more smarter
queueing mechanism) and throttles CPU usage for hash calculations.
This patch is just the infrastructure part with no interfaces
exposed to set various throttling values. The tunable selected
here (basically hardcoded) avoids 100% CPU usage during hash
calculation (with some bursts cycles). We'd need much more
intensive test(s) to assign values for various throttling
options (lazy/normal/aggressive).
[1] https://en.wikipedia.org/wiki/Token_bucket
[2] http://en.wikipedia.org/wiki/Token_bucket#Hierarchical_token_bucket
Change-Id: Icc49af80eeab6adb60166d0810e69ef37cfe2fd8
BUG: 1207020
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10307
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As optional feature, during unlink, full path will be recorded.
Changelog Version number to be bumped up to 1.2.
With this patch, parser checks the version number before parsing
and handles accordingly.
Change-Id: Ic1ad98259c39e417029a08e26a1d4b467817e65a
BUG: 1214561
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10166
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
inode quota is a new feature implemented in glusterfs-3.7
if quota is enabled in the older version and is upgraded
to a new version, we can hit setxattr spike during self-heal
of inode quotas. So, when a quota is enabled, turn off
inode-quotas with a xlator option.
With this patch, we still account for inode quotas but only
when a write operation is performed for a particular file.
User will be able to query inode quotas once the Inode-quota
xlator option is enabled.
Change-Id: I52fb28bf7024989ce7bb08ac63a303bf3ec1ec9a
BUG: 1209430
Signed-off-by: vmallika <vmallika@redhat.com>
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Reviewed-on: http://review.gluster.org/10152
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: The CTR xlator records file meta (heat/hardlinks)
into the data. This works fine for files which are created
after ctr xlator is switched ON. But for files which were
created before CTR xlator is ON, CTR xlator is not able to
record either of the meta i.e heat or hardlinks. Thus making
those files immune to promotions/demotions.
Solution: The solution that is implemented in this patch is
do ctr-db heal of all those pre-existent files, using named lookup.
For this purpose we use the inode-xlator context variable option
in gluster.
The inode-xlator context variable for ctr xlator will have the
following,
a. A Lock for the context variable
b. A hardlink list: This list represents the successful looked
up hardlinks.
These are the scenarios when the hardlink list is updated:
1) Named-Lookup: Whenever a named lookup happens on a file, in the
wind path we copy all required hardlink and inode information to
ctr_db_record structure, which resides in the frame->local variable.
We dont update the database in wind. During the unwind, we read the
information from the ctr_db_record and ,
Check if the inode context variable is created, if not we create it.
Check if the hard link is there in the hardlink list.
If its not there we add it to the list and send a update to the
database using libgfdb.
Please note: The database transaction can fail(and we ignore) as there
already might be a record in the db. This update to the db is to heal
if its not there.
If its there in the list we ignore it.
2) Inode Forget: Whenever an inode forget hits we clear the hardlink list in
the inode context variable and delete the inode context variable.
Please note: An inode forget may happen for two reason,
a. when the inode is delete.
b. the in-memory inode is evicted from the inode table due to cache limits.
3) create: whenever a create happens we create the inode context variable and
add the hardlink. The database updation is done as usual by ctr.
4) link: whenever a hardlink is created for the inode, we create the inode context
variable, if not present, and add the hardlink to the list.
5) unlink: whenever a unlink happens we delete the hardlink from the list.
6) mknod: same as create.
7) rename: whenever a rename happens we update the hardlink in list. if the hardlink
was not present for updation, we add the hardlink to the list.
What is pending:
1) This solution will only work for named lookups.
2) We dont track afr-self-heal/dht-rebalancer traffic for healing.
Change-Id: Ia4bbaf84128ad6ce8c3ddd70bcfa82894c79585f
BUG: 1212037
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/10370
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of "trusted.glusterfs.bit-rot.*" use "trusted.bit-rot.*"
NOTE:
With this patch, data on existing volumes would be resigned
(which should be OK as of now since we do not expect many
users as of now :-))
Change-Id: I926c7bca266a9c8f2cb35d57c4d0359aa5cecfa0
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10181
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I4cc060710482de8633141170dd35f669f01f639b
BUG: 1207615
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10528
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Have added support to send attributes of both entries and
its parent (include oldparent in case of RENAME fop) in the
same notification request to avoid multiple rpc requests.
Also, made changes in gfapi to send parent object and its
attributes changed in a single upcall event.
Change-Id: I92833da3bcec38d65216921c2ce4d10367c32ef1
BUG: 1200262
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10460
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To cleanup expired client entries (with access_time > 2*CACHE_INVALIDATION_TIMEOUT),
have
* defined a global list to contain all the upcall_inode_ctx allocated
* Every time a upcall_inode_ctx is allocated, it is added to the global list
* during inode_forget, that upcall_inode_ctx is marked for destroy
* created a reaper thread which scans through that list
* cleans up expired client entries
* frees the inode_ctx with destroy_mode set.
Note: This reaper thread is initialized only when features.cache_invalidation option
is enabled.
Change-Id: Iea2a63eb31b8e08d5709e7e090cf26fd13d01265
BUG: 1200267
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10342
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM:
There is no way to get the path of deleted file if we
have gfid from changelog since the file is already deleted.
SOLUTION:
Do a recursive readlink on parent gfid in backend .glusterfs
path to get the complete path in I/O callpath in changelog
translator and capture it in callback.
The path captured is relative from the brick root. The field
separator used is '\0'.
e.g.,
......\0<pgfid>/bname\0<relative-path>\0<next-record>
ADDITIONAL REQUIRED CHANGES:
1. The changelog translator option called "changelog.capture-del-path"
is introduced to enable or disable the capturing of deleted entry
path.
e.g.,
gluster vol set <vol-name> changelog.capture-del-path on/off
If capture-del-path is disabled, '\0' is captured instead of
relative path.
e.g.,
......\0<pgfid>/bname\0\0\0<next-record>
2. The minor number in the version of changelog is bumped up from v1.1
to v1.2.
3. If recursive readlink is failed for some reason, it will capture
\0 in place of <relative path>.
e.g.,
......\0<pgfid>/bname\0\0\0<next-record>
(same as when caputre-del-path option is disabled)
4. If bname argument passed to "resolve_pargfid_to_path" function
is NULL and pargfid is ROOT, "." is returned. This is not the
case with changelog, where bname is always passed. This is
applicable to other consumers of "resolve_pargfid_to_path"
routine.
NOTE:
Changelog parser should consider the above new changes
and should parse accordingly.
Change-Id: I040ed429b5aa7d391033fc6a540edbf07fc37827
BUG: 1214561
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10288
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: NetBSD Build System
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The structure 'rpcsvc_state', which maintains rpc server
state had no separate pointer to track the translator.
It was using the mydata pointer itself. So callers were
forced to send xlator pointer as mydata which is opaque
(void pointer) by function prototype.
'rpcsvc_register_init' is setting svc->mydata with xlator
pointer. 'rpcsvc_register_notify' is overwriting svc->mydata
with mydata pointer. And rpc interprets svc->mydata as
xlator pointer internally. If someone passes non xlator
structure pointer to rpcsvc_register_notify as libgfchangelog
currently does, it might corrupt mydata. So interpreting opaque
mydata as xlator pointer is incorrect as it is caller's choice
to send mydata as any type of data to 'rpcsvc_register_notify'.
Maintaining two different pointers in 'rpcsvc_state' for xlator
and mydata solves the issue.
Change-Id: I7874933fefc68f3fe01d44f92016a8e4e9768378
BUG: 1215161
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10366
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently glupy files resides in gluster namespace of python site packages.
The other projects like libgfapi-python ..etc are evolving and need to share
the gluster namespace. The current structure makes things difficult as all
subpackages have its own __init__ files and other files.
One subpackage can not any more own gluster namespace.
The attempt is to make below structure for gluster namespace so that
it is more portable and scalable for future use.
<sitepackages>/gluster/
|
-- __init__.py
|
|
-- glupy
|
-- __init__.py
-- glupy.py
-- ........
|
|
-- gfapi
|
-- __init__.py
-- gfapi.py
-- ........
By above structure clients can import:
>>> from gluster import glupy
>>> from gluster import gfapi
libgfapi-python project has been moved to this structure via
http://review.gluster.org/#/c/9668/
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Change-Id: I54886200ddb6a4153a74d9e187aeca7cad79ef9e
BUG: 1211900
Reviewed-on: http://review.gluster.org/10248
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently whatever bitrot/scrubber tunable value user set for one
volume that value is considering for all other volumes also.
Each volume should act on their respective bitrot/scrubber tunable
value.
For handling bitrot/scrubber tunable value independently with respect
to all the volume bitrot and scrubber translator should run seperatly
for each volume.
Change-Id: I1d9379508afe6cfd2f78e3ebf29c829c362d84a9
BUG: 1170075
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10352
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During mount, NFS directly calls stat on the root of the volume
without sending a lookup on it. This was causing inode_ctx_get_block_size()
to fail on /. A check is now added in [f]stat which would ensure no action
is taken by shard xlator when the operation is on a directory.
Change-Id: I81849eeddfdad9f271155442408d95b4a25d7647
BUG: 1207615
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10427
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I968980dc4df458ec427e33503363bbd017e1163e
BUG: 1200271
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10194
Tested-by: NetBSD Build System
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier, both chagelog on/off and brick restart were considered
to be changelog breakage and treated as changelog not being
continuous. As a result, new HTIME.TSTAMP file was created on
both the above cases. Now the change is made such that only
on changelog enable/disable, the changelog is considered to be
discontinuous. New HTIME.TSTAMP file is not created on brick
restart, the changelogs files are appended to last HTIME.TSTAMP
file.
Treating changelog as continuous in above scenario is important
as changelog history API will fail otherwise. It can successfully
get changes between start and end timestamps only when changelog
is continuous (Changelogs in single HTIME.TSTAMP file are treated
as continuous). Without this change, changelog history API would
fail, and it would become necessary to fallback to other mechanisms
like xsync FSCrawl in case geo-rep to detect changes in this time
window. But Xsync FSCrawl would not be applicable to other
consumers like glusterfind.
Rationale:
1. In plain distributed volume, if brick goes down, no I/O can
happen onto the brick. Hence changelog is intact with data
on disk.
2. In distributed replicate volume, if brick goes down, since
self-heal traffic is captured in changelog. Eventually,
I/O happened whend brick down is captured in changelog.
Change-Id: I2eb66efe6ee9a9228fb1fcb38d6e7696b9559d5b
BUG: 1211327
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10222
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I1a90ad6669c1cb79aaae6b4bd9621c75d9985c8a
BUG: 1207615
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10446
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) ctr_link_consistency option for ctr xaltor is provided so that
the user can choose to switch it on or off.
/* For link consistency we do a double update i.e mark the link
* during the wind and during the unwind we update/delete the link.
* This has a performance hit. We give a choice here whether we need
* link consistency to be spoton or not using link_consistency flag.
* This will have only one link update */
2) In delete the wind time recording is moved to unwind path.
/* Special performance case:
* Updating wind time in unwind for delete. This is done here
* as in the wind path we will not know whether its the last
* link or not. For a last link there is not use to update any
* wind or unwind time!*/
Change-Id: I209472fb816f939db4a868b97ba053b028f17ea6
BUG: 1217786
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10170
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Coverity IDs:
1214630
1214631
1214633
1234643
Change-Id: I172c4f49bf651b2324522f9e661023f73ca05339
BUG: 789278
Signed-off-by: Nandaja Varma <nvarma@redhat.com>
Reviewed-on: http://review.gluster.org/9557
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Sakshi Bansal
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As suggested during the code-review of Bug1200262, have modified
GF_CBK_UPCALL to be exlusively GF_CBK_CACHE_INVALIDATION.
Thus, for any new upcall event, a new CBK procedure will be added.
Also made changes to store upcall data separately based on the
upcall event type received.
BUG: 1200262
Change-Id: I0f5e53d6f5ece16aecb514a0a426dca40fa1c755
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10049
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When ganesha.enable is set to on and features.ganesha is
enabled, there are a few behaviour changes that should
be seen in other volume operations.
1. ganesha.enable can be set to 'on' only
when features.ganesha is set to 'enable'
2.When gluster vol is started, and if ganesha.enable
key was set to 'on', it should automatically export the volume
via NFS-Ganesha.
3.When ganesha.enable is set to 'on', and a volume
is stopped, that volume should be unexported via NFS-Ganesha.
4. gluster vol reset <volname>
If ganesha.enable was set to on, then unexport the
volume via NFS-Ganesha.
5. gluster vol reset all
If features.ganesha is set to enable, as part
of reset all, set it to disable. This translates
to teardown cluster.
All the above problems are fixed by checking the global key
and value, depending on the value, specific functions are called.
And also, functions related to global commands
are moved to cli-cmd-global.c
Commit phase of features.ganesha enable/disable
runs the ganesha-ha.sh setup/teardown respectively.
Before the script begins, it is important that the
NFS-Ganesha service starts on all the HA nodes.
Having the start service commands in the
commit phase could lead to problems.
Moving the pre-requisite service start
commands to the 'stage' phase.
Change-Id: I5a256f94f8e1310ddcd5369f329b7168b2a24c47
BUG: 1200265
Signed-off-by: Meghana Madhusudhan <mmadhusu@redhat.com>
Reviewed-on: http://review.gluster.org/10283
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I761927ea263b4144b851881f25791fda5b794f59
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10381
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If scrubber detect any bad object by mismatching of checksum of scrubber
and signer then log messages shold come as a Alert instead of warning.
Change-Id: I075d80700cbe6182e525a04419a80ab18419ff91
BUG: 1210687
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10226
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In quota readdirp_cbk, inode ctx filled for the all entries
received.
In marker readdirp_cbk, files/directories are inspected for
dirty
There is no guarantee that entry->inode is populated.
If entry->inode is NULL, this needs to be treated as readdir
Change-Id: Id2d17bb89e4770845ce1f13d73abc2b3c5826c06
BUG: 1215550
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/10416
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia39fa0f29eac84c18d13a94f704b1703565b504e
BUG: 1207615
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10373
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Failing to reset scanning counter causes "incorrect" delay of around
50 seconds per directory entry. This causes scrubber to run extremely
slowly.
[
NOTE: This is a temporary fix. With the introduction of token
bucket based throttling, inducing throttle via sleep()
call would be unneeded.
]
Also, fix logging messages in scrubber to log brick and full path
of the object which is identified/marked as corrupted.
Change-Id: Id501bd15dcdbd8a09613f80f9d84050304740027
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10375
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I5cdd805821a4f3657f490223b97f42c724ee588f
BUG: 1207615
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10249
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implementation is same as the posix_unlink_cbk() where CTR sends
a request during a unlink to send the number of links to the inode
and posix obliges sending it using the unwind xdata dict.
For Trash xlator a unlink is stat + mkdir(if parent is not present)
+ rename. And hence this is handled in trash_unlink_rename_cbk().
Change-Id: I402e83567b88e3c9fe171379693c82937af567f9
BUG: 1205545
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Signed-off-by: Anoop C S <achiraya@redhat.com>
Reviewed-on: http://review.gluster.org/9989
Tested-by: NetBSD Build System
Tested-by: Joseph Fernandes
Reviewed-by: Joseph Fernandes
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CID 1288784
CID 1288785
CID 1288795
CID 1288796
CID 1288797
CID 1288802
Change-Id: I51dd7653a2dce3b7b6387e5d91c1c07eb157a04b
BUG: 789278
Signed-off-by: Anoop C S <achiraya@redhat.com>
Reviewed-on: http://review.gluster.org/10315
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Tested-by: jiffin tony Thottan <jthottan@redhat.com>
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Metadata read fops like lookup, stat etc will now fetch the xattr that
holds the size and block count information, extract the size and block
count fields and set them in respective stbuf before unwinding the
resultant iatt to the parent xlator.
Change-Id: I881be8955092fa6b75f8b0e4f3deb01344cb638e
BUG: 1207603
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10098
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Coverity IDs:
1288760 - Read from pointer after free
1288761 - Use after free.
Change-Id: Ide9405b9c30a3e27941054a4ae61f585ef09cd8c
BUG: 789278
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-on: http://review.gluster.org/10242
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: soumya k <skoduri@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a problem during upgrade where, inode quotas are not healed in
the contri xattrs.
Healing happens if contri xattrs are missing.
But healing doesn't happen if contri xattrs are present and inode quota
values are missing in the contri xattrs.
This patch fixes the problem
Change-Id: I6c88b74b5bb333a97c5419e24cc4ada82839f474
BUG: 1211808
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/10239
Tested-by: NetBSD Build System
Reviewed-by: Sachin Pandit <spandit@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|