| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* When a writev call occurs, the client compresses the data before
sending it to server. On the server, compressed data is decompressed.
Similarly, when a readv call occurs, the server compresses the data
before sending it to client. On the client, the compressed data is
decompressed. Thus the amount of data sent over the wire is minimized.
* Compression/Decompression is done using Zlib library.
* During normal operation, this is the format of data sent over wire :
<compressed-data> + trailer(8)
The trailer contains the CRC32 checksum and length of original
uncompressed data. This is used for validation.
HOW TO USE
----------
Turning on compression xlator:
gluster volume set <vol_name> compress on
Configurable options:
gluster volume set <vol_name> compress.compression-level 8
gluster volume set <vol_name> compress.min-size 50
Change-Id: Ib7a66b6f1f70fe002b7c513588cdf75c69370805
BUG: 923540
Original-author : Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Prashanth Pai <nullpai@gmail.com>
Signed-off-by: Prashanth Pai <ppai@redhat.com>
Reviewed-on: http://review.gluster.org/3251
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current coroutine model, mapping synctasks 1-1 with qemu internal
Coroutines, has some unresolved raciness issues. This problem usually
manifests as lifecycle mismatches between top-level (gluster created)
synctasks and the subsequently created internal coroutines from that
context. Qemu's internal queueing (and locking) can cause situations
where the top-level synctask is destroyed before the internal scheduler
has released references to memory, leading to use after free crashes
and asserts.
Simplify the coroutine model to use a single synctask as a coroutine
processor and rely on the existing native ucontext coroutine
implementation. The syncenv thread is donated to qemu and ensures a
single top-level coroutine is processed at a time. Qemu now has
complete control over coroutine scheduling.
BUG: 986775
Change-Id: I38223479a608d80353128e390f243933fc946fd6
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/6110
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add basic backing image support to the block-format mechanism. This
is a functionality checkpoint that enables the raw mechanism
required to support client driven "snapshot" and "clone" requests.
This change enhances the block-format setxattr command to support
an additional and optional backing image reference. For example:
setxattr -n trusted.glusterfs.block-format -v "qcow2:10GB:<bimg>" ./newimage
... where <bimg> refers to the backing image for unallocated blocks
in newimage. <bimg> can be provided in one of two formats:
- a gfid string in the following format (assuming a valid gfid):
<gfid:00000000-0000-0000-0000-000000000000>
- or a filename that must be resident in the same directory as the
new clone file being formatted. E.g.,
setxattr -n trusted.glusterfs.block-format -v "qcow2:10GB:baseimg" ./newimage
This latter format is more restrictive, simply provided for
convenience or until something more refined is available.
This change makes no assumptions about the backing image file and
affords no additional protection. It is up to the user/client to
recognize the relationship between the files and manage them
appropriately (i.e., no writes to the backing image, etc.).
BUG: 986775
Change-Id: I7aff7bdc59b85a6459001a6bfeae4db6bf74f703
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/5967
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in
the specified range. This fop will be useful when a whole file needs to
be initialized with zero (could be useful for zero filled VM disk image
provisioning or during scrubbing of VM disk images).
Client/application can issue this FOP for zeroing out. Gluster server
will zero out required range of bytes ie server offloaded zeroing. In
the absence of this fop, client/application has to repetitively issue
write (zero) fop to the server, which is very inefficient method because
of the overheads involved in RPC calls and acknowledgements.
WRITESAME is a SCSI T10 command that takes a block of data as input and
writes the same data to other blocks and this write is handled
completely within the storage and hence is known as offload . Linux ,now
has support for SCSI WRITESAME command which is exposed to the user in
the form of BLKZEROOUT ioctl. BD Xlator can exploit BLKZEROOUT ioctl to
implement this fop. Thus zeroing out operations can be completely
offloaded to the storage device , making it highly efficient.
The fop takes two arguments offset and size. It zeroes out 'size' number
of bytes in an opened file starting from 'offset' position.
This patch adds zerofill support to the following areas:
- libglusterfs
- io-stats
- performance/md-cache,open-behind
- quota
- cluster/afr,dht,stripe
- rpc/xdr
- protocol/client,server
- io-threads
- marker
- storage/posix
- libgfapi
Client applications can exloit this fop by using glfs_zerofill introduced in
libgfapi.FUSE support to this fop has not been added as there is no system call
for this fop.
Changes from previous version 3:
* Removed redundant memory failure log messages
Changes from previous version 2:
* Rebased and fixed build error
Changes from previous version 1:
* Rebased for latest master
TODO :
* Add zerofill support to trace xlator
* Expose zerofill capability as part of gluster volume info
Here is a performance comparison of server offloaded zeofill vs zeroing
out using repeated writes.
[root@llmvm02 remote]# time ./offloaded aakash-test log 20
real 3m34.155s
user 0m0.018s
sys 0m0.040s
[root@llmvm02 remote]# time ./manually aakash-test log 20
real 4m23.043s
user 0m2.197s
sys 0m14.457s
[root@llmvm02 remote]# time ./offloaded aakash-test log 25;
real 4m28.363s
user 0m0.021s
sys 0m0.025s
[root@llmvm02 remote]# time ./manually aakash-test log 25
real 5m34.278s
user 0m2.957s
sys 0m18.808s
The argument log is a file which we want to set for logging purpose and
the third argument is size in GB .
As we can see there is a performance improvement of around 20% with this
fop.
Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18
BUG: 1028673
Signed-off-by: Aakash Lal Das <aakash@linux.vnet.ibm.com>
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-on: http://review.gluster.org/5327
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
remove server_ctx and locks_ctx from client_ctx directly and store as
into discrete entities in the scratch_ctx
hooking up dump will be in phase 3
BUG: 849630
Change-Id: I94cea328326db236cdfdf306cb381e4d58f58d4c
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/5678
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ic3f9d30d75df0bbbdf8fe28446fabe62d90fa854
BUG: 1024666
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6194
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gettimeofday() returns the current wall clock time and timezone.
Using these functions in order to measure the passage of time
(how long an operation took) therefore seems like a no-brainer.
This time suffer's from some limitations:
a. They have a low resolution: “High-performance” timing by
definition, requires clock resolutions into the microseconds
or better.
b. They can jump forwards and backwards in time: Computer
clocks all tick at slightly different rates, which causes
the time to drift. Most systems have NTP enabled which
periodically adjusts the system clock to keep them in sync
with “actual” time. The adjustment can cause the clock to
suddenly jump forward (artificially inflating your timing
numbers) or jump backwards (causing your timing calculations
to go negative or hugely positive). In such cases timer
thread could go into an infinite loop.
From 'man gettimeofday':
----------
..
..
The time returned by gettimeofday() is affected by discontinuous
jumps in the system time (e.g., if the system administrator manually
changes the system time). If you need a monotonically increasing
clock, see clock_gettime(2).
..
..
----------
Rationale:
For calculating interval timing for Timer thread, all that’s
needed should be clock as a simple counter that increments
at a stable rate.
This is necessary to avoid the jumps which are caused by using
"wall time", this counter must be monotonic that can never
“tick” backwards, ever.
Change-Id: I701d31e71a85a73d21a6c5cd15583e7a5a645eeb
BUG: 1017993
Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
Reviewed-on: http://review.gluster.org/6070
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently to know the number of files to be healed, either user
has to go to backend and check the number of entries present in
indices/xattrop directory. But if a volume consists of large
number of bricks, going to each backend and counting the number
of entries is a time-taking task. Otherwise user can give
gluster volume heal vol-name info command but with this
approach if no. of entries are very hugh in the indices/
xattrop directory, it will comsume time.
So as a feature, new command is implemented.
Command 1: gluster volume heal vn statistics heal-count
This command will get the number of entries present in
every brick of a volume. The output displays only entries
count.
Command 2: gluster volume heal vn statistics heal-count
replica 192.168.122.1:/home/user/brickname
Here if we are concerned with just one replica.
So providing any one of the brick of a replica will get
the number of entries to be healed for that replica only.
Example:
Replicate volume with replica count 2.
Backend status:
--------------
[root@dhcp-0-17 xattrop]# ls -lia | wc -l
1918
NOTE: Out of 1918, 2 entries are <xattrop-gfid> dummy
entries so actual no. of entries to be healed are
1916.
[root@dhcp-0-17 xattrop]# pwd
/home/user/2ty/.glusterfs/indices/xattrop
Command output:
--------------
Gathering count of entries to be healed on volume volume3 has been successful
Brick 192.168.122.1:/home/user/22iu
Status: Brick is Not connected
Entries count is not available
Brick 192.168.122.1:/home/user/2ty
Number of entries: 1916
Change-Id: I72452f3de50502dc898076ec74d434d9e77fd290
BUG: 1015990
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/6044
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bdrv_init() is intended to be invoked only once. If invoked again
after initialization (i.e., due to graph changes), the block driver
registration code can reinsert entries into bdrv_drivers,
effectively corrupting the NULL terminated linked list. This
manifests as an infinite loop for callers attempt to traverse the
list (to locate a driver).
BUG: 986775
Change-Id: I2f95bda3c753dece4648cdad009d0e2a2d16f644
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/6021
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
change in encoding method of changelog was critical section for
"fop dispatch thread", "roll-over thread" and "reconfigure dispatch thread".
In this patch the "encoding-method" is changed by the reconfigure dispatch thread
lazily during handle_change, which solves the concurrency among the racing
threads.
BUG: 1002940
Change-Id: I78c3e8887efa46d0fcc60755cdf4243031cfa3eb
Signed-off-by: Ajeet Jha <ajha@redhat.com>
Reviewed-on: http://review.gluster.org/5844
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Harshavardhana <harsha@harshavardhana.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Block all signal except those which are set for explicit handling
in glusterfs_signals_setup(). Since thread spawning code in
libglusterfs and xlators can get called from application threads
when used through libgfapi, it is necessary to do this blocking.
Change-Id: Ia320f80521a83d2edcda50b9ad414583a0175281
BUG: 1011662
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/5995
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support the readdirp fop in qemu-block to ensure that image files
are handled correctly when readdirp is enabled. E.g., without
readdirp support, incorrect stat data for formatted files can be
reported back (and cached) via the client.
BUG: 986775
Change-Id: Ibc4bd0b42548953ebe30832f3d853bb68095f0ac
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/5946
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Building RPMs from a 'make dist' tarball fails when qemu-block is
enabled. Enabling is done automatically when the glib2 development files
are available (enabled by ./configure).
Manual building with:
$ ./autogen.sh && ./configure && make dist && rpmbuild -ta *.tar.gz
Building in mock works fine, glib2-devel is not installed by default so
the qemu-block xlator gets disabled. This change also adds glib2-devel
to the BuildRequires in the glusterfs.spec file, causing the qemu-block
xlator to be built by default, and included in the glusterfs RPM.
Change-Id: Ibb73628772586d9e07bbfde7a8ff2fc973489086
BUG: 986775
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/5896
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I7aab98a70793bee936272f0b795f4c22c3733dd2
BUG: 1001055
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/5705
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is required by Geo-Replication that does auxillary mount
with client-pid as -1 (which has special treatment at specific
places in GlusterFS), to trigger xtime updates on the intermediate
master in a cascading setup.
Marker too had a check to "not" mark updates for geo-replication's
auxillary mounts. With the new geo-replication design, xtimes are
not set by the master on the slave for all entities. Due to this
cascading setups were broken.
This patch introduces "geo-replication.ignore-pid-check" option
as a "override" for the client-pid check for gsyncd's client-pid.
When this options is enabled, marker start "marking" even if the
updates are from the special client.
Geo-Replication on the detection of itself being an intermediate
master, enables this option.
Change-Id: I9f7140edd12fef5480595ee0f93f35b94cdb8345
BUG: 996371
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/5591
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Avra Sengupta <asengupt@redhat.com>
Tested-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch extends the persistent instrumentation work done by
Aravinda (@avishwa), by introducing a handfull of instrumentation
variables for crawl. These variables are "pulled up" by glusterd
in the event of a geo-replication status cli command and looks
something like below:
"Uptime=00:21:10;FilesSyned=2982;FilesPending=0;BytesPending=0;DeletesPending=0;"
"FilesPending", "BytesPending" and "DeletesPending" are short-lived
variables that are non-zero when a changelog is being processes (ie.
when an active sync in ongoing). After a successfull changelog process
"FilesPending" is summed up into "FilesSynced". The three short-lived
variabled are then reset to zero and the data is persisted
Additionally this patch also reverts some of the changes made for
BZ #986929 (those were not needed).
Change-Id: I948f1a0884ca71bc5e5bcfdc017d16c8c54fc30b
BUG: 990420
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/5441
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for internals snapshots using QCOW2 and
general framework for external snapshots (next patch) with
QCOW2 and QED.
For internal snapshots, the file must be "initialized" or
"formatted" into QCOW2 format, and specify a file size.
Snapshots can be created, deleted, and applied ("goto").
e.g:
// Format and Initialize
sh# setfattr -n trusted.glusterfs.block-format -v qcow2:10GB /mnt/imgfile
sh# ls -l /mnt/imgfile
-rw-r--r-- 1 root root 10G Jul 18 21:20 imgfile
// Create a snapshot
sh# setfattr -n trusted.glusterfs.block-snapshot-create -v name1 imgfile
// Apply a snapshot
sh# setfattr -n trusted.gluterfs.block-snapshot-goto -v name1 imgfile
Change-Id: If993e057a9455967ba3fa9dcabb7f74b8b2cf4c3
BUG: 986775
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/5367
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prints, in the statedump, the information about the mount that
performed the inode/entry lk.
For the entrylks that are granted after a blocked state, the
blocked time is not printed. A patch for that will be sent
later.
Change-Id: Ib0c1ed21fa9328b435f96b590dd343f59814a08d
BUG: 915629
Signed-off-by: Anuradha Talur <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/5712
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I23b8cb7223b91a55af1cd4214f61bbe0e87351f6
BUG: 952029
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: http://review.gluster.org/5683
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
BUG: 952029
Change-Id: I7405d473d369a4a951836eceda4faccbad19ce0e
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: http://review.gluster.org/5497
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A volume would fail to start if any one of the bricks fails staging or
fails to start, even with the 'force' option. With this patch, when the
'force' option is given for a volume start, glusterd will continue and
start other bricks even if one fails staging or starting.
Also did a small fix in changelog, to prevent it crashing when it fails
to init.
Change-Id: I7efbd9ab13d12d69b0335ae54143fa17586f8f98
BUG: 994375
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/5510
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
at syslog side, log message is identified by its properties like
programname, pid, etc. brick/mount processes need to be identified
uniquely as they are different process of gluterfsd/glusterfs. At
rsyslog side, log separated by programname/app-name with pid works but
bit hard to identify them in long run which process is for what
brick/mount.
This patch fixes by setting identity string at openlog() which sets
programname/app-name as similar to old style log file prefixed by
gluster, glusterd, glusterfs or glusterfsd
Change-Id: Ia05068943fa67ae1663aaded1444cf84ea648db8
BUG: 928648
Signed-off-by: Bala.FA <barumuga@redhat.com>
Reviewed-on: http://review.gluster.org/5541
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In 3.3, inode locks of both metadata and data are competing in same
domain called data domain (old style). This coupled with eager-lock,
delayed post-ops introduce delays for metadata operations like chmod,
chown etc. To avoid this problem, inode locks for metadata ops are
moved to different domain called metadata domain in 3.4 (new style).
But when both 3.3 clients and 3.4 clients are present, 3.4 clients
for metadata operations still need to take locks in "old style" so
that proper synchronization happens across 3.3 and 3.4 clients. Only
when all clients are >= 3.4 locks will be taken in "new style" for
metadata locks. Because of this behavior as long as at least one 3.3
client is present, delays will be perceived for doing metadata
operations on all 3.4 clients while data operations are in
progress (Ex: Untar will untar one file per sec).
Fix:
Make locks xlators translate old-style metadata locks to new-style
metadata locks. Since upgrade process suggests upgrading servers
first and then clients, this approach gives good results.
Tests:
1) Tested that old style metadata locks are converted to new style by
locks xlator using gdb
2) Tested that disconnects purge locks in meta-data domain as well
using gdb and statedumps.
3) Tested that untar performance is not hampered by meta-data and
data operations.
4) Had two mounts one with orthogonal-meta-data on and other with
orthogonal-meta-data off ran chmod 777 <file> on one mount and
chmod 555 <file> on the other mount in while loops when I took
statedumps I saw that both the transports are taking lock on
same domain with same range.
18:49:30 :) ⚡ sudo grep -B1 "ACTIVE" /usr/local/var/run/gluster/home-gfs-r2_0.324.dump.*
home-gfs-r2_0.324.dump.1375794971-lock-dump.domain.domain=r2-replicate-0:metadata
home-gfs-r2_0.324.dump.1375794971:inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=9223372036854775806, len=0, pid = 7525, owner=78f9e652497f0000, transport=0x15ac9e0, , granted at Tue Aug 6 18:46:11 2013
home-gfs-r2_0.324.dump.1375795051-lock-dump.domain.domain=r2-replicate-0:metadata
home-gfs-r2_0.324.dump.1375795051:inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=9223372036854775806, len=0, pid = 8879, owner=0019cc3cad7f0000, transport=0x158f580, , granted at Tue Aug 6 18:47:31 2013
Change-Id: I268df4efd93a377a0c73fbc59b739ef12a7a8bb6
BUG: 993981
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5503
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implementation of client_t
The feature page for client_t is at
http://www.gluster.org/community/documentation/index.php/Planning34/client_t
In addition to adding libglusterfs/client_t.[ch] it also extracts/moves
the locktable functionality from xlators/protocol/server to libglusterfs,
where it is used; thus it may now be shared by other xlators too.
This patch is large as it is. Hooking up the state dump is left to do
in phase 2 of this patch set.
(N.B. this change/patch-set supercedes previous change 3689, which was
corrupted during a rebase. That change will be abandoned.)
BUG: 849630
Change-Id: I1433743190630a6d8119a72b81439c0c4c990340
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/3957
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Other enhancements being:
* ignore fops made by rebalance
* ignore internally triggered fops
BUG: 987734
Change-Id: I7dd164ae3c209fdb8ec43a27e67b8846f937c93b
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/5380
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia310af96b25f29351f3adc4bbc97aea271df7673
BUG: 987747
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/5379
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ibd0faefecc15b6713eda28bc96794ae58aff45aa
BUG: 847839
Original Author: Amar Tumballi <amarts@redhat.com>
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/5133
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the initial version of the Changelog Translator.
What is it
-----------
Goal is to capture changes performed on a GlusterFS volume.
The translator needs to be loaded on the server (bricks) and
captures changes in a plain text file inside a configured
directory path (controlled by "changelog-dir", should be
somewhere in <export>/.glusterfs/changelog by default).
Changes are classified into 3 types:
- Data: : TYPE-I
- Metadata : TYPE-II
- Entry : TYPE-III
Changelog file is rolled over after a certain time interval
(defauls to 60 seconds) after which a changelog is started.
The thing to be noted here is that for a time interval
(time slice) multiple changes for an inode are recorded only
once (ie. say for 100+ writes on an inode that happens within
the time slice has only a single corresponding entry in the
changelog file). That way we do not bloat up the changelog
and also save lots of writes.
Changelog Format
-----------------
TYPE-I and TYPE-II changes have the gfid on the entity on
which the operation happened. TYPE-III being a entry op
requires the parent gfid and the basename. Changelog format
has been kept to a minimal and it's upto the consumers to
do the heavy loading of figuring out deletes, renames etc..
A single changelog file records all three types of changes,
with each change starting with an identifier ("D": DATA,
"M": METADATA and "E": ENTRY). Option is provided for the
encoding type (See TUNABLES).
Consumers
----------
The only consumer as of today would be geo-replication, although
backup utilities, self-heal, bit-rot detection could be possible
consumers in the future.
CLI
----
By default, change-logging is disabled (the translator is present
in the server graph but does nothing). When enabled (via cli) each
brick starts to log the changes. There are a set of tunable that
can be used to change the translators behaviour:
- enable/disable changelog (disabled by default)
gluster volume set <volume> changelog {on|off}
- set the logging directory (<brick>/.glusterfs/changelogs is the
default)
gluster volume set <volume> changelog-dir /path/to/dir
- select encoding type (binary (default) or ascii)
gluster volume set <volume> encoding {binary|ascii}
- change the rollover time for the logs (60 secs by default)
gluster volume set <volume> rollover-time <secs>
- when secs > 0, changelog file is not open()'d with O_SYNC flag
- and fsync is trigerred periodically every <secs> seconds.
gluster volume set <volume> fsync-interval <secs>
features/changelog: changelog consumer library (libgfchangelog)
A shared library is provided for the consumer of the changelogs
for easy acess via APIs. Application can link against this library
and request for changelog updates. Conversion of binary logs to
human-readable ascii format is also taken care by the library which
keeps a copy of the changelog in application provided working
directory.
Change-Id: I75575fb7f1c53d2bec3dba1a329ea7bb3c628497
BUG: 847839
Original Author: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/5127
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Following is the semantics of the 'cmd':
1) If @domain is NULL - returns no. of locks blocked/granted in all domains
2) If @domain is non-NULL- returns no. of locks blocked/granted in that
domain
3) If @domain is non-existent - returns '0'; This is important since
locks xlator creates a domain in a lazy manner.
where @domain - a string representing the domain.
Change-Id: I5e609772343acc157ca650300618c1161efbe72d
BUG: 951195
Original-author: Krishnan Parthasarathi <kparthas@redhat.com>
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/4889
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for the DISCARD file operation. Discard punches a hole
in a file in the provided range. Block de-allocation is implemented
via fallocate() (as requested via fuse and passed on to the brick
fs) but a separate fop is created within gluster to emphasize the
fact that discard changes file data (the discarded region is
replaced with zeroes) and must invalidate caches where appropriate.
BUG: 963678
Change-Id: I34633a0bfff2187afeab4292a15f3cc9adf261af
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/5090
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement support for the fallocate file operation. fallocate
allocates blocks for a particular inode such that future writes
to the associated region of the file are guaranteed not to fail
with ENOSPC.
This patch adds fallocate support to the following areas:
- libglusterfs
- mount/fuse
- io-stats
- performance/md-cache,open-behind
- quota
- cluster/afr,dht,stripe
- rpc/xdr
- protocol/client,server
- io-threads
- marker
- storage/posix
- libgfapi
BUG: 949242
Change-Id: Ice8e61351f9d6115c5df68768bc844abbf0ce8bd
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/4969
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
IOW with more than just python2.6. Python2.7 is certainly what's on
the vast majority of non-RHEL systems that are out there. Also our
rpm.t regression test will build on epel-5 under mock; RHEL5 has
Python2.4.
Change-Id: I09c95c1fb6b3498e910ad239c4f0af7f786c3700
BUG: 961856
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/5007
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When entrylk lock requests are blocked and granted aysnchronously,
the entrylk lock structure was getting leaked.
Change-Id: Ie3f29f550730189f27745d991b029e50c63e63da
BUG: 962350
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4991
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extend the following fops with Python:
* open
* readv
* writev
* opendir
* readdir
* readdirp
* stat
* fstat
* statfs
* setxattr
* getxattr
* fsetxattr
* fgetxattr
* removexattr
* fremovexattr
* link
* unlink
* readlink
* symlink
* mkdir
* rmdir
Add fd_t, inode_t and iatt_t structure types.
Modify loc_t structure type; Alter the data types of the following
attributes - inode, parent, gfid, pargfid.
Modify uuid2str function, which returns a string equivalent for a ctype
object representing a gfid, to make use of python's 'uuid' module for
accurate representation of uuids.
by Justin Clift:
Adjust debug-trace.py to work with Python 2.6
Work around 'zero length field name in format' bug in
negative.py's uuid2str function
Fix indentation errors in negative.py, glupy.h,
glupy.c, gluster.py
Change-Id: If0fcfb2866e21c0380a973f8ffab9ea7b6a4cd5d
BUG: 961856
Signed-off-by: Ram Raja <rraja@redhat.com>
Reviewed-on: http://review.gluster.org/4907
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Justin Clift <jclift@redhat.com>
Tested-by: Justin Clift <jclift@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I3891ef6eaf6ede7c8cbedc3298ce2501a69b2b05
BUG: 961856
Original-author: Jeff Darcy <jdarcy@redhat.com>
Signed-off-by: Ram Raja <rraja@redhat.com>
Reviewed-on: http://review.gluster.org/4906
Reviewed-by: Justin Clift <jclift@redhat.com>
Tested-by: Justin Clift <jclift@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added code to display extra information when status command
is executed.
Information shown now are
1 Number of files synced
2 crawl time
3 total sync time
4 bytes synced
bytes synced is taken from rsync output .
--stats option of rsync gives extra infor
mation about the sync.In stats output there
is a field called Total transferred file
size which states the ammount of bytes synced .
This information is parsed from stdout output
using regular expressions.Bytes synced information
can be used to calculate throughput.
Change-Id: Id9bba9fff45ee7049bb8257c6fd918e5237e05b1
BUG: 947774
Signed-off-by: sarvotham s pai <spai@redhat.com>
Reviewed-on: http://review.gluster.org/4749
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... so it's easy to figure out errno caused it. As of now
it's only due to ENOSPC. Logging is done in the error handling
routine, so any further changes that require unlinking of the
timestamp file due to some error condition(s) are logged.
Change-Id: Ia59338e2e32b2adbbd1d56aa260018270f1abae9
BUG: 853911
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/4649
Reviewed-by: Csaba Henk <csaba@redhat.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Users are still using geo-rep with the old, deprecated, insecure, unsupported
ssh setup. Not their fault -- the implementation of the new method had the
following charasteristics:
- old method is possible, but with default settings it's not working
- it can be made operational by fiddling with "remote-gsyncd" tunable
- with default setting, an unhelpful, actually misleading error message is
produced
- the UI gave no hint to the changes in the ssh setup
http://review.gluster.org/4392 tried to fix these; what it accomplished was
unrestricted support to the bad practice (by making the default old setup
operational).
From this on:
- we disable the old method by reserving the "remote-gsyncd" tunable
- if the old method is attempted, give a hint what to do
Change-Id: Icade94725d8d8d2d4c89cab992d4226351637b86
BUG: 895656
Signed-off-by: Csaba Henk <csaba@redhat.com>
Reviewed-on: http://review.gluster.org/4602
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds an option, features.quota-deem-statfs (default off) to consider the
quota limits while calculating the volume stats.
Eg: Backend is of size 10GB and limit set on / is 5GB. If the option is off
df show actual size to be 10GB and when it is on df shows 5GB.
Change-Id: Ib30733bb69afecce1dea9d0491af89d551d214cc
BUG: 905425
Signed-off-by: Varun Shastry <vshastry@redhat.com>
Reviewed-on: http://review.gluster.org/4511
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is useful to find all calls that remove a file from the protected
directory, including renames and internal calls. Such calls will cause
a stack trace to be logged. There's a filter script to add the needed
translators, and then the new functionality can be invoked with one of
the following commands.
setfattr -n trusted.glusterfs.protect -v log $dir
setfattr -n trusted.glusterfs.protect -v reject $dir
setfattr -n trusted.glusterfs.protect -v anything_else $dir
The first logs calls, but still allows them. The second rejects them
with EPERM. The third turns off protection for that directory.
Change-Id: Iee4baaf8e837106be2b4099542cb7dcaae40428c
BUG: 888072
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/4496
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
.. using the environment variable $PYTHON
Change-Id: Ieaad8be98b826c803268216826e250d9944c8190
BUG: 882127
Signed-off-by: Joe Julian <me@joejulian.name>
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4252
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I1c9541058c7d07786539a3266ca125a6a15287d8
BUG: 859835
Signed-off-by: Anand Avati <avati@redhat.com>
Original-author: Kacper Kowalik (Xarthisius) <xarthisius.kk@gmail.com>
Signed-off-by: Kacper Kowalik (Xarthisius) <xarthisius.kk@gmail.com>
Reviewed-on: http://review.gluster.org/3967
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... so that a mountbroker session which is initiated b/w master
and slave does not use the same log file if it's started after
a normal geo-rep session b/w master and slave. This results in
EPERM as the log file is owned by root and the geo-rep slave
process (now running as a non privileged user) does not have
access to it.
Also, having separate log file directory for mountbroker sessions
looks clean.
NOTE: geo-rep's client mount log file location remains unchanged.
Change-Id: Ic7a732e250aee5393b9c3f6ebf6dfe2c310b7fe4
BUG: 893960
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/4407
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ibf639695ebd99c11c6960c9be82c0cee71b50744
BUG: 905864
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4458
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* warnings on 'void *' arguments
* warnings on empty initializations
* warnings on empty array (array[0])
Change-Id: Iae440f54cbd59580eb69f3ecaed5a9926c0edf95
BUG: 875913
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/4219
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* move 'dict_keys_join()' from api/glfs_fops.c to libglusterfs/dict.c
- also added an argument which is treated as a filter function if
required, currently useful for fuse.
* now 'make CFLAGS="-std=gnu99 -pedantic" 2>&1 | grep nested' gives
no output.
Change-Id: I4e18496fbd93ae1d3942026ef4931889cba015e8
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 875913
Reviewed-on: http://review.gluster.org/4187
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I664614677bc887ce087bfca067e6e57f0d6b659d
BUG: 824753
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/4272
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://www.python.org/dev/peps/pep-0352/ explains that the .message
property of BaseException is being removed. Most of the other exception
handlers access <Exception>.args[] which should be suitable for this
case too.
Change-Id: I1810450b78d2b3d7f8bd07f2beb02cbe9e2adecb
BUG: 888346
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/4328
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The gsyncd invocation that instruments the "geo-rep config" command is
multiplexed over peers to ensure the uniformity of configuration.
In general, that works well, but checkpoint setting is a special case,
because (unlike other instances of config-set) it is logged (as recording
of checkpoint events is part of the feature).
Problem is that the path components leading to the log file are
created only on the original node, where gsyncd was started.
Therefore the logging attempt will fail on the other nodes.
Fix: ignore if opening the logfile on behalf of checkpoint setting
fails with ENOENT.
Change-Id: I677f3f081bf4b9e3ba4d25d58979d86931e6beb4
BUG: 881997
Signed-off-by: Csaba Henk <csaba@redhat.com>
Reviewed-on: http://review.gluster.org/4248
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Christos Triantafyllidis <ctrianta@redhat.com>
Reviewed-by: Christos Triantafyllidis <ctrianta@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
... in gsyncd python code. Indeed, use the configuration
mechanism to set it suitably from glusterd.
Change-Id: I9fe2088b14d28588d1e64fe892740cc5755b8365
BUG: 868877
Signed-off-by: Csaba Henk <csaba@redhat.com>
Reviewed-on: http://review.gluster.org/4143
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|