| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CLI command for bitrot features.
volume bitrot <volname> enable|disable
Above command will enable/disable bitrot feature for particular volume.
BUG: 1170075
Change-Id: Ie84002ef7f479a285688fdae99c7afa3e91b8b99
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Signed-off-by: Anand nekkunti <anekkunt@redhat.com>
Signed-off-by: Dominic P Geevarghese <dgeevarg@redhat.com>
Reviewed-on: http://review.gluster.org/9866
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GlusterFS volume snapshot provides point-in-time copy
of a GlusterFS volume. Currently, GlusterFS volume
snapshots can be easily scheduled by setting up
cron jobs on one of the nodes in the GlusterFS
trusted storage pool. This has a single point failure (SPOF),
as scheduled jobs can be missed if the node running the cron
jobs dies.
The solution to the above problems is addressed in this patch.
The snap_scheduler.py helper script expects the user to install
the argparse python module before using it.
Further details for the same are available at:
http://www.gluster.org/community/documentation/index.php/Features/Scheduling_of_Snapshot
Change-Id: I2c357af5b7d3e66f270d20eef50cdeecdcbe15c7
BUG: 1198027
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/9788
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
==========================================================================
Inode quota
==========================================================================
= Currently, the only way to retrieve the number of files/objects in a =
= directory or volume is to do a crawl of the entire directory/volume. =
= This is expensive and is not scalable. =
= =
= The proposed mechanism will provide an easier alternative to determine =
= the count of files/objects in a directory or volume. =
= =
= The new mechanism proposes to store count of objects/files as part of =
= an extended attribute of a directory. Each directory's extended =
= attribute value will indicate the number of files/objects present =
= in a tree with the directory being considered as the root of the tree. =
= =
= The count value can be accessed by performing a getxattr(). =
= Cluster translators like afr, dht and stripe will perform aggregation =
= of count values from various bricks when getxattr() happens on the key =
= associated with file/object count. =
A new interface is introduced:
------------------------------
limit-objects : limit the number of inodes at directory level
list-objects : list the directories where the limit is set
remove-objects : remove the limit from the directory
==========================================================================
CLI COMMAND:
gluster volume quota <volname> limit-objects <path> <number> [<percent>]
* <number> is a hard-limit for number of objects limitation for path "<path>"
If hard-limit is exceeded, creation of file/directory is no longer
permitted.
* <percent> is a soft-limit for number of objects creation for path "<path>"
If soft-limit is exceeded, a warning is issued for each creation.
CLI COMMAND:
gluster volume quota <volname> remove-objects [path]
==========================================================================
CLI COMMAND:
gluster volume quota <volname> list-objects [path] ...
Sample output:
------------------
Path Hard-limit Soft-limit Used Available
Soft-limit exceeded?
Hard-limit exceeded?
------------------------------------------------------------------------
--------------------------------------
/dir 10 80% 10 0
Yes
Yes
==========================================================================
[root@snapshot-28 dir]# ls
a b file11 file12 file13 file14 file15 file16 file17
[root@snapshot-28 dir]# touch a1
touch: cannot touch `a1': Disk quota exceeded
* Nine files are created in directory "dir" and directory is included in
* the
count too. Hence the limit "10" is reached and further file creation
fails
==========================================================================
Note: We have also done some re-factoring in cli for volume name
validation. New function cli_validate_volname is created
==========================================================================
Change-Id: I1823497de4f790a2a20ebb1770293472ea33ee2b
BUG: 1190108
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/9769
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the RPC based changes to {libgf}changelog, loading shared
objects dynamically would need symbols to be available from
other shared libraries. As an example, creating an RPC listner
loads the RPC transport shared object which requires symbols
to be available from already loaded shared objects.
Using RTLD_GLOBAL makes the symbols available for symbol
resolution of subsequently loaded libraries.
Change-Id: I3d3ef790eded82911f05836c707509157680645c
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9814
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces RPC based communication between the changelog
translator and libgfchangelog. It replaces the old pathetic stream
based interaction that existed earlier (due to time constraints :-/).
Changelog, upon initialization starts a RPC server (rpcsvc) allowing
clients to invoke a probe API as a bootup mechanism to request for
event notifications. During probe, clients can choose an event
filter specifying the type(s) of events they are interested in. As
of now there is no way to change the event notification set once
the probe RPC call is made, but that is easier to implement.
The actual event notifications is done on a separate RPC session.
The client (libgfchangelog) itself starts and RPC server which the
changelog translator "connects back" during probe. Notifications
are dispatched by a bunch of threads from the server (translator)
and the client optionally orders them if ordered notifications
are requried. FOPs fill in their respective event details in a
buffer (rot-buffs to be particular) and a bunch of threads
(consumers) swap the buffers out of roatation and dispatch them
via RPC. To avoid writer starvation, then number of dispatcher
threads is one less than the number of buffer list in rot-buffs.x
libgfchangelog becomes purely callback based -- upon event
notification from the server (and re-ordering them if required)
invoke a callback routine specified by consumer(s).
A major part of the patch is also aimed at providing backward
compatibility for geo-replication, which was one of the main
consumer of the stream based API. Also, this patch does not\
"turn on" event notifications for all fops, just a bunch which
is currently in requirement. Another pain point is that the
server does not filter events before dispatching it to the
clients. That load is taken up by the client itself (although
it's done at the library layer rather than making it hard on
the callback implementor). This needs improvement and care
needs to be taken to not load the server up with expensive
filtering mechanisms.
Change-Id: Ibf60a432b68f2dfa60c6f9add2bcfd37a9c41395
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9708
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These two functions add support for POSIX ACLs through the GFAPI-handle
interface.
The initial infrastructure for POSIX ACLs based on libacl has been added
with the required changes to the POSIX xlator:
- http://review.gluster.org/9627
NetBSD does not support POSIX ACLs, so using any of the functions should
return ENOTSUP.
URL: http://www.gluster.org/community/documentation/index.php/Features/Improved_POSIX_ACLs
Change-Id: Ie74f3f963c3f9d576cb2f2a1e6d97e3cd4b01eda
BUG: 1185654
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9736
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
*************************************************************************
Libgfdb |
*************************************************************************
Libgfdb provides abstract mechanism to record extra/rich metadata
required for data maintenance, such as data tiering/classification.
It provides consumer with API for recording and querying, keeping
the consumer abstracted from the data store used beneath for storing data.
It works in a plug-and-play model, where data stores can be plugged-in.
Presently we have plugin for Sqlite3. In the future will provide recording
and querying performance optimizer. In the current implementation the schema
of metadata is fixed.
Schema:
~~~~~~
GF_FILE_TB Table:
~~~~~~~~~~~~~~~~~
This table has one entry per file inode. It holds the metadata required to
make decisions in data maintenance.
GF_ID (Primary key) : File GFID (Universal Unique IDentifier in the namespace)
W_SEC, W_MSEC : Write wind time in sec & micro-sec
UW_SEC, UW_MSEC : Write un-wind time in sec & micro-sec
W_READ_SEC, W_READ_MSEC : Read wind time in sec & micro-sec
UW_READ_SEC, UW_READ_MSEC : Read un-wind time in sec & micro-sec
WRITE_FREQ_CNTR INTEGER : Write Frequency Counter
READ_FREQ_CNTR INTEGER : Read Frequency Counter
GF_FLINK_TABLE:
~~~~~~~~~~~~~~
This table has all the hardlinks to a file inode.
GF_ID : File GFID (Composite Primary Key)``|
GF_PID : Parent Directory GFID (Composite Primary Key) |-> Primary Key
FNAME : File Base Name (Composite Primary Key)__|
FPATH : File Full Path (Its redundant for now, this will go)
W_DEL_FLAG : This Flag is used for crash consistancy, when a link is unlinked.
i.e Set to 1 during unlink wind and during unwind this record
is deleted
LINK_UPDATE : This Flag is used when a link is changed i.e rename.
Set to 1 when rename wind and set to 0 in rename unwind
Libgfdb API:
~~~~~~~~~~~
Refer libglusterfs/src/gfdb/gfdb_data_store.h
Change-Id: I2e9fbab3878ce630a7f41221ef61017dc43db11f
BUG: 1194753
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/9683
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
marker-quota.c: In function 'mq_inspect_directory_xattr_task':
marker-quota.c:3451:31: warning: variable 'buf' set but not
used [-Wunused-but-set-variable]
struct iatt buf = {0,};
Change-Id: I211378328bdb2509a5d2a186d173f7f30a670c8a
BUG: 1198849
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/9928
Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Marker can fail or can account incorrect numbers when it doesn't find a
ancestry for a inode.
Solution:
Current build_ancestry is done only on demand in the write/create FOPs
in quota enforcer.
It is good to do this in the quota_lookup as well.
Change-Id: I8aaf5b3e05a3ca51e7ab1eaa1b636a90f659a872
BUG: 1184885
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/9478
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
position in the graph rather than relative (local) to a particular
translator.
Encoding the volume in this way allows a single translator to manage
which brick is currently being scanned for directory entries. Using a
single translator minimizes allocated bits in the d_off. It also allows
multiple DHT translators in the same graph to have a common frame of
reference (the graph position) for which brick is being read. Multiple
DHT translators are needed for the Tiering feature.
The fix builds off a previous change (9332) which removed subvolume
encoding from AFR. The fix makes an equivalent change to the EC
translator.
More background can be found in fix 9332 and gluster-dev discussions [1].
DHT and AFR/EC are responsibile (as before) for choosing which brick to
enumerate directory entries in over the readdir lifecycle.
The client translator receiving the readdir fop encodes the dht_t. It
is referred to as the "leaf node" in the graph and corresponds to the
brick being scanned.
When DHT decodes the d_off, it translates the leaf node to a local
subvolume, which represents the next node in the graph leading to
the brick.
Tracking of leaf nodes is done in common utility functions. Leaf nodes
counts and positional information are updated on a graph switch.
[1] www.gluster.org/pipermail/gluster-devel/2015-January/043592.html
Change-Id: Iaf0ea86d7046b1ceadbad69d88707b243077ebc8
BUG: 1190734
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/9688
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A new global CLI option has been introduced for NFS-Ganesha.
gluster features.ganesha enable/disable.
This option is persistent and shall be inherited
by new volumes created after this option is set.
gluster features.ganesha enable
It carries out the following functions:
1. Disables gluster-nfs across the cluster
2. Starts NFS-Ganesha server on a subset of nodes and exports '/'.
3. Creates the HA cluster for NFS-Ganesha.
4. Writes the option into the global config file.
gluster features.ganesha disable
1. Stops NFS-Ganesha server.
2. Tears down the HA cluster for NFS-Ganesha
With this change the older volume set
options with keys "nfs-ganesha.host"
and "nfs-ganesha.enable" will no longer
be supported. This commit has only has the
CLI related changes. Another patch will
be submitted to support this feature entirely.
Change-Id: Ie4b66a16c23b33b795738654b9a68f8e2c34efe3
BUG: 1188184
Signed-off-by: Meghana Madhusudhan <mmadhusu@redhat.com>
Reviewed-on: http://review.gluster.org/9538
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I628fbd99c2478fcb8bb6e5be55e43467f25227bf
BUG: 1165870
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/9879
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Lalatendu Mohanty <lmohanty@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tests/basic/mount-nfs-auth.t hardcoded /var/lib/glusterd/nfs/
as the NFS state directory, cuasing failures if glusterfs was
configured with state in another location.
Fix this by obtaning the directory through a gluster volume get
command. The nfs.mount-rmtab key gives us a file inside the
directory we are looking for.
This fixes tests/basic/mount-nfs-auth.t regression on NetBSD.
BUG: 1129939
Change-Id: I19184859c03faf5b9aeb95d080cf90fa581be380
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/9896
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I12cb0cdacace755d6c545c8697a4e18d6035954b
BUG: 1075417
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/9869
Reviewed-by: Lalatendu Mohanty <lmohanty@redhat.com>
Tested-by: Lalatendu Mohanty <lmohanty@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When run as root, BSD ls(1) lists dot-files, which includes
.glusterfs in split-brain-healing.t's usage. This leads to failure.
gfid-self-heal.t suffers the same problem.
Fix by filtering out dot-files in ls(1) output
NB: split-brain-healing.t also requires http://review.gluster.org/9831
to pass on NetBSD.
BUG: 1129939
Change-Id: Ic572d3abf685e9b43f32ddee8a13b5f5c4ae641f
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/9885
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following options for the Gluster/NFS server are added :
- nfs.exports-auth-enable
- nfs.auth-refresh-interval-sec
- nfs.auth-cache-ttl-sec
BUG: 1143880
Change-Id: I37a73966c4ed27cd0f8c77200ef68a0d12b385b8
Original-author: Shreyas Siravara <shreyas.siravara@gmail.com>
CC: Richard Wareing <rwareing@fb.com>
CC: Jiffin Tony Thottan <jthottan@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9364
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I759f05d63028d6a52c3e1ee2ab9892583c4832e7
BUG: 1193893
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/9800
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In glusterd_peerinfo_destroy, cast the passed 'strcut rcu_head *'
pointer to 'gd_rcu_head *' before use in caa_container_of() to prevent
the incompatible-pointer compilation warning.
Also, refactor peerinfo->head to peerinfo->rcu_head to reduce confusion
when reading code.
This change was developed on the git branch at [1]. This commit is a
combination of the following commits on the development branch.
aa4a0bc Rename peerinfo->head to peerinfo->rcu_head
c79144b Cast struct rcu_head * to gd_rcu_head * to prevent warning
1d222c3 More head -> rcu_head renames
[1]: https://github.com/kshlm/glusterfs/tree/urcu
BUG: 1191030
Change-Id: I7ede02090413839563ce44fdf6289697b28777e7
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/9922
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
geo-replication/src/peer_mountbroker
BUG: 1136312
Change-Id: Ib9b287b4e1183cb44acbf01184a240be7f09be7c
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/9923
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Documentation is available in patch:
http://review.gluster.org/#/c/9800/
A tool which helps to get list of modified files or list of all files in
GlusterFS Volume using Changelog or find command.
Usage
=====
glusterfind --help
Create:
-------
glusterfind create --help
The tool creates status file $GLUSTERD_WORKDIR/SESSION/VOLUME/status
and records current timestamp to initiate the session. This timestamp
will be used as start time for next runs.
As part of create also generates ssh key and distributes to all peers.
and enables build.pgfid and changelog using volume set command.
Pre:
----
glusterfind pre --help
This command is used to generate the list of files modified after session
creation time or after last run. To get list of all files/dirs in Volume,
run pre command with `--full` argument.
The tool gets all nodes details using gluster volume info and runs node
agent for each brick in respective nodes via ssh command. Once these node
agents generate the output file, tool copies to local using scp. Merges all
the output files to generate the final output file.
Post:
-----
glusterfind post --help
After consuming the list, this sub command is called to update the session
time based on pre command status file.
List:
-----
glusterfind list --help
To view all the sessions
Delete:
-------
glusterfind delete --help
Delete session.
Known Issues
------------
1. Deleted files will not get listed, since we can't convert GFID to
Path if file/dir is deleted.
2. Only new name will get listed if Renamed.
3. All hardlinks will get listed.
Change-Id: I82991feb0aea85cb6ec035fddbf80a2b276e86b0
BUG: 1193893
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/9682
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Prashanth Pai <ppai@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having this particular check which was introduced by
commit c78998c39f0857ea7aacba360632c148afc54a55 causes a drop in
performance in readdirp. So the behavior is made configurable with this
patch.
Change-Id: I2858fc18b3539df7aa6d3f489e0d5cfaeb8a9b3c
BUG: 1202669
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/9917
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A dummy translator has been introduced as a place
holder for functions related to managing NFS-Ganesha
exports. A volume set option is introduced to
manage volume level exports.
gluster vol set <volname> ganesha.enable ON/OFF
1. gluster volume set <volname> ganesha.enable ON
It creates the export config file with a unique export ID.
Sends a DBus signal to export this volume dynamically.
2. gluster vol set <volname> ganesha.enable OFF
Unexports the specific volume. Deletes the specfic
config file related to the volume.
This change also removes the handling of the older
keys "nfs-ganesha.enable" and "nfs-ganesha.host"
Change-Id: I8d4a0b542326a6a0c8e4711600b106274d666587
BUG: 1188184
Signed-off-by: Meghana Madhusudhan <mmadhusu@redhat.com>
Reviewed-on: http://review.gluster.org/9585
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
snapshot clone will allow us to take a snpahot of a snapshot.
Newly created clone volume will be a regular volume with read/write
permissions.
CLI command
snapshot clone <clonename> <snapname>
Change-Id: Icadb993fa42fff787a330f8f49452da54e9db7de
BUG: 1199894
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/9750
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The crash is seen when, glfs_init failed for some reason
and glfs_fini was called for cleaning up the partial initialization.
The fix is in two folds:
1. In timer store and restore the THIS, previously
it was being overwritten.
2. In glfs_free_from_ctx() and glfs_fini() check for
NULL before destroying.
Change-Id: If40bf69936b873a1da8e348c9d92c66f2f07994b
BUG: 1202290
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/9895
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even when trash translator is disabled, the following error
is being logged for each unlink/truncate/ftruncate calls.
[...] E [trash.c:221:trash_local_wipe] (--> ...
... ) 0-trash: invalid argument: local
This change replaces GF_VALIDATE_OR_GOTO macro with simple
if condition.
Change-Id: I7e6754cd53ec7c2d84669b6d40d883a2d1eee41e
BUG: 1132465
Signed-off-by: Anoop C S <achiraya@redhat.com>
Reviewed-on: http://review.gluster.org/9909
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the only way to retrieve the number of files/objects in a
directory or volume is to do a crawl of the entire directory/volume.
This is expensive and is not scalable.
The new mechanism proposes to store count of objects/files as part of
an extended attribute of a directory. Each directory's extended
attribute value will indicate the number of files/objects present
in a tree with the directory being considered as the root of the tree.
Currently file usage is accounted in marker by doing multiple FOPs
like setting and getting xattrs. Doing this with STACK WIND and
UNWIND can be harder to debug as involves multiple callbacks.
In this code we are replacing current mechanism with syncop approach
as syncop code is much simpler to follow and help us implement inode
quota in an organized way.
Change-Id: Ibf366fbe07037284e89a241ddaff7750fc8771b4
BUG: 1188636
Signed-off-by: vmallika <vmallika@redhat.com>
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Reviewed-on: http://review.gluster.org/9567
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
10.TEST kill_brick $V0 $H0 $B0/${V0}1
11.-EXPECT '1' echo `pgrep glusterfsd | wc -l
Problem:
On my Fedora 21 laptop, #11 always fails:"not ok 11 Got "2" instead of "1"
On debugging, I found that after killing, the kernel takes some time to
clean up the process until which it appears as defunct in the pgrep
output:
root 21795 2.0 0.0 0 0 ? Zsl 11:57 0:00 [glusterfsd] <defunct>
Fix:
As long as TEST kill_brick is successful, we really don't need to double
check with the pgrep output. Hence removing that line.
Change-Id: Ia10e0a04803e54a074f73da6523fa6a98c677d58
BUG: 1163543
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/9904
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In case of any upcall cbk events received by the protocol/client,
gfapi will be notified which queues them up in a list (<gfapi_cbk_upcall>).
Applicatons are responsible to provide APIs to process & notify them in case
of any such upcall events queued.
Added a new API which will be used by Ganesha to repeatedly poll for any
such upcall event notified (<glfs_h_poll_upcall>).
A new test-file has been added to test the cache_invalidation upcall events.
Below link has a writeup which explains the code changes done -
URL: https://soumyakoduri.wordpress.com/2015/02/25/glusterfs-understanding-upcall-infrastructure-and-cache-invalidation-support/
Change-Id: Iafc6880000c865fd4da22d0cfc388ec135b5a1c5
BUG: 1200262
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/9536
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Resubmitting after a gerrit bug bungled the merge of
http://review.gluster.org/9621 (was it really a gerrit bug?)
Scripts related to NFS-Ganesha are in extras/ganesha/scripts.
Config files are in extras/ganesha/config.
Resource Agent files are in extras/ganesha/ocf
Files are copied to appropriate locations.
Change-Id: I137169f4d653ee2b7d6df14d41e2babd0ae8d10c
BUG: 1188184
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/9912
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
During pre-op phase, the index xlator
1. Creates the entry inside .glusterfs/indices/xattrop
2. Winds the xattrop fop to posix to mark dirty/pending changelogs.
If the brick crashes after 1, the xattrop entry becomes stale and never
gets removed by shd during subsequent crawls because there is nothing to
heal (changelogs are zero).
Though the stale entry does not get displayed in the output of 'heal
info' command, it nevertheless stays there forever unless a new write
transaction is performed on the file.
Fix:
During index self-heal if afr xattrs are found to be clean (indicated by
ret value of 2 on a call to afr_shd_selfheal(), send a dummy
post-op with all 0s for the xattr values, which makes the index xlator
to unlink the stale entry.
Change-Id: I02cb2bc937f2e3f3f3cb35d67b006664dc7ef919
BUG: 1190069
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/9714
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anuradha Talur <atalur@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Several features - e.g. encryption, erasure codes, or NSR - involve
multiple cooperating translators which sometimes need a "private" means
of communication amongst themselves. Historically we've used virtual or
synthetic xattrs, but that's not very elegant and clutters up the
getxattr/setxattr path which must also handle real xattr requests. This
new fop should address that.
The only argument is an int32_t "op" which should be recognized by the
target translator. It is recommended that translators using these
feature follow some convention regarding the ops that they define, to
avoid conflicts. Using a hash of the target translator's type string as
a base for a series of ops would probably be a good start. Any other
information can be passed in both directions using xdata.
The default behavior for this fop, as with any other, is to pass through
to FIRST_CHILD. That makes use of this fop "transparent" to other
translators that were written before it existed, but it also means that
it only really works with pass-through translators. If a routing
translator (such as DHT) or a fan-out translator (such as AFR) is
involved, the IPC might not reach its intended destination unless those
translators are modified to forward IPC fops along all paths.
If an IPC gets all the way to storage/posix it is considered an error,
much like an uncaught exception. We don't actually *do* anything in
that case, but we do log it send back an EOPNOTSUPP error. This makes
the "unrecognized opcode" condition distinguishable from the "no IPC
support" condition (which would yield an RPC error instead) so clients
can probe for the presence of a handler for their own favorite opcode
and either use that or use old-school xattrs depending on the result.
BUG: 1158628
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Change-Id: I84af1b17babe5b30ec03ecf027ae37d09b873968
Reviewed-on: http://review.gluster.org/8812
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the network.ping-timeout to set the TCP_USER_TIMEOUT socket option
(see 'man 7 tcp'). The option sets the transport.tcp-user-timeout option
that is handled in the rpc/socket layer on the protocol/server side.
This socket option makes detecting unclean disconnected clients more
reliable.
When the socket gets closed, any locks that the client held are been
released. This makes it possible to reduce the fail-over time for
applications that run on systems that became unreachable due to
a network partition or general system error client-side (kernel panic,
hang, ...).
It is not trivial to create a test-case for this at the moment. We need
a client that unclean disconnects and an other client that tries to take
over the lock from the disconnected client.
URL: http://supercolony.gluster.org/pipermail/gluster-devel/2014-May/040755.html
Change-Id: I5e5f540a49abfb5f398291f1818583a63a5f4bb4
BUG: 1129787
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/8065
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: soumya k <skoduri@redhat.com>
Reviewed-by: Santosh Pradhan <santosh.pradhan@gmail.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Framework on the server-side, to handle certain state of the files
accessed and send notifications to the clients connected.
A generic and extensible framework, used to maintain states in
the glusterfsd process for each of the files accessed
(including the clients info doing the fops) and send
notifications to the respective glusterfs clients incase of
any change in that state.
This patch handles "Inode Update/Invalidation" upcall event.
Feature page:
URL: http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure
Below link has a writeup which explains the code changes done -
URL: https://soumyakoduri.wordpress.com/2015/02/25/glusterfs-understanding-upcall-infrastructure-and-cache-invalidation-support/
Change-Id: Ie3d724be9a3419fcf18901a753e8ec2df2ac802f
BUG: 1200262
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/9535
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Non root geo-replication setup is now simplified. This
patch provides cli for mountbroker user and options management
To set Options,
gluster system:: execute mountbroker opt <KEY> <VALUE>
# for example,
gluster system:: execute mountbroker opt mountbroker-root /var/mountbroker-root
gluster system:: execute mountbroker opt geo-replication-log-group geogroup
gluster system:: execute mountbroker opt rpc-auth-allow-insecure on
To remove option,
gluster system:: execute mountbroker optdel <KEY>
# for example,
gluster system:: execute mountbroker optdel geo-replication-log-group
To add/edit user,
gluster system:: execute mountbroker user <USERNAME> <VOLUMES>
# for example
gluster system:: execute mountbroker user geoaccount slavevol1,slavevol2
To remove user,
gluster system:: execute mountbroker userdel <USERNAME>
# for example
gluster system:: execute mountbroker userdel geoaccount
For info,
gluster system:: execute mountbroker info
gluster system:: execute mountbroker -j info
For JSON output add -j after mountbroker, for example,
gluster system:: execute mountbroker -j user geoaccount slavevol1,slavevol2
PS: Each peer prints its own JSON output, aggregator required from consumer side
BUG: 1136312
Change-Id: Ie52210c0bcc91ac2ffd3ba58988222ffca62b47f
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/9398
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: darshan n <dnarayan@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
CC libglusterfs_la-inode.lo
inode.c: In function 'inode_table_destroy':
inode.c:1630:19: warning: variable 'this' set
but not used [-Wunused-but-set-variable]
xlator_t *this = NULL;
Change-Id: If4b37ab896ee0a309826d4be48c6599d6ec2710b
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/9846
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anoop C S <achiraya@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider below scenario in the quota xlator
T1 - write with delta1 bytes on fd1
check_limit sees that delta1 bytes is not exceeding soft limit
T2 - write with delta2 bytes on fd1
check_limit sees that delta2 bytes is not exceeding soft limit
T3 - delta1 and delta2 bytes are written to the disk.
Here delta1 and delta2 are checked separately and do not exceed
limit, but they together exceed the limit which is not checked.
We need to find a solution to solve this problem. Till then for
other regressions to pass, we remove the the test which checks for
soft limit crossed.
Change-Id: I8f76754e975c3315557a4c570db8bb5d9e56de15
BUG: 1202292
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/9894
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the test systems gets into a memory pressure state (the Jenkins VMs
do not have much RAM), the localhost NFS-mount can get hung. It is
possible to prevent this by writing with O_DIRECT. Unfortnately, the
'dd' command on NetBSD does not seem to support such an option.
The alternative is to reduce the I/O that can get cached on the
NFS-client, like reducing the "count" option for "dd".
Change-Id: I1da9cb41133bb934bcbae0a6bc091f798514ed3d
BUG: 1163543
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9883
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the combined patch set for supporting trash feature.
http://www.gluster.org/community/documentation/index.php/Features/Trash
Current patch includes the following features:
* volume set options for enabling trash globally and
exclusively for internal operations like self-heal
and re-balance
* volume set options for setting the eliminate
path, trash directory path and maximum trashable
file size.
* test script for checking the functionality of the
feature
* brief documentation on different aspects of trash
feature.
Change-Id: Ic7486982dcd6e295d1eba0f4d5ee6d33bf1b4cb3
BUG: 1132465
Signed-off-by: Anoop C S <achiraya@redhat.com>
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
Reviewed-on: http://review.gluster.org/8312
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I0bfa44eb5b5f21e381af3e71c26ea863e4adc46f
BUG:1202274
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/9878
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
currently glfs_new_from_ctx() does not initialize child_down_count
conditional variable, but, glfs_free_from_ctx()
destroy this variable. mnt3udp_get_export_subdir_inode() from mount3
calls glfs_free_from_ctx(), so bound to invite problems. This patch
avoids the issue.
Change-Id: I8c1ed83f0b39248edbb78db25c9434274b538e80
BUG: 1200879
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Reviewed-on: http://review.gluster.org/9857
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The peer list and the peerinfo objects are now protected using RCU.
Design patterns described in the Paul McKenney's RCU dissertation [1]
(sections 5 and 6) have been used to convert existing non-RCU protected
code to RCU protected code.
Currently, we are only targetting guaranteeing the existence of the
peerinfo objects, ie., we are only looking to protect deletes, not all
updaters. We chose this, as protecting all updates is a much more
complex task.
The steps used to accomplish this are,
1. Remove all long lived direct references to peerinfo objects (apart
from the peerinfo list). This includes references in glusterd_peerctx_t
(RPC), glusterd_friend_sm_event_t (friend state machine) and others.
This way no one has a reference to deleted peerinfo object.
2. Replace the direct references with indirect references, ie., use
peer uuid and peer hostname as indirect references to the peerinfo
object. Any reader or updater now uses the indirect references to get to
the actual peerinfo object, using glusterd_peerinfo_find. Cases where a
peerinfo cannot be found are handled gracefully.
3. The readers get and use the peerinfo object only within a RCU read
critical section. This prevents the object from being deleted/freed when
in actual use.
4. The deletion of a peerinfo object is done in a ordered manner
(glusterd_peerinfo_destroy). The object is first removed from the
peerinfo list using an atomic list remove, but the list head is not
reset to allow existing list readers to complete correctly. We wait for
readers to complete, before resetting the list head. This removes the
object from the list completely. After this no new readers can get a
reference to the object, and it can be freed.
This change was developed on the git branch at [2]. This commit is a
combination of the following commits on the development branch.
d7999b9 Protect the glusterd_conf_t->peers_list with RCU.
0da85c4 Synchronize before INITing peerinfo list head after removing
from list.
32ec28a Add missing rcu_read_unlock
8fed0b8 Correctly exit read critical section once peer is found.
63db857 Free peerctx only on rpc destruction
56eff26 Cleanup style issues
e5f38b0 Indirection for events and friend_sm
3c84ac4 In __glusterd_probe_cbk goto unlock only if peer already
exists
141d855 Address review comments on 9695/1
aaeefed Protection during peer updates
6eda33d Revert "Synchronize before INITing peerinfo list head after
removing from list."
f69db96 Remove unneeded line
b43d2ec Address review comments on 9695/4
7781921 Address review comments on 9695/5
eb6467b Add some missing semi-colons
328a47f Remove synchronize_rcu from
glusterd_friend_sm_transition_state
186e429 Run part of glusterd_friend_remove in critical section
55c0a2e Fix gluster (peer status/ pool list) with no peers
93f8dcf Use call_rcu to free peerinfo
c36178c Introduce composite struct, gd_rcu_head
[1]: http://www.rdrop.com/~paulmck/RCU/RCUdissertation.2004.07.14e1.pdf
[2]: https://github.com/kshlm/glusterfs/tree/urcu
Change-Id: Ic1480e59c86d41d25a6a3d159aa3e11fbb3cbc7b
BUG: 1191030
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/9695
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Anand Nekkunti <anekkunt@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use register time(xsync_upper_limit) only for stime update, do not
use for change detection.
Problem 1:
If a file created before geo-rep, xtime xattr does not exist.
Geo-rep updates xtime of the file to current time if not exists.
xtime > upper_limit so geo-rep will not pick those files. Changelog
either will have SETXATTR, and fails to sync the file.
Problem 2:
If a file is created before geo-rep create and updated after
geo-rep start. xtime of the file is greater than upper limit(geo-rep
start time/changelog register time). Geo-rep(XSync) will not pick this
file for syncing. Changelog will have only DATA recorded for that file.
Geo-rep tries DATA without any ENTRY ops and fails with rsync error.
BUG: 1200733
Change-Id: Ie4e8f284db689d2c755ef8e7ecbb658db1c0785f
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/9855
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Tested-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CURRENT DESIGN AND ITS LIMITATIONS:
-----------------------------------
Geo-replication syncs changes across geography using changelogs captured
by changelog translator. Changelog translator sits on server side just
above posix translator. Hence, in distributed replicated setup, both
replica pairs collect changelogs w.r.t their bricks. Geo-replication
syncs the changes using only one brick among the replica pair at a time,
calling it as "ACTIVE" and other non syncing brick as "PASSIVE".
Let's consider below example of distributed replicated setup where
NODE-1 as b1 and its replicated brick b1r is in NODE-2
NODE-1 NODE-2
b1 b1r
At the beginning, geo-replication chooses to sync changes from NODE-1:b1
and NODE-2:b1r will be "PASSIVE". The logic depends on virtual getxattr
'trusted.glusterfs.node-uuid' which always returns first up subvolume
i.e., NODE-1. When NODE-1 goes down, the above xattr returns NODE-2 and
that is made 'ACTIVE'. But when NODE-1 comes back again, the above xattr
returns NODE-1 and it is made 'ACTIVE' again. So for a brief interval of
time, if NODE-2 had not finished processing the changelog, both NODE-2
and NODE-1 will be ACTIVE causing rename race as mentioned in the bug.
SOLUTION:
---------
1. Have a shared replicated storage, a glusterfs management volume specific
to geo-replication.
2. Geo-rep creates a file per replica set on management volume.
3. fcntl lock on the above said file is used for synchronization
between geo-rep workers belonging to same replica set.
4. If management volume is not configured, geo-replication will back
to previous logic of using first up sub volume.
Each worker tries to lock the file on shared storage, who ever wins will
be ACTIVE. With this, we are able to solve the problem but there is an
issue when the shared replicated storage goes down (when all replicas
goes down). In that case, the lock state is lost. So AFR needs to rebuild the
lock state after brick comes up.
NOTE:
-----
This patch brings in the, pre-requisite step of setting up management volume
for geo-replication during creation.
1. Create mgmt-vol for geo-replicatoin and start it. Management volume should
be part of master cluster and recommended to be three way replicated
volume having each brick in different nodes for availability.
2. Create geo-rep session.
3. Configure mgmt-vol created with geo-replication session as follows.
gluster vol geo-rep <mastervol> slavenode::<slavevol> config meta_volume \
<meta-vol-name>
4. Start geo-rep session.
Backward Compatiability:
-----------------------
If management volume is not configured, it falls back to previous logic of
using node-uuid virtual xattr. But it is not recommended.
Change-Id: I7319d2289516f534b69edd00c9d0db5a3725661a
BUG: 1196632
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/9759
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The tool finds the missing files in a geo-replication slave volume.
The tool crawls backend .glusterfs of the brickpath, which is passed
as a parameter and stats each entry on slave volume mount to check
the presence of file. The mount used is aux-gfid-mount, hence no path
conversion is required and is fast. The tool needs to be run on every
node in cluster for each brickpath of geo-rep master volume to find
missing files on slave volume. The tool is generic enough and can be
used in non geo-replication context as well.
Most of the crawler code is leverged from Avati's xfind and is modified
to crawl only .glusterfs (https://github.com/avati/xsync)
Thanks Aravinda for scripts to convert gfid to path.
Change-Id: I84deaaaf638f7c571ff1319b67a3440fe27da810
BUG: 1187140
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/9503
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I41acd9970bef04bb16cd4d8532a84a95d5fb642a
BUG: 1199003
Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>.
Reviewed-on: http://review.gluster.org/9810
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
BUG: 1143880
Change-Id: I359470a1edb935e206eeeecd4de7022530fb397a
Reported-by: Vijay Bellur <vbellur@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9882
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I2299378f02a5577a8bf2874664ba79e92c3811b5
BUG: 1201621
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/9872
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CID: 1124496
The pointer is not checked against null and is dereferenced anyway,
which is now checked.
Change-Id: Ib810546445596671b3656f01a14bbad02cdc221c
BUG: 789278
Signed-off-by: arao <arao@redhat.com>
Reviewed-on: http://review.gluster.org/9640
Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CID:1128926
Change-Id: I5ad1229e225a36f995245a847db1a19609a18cd8
BUG: 789278
Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-on: http://review.gluster.org/9556
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'line' is allocated through getline() which uses malloc(). GF_FREE()
will fail to release the memory because it can not find the expected
mem-pool header. Instead of GF_FREE(), free() should be used for strings
that get allocated with getline().
Subsequent calls to getline() with a non-NULL pointer will get the size
of the allocation adjusted with realloc().
Change-Id: I612fbf17d7283174d541da6f34d26e4f44e83bfa
BUG: 1143880
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9860
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|