summaryrefslogtreecommitdiffstats
path: root/xlators
Commit message (Collapse)AuthorAgeFilesLines
* cluster/afr : enable inspection & resolution of files in split-brainAnuradha2015-03-195-33/+336
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Part 2/2 patch to enable users analyze and resolve split-brain. This patch enables : 1) Users to inspect the files in data and metadata split-brain. 2) Resolve the split-brain. Both using a series of setfattr commands. Consider a volume "test" with 2 bricks. 1) To inspect a file f1: setfattr -n replica.split-brain-choice -v test-client-0 f1 After the execution of this command, if no read_subvol is found, reads will be served from test-client-0 (corresponding to brick-0). 2) To resolve split-brain : setfattr -n replica.split-brain-heal-finalize -v test-client-0 f1 Execution of this command will lead to the resolution of data and metadata split-brain with subvol mentioned in the command (test-client-0 here) as the source and the rest as sink. Change-Id: Ia20f3ee5abd3119e3d54fcc599f1e55ac65fd179 BUG: 1191396 Signed-off-by: Anuradha <atalur@redhat.com> Reviewed-on: http://review.gluster.org/9743 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: CLI commands to create and manage tiered volumes.Dan Lambright2015-03-1910-32/+438
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A tiered volume is a normal volume with some number of new bricks representing "hot" storage. The "hot" bricks can be attached or detached dynamically to a normal volume. When this happens, a new graph is constructed. The root of the new graph is an instance of the tier translator. One subvolume of the tier translator leads to the old volume, and another leads to the new hot bricks. attach-tier <VOLNAME> [<replica> <COUNT>] <NEW-BRICK> ... [force] volume detach-tier <VOLNAME> [replica <COUNT>] <BRICK> ... <start|stop|status|commit|force> gluster volume rebalance <volume> tier start gluster volume rebalance <volume> tier stop gluster volume rebalance <volume> tier status The "tier start" CLI command starts a server side daemon. The daemon initiates file level migration based on caching policies. The daemon's status can be monitored and stopped. Note development on the "tier status" command is incomplete. It will be added in a subsequent patch. When the "hot" storage is detached, the tier translator is removed from the graph and the tiered volume reverts to its original state as described in the volume's info file. For more background and design see the feature page [1]. [1] http://www.gluster.org/community/documentation/index.php/Features/data-classification Change-Id: Ic8042ce37327b850b9e199236e5be3dae95d2472 BUG: 1194753 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/9753 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* Adding ChangeTimeRecorder(CTR) Xlator to GlusterFSJoseph Fernandes2015-03-1912-8/+2102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ********************************************************************** ChangeTimeRecorder(CTR) Xlator | ********************************************************************** ChangeTimeRecorder(CTR) is server side xlator(translator) which sits just above posix xlator. The main role of this xlator is to record the access/write patterns on a file residing the brick. It records the read(only data) and write(data and metadata) times and also count on how many times a file is read or written. This xlator also captures the hard links to a file(as its required by data tiering to move files). CTR Xlator is the consumer of libgfdb. To Enable/Disable CTR Xlator: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ gluster volume set <volume-name> features.ctr-enabled {on/off} To Enable/Disable Frequency Counter Recording in CTR Xlator: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ gluster volume set <volume-name> features.record-counters {on/off} Change-Id: I5d3cf056af61ac8e3f8250321a27cb240a214ac2 BUG: 1194753 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/9935 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterfsd: add "print-netgroups" and "print-exports" commandNiels de Vos2015-03-183-11/+11
| | | | | | | | | | | | | | | | | | | | | | | NFS now has the ability to use a separate file for "netgroups" and "exports". An administrator should have the ability to check the validity of the files before applying the configuration. The "glusterfsd" command now has the following additional arguments that can be used to check the configuration: --print-netgroups: Validate the netgroups file and print it out --print-exports: Validate the exports file and print it out BUG: 1143880 Change-Id: I24c40d50110d49d8290f9fd916742f7e4d0df85f URL: http://www.gluster.org/community/documentation/index.php/Features/Exports_Netgroups_Authentication Original-author: Shreyas Siravara <shreyas.siravara@gmail.com> CC: Richard Wareing <rwareing@fb.com> CC: Jiffin Tony Thottan <jthottan@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9365 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* cli/glusterd: cli command implementation for bitrot featuresGaurav Kumar Garg2015-03-1811-2/+431
| | | | | | | | | | | | | | | | | CLI command for bitrot features. volume bitrot <volname> enable|disable Above command will enable/disable bitrot feature for particular volume. BUG: 1170075 Change-Id: Ie84002ef7f479a285688fdae99c7afa3e91b8b99 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Signed-off-by: Anand nekkunti <anekkunt@redhat.com> Signed-off-by: Dominic P Geevarghese <dgeevarg@redhat.com> Reviewed-on: http://review.gluster.org/9866 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/quota : Introducing inode quotavmallika2015-03-1812-350/+599
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ========================================================================== Inode quota ========================================================================== = Currently, the only way to retrieve the number of files/objects in a = = directory or volume is to do a crawl of the entire directory/volume. = = This is expensive and is not scalable. = = = = The proposed mechanism will provide an easier alternative to determine = = the count of files/objects in a directory or volume. = = = = The new mechanism proposes to store count of objects/files as part of = = an extended attribute of a directory. Each directory's extended = = attribute value will indicate the number of files/objects present = = in a tree with the directory being considered as the root of the tree. = = = = The count value can be accessed by performing a getxattr(). = = Cluster translators like afr, dht and stripe will perform aggregation = = of count values from various bricks when getxattr() happens on the key = = associated with file/object count. = A new interface is introduced: ------------------------------ limit-objects : limit the number of inodes at directory level list-objects : list the directories where the limit is set remove-objects : remove the limit from the directory ========================================================================== CLI COMMAND: gluster volume quota <volname> limit-objects <path> <number> [<percent>] * <number> is a hard-limit for number of objects limitation for path "<path>" If hard-limit is exceeded, creation of file/directory is no longer permitted. * <percent> is a soft-limit for number of objects creation for path "<path>" If soft-limit is exceeded, a warning is issued for each creation. CLI COMMAND: gluster volume quota <volname> remove-objects [path] ========================================================================== CLI COMMAND: gluster volume quota <volname> list-objects [path] ... Sample output: ------------------ Path Hard-limit Soft-limit Used Available Soft-limit exceeded? Hard-limit exceeded? ------------------------------------------------------------------------ -------------------------------------- /dir 10 80% 10 0 Yes Yes ========================================================================== [root@snapshot-28 dir]# ls a b file11 file12 file13 file14 file15 file16 file17 [root@snapshot-28 dir]# touch a1 touch: cannot touch `a1': Disk quota exceeded * Nine files are created in directory "dir" and directory is included in * the count too. Hence the limit "10" is reached and further file creation fails ========================================================================== Note: We have also done some re-factoring in cli for volume name validation. New function cli_validate_volname is created ========================================================================== Change-Id: I1823497de4f790a2a20ebb1770293472ea33ee2b BUG: 1190108 Signed-off-by: Sachin Pandit <spandit@redhat.com> Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/9769 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/changelog: RPC'fy {libgf}changelogVenky Shankar2015-03-1829-1301/+3936
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces RPC based communication between the changelog translator and libgfchangelog. It replaces the old pathetic stream based interaction that existed earlier (due to time constraints :-/). Changelog, upon initialization starts a RPC server (rpcsvc) allowing clients to invoke a probe API as a bootup mechanism to request for event notifications. During probe, clients can choose an event filter specifying the type(s) of events they are interested in. As of now there is no way to change the event notification set once the probe RPC call is made, but that is easier to implement. The actual event notifications is done on a separate RPC session. The client (libgfchangelog) itself starts and RPC server which the changelog translator "connects back" during probe. Notifications are dispatched by a bunch of threads from the server (translator) and the client optionally orders them if ordered notifications are requried. FOPs fill in their respective event details in a buffer (rot-buffs to be particular) and a bunch of threads (consumers) swap the buffers out of roatation and dispatch them via RPC. To avoid writer starvation, then number of dispatcher threads is one less than the number of buffer list in rot-buffs.x libgfchangelog becomes purely callback based -- upon event notification from the server (and re-ordering them if required) invoke a callback routine specified by consumer(s). A major part of the patch is also aimed at providing backward compatibility for geo-replication, which was one of the main consumer of the stream based API. Also, this patch does not\ "turn on" event notifications for all fops, just a bunch which is currently in requirement. Another pain point is that the server does not filter events before dispatching it to the clients. That load is taken up by the client itself (although it's done at the library layer rather than making it hard on the callback implementor). This needs improvement and care needs to be taken to not load the server up with expensive filtering mechanisms. Change-Id: Ibf60a432b68f2dfa60c6f9add2bcfd37a9c41395 BUG: 1170075 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/9708 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Adding Libgfdb to GlusterFSJoseph Fernandes2015-03-181-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ************************************************************************* Libgfdb | ************************************************************************* Libgfdb provides abstract mechanism to record extra/rich metadata required for data maintenance, such as data tiering/classification. It provides consumer with API for recording and querying, keeping the consumer abstracted from the data store used beneath for storing data. It works in a plug-and-play model, where data stores can be plugged-in. Presently we have plugin for Sqlite3. In the future will provide recording and querying performance optimizer. In the current implementation the schema of metadata is fixed. Schema: ~~~~~~ GF_FILE_TB Table: ~~~~~~~~~~~~~~~~~ This table has one entry per file inode. It holds the metadata required to make decisions in data maintenance. GF_ID (Primary key) : File GFID (Universal Unique IDentifier in the namespace) W_SEC, W_MSEC : Write wind time in sec & micro-sec UW_SEC, UW_MSEC : Write un-wind time in sec & micro-sec W_READ_SEC, W_READ_MSEC : Read wind time in sec & micro-sec UW_READ_SEC, UW_READ_MSEC : Read un-wind time in sec & micro-sec WRITE_FREQ_CNTR INTEGER : Write Frequency Counter READ_FREQ_CNTR INTEGER : Read Frequency Counter GF_FLINK_TABLE: ~~~~~~~~~~~~~~ This table has all the hardlinks to a file inode. GF_ID : File GFID (Composite Primary Key)``| GF_PID : Parent Directory GFID (Composite Primary Key) |-> Primary Key FNAME : File Base Name (Composite Primary Key)__| FPATH : File Full Path (Its redundant for now, this will go) W_DEL_FLAG : This Flag is used for crash consistancy, when a link is unlinked. i.e Set to 1 during unlink wind and during unwind this record is deleted LINK_UPDATE : This Flag is used when a link is changed i.e rename. Set to 1 when rename wind and set to 0 in rename unwind Libgfdb API: ~~~~~~~~~~~ Refer libglusterfs/src/gfdb/gfdb_data_store.h Change-Id: I2e9fbab3878ce630a7f41221ef61017dc43db11f BUG: 1194753 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/9683 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* marker: fix compile time warning on buf arg.Humble Devassy Chirammal2015-03-181-2/+0
| | | | | | | | | | | | | | | | | Problem: marker-quota.c: In function 'mq_inspect_directory_xattr_task': marker-quota.c:3451:31: warning: variable 'buf' set but not used [-Wunused-but-set-variable] struct iatt buf = {0,}; Change-Id: I211378328bdb2509a5d2a186d173f7f30a670c8a BUG: 1198849 Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com> Reviewed-on: http://review.gluster.org/9928 Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* Quota: Build ancestry in the lookupvmallika2015-03-183-10/+113
| | | | | | | | | | | | | | | | | | Marker can fail or can account incorrect numbers when it doesn't find a ancestry for a inode. Solution: Current build_ancestry is done only on demand in the write/create FOPs in quota enforcer. It is good to do this in the quota_lookup as well. Change-Id: I8aaf5b3e05a3ca51e7ab1eaa1b636a90f659a872 BUG: 1184885 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/9478 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Change the subvolume encoding in d_off to be a "global"Dan Lambright2015-03-1814-182/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | position in the graph rather than relative (local) to a particular translator. Encoding the volume in this way allows a single translator to manage which brick is currently being scanned for directory entries. Using a single translator minimizes allocated bits in the d_off. It also allows multiple DHT translators in the same graph to have a common frame of reference (the graph position) for which brick is being read. Multiple DHT translators are needed for the Tiering feature. The fix builds off a previous change (9332) which removed subvolume encoding from AFR. The fix makes an equivalent change to the EC translator. More background can be found in fix 9332 and gluster-dev discussions [1]. DHT and AFR/EC are responsibile (as before) for choosing which brick to enumerate directory entries in over the readdir lifecycle. The client translator receiving the readdir fop encodes the dht_t. It is referred to as the "leaf node" in the graph and corresponds to the brick being scanned. When DHT decodes the d_off, it translates the leaf node to a local subvolume, which represents the next node in the graph leading to the brick. Tracking of leaf nodes is done in common utility functions. Leaf nodes counts and positional information are updated on a graph switch. [1] www.gluster.org/pipermail/gluster-devel/2015-January/043592.html Change-Id: Iaf0ea86d7046b1ceadbad69d88707b243077ebc8 BUG: 1190734 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/9688 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* CLI : GLobal option for NFS-GaneshaMeghana Madhusudhan2015-03-186-17/+272
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A new global CLI option has been introduced for NFS-Ganesha. gluster features.ganesha enable/disable. This option is persistent and shall be inherited by new volumes created after this option is set. gluster features.ganesha enable It carries out the following functions: 1. Disables gluster-nfs across the cluster 2. Starts NFS-Ganesha server on a subset of nodes and exports '/'. 3. Creates the HA cluster for NFS-Ganesha. 4. Writes the option into the global config file. gluster features.ganesha disable 1. Stops NFS-Ganesha server. 2. Tears down the HA cluster for NFS-Ganesha With this change the older volume set options with keys "nfs-ganesha.host" and "nfs-ganesha.enable" will no longer be supported. This commit has only has the CLI related changes. Another patch will be submitted to support this feature entirely. Change-Id: Ie4b66a16c23b33b795738654b9a68f8e2c34efe3 BUG: 1188184 Signed-off-by: Meghana Madhusudhan <mmadhusu@redhat.com> Reviewed-on: http://review.gluster.org/9538 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* dht: suggest to add more bricks when min-free-disk is exceeded.Humble Devassy Chirammal2015-03-182-3/+3
| | | | | | | | | | Change-Id: I628fbd99c2478fcb8bb6e5be55e43467f25227bf BUG: 1165870 Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com> Reviewed-on: http://review.gluster.org/9879 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Lalatendu Mohanty <lmohanty@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* glusterd: add new NFS options for exports/netgroups and related cachingNiels de Vos2015-03-182-0/+88
| | | | | | | | | | | | | | | | | The following options for the Gluster/NFS server are added : - nfs.exports-auth-enable - nfs.auth-refresh-interval-sec - nfs.auth-cache-ttl-sec BUG: 1143880 Change-Id: I37a73966c4ed27cd0f8c77200ef68a0d12b385b8 Original-author: Shreyas Siravara <shreyas.siravara@gmail.com> CC: Richard Wareing <rwareing@fb.com> CC: Jiffin Tony Thottan <jthottan@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9364 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Remove compilation warningKaushal M2015-03-182-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | In glusterd_peerinfo_destroy, cast the passed 'strcut rcu_head *' pointer to 'gd_rcu_head *' before use in caa_container_of() to prevent the incompatible-pointer compilation warning. Also, refactor peerinfo->head to peerinfo->rcu_head to reduce confusion when reading code. This change was developed on the git branch at [1]. This commit is a combination of the following commits on the development branch. aa4a0bc Rename peerinfo->head to peerinfo->rcu_head c79144b Cast struct rcu_head * to gd_rcu_head * to prevent warning 1d222c3 More head -> rcu_head renames [1]: https://github.com/kshlm/glusterfs/tree/urcu BUG: 1191030 Change-Id: I7ede02090413839563ce44fdf6289697b28777e7 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/9922 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Make read child match check in afr optionalKrutika Dhananjay2015-03-184-0/+25
| | | | | | | | | | | | | | | Having this particular check which was introduced by commit c78998c39f0857ea7aacba360632c148afc54a55 causes a drop in performance in readdirp. So the behavior is made configurable with this patch. Change-Id: I2858fc18b3539df7aa6d3f489e0d5cfaeb8a9b3c BUG: 1202669 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9917 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* NFS-Ganesha: Volume set option for managing NFS-Ganesha exports.Meghana Madhusudhan2015-03-1811-42/+541
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | A dummy translator has been introduced as a place holder for functions related to managing NFS-Ganesha exports. A volume set option is introduced to manage volume level exports. gluster vol set <volname> ganesha.enable ON/OFF 1. gluster volume set <volname> ganesha.enable ON It creates the export config file with a unique export ID. Sends a DBus signal to export this volume dynamically. 2. gluster vol set <volname> ganesha.enable OFF Unexports the specific volume. Deletes the specfic config file related to the volume. This change also removes the handling of the older keys "nfs-ganesha.enable" and "nfs-ganesha.host" Change-Id: I8d4a0b542326a6a0c8e4711600b106274d666587 BUG: 1188184 Signed-off-by: Meghana Madhusudhan <mmadhusu@redhat.com> Reviewed-on: http://review.gluster.org/9585 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* Snapshot/clone: clone of a snapshot that will act as a regular volumeMohammed Rafi KC2015-03-182-183/+800
| | | | | | | | | | | | | | | | | snapshot clone will allow us to take a snpahot of a snapshot. Newly created clone volume will be a regular volume with read/write permissions. CLI command snapshot clone <clonename> <snapname> Change-Id: Icadb993fa42fff787a330f8f49452da54e9db7de BUG: 1199894 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/9750 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/trash: Avoid unnecessary logging from trash_local_wipeAnoop C S2015-03-171-1/+2
| | | | | | | | | | | | | | | | | | | Even when trash translator is disabled, the following error is being logged for each unlink/truncate/ftruncate calls. [...] E [trash.c:221:trash_local_wipe] (--> ... ... ) 0-trash: invalid argument: local This change replaces GF_VALIDATE_OR_GOTO macro with simple if condition. Change-Id: I7e6754cd53ec7c2d84669b6d40d883a2d1eee41e BUG: 1132465 Signed-off-by: Anoop C S <achiraya@redhat.com> Reviewed-on: http://review.gluster.org/9909 Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* Quota/marker : Support for inode quotavmallika2015-03-176-248/+1658
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the only way to retrieve the number of files/objects in a directory or volume is to do a crawl of the entire directory/volume. This is expensive and is not scalable. The new mechanism proposes to store count of objects/files as part of an extended attribute of a directory. Each directory's extended attribute value will indicate the number of files/objects present in a tree with the directory being considered as the root of the tree. Currently file usage is accounted in marker by doing multiple FOPs like setting and getting xattrs. Doing this with STACK WIND and UNWIND can be harder to debug as involves multiple callbacks. In this code we are replacing current mechanism with syncop approach as syncop code is much simpler to follow and help us implement inode quota in an organized way. Change-Id: Ibf366fbe07037284e89a241ddaff7750fc8771b4 BUG: 1188636 Signed-off-by: vmallika <vmallika@redhat.com> Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/9567 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* gfapi: APIs to store and process upcall notifications receivedSoumya Koduri2015-03-171-0/+34
| | | | | | | | | | | | | | | | | | | | | | | In case of any upcall cbk events received by the protocol/client, gfapi will be notified which queues them up in a list (<gfapi_cbk_upcall>). Applicatons are responsible to provide APIs to process & notify them in case of any such upcall events queued. Added a new API which will be used by Ganesha to repeatedly poll for any such upcall event notified (<glfs_h_poll_upcall>). A new test-file has been added to test the cache_invalidation upcall events. Below link has a writeup which explains the code changes done - URL: https://soumyakoduri.wordpress.com/2015/02/25/glusterfs-understanding-upcall-infrastructure-and-cache-invalidation-support/ Change-Id: Iafc6880000c865fd4da22d0cfc388ec135b5a1c5 BUG: 1200262 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/9536 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* afr: remove stale index entriesRavishankar N2015-03-173-4/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: During pre-op phase, the index xlator 1. Creates the entry inside .glusterfs/indices/xattrop 2. Winds the xattrop fop to posix to mark dirty/pending changelogs. If the brick crashes after 1, the xattrop entry becomes stale and never gets removed by shd during subsequent crawls because there is nothing to heal (changelogs are zero). Though the stale entry does not get displayed in the output of 'heal info' command, it nevertheless stays there forever unless a new write transaction is performed on the file. Fix: During index self-heal if afr xattrs are found to be clean (indicated by ret value of 2 on a call to afr_shd_selfheal(), send a dummy post-op with all 0s for the xattr values, which makes the index xlator to unlink the stale entry. Change-Id: I02cb2bc937f2e3f3f3cb35d67b006664dc7ef919 BUG: 1190069 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/9714 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* every/where: add GF_FOP_IPC for inter-translator communicationJeff Darcy2015-03-175-47/+280
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several features - e.g. encryption, erasure codes, or NSR - involve multiple cooperating translators which sometimes need a "private" means of communication amongst themselves. Historically we've used virtual or synthetic xattrs, but that's not very elegant and clutters up the getxattr/setxattr path which must also handle real xattr requests. This new fop should address that. The only argument is an int32_t "op" which should be recognized by the target translator. It is recommended that translators using these feature follow some convention regarding the ops that they define, to avoid conflicts. Using a hash of the target translator's type string as a base for a series of ops would probably be a good start. Any other information can be passed in both directions using xdata. The default behavior for this fop, as with any other, is to pass through to FIRST_CHILD. That makes use of this fop "transparent" to other translators that were written before it existed, but it also means that it only really works with pass-through translators. If a routing translator (such as DHT) or a fan-out translator (such as AFR) is involved, the IPC might not reach its intended destination unless those translators are modified to forward IPC fops along all paths. If an IPC gets all the way to storage/posix it is considered an error, much like an uncaught exception. We don't actually *do* anything in that case, but we do log it send back an EOPNOTSUPP error. This makes the "unrecognized opcode" condition distinguishable from the "no IPC support" condition (which would yield an RPC error instead) so clients can probe for the presence of a handler for their own favorite opcode and either use that or use old-school xattrs depending on the result. BUG: 1158628 Signed-off-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: I84af1b17babe5b30ec03ecf027ae37d09b873968 Reviewed-on: http://review.gluster.org/8812 Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* socket: use TCP_USER_TIMEOUT to detect client failures quickerNiels de Vos2015-03-173-3/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the network.ping-timeout to set the TCP_USER_TIMEOUT socket option (see 'man 7 tcp'). The option sets the transport.tcp-user-timeout option that is handled in the rpc/socket layer on the protocol/server side. This socket option makes detecting unclean disconnected clients more reliable. When the socket gets closed, any locks that the client held are been released. This makes it possible to reduce the fail-over time for applications that run on systems that became unreachable due to a network partition or general system error client-side (kernel panic, hang, ...). It is not trivial to create a test-case for this at the moment. We need a client that unclean disconnects and an other client that tries to take over the lock from the disconnected client. URL: http://supercolony.gluster.org/pipermail/gluster-devel/2014-May/040755.html Change-Id: I5e5f540a49abfb5f398291f1818583a63a5f4bb4 BUG: 1129787 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/8065 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Santosh Pradhan <santosh.pradhan@gmail.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* Upcall: New xlator to store various states and send cbk eventsSoumya Koduri2015-03-1712-7/+1791
| | | | | | | | | | | | | | | | | | | | | | | | | | Framework on the server-side, to handle certain state of the files accessed and send notifications to the clients connected. A generic and extensible framework, used to maintain states in the glusterfsd process for each of the files accessed (including the clients info doing the fops) and send notifications to the respective glusterfs clients incase of any change in that state. This patch handles "Inode Update/Invalidation" upcall event. Feature page: URL: http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure Below link has a writeup which explains the code changes done - URL: https://soumyakoduri.wordpress.com/2015/02/25/glusterfs-understanding-upcall-infrastructure-and-cache-invalidation-support/ Change-Id: Ie3d724be9a3419fcf18901a753e8ec2df2ac802f BUG: 1200262 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/9535 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* Features/trash : Combined patches for trash translatorAnoop C S2015-03-169-727/+2086
| | | | | | | | | | | | | | | | | | | | | | | | | This is the combined patch set for supporting trash feature. http://www.gluster.org/community/documentation/index.php/Features/Trash Current patch includes the following features: * volume set options for enabling trash globally and exclusively for internal operations like self-heal and re-balance * volume set options for setting the eliminate path, trash directory path and maximum trashable file size. * test script for checking the functionality of the feature * brief documentation on different aspects of trash feature. Change-Id: Ic7486982dcd6e295d1eba0f4d5ee6d33bf1b4cb3 BUG: 1132465 Signed-off-by: Anoop C S <achiraya@redhat.com> Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/8312 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Protect the peer list and peerinfos with RCU.Kaushal M2015-03-1618-290/+801
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The peer list and the peerinfo objects are now protected using RCU. Design patterns described in the Paul McKenney's RCU dissertation [1] (sections 5 and 6) have been used to convert existing non-RCU protected code to RCU protected code. Currently, we are only targetting guaranteeing the existence of the peerinfo objects, ie., we are only looking to protect deletes, not all updaters. We chose this, as protecting all updates is a much more complex task. The steps used to accomplish this are, 1. Remove all long lived direct references to peerinfo objects (apart from the peerinfo list). This includes references in glusterd_peerctx_t (RPC), glusterd_friend_sm_event_t (friend state machine) and others. This way no one has a reference to deleted peerinfo object. 2. Replace the direct references with indirect references, ie., use peer uuid and peer hostname as indirect references to the peerinfo object. Any reader or updater now uses the indirect references to get to the actual peerinfo object, using glusterd_peerinfo_find. Cases where a peerinfo cannot be found are handled gracefully. 3. The readers get and use the peerinfo object only within a RCU read critical section. This prevents the object from being deleted/freed when in actual use. 4. The deletion of a peerinfo object is done in a ordered manner (glusterd_peerinfo_destroy). The object is first removed from the peerinfo list using an atomic list remove, but the list head is not reset to allow existing list readers to complete correctly. We wait for readers to complete, before resetting the list head. This removes the object from the list completely. After this no new readers can get a reference to the object, and it can be freed. This change was developed on the git branch at [2]. This commit is a combination of the following commits on the development branch. d7999b9 Protect the glusterd_conf_t->peers_list with RCU. 0da85c4 Synchronize before INITing peerinfo list head after removing from list. 32ec28a Add missing rcu_read_unlock 8fed0b8 Correctly exit read critical section once peer is found. 63db857 Free peerctx only on rpc destruction 56eff26 Cleanup style issues e5f38b0 Indirection for events and friend_sm 3c84ac4 In __glusterd_probe_cbk goto unlock only if peer already exists 141d855 Address review comments on 9695/1 aaeefed Protection during peer updates 6eda33d Revert "Synchronize before INITing peerinfo list head after removing from list." f69db96 Remove unneeded line b43d2ec Address review comments on 9695/4 7781921 Address review comments on 9695/5 eb6467b Add some missing semi-colons 328a47f Remove synchronize_rcu from glusterd_friend_sm_transition_state 186e429 Run part of glusterd_friend_remove in critical section 55c0a2e Fix gluster (peer status/ pool list) with no peers 93f8dcf Use call_rcu to free peerinfo c36178c Introduce composite struct, gd_rcu_head [1]: http://www.rdrop.com/~paulmck/RCU/RCUdissertation.2004.07.14e1.pdf [2]: https://github.com/kshlm/glusterfs/tree/urcu Change-Id: Ic1480e59c86d41d25a6a3d159aa3e11fbb3cbc7b BUG: 1191030 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/9695 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Anand Nekkunti <anekkunt@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* nfs: improve cleanup of 'struct exports_file' to prevent memory leakNiels de Vos2015-03-151-4/+16
| | | | | | | | | | BUG: 1143880 Change-Id: I359470a1edb935e206eeeecd4de7022530fb397a Reported-by: Vijay Bellur <vbellur@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9882 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* fuse: Fixing dereference after null checkarao2015-03-151-1/+1
| | | | | | | | | | | | | | | CID: 1124496 The pointer is not checked against null and is dereferenced anyway, which is now checked. Change-Id: Ib810546445596671b3656f01a14bbad02cdc221c BUG: 789278 Signed-off-by: arao <arao@redhat.com> Reviewed-on: http://review.gluster.org/9640 Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* xlators/storage/bd : Unused value is removed.Manikandan Selvaganesh2015-03-151-1/+0
| | | | | | | | | | | | | | CID:1128926 Change-Id: I5ad1229e225a36f995245a847db1a19609a18cd8 BUG: 789278 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/9556 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* nfs: use free() for getline() allocated string in netgroupsNiels de Vos2015-03-151-1/+2
| | | | | | | | | | | | | | | | | | 'line' is allocated through getline() which uses malloc(). GF_FREE() will fail to release the memory because it can not find the expected mem-pool header. Instead of GF_FREE(), free() should be used for strings that get allocated with getline(). Subsequent calls to getline() with a non-NULL pointer will get the size of the allocation adjusted with realloc(). Change-Id: I612fbf17d7283174d541da6f34d26e4f44e83bfa BUG: 1143880 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9860 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* gNFS: Export / Netgroup authentication on Gluster NFS mountNiels de Vos2015-03-1510-53/+1018
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * Parses linux style export file/netgroups file into a structure that can be lookedup. * This parser turns each line into a structure called an "export directory". Each of these has a dictionary of hosts and netgroups which can be looked up during the mount authentication process. (See Change-Id Ic060aac and I7e6aa6bc) * A string beginning withan '@' is treated as a netgroup and a string beginning without an @ is a host. (See Change-Id Ie04800d) * This parser does not currently support all the options in the man page ('man exports'), but we can easily add them. BUG: 1143880 URL: http://www.gluster.org/community/documentation/index.php/Features/Exports_Netgroups_Authentication Change-Id: I181e8c1814d6ef3cae5b4d88353622734f0c0f0b Original-author: Shreyas Siravara <shreyas.siravara@gmail.com> CC: Richard Wareing <rwareing@fb.com> CC: Jiffin Tony Thottan <jthottan@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/8758 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* nfs: add auth-cache for the MOUNT protocolNiels de Vos2015-03-155-2/+371
| | | | | | | | | | | | | | | | | | | Authentication cache for the new fine grained contol for the MOUNT protocol. The extended authentication (see Change-Id Ic060aac) benefits from caching the access/permission checks that are done when an NFS-client mounts an export. This auth-cache will be used by Change-Id I181e8c1. BUG: 1143880 Change-Id: I1379116572c8a4d1bf0c7ca4f826e51a79d91444 Original-author: Shreyas Siravara <shreyas.siravara@gmail.com> CC: Richard Wareing <rwareing@fb.com> CC: Jiffin Tony Thottan <jthottan@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9363 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* nfs: more fine grained authentication for the MOUNT protocolNiels de Vos2015-03-156-3/+709
| | | | | | | | | | | | | | | | The /etc/exports format for NFS-exports (see Change-Id I7e6aa6b) allows a more fine grained control over the authentication. This change adds the functions and structures that will be used in by Change-Id I181e8c1. BUG: 1143880 Change-Id: Ic060aac7c52d91e08519b222ba46383c94665ce7 Original-author: Shreyas Siravara <shreyas.siravara@gmail.com> CC: Richard Wareing <rwareing@fb.com> CC: Jiffin Tony Thottan <jthottan@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9362 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* nfs: add support for separate 'exports' fileNiels de Vos2015-03-155-2/+1535
| | | | | | | | | | | | | | | | The Linux kernel NFS server uses /etc/exports to manage permissions for the NFS-clients. Extending the Gluster/NFS server to support a similar scheme is needed for many deployments in enterprise environments. BUG: 1143880 Change-Id: I7e6aa6bc6aa1cd5f52458e023387ed38de9823d7 Original-author: Shreyas Siravara <shreyas.siravara@gmail.com> CC: Richard Wareing <rwareing@fb.com> CC: Jiffin Tony Thottan <jthottan@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9361 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep-cli: added a new option "no-verify" to geo-rep create.darshan n2015-03-121-17/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a new option called as "no-verify" to geo-rep create command. With no-verify option, following checks does not take place before session creation: * if ssh port 22 is open in slave * has proper passwordless ssh login setup * slave volume is created and is empty * if slave has enough memory This option is needed by ovirt-engine as the tasks done by push-pem is taken care by ovirt-engine and also the above checks are done. Thus creation of password-less ssh can be avoided when geo-replication is managed through ovirt. Usage: volume geo-replication [<VOLNAME>] [<SLAVE-URL>] { create [[no-verify]|[push-pem]] [force]| start [force]|stop [force]|pause [force]| resume [force]|config|status [detail]| delete } [options...] Change-Id: I975265f27d6434be5409438257d09cd4190c9159 BUG: 1198615 Signed-off-by: darshan n <dnarayan@redhat.com> Reviewed-on: http://review.gluster.org/9799 Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* cluster/afr: Convert quota size from n/w to host order before useKrutika Dhananjay2015-03-121-0/+4
| | | | | | | | | | | | Change-Id: I3e4fe15716556441546fcd62b8ac2833869b21cf BUG: 1200670 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9853 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* nfs: add structures and functions for parsing netgroupsNiels de Vos2015-03-114-2/+1198
| | | | | | | | | | | | | | | | | | | | Netgroups are often used by enterprises to group a set of systems. The NFS /etc/exports file support the @netgroup notation, and Gluster/NFS will get extended to support this notation as well. For this, it is needed that Gluster/NFS learns to parse the netgroup format. A change to glusterfsd (Change-Id I24c40d5) will add test cases where the parsing is used for regression testing. BUG: 1143880 Change-Id: Ie04800d4dc26f99df922c9fcc00845f53291cf4f Original-author: Shreyas Siravara <shreyas.siravara@gmail.com> CC: Richard Wareing <rwareing@fb.com> CC: Jiffin Tony Thottan <jthottan@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9360 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: create nfs volfile even when NFS is disabled on all volumesKrishnan Parthasarathi2015-03-111-1/+1
| | | | | | | | | | | | | | | | | | | | | This is required to determine if gluster-nfs daemon needs to be restarted. With http://review.gluster.org/9835 gluster-nfs volfile wouldn't be created if all volumes had nfs disabled before they were started even once. With the existing code, we wouldn't be able to determine if gluster-nfs needs to be restarted or reconfigured without the gluster-nfs volfile. This fix is ensure that we generate the gluster-nfs volfile even if it wouldn't be started, to honour the above requirement. Change-Id: I86c6707870d838b03dd4d14b91b984cb43c33006 BUG: 1199944 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/9851 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Tested-by: Kaushal M <kaushal@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* Performance: Replace ASSERTS in xlator fini() with info messages, for the ↵Poornima G2015-03-102-1/+15
| | | | | | | | | | | | | | | | | known leaks. There are few known leaks in read-ahead and quick-read xlator fini(). Until they are fixed replacing the ASSERTS with info, else any call to glfs_fini() in debug mode will core dump. Change-Id: Id60c6f952574863fc77c7d101cb5d5e9113090d8 BUG: 1199436 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/9819 Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* snapshot: append timestamp with snapnameMohammed Rafi KC2015-03-101-19/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Appending GMT time stamp with snapname by default. If no-timestamp flag is given during snapshot creation, then time stamp will not append with snapname; Initial consumer of this feature is Samba's Shadow Copy feature. This feature allows Windows user to get previous revisions of a file. For this feature to work snapshot names under .snaps folder (USS) should have timestamp in following format appended: @GMT-YYYY.MM.DD-hh.mm.ss PS: https://www.samba.org/samba/docs/man/manpages/vfs_shadow_copy2.8.html This format is configurable by Samba conf file. Due to a limitation in Windows directory access the exact format cannot be used by USS. Therefore we have modified the file format to: _GMT-YYYY.MM.DD-hh.mm.ss Snapshot scheduling feature also required to append timestamp to the snapshot name therefore timestamp is appended in snapshot creation itself instead of doing the changes in snapview server. More info: https://www.mail-archive.com/gluster-users@gluster.org/msg18895.html Change-Id: Idac24670948cf4c0fbe916ea6690e49cbc832d07 BUG: 1189473 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/9597 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* mgmt/glusterd: Changes required for disperse volume heal commandsPranith Kumar K2015-03-105-153/+226
| | | | | | | | | | | | | - Include xattrop64-watchlist for index xlator for disperse volumes. - Change the functions that exist to consider disperse volumes also for sending commands to disperse xls in self-heal-daemon. Change-Id: Iae75a5d3dd5642454a2ebf5840feba35780d8adb BUG: 1177601 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9793 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* cluster/ec: Add self-heal-daemon command handlersPranith Kumar K2015-03-0911-61/+741
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces the changes required in ec xlator to handle index/full heal. Index healer threads: Ec xlator start an index healer thread per local brick. This thread keeps waking up every minute to check if there are any files to be healed based on the indices kept in index directory. Whenever child_up event comes, then also this index healer thread wakes up and crawls the indices and triggers heal. When self-heal-daemon is disabled on this particular volume then the healer thread keeps waiting until it is enabled again to perform heals. Full healer threads: Ec xlator starts a full healer thread for the local subvolume provided by glusterd to perform full crawl on the directory hierarchy to perform heals. Once the crawl completes the thread exits if no more full heals are issued. Changed xl-op prefix GF_AFR_OP to GF_SHD_OP to make it more generic. Change-Id: Idf9b2735d779a6253717be064173dfde6f8f824b BUG: 1177601 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9787 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: add ACL translation for the GF_POSIX_ACL_*_KEY xattrNiels de Vos2015-03-094-1/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding support for two virtual extended attributes that are used for converting a binary POSIX ACL to a POSIX.1e long ACL text format. This makes it possible to transfer the ACL over the network to a different OS which can convert the POSIX.1e text format to its native structures. The following xattrs are sent over RPC in SETXATTR/GETXATTR procedures, and contain the POSIX.1e long ACL text format: - glusterfs.posix.acl: maps to ACL_TYPE_ACCESS - glusterfs.posix.default_acl: maps to ACL_TYPE_DEFAULT acl_from_text() (from libacl) converts the text format into an acl_t structure. This structure is then used by acl_set_file() to set the ACL in the filesystem. libacl-devel is needed for linking against libacl, so it has been added to the BuildRequires in the .spec. NetBSD does not support POSIX ACLs. Trying to get/set POSIX ACLs on a storage server running NetBSD, an error will be returned with errno set to ENOTSUP. Faking support, but not enforcing ACLs seems wrong to me. URL: http://www.gluster.org/community/documentation/index.php/Features/Improved_POSIX_ACLs BUG: 1185654 Change-Id: Ic5eb73d69190d3492df2f711d0436775eeea7de3 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9627 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* glusterd: don't start gluster-nfs when NFS is disabledKrishnan Parthasarathi2015-03-091-2/+28
| | | | | | | | | | Change-Id: Ic4da2a467a95af7108ed67954f44341131b41c7b BUG: 1199944 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/9835 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Fix dictionary leaks in ancestry-building code.Pranith Kumar K2015-03-092-6/+1
| | | | | | | | | | Change-Id: I7a4a24ed95f897d1c14d89f3869c20ba40f85b7f BUG: 1188636 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9839 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libxlator: Make sure marker_xattr is validPranith Kumar K2015-03-091-13/+17
| | | | | | | | | | | | | | | | | | | Problem: marker_xattr is allocated only when the op_ret is 0. If the final response is a failure, then the marker time is dict_set with key as NULL. this will be changed to ref:<address-of-value> by dict_set, so the value won't appear on the marker-key when the getxattr_cbk is in dht. So dht unwinds with failure EINVAL. Fix: Always populate marker_xattr. Fixed dict mem-leak as well. Change-Id: I1752f277a8852c47b0a2ccce9fd72ee88456ac02 BUG: 1199406 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9817 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/afr: Handle getxattr of quota-size keyPranith Kumar K2015-03-093-61/+103
| | | | | | | | | | | | | Afr needs to query QUOTA_SIZE_KEY from all the subvolumes and return the value which is maximum of the readable bricks. Change-Id: Ibb9064c8652aea0d984796e7a06f8adca72aa971 BUG: 1199431 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9820 Reviewed-by: Anuradha Talur <atalur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
* mgmt/glusterd: Add bind-insecure optionPranith Kumar K2015-03-091-2/+8
| | | | | | | | | | | | Also deleted default values for disperse-self-heal-daemon and locks.trace Change-Id: Icc927d176aa10f06b40c114aa296b02dbad3a8ff BUG: 1187858 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9516 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* nfs: prevent logging missing 'system.posix_acl_access' xattrsNiels de Vos2015-03-081-0/+4
| | | | | | | | | | | | | Change http://review.gluster.org/9773 addresses the majority of the logging, but it seems it is still possible to trigger the excessive logging by requesting the ACL on files directly. Lets squash those too. BUG: 1197253 Change-Id: I9e90ddd45f1a39641478f34c69c64dfe1c11c727 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9781 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Meghana M <mmadhusu@redhat.com>