summaryrefslogtreecommitdiffstats
path: root/xlators/mgmt/glusterd/src/glusterd-store.c
Commit message (Collapse)AuthorAgeFilesLines
* glusterd/snapshot: Addressed upstream review commentsHEADdevelopmentRajesh Joseph2014-04-071-3/+2
| | | | | | | | | | Code cleanup and review comments fixed. Change-Id: Ie51e3a2f995295632cb1db05bab8b38603ab9a0a Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/7399 Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com>
* glusterd/snapshot: Recording the snapshots missed in each brick.Avra Sengupta2014-04-021-7/+281
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Persisting missing snapshot info on disk as well as in memory in the following format: -------------NODE-UUID--------------:---------SNAP-UUID--------------=BRICKNUM:-------BRICKPATH--------:OPERATION:STATUS 927cb5fe-63da-48f5-82f6-e6a09ddc81c4:a17b4fe42c5a45f7a916438643edaa13= 3 :/brick/brick-dirs/brick3: 1 : 1 927cb5fe-63da-48f5-82f6-e6a09ddc81c4:a17b4fe42c5a45f7a916438643edaa13= 3 :/brick/brick-dirs/brick3: 3 : 1 927cb5fe-63da-48f5-82f6-e6a09ddc81c4:83a3cc05453b46b2a7eda4c9a9208638= 3 :/brick/brick-dirs/brick3: 1 : 1 This data will be stored on disk at /var/lib/glusterd/snaps/missed_snaps_list In memory we maintain the data as a list of glusterd_missed_snap_info in conf, the key for this list are the first two fields, i.e NODE-UUID:SNAP-UUID. For every NODE-UUID:SNAP-UUID, there can be multiple operations missed on multiple bricks. So we maintain a list of glusterd_snap_op_t for evert node of glusterd_missed_snap_info This list is maintained or updated during snapshot create, delete, and restore operations which are the only operations that if missed, are recorded in this list. During snapshot create, if a node is down, or a brick is down, we don't receive their mount point infos. snap_status of such bricks is marked as -1, and their brick details are added to this list. During snapshot delete, we check from originator node, if any other nodes, holding bricks of the said snap are down. Those are also added to the list. Also if the node is up, but the snapshot was pending for a snap brick, and its snap_status is -1, we add that to the list too. When a subsequent delete entry is processed for an already existing create entry, we just mark the create entries status as done (2), and don't add the delete entry to the list. During snapshot restore, we check from originator node, if any other nodes, holding bricks of the said snap are down. Those are also added to the list. Also if the node is up, but the snapshot was pending for a snap brick, and its snap_status is -1, we add that to the list too. Change-Id: I22578d14f81a54e13f6832966b70cd4cfdfd5b44 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7208 Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Rajesh Joseph <rjoseph@redhat.com>
* glusterd/snapshot: Making snap operations crash consistentAvra Sengupta2014-04-021-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the events of a volume's brick being down, or a node being down, making snap ops like create, delete, restore, and status crash consistent. Marking snap status of snap bricks which were not snapshotted because the volume brick was down as -1, and not starting those snap bricks till the snapshot is taken. During delete bypassing lvm snapshot remove for snap bricks whose snap status is -1 During restore bypass replacing xattrs on the snapshot bricks whose snap status is -1. Also bumping restored volume's version so as to handle nodes being down. On handshake of a restored volume, passing brick's snap_status as well. During snapshot status of the non-snapshotted brick details display "N/A". If a node is down, the entry itself will not be displayed. Change-Id: Id042efd7507829995270da0b2b2a6282a08a053d Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7241 Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Rajesh Joseph <rjoseph@redhat.com>
* glusterd/snapshot: Using original brickinfo for snap brick pathAvra Sengupta2014-04-021-0/+12
| | | | | | | | | | | | | Using /var/run/gluster/snaps/<snap-uuid>/ <original brick number>/<original brick's mount dir> as the snapshot brick path Change-Id: Icdd838512e662ed687662188bd197ed1a889fbea Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7231 Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Rajesh Joseph <rjoseph@redhat.com>
* glusterd/snapshot:store related fix for the review comments in the upstreamVijaikumar M2014-03-271-6/+8
| | | | | | | | Change-Id: I07da1f0a102b71c7c70dac83b1fe98d05607cb64 Signed-off-by: Vijaikumar M <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/7355 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Rajesh Joseph <rjoseph@redhat.com>
* glusterd/snapshot : remove snap_volume variable from struct glusterd_snap_Vijaikumar M2014-03-251-49/+29
| | | | | | | | Change-Id: I3c5eae3f199d97a2fe70599e2cdb4c357d20f20d Signed-off-by: Vijaikumar M <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/7197 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Rajesh Joseph <rjoseph@redhat.com>
* snapshot: cleanup unwanted CG-codeVijaikumar M2014-03-251-409/+0
| | | | | | | | | Change-Id: Id3da2a7aac84027aeff050f6dd9a3484719362bc BUG: 1073780 Signed-off-by: Vijaikumar M <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/7204 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Rajesh Joseph <rjoseph@redhat.com>
* mgmt/glusterd: do cleanup of snapshots in post-validate phase if half bakedRaghavendra Bhat2014-03-241-0/+57
| | | | | | | | | | objects are there Change-Id: I372cac98ad054cdc1a6fbc7f6c77c25981063b2f Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/7237 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Rajesh Joseph <rjoseph@redhat.com>
* glusterd/snapshot: Snapshot create and delete changesVijaikumar M2014-03-061-33/+0
| | | | | | | | | | | | | | | | | | | With the snap driven approach, While creating the snapshot, We have to mention the snap-name first and then the volumes to be associated with that. Corresponding changes has been made in glusterd. While deleting the snapshot, we have to mention only the snapname. Corresponding changes has been made in glusterd. CLI changes for the same can be found here: http://review.gluster.org/#/c/6947/ Change-Id: I8bd8f471da5b728165da5f331faad3dde3486823 Signed-off-by: Vijaikumar M <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/7123 Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Rajesh Joseph <rjoseph@redhat.com>
* glusterd/snapshot: store location for snap driven changesVijaikumar M2014-03-031-885/+516
| | | | | | | | | | | | | | | | Currently snapshot volfiles are stored at: <workdir>/vols/<volname>/snaps/<snapvol> With snap driven approach we need to store the volfiles at: <workdir>/snaps/<snapname>/<snapvol> Change-Id: I8efdd5db29833b2b06b64a900cbb4c9b9a5d36b6 Signed-off-by: Vijaikumar M <vmallika@redhat.com> Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/7006 Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Rajesh Joseph <rjoseph@redhat.com>
* glusterd/snapshot: Use snap uuid to create lvm snapshotVijaikumar M2014-02-131-4/+6
| | | | | | | | | | | | Using snap uuid to create lvm snapshot will solve the problem of having '-' in the snap name or snap name is too long. Change-Id: If204f02a8f5de599fb409d06c7893ef3542a6300 BUG: 1045333 Signed-off-by: Vijaikumar M <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/6709 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Rajesh Joseph <rjoseph@redhat.com>
* glusterd/snapshot : snapshot delete - check if process is killed before ↵Sachin Pandit2014-02-041-3/+9
| | | | | | | | | | | | | unmounting. Change-Id: Idf0cf63429212142795e1aeb4fd4962b51620426 BUG: 1049353 Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/6772 Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Rajesh Joseph <rjoseph@redhat.com>
* mgmt/glusterd: perform store on only newly created snapshotRaghavendra Bhat2014-02-021-11/+99
| | | | | | | | | | | | | | | * Instead performing store on all the snaps of the volume when a new snap is created, store only the new snap's info. Otherwise with more and more snapshots in the volume, time creation for new snapshot becomes very large as glusterd has to store all the snapshots related info everytime (that too by doing fsyncs) Change-Id: I0e005d1d4c044b07a8abde8e6ba55e66a1bbd590 BUG: 1059146 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/6841 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Rajesh Joseph <rjoseph@redhat.com>
* glusterd/snapshot : Making snap time persistent.Sachin Pandit2014-01-161-11/+61
| | | | | Change-Id: I2d07717ee01751e481571ce420b0f84711ea9597 Signed-off-by: Sachin Pandit <spandit@redhat.com>
* mgmt/glusterd : Having a separate list for snapshots.Sachin Pandit2014-01-151-5/+4
| | | | | | | | | Creating a separate list for snaps taken, as cluttering snaps in the volume list does not look neat. Change-Id: Ida4a183e95e8694b85ebb5a680d06b7d29a460a0 BUG: 1040947 Signed-off-by: Sachin Pandit <spandit@redhat.com>
* glusterd/snapshot : Update the parent volume name of snap during restore.Sachin Pandit2014-01-151-0/+1
| | | | | | Change-Id: I9663342e49073a172667220d4839b2beb65ffc0e BUG: 1040947 Signed-off-by: Sachin Pandit <spandit@redhat.com>
* Snapshot: Minor change to fix last merge conflictRajesh Joseph2014-01-151-1/+0
| | | | | | | Some changes were missed in resolving conflict in the restore patch. Change-Id: Ie0cf01237bf975027056c10bb3681e122334f83e Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
* Snapshot: Gluster snapshot restore featureRajesh Joseph2014-01-151-37/+113
| | | | | | | | | | | Implemented gluster snapshot restore feature. The restore is done by replacing the origin volume with the snap volume. TODO: After the restore the snapshot volume should be deleted. As of now the deletion work is pending. Change-Id: Ib137fb6bb84a74030607ffa47f89cd705dc7e1ff Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
* glusterd/snapshot : Making CG Name persistent and handle Snap Description Issue.Sachin Pandit2014-01-131-4/+29
| | | | | | | | | | | | | | | | | 1)Write the CG Name in the "snap_list.info" file, so that CG Name is not lost when we restart the glusterd. 2)Fixes the issue where Description given for CG, as a part of create command, was getting stored as snap description rather than CG Description. 3)Fixes the problem with glusterd restart when we have multiple words in Snap Descripition Change-Id: I3129c53d1ec54dd170ca1300583f278f58c4e0e2 BUG: 1044476 Signed-off-by: Sachin Pandit <spandit@redhat.com>
* glusterd/snapshot: Defining snap-max-soft-limit as a percentage of ↵Avra Sengupta2014-01-071-17/+6
| | | | | | | | | | | | | snap-max-hard-limit. This patch also prohibits configuration of snap-max-hard-limit and snap-max-soft-limit for snap volumes. Also displaying the snapshot configs by reading data only from local node, as all config data will be in sync across the cluster. Change-Id: I635b925c02ed5b108cd10c7193b154ad82d5afad BUG: 1043792 Signed-off-by: Avra Sengupta <asengupt@redhat.com>
* glusterd/snapshot: Introducing snap-max-hard-limit and snap-max-soft-limitAvra Sengupta2014-01-061-18/+67
| | | | | | | | | Note: Manually adding this patch again as this patch got missed in git reset option done on remote development branch Change-Id: I9e81c5ec003c1e1722d0fcb27dd87c365ee43ff4 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
* glusterd/snapshot: Fix Displaying Port, Online Status and Pid for snap vols ↵Avra Sengupta2013-12-101-0/+12
| | | | | | | | | | | in volume status Added a parent_volname member in glusterd_volinfo_ structure to help point the snap vol to the parent volname. Using this to fetch the pidfile location during volume status. Change-Id: I30a16646561394d0f7d16f66abff14c425f31f06 Signed-off-by: Avra Sengupta <asengupt@redhat.com>
* mgmt/glusterd: set the snap count properly while restoringRaghavendra Bhat2013-12-031-2/+3
| | | | | Change-Id: I37eb7ab12767fdd11aa2e58441d26e6d6d9dd245 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* mgmt/glusterd: handle issues present in snapshot create and snap managementRaghavendra Bhat2013-12-021-20/+28
| | | | | Change-Id: I94b5f6e00be7d1ff0c454e291c779dae7b423748 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* glusterd/snapshot: Added trace statements and handled snapshot create commit ↵Avra Sengupta2013-11-201-2/+5
| | | | | | | | | | | failure. Also handles empty string(not NULL) in gd_syncop_mgmt_brick_op() and adds "Snapshot" in operation list used for printing op during logging. Change-Id: Icac9dce6bf1c087ab2aace9953e2af3a0fb81be6 Signed-off-by: Avra Sengupta <asengupt@redhat.com>
* snapshot: Merge conflict resolutionshishir gowda2013-11-151-5/+6
| | | | Signed-off-by: shishir gowda <sgowda@redhat.com>
* mgmt/glusterd: store for snapshot descriptionshishir gowda2013-11-151-2/+35
| | | | | Change-Id: I0ba50ba2963edf8d890a2dc78d48d42db7f71ae2 Signed-off-by: shishir gowda <sgowda@redhat.com>
* snapshot: Snapshot restoreRajesh Joseph2013-11-151-1/+1
| | | | | | | | | | | | GL-31: Ability to restore snapshot Implemented snapshot restore for thin logical volume. As of now snapshot restore for CG is not tested. Testing for snapshot restore of a volume is done by changing the snapshot create process to create a thick snapshot. This is done because --merge option to restore thin volume is not working in the latest kernel. Change-Id: Ia3ded7e6c4da5957a74e269a25ba3200e6fb2d8b Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
* mgmt/glusterd: save snapshot config values in storeshishir gowda2013-11-151-1/+71
| | | | | Change-Id: Ia755e5c4af84827cc9b8876054cc48cfdc598876 Signed-off-by: shishir gowda <sgowda@redhat.com>
* mgmt/glusterd: store option to update only snap_list.infoshishir gowda2013-11-151-0/+51
| | | | | Change-Id: I16b17ca60b5f9b34b7d238d8a3424a3b7a1dc435 Signed-off-by: shishir gowda <sgowda@redhat.com>
* mgmt/glusterd: handle snap volume store and actual volume store separatelyRaghavendra Bhat2013-11-151-14/+35
| | | | | Change-Id: I8b88fe94d0f9ee1089cafdda037abcf2f7a180ca Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* mgmt/glusterd: Snapstore fixes to make info file persistentshishir gowda2013-11-151-6/+11
| | | | | Change-Id: I30cbbeb135c2d0a780e9e414ac0a96739e25647b Signed-off-by: shishir gowda <sgowda@redhat.com>
* mgmt/glusterd:store for CGshishir gowda2013-11-151-1/+385
| | | | | Change-Id: I6ec888a5553ad29ded032c02c80dd940b2aae007 Signed-off-by: shishir gowda <sgowda@redhat.com>
* mgmt/glusterd: snapshot create commandRaghavendra Bhat2013-11-151-8/+24
| | | | | | | | | | | | | | | | | This is still a work in progress. As of now, these things are done: * Take the snapshot of the backend brick * Create the new volume for the snapshot * Create the brick and the client volfiles * Store the snapshot related info in /var/lib/glusterd * Create the snap object representing the snapshot TODO: Start the brick processes for the snapshot Change-Id: I26fbb0f8e5cf004d4c1dbca51819bab1cd1bac15 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* mgmt/glusterd: Store for snapshotshishir gowda2013-11-151-9/+595
| | | | | | | | | | | | | | | | | | | | Introduced a new store for storing snapshot list for a given volume. $GLUSTERD_INSTALL_PATH/vols/<volname>/snap_list.info $GLUSTERD_INSTALL_PATH/vols/<volname>/snaps/ $GLUSTERD_INSTALL_PATH/vols/<volname>/snaps/<snap-name>/info <-snapshot volume info $GLUSTERD_INSTALL_PATH/vols/<volname>/snaps/<snap-name>/bricks <-snapshot volume brick dir $GLUSTERD_INSTALL_PATH/vols/<volname>/snaps/<snap-name>/bricks/<infos> <-snapshot volume brick info files store delete options TODO - $GLUSTERD_INSTALL_PATH/CG/ <-place holder for all cg's .../CG/<cg-name>/info <- per cg information placeholder Change-Id: I1f9fd8ff7cc0682d05b33965736a43dca6adb3e9 Signed-off-by: shishir gowda <sgowda@redhat.com>
* bd: posix/multi-brick support to BD xlatorM. Mohan Kumar2013-11-131-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current BD xlator (block backend) has a few limitations such as * Creation of directories not supported * Supports only single brick * Does not use extended attributes (and client gfid) like posix xlator * Creation of special files (symbolic links, device nodes etc) not supported Basic limitation of not allowing directory creation is blocking oVirt/VDSM to consume BD xlator as part of Gluster domain since VDSM creates multi-level directories when GlusterFS is used as storage backend for storing VM images. To overcome these limitations a new BD xlator with following improvements is suggested. * New hybrid BD xlator that handles both regular files and block device files * The volume will have both POSIX and BD bricks. Regular files are created on POSIX bricks, block devices are created on the BD brick (VG) * BD xlator leverages exiting POSIX xlator for most POSIX calls and hence sits above the POSIX xlator * Block device file is differentiated from regular file by an extended attribute * The xattr 'user.glusterfs.bd' (BD_XATTR) plays a role in mapping a posix file to Logical Volume (LV). * When a client sends a request to set BD_XATTR on a posix file, a new LV is created and mapped to posix file. So every block device will have a representative file in POSIX brick with 'user.glusterfs.bd' (BD_XATTR) set. * Here after all operations on this file results in LV related operations. For example opening a file that has BD_XATTR set results in opening the LV block device, reading results in reading the corresponding LV block device. When BD xlator gets request to set BD_XATTR via setxattr call, it creates a LV and information about this LV is placed in the xattr of the posix file. xattr "user.glusterfs.bd" used to identify that posix file is mapped to BD. Usage: Server side: [root@host1 ~]# gluster volume create bdvol host1:/storage/vg1_info?vg1 host2:/storage/vg2_info?vg2 It creates a distributed gluster volume 'bdvol' with Volume Group vg1 using posix brick /storage/vg1_info in host1 and Volume Group vg2 using /storage/vg2_info in host2. [root@host1 ~]# gluster volume start bdvol Client side: [root@node ~]# mount -t glusterfs host1:/bdvol /media [root@node ~]# touch /media/posix It creates regular posix file 'posix' in either host1:/vg1 or host2:/vg2 brick [root@node ~]# mkdir /media/image [root@node ~]# touch /media/image/lv1 It also creates regular posix file 'lv1' in either host1:/vg1 or host2:/vg2 brick [root@node ~]# setfattr -n "user.glusterfs.bd" -v "lv" /media/image/lv1 [root@node ~]# Above setxattr results in creating a new LV in corresponding brick's VG and it sets 'user.glusterfs.bd' with value 'lv:<default-extent-size' [root@node ~]# truncate -s5G /media/image/lv1 It results in resizig LV 'lv1'to 5G New BD xlator code is placed in xlators/storage/bd directory. Also add volume-uuid to the VG so that same VG can't be used for other bricks/volumes. After deleting a gluster volume, one has to manually remove the associated tag using vgchange <vg-name> --deltag <trusted.glusterfs.volume-id:<volume-id>> Changes from previous version V5: * Removed support for delayed deleting of LVs Changes from previous version V4: * Consolidated the patches * Removed usage of BD_XATTR_SIZE and consolidated it in BD_XATTR. Changes from previous version V3: * Added support in FUSE to support full/linked clone * Added support to merge snapshots and provide information about origin * bd_map xlator removed * iatt structure used in inode_ctx. iatt is cached and updated during fsync/flush * aio support * Type and capabilities of volume are exported through getxattr Changes from version 2: * Used inode_context for caching BD size and to check if loc/fd is BD or not. * Added GlusterFS server offloaded copy and snapshot through setfattr FOP. As part of this libgfapi is modified. * BD xlator supports stripe * During unlinking if a LV file is already opened, its added to delete list and bd_del_thread tries to delete from this list when a last reference to that file is closed. Changes from previous version: * gfid is used as name of LV * ? is used to specify VG name for creating BD volume in volume create, add-brick. gluster volume create volname host:/path?vg * open-behind issue is fixed * A replicate brick can be added dynamically and LVs from source brick are replicated to destination brick * A distribute brick can be added dynamically and rebalance operation distributes existing LVs/files to the new brick * Thin provisioning support added. * bd_map xlator support retained * setfattr -n user.glusterfs.bd -v "lv" creates a regular LV and setfattr -n user.glusterfs.bd -v "thin" creates thin LV * Capability and backend information added to gluster volume info (and --xml) so that management tools can exploit BD xlator. * tracing support for bd xlator added TODO: * Add support to display snapshots for a given LV * Display posix filename for list-origin instead of gfid Change-Id: I00d32dfbab3b7c806e0841515c86c3aa519332f2 BUG: 1028672 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/4809 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* bd_map: Remove bd_map xlatorM. Mohan Kumar2013-11-131-11/+0
| | | | | | | | | | | Remove bd_map xlator and CLI related changes. Change-Id: If7086205df1907127c1a1fa4ba603f1c48421d09 BUG: 1028672 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/5747 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mgmt/glusterd: add option to specify a different base-portKaleb S. KEITHLEY2013-10-311-4/+3
| | | | | | | | | | | | | | | | | | This is (arguably) a hack to work around a bug in libvirt which is not well behaved wrt to using TCP ports in the unreserved space between 49152-65535. (See RFC 6335) Normally glusterd starts and binds to the first available port in range, usually 49152. libvirt's live migration also tries to use ports in this range, but has no fallback to use (an)other port(s) when the one it wants is already in use. Change-Id: Id8fe35c08b6ce4f268d46804bbb6dddab7a6b7bb BUG: 1018178 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/6210 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Calculate volume op-versions only on set/resetKaushal M2013-09-291-4/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The volume op-versions are calculated during a volume set/reset, reading a volume from disk and importing a volume during probe or volume sync. The calculation of the volume op-version depends on the clusters op-version as some features are enabled automatically depending on the clusters op-version. We also don't store the volume op-versions persistently and don't export the volume op-versions during sync. Due to this, there can occur cases which will lead to inconsistencies in volumes in different peers. One such case is below, Consider, a cluster made up 3 peers P1, P2 and P3, operating at op-version N. The cluster has two volumes V1 and V2, which have volume op-versions N (since volume op-version cannot be greater than cluster op-version). We have, Cluster-op-version = N V1 op-version = N V2 op-version = N A set operation on V1 causes the clusters op-version to be bumped up to N+1. Assume that there exist some features that are automatically enabled on op-version N+1. The op-version of V2 remains at N as no operation has been performed on it. So, Cluster op-version = N+1 V1 op-version = N+1 V2 op-version = N Now, we probe a new peer P4. On the new peer we will have the following op-versions, Cluster op-version = N+1 V1 op-version = N+1 V2 op-version = N+1 This happens because we don't send volume op-versions during the sync after probe. P4 will freshly calculate the op-version of V2 (assuming features have been auto enabled due to the cluster op-version being N+1) as N+1. Another case is when glusterd on a peer restarts. Assume P3 was restarted, glusterd will recalculate the volume op-versions during the restore state. Again, op-version of V2 will be calculated as N+1 assuming auto enabled features. This will lead to inconsistency in the volume representation in memory and on disk, as glusterd will assume the volume contains auto enabled features, but the volfiles don't contain them as they were not regenrated. These kind of issues can be solved by calculating the volume op-version only when features are enabled and disabled (ie. during volume set/reset), persisting the volume-op-versions and exporting/importing them. Change-Id: I52de0668c92628622e85f4588fb28829a7231132 BUG: 1005043 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5568 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Move certain logs into 'DEBUG' levelHarshavardhana2013-08-181-3/+3
| | | | | | | | | | | | Confusing "Error" messages in logs can cause user panic and false positives - avoid them as necessary in future. Change-Id: I906c64eea879b19a8db099c89d1d7f874e5530db BUG: 995784 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5555 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: Fix a minor typo.Vijay Bellur2013-07-311-1/+1
| | | | | | | | | | | Thanks to Patrick Matthäi <pmatthaei@debian.org> for the patch. Signed-off-by: Vijay Bellur <vbellur@redhat.com> Change-Id: I59da74298894ccc2ab30967ffe44cc844aa73f82 BUG: 814534 Reviewed-on: http://review.gluster.org/5436 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Anand Avati <avati@redhat.com>
* store: move glusterd_store functions from mgmt/glusterd to libglusterfsNiels de Vos2013-06-201-821/+169
| | | | | | | | | | | | | Making the glusterd_store_* functions re-usable will help with future changes that need to read/write lists of items. BUG: 904065 Change-Id: I99fb8eced76d12d5a254567eccff9790b43d8da3 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/4676 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Set op-version on startup based on install statusKaushal M2013-05-121-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | If the current installation of glusterfs doesn't have a stored op-version and is, a. a fresh install, then set op-version to maximum b. an upgrade from release which didn't have op-version support, set it to minimum. The install status is detected using the peer-uuid. If both peer-uuid and op-version are not present in the store, the installation is fresh. If peer-uuid is present, but op-version is absent in the store, the installation has been upgraded from a version which didn't support op-versions. By setting the initial op-version as above, we can ensure that a. features are not enabled accidentally during upgrades b. a fresh install starts with all possible features enabled. Change-Id: I52aed9ebeb5d3823c81e65f739a78a924ecf7e12 BUG: 954256 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4867 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: Introduce volume op-versionsKaushal M2013-04-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | Each volume is now associated with two op-versions, * op_version - the op-version of the highest op-versioned feature enabled * client_op_version - the op-version of the highest op-versioned feature enabled which affects the clients only. These two op-versions are generated dynamically and kept updated during runtime. Glusterd now uses the respective volumes' client-op-version during getspec requests. To achieve the above a new field in the vme table is introduced, client_option, this boolean field tells if the option is a client side option. Change-Id: I12c83b1dd29ab506026efd50d448cebbcee53c27 BUG: 907311 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4584 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Mark vol as deleted by renaming voldir before cleaning up the storeKrutika Dhananjay2013-03-111-41/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: During 'volume delete', when glusterd fails to erase all information about a volume from the backend store (for instance because rmdir() failed on non-empty directories), not only does volume delete fail on that node, but also subsequent attempts to restart glusterd fail because the volume store is left in an inconsistent state. FIX: Rename the volume directory path to a new location <working-dir>/trash/<volume-id>.deleted, and then go on to clean up its contents. The volume is considered deleted once rename() succeeds, irrespective of whether the cleanup succeeds or not. Change-Id: Iaf18e1684f0b101808bd5e1cd53a5d55790541a8 BUG: 889630 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4639 Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: Added the validation function for subvols-per-directoryAvra Sengupta2013-02-281-2/+5
| | | | | | | | | Change-Id: Ie2259023b9001311a2032792639c3093054f6750 BUG: 896431 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4552 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: log changes in volume create and delete codepathKrutika Dhananjay2013-02-041-50/+81
| | | | | | | | | | | | | | | | | | | | Making log changes involving two commands as they both share sections of code (like the part where the volume metadata is cleaned up in vol delete in case of success; and in vol create in case of failure). * Most of the changes are of the 's/THIS/this' kind. * Changed some of the log messages to give as much information as available in case of failure. * Changed log levels in some of the log messages. Change-Id: I10242511fe9400a07ab04717464d748d9172dd85 BUG: 812356 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4462 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: log changes in volume stop (and in op sm codepath)Krutika Dhananjay2013-01-181-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes log changes mostly in the op state machine as also in volume stop codepath of glusterd. Changes made: * Moved log level from INFO to DEBUG, of log messages on the various state transitions within a transaction. For example, messages of the following kind: a. "Sent op req to <n> peers" b. "Received LOCK from uuid: <peer-uuid>", etc. * Changed some of the log messages to give as much information as available in case of failure. * Added logs to identify on which machine lock/stage/commit failed. * Quite a few s/THIS/this changes. Also, with this change, log changes in all other volume ops should (hopefully) boil down to modifying the respective logs in handler, stage and commit (and brick ops in some cases). Change-Id: I2b8443042b07fb41a1d12033741f7e156aa6b3da BUG: 812356 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4382 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Made dst brick's port info available to all peersKrishnan Parthasarathi2013-01-041-0/+32
| | | | | | | | | Change-Id: I1f65743a31d95013fdf22cded91c314e9934a3a9 BUG: 816915 Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Reviewed-on: http://review.gluster.org/3275 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd, cli: Task id's for async tasksKaushal M2012-12-191-18/+55
| | | | | | | | | | | | | | | | | | | | | | This patch introduces task-id's for async tasks like rebalance, remove-brick and replace-brick. An id is generated for each task when it is started and displayed to the user in cli output. The status of running tasks is also included in the output of "volume status" along with its id, so that a user can easily track the progress of an async task. Also, * added tests for this feature into the regression test suite. * added a python script for creating files, 'create-files.py', courtesy Vijaykumar Koppad (vkoppad@redhat.com) into the test suite. This patch reverts the revert commit 698deb33d731df6de84da8ae8ee4045e1543a168. BUG: 857330 Change-Id: Id43d7cb629a38f47f733fbc18cb4c5f2f0327c7a Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4294 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>