summaryrefslogtreecommitdiffstats
path: root/Feature Planning/GlusterFS 3.5/AFR CLI enhancements.md
diff options
context:
space:
mode:
authorraghavendra talur <raghavendra.talur@gmail.com>2015-08-20 15:09:31 +0530
committerHumble Devassy Chirammal <humble.devassy@gmail.com>2015-08-31 02:27:22 -0700
commit9e9e3c5620882d2f769694996ff4d7e0cf36cc2b (patch)
tree3a00cbd0cc24eb7df3de9b2eeeb8d42ee9175f88 /Feature Planning/GlusterFS 3.5/AFR CLI enhancements.md
parentf6055cdb4dedde576ed8ec55a13814a69dceefdc (diff)
Create basic directory structure
All new features specs go into in_progress directory. Once signed off, it should be moved to done directory. For now, This change moves all the Gluster 4.0 feature specs to in_progress. All other specs are under done/release-version. More cleanup required will be done incrementally. Change-Id: Id272d301ba8c434cbf7a9a966ceba05fe63b230d BUG: 1206539 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/11969 Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Prashanth Pai <ppai@redhat.com> Tested-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
Diffstat (limited to 'Feature Planning/GlusterFS 3.5/AFR CLI enhancements.md')
-rw-r--r--Feature Planning/GlusterFS 3.5/AFR CLI enhancements.md204
1 files changed, 0 insertions, 204 deletions
diff --git a/Feature Planning/GlusterFS 3.5/AFR CLI enhancements.md b/Feature Planning/GlusterFS 3.5/AFR CLI enhancements.md
deleted file mode 100644
index 88f4980..0000000
--- a/Feature Planning/GlusterFS 3.5/AFR CLI enhancements.md
+++ /dev/null
@@ -1,204 +0,0 @@
-Feature
--------
-
-AFR CLI enhancements
-
-SUMMARY
--------
-
-Presently the AFR reporting via CLI has lots of problems in the
-representation of logs because of which they may not be able to use the
-data effectively. This feature is to correct these problems and provide
-a coherent mechanism to present heal status,information and the logs
-associated.
-
-Owners
-------
-
-Venkatesh Somayajulu
-Raghavan
-
-Current status
---------------
-
-There are many bugs related to this which indicates the current status
-and why these requirements are required.
-
-​1) 924062 - gluster volume heal info shows only gfids in some cases and
-sometimes names. This is very confusing for the end user.
-
-​2) 852294 - gluster volume heal info hangs/crashes when there is a
-large number of entries to be healed.
-
-​3) 883698 - when self heal daemon is turned off, heal info does not
-show any output. But healing can happen because of lookups from IO path.
-Hence list of entries to be healed still needs to be shown.
-
-​4) 921025 - directories are not reported when list of split brain
-entries needs to be displayed.
-
-​5) 981185 - when self heal daemon process is offline, volume heal info
-gives error as "staging failure"
-
-​6) 952084 - We need a command to resolve files in split brain state.
-
-​7) 986309 - We need to report source information for files which got
-healed during a self heal session.
-
-​8) 986317 - Sometimes list of files to get healed also includes files
-to which IO s being done since the entries for these files could be in
-the xattrop directory. This could be confusing for the user.
-
-There is a master bug 926044 that sums up most of the above problems. It
-does give the QA perspective of the current representation out of the
-present reporting infrastructure.
-
-Detailed Description
---------------------
-
-​1) One common thread among all the above complaints is that the
-information presented to the user is <B>FUD</B> because of the following
-reasons:
-
-(a) Split brain itself is a scary scenario especially with VMs.
-(b) The data that we present to the users cannot be used in a stable
- manner for them to get to the list of these files. <I>For ex:</I> we
- need to give mechanisms by which he can automate the resolution out
- of split brain.
-(c) The logs that are generated are all the more scarier since we
- see repetition of some error lines running into hundreds of lines.
- Our mailing lists are filled with such emails from end users.
-
-Any data is useless unless it is associated with an event. For self
-heal, the event that leads to self heal is the loss of connectivity to a
-brick from a client. So all healing info and especially split brain
-should be associated with such events.
-
-The following is hence the proposed mechanism:
-
-(a) Every loss of a brick from client's perspective is logged and
- available via some ID. The information provides the time from when
- the brick went down to when it came up. Also it should also report
- the number of IO transactions(modifies) that hapenned during this
- event.
-(b) The list of these events are available via some CLI command. The
- actual command needs to be detailed as part of this feature.
-(c) All volume info commands regarding list of files to be healed,
- files healed and split brain files should be associated with this
- event(s).
-
-​2) Provide a mechanism to show statistics at a volume and replica group
-level. It should show the number of files to be healed and number of
-split brain files at both the volume and replica group level.
-
-​3) Provide a mechanism to show per volume list of files to be
-healed/files healed/split brain in the following info:
-
-This should have the following information:
-
-(a) File name
-(b) Bricks location
-(c) Event association (brick going down)
-(d) Source
-(v) Sink
-
-​4) Self heal crawl statistics - Introduce new CLI commands for showing
-more information on self heal crawl per volume.
-
-(a) Display why a self heal crawl ran (timeouts, brick coming up)
-(b) Start time and end time
-(c) Number of files it attempted to heal
-(d) Location of the self heal daemon
-
-​5) Scale the logging infrastructure to handle huge number of file list
-that needs to be displayed as part of the logging.
-
-(a) Right now the system crashes or hangs in case of a high number
- of files.
-(b) It causes CLI timeouts arbitrarily. The latencies involved in
- the logging have to be studied (profiled) and mechanisms to
- circumvent them have to be introduced.
-(c) All files are displayed on the output. Have a better way of
- representing them.
-
-Options are:
-
-(a) Maybe write to a glusterd log file or have a seperate directory
- for afr heal logs.
-(b) Have a status kind of command. This will display the current
- status of the log building and maybe have batched way of
- representing when there is a huge list.
-
-​6) We should provide mechanism where the user can heal split brain by
-some pre-established policies:
-
-(a) Let the system figure out the latest files (assuming all nodes
- are in time sync) and choose the copies that have the latest time.
-(b) Choose one particular brick as the source for split brain and
- heal all split brains from this brick.
-(c) Just remove the split brain information from changelog. We leave
- the exercise to the user to repair split brain where in he would
- rewrite to the split brained files. (right now the user is forced to
- remove xattrs manually for this step).
-
-Benefits to GlusterFS
---------------------
-
-Makes the end user more aware of healing status and provides statistics.
-
-Scope
------
-
-6.1. Nature of proposed change
-
-Modification to AFR and CLI and glusterd code
-
-6.2. Implications on manageability
-
-New CLI commands to be added. Existing commands to be improved.
-
-6.3. Implications on presentation layer
-
-N/A
-
-6.4. Implications on persistence layer
-
-N/A
-
-6.5. Implications on 'GlusterFS' backend
-
-N/A
-
-6.6. Modification to GlusterFS metadata
-
-N/A
-
-6.7. Implications on 'glusterd'
-
-Changes for healing specific commands will be introduced.
-
-How To Test
------------
-
-See documentation session
-
-User Experience
----------------
-
-*Changes in CLI, effect on User experience...*
-
-Documentation
--------------
-
-<http://review.gluster.org/#/c/7792/1/doc/features/afr-statistics.md>
-
-Status
-------
-
-Patches :
-
-<http://review.gluster.org/6044> <http://review.gluster.org/4790>
-
-Status:
-
-Merged \ No newline at end of file