summaryrefslogtreecommitdiffstats
path: root/done/GlusterFS 3.6
diff options
context:
space:
mode:
Diffstat (limited to 'done/GlusterFS 3.6')
-rw-r--r--done/GlusterFS 3.6/Better Logging.md348
-rw-r--r--done/GlusterFS 3.6/Better Peer Identification.md172
-rw-r--r--done/GlusterFS 3.6/Gluster User Serviceable Snapshots.md39
-rw-r--r--done/GlusterFS 3.6/Gluster Volume Snapshot.md354
-rw-r--r--done/GlusterFS 3.6/New Style Replication.md230
-rw-r--r--done/GlusterFS 3.6/Persistent AFR Changelog xattributes.md178
-rw-r--r--done/GlusterFS 3.6/RDMA Improvements.md101
-rw-r--r--done/GlusterFS 3.6/Server-side Barrier feature.md213
-rw-r--r--done/GlusterFS 3.6/Thousand Node Gluster.md150
-rw-r--r--done/GlusterFS 3.6/afrv2.md244
-rw-r--r--done/GlusterFS 3.6/better-ssl.md137
-rw-r--r--done/GlusterFS 3.6/disperse.md142
-rw-r--r--done/GlusterFS 3.6/glusterd volume locks.md48
-rw-r--r--done/GlusterFS 3.6/heterogeneous-bricks.md136
-rw-r--r--done/GlusterFS 3.6/index.md96
15 files changed, 2588 insertions, 0 deletions
diff --git a/done/GlusterFS 3.6/Better Logging.md b/done/GlusterFS 3.6/Better Logging.md
new file mode 100644
index 0000000..6aad602
--- /dev/null
+++ b/done/GlusterFS 3.6/Better Logging.md
@@ -0,0 +1,348 @@
+Feature
+-------
+
+Gluster logging enhancements to support message IDs per message
+
+Summary
+-------
+
+Enhance gluster logging to provide the following features, SubFeature
+--\> SF
+
+- SF1: Add message IDs to message
+
+- SF2: Standardize error num reporting across messages
+
+- SF3: Enable repetitive message suppression in logs
+
+- SF4: Log location and hierarchy standardization (in case anything is
+further required here, analysis pending)
+
+- SF5: Enable per sub-module logging level configuration
+
+- SF6: Enable logging to other frameworks, than just the current gluster
+logs
+
+- SF7: Generate a catalogue of these message, with message ID, message,
+reason for occurrence, recovery/troubleshooting steps.
+
+Owners
+------
+
+Balamurugan Arumugam <barumuga@redhat.com>
+Krishnan Parthasarathi <kparthas@redhat.com>
+Krutika Dhananjay <kdhananj@redhat.com>
+Shyamsundar Ranganathan <srangana@redhat.com>
+
+Current status
+--------------
+
+### Existing infrastructure:
+
+Currently gf\_logXXX exists as an infrastructure API for all logging
+related needs. This (typically) takes the form,
+
+gf\_log(dom, levl, fmt...)
+
+where,
+
+    dom: Open format string usually the xlator name, or "cli" or volume name etc.
+    levl: One of, GF_LOG_EMERG, GF_LOG_ALERT, GF_LOG_CRITICAL, GF_LOG_ERROR, GF_LOG_WARNING, GF_LOG_NOTICE, GF_LOG_INFO, GF_LOG_DEBUG, GF_LOG_TRACE
+    fmt: the actual message string, followed by the required arguments in the string
+
+The log initialization happens through,
+
+gf\_log\_init (void \*data, const char \*filename, const char \*ident)
+
+where,
+
+    data: glusterfs_ctx_t, largely unused in logging other than the required FILE and mutex fields
+    filename: file name to log to
+    ident: Like syslog ident parameter, largely unused
+
+The above infrastructure leads to logs of type, (sample extraction from
+nfs.log)
+
+     [2013-12-08 14:17:17.603879] I [socket.c:3485:socket_init] 0-socket.ACL: SSL support is NOT enabled
+     [2013-12-08 14:17:17.603937] I [socket.c:3500:socket_init] 0-socket.ACL: using system polling thread
+     [2013-12-08 14:17:17.612128] I [nfs.c:934:init] 0-nfs: NFS service started
+     [2013-12-08 14:17:17.612383] I [dht-shared.c:311:dht_init_regex] 0-testvol-dht: using regex rsync-hash-regex = ^\.(.+)\.[^.]+$
+
+### Limitations/Issues in the infrastructure
+
+​1) Auto analysis of logs needs to be done based on the final message
+string. Automated tools that can help with log message and related
+troubleshooting options need to use the final string, which needs to be
+intelligently parsed and also may change between releases. It would be
+desirable to have message IDs so that such tools and trouble shooting
+options can leverage the same in a much easier fashion.
+
+​2) The log message itself currently does not use the \_ident\_ which
+can help as we move to more common logging frameworks like journald,
+rsyslog (or syslog as the case maybe)
+
+​3) errno is the primary identifier of errors across gluster, i.e we do
+not have error codes in gluster and use errno values everywhere. The log
+messages currently do not lend themselves to standardization like
+printing the string equivalent of errno rather than the actual errno
+value, which \_could\_ be cryptic to administrators
+
+​4) Typical logging infrastructures provide suppression (on a
+configurable basis) for repetitive messages to prevent log flooding,
+this is currently missing in the current infrastructure
+
+​5) The current infrastructure cannot be used to control log levels at a
+per xlator or sub module, as the \_dom\_ passed is a string that change
+based on volume name, translator name etc. It would be desirable to have
+a better module identification mechanism that can help with this
+feature.
+
+​6) Currently the entire logging infrastructure resides within gluster.
+It would be desirable in scaled situations to have centralized logging
+and monitoring solutions in place, to be able to better analyse and
+monitor the cluster health and take actions.
+
+This requires some form of pluggable logging frameworks that can be used
+within gluster to enable this possibility. Currently the existing
+framework is used throughout gluster and hence we need only to change
+configuration and logging.c to enable logging to other frameworks (as an
+example the current syslog plug that was provided).
+
+It would be desirable to enhance this to provide a more robust framework
+for future extensions to other frameworks. This is not a limitation of
+the current framework, so much as a re-factor to be able to switch
+logging frameworks with more ease.
+
+​7) For centralized logging in the future, it would need better
+identification strings from various gluster processes and hosts, which
+is currently missing or suppressed in the logging infrastructure.
+
+Due to the nature of enhancements proposed, it is required that we
+better the current infrastructure for the stated needs and do some
+future proofing in terms of newer messages that would be added.
+
+Detailed Description
+--------------------
+
+NOTE: Covering details for SF1, SF2, and partially SF3, SF5, SF6. SF4/7
+will be covered in later revisions/phases.
+
+### Logging API changes:
+
+​1) Change the logging API as follows,
+
+From: gf\_log(dom, levl, fmt...)
+
+To: gf\_msg(dom, levl, errnum, msgid, fmt...)
+
+Where:
+
+    dom: Open string as used in the current logging infrastructure (helps in backward compat)
+    levl: As in current logging infrastructure (current levels seem sufficient enough to not add more levels for better debuggability etc.)
+    <new fields>
+    msgid: A message identifier, unique to this message FMT string and possibly this invocation. (SF1, lending to SF3)
+    errnum: The errno that this message is generated for (with an implicit 0 meaning no error number per se with this message) (SF2)
+
+NOTE: Internally the gf\_msg would still be a macro that would add the
+\_\_FILE\_\_ \_\_LINE\_\_ \_\_FUNCTION\_\_ arguments
+
+​2) Enforce \_ident\_ in the logging initialization API, gf\_log\_init
+(void \*data, const char \*filename, const char \*ident)
+
+Where:
+
+ ident would be the identifier string like, nfs, <mountpoint>, brick-<brick-name>, cli, glusterd, as is the case with the log file name that is generated today (lending to SF6)
+
+#### What this achieves:
+
+With the above changes, we now have a message ID per message
+(\_msgid\_), location of the message in terms of which component
+(\_dom\_) and which process (\_ident\_). The further identification of
+the message location in terms of host (ip/name) can be done in the
+framework, when centralized logging infrastructure is introduced.
+
+#### Log message changes:
+
+With the above changes to the API the log message can now appear in a
+compatibility mode to adhere to current logging format, or be presented
+as follows,
+
+log invoked as: gf\_msg(dom, levl, ENOTSUP, msgidX)
+
+Example: gf\_msg ("logchecks", GF\_LOG\_CRITICAL, 22, logchecks\_msg\_4,
+42, "Forty-Two", 42);
+
+Where: logchecks\_msg\_4 (GLFS\_COMP\_BASE + 4), "Critical: Format
+testing: %d:%s:%x"
+
+​1) Gluster logging framework (logged as)
+
+ [2014-02-17 08:52:28.038267] I [MSGID: 1002] [logchecks.c:44:go_log] 0-logchecks: Informational: Format testing: 42:Forty-Two:2a [Invalid argument]
+
+​2) syslog (passed as)
+
+ Feb 17 14:17:42 somari logchecks[26205]: [MSGID: 1002] [logchecks.c:44:go_log] 0-logchecks: Informational: Format testing: 42:Forty-Two:2a [Invalid argument]
+
+​3) journald (passed as)
+
+    sd_journal_send("MESSAGE=<vasprintf(dom, msgid(fmt))>",
+                        "MESSAGE_ID=msgid",
+                        "PRIORITY=levl",
+                        "CODE_FILE=`<fname>`", "CODE_LINE=`<lnum>", "CODE_FUNC=<fnnam>",
+                        "ERRNO=errnum",
+                        "SYSLOG_IDENTIFIER=<ident>"
+                        NULL);
+
+​4) CEE (Common Event Expression) format string passed to any CEE
+consumer (say lumberjack)
+
+Based on generating @CEE JSON string as per specifications and passing
+it the infrastructure in question.
+
+#### Message ID generation:
+
+​1) Some rules for message IDs
+
+- Every message, even if it is the same message FMT, will have a unique
+message ID - Changes to a specific message string, hence will not change
+its ID and also not impact other locations in the code that use the same
+message FMT
+
+​2) A glfs-message-id.h file would contain ranges per component for
+individual component based messages to be created without overlapping on
+the ranges.
+
+​3) <component>-message.h would contain something as follows,
+
+     #define GLFS_COMP_BASE         GLFS_MSGID_COMP_<component>
+     #define GLFS_NUM_MESSAGES       1
+     #define GLFS_MSGID_END          (GLFS_COMP_BASE + GLFS_NUM_MESSAGES + 1)
+     /* Messaged with message IDs */
+     #define glfs_msg_start_x GLFS_COMP_BASE, "Invalid: Start of messages"
+     /*------------*/
+     #define <component>_msg_1 (GLFS_COMP_BASE + 1), "Test message, replace with"\
+                        " original when using the template"
+     /*------------*/
+     #define glfs_msg_end_x GLFS_MSGID_END, "Invalid: End of messages"
+
+​5) Each call to gf\_msg hence would be,
+
+    gf_msg(dom, levl, errnum, glfs_msg_x, ...)
+
+#### Setting per xlator logging levels (SF5):
+
+short description to be elaborated later
+
+Leverage this-\>loglevel to override the global loglevel. This can be
+also configured from gluster CLI at runtime to change the log levels at
+a per xlator level for targeted debugging.
+
+#### Multiple log suppression(SF3):
+
+short description to be elaborated later
+
+​1) Save the message string as follows, Msg\_Object(msgid,
+msgstring(vasprintf(dom, fmt)), timestamp, repetitions)
+
+​2) On each message received by the logging infrastructure check the
+list of saved last few Msg\_Objects as follows,
+
+2.1) compare msgid and on success compare msgstring for a match, compare
+repetition tolerance time with current TS and saved TS in the
+Msg\_Object
+
+2.1.1) if tolerance is within limits, increment repetitions and do not
+print message
+
+2.1.2) if tolerance is outside limits, print repetition count for saved
+message (if any) and print the new message
+
+2.2) If none of the messages match the current message, knock off the
+oldest message in the list printing any repetition count message for the
+same, and stash new message into the list
+
+The key things to remember and act on here would be to, minimize the
+string duplication on each message, and also to keep the comparison
+quick (hence base it off message IDs and errno to start with)
+
+#### Message catalogue (SF7):
+
+<short description to be elaborated later>
+
+The idea is to use Doxygen comments in the <component>-message.h per
+component, to list information in various sections per message of
+consequence and later use Doxygen to publish this catalogue on a per
+release basis.
+
+Benefit to GlusterFS
+--------------------
+
+The mentioned limitations and auto log analysis benefits would accrue
+for GlusterFS
+
+Scope
+-----
+
+### Nature of proposed change
+
+All gf\_logXXX function invocations would change to gf\_msgXXX
+invocations.
+
+### Implications on manageability
+
+None
+
+### Implications on presentation layer
+
+None
+
+### Implications on persistence layer
+
+None
+
+### Implications on 'GlusterFS' backend
+
+None
+
+### Modification to GlusterFS metadata
+
+None
+
+### Implications on 'glusterd'
+
+None
+
+How To Test
+-----------
+
+A separate test utility that tests various logs and formats would be
+provided to ensure that functionality can be tested independent of
+GlusterFS
+
+User Experience
+---------------
+
+Users would notice changed logging formats as mentioned above, the
+additional field of importance would be the MSGID:
+
+Dependencies
+------------
+
+None
+
+Documentation
+-------------
+
+Intending to add a logging.md (or modify the same) to elaborate on how a
+new component should now use the new framework and generate messages
+with IDs in the same.
+
+Status
+------
+
+In development (see, <http://review.gluster.org/#/c/6547/> )
+
+Comments and Discussion
+-----------------------
+
+<Follow here> \ No newline at end of file
diff --git a/done/GlusterFS 3.6/Better Peer Identification.md b/done/GlusterFS 3.6/Better Peer Identification.md
new file mode 100644
index 0000000..a8c6996
--- /dev/null
+++ b/done/GlusterFS 3.6/Better Peer Identification.md
@@ -0,0 +1,172 @@
+Feature
+-------
+
+**Better peer identification**
+
+Summary
+-------
+
+This proposal is regarding better identification of peers.
+
+Owners
+------
+
+Kaushal Madappa <kmadappa@redhat.com>
+
+Current status
+--------------
+
+Glusterd currently is inconsistent in the way it identifies peers. This
+causes problems when the same peer is referenced with different names in
+different gluster commands.
+
+Detailed Description
+--------------------
+
+Currently, the way we identify peers is not consistent all through the
+gluster code. We use uuids internally and hostnames externally.
+
+This setup works pretty well when all the peers are on a single network,
+have one address, and are referred to in all the gluster commands with
+same address.
+
+But once we start mixing up addresses in the commands (ip, shortnames,
+fqdn) and bring in multiple networks we have problems.
+
+The problems were discussed in the following mailing list threads and
+some solutions were proposed.
+
+- How do we identify peers? [^1]
+- RFC - "Connection Groups" concept [^2]
+
+The solution to the multi-network problem is dependent on the solution
+to the peer identification problem. So it'll be good to target fixing
+the peer identification problem asap, ie. in 3.6, and take up the
+networks problem later.
+
+Benefit to GlusterFS
+--------------------
+
+Sanity. It will be great to have all internal identifiers for peers
+happening through a UUID, and being translated into a host/IP at the
+most superficial layer.
+
+Scope
+-----
+
+### Nature of proposed change
+
+The following changes will be done in Glusterd to improve peer
+identification.
+
+1. Peerinfo struct will be extended to have a list of associated
+ hostnames/addresses, instead of a single hostname as it is
+ currently. The import/export and store/restore functions will be
+ changed to handle this. CLI will be updated to show this list of
+ addresses in peer status and pool list commands.
+2. Peer probe will be changed to append an address to the peerinfo
+ address list, when we observe that the given address belongs to an
+ existing peer.
+3. Have a new API for translation between hostname/addresses into
+ UUIDs. This new API will be used in all places where
+ hostnames/addresses were being validated, including peer probe, peer
+ detach, volume create, add-brick, remove-brick etc.
+4. A new command - 'gluster peer add-address <existing> <new-address>'
+ - which appends to the address list will be implemented if time
+ permits.
+5. A new command - 'gluster peer rename <existing> <new>' - which will
+ rename all occurrences of a peer with the newly given name will be
+ implemented if time permits.
+
+Changes 1-3 are the base for the other changes and will the primary
+deliverables for this feature.
+
+### Implications on manageability
+
+The primary changes will bring about some changes to the CLI output of
+'peer status' and 'pool list' commands. The normal and XML outputs for
+these commands will contain a list of addresses for each peer, instead
+of a single hostname.
+
+Tools depending on the output of these commands will need to be updated.
+
+**TODO**: *Add sample outputs*
+
+The new commands 'peer add-address' and 'peer rename' will improve
+manageability of peers.
+
+### Implications on presentation layer
+
+None
+
+### Implications on persistence layer
+
+None
+
+### Implications on 'GlusterFS' backend
+
+None
+
+### Modification to GlusterFS metadata
+
+None
+
+### Implications on 'glusterd'
+
+<persistent store, configuration changes, brick-op...>
+
+How To Test
+-----------
+
+**TODO:** *Add test cases*
+
+User Experience
+---------------
+
+User experience will improve for commands which used peer identifiers
+(volume create/add-brick/remove-brick, peer probe, peer detach), as the
+the user will no longer face errors caused by mixed usage of
+identifiers.
+
+Dependencies
+------------
+
+None.
+
+Documentation
+-------------
+
+The new behaviour of the peer probe command will need to be documented.
+The new commands will need to be documented as well.
+
+**TODO:** *Add more documentations*
+
+Status
+------
+
+The feature is under development on forge [^3] and github [^4]. This
+github merge request [^5] can be used for performing preliminary
+reviews. Once we are satisfied with the changes, it will be posted for
+review on gerrit.
+
+Comments and Discussion
+-----------------------
+
+There are open issues around node crash + re-install with same IP (but
+new UUID) which need to be addressed in this effort.
+
+Links
+-----
+
+<references>
+</references>
+
+[^1]: <http://lists.gnu.org/archive/html/gluster-devel/2013-06/msg00067.html>
+
+[^2]: <http://lists.gnu.org/archive/html/gluster-devel/2013-06/msg00069.html>
+
+[^3]: <https://forge.gluster.org/~kshlm/glusterfs-core/kshlms-glusterfs/commits/better-peer-identification>
+
+[^4]: <https://github.com/kshlm/glusterfs/tree/better-peer-identification>
+
+[^5]: <https://github.com/kshlm/glusterfs/pull/2>
diff --git a/done/GlusterFS 3.6/Gluster User Serviceable Snapshots.md b/done/GlusterFS 3.6/Gluster User Serviceable Snapshots.md
new file mode 100644
index 0000000..9af7062
--- /dev/null
+++ b/done/GlusterFS 3.6/Gluster User Serviceable Snapshots.md
@@ -0,0 +1,39 @@
+Feature
+-------
+
+Enable user-serviceable snapshots for GlusterFS Volumes based on
+GlusterFS-Snapshot feature
+
+Owners
+------
+
+Anand Avati
+Anand Subramanian <anands@redhat.com>
+Raghavendra Bhat
+Varun Shastry
+
+Summary
+-------
+
+Each snapshot capable GlusterFS Volume will contain a .snaps directory
+through which a user will be able to access previously point-in-time
+snapshot copies of his data. This will be enabled through a hidden
+.snaps folder in each directory or sub-directory within the volume.
+These user-serviceable snapshot copies will be read-only.
+
+Tests
+-----
+
+​1) Enable uss (gluster volume set <volume name> features.uss enable) A
+snap daemon should get started for the volume. It should be visible in
+gluster volume status command. 2) entering the snapshot world ls on
+.snaps from any directory within the filesystem should be successful and
+should show the list of snapshots as directories. 3) accessing the
+snapshots One of the snapshots can be entered and it should show the
+contents of the directory from which .snaps was entered, when the
+snapshot was taken. NOTE: If the directory was not present when a
+snapshot was taken (say snap1) and created later, then entering snap1
+directory (or any access) will fail with stale file handle. 4) Reading
+from snapshots Any kind of read operations from the snapshots should be
+successful. But any modifications to snapshot data is not allowed.
+Snapshots are read-only \ No newline at end of file
diff --git a/done/GlusterFS 3.6/Gluster Volume Snapshot.md b/done/GlusterFS 3.6/Gluster Volume Snapshot.md
new file mode 100644
index 0000000..468992a
--- /dev/null
+++ b/done/GlusterFS 3.6/Gluster Volume Snapshot.md
@@ -0,0 +1,354 @@
+Feature
+-------
+
+Snapshot of Gluster Volume
+
+Summary
+-------
+
+Gluster volume snapshot will provide point-in-time copy of a GlusterFS
+volume. This snapshot is an online-snapshot therefore file-system and
+its associated data continue to be available for the clients, while the
+snapshot is being taken.
+
+Snapshot of a GlusterFS volume will create another read-only volume
+which will be a point-in-time copy of the original volume. Users can use
+this read-only volume to recover any file(s) they want. Snapshot will
+also provide restore feature which will help the user to recover an
+entire volume. The restore operation will replace the original volume
+with the snapshot volume.
+
+Owner(s)
+--------
+
+Rajesh Joseph <rjoseph@redhat.com>
+
+Copyright
+---------
+
+Copyright (c) 2013-2014 Red Hat, Inc. <http://www.redhat.com>
+
+This feature is licensed under your choice of the GNU Lesser General
+Public License, version 3 or any later version (LGPLv3 or later), or the
+GNU General Public License, version 2 (GPLv2), in all cases as published
+by the Free Software Foundation.
+
+Current status
+--------------
+
+Gluster volume snapshot support is provided in GlusterFS 3.6
+
+Detailed Description
+--------------------
+
+GlusterFS snapshot feature will provide a crash consistent point-in-time
+copy of Gluster volume(s). This snapshot is an online-snapshot therefore
+file-system and its associated data continue to be available for the
+clients, while the snapshot is being taken. As of now we are not
+planning to provide application level crash consistency. That means if a
+snapshot is restored then applications need to rely on journals or other
+technique to recover or cleanup some of the operations performed on
+GlusterFS volume.
+
+A GlusterFS volume is made up of multiple bricks spread across multiple
+nodes. Each brick translates to a directory path on a given file-system.
+The current snapshot design is based on thinly provisioned LVM2 snapshot
+feature. Therefore as a prerequisite the Gluster bricks should be on
+thinly provisioned LVM. For a single lvm, taking a snapshot would be
+straight forward for the admin, but this is compounded in a GlusterFS
+volume which has bricks spread across multiple LVM’s across multiple
+nodes. Gluster volume snapshot feature aims to provide a set of
+interfaces from which the admin can snap and manage the snapshots for
+Gluster volumes.
+
+Gluster volume snapshot is nothing but snapshots of all the bricks in
+the volume. So ideally all the bricks should be snapped at the same
+time. But with real-life latencies (processor and network) this may not
+hold true all the time. Therefore we need to make sure that during
+snapshot the file-system is in consistent state. Therefore we barrier
+few operation so that the file-system remains in a healthy state during
+snapshot.
+
+For details about barrier [Server Side
+Barrier](http://www.gluster.org/community/documentation/index.php/Features/Server-side_Barrier_feature)
+
+Benefit to GlusterFS
+--------------------
+
+Snapshot of glusterfs volume allows users to
+
+- A point in time checkpoint from which to recover/failback
+- Allow read-only snaps to be the source of backups.
+
+Scope
+-----
+
+### Nature of proposed change
+
+Gluster cli will be modified to provide new commands for snapshot
+management. The entire snapshot core implementation will be done in
+glusterd.
+
+Apart from this Snapshot will also make use of quiescing xlator for
+doing quiescing. This will be a server side translator which will
+quiesce will fops which can modify disk state. The quescing will be done
+till the snapshot operation is complete.
+
+### Implications on manageability
+
+Snapshot will provide new set of cli commands to manage snapshots. REST
+APIs are not planned for this release.
+
+### Implications on persistence layer
+
+Snapshot will create new volume per snapshot. These volumes are stored
+in /var/lib/glusterd/snaps folder. Apart from this each volume will have
+additional snapshot related information stored in snap\_list.info file
+in its respective vol folder.
+
+### Implications on 'glusterd'
+
+Snapshot information and snapshot volume details are stored in
+persistent stores.
+
+How To Test
+-----------
+
+For testing this feature one needs to have mulitple thinly provisioned
+volumes or else need to create LVM using loop back devices.
+
+Details of how to create thin volume can be found at the following link
+<https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/thinly_provisioned_volume_creation.html>
+
+Each brick needs to be in a independent LVM. And these LVMs should be
+thinly provisioned. From these bricks create Gluster volume. This volume
+can then be used for snapshot testing.
+
+See the User Experience section for various commands of snapshot.
+
+User Experience
+---------------
+
+##### Snapshot creation
+
+ snapshot create <snapname> <volname(s)> [description <description>] [force]
+
+This command will create a sapshot of the volume identified by volname.
+snapname is a mandatory field and the name should be unique in the
+entire cluster. Users can also provide an optional description to be
+saved along with the snap (max 1024 characters). force keyword is used
+if some bricks of orginal volume is down and still you want to take the
+snapshot.
+
+##### Listing of available snaps
+
+ gluster snapshot list [snap-name] [vol <volname>]
+
+This command is used to list all snapshots taken, or for a specified
+volume. If snap-name is provided then it will list the details of that
+snap.
+
+##### Configuring the snapshot behavior
+
+ gluster snapshot config [vol-name]
+
+This command will display existing config values for a volume. If volume
+name is not provided then config values of all the volume is displayed.
+
+ gluster snapshot config [vol-name] [<snap-max-limit> <count>] [<snap-max-soft-limit> <percentage>] [force]
+
+The above command can be used to change the existing config values. If
+vol-name is provided then config value of that volume is changed, else
+it will set/change the system limit.
+
+The system limit is the default value of the config for all the volume.
+Volume specific limit cannot cross the system limit. If a volume
+specific limit is not provided then system limit will be considered.
+
+If any of this limit is decreased and the current snap count of the
+system/volume is more than the limit then the command will fail. If user
+still want to decrease the limit then force option should be used.
+
+**snap-max-limit**: Maximum snapshot limit for a volume. Snapshots
+creation will fail if snap count reach this limit.
+
+**snap-max-soft-limit**: Maximum snapshot limit for a volume. Snapshots
+can still be created if snap count reaches this limit. An auto-deletion
+will be triggered if this limit is reached. The oldest snaps will be
+deleted if snap count reaches this limit. This is represented as
+percentage value.
+
+##### Status of snapshots
+
+ gluster snapshot status ([snap-name] | [volume <vol-name>])
+
+Shows the status of all the snapshots or the specified snapshot. The
+status will include the brick details, LVM details, process details,
+etc.
+
+##### Activating a snap volume
+
+By default the snapshot created will be in an inactive state. Use the
+following commands to activate snapshot.
+
+ gluster snapshot activate <snap-name>
+
+##### Deactivating a snap volume
+
+ gluster snapshot deactivate <snap-name>
+
+The above command will deactivate an active snapshot
+
+##### Deleting snaps
+
+ gluster snapshot delete <snap-name>
+
+This command will delete the specified snapshot.
+
+##### Restoring snaps
+
+ gluster snapshot restore <snap-name>
+
+This command restores an already taken snapshot of single or multiple
+volumes. Snapshot restore is an offline activity therefore if any volume
+which is part of the given snap is online then the restore operation
+will fail.
+
+Once the snapshot is restored it will be deleted from the list of
+snapshot.
+
+Dependencies
+------------
+
+To provide support for a crash-consistent snapshot feature Gluster core
+com- ponents itself should be crash-consistent. As of now Gluster as a
+whole is not crash-consistent. In this section we will identify those
+Gluster components which are not crash-consistent.
+
+**Geo-Replication**: Geo-replication provides master-slave
+synchronization option to Gluster. Geo-replication maintains state
+information for completing the sync operation. Therefore ideally when a
+snapshot is taken then both the master and slave snapshot should be
+taken. And both master and slave snapshot should be in mutually
+consistent state.
+
+Geo-replication make use of change-log to do the sync. By default the
+change-log is stored .glusterfs folder in every brick. But the
+change-log path is configurable. If change-log is part of the brick then
+snapshot will contain the change-log changes as well. But if it is not
+then it needs to be saved separately during a snapshot.
+
+Following things should be considered for making change-log
+crash-consistent:
+
+- Change-log is part of the brick of the same volume.
+- Change-log is outside the brick. As of now there is no size limit on
+ the
+
+change-log files. We need to answer following questions here
+
+- - Time taken to make a copy of the entire change-log. Will affect
+ the
+
+overall time of snapshot operation.
+
+- - The location where it can be copied. Will impact the disk usage
+ of
+
+the target disk or file-system.
+
+- Some part of change-log is present in the brick and some are outside
+
+the brick. This situation will arrive when change-log path is changed
+in-between.
+
+- Change-log is saved in another volume and this volume forms a CG
+ with
+
+the volume about to be snapped.
+
+**Note**: Considering the above points we have decided not to support
+change-log stored outside the bricks.
+
+For this release automatic snapshot of both master and slave session is
+not supported. If required user need to explicitly take snapshot of both
+master and slave. Following steps need to be followed while taking
+snapshot of a master and slave setup
+
+- Stop geo-replication manually.
+- Snapshot all the slaves first.
+- When the slave snapshot is done then initiate master snapshot.
+- When both the snapshot is complete geo-syncronization can be started
+ again.
+
+**Gluster Quota**: Quota enables an admin to specify per directory
+quota. Quota makes use of marker translator to enforce quota. As of now
+the marker framework is not completely crash-consistent. As part of
+snapshot feature we need to address following issues.
+
+- If a snapshot is taken while the contribution size of a file is
+ being updated then you might end up with a snapshot where there is a
+ mismatch between the actual size of the file and the contribution of
+ the file. These in-consistencies can only be rectified when a
+ look-up is issued on the snapshot volume for the same file. As a
+ workaround admin needs to issue an explicit file-system crawl to
+ rectify the problem.
+- For NFS, quota makes use of pgfid to build a path from gfid and
+ enforce quota. As of now pgfid update is not crash-consistent.
+- Quota saves its configuration in file-system under /var/lib/glusterd
+ folder. As part of snapshot feature we need to save this file.
+
+**NFS**: NFS uses a single graph to represent all the volumes in the
+system. And to make all the snapshot volume accessible these snapshot
+volumes should be added to this graph. This brings in another
+restriction, i.e. all the snapshot names should be unique and
+additionally snap name should not clash with any other volume name as
+well.
+
+To handle this situation we have decided to use an internal uuid as snap
+name. And keep a mapping of this uuid and user given snap name in an
+internal structure.
+
+Another restriction with NFS is that when a newly created volume
+(snapshot volume) is started it will restart NFS server. Therefore we
+decided when snapshot is taken it will be in stopped state. Later when
+snapshot volume is needed it can be started explicitly.
+
+**DHT**: DHT xlator decides which node to look for a file/directory.
+Some of the DHT fop are not atomic in nature, e.g rename (both file and
+directory). Also these operations are not transactional in nature. That
+means if a crash happens the data in server might be in an inconsistent
+state. Depending upon the time of snapshot and which DHT operation is in
+what state there can be an inconsistent snapshot.
+
+**AFR**: AFR is the high-availability module in Gluster. AFR keeps track
+of fresh and correct copy of data using extended attributes. Therefore
+it is important that before taking snapshot these extended attributes
+are written into the disk. To make sure these attributes are written to
+disk snapshot module will issue explicit sync after the
+barrier/quiescing.
+
+The other issue with the current AFR is that it writes the volume name
+to the extended attribute of all the files. AFR uses this for
+self-healing. When snapshot is taken of such a volume the snapshotted
+volume will also have the same volume name. Therefore AFR needs to
+create a mapping of the real volume name and the extended entry name in
+the volfile. So that correct name can be referred during self-heal.
+
+Another dependency on AFR is that currently there is no direct API or
+call back function which will tell that AFR self-healing is completed on
+a volume. This feature is required to heal a snapshot volume before
+restore.
+
+Documentation
+-------------
+
+Status
+------
+
+In development
+
+Comments and Discussion
+-----------------------
+
+<Follow here>
diff --git a/done/GlusterFS 3.6/New Style Replication.md b/done/GlusterFS 3.6/New Style Replication.md
new file mode 100644
index 0000000..ffd8167
--- /dev/null
+++ b/done/GlusterFS 3.6/New Style Replication.md
@@ -0,0 +1,230 @@
+Goal
+----
+
+More partition-tolerant replication, with higher performance for most
+use cases.
+
+Summary
+-------
+
+NSR is a new synchronous replication translator, complementing or
+perhaps some day replacing AFR.
+
+Owners
+------
+
+Jeff Darcy <jdarcy@redhat.com>
+Venky Shankar <vshankar@redhat.com>
+
+Current status
+--------------
+
+Design and prototype (nearly) complete, implementation beginning.
+
+Related Feature Requests and Bugs
+---------------------------------
+
+[AFR bugs related to "split
+brain"](https://bugzilla.redhat.com/buglist.cgi?classification=Community&component=replicate&list_id=3040567&product=GlusterFS&query_format=advanced&short_desc=split&short_desc_type=allwordssubstr)
+
+[AFR bugs related to
+"perf"](https://bugzilla.redhat.com/buglist.cgi?classification=Community&component=replicate&list_id=3040572&product=GlusterFS&query_format=advanced&short_desc=perf&short_desc_type=allwordssubstr)
+
+(Both lists are undoubtedly partial because not all bugs in these areas
+using these specific words. In particular, "GFID mismatch" bugs are
+really a kind of split brain, but aren't represented.)
+
+Detailed Description
+--------------------
+
+NSR is designed to have the following features.
+
+- Server based - "chain" replication can use bandwidth of both client
+ and server instead of splitting client bandwidth N ways.
+
+- Journal based - for reduced network traffic in normal operation,
+ plus faster recovery and greater resistance to "split brain" errors.
+
+- Variable consistency model - based on
+ [Dynamo](http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf)
+ to provide options trading some consistency for greater availability
+ and/or performance.
+
+- Newer, smaller codebase - reduces technical debt, enables higher
+ replica counts, more informative status reporting and logging, and
+ other future features (e.g. ordered asynchronous replication).
+
+Benefit to GlusterFS
+====================
+
+Faster, more robust, more manageable/maintainable replication.
+
+Scope
+=====
+
+Nature of proposed change
+-------------------------
+
+At least two new translators will be necessary.
+
+- A simple client-side translator to route requests to the current
+ leader among the bricks in a replica set.
+
+- A server-side translator to handle the "heavy lifting" of
+ replication, recovery, etc.
+
+Implications on manageability
+-----------------------------
+
+At a high level, commands to enable, configure, and manage NSR will be
+very similar to those already used for AFR. At a lower level, the
+options affecting things things like quorum, consistency, and placement
+of journals will all be completely different.
+
+Implications on presentation layer
+----------------------------------
+
+Minimal. Most changes will be to simplify or remove special handling for
+AFR's unique behavior (especially around lookup vs. self-heal).
+
+Implications on persistence layer
+---------------------------------
+
+N/A
+
+Implications on 'GlusterFS' backend
+-----------------------------------
+
+The journal for each brick in an NSR volume might (for performance
+reasons) be placed on one or more local volumes other than the one
+containing the brick's data. Special requirements around AIO, fsync,
+etc. will be less than with AFR.
+
+Modification to GlusterFS metadata
+----------------------------------
+
+NSR will not use the same xattrs as AFR, reducing the need for larger
+inodes.
+
+Implications on 'glusterd'
+--------------------------
+
+Volgen must be able to configure the client-side and server-side parts
+of NSR, instead of AFR on the client side and index (which will no
+longer be necessary) on the server side. Other interactions with
+glusterd should remain mostly the same.
+
+How To Test
+===========
+
+Most basic AFR tests - e.g. reading/writing data, killing nodes,
+starting/stopping self-heal - would apply to NSR as well. Tests that
+embed assumptions about AFR xattrs or other internal artifacts will need
+to be re-written.
+
+User Experience
+===============
+
+Minimal change, mostly related to new options.
+
+Dependencies
+============
+
+NSR depends on a cluster-management framework that can provide
+membership tracking, leader election, and robust consistent key/value
+data storage. This is expected to be developed in parallel as part of
+the glusterd-scalability feature, but can be implemented (in simplified
+form) within NSR itself if necessary.
+
+Documentation
+=============
+
+TBD.
+
+Status
+======
+
+Some parts of earlier implementation updated to current tree, others in
+the middle of replacement.
+
+- [New design](http://review.gluster.org/#/c/8915/)
+
+- [Basic translator code](http://review.gluster.org/#/c/8913/) (needs
+ update to new code-generation infractructure)
+
+- [GF\_FOP\_IPC](http://review.gluster.org/#/c/8812/)
+
+- [etcd support](http://review.gluster.org/#/c/8887/)
+
+- [New code-generation
+ infrastructure](http://review.gluster.org/#/c/9411/)
+
+- [New data-logging
+ translator](https://forge.gluster.org/~jdarcy/glusterfs-core/jdarcys-glusterfs-data-logging)
+
+Comments and Discussion
+=======================
+
+My biggest concern with journal-based replication comes from my previous
+use of DRBD. They do an "activity log"[^1] which sounds like the same
+basic concept. Once that log filled, I experienced cascading failure.
+When the journal can be filled faster than it's emptied this could cause
+the problem I experienced.
+
+So what I'm looking to be convinced is how journalled replication
+maintains full redundancy and how it will prevent the journal input from
+exceeding the capacity of the journal output or at least how it won't
+fail if this should happen.
+
+[jjulian](User:Jjulian "wikilink")
+([talk](User talk:Jjulian "wikilink")) 17:21, 13 August 2013 (UTC)
+
+<hr/>
+This is akin to a CAP Theorem[^2][^3] problem. If your nodes can't
+communicate, what do you do with writes? Our replication approach has
+traditionally been CP - enforce quorum, allow writes only among the
+majority - and for the sake of satisfying user expectations (or POSIX)
+pretty much has to remain CP at least by default. I personally think we
+need to allow an AP choice as well, which is why the quorum levels in
+NSR are tunable to get that result.
+
+So, what do we do if a node runs out of journal space? Well, it's unable
+to function normally, i.e. it's failed, so it can't count toward quorum.
+This would immediately lead to loss of write availability in a two-node
+replica set, and could happen easily enough in a three-node replica set
+if two similarly configured nodes ran out of journal space
+simultaneously. A significant part of the complexity in our design is
+around pruning no-longer-needed journal segments, precisely because this
+is an icky problem, but even with all the pruning in the world it could
+still happen eventually. Therefore the design also includes the notion
+of arbiters, which can be quorum-only or can also have their own
+journals (with no or partial data). Therefore, your quorum for
+admission/journaling purposes can be significantly higher than your
+actual replica count. So what options do we have to avoid or deal with
+journal exhaustion?
+
+- Add more journal space (it's just files, so this can be done
+ reactively during an extended outage).
+
+- Add arbiters.
+
+- Decrease the quorum levels.
+
+- Manually kick a node out of the replica set.
+
+- Add admission control, artificially delaying new requests as the
+ journal becomes full. (This one requires more code.)
+
+If you do \*none\* of these things then yeah, you're scrod. That said,
+do you think these options seem sufficient?
+
+[Jdarcy](User:Jdarcy "wikilink") ([talk](User talk:Jdarcy "wikilink"))
+15:27, 29 August 2013 (UTC)
+
+<references/>
+
+[^1]: <http://www.drbd.org/users-guide-emb/s-activity-log.html>
+
+[^2]: <http://www.julianbrowne.com/article/viewer/brewers-cap-theorem>
+
+[^3]: <http://henryr.github.io/cap-faq/>
diff --git a/done/GlusterFS 3.6/Persistent AFR Changelog xattributes.md b/done/GlusterFS 3.6/Persistent AFR Changelog xattributes.md
new file mode 100644
index 0000000..e21b788
--- /dev/null
+++ b/done/GlusterFS 3.6/Persistent AFR Changelog xattributes.md
@@ -0,0 +1,178 @@
+Feature
+-------
+
+Provide a unique and consistent name for AFR changelog extended
+attributes/ client translator names in the volume graph.
+
+Summary
+-------
+
+Make AFR changelog extended attribute names independent of brick
+position in the graph, which ensures that there will be no potential
+misdirected self-heals during remove-brick operation.
+
+Owners
+------
+
+Ravishankar N <ravishankar@redhat.com>
+Pranith Kumar K <pkarampu@redhat.com>
+
+Current status
+--------------
+
+Patches merged in master.
+
+<http://review.gluster.org/#/c/7122/>
+
+<http://review.gluster.org/#/c/7155/>
+
+Detailed Description
+--------------------
+
+BACKGROUND ON THE PROBLEM: =========================== AFR makes use of
+changelog extended attributes on a per file basis which records pending
+operations on that file and is used to determine the sources and sinks
+when healing needs to be done. As of today, AFR uses the client
+translator names (from the volume graph) as the names of the changelog
+attributes. For eg. for a replica 3 volume, each file on every brick has
+the following extended attributes:
+
+ trusted.afr.<volname>-client-0-->maps to Brick0
+ trusted.afr.<volname>-client-1-->maps to Brick1
+ trusted.afr.<volname>-client-2-->maps to Brick2
+
+​1) Now when any brick is removed (say Brick1), the graph is regenerated
+and AFR maps the xattrs to the bricks so:
+
+ trusted.afr.<volname>-client-0-->maps to Brick0
+ trusted.afr.<volname>-client-1-->maps to Brick2 
+
+Thus the xattr 'trusted.afr.testvol-client-1' which earlier referred to
+Brick1's attributes now refer to Brick-2's. If there are pending
+self-heals prior to the remove-brick happened, healing could possibly
+happen in the wrong direction thereby causing data loss.
+
+​2) The second problem is a dependency with Snapshot feature. Snapshot
+volumes have new names (UUID based) and thus the (client)xlator names
+are different. Eg: \<<volname>-client-0\> will now be
+\<<snapvolname>-client-0\>. When AFR uses these names to query for its
+changelog xattrs but the files on the bricks have the old changelog
+xattrs. Hence the heal information is completely lost.
+
+WHAT IS THE EXACT ISSUE WE ARE SOLVING OR OBJECTIVE OF THE
+FEATURE/DESIGN?
+==========================================================================
+In a nutshell, the solution is to generate unique and persistent names
+for the client translators so that even if any of the bricks are
+removed, the translator names always map to the same bricks. In turn,
+AFR, which uses these names for the changelog xattr names also refer to
+the correct bricks.
+
+SOLUTION:
+
+The solution is explained as a sequence of steps:
+
+- The client translator names will still use the existing
+ nomenclature, except that now they are monotonically increasing
+ (<volname>-client-0,1,2...) and are not dependent on the brick
+ position.Let us call these names as brick-IDs. These brick IDs are
+ also written to the brickinfo files (in
+ /var/lib/glusterd/vols/<volname>/bricks/\*) by glusterd during
+ volume creation. When the volfile is generated, these brick
+ brick-IDs form the client xlator names.
+
+- Whenever a brick operation is performed, the names are retained for
+ existing bricks irrespective of their position in the graph. New
+ bricks get the monotonically increasing brick-ID while names for
+ existing bricks are obtained from the brickinfo file.
+
+- Note that this approach does not affect client versions (old/new) in
+ anyway because the clients just use the volume config provided by
+ the volfile server.
+
+- For retaining backward compatibility, We need to check two items:
+ (a)Under what condition is remove brick allowed; (b)When is brick-ID
+ written to brickinfo file.
+
+For the above 2 items, the implementation rules will be thus:
+
+​i) This feature is implemented in 3.6. Lets say its op-version is 5.
+
+​ii) We need to implement a check to allow remove-brick only if cluster
+opversion is \>=5
+
+​iii) The brick-ID is written to brickinfo when the nodes are upgraded
+(during glusterd restore) and when a peer is probed (i.e. during volfile
+import).
+
+Benefit to GlusterFS
+--------------------
+
+Even if there are pending self-heals, remove-brick operations can be
+carried out safely without fear of incorrect heals which may cause data
+loss.
+
+Scope
+-----
+
+### Nature of proposed change
+
+Modifications will be made in restore, volfile import and volgen
+portions of glusterd.
+
+### Implications on manageability
+
+N/A
+
+### Implications on presentation layer
+
+N/A
+
+### Implications on persistence layer
+
+N/A
+
+### Implications on 'GlusterFS' backend
+
+N/A
+
+### Modification to GlusterFS metadata
+
+N/A
+
+### Implications on 'glusterd'
+
+As described earlier.
+
+How To Test
+-----------
+
+remove-brick operation needs to be carried out on rep/dist-rep volumes
+having pending self-heals and it must be verified that no data is lost.
+snapshots of the volumes must also be able to access files without any
+issues.
+
+User Experience
+---------------
+
+N/A
+
+Dependencies
+------------
+
+None.
+
+Documentation
+-------------
+
+TBD
+
+Status
+------
+
+See 'Current status' section.
+
+Comments and Discussion
+-----------------------
+
+<Follow here> \ No newline at end of file
diff --git a/done/GlusterFS 3.6/RDMA Improvements.md b/done/GlusterFS 3.6/RDMA Improvements.md
new file mode 100644
index 0000000..1e71729
--- /dev/null
+++ b/done/GlusterFS 3.6/RDMA Improvements.md
@@ -0,0 +1,101 @@
+Feature
+-------
+
+**RDMA Improvements**
+
+Summary
+-------
+
+This proposal is regarding getting RDMA volumes out of tech preview.
+
+Owners
+------
+
+Raghavendra Gowdappa <rgowdapp@redhat.com>
+Vijay Bellur <vbellur@redhat.com>
+
+Current status
+--------------
+
+Work in progress
+
+Detailed Description
+--------------------
+
+Fix known & unknown issues in volumes with transport type rdma so that
+RDMA can be used as the interconnect between client - servers & between
+servers.
+
+- Performance Issues - Had found that performance was bad when
+ compared with plain ib-verbs send/recv v/s RDMA reads and writes.
+- Co-existence with tcp - There seemed to be some memory corruptions
+ when we had both tcp and rdma transports.
+- librdmacm for connection management - with this there is a
+ requirement that the brick has to listen on an IPoIB address and
+ this affects our current ability where a peer has the flexibility to
+ connect to either ethernet or infiniband address. Another related
+ feature Better peer identification will help us to resolve this
+ issue.
+- More testing required
+
+Benefit to GlusterFS
+--------------------
+
+Scope
+-----
+
+### Nature of proposed change
+
+Bug-fixes to transport/rdma
+
+### Implications on manageability
+
+Remove the warning about creation of rdma volumes in CLI.
+
+### Implications on presentation layer
+
+TBD
+
+### Implications on persistence layer
+
+No impact
+
+### Implications on 'GlusterFS' backend
+
+No impact
+
+### Modification to GlusterFS metadata
+
+No impact
+
+### Implications on 'glusterd'
+
+No impact
+
+How To Test
+-----------
+
+TBD
+
+User Experience
+---------------
+
+TBD
+
+Dependencies
+------------
+
+Better Peer identification
+
+Documentation
+-------------
+
+TBD
+
+Status
+------
+
+In development
+
+Comments and Discussion
+-----------------------
diff --git a/done/GlusterFS 3.6/Server-side Barrier feature.md b/done/GlusterFS 3.6/Server-side Barrier feature.md
new file mode 100644
index 0000000..c13e25a
--- /dev/null
+++ b/done/GlusterFS 3.6/Server-side Barrier feature.md
@@ -0,0 +1,213 @@
+Server-side barrier feature
+===========================
+
+- Author(s): Varun Shastry, Krishnan Parthasarathi
+- Date: Jan 28 2014
+- Bugzilla: <https://bugzilla.redhat.com/1060002>
+- Document ID: BZ1060002
+- Document Version: 1
+- Obsoletes: NA
+
+Abstract
+--------
+
+Snapshot feature needs a mechanism in GlusterFS, where acknowledgements
+to file operations (FOPs) are held back until the snapshot of all the
+bricks of the volume are taken.
+
+The barrier feature would stop holding back FOPs after a configurable
+'barrier-timeout' seconds. This is to prevent an accidental lockdown of
+the volume.
+
+This mechanism should have the following properties:
+
+- Should keep 'barriering' transparent to the applications.
+- Should not acknowledge FOPs that fall into the barrier class. A FOP
+ that when acknowledged to the application, could lead to the
+ snapshot of the volume become inconsistent, is a barrier class FOP.
+
+With the below example of 'unlink' how a FOP is classified as barrier
+class is explained.
+
+For the following sequence of events, assuming unlink FOP was not
+barriered. Assume a replicate volume with two bricks, namely b1 and b2.
+
+ b1 b2
+ time ----------------------------------
+ | t1 snapshot
+ | t2 unlink /a unlink /a
+ \/ t3 mkdir /a mkdir /a
+ t4 snapshot
+
+The result of the sequence of events will store /a as a file in snapshot
+b1 while /a is stored as directory in snapshot b2. This leads to split
+brain problem of the AFR and in other way inconsistency of the volume.
+
+Copyright
+---------
+
+Copyright (c) 2014 Red Hat, Inc. <http://www.redhat.com>
+
+This feature is licensed under your choice of the GNU Lesser General
+Public License, version 3 or any later version (LGPLv3 or later), or the
+GNU General Public License, version 2 (GPLv2), in all cases as published
+by the Free Software Foundation.
+
+Introduction
+------------
+
+The volume snapshot feature snapshots a volume by snapshotting
+individual bricks, that are available, using the lvm-snapshot
+technology. As part of using lvm-snapshot, the design requires bricks to
+be free from few set of modifications (fops in Barrier Class) to avoid
+the inconsistency. This is where the server-side barriering of FOPs
+comes into picture.
+
+Terminology
+-----------
+
+- barrier(ing) - To make barrier fops temporarily inactive or
+ disabled.
+- available - A brick is said to be available when the corresponding
+ glusterfsd process is running and serving file operations.
+- FOP - File Operation
+
+High Level Design
+-----------------
+
+### Architecture/Design Overview
+
+- Server-side barriering, for Snapshot, must be enabled/disabled on
+ the bricks of a volume in a synchronous manner. ie, any command
+ using this would be blocked until barriering is enabled/disabled.
+ The brick process would provide this mechanism via an RPC.
+- Barrier translator would be placed immediately above io-threads
+ translator in the server/brick stack.
+- Barrier translator would queue FOPs when enabled. On disable, the
+ translator dequeues all the FOPs, while serving new FOPs from
+ application. By default, barriering is disabled.
+- The barrier feature would stop blocking the acknowledgements of FOPs
+ after a configurable 'barrier-timeout' seconds. This is to prevent
+ an accidental lockdown of the volume.
+- Operations those fall into barrier class are listed below. Any other
+ fop not listed below does not fall into this category and hence are
+ not barriered.
+ - rmdir
+ - unlink
+ - rename
+ - [f]truncate
+ - fsync
+ - write with O\_SYNC flag
+ - [f]removexattr
+
+### Design Feature
+
+Following timeline diagram depicts message exchanges between glusterd
+and brick during enable and disable of barriering. This diagram assumes
+that enable operation is synchronous and disable is asynchronous. See
+below for alternatives.
+
+ glusterd (snapshot) barrier @ brick
+ ------------------ ---------------
+ t1 | |
+ t2 | continue to pass through
+ | all the fops
+ t3 send 'enable' |
+ t4 | * starts barriering the fops
+ | * send back the ack
+ t5 receive the ack |
+ | |
+ t6 | &lt;take snap&gt; |
+ | . |
+ | . |
+ | . |
+ | &lt;/take snap&gt; |
+ | |
+ t7 send disable |
+ (does not wait for the ack) |
+ t8 | release all the holded fops
+ | and no more barriering
+ | |
+ t9 | continue in PASS_THROUGH mode
+
+Glusterd would send an RPC (described in API section), to enable
+barriering on a brick, by setting option feature.barrier to 'ON' in
+barrier translator. This would be performed on all the bricks present in
+that node, belonging to the set of volumes that are being snapshotted.
+
+Disable of barriering can happen in synchronous or asynchronous mode.
+The choice is left to the consumer of this feature.
+
+On disable, all FOPs queued up will be dequeued. Simultaneously the
+subsequent barrier request(s) will be served.
+
+Barrier option enable/disable is persisted into the volfile. This is to
+make the feature available for consumers in asynchronous mode, like any
+other (configurable) feature.
+
+Barrier feature also has timeout option based on which dequeuing would
+get triggered if the consumer fails to send the disable request.
+
+Low-level details of Barrier translator working
+-----------------------------------------------
+
+The translator operates in one of two states, namely QUEUEING and
+PASS\_THROUGH.
+
+When barriering is enabled, the translator moves to QUEUEING state. It
+queues outgoing FOPs thereafter in the call back path.
+
+When barriering is disabled, the translator moves to PASS\_THROUGH state
+and does not queue when it is in PASS\_THROUGH state. Additionally, the
+queued FOPs are 'released', when the translator moves from QUEUEING to
+PASS\_THROUGH state.
+
+It has a translator global queue (doubly linked lists, see
+libglusterfs/src/list.h) where the FOPs are queued in the form of a call
+stub (see libglusterfs/src/call-stub.[ch])
+
+When the FOP has succeeded, but barrier translator failed to queue in
+the call back, the barrier translator would disable barriering and
+release any queued FOPs, barrier would inform the consumer about this
+failure on succesive disable request.
+
+Interfaces
+----------
+
+### Application Programming Interface
+
+- An RPC procedure is added at the brick side, which allows any client
+ [sic] to set the feature.barrier option of the barrier translator
+ with a given value.
+- Glusterd would be using this to set server-side-barriering on, on a
+ brick.
+
+Performance Considerations
+--------------------------
+
+- The barriering of FOPs may be perceived as a performance degrade by
+ the applications. Since this is a hard requirement for snapshot, the
+ onus is on the snapshot feature to reduce the window for which
+ barriering is enabled.
+
+### Scalability
+
+- In glusterd, each brick operation is executed in a serial manner.
+ So, the latency of enabling barriering is a function of the no. of
+ bricks present on the node of the set of volumes being snapshotted.
+ This is not a scalability limitation of the mechanism of enabling
+ barriering but a limitation in the brick operations mechanism in
+ glusterd.
+
+Migration Considerations
+------------------------
+
+The barrier translator is introduced with op-version 4. It is a
+server-side translator and does not impact older clients even when this
+feature is enabled.
+
+Installation and deployment
+---------------------------
+
+- Barrier xlator is not packaged with glusterfs-server rpm. With this
+ changes, this has to be added to the rpm.
diff --git a/done/GlusterFS 3.6/Thousand Node Gluster.md b/done/GlusterFS 3.6/Thousand Node Gluster.md
new file mode 100644
index 0000000..54c3e13
--- /dev/null
+++ b/done/GlusterFS 3.6/Thousand Node Gluster.md
@@ -0,0 +1,150 @@
+Goal
+----
+
+Thousand-node scalability for glusterd
+
+Summary
+=======
+
+This "feature" is really a set of infrastructure changes that will
+enable glusterd to manage a thousand servers gracefully.
+
+Owners
+======
+
+Krishnan Parthasarathi <kparthas@redhat.com>
+Jeff Darcy <jdarcy@redhat.com>
+
+Current status
+==============
+
+Proposed, awaiting summit for approval.
+
+Related Feature Requests and Bugs
+=================================
+
+N/A
+
+Detailed Description
+====================
+
+There are three major areas of change included in this proposal.
+
+- Replace the current order-n-squared heartbeat/membership protocol
+ with a much smaller "monitor cluster" based on Paxos or
+ [Raft](https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf),
+ to which I/O servers check in.
+
+- Use the monitor cluster to designate specific functions or roles -
+ e.g. self-heal, rebalance, leadership in an NSR subvolume - to I/O
+ servers in a coordinated and globally optimal fashion.
+
+- Replace the current system of replicating configuration data on all
+ servers (providing practically no guarantee of consistency if one is
+ absent during a configuration change) with storage of configuration
+ data in the monitor cluster.
+
+Benefit to GlusterFS
+====================
+
+Scaling of our management plane to 1000+ nodes, enabling competition
+with other projects such as HDFS or Ceph which already have or claim
+such scalability.
+
+Scope
+=====
+
+Nature of proposed change
+-------------------------
+
+Functionality very similar to what we need in the monitor cluster
+already exists in some of the Raft implementations, notably
+[etcd](https://github.com/coreos/etcd). Such a component could provide
+the services described above to a modified glusterd running on each
+server. The changes to glusterd would mostly consist of removing the
+current heartbeat and config-storage code, replacing it with calls into
+(and callbacks from) the monitor cluster.
+
+Implications on manageability
+-----------------------------
+
+Enabling/starting monitor daemons on those few nodes that have them must
+be done separately from starting glusterd. Since the changes mostly are
+to how each glusterd interacts with others and with its own local
+storage back end, interactions with the CLI or with glusterfsd need not
+change.
+
+Implications on presentation layer
+----------------------------------
+
+N/A
+
+Implications on persistence layer
+---------------------------------
+
+N/A
+
+Implications on 'GlusterFS' backend
+-----------------------------------
+
+N/A
+
+Modification to GlusterFS metadata
+----------------------------------
+
+The monitor daemons need space for their data, much like that currently
+maintained in /var/lib/glusterd currently.
+
+Implications on 'glusterd'
+--------------------------
+
+Drastic. See sections above.
+
+How To Test
+===========
+
+A new set of tests for the monitor-cluster functionality will need to be
+developed, perhaps derived from those for the external project if we
+adopt one. Most tests related to our multi-node testing facilities
+(cluster.rc) will also need to change. Tests which merely invoke the CLI
+should require little if any change.
+
+User Experience
+===============
+
+Minimal change.
+
+Dependencies
+============
+
+A mature/stable enough implementation of Raft or a similar protocol.
+Failing that, we'd need to develop our own service along similar lines.
+
+Documentation
+=============
+
+TBD.
+
+Status
+======
+
+In design.
+
+The choice of technology and approaches are being discussed on the
+-devel ML.
+
+- "Proposal for Glusterd-2.0" -
+ [1](http://www.gluster.org/pipermail/gluster-users/2014-September/018639.html)
+
+: Though the discussion has become passive, the question is whether we
+ choose to implement consensus algorithm inside our project or depend
+ on external projects that provide similar service.
+
+- "Management volume proposal" -
+ [2](http://www.gluster.org/pipermail/gluster-devel/2014-November/042944.html)
+
+: This has limitations due to the circular dependency making it
+ infeasible.
+
+Comments and Discussion
+=======================
diff --git a/done/GlusterFS 3.6/afrv2.md b/done/GlusterFS 3.6/afrv2.md
new file mode 100644
index 0000000..a1767c7
--- /dev/null
+++ b/done/GlusterFS 3.6/afrv2.md
@@ -0,0 +1,244 @@
+Feature
+-------
+
+This feature is major code re-factor of current afr along with a key
+design change in the way changelog extended attributes are stored in
+afr.
+
+Summary
+-------
+
+This feature introduces design change in afr which separates ongoing
+transaction, pending operation count for files/directories.
+
+Owners
+------
+
+Anand Avati
+Pranith Kumar Karampuri
+
+Current status
+--------------
+
+The feature is in final stages of review at
+<http://review.gluster.org/6010>
+
+Detailed Description
+--------------------
+
+How AFR works:
+
+In order to keep track of what copies of the file are modified and up to
+date, and what copies require to be healed, AFR keeps state information
+in the extended attributes of the file called changelog extended
+attributes. These extended attributes stores that copy's view of how up
+to date the other copies are. The extended attributes are modified in a
+transaction which consists of 5 phases - LOCK, PRE-OP, OP, POST-OP and
+UNLOCK. In the PRE-OP phase the extended attributes are updated to store
+the intent of modification (in the OP phase.)
+
+In the POST-OP phase, depending on how many servers crashed mid way and
+on how many servers the OP was applied successfully, a corresponding
+change is made in the extended attributes (of the surviving copies) to
+represent the staleness of the copies which missed the OP phase.
+
+Further, when those lagging servers become available, healing decisions
+are taken based on these extended attribute values.
+
+Today, a PRE-OP increments the pending counters of all elements in the
+array (where each element represents a server, and therefore one of the
+members of the array represents that server itself.) The POST-OP
+decrements those counters which represent servers where the operation
+was successful. The update is performed on all the servers which have
+made it till the POST-OP phase. The decision of whether a server crashed
+in the middle of a transaction or whether the server lived through the
+transaction and witnessed the other server crash, is inferred by
+inspecting the extended attributes of all servers together. Because
+there is no distinction between these counters as to how many of those
+increments represent "in transit" operations and how many of those are
+retained without decrement to represent "pending counters", there is
+value in adding clarity to the system by separating the two.
+
+The change is to now have only one dirty flag on each server per file.
+We also make the PRE-OP increment only that dirty flag rather than all
+the elements of the pending array. The dirty flag must be set before
+performing the operation, and based on which of the servers the
+operation failed, we will set the pending counters representing these
+failed servers on the remaining ones in the POST-OP phase. The dirty
+counter is also cleared at the end of the POST-OP. This means, in
+successful operations only the dirty flag (one integer) is incremented
+and decremented per server per file. However if a pending counter is set
+because of an operation failure, then the flag is an unambiguous "finger
+pointing" at the other server. Meaning, if a file has a pending counter
+AND a dirty flag, it will not undermine the "strength" of the pending
+counter. This change completely removes today's ambiguity of whether a
+pending counter represents a still ongoing operation (or crashed in
+transit) vs a surely missed operation.
+
+Benefit to GlusterFS
+--------------------
+
+It increases the clarity of whether a file has any ongoing transactions
+and any pending self-heals. Code is more maintainable now.
+
+Scope
+-----
+
+### Nature of proposed change
+
+- Remove client side self-healing completely (opendir, openfd, lookup) -
+Re-work readdir-failover to work reliably in case of NFS - Remove
+unused/dead lock recovery code - Consistently use xdata in both calls
+and callbacks in all FOPs - Per-inode event generation, used to force
+inode ctx refresh - Implement dirty flag support (in place of pending
+counts) - Eliminate inode ctx structure, use read subvol bits +
+event\_generation - Implement inode ctx refreshing based on event
+generation - Provide backward compatibility in transactions - remove
+unused variables and functions - make code more consistent in style and
+pattern - regularize and clean up inode-write transaction code -
+regularize and clean up dir-write transaction code - regularize and
+clean up common FOPs - reorganize transaction framework code - skip
+setting xattrs in pending dict if nothing is pending - re-write
+self-healing code using syncops - re-write simpler self-heal-daemon
+
+### Implications on manageability
+
+None
+
+### Implications on presentation layer
+
+None
+
+### Implications on persistence layer
+
+None
+
+### Implications on 'GlusterFS' backend
+
+None
+
+### Modification to GlusterFS metadata
+
+This changes the way pending counts vs Ongoing transactions are
+represented in changelog extended attributes.
+
+### Implications on 'glusterd'
+
+None
+
+How To Test
+-----------
+
+Same test cases of afrv1 hold.
+
+User Experience
+---------------
+
+None
+
+Dependencies
+------------
+
+None
+
+Documentation
+-------------
+
+---
+
+Status
+------
+
+The feature is in final stages of review at
+<http://review.gluster.org/6010>
+
+Comments and Discussion
+-----------------------
+
+---
+
+Summary
+-------
+
+<Brief Description of the Feature>
+
+Owners
+------
+
+<Feature Owners - Ideally includes you :-)>
+
+Current status
+--------------
+
+<Provide details on related existing features, if any and why this new feature is needed>
+
+Detailed Description
+--------------------
+
+<Detailed Feature Description>
+
+Benefit to GlusterFS
+--------------------
+
+<Describe Value additions to GlusterFS>
+
+Scope
+-----
+
+### Nature of proposed change
+
+<modification to existing code, new translators ...>
+
+### Implications on manageability
+
+<Glusterd, GlusterCLI, Web Console, REST API>
+
+### Implications on presentation layer
+
+<NFS/SAMBA/UFO/FUSE/libglusterfsclient Integration>
+
+### Implications on persistence layer
+
+<LVM, XFS, RHEL ...>
+
+### Implications on 'GlusterFS' backend
+
+<brick's data format, layout changes>
+
+### Modification to GlusterFS metadata
+
+<extended attributes used, internal hidden files to keep the metadata...>
+
+### Implications on 'glusterd'
+
+<persistent store, configuration changes, brick-op...>
+
+How To Test
+-----------
+
+<Description on Testing the feature>
+
+User Experience
+---------------
+
+<Changes in CLI, effect on User experience...>
+
+Dependencies
+------------
+
+<Dependencies, if any>
+
+Documentation
+-------------
+
+<Documentation for the feature>
+
+Status
+------
+
+<Status of development - Design Ready, In development, Completed>
+
+Comments and Discussion
+-----------------------
+
+<Follow here>
diff --git a/done/GlusterFS 3.6/better-ssl.md b/done/GlusterFS 3.6/better-ssl.md
new file mode 100644
index 0000000..44136d5
--- /dev/null
+++ b/done/GlusterFS 3.6/better-ssl.md
@@ -0,0 +1,137 @@
+Feature
+=======
+
+Better SSL Support
+
+1 Summary
+=========
+
+Our SSL support is currently incomplete in several areas. This "feature"
+covers several enhancements (see Detailed Description below) to close
+gaps and make it more user-friendly.
+
+2 Owners
+========
+
+Jeff Darcy <jdarcy@redhat.com>
+
+3 Current status
+================
+
+Some patches already submitted.
+
+4 Detailed Description
+======================
+
+These are the items necessary to make our SSL support more of a useful
+differentiating feature vs. other projects.
+
+- Enable SSL for the management plane (glusterd). There are currently
+ several bugs and UI issues precluding this.
+
+- Allow SSL identities to be used for authorization as well as
+ authentication (and encryption). At a minimum this would apply to
+ the I/O path, restricting specific volumes to specific
+ SSL-identified principals. It might also apply to the management
+ path, restricting certain actions (and/or actions on certain
+ volumes) to certain principals. Ultimately this could be the basis
+ for full role-based access control, but that's not in scope
+ currently.
+
+- Provide more options, e.g. for cipher suites or certificate-signing
+
+- Fix bugs related to increased concurrency levels from the
+ multi-threaded transport.
+
+5 Benefit to GlusterFS
+======================
+
+Sufficient security to support deployment in environments where security
+is a non-negotiable requirement (e.g. government). Sufficient usability
+to support deployment by anyone who merely desires additional security.
+Improved performance in some cases, due to the multi-threaded transport.
+
+6 Scope
+=======
+
+6.1. Nature of proposed change
+------------------------------
+
+Most of the proposed changes do not actually involve the SSL transport
+itself, but are in surrounding components instead. The exception is the
+addition of options, which should be pretty simple. However, bugs
+related to increased concurrency levels could show up anywhere, most
+likely in our more complex translators (e.g. DHT or AFR), and will need
+to be fixed on a case-by-case basis.
+
+6.2. Implications on manageability
+----------------------------------
+
+Additional configuration will be necessary to enable SSL for glusterd.
+Additional commands will also be needed to manage certificates and keys;
+the [HekaFS
+documentation](https://git.fedorahosted.org/cgit/CloudFS.git/tree/doc)
+can serve as an example of what's needed.
+
+6.3. Implications on presentation layer
+---------------------------------------
+
+N/A
+
+6.4. Implications on persistence layer
+--------------------------------------
+
+N/A
+
+6.5. Implications on 'GlusterFS' backend
+----------------------------------------
+
+N/A
+
+6.6. Modification to GlusterFS metadata
+---------------------------------------
+
+N/A
+
+6.7. Implications on 'glusterd'
+-------------------------------
+
+Significant changes to how glusterd calls the transport layer (and
+expects to be called in return) will be necessary to fix bugs and to
+enable SSL on its connections.
+
+7 How To Test
+=============
+
+New tests will be needed for each major change in the detailed
+description. Also, to improve test coverage and smoke out all of the
+concurrency bugs, it might be desirable to change the test framework to
+allow running in a mode where SSL is enabled for all tests.
+
+8 User Experience
+=================
+
+Correspondent to "implications on manageability" section above.
+
+9 Dependencies
+==============
+
+Currently we use OpenSSL, so its idiosyncrasies guide implementation
+choices and timelines. Sometimes it even affects the user experience,
+e.g. in terms of what options exist for cipher suites or certificate
+depth. It's possible that it will prove advantageous to switch to
+another SSL/TLS package with a better interface, probably PolarSSL
+(which often responds to new threats more quickly than OpenSSL).
+
+10 Documentation
+================
+
+TBD, likely extensive (see "User Experience" section).
+
+11 Status
+=========
+
+Awaiting approval.
+
+12 Comments and Discussion
+==========================
diff --git a/done/GlusterFS 3.6/disperse.md b/done/GlusterFS 3.6/disperse.md
new file mode 100644
index 0000000..e2bad37
--- /dev/null
+++ b/done/GlusterFS 3.6/disperse.md
@@ -0,0 +1,142 @@
+Feature
+=======
+
+Dispersed volume translator
+
+Summary
+=======
+
+The disperse translator is a new type of volume for GlusterFS that can
+be used to offer a configurable level of fault tolerance while
+optimizing the disk space waste. It can be seen as a RAID5-like volume.
+
+Owners
+======
+
+Xavier Hernandez <xhernandez@datalab.es>
+
+Current status
+==============
+
+A working version is included in GlusterFS 3.6
+
+Detailed Description
+====================
+
+The disperse translator is based on erasure codes to allow the recovery
+of the data stored on one or more bricks in case of failure. The number
+of bricks that can fail without losing data is configurable.
+
+Each brick stores only a portion of each block of data. Some of these
+portions are called parity or redundancy blocks. They are computed using
+a mathematical transformation so that they can be used to recover the
+content of the portion stored on another brick.
+
+Each volume is composed of a set of N bricks (as many as you want), and
+R of them are used to store the redundancy information. On this
+configuration, if each brick has capacity C, the total space available
+on the volume will be (N - R) \* C.
+
+A versioning system is used to dectect inconsistencies and initiate a
+self-heal if appropiate.
+
+All these operations are made on the fly, transparently to the user.
+
+Benefit to GlusterFS
+====================
+
+It can be used to create volumes with a configurable level of redundancy
+(like replicate), but optimizing disk usage.
+
+Scope
+=====
+
+Nature of proposed change
+-------------------------
+
+The dispersed volume is implemented by a client-side translator that
+will be responsible of encoding/decoding the brick contents.
+
+Implications on manageability
+-----------------------------
+
+The new type of volume will be configured as any other one. However the
+healing operations are quite different and maybe will be necessary to
+handle them separately.
+
+Implications on presentation layer
+----------------------------------
+
+N/A
+
+Implications on persistence layer
+---------------------------------
+
+N/A
+
+Implications on 'GlusterFS' backend
+-----------------------------------
+
+N/A
+
+Modification to GlusterFS metadata
+----------------------------------
+
+Three new extended attributes are created to manage a dispersed file:
+
+- trusted.ec.config: Contains information about the parameters used to
+ encode the file.
+- trusted.ec.version: Tracks the number of changes made to the file.
+- trusted.ec.size: Tracks the real size of the file.
+
+Implications on 'glusterd'
+--------------------------
+
+glusterd and cli have been modified to add the needed functionality to
+create and manage dispersed volumes.
+
+How To Test
+===========
+
+There is a new glusterd syntax to create dispersed volumes:
+
+ gluster volume create <volname> [disperse [count]] [redundancy <count>]] <bricks>
+
+Both 'disperse' and 'redundancy' are optional, but at least one of them
+must be present to create a dispersed volume. The <count> of 'disperse'
+is also optional: if not specified, the number of bricks specified in
+the command line is taken as the <count> value. To create a
+distributed-disperse volume, it's necessary to specify 'disperse' with a
+<count> value smaller than the total number of bricks.
+
+When 'redundancy' is not specified, it's default value is computed so
+that it generates an optimal configuration. A configuration is optimal
+if *number of bricks - redundancy* is a power of 2. If such a value
+exists and it's greater than one, a warning is shown to validate the
+number. If it doesn't exist, 1 is taken and another warning is shown.
+
+Once created, the disperse volume can be started, mounted and used as
+any other volume.
+
+User Experience
+===============
+
+Almost the same. Only a new volume type added.
+
+Dependencies
+============
+
+None
+
+Documentation
+=============
+
+Not available yet.
+
+Status
+======
+
+First implementation ready.
+
+Comments and Discussion
+=======================
diff --git a/done/GlusterFS 3.6/glusterd volume locks.md b/done/GlusterFS 3.6/glusterd volume locks.md
new file mode 100644
index 0000000..a8f8ebd
--- /dev/null
+++ b/done/GlusterFS 3.6/glusterd volume locks.md
@@ -0,0 +1,48 @@
+As of today most gluster commands take a cluster wide lock, before
+performing their respective operations. As a result any two gluster
+commands, which have no interdependency with each other, can't be
+executed simultaneously. To remove this interdependency we propose to
+replace this cluster wide lock with a volume specific lock, so that two
+operations on two different volumes can be performed simultaneously.
+
+​1. We classify all gluster operations in three different classes :
+Create volume, Delete volume, and volume specific operations.
+
+​2. At any given point of time, we should allow two simultaneous
+operations (create, delete or volume specific), as long as each both the
+operations are not happening on the same volume.
+
+​3. If two simultaneous operations are performed on the same volume, the
+operation which manages to acquire the volume lock will succeed, while
+the other will fail. Also both might fail to acquire the volume lock on
+the cluster, in which case both the operations will fail.
+
+In order to achieve this, we propose a locking engine, which will
+receive lock requests from these three types of operations. Each such
+request for a particular volume will contest for the same volume lock
+(based on the volume name and the node-uuid). For example, a delete
+volume command for volume1 and a volume status command for volume 1 will
+contest for the same lock (comprising of the volume name, and the uuid
+of the node winning the lock), in which case, one of these commands will
+succeed and the other one will not, failing to acquire the lock.
+
+Whereas, if two operations are simultaneously performed on a different
+volumes they should happen smoothly, as both these operations would
+request the locking engine for two different locks, and will succeed in
+locking them in parallel.
+
+We maintain a global list of volume-locks (using a dict for a list)
+where the key is the volume name, and which saves the uuid of the
+originator glusterd. These locks are held and released per volume
+transaction.
+
+In order to acheive multiple gluster operations occuring at the same
+time, we also separate opinfos in the op-state-machine, as a part of
+this patch. To do so, we generate a unique transaction-id (uuid) per
+gluster transaction. An opinfo is then associated with this transaction
+id, which is used throughout the transaction. We maintain a run-time
+global list(using a dict) of transaction-ids, and their respective
+opinfos to achieve this.
+
+Gluster devel Mailing Thread:
+<http://lists.gnu.org/archive/html/gluster-devel/2013-09/msg00042.html> \ No newline at end of file
diff --git a/done/GlusterFS 3.6/heterogeneous-bricks.md b/done/GlusterFS 3.6/heterogeneous-bricks.md
new file mode 100644
index 0000000..a769b56
--- /dev/null
+++ b/done/GlusterFS 3.6/heterogeneous-bricks.md
@@ -0,0 +1,136 @@
+Feature
+-------
+
+Support heterogeneous (different size) bricks.
+
+Summary
+-------
+
+DHT is currently very naive about brick sizes, assigning equal "weight"
+to each brick/subvolume for purposes of placing new files even though
+the bricks might actually have different sizes. It would be better if
+DHT assigned greater weight (i.e. would create more files) on bricks
+with more total or free space.
+
+This proposal came out of a [mailing-list
+discussion](http://www.gluster.org/pipermail/gluster-users/2014-January/038638.html)
+
+Owners
+------
+
+- Raghavendra G (rgowdapp@redhat.com)
+
+Current status
+--------------
+
+There is a
+[script](https://github.com/gluster/glusterfs/blob/master/extras/rebalance.py)
+representing much of the necessary logic, using DHT's "custom layout"
+feature and other tricks.
+
+The most basic kind of heterogeneous-brick-friendly rebalancing has been
+implemented. [patch](http://review.gluster.org/#/c/8020/)
+
+Detailed Description
+--------------------
+
+There should be (at least) three options:
+
+- Assign subvolume weights based on **total** space.
+
+- Assign subvolume weights based on **free** space.
+
+- Assign all (or nearly all) weight to specific subvolumes.
+
+The last option is useful for those who expand a volume by adding bricks
+and intend to let the system "rebalance automatically" by directing new
+files to the new bricks, without migrating any old data. Once the
+appropriate weights have been calculated, the rebalance command should
+apply the results recursively to all directories within the volume
+(except those with custom layouts) and DHT should assign layouts to new
+directories in accordance with the same weights.
+
+Benefit to GlusterFS
+--------------------
+
+Better support for adding new bricks that are a different size than the
+old, which is common as disk capacity tends to improve with each
+generation (as noted in the ML discussion).
+
+Better support for adding capacity without an expensive (data-migrating)
+rebalance operation.
+
+Scope
+-----
+
+This will involve changes to all current rebalance code - CLI, glusterd,
+DHT, and probably others.
+
+### Implications on manageability
+
+New CLI options.
+
+### Implications on presentation layer
+
+None.
+
+### Implications on persistence layer
+
+None.
+
+### Implications on 'GlusterFS' backend
+
+None, unless we want to add a "policy" xattr on the root inode to be
+consulted when new directories are created (could also be done via
+translator options).
+
+### Modification to GlusterFS metadata
+
+Same as previous.
+
+### Implications on 'glusterd'
+
+New fields in rebalance-related RPCs.
+
+How To Test
+-----------
+
+For each policy:
+
+1. Create a volume with small bricks (ramdisk-based if need be).
+
+1. Fill the bricks to varying degrees.
+
+1. (optional) Add more empty bricks.
+
+1. Rebalance using the target policy.
+
+1. Create some dozens/hundreds of new files.
+
+1. Verify that the distribution of new files matches what is expected
+ for the given policy.
+
+User Experience
+---------------
+
+New options for the "rebalance" command.
+
+Dependencies
+------------
+
+None.
+
+Documentation
+-------------
+
+TBD
+
+Status
+------
+
+Original author has abandoned this change. If anyone else wants to make
+a \*positive\* contribution to fix a long-standing user concern, feel
+free.
+
+Comments and Discussion
+-----------------------
diff --git a/done/GlusterFS 3.6/index.md b/done/GlusterFS 3.6/index.md
new file mode 100644
index 0000000..f4d83db
--- /dev/null
+++ b/done/GlusterFS 3.6/index.md
@@ -0,0 +1,96 @@
+GlusterFS 3.6 Release Planning
+------------------------------
+
+Tentative Dates:
+
+4th Mar, 2014 - Feature proposal freeze
+
+17th Jul, 2014 - Feature freeze & Branching
+
+12th Sep, 2014 - Community Test Weekend \#1
+
+21st Sep, 2014 - 3.6.0 Beta Release
+
+22nd Sep, 2014 - Community Test Week
+
+31st Oct, 2014 - 3.6.0 GA
+
+Feature proposal for GlusterFS 3.6
+----------------------------------
+
+### Features in 3.6.0
+
+- [Features/better-ssl](./better-ssl.md):
+ Various improvements to SSL support.
+
+- [Features/heterogeneous-bricks](./heterogeneous-bricks.md):
+ Support different-sized bricks.
+
+- [Features/disperse](./disperse.md):
+ Volumes based on erasure codes.
+
+- [Features/glusterd-volume-locks](./glusterd volume locks.md):
+ Volume wide locks for glusterd
+
+- [Features/persistent-AFR-changelog-xattributes](./Persistent AFR Changelog xattributes.md):
+ Persistent naming scheme for client xlator names and AFR changelog
+ attributes.
+
+- [Features/better-logging](./Better Logging.md):
+ Gluster logging enhancements to support message IDs per message
+
+- [Features/Better peer identification](./Better Peer Identification.md)
+
+- [Features/Gluster Volume Snapshot](./Gluster Volume Snapshot.md)
+
+- [Features/Gluster User Serviceable Snapshots](./Gluster User Serviceable Snapshots.md)
+
+- **[Features/afrv2](./afrv2.md)**: Afr refactor.
+
+- [Features/RDMA Improvements](./RDMA Improvements.md):
+ Improvements for RDMA
+
+- [Features/Server-side Barrier feature](./Server-side Barrier feature.md):
+ A supplementary feature for the
+ [Features/Gluster Volume Snapshot](./Gluster Volume Snapshot.md) which maintains the consistency across the snapshots.
+
+### Features beyond 3.6.0
+
+- [Features/Smallfile Perf](../GlusterFS 3.7/Small File Performance.md):
+ Small-file performance enhancement.
+
+- [Features/data-classification](../GlusterFS 3.7/Data Classification.md):
+ Tiering, rack-aware placement, and more.
+
+- [Features/new-style-replication](./New Style Replication.md):
+ Log-based, chain replication.
+
+- [Features/thousand-node-glusterd](./Thousand Node Gluster.md):
+ Glusterd changes for higher scale.
+
+- [Features/Trash](../GlusterFS 3.7/Trash.md):
+ Trash translator for GlusterFS
+
+- [Features/Object Count](../GlusterFS 3.7/Object Count.md)
+- [Features/SELinux Integration](../GlusterFS 3.7/SE Linux Integration.md)
+- [Features/Easy addition of custom translators](../GlusterFS 3.7/Easy addition of Custom Translators.md)
+- [Features/Exports Netgroups Authentication](../GlusterFS 3.7/Exports and Netgroups Authentication.md)
+ [Features/outcast](../GlusterFS 3.7/Outcast.md): Outcast
+
+- **[Features/Policy based Split-brain Resolution](../GlusterFS 3.7/Policy based Split-brain Resolution.md)**: Policy Based
+Split-brain resolution
+
+- [Features/rest-api](../GlusterFS 3.7/rest-api.md):
+ REST API for Gluster Volume and Peer Management
+
+- [Features/Archipelago Integration](../GlusterFS 3.7/Archipelago Integration.md):
+ Improvements for integration with Archipelago
+
+Release Criterion
+-----------------
+
+- All new features to be documented in admin guide
+
+- Feature tested as part of testing days.
+
+- More to follow \ No newline at end of file