summaryrefslogtreecommitdiffstats
path: root/doc/developer-guide/coding-standard.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/developer-guide/coding-standard.md')
-rw-r--r--doc/developer-guide/coding-standard.md697
1 files changed, 697 insertions, 0 deletions
diff --git a/doc/developer-guide/coding-standard.md b/doc/developer-guide/coding-standard.md
new file mode 100644
index 00000000000..031c6c0da99
--- /dev/null
+++ b/doc/developer-guide/coding-standard.md
@@ -0,0 +1,697 @@
+GlusterFS Coding Standards
+==========================
+
+Before you get started
+----------------------
+Before starting with other part of coding standard, install `clang-format`
+
+On Fedora:
+```
+$ dnf install clang
+```
+On debian/Ubuntu:
+```
+$ apt-get install clang
+```
+Once you are done with all the local changes, you need to run below set of commands,
+before submitting the patch for review.
+```
+$ git add $file # if any
+$ git commit -a -s -m "commit message"
+$ git show --pretty="format:" --name-only | grep -v "contrib/" | egrep "*\.[ch]$" | xargs clang-format -i
+$ git diff # see if there are any changes
+$ git commit -a --amend # get the format changes done
+$ ./submit-for-review.sh
+```
+
+
+Structure definitions should have a comment per member
+------------------------------------------------------
+
+Every member in a structure definition must have a comment about its purpose.
+The comment should be descriptive without being overly verbose. For pointer
+members, lifecycle concerns for the pointed-to object should be noted. For lock
+members, the relationship between the lock member and the other members it
+protects should be explicit.
+
+*Bad:*
+
+```
+gf_lock_t lock; /* lock */
+```
+
+*Good:*
+
+```
+DBTYPE access_mode; /* access mode for accessing
+ * the databases, can be
+ * DB_HASH, DB_BTREE
+ * (option access-mode <mode>)
+ */
+```
+
+Structure members should be aligned based on the padding requirements
+---------------------------------------------------------------------
+
+The compiler will make sure that structure members have optimum alignment,
+but at the expense of suboptimal padding. More important is to optimize the
+padding. The compiler won't do that for you.
+
+This also will help utilize the memory better
+
+*Bad:*
+```
+struct bad {
+ bool b; /* 0 */
+ /* 1..7 pad */
+ void *p; /* 8..15 */
+ char c; /* 16 */
+ char a[16]; /* 17..33 */
+ /* 34..39 pad */
+ int64_t ii; /* 40..47 */
+ int32_t i; /* 48..51 */
+ /* 52..55 pad */
+ int64_t iii; /* 56..63 */
+};
+```
+
+*Good:*
+```
+struct good {
+ int64_t ii; /* explicit 64-bit types */
+ void *p; /* may be 64- or 32-bit */
+ long l; /* may be 64- or 32-bit */
+ int i; /* 32-bit */
+ short s; /* 16-bit */
+ char c; /* 8-bit */
+ bool b; /* 8-bit */
+ char a[1024];
+);
+```
+Make sure the items with the most stringent alignment requirements will need
+to come earliest (ie, pointers and perhaps uint64_t etc), and those with less
+stringent alignment requirements at the end (uint16/uint8 and char). Also note
+that the long array (if any) should be at the end of the structure, regardless
+of the type.
+
+Also note, if your structure's overall size is crossing 1k-4k limit, it is
+recommended to mention the reason why the particular structure needs so much
+memory as a comment at the top.
+
+Use \_typename for struct tags and typename\_t for typedefs
+---------------------------------------------------------
+
+Being consistent here makes it possible to automate navigation from use of a
+type to its true definition (not just the typedef).
+
+*Bad:*
+
+```
+struct thing {...};
+struct thing_t {...};
+typedef struct _thing thing;
+```
+
+*Good:*
+
+```
+typedef struct _thing {...} thing_t;
+```
+
+No double underscores
+---------------------
+
+Identifiers beginning with double underscores are supposed to reserved for the
+compiler.
+
+http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
+
+When you need to define inner/outer functions, use a different prefix/suffix.
+
+*Bad:*
+
+```
+void __do_something (void);
+
+void
+do_something (void)
+{
+ LOCK ();
+ __do_something ();
+ UNLOCK ();
+}
+```
+
+*Good:*
+
+```
+void do_something_locked (void);
+```
+
+Only use safe pointers in initializers
+----------------------------------------------------------
+
+Some pointers, such as `this` in a fop function, can be assumed to be non-NULL.
+However, other parameters and further-derived values might be NULL.
+
+*Good:*
+
+```
+pid_t pid = frame->root->pid;
+```
+
+
+*Bad:*
+
+```
+data_t *my_data = dict_get (xdata, "fubar");
+```
+
+No giant stack allocations
+--------------------------
+
+Synctasks have small finite stacks. To avoid overflowing these stacks, avoid
+allocating any large data structures on the stack. Use dynamic allocation
+instead.
+
+*Bad:*
+
+```
+gf_boolean_t port_inuse[65536]; /* 256KB, this actually happened */
+```
+
+NOTE: Ideal is to limit the stack array to less than 256 bytes.
+
+
+Character array initializing
+----------------------------
+
+It is recommended to keep the character array initializing to empty string.
+
+*Good:*
+```
+char msg[1024] = "";
+```
+
+Not so much recommended, even though it means the same.
+
+```
+char msg[1024] = {0,};
+```
+
+We recommend above to structure initialization.
+
+
+
+Validate all arguments to a function
+------------------------------------
+
+All pointer arguments to a function must be checked for `NULL`.
+A macro named `GF_VALIDATE_OR_GOTO` (in `common-utils.h`)
+takes two arguments; if the first is `NULL`, it writes a log message and
+jumps to a label specified by the second aergument after setting errno
+appropriately. There are several variants of this function for more
+specific purposes, and their use is recommended.
+
+*Bad:*
+
+```
+/* top of function */
+ret = dict_get (xdata, ...)
+```
+
+*Good:*
+
+```
+/* top of function */
+GF_VALIDATE_OR_GOTO(xdata,out);
+ret = dict_get (xdata, ...)
+```
+
+Never rely on precedence of operators
+-------------------------------------
+
+Never write code that relies on the precedence of operators to execute
+correctly. Such code can be hard to read and someone else might not
+know the precedence of operators as accurately as you do. This includes
+precedence of increment/decrement vs. field/subscript. The only exceptions are
+arithmetic operators (which have had defined precedence since before computers
+even existed) and boolean negation.
+
+*Bad:*
+
+```
+if (op_ret == -1 && errno != ENOENT)
+++foo->bar /* incrementing foo, or incrementing foo->bar? */
+a && b || !c
+```
+
+*Good:*
+
+```
+if ((op_ret == -1) && (errno != ENOENT))
+(++foo)->bar
+++(foo->bar)
+(a && b) || !c
+a && (b || !c)
+```
+
+Use exactly matching types
+--------------------------
+
+Use a variable of the exact type declared in the manual to hold the
+return value of a function. Do not use an 'equivalent' type.
+
+
+*Bad:*
+
+```
+int len = strlen (path);
+```
+
+*Good:*
+
+```
+size_t len = strlen (path);
+```
+
+Avoid code such as `foo->bar->baz`; check every pointer
+-------------------------------------------------------------
+
+Do not write code that blindly follows a chain of pointer references. Any
+pointer in the chain may be `NULL` and thus cause a crash. Verify that each
+pointer is non-null before following it. Even if `foo->bar` has been checked
+and is known safe, repeating it can make code more verbose and less clear.
+
+This rule includes `[]` as well as `->` because both dereference pointers.
+
+*Bad:*
+
+```
+foo->bar->field1 = value1;
+xyz = foo->bar->field2 + foo->bar->field3 * foo->bar->field4;
+foo->bar[5].baz
+```
+
+*Good:*
+
+```
+my_bar = foo->bar;
+if (!my_bar) ... return;
+my_bar->field1 = value1;
+xyz = my_bar->field2 + my_bar->field3 * my_bar->field4;
+```
+
+Document unchecked return values
+----------------------------------------------------
+
+In general, return values should be checked. If a function is being called
+for its side effects and the return value really doesn't matter, an explicit
+cast to void is required (to keep static analyzers happy) and a comment is
+recommended.
+
+*Bad:*
+
+```
+close (fd);
+do_important_thing ();
+```
+
+*Good (or at least OK):*
+
+```
+(void) sleep (1);
+```
+
+Gracefully handle failure of malloc (and other allocation functions)
+--------------------------------------------------------------------
+
+GlusterFS should never crash or exit due to lack of memory. If a
+memory allocation fails, the call should be unwound and an error
+returned to the user.
+
+*Use result args and reserve the return value to indicate success or failure:*
+
+The return value of every functions must indicate success or failure (unless
+it is impossible for the function to fail --- e.g., boolean functions). If
+the function needs to return additional data, it must be returned using a
+result (pointer) argument.
+
+*Bad:*
+
+```
+int32_t dict_get_int32 (dict_t *this, char *key);
+```
+
+*Good:*
+
+```
+int dict_get_int32 (dict_t *this, char *key, int32_t *val);
+```
+
+Always use the 'n' versions of string functions
+-----------------------------------------------
+
+Unless impossible, use the length-limited versions of the string functions.
+
+*Bad:*
+
+```
+strcpy (entry_path, real_path);
+```
+
+*Good:*
+
+```
+strncpy (entry_path, real_path, entry_path_len);
+```
+
+Do not use memset prior to sprintf/snprintf/vsnprintf etc...
+------------------------------------------------------------
+snprintf(and other similar string functions) terminates the buffer with a
+'\0'(null character). Hence, there is no need to do a memset before using
+snprintf. (Of course you need to account one extra byte for the null character
+in your allocation).
+
+Note: Similarly if you are doing pre-memory allocation for the buffer, use
+GF_MALLOC instead of GF_CALLOC, since the later is bit costlier.
+
+*Bad:*
+
+```
+char buffer[x];
+memset (buffer, 0, x);
+bytes_read = snprintf (buffer, sizeof buffer, "bad standard");
+```
+
+*Good:*
+```
+char buffer[x];
+bytes_read = snprintf (buffer, sizeof (buffer), "good standard");
+```
+
+And it is always to good initialize the char array if the string is static.
+
+E.g.
+```
+char buffer[] = "good standard";
+```
+
+No dead or commented code
+-------------------------
+
+There must be no dead code (code to which control can never be passed) or
+commented out code in the codebase.
+
+Function length or Keep functions small
+---------------------------------------
+
+We live in the UNIX-world where modules do one thing and do it well.
+This rule should apply to our functions also. If a function is very long, try splitting it
+into many little helper functions. The question is, in a coding
+spree, how do we know a function is long and unreadable. One rule of
+thumb given by Linus Torvalds is that, a function should be broken-up
+if you have 4 or more levels of indentation going on for more than 3-4
+lines.
+
+*Example for a helper function:*
+```
+static int
+same_owner (posix_lock_t *l1, posix_lock_t *l2)
+{
+ return ((l1->client_pid == l2->client_pid) &&
+ (l1->transport == l2->transport));
+}
+```
+
+Define functions as static
+--------------------------
+
+Declare functions as static unless they're exposed via a module-level API for
+use from other modules.
+
+No nested functions
+-------------------
+
+Nested functions have proven unreliable, e.g. as callbacks in code that uses
+ucontext (green) threads,
+
+Use inline functions instead of macros whenever possible
+--------------------------------------------------------
+
+Inline functions enforce type safety; macros do not. Use macros only for things
+that explicitly need to be type-agnostic (e.g. cases where one might use
+generics or templates in other languages), or that use other preprocessor
+features such as `#` for stringification or `##` for token pasting. In general,
+"static inline" is the preferred form.
+
+Avoid copypasta
+---------------
+
+Code that is copied and then pasted into multiple functions often creates
+maintenance problems later, e.g. updating all but one instance for a subsequent
+change. If you find yourself copying the same "boilerplate" many places,
+consider refactoring to use helper functions (including inline) or macros, or
+code generation.
+
+Ensure function calls wrap around after 80-columns
+--------------------------------------------------
+
+Place remaining arguments on the next line if needed.
+
+Functions arguments and function definition
+-------------------------------------------
+
+Place all the arguments of a function definition on the same line
+until the line goes beyond 80-cols. Arguments that extend beyind
+80-cols should be placed on the next line.
+
+Style issues
+------------
+
+### Brace placement
+
+Use K&R/Linux style of brace placement for blocks.
+
+*Good:*
+
+```
+int some_function (...)
+{
+ if (...) {
+ /* ... */
+ } else if (...) {
+ /* ... */
+ } else {
+ /* ... */
+ }
+
+ do {
+ /* ... */
+ } while (cond);
+}
+```
+
+### Indentation
+
+Use *eight* spaces for indenting blocks. Ensure that your
+file contains only spaces and not tab characters. You can do this
+in Emacs by selecting the entire file (`C-x h`) and
+running `M-x untabify`.
+
+To make Emacs indent lines automatically by eight spaces, add this
+line to your `.emacs`:
+
+```
+(add-hook 'c-mode-hook (lambda () (c-set-style "linux")))
+```
+
+### Comments
+
+Write a comment before every function describing its purpose (one-line),
+its arguments, and its return value. Mention whether it is an internal
+function or an exported function.
+
+Write a comment before every structure describing its purpose, and
+write comments about each of its members.
+
+Follow the style shown below for comments, since such comments
+can then be automatically extracted by doxygen to generate
+documentation.
+
+*Good:*
+
+```
+/**
+* hash_name -hash function for filenames
+* @par: parent inode number
+* @name: basename of inode
+* @mod: number of buckets in the hashtable
+*
+* @return: success: bucket number
+* failure: -1
+*
+* Not for external use.
+*/
+```
+
+### Indicating critical sections
+
+To clearly show regions of code which execute with locks held, use
+the following format:
+
+```
+pthread_mutex_lock (&mutex);
+{
+ /* code */
+}
+pthread_mutex_unlock (&mutex);
+```
+
+### Always use braces
+
+Even around single statements.
+
+*Bad:*
+
+```
+if (condition) action ();
+
+if (condition)
+ action ();
+```
+
+*Good:*
+
+```
+if (condition) {
+ action ();
+}
+```
+
+### Avoid multi-line conditionals
+
+These can be hard to read and even harder to modify later. Predicate functions
+and helper variables are always better for maintainability.
+
+*Bad:*
+
+```
+if ((thing1 && other_complex_condition (thing1, lots, of, args))
+ || (!thing2 || even_more_complex_condition (thing2))
+ || all_sorts_of_stuff_with_thing3) {
+ return;
+}
+
+```
+
+*Better:*
+
+```
+thing1_ok = predicate1 (thing1, lots, of, args
+thing2_ok = predicate2 (thing2);
+thing3_ok = predicate3 (thing3);
+
+if (!thing1_ok || !thing2_ok || !thing3_ok) {
+ return;
+}
+```
+
+*Best:*
+
+```
+if (thing1 && other_complex_condition (thing1, lots, of, args)) {
+ return;
+}
+if (!thing2 || even_more_complex_condition (thing2)) {
+ /* Note potential for a different message here. */
+ return;
+}
+if (all_sorts_of_stuff_with_thing3) {
+ /* And here too. */
+ return;
+}
+```
+
+### Use 'const' liberally
+
+If a value isn't supposed/expected to change, there's no cost to adding a
+'const' keyword and it will help prevent violation of expectations.
+
+### Avoid global variables (including 'static' auto variables)
+Almost all state in Gluster is contextual and should be contained in the
+appropriate structure reflecting its scope (e.g. `call\_frame\_t`, `call\_stack\_t`,
+`xlator\_t`, `glusterfs\_ctx\_t`). With dynamic loading and graph switches in play,
+each global requires careful consideration of when it should be initialized or
+reinitialized, when it might _accidentally_ be reinitialized, when its value
+might become stale, and so on. A few global variables are needed to serve as
+'anchor points' for these structures, and more exceptions to the rule might be
+approved in the future, but new globals should not be added to the codebase
+without explicit approval.
+
+## A skeleton fop function
+
+This is the recommended template for any fop. In the beginning come the
+initializations. After that, the 'success' control flow should be linear. Any
+error conditions should cause a `goto` to a label at the end. By convention
+this is 'out' if there is only one such label, but a cascade of such labels is
+allowable to support multi-stage cleanup. At that point, the code should detect
+the error that has occurred and do appropriate cleanup.
+
+```
+int32_t
+sample_fop (call_frame_t *frame, xlator_t *this, ...)
+{
+ char * var1 = NULL;
+ int32_t op_ret = -1;
+ int32_t op_errno = 0;
+ DIR * dir = NULL;
+ struct posix_fd * pfd = NULL;
+
+ VALIDATE_OR_GOTO (frame, out);
+ VALIDATE_OR_GOTO (this, out);
+
+ /* other validations */
+
+ dir = opendir (...);
+
+ if (dir == NULL) {
+ op_errno = errno;
+ gf_log (this->name, GF_LOG_ERROR,
+ "opendir failed on %s (%s)", loc->path,
+ strerror (op_errno));
+ goto out;
+ }
+
+ /* another system call */
+ if (...) {
+ op_errno = ENOMEM;
+ gf_log (this->name, GF_LOG_ERROR,
+ "out of memory :(");
+ goto out;
+ }
+
+ /* ... */
+
+ out:
+ if (op_ret == -1) {
+
+ /* check for all the cleanup that needs to be
+ done */
+
+ if (dir) {
+ closedir (dir);
+ dir = NULL;
+ }
+
+ if (pfd) {
+ FREE (pfd->path);
+ FREE (pfd);
+ pfd = NULL;
+ }
+ }
+
+ STACK_UNWIND (frame, op_ret, op_errno, fd);
+ return 0;
+}
+```