summaryrefslogtreecommitdiffstats
path: root/doc/developer-guide/coding-standard.md
diff options
context:
space:
mode:
authorJeff Darcy <jdarcy@fb.com>2017-06-29 07:00:41 -0700
committerJeff Darcy <jeff@pl.atyp.us>2017-10-16 13:52:34 +0000
commit09f6ae2111779c085fd108436cbc56615ef5f3df (patch)
tree6b066470579b3d2b03d30fa42ea78e4bcf97a053 /doc/developer-guide/coding-standard.md
parent206120126d455417a81a48ae473d49be337e9463 (diff)
project: update coding standard
Change-Id: I1d389c8d97ec1491be675bea2ff90898f1209861 Signed-off-by: Jeff Darcy <jdarcy@fb.com>
Diffstat (limited to 'doc/developer-guide/coding-standard.md')
-rw-r--r--doc/developer-guide/coding-standard.md320
1 files changed, 245 insertions, 75 deletions
diff --git a/doc/developer-guide/coding-standard.md b/doc/developer-guide/coding-standard.md
index 368c5553464..446e3424d16 100644
--- a/doc/developer-guide/coding-standard.md
+++ b/doc/developer-guide/coding-standard.md
@@ -4,8 +4,11 @@ GlusterFS Coding Standards
Structure definitions should have a comment per member
------------------------------------------------------
-Every member in a structure definition must have a comment about its
-purpose. The comment should be descriptive without being overly verbose.
+Every member in a structure definition must have a comment about its purpose.
+The comment should be descriptive without being overly verbose. For pointer
+members, lifecycle concerns for the pointed-to object should be noted. For lock
+members, the relationship between the lock member and the other members it
+protects should be explicit.
*Bad:*
@@ -23,59 +26,112 @@ DBTYPE access_mode; /* access mode for accessing
*/
```
-Declare all variables at the beginning of the function
-------------------------------------------------------
+Use \_typename for struct tags and typename\_t for typedefs
+---------------------------------------------------------
+
+Being consistent here makes it possible to automate navigation from use of a
+type to its true definition (not just the typedef).
+
+*Bad:*
+
+```
+struct thing {...};
+struct thing_t {...};
+typedef struct _thing thing;
+```
+
+*Good:*
+
+```
+typedef struct _thing {...} thing_t;
+```
+
+No double underscores
+---------------------
-All local variables in a function must be declared immediately after the
-opening brace. This makes it easy to keep track of memory that needs to be freed
-during exit. It also helps debugging, since gdb cannot handle variables
-declared inside loops or other such blocks.
+Identifiers beginning with double underscores are supposed to reserved for the
+compiler.
-Always initialize local variables
----------------------------------
+http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
-Every local variable should be initialized to a sensible default value
-at the point of its declaration. All pointers should be initialized to NULL,
-and all integers should be zero or (if it makes sense) an error value.
+When you need to define inner/outer functions, use a different prefix/suffix.
+*Bad:*
+
+```
+void __do_something (void);
+
+void
+do_something (void)
+{
+ LOCK ();
+ __do_something ();
+ UNLOCK ();
+}
+```
*Good:*
```
-int ret = 0;
-char *databuf = NULL;
-int _fd = -1;
+void do_something_locked (void);
```
-Initialization should always be done with a constant value
+Only use safe pointers in initializers
----------------------------------------------------------
-Never use a non-constant expression as the initialization value for a variable.
+Some pointers, such as `this` in a fop function, can be assumed to be non-NULL.
+However, other parameters and further-derived values might be NULL.
+
+*Good:*
+
+```
+pid_t pid = frame->root->pid;
+```
+
+
+*Bad:*
+```
+data_t *my_data = dict_get (xdata, "fubar");
+```
+
+No giant stack allocations
+--------------------------
+
+Synctasks have small finite stacks. To avoid overflowing these stacks, avoid
+allocating any large data structures on the stack. Use dynamic allocation
+instead.
*Bad:*
```
-pid_t pid = frame->root->pid;
-char *databuf = malloc (1024);
+gf_boolean_t port_inuse[65536]; /* 256KB, this actually happened */
```
+
Validate all arguments to a function
------------------------------------
All pointer arguments to a function must be checked for `NULL`.
-A macro named `VALIDATE` (in `common-utils.h`)
-takes one argument, and if it is `NULL`, writes a log message and
-jumps to a label called `err` after setting op_ret and op_errno
-appropriately. It is recommended to use this template.
+A macro named `GF_VALIDATE_OR_GOTO` (in `common-utils.h`)
+takes two arguments; if the first is `NULL`, it writes a log message and
+jumps to a label specified by the second aergument after setting errno
+appropriately. There are several variants of this function for more
+specific purposes, and their use is recommended.
+*Bad:*
+
+```
+/* top of function */
+ret = dict_get (xdata, ...)
+```
*Good:*
```
-VALIDATE(frame);
-VALIDATE(this);
-VALIDATE(inode);
+/* top of function */
+GF_VALIDATE_OR_GOTO(xdata,out);
+ret = dict_get (xdata, ...)
```
Never rely on precedence of operators
@@ -83,25 +139,34 @@ Never rely on precedence of operators
Never write code that relies on the precedence of operators to execute
correctly. Such code can be hard to read and someone else might not
-know the precedence of operators as accurately as you do.
+know the precedence of operators as accurately as you do. This includes
+precedence of increment/decrement vs. field/subscript. The only exceptions are
+arithmetic operators (which have had defined precedence since before computers
+even existed) and boolean negation.
*Bad:*
```
if (op_ret == -1 && errno != ENOENT)
+++foo->bar /* incrementing foo, or incrementing foo->bar? */
+a && b || !c
```
*Good:*
```
if ((op_ret == -1) && (errno != ENOENT))
+(++foo)->bar
+++(foo->bar)
+(a && b) || !c
+a && (b || !c)
```
Use exactly matching types
--------------------------
Use a variable of the exact type declared in the manual to hold the
-return value of a function. Do not use an ``equivalent'' type.
+return value of a function. Do not use an 'equivalent' type.
*Bad:*
@@ -116,42 +181,56 @@ int len = strlen (path);
size_t len = strlen (path);
```
-Never write code such as `foo->bar->baz`; check every pointer
+Avoid code such as `foo->bar->baz`; check every pointer
-------------------------------------------------------------
-Do not write code that blindly follows a chain of pointer
-references. Any pointer in the chain may be `NULL` and thus
-cause a crash. Verify that each pointer is non-null before following
-it.
+Do not write code that blindly follows a chain of pointer references. Any
+pointer in the chain may be `NULL` and thus cause a crash. Verify that each
+pointer is non-null before following it. Even if `foo->bar` has been checked
+and is known safe, repeating it can make code more verbose and less clear.
+
+This rule includes `[]` as well as `->` because both dereference pointers.
+
+*Bad:*
+
+```
+foo->bar->field1 = value1;
+xyz = foo->bar->field2 + foo->bar->field3 * foo->bar->field4;
+foo->bar[5].baz
+```
-Check return value of all functions and system calls
+*Good:*
+
+```
+my_bar = foo->bar;
+if (!my_bar) ... return;
+my_bar->field1 = value1;
+xyz = my_bar->field2 + my_bar->field3 * my_bar->field4;
+```
+
+Document unchecked return values
----------------------------------------------------
-The return value of all system calls and API functions must be checked
-for success or failure.
+In general, return values should be checked. If a function is being called
+for its side effects and the return value really doesn't matter, an explicit
+cast to void is required (to keep static analyzers happy) and a comment is
+recommended.
*Bad:*
```
close (fd);
+do_important_thing ();
```
-*Good:*
+*Good (or at least OK):*
```
-op_ret = close (_fd);
-if (op_ret == -1) {
- gf_log (this->name, GF_LOG_ERROR,
- "close on file %s failed (%s)", real_path,
- strerror (errno));
- op_errno = errno;
- goto out;
-}
+(void) sleep (1);
```
-
-Gracefully handle failure of malloc
------------------------------------
+Gracefully handle failure of malloc (and other allocation functions)
+--------------------------------------------------------------------
GlusterFS should never crash or exit due to lack of memory. If a
memory allocation fails, the call should be unwound and an error
@@ -176,7 +255,7 @@ int32_t dict_get_int32 (dict_t *this, char *key);
int dict_get_int32 (dict_t *this, char *key, int32_t *val);
```
-Always use the `n' versions of string functions
+Always use the 'n' versions of string functions
-----------------------------------------------
Unless impossible, use the length-limited versions of the string functions.
@@ -199,12 +278,6 @@ No dead or commented code
There must be no dead code (code to which control can never be passed) or
commented out code in the codebase.
-Only one unwind and return per function
----------------------------------------
-
-There must be only one exit out of a function. `UNWIND` and return
-should happen at only point in the function.
-
Function length or Keep functions small
---------------------------------------
@@ -226,20 +299,35 @@ same_owner (posix_lock_t *l1, posix_lock_t *l2)
}
```
-Defining functions as static
-----------------------------
+Define functions as static
+--------------------------
+
+Declare functions as static unless they're exposed via a module-level API for
+use from other modules.
-Define internal functions as static only if you're
-very sure that there will not be a crash(..of any kind..) emanating in
-that function. If there is even a remote possibility, perhaps due to
-pointer derefering, etc, declare the function as non-static. This
-ensures that when a crash does happen, the function name shows up the
-in the back-trace generated by libc. However, doing so has potential
-for polluting the function namespace, so to avoid conflicts with other
-components in other parts, ensure that the function names are
-prepended with a prefix that identify the component to which it
-belongs. For eg. non-static functions in io-threads translator start
-with iot_.
+No nested functions
+-------------------
+
+Nested functions have proven unreliable, e.g. as callbacks in code that uses
+ucontext (green) threads,
+
+Use inline functions instead of macros whenever possible
+--------------------------------------------------------
+
+Inline functions enforce type safety; macros do not. Use macros only for things
+that explicitly need to be type-agnostic (e.g. cases where one might use
+generics or templates in other languages), or that use other preprocessor
+features such as `#` for stringification or `##` for token pasting. In general,
+"static inline" is the preferred form.
+
+Avoid copypasta
+---------------
+
+Code that is copied and then pasted into multiple functions often creates
+maintenance problems later, e.g. updating all but one instance for a subsequent
+change. If you find yourself copying the same "boilerplate" many places,
+consider refactoring to use helper functions (including inline) or macros, or
+code generation.
Ensure function calls wrap around after 80-columns
--------------------------------------------------
@@ -335,13 +423,95 @@ pthread_mutex_lock (&mutex);
pthread_mutex_unlock (&mutex);
```
-*A skeleton fop function:*
+### Always use braces
+
+Even around single statements.
+
+*Bad:*
+
+```
+if (condition) action ();
+
+if (condition)
+ action ();
+```
+
+*Good:*
+
+```
+if (condition) {
+ action ();
+}
+```
+
+### Avoid multi-line conditionals
+
+These can be hard to read and even harder to modify later. Predicate functions
+and helper variables are always better for maintainability.
+
+*Bad:*
+
+```
+if ((thing1 && other_complex_condition (thing1, lots, of, args))
+ || (!thing2 || even_more_complex_condition (thing2))
+ || all_sorts_of_stuff_with_thing3) {
+ return;
+}
+
+```
+
+*Better:*
+
+```
+thing1_ok = predicate1 (thing1, lots, of, args
+thing2_ok = predicate2 (thing2);
+thing3_ok = predicate3 (thing3);
+
+if (!thing1_ok || !thing2_ok || !thing3_ok) {
+ return;
+}
+```
+
+*Best:*
+
+```
+if (thing1 && other_complex_condition (thing1, lots, of, args)) {
+ return;
+}
+if (!thing2 || even_more_complex_condition (thing2)) {
+ /* Note potential for a different message here. */
+ return;
+}
+if (all_sorts_of_stuff_with_thing3) {
+ /* And here too. */
+ return;
+}
+```
+
+### Use 'const' liberally
+
+If a value isn't supposed/expected to change, there's no cost to adding a
+'const' keyword and it will help prevent violation of expectations.
+
+### Avoid global variables (including 'static' auto variables)
+Almost all state in Gluster is contextual and should be contained in the
+appropriate structure reflecting its scope (e.g. call\_frame\_t, call\_stack\_t,
+xlator\_t, glusterfs\_ctx\_t). With dynamic loading and graph switches in play,
+each global requires careful consideration of when it should be initialized or
+reinitialized, when it might _accidentally_ be reinitialized, when its value
+might become stale, and so on. A few global variables are needed to serve as
+'anchor points' for these structures, and more exceptions to the rule might be
+approved in the future, but new globals should not be added to the codebase
+without explicit approval.
+
+## A skeleton fop function
-This is the recommended template for any fop. In the beginning come
-the initializations. After that, the `success' control flow should be
-linear. Any error conditions should cause a `goto` to a single
-point, `out`. At that point, the code should detect the error
-that has occurred and do appropriate cleanup.
+This is the recommended template for any fop. In the beginning come the
+initializations. After that, the 'success' control flow should be linear. Any
+error conditions should cause a `goto` to a label at the end. By convention
+this is 'out' if there is only one such label, but a cascade of such labels is
+allowable to support multi-stage cleanup. At that point, the code should detect
+the error that has occurred and do appropriate cleanup.
```
int32_t