summaryrefslogtreecommitdiffstats
path: root/doc/developer-guide/coding-standard.md
blob: 7c1beddc2d5f55395cedd94bd09af690de1038b9 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
GlusterFS Coding Standards
==========================

Before you get started
----------------------
Before starting with other part of coding standard, install `clang-format`

On Fedora:

$ dnf install clang

On debian/Ubuntu:

$ apt-get install clang

Once you are done with all the local changes, you need to run below set of commands,
before submitting the patch for review.

 $ git add $file # if any
 $ git commit -a -s -m "commit message"
 $ git show --pretty="format:" --name-only | grep -v "contrib/" | egrep "*\.[ch]$" | xargs clang-format -i
 $ git diff # see if there are any changes
 $ git commit -a --amend # get the format changes done
 $ ./submit-for-review.sh



Structure definitions should have a comment per member
------------------------------------------------------

Every member in a structure definition must have a comment about its purpose.
The comment should be descriptive without being overly verbose.  For pointer
members, lifecycle concerns for the pointed-to object should be noted.  For lock
members, the relationship between the lock member and the other members it
protects should be explicit.

*Bad:*

```
gf_lock_t   lock;           /* lock */
```

*Good:*

```
DBTYPE      access_mode;    /* access mode for accessing
                             * the databases, can be
                             * DB_HASH, DB_BTREE
                             * (option access-mode <mode>)
                             */
```

Structure members should be aligned based on the padding requirements
---------------------------------------------------------------------

The compiler will make sure that structure members have optimum alignment,
but at the expense of suboptimal padding. More important is to optimize the
padding. The compiler won't do that for you.

This also will help utilize the memory better

*Bad:*
```
struct bad {
    bool  b; /* 0 */
             /* 1..7 pad */
    void *p; /* 8..15 */
    char  c; /* 16 */
    char  a[16]; /* 17..33 */
             /* 34..39 pad */
    int64_t ii; /* 40..47 */
    int32_t i; /* 48..51 */
             /* 52..55 pad */
    int64_t iii; /* 56..63 */
};
```

*Good:*
```
struct good {
   int64_t ii; /* explicit 64-bit types */
   void *p; /* may be 64- or 32-bit */
   long  l;   /* may be 64- or 32-bit */
   int   i;   /* 32-bit */
   short s;   /* 16-bit */
   char  c;   /* 8-bit */
   bool  b;   /* 8-bit */
   char  a[1024];
);
```
Make sure the items with the most stringent alignment requirements will need
to come earliest (ie, pointers and  perhaps uint64_t etc), and those with less
stringent alignment requirements at the end (uint16/uint8 and char). Also note
that the long array (if any) should be at the end of the structure, regardless
of the type.

Also note, if your structure's overall size is crossing 1k-4k limit, it is
recommended to mention the reason why the particular structure needs so much
memory as a comment at the top.

Use \_typename for struct tags and typename\_t for typedefs
---------------------------------------------------------

Being consistent here makes it possible to automate navigation from use of a
type to its true definition (not just the typedef).

*Bad:*

```
struct thing {...};
struct thing_t {...};
typedef struct _thing thing;
```

*Good:*

```
typedef struct _thing {...} thing_t;
```

No double underscores
---------------------

Identifiers beginning with double underscores are supposed to reserved for the
compiler.

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

When you need to define inner/outer functions, use a different prefix/suffix.

*Bad:*

```
void __do_something (void);

void
do_something (void)
{
        LOCK ();
        __do_something ();
        UNLOCK ();
}
```

*Good:*

```
void do_something_locked (void);
```

Only use safe pointers in initializers
----------------------------------------------------------

Some pointers, such as `this` in a fop function, can be assumed to be non-NULL.
However, other parameters and further-derived values might be NULL.

*Good:*

```
pid_t pid     = frame->root->pid;
```


*Bad:*

```
data_t *my_data = dict_get (xdata, "fubar");
```

No giant stack allocations
--------------------------

Synctasks have small finite stacks.  To avoid overflowing these stacks, avoid
allocating any large data structures on the stack.  Use dynamic allocation
instead.

*Bad:*

```
gf_boolean_t port_inuse[65536]; /* 256KB, this actually happened */
```

NOTE: Ideal is to limit the stack array to less than 256 bytes.


Character array initializing
----------------------------

It is recommended to keep the character array initializing to empty string.

*Good:*
```
char msg[1024] = "";
```

Not so much recommended, even though it means the same.

```
char msg[1024] = {0,};
```

We recommend above to structure initialization.



Validate all arguments to a function
------------------------------------

All pointer arguments to a function must be checked for `NULL`.
A macro named `GF_VALIDATE_OR_GOTO` (in `common-utils.h`)
takes two arguments; if the first is `NULL`, it writes a log message and
jumps to a label specified by the second aergument after setting errno
appropriately. There are several variants of this function for more
specific purposes, and their use is recommended.

*Bad:*

```
/* top of function */
ret = dict_get (xdata, ...)
```

*Good:*

```
/* top of function */
GF_VALIDATE_OR_GOTO(xdata,out);
ret = dict_get (xdata, ...)
```

Never rely on precedence of operators
-------------------------------------

Never write code that relies on the precedence of operators to execute
correctly.  Such code can be hard to read and someone else might not
know the precedence of operators as accurately as you do.  This includes
precedence of increment/decrement vs. field/subscript.  The only exceptions are
arithmetic operators (which have had defined precedence since before computers
even existed) and boolean negation.

*Bad:*

```
if (op_ret == -1 && errno != ENOENT)
++foo->bar      /* incrementing foo, or incrementing foo->bar? */
a && b || !c
```

*Good:*

```
if ((op_ret == -1) && (errno != ENOENT))
(++foo)->bar
++(foo->bar)
(a && b) || !c
a && (b || !c)
```

Use exactly matching types
--------------------------

Use a variable of the exact type declared in the manual to hold the
return value of a function. Do not use an 'equivalent' type.


*Bad:*

```
int len = strlen (path);
```

*Good:*

```
size_t len = strlen (path);
```

Avoid code such as `foo->bar->baz`; check every pointer
-------------------------------------------------------------

Do not write code that blindly follows a chain of pointer references. Any
pointer in the chain may be `NULL` and thus cause a crash. Verify that each
pointer is non-null before following it.  Even if `foo->bar` has been checked
and is known safe, repeating it can make code more verbose and less clear.

This rule includes `[]` as well as `->` because both dereference pointers.

*Bad:*

```
foo->bar->field1 = value1;
xyz = foo->bar->field2 + foo->bar->field3 * foo->bar->field4;
foo->bar[5].baz
```

*Good:*

```
my_bar = foo->bar;
if (!my_bar) ... return;
my_bar->field1 = value1;
xyz = my_bar->field2 + my_bar->field3 * my_bar->field4;
```

Document unchecked return values
----------------------------------------------------

In general, return values should be checked.  If a function is being called
for its side effects and the return value really doesn't matter, an explicit
cast to void is required (to keep static analyzers happy) and a comment is
recommended.

*Bad:*

```
close (fd);
do_important_thing ();
```

*Good (or at least OK):*

```
(void) sleep (1);
```

Gracefully handle failure of malloc (and other allocation functions)
--------------------------------------------------------------------

GlusterFS should never crash or exit due to lack of memory. If a
memory allocation fails, the call should be unwound and an error
returned to the user.

*Use result args and reserve the return value to indicate success or failure:*

The return value of every functions must indicate success or failure (unless
it is impossible for the function to fail --- e.g., boolean functions). If
the function needs to return additional data, it must be returned using a
result (pointer) argument.

*Bad:*

```
int32_t dict_get_int32 (dict_t *this, char *key);
```

*Good:*

```
int dict_get_int32 (dict_t *this, char *key, int32_t *val);
```

Always use the 'n' versions of string functions
-----------------------------------------------

Unless impossible, use the length-limited versions of the string functions.

*Bad:*

```
strcpy (entry_path, real_path);
```

*Good:*

```
strncpy (entry_path, real_path, entry_path_len);
```

No dead or commented code
-------------------------

There must be no dead code (code to which control can never be passed) or
commented out code in the codebase.

Function length or Keep functions small
---------------------------------------

We live in the UNIX-world where modules do one thing and do it well.
This rule should apply to our functions also. If a function is very long, try splitting it
into many little helper functions. The question is, in a coding
spree, how do we know a function is long and unreadable. One rule of
thumb given by Linus Torvalds is that, a function should be broken-up
if you have 4 or more levels of indentation going on for more than 3-4
lines.

*Example for a helper function:*
```
static int
same_owner (posix_lock_t *l1, posix_lock_t *l2)
{
        return ((l1->client_pid == l2->client_pid) &&
               (l1->transport  == l2->transport));
}
```

Define functions as static
--------------------------

Declare functions as static unless they're exposed via a module-level API for
use from other modules.

No nested functions
-------------------

Nested functions have proven unreliable, e.g. as callbacks in code that uses
ucontext (green) threads,

Use inline functions instead of macros whenever possible
--------------------------------------------------------

Inline functions enforce type safety; macros do not.  Use macros only for things
that explicitly need to be type-agnostic (e.g. cases where one might use
generics or templates in other languages), or that use other preprocessor
features such as `#` for stringification or `##` for token pasting.  In general,
"static inline" is the preferred form.

Avoid copypasta
---------------

Code that is copied and then pasted into multiple functions often creates
maintenance problems later, e.g. updating all but one instance for a subsequent
change.  If you find yourself copying the same "boilerplate" many places,
consider refactoring to use helper functions (including inline) or macros, or
code generation.

Ensure function calls wrap around after 80-columns
--------------------------------------------------

Place remaining arguments on the next line if needed.

Functions arguments and function definition
-------------------------------------------

Place all the arguments of a function definition on the same line
until the line goes beyond 80-cols. Arguments that extend beyind
80-cols should be placed on the next line.

Style issues
------------

### Brace placement

Use K&R/Linux style of brace placement for blocks.

*Good:*

```
int some_function (...)
{
        if (...) {
                /* ... */
        } else if (...) {
                /* ... */
        } else {
                /* ... */
        }

        do {
                /* ... */
        } while (cond);
}
```

### Indentation

Use *eight* spaces for indenting blocks. Ensure that your
file contains only spaces and not tab characters. You can do this
in Emacs by selecting the entire file (`C-x h`) and
running `M-x untabify`.

To make Emacs indent lines automatically by eight spaces, add this
line to your `.emacs`:

```
(add-hook 'c-mode-hook (lambda () (c-set-style "linux")))
```

### Comments

Write a comment before every function describing its purpose (one-line),
its arguments, and its return value. Mention whether it is an internal
function or an exported function.

Write a comment before every structure describing its purpose, and
write comments about each of its members.

Follow the style shown below for comments, since such comments
can then be automatically extracted by doxygen to generate
documentation.

*Good:*

```
/**
* hash_name -hash function for filenames
* @par:  parent inode number
* @name: basename of inode
* @mod:  number of buckets in the hashtable
*
* @return: success: bucket number
*          failure: -1
*
* Not for external use.
*/
```

### Indicating critical sections

To clearly show regions of code which execute with locks held, use
the following format:

```
pthread_mutex_lock (&mutex);
{
        /* code */
}
pthread_mutex_unlock (&mutex);
```

### Always use braces

Even around single statements.

*Bad:*

```
if (condition) action ();

if (condition)
        action ();
```

*Good:*

```
if (condition) {
        action ();
}
```

### Avoid multi-line conditionals

These can be hard to read and even harder to modify later.  Predicate functions
and helper variables are always better for maintainability.

*Bad:*

```
if ((thing1 && other_complex_condition (thing1, lots, of, args))
    || (!thing2 || even_more_complex_condition (thing2))
    || all_sorts_of_stuff_with_thing3) {
    return;
}

```

*Better:*

```
thing1_ok = predicate1 (thing1, lots, of, args
thing2_ok = predicate2 (thing2);
thing3_ok = predicate3 (thing3);

if (!thing1_ok || !thing2_ok || !thing3_ok) {
        return;
}
```

*Best:*

```
if (thing1 && other_complex_condition (thing1, lots, of, args)) {
        return;
}
if (!thing2 || even_more_complex_condition (thing2)) {
        /* Note potential for a different message here. */
        return;
}
if (all_sorts_of_stuff_with_thing3) {
        /* And here too. */
        return;
}
```

### Use 'const' liberally

If a value isn't supposed/expected to change, there's no cost to adding a
'const' keyword and it will help prevent violation of expectations.

### Avoid global variables (including 'static' auto variables)
Almost all state in Gluster is contextual and should be contained in the
appropriate structure reflecting its scope (e.g. call\_frame\_t, call\_stack\_t,
xlator\_t, glusterfs\_ctx\_t).  With dynamic loading and graph switches in play,
each global requires careful consideration of when it should be initialized or
reinitialized, when it might _accidentally_ be reinitialized, when its value
might become stale, and so on.  A few global variables are needed to serve as
'anchor points' for these structures, and more exceptions to the rule might be
approved in the future, but new globals should not be added to the codebase
without explicit approval.

## A skeleton fop function

This is the recommended template for any fop. In the beginning come the
initializations. After that, the 'success' control flow should be linear.  Any
error conditions should cause a `goto` to a label at the end.  By convention
this is 'out' if there is only one such label, but a cascade of such labels is
allowable to support multi-stage cleanup.  At that point, the code should detect
the error that has occurred and do appropriate cleanup.

```
int32_t
sample_fop (call_frame_t *frame, xlator_t *this, ...)
{
        char *            var1     = NULL;
        int32_t           op_ret   = -1;
        int32_t           op_errno = 0;
        DIR *             dir      = NULL;
        struct posix_fd * pfd      = NULL;

        VALIDATE_OR_GOTO (frame, out);
        VALIDATE_OR_GOTO (this, out);

        /* other validations */

        dir = opendir (...);

        if (dir == NULL) {
                op_errno = errno;
                gf_log (this->name, GF_LOG_ERROR,
                        "opendir failed on %s (%s)", loc->path,
                        strerror (op_errno));
                goto out;
        }

        /* another system call */
        if (...) {
                op_errno = ENOMEM;
                gf_log (this->name, GF_LOG_ERROR,
                        "out of memory :(");
                goto out;
        }

        /* ... */

 out:
        if (op_ret == -1) {

          /* check for all the cleanup that needs to be
             done */

                if (dir) {
                        closedir (dir);
                        dir = NULL;
                }

                if (pfd) {
                        FREE (pfd->path);
                        FREE (pfd);
                        pfd = NULL;
                }
        }

        STACK_UNWIND (frame, op_ret, op_errno, fd);
        return 0;
}
```