glusterfs.git/xlators/mount, branch master

mount/fuse: Fix graph-switch when reader-thread-count is set

2020-10-05T11:27:28+00:00

Problem:
The current graph-switch code sets priv->handle_graph_switch to false even
when graph-switch is in progress which leads to crashes in some cases

Fix:
priv->handle_graph_switch should be set to false only when graph-switch
completes.

fixes: #1539
Change-Id: I5b04f7220a0a6e65c5f5afa3e28d1afe9efcdc31
Signed-off-by: Pranith Kumar K

xlators: prefer libglusterfs time API

2020-09-07T12:56:45+00:00

Prefer timespec_now_realtime() and gf_time() over clock_gettime()
and time(), use gf_tvdiff() and gf_tsdiff() where appropriate,
drop unused time_elapsed() and leftovers in 'struct posix_private'.

Change-Id: Ie1f0229df5b03d0862193ce2b7fb91d27b0981b6
Signed-off-by: Dmitry Antipov 
Updates: #1002

fuse: fetch arbitrary number of groups from /proc/[pid]/status

2020-08-21T14:12:09+00:00

Glusterfs so far constrained itself with an arbitrary limit (32)
for the number of groups read from /proc/[pid]/status (this was
the number of groups shown there prior to Linux commit
v3.7-9553-g8d238027b87e (v3.8-rc1~74^2~59); since this commit, all
groups are shown).

With this change we'll read groups up to the number Glusterfs
supports in general (64k).

Note: the actual number of groups that are made use of in a
regular Glusterfs setup shall still be capped at ~93 due to limitations
of the RPC transport. To be able to handle more groups than that,
brick side gid resolution (server.manage-gids option) can be used along
with NIS, LDAP or other such networked directory service (see
https://github.com/gluster/glusterdocs/blob/5ba15a2/docs/Administrator%20Guide/Handling-of-users-with-many-groups.md#limit-in-the-glusterfs-protocol
).

Also adding some diagnostic messages to frame_fill_groups().

Change-Id: I271f3dc3e6d3c44d6d989c7a2073ea5f16c26ee0
fixes: #1075
Signed-off-by: Csaba Henk

FreeBSD patches for fuse mount utility

2020-08-19T04:23:45+00:00

Change-Id: Ib2bac85c28905bb8997fbb64db2308f2a6f31720
Fixes: #1376

fuse: change setlk interrupt strategy to 'sync'

2020-07-24T04:35:42+00:00

The setlk interrupt handler uses a 'fork' of the resolved
fuse state from setlk (a copy with some edits) to initiate
its own auxiliary fop. Thus the references stored in the
fuse states of the setlk fop and of its interrupt handler
are shared (apart from the ones edited  by the interrupt
handler -- but the bulk of them remain as is). The lifetimes
of these references are tied to the setlk fop, which has
established them by properly claiming their backing
resources. To guarantee the validity of these references in
the interrupt context, we need to make sure that the setlk
fop did not reclaim the fuse state while the interrupt
handler is running.

In other words, the setlk fop needs to wait for the
termination of the interrupt handler, which is accomplished
by the 'sync' strategy of the interrupt API (passing
true for the 'sync' argument of
fuse_interrupt_finish_{fop,interrupt} functions).

Change-Id: I9a6dc76972507be4b7ba8d023cc876e5fddf813f
Updates: #1374
Signed-off-by: Csaba Henk

fuse: fix waiting for interrupt handler

2020-07-24T04:35:42+00:00

With 'sync' strategy, a fop's cbk waits for
the interrupt handler to finish by making a
call to fuse_interrupt_finish_fop() with
sync = true.

The wait is implemented by monitoring an
interrupt_state struct member via a condition
variable. However, due to broken code logic,
the pthread_cond_wait() call is never reached.

This change introduces a new member to the
fuse_interrupt_state_t enum (the type of
aforementioned struct member),
FUSE_INTERRUPT_WAITING_HANDLER, which is then
used for indicating the state of waiting for
the interrupt handler.

Change-Id: I72ab06c37f45ff8f212a6a632bac1f647af05cbd
Updates: #1374
Signed-off-by: Csaba Henk

Make FUSE notification optional at configure time

2020-07-23T10:30:19+00:00

NetBSD FUSE does not implement FUSE notification yet. This changes
makes this feature a configure time option so that it can be disabled.

Fixes: #1381
Change-Id: I3d977d8d69b57e1ac6957be84a9ddbb69b100893
Type: Bug
Signed-off-by: Emmanuel Dreyfus manu@netbsd.org

mount/fuse: use cookies to get fuse-interrupt-record instead of xdata

2020-06-18T06:12:07+00:00

Problem:
On executing tests/features/flock_interrupt.t the following error log
appears
[2020-06-16 11:51:54.631072 +0000] E
[fuse-bridge.c:4791:fuse_setlk_interrupt_handler_cbk] 0-glusterfs-fuse:
interrupt record not found

This happens because fuse-interrupt-record is never sent on the wire by
getxattr fop and there is no guarantee that in the cbk it will be
available in case of failures.

Fix:
wind getxattr fop with fuse-interrupt-record as cookie and recover it
in the cbk

Fixes: #1310
Change-Id: I4cfff154321a449114fc26e9440db0f08e5c7daa
Signed-off-by: Pranith Kumar K

Indicate timezone offsets in timestamps

2020-06-15T12:41:10+00:00

Logs and other output carrying timestamps
will have now timezone offsets indicated, eg.:

[2020-03-12 07:01:05.584482 +0000] I [MSGID: 106143] [glusterd-pmap.c:388:pmap_registry_remove] 0-pmap: removing brick (null) on port 49153

To this end,

- gf_time_fmt() now inserts timezone offset via %z strftime(3) template.
- A new utility function has been added, gf_time_fmt_tv(), that
  takes a struct timeval pointer (*tv) instead of a time_t value to
  specify the time. If tv->tv_usec is negative,

  gf_time_fmt_tv(... tv ...)

  is equivalent to

  gf_time_fmt(... tv->tv_sec ...)

  Otherwise it also inserts tv->tv_usec to the formatted string.
- Building timestamps of usec precision has been converted to
  gf_time_fmt_tv, which is necessary because the method of appending
  a period and the usec value to the end of the timestamp does not work
  if the timestamp has zone offset, but it's also beneficial in terms of
  eliminating repetition.
- The buffer passed to gf_time_fmt/gf_time_fmt_tv has been unified to
  be of GF_TIMESTR_SIZE size (256). We need slightly larger buffer space
  to accommodate the zone offset and it's preferable to use a buffer
  which is undisputedly large enough.

This change does *not* do the following:

- Retaining a method of timestamp creation without timezone offset.
  As to my understanding we don't need such backward compatibility
  as the code just emits timestamps to logs and other diagnostic
  texts, and doesn't do any later processing on them that would rely
  on their format. An exception to this, ie. a case where timestamp
  is built for internal use, is graph.c:fill_uuid(). As far as I can
  see, what matters in that case is the uniqueness of the produced
  string, not the format.
- Implementing a single-token (space free) timestamp format.
  While some timestamp formats used to be single-token, now all of
  them will include a space preceding the offset indicator. Again,
  I did not see a use case where this could be significant in terms
  of representation.
- Moving the codebase to a single unified timestamp format and
  dropping the fmt argument of gf_time_fmt/gf_time_fmt_tv.
  While the gf_timefmt_FT format is almost ubiquitous, there are
  a few cases where different formats are used. I'm not convinced
  there is any reason to not use gf_timefmt_FT in those cases too,
  but I did not want to make a decision in this regard.

Change-Id: I0af73ab5d490cca7ed8d07a2ce7ac22a6df2920a
Updates: #837
Signed-off-by: Csaba Henk

cluster/afr: Delay post-op for fsync

2020-06-08T13:49:12+00:00

Problem:
AFR doesn't delay post-op for fsync fop. For fsync heavy workloads
this leads to un-necessary fxattrop/finodelk for every fsync leading
to bad performance.

Fix:
Have delayed post-op for fsync. Add special flag in xdata to indicate
that afr shouldn't delay post-op in cases where either the
process will terminate or graph-switch would happen. Otherwise it leads
to un-necessary heals when the graph-switch/process-termination
happens before delayed-post-op completes.

Fixes: #1253
Change-Id: I531940d13269a111c49e0510d49514dc169f4577
Signed-off-by: Pranith Kumar K