glusterfs.git/xlators/mgmt/glusterd/src/glusterd-handler.c, branch v7.0rc2

glusterd/svc: update pid of mux volumes from the shd process

2019-07-24T10:29:17+00:00

For a normal volume, we are updating the pid from a the
process while we do a daemonization or at the end of the
init if it is no-daemon mode. Along with updating the pid
we also lock the file, to make sure that the process is
running fine.

With brick mux, we were updating the pidfile from gluterd
after an attach/detach request.

There are two problems with this approach.
1) We are not holding a pidlock for any file other than parent
   process.
2) There is a chance for possible race conditions with attach/detach.
   For example, shd start and a volume stop could race. Let's say
   we are starting an shd and it is attached to a volume.
   While we trying to link the pid file to the running process,
   this would have deleted by the thread that doing a volume stop.

Backport of : https://review.gluster.org/#/c/glusterfs/+/22935/
>Change-Id: I29a00352102877ce09ea3f376ca52affceb5cf1a
>Updates: bz#1722541
>Signed-off-by: Mohammed Rafi KC 

Change-Id: I29a00352102877ce09ea3f376ca52affceb5cf1a
Updates: bz#1732668
Signed-off-by: Mohammed Rafi KC

glusterd: do not mark skip_locking as true for geo-rep operations

2019-07-19T06:18:23+00:00

We need to send the commit req to peers in case of geo-rep
operations even though it is a no volname operation. In commit
phase peers try to set the txn_opinfo which will fail because
it is a no volname operation where we don't require a commit
phase. We mark skip_locking as true for no volname operations,
but we have to give an exception to geo-rep operations, so that
they can set txn_opinfo in commit phase.

Please refer to detailed RCA at the bug: 1730543

fixes: bz#1730543

Change-Id: I9f2478b12a281f6e052035c0563c40543493a3fc
Signed-off-by: Sanju Rakonde 
(cherry picked from commit b917974ee922d7a2e079692ad7d6f61f900b37b2)

glusterd: Show the correct brick status in get-state

2019-07-15T13:19:32+00:00

Problem: get-state does not show correct brick status if brick
         status is not Started, it always shows started if any value
         is set brickinfo->status

Solution: Check the value of brickinfo->status to show correct status
          in get-state

Change-Id: I12a79619024c2cf59f338220d144f2f034059b3b
fixes: bz#1726905
Signed-off-by: Mohit Agrawal 
(cherry picked from commit af989db23d1db00e087f2b9d3dfc43b13ef17153)

glusterd/thin-arbiter: Thin-arbiter integration with GD1

2019-07-04T07:42:11+00:00

gluster volume create  replica 2 thin-arbiter 1 : :
: [force]

The changes have been made in a way that the last brick in the bricks list
will be treated as the thin-arbiter.
GD1 will be manipulated to consider replica count to be as 2 and continue creating the
volume like any other replica 2 volume but since thin-arbiter volumes need ta-brick
client xlator entries for each subvolume in fuse volfile, volfile generation is
modified in a way to inject these entries seperately in the volfile for every subvolume.

Few more additions -
1- Save the volinfo with new fields ta_bricks list and thin_arbiter_count.
2- Introduce a new option client.ta-brick-port to add remote-port to ta-brick xlator entry
   in fuse volfiles. The option can be set using the following CLI syntax -
   gluster volume set  client.ta-brick-port 
3- Volume Info will contain a Thin-Arbiter-path entry to distinguish
   from other replicate volumes.

Change-Id: Ib434e2313b29716f32476c6c211d282c4ef39406
Updates #687
Signed-off-by: Vishal Pandey 
(cherry picked from commit 9b223b15ab69fce4076de036ee162f36a058bcd2)

glusterd-volgen.c: remove BD xlator from the graph

2019-06-18T12:09:09+00:00

The BD xlator was removed some time ago. Remove it from the graph.
We can also remove the caps settings - only the BD xlator
was using it.

Lastly, remove the caps (which only BD was using) and the document
describing the translator.

Change-Id: Id0adcb2952f4832a5dc6301e726874522e07935d
updates: bz#1193929
Signed-off-by: Yaniv Kaul

glusterd/tier: remove tier related code from glusterd

2019-05-27T07:50:24+00:00

The handler functions are pointed to dummy functions.
The switch case handling for tier also have been moved to
point default case to avoid issues, if reintroduced.

The tier changes in DHT still remain as such.

updates: bz#1693692

Change-Id: I80d80c9a3eb862b4440a36b31ae82b2e9d92e4dc
Signed-off-by: Hari Gowtham

glusterd: Fix coverity defects & put coverity annotations

2019-05-02T08:06:06+00:00

Along with fixing few defect, put the required annotations for the defects which
are marked ignore/false positive/intentional as per the coverity defect sheet.
This should avoid the per component graph showing many defects as open in the
coverity glusterfs web page.

Updates: bz#789278
Change-Id: I19461dc3603a3bd8f88866a1ab3db43d783af8e4
Signed-off-by: Atin Mukherjee

glusterd: coverity fixes

2019-04-26T13:15:36+00:00

1400775 - USE_AFTER_FREE
1400742 - Missing Unlock
1400736 - CHECKED_RETURN
1398470 - Missing Unlock

Missing unlock is the tricky one, we have had annotation added, but
coverity still continued to complaint. Added pthread_mutex_unlock to
clean up the lock before destroying it to see if it makes coverity
happy.

Updates: bz#789278
Change-Id: I1d892612a17f805144d96c1b15004a85a1639414
Signed-off-by: Atin Mukherjee

glusterd: coverity fixes

2019-04-25T06:39:16+00:00

Addresses the following:

* CID 1124776:  Resource leaks  (RESOURCE_LEAK) - Variable "aa" going out
of scope leaks the storage it points to in glusterd-volgen.c

* Bunch of CHECKED_RETURN defects in the callers of synctask_barrier_init

* CID 1400755:  Error handling issues  (CHECKED_RETURN) - Calling
"gf_is_service_running" without checking return value in
xlators/mgmt/glusterd/src/glusterd-shd-svc.c: 671 in
glusterd_shdsvc_stop()

* CID 1400745:  Memory - illegal accesses  (USE_AFTER_FREE) - Dereferencing
freed pointer "volinfo" in /xlators/mgmt/glusterd/src/glusterd-shd-svc.c: 460 in glusterd_shdsvc_start()

* CID 1400742:  Program hangs  (LOCK) - adding annotation to fix this
false positive

Updates: bz#789278
Change-Id: I02f16e7eeb8c5cf72f7d0b29d00df4f03b3718b3
Signed-off-by: Atin Mukherjee

glusterd: provide a way to detach failed node

2019-04-11T15:15:37+00:00

When a gluster node in trusted storage pool has failed
due to hardware issues, volume delete operation fails
saying "Not all peers are up" and peer detach for failed
node fails saying "Brick(s) with peer  exists
in cluster".

The idea here is to use either replace-brick or remove-brick
command to remove all the bricks hosted by failed node and
then re-attempting the peer detach. This change adds this
trick in peer detach error message.

fixes: bz#1697866

Change-Id: I0c58887479d31db603ad8d6535ea9d547880ccc8
Signed-off-by: Sanju Rakonde