glusterfs.git/tests/bugs, branch v3.13.0beta

glusterd: delete source brick only once in reset-brick commit force

2017-10-31T11:14:18+00:00

While stopping the brick which is to be reset and replaced delete_brick
flag was passed as true which resulted glusterd to free up to source
brick before the actual operation. This results commit force to fail
failing to find the source brickinfo.

Change-Id: I1aa7508eff7cc9c9b5d6f5163f3bb92736d6df44
BUG: 1507466
Signed-off-by: Atin Mukherjee

tests: Update tier CLI in .t files

2017-10-30T15:50:44+00:00

Update .t tier tests to use the new tier CLI.

Change-Id: I0e7f1769071108d8266fc86378c4466bcaf96e7d
BUG: 1505253
Signed-off-by: N Balachandran

cluster/afr: Fail open on split-brain

2017-10-26T18:23:35+00:00

Problem:
Append on a file with split-brain succeeds. Open is intercepted by open-behind,
when write comes on the file, open-behind does open+write. Open succeeds
because afr doesn't fail it. Then write succeeds because write-behind
intercepts it. Flush is also intercepted by write-behind, so the application
never gets to know that the write failed.

Fix:
Fail open on split-brain, so that when open-behind does open+write open fails
which leads to write failure. Application will know about this failure.

Change-Id: I4bff1c747c97bb2925d6987f4ced5f1ce75dbc15
BUG: 1294051
Signed-off-by: Pranith Kumar K

snapshot: Issue with other processes accessing the mounted brick

2017-10-23T10:05:02+00:00

Added code for unmount of activated snapshot brick during snapshot
deactivation process which make sense as mount point for deactivated
bricks should not exist.

Removed code for mounting newly created snapshot, as newly created
snapshots should not mount until it is activated.

Added code for mount point creation and snapshot mount during snapshot
activation.

Added validation during glusterd init for mounting only those snapshot
whose status is either STARTED or RESTORED.

During snapshot restore, mount point for stopped snap should exist as
it is required to set extended attribute.

During handshake, after getting updates from friend mount point for
activated snapshot should exist and should not for deactivated
snapshot.

While getting snap status we should show relevent information for
deactivated snapshots, after this pathch 'gluster snap status' command
will show output like-

Snap Name : snap1
Snap UUID : snap-uuid

	Brick Path        :   server1:/run/gluster/snaps/snap-vol-name/brick
	Volume Group      :   N/A (Deactivated Snapshot)
	Brick Running     :   No
	Brick PID         :   N/A
	Data Percentage   :   N/A
	LV Size           :   N/A

Fixes: #276

Change-Id: I65783488e35fac43632615ce1b8ff7b8e84834dc
BUG: 1482023
Signed-off-by: Sunny Kumar

glusterd:Marking all the brick status as stopped when a process goes down in brick multiplexing

2017-10-11T19:36:26+00:00

In brick multiplexing environment, if a brick process goes down
i.e., if we kill it with SIGKILL, the status of the brick for which
the process came up for the first time is only changing to stopped.
all other brick statuses are remain started. This is happening because
the process was killed abruptly using SIGKILL signal and signal
handler wasn't invoked and further cleanup wasn't triggered.

When we try to start a volume using force, it shows error saying
"Request timed out", since all the brickinfo->status are still in
started state, we're waiting for one of the brick process to come up
which never going to happen since the brick process was killed.

To resolve this, In the disconnect event, We are checking all the
processes that whether the brick which got disconnected belongs the
process. Once we get the process we are calling a function named
glusterd_mark_bricks_stopped_by_proc() and sending brick_proc_t object as
an argument.

From the glusterd_brick_proc_t we can get all the bricks attached
to that process. but these are duplicated ones. To get the original
brickinfo we are reading volinfo from brick. In volinfo we will have
original brickinfo copies. We are changing brickinfo->status to
stopped for all the bricks.

Change-Id: Ifb9054b3ee081ef56b39b2903ae686984fe827e7
BUG: 1499509
Signed-off-by: Sanju Rakonde

test: Mark test case ./tests/bugs/bug-1371806_1.t as a bad test case

2017-10-09T18:12:02+00:00

Problem: Test case ./tests/bugs/bug-1371806_1.t is failing.

Solution: Mark test case ./tests/bugs/bug-1371806_1.t as a bad test case.

BUG: 1499663
Change-Id: Icb3f41d23dcc74cce6fde05ca343c158d5f58cdd
Signed-off-by: Mohit Agrawal

glusterd : fix client io-threads option for replicate volumes

2017-10-09T06:49:16+00:00

Problem:
Commit ff075a3d6f9b142911d25c27fd209838782bfff0 disabled loading
client-io-threads for replicate volumes (it was set to on by default in
commit e068c1997314046658dd502e9118dab32decf879) due to performance
issues but in doing so, inadvertently failed to load the xlator even if
the user explicitly enabled the option using the volume set command.
This was despite returning returning sucess for the volume set.

Fix:
Modify the check in perfxl_option_handler() and add checks in volume
create/add-brick/remove-brick code paths, tying it all to
GD_OP_VERSION_3_12_2.

Change-Id: Ib612973a999a7da818cc926f5c2601b1f0794fcf
BUG: 1498570
Signed-off-by: Ravishankar N

afr: heal gfid as a part of entry heal

2017-10-09T06:23:08+00:00

Problem:
If a brick crashes after an entry (file or dir) is created but before
gfid is assigned, the good bricks will have pending entry heal xattrs
but the heal won't complete because afr_selfheal_recreate_entry() tries
to create the entry again and it fails with EEXIST.

Fix:
We could have fixed posx_mknod/mkdir etc to assign the gfid if the file
already exists but the right thing to do seems to be to trigger a lookup
on the bad brick and let it heal the gfid instead of winding an
mknod/mkdir in the first place.

Change-Id: I82f76665a7541f1893ef8d847b78af6466aff1ff
BUG: 1493415
Signed-off-by: Ravishankar N

cluster/dht : User xattrs are not healed after brick stop/start

2017-10-04T09:55:35+00:00

Problem: In a distributed volume custom extended attribute value for a directory
         does not display correct value after stop/start or added newly brick.
         If any extended(acl) attribute value is set for a directory after stop/added
         the brick the attribute(user|acl|quota) value is not updated on brick
         after start the brick.

Solution: First store hashed subvol or subvol(has internal xattr) on inode ctx and
          consider it as a MDS subvol.At the time of update custom xattr
          (user,quota,acl, selinux) on directory first check the mds from
          inode ctx, if mds is not present on inode ctx then throw EINVAL error
          to application otherwise set xattr on MDS subvol with internal xattr
          value of -1 and then try to update the attribute on other non MDS
          volumes also.If mds subvol is down in that case throw an
          error "Transport endpoint is not connected". In dht_dir_lookup_cbk|
          dht_revalidate_cbk|dht_discover_complete call dht_call_dir_xattr_heal
          to heal custom extended attribute.
          In case of gnfs server if hashed subvol has not found based on
          loc then wind a call on all subvol to update xattr.

Fix:    1) Save MDS subvol on inode ctx
        2) Check if mds subvol is present on inode ctx
        3) If mds subvol is down then call unwind with error ENOTCONN and if it is up
           then set new xattr "GF_DHT_XATTR_MDS" to -1 and wind a call on other
           subvol.
        4) If setxattr fop is successful on non-mds subvol then increment the value of
           internal xattr to +1
        5) At the time of directory_lookup check the value of new xattr GF_DHT_XATTR_MDS
        6) If value is not 0 in dht_lookup_dir_cbk(other cbk) functions then call heal
           function to heal user xattr
        7) syncop_setxattr on hashed_subvol to reset the value of xattr to 0
           if heal is successful on all subvol.

Test : To reproduce the issue followed below steps
       1) Create a distributed volume and create mount point
       2) Create some directory from mount point mkdir tmp{1..5}
       3) Kill any one brick from the volume
       4) Set extended attribute from mount point on directory
          setfattr -n user.foo -v "abc" ./tmp{1..5}
          It will throw error " Transport End point is not connected "
          for those hashed subvol is down
       5) Start volume with force option to start brick process
       6) Execute getfattr command on mount point for directory
       7) Check extended attribute on brick
          getfattr -n user.foo /tmp{1..5}
          It shows correct value for directories for those
          xattr fop were executed successfully.

Note: The patch will resolve xattr healing problem only for fuse mount
      not for nfs mount.

BUG: 1371806
Signed-off-by: Mohit Agrawal 

Change-Id: I4eb137eace24a8cb796712b742f1d177a65343d5

cluster/afr: Make choose-local "reconfigurable"

2017-09-30T02:15:21+00:00

With this change, enabling choose-local (which means its state makes
transition from "off" to "on") will be effective after the first
gfid-lookup on "/" since volume-set was executed.

Change-Id: Ibab292ba705d993b475cd0303fb3318211fb2500
BUG: 1480525
Signed-off-by: Krutika Dhananjay