summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * geo-rep: Handle directory sync failure as hard errorKotresh HR2017-04-192-6/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If directory creation is failed, return immediately before further processing. Allowing it to further process will fail the entire directory tree syncing to slave. Hence master will log and raise exception if it's directory failure. Earlier, master used to log the failure and proceed. > BUG: 1411607 > Signed-off-by: Kotresh HR <khiremat@redhat.com> > Reviewed-on: http://review.gluster.org/16364 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Aravinda VK <avishwan@redhat.com> Change-Id: Iba2a8b5d3d0092e7a9c8a3c2cdf9e6e29c73ddf0 BUG: 1441933 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: https://review.gluster.org/17051 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com>
| * doc : release-notes for GlusterFS-3.8.11v3.8.11Jiffin Tony Thottan2017-04-101-0/+26
| | | | | | | | | | | | | | | | | | | | | | BUG: 1431410 Change-Id: Iaf1d9603221bc0c70ad1695f5aa0afc2d651d737 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: https://review.gluster.org/17028 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
| * features/shard: Fix vm corruption upon fix-layoutKrutika Dhananjay2017-04-102-59/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: https://review.gluster.org/17010 shard's writev implementation, as part of identifying presence of participant shards that aren't in memory, first sends an MKNOD on these shards, and upon EEXIST error, looks up the shards before proceeding with the writes. The VM corruption was caused when the following happened: 1. DHT had n subvolumes initially. 2. Upon add-brick + fix-layout, the layout of .shard changed although the existing shards under it were yet to be migrated to their new hashed subvolumes. 3. During this time, there were writes on the VM falling in regions of the file whose corresponding shards were already existing under .shard. 4. Sharding xl sent MKNOD on these shards, now creating them in their new hashed subvolumes although there already exist shard blocks for this region with valid data. 5. All subsequent writes were wound on these newly created copies. The net outcome is that both copies of the shard didn't have the correct data. This caused the affected VMs to be unbootable. FIX: For want of better alternatives in DHT, the fix changes shard fops to do a LOOKUP before the MKNOD and upon EEXIST error, perform another lookup. Change-Id: I1a5d3515b42e2e5583c407d1b4aff44d7ce472eb BUG: 1440635 RCA'd-by: Raghavendra Gowdappa <rgowdapp@redhat.com> Reported-by: Mahdi Adnan <mahdi.adnan@outlook.com> Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/17019 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
| * features/shard: Initialize local->fop in readvKrutika Dhananjay2017-04-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: https://review.gluster.org/17014 Change-Id: I4d2f0a3f533009038d48579db5a8a2a048b77ca1 BUG: 1440635 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/17020 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
| * features/worm: Adding implementation for ftruncatekarthik-us2017-04-071-2/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Since the ftruncate fop was not handled in the worm feature, when truncate and write was happening on a worm-retained/worm file, it was giving the EROFS error but truncating the file, which is not correct. > Change-Id: I1a7e904655210d78bce9e01652ac56f3783b5aed > BUG: 1438810 > Signed-off-by: karthik-us <ksubrahm@redhat.com> > Reviewed-on: https://review.gluster.org/16995 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Niels de Vos <ndevos@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Ravishankar N <ravishankar@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: Raghavendra Talur <rtalur@redhat.com> (cherry picked from commit c5a4a77848024d2adf8cd4f35d550ba90c174fc7) Change-Id: Ic5e904b5bb3d76954a143f92fbfd8959fec884b8 BUG: 1439112 Signed-off-by: karthik-us <ksubrahm@redhat.com> Reviewed-on: https://review.gluster.org/17000 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
| * cluster/dht: Modify local->loc.gfid in thread safe mannerPranith Kumar K2017-04-071-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of https://review.gluster.org/16986 Problem: local->loc.gfid in dht_lookup_directory() will be null-gfid for a fresh lookup. dht_lookup_dir_cbk() updates local->loc.gfid while in other thread dht_lookup_directory() is still winding lookup calls to subvolumes so there is a chance of partial gfid being seen by EC. We saw in 12x(4+2) volume, ec is receiving an loc where the gfid has last 10 bytes matching with the gfid of the directory and the first 4 bytes are all-zeros. This is leading to EC erroring out the lookup with EINVAL which leads to NFS failing lookup with EIO. snip from gdb: $37 = (dht_local_t *) 0x7fde5de5b3cc (gdb) p /x $37->loc.gfid $39 = {0x3b, 0x82, 0x10, 0x5e, 0x40, 0x65, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5, 0x6c, 0x2c, 0xb8, 0x56} (gdb) fr 7 state=<optimized out>) at ec-generic.c:837 837 ec_lookup_rebuild(fop->xl->private, fop, cbk); (gdb) p /x fop->loc[0].gfid $40 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5, 0x6c, 0x2c, 0xb8, 0x56} snip from log: [2017-01-29 03:22:30.132328] W [MSGID: 122019] [ec-helpers.c:354:ec_loc_gfid_check] 0-butcher-disperse-4: Mismatching GFID's in loc [2017-01-29 03:22:30.132709] W [MSGID: 112199] [nfs3-helpers.c:3515:nfs3_log_newfh_res] 0-nfs-nfsv3: /linux-4.9.5/Documentation => (XID: b27b9474, MKDIR: NFS: 5(I/O error), POSIX: 5(Input/output error)), FH: exportid 00000000-0000-0000-0000-000000000000, gfid 00000000-0000-0000-0000-000000000000, mountid 00000000-0000-0000-0000-000000000000 [Invalid argument] Fix: update local->loc.gfid in last-call to make sure there are no races. >BUG: 1438411 >Change-Id: Ifcb7e911568c1f1f83123da6ff0cf742b91800a0 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> BUG: 1438424 Change-Id: If039956205cfac5e798c2c90e92a9a47b404e804 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/16988 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
| * rpcsvc: Add rpchdr and proghdr to iobref before submitting to transportPoornima G2017-04-074-12/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of https://review.gluster.org/16613 Issue: When fio is run on multiple clients (each client writes to its own files), and meanwhile the clients does a readdirp, thus the client which did a readdirp will now recieve the upcalls. In this scenario the client disconnects with rpc decode failed error. RCA: Upcall calls rpcsvc_request_submit to submit the request to socket: rpcsvc_request_submit currently: rpcsvc_request_submit () { iobuf = iobuf_new iov = iobuf->ptr fill iobuf to contain xdrised upcall content - proghdr rpcsvc_callback_submit (..iov..) ... if (iobuf) iobuf_unref (iobuf) } rpcsvc_callback_submit (... iov...) { ... iobuf = iobuf_new iov1 = iobuf->ptr fill iobuf to contain xdrised rpc header - rpchdr msg.rpchdr = iov1 msg.proghdr = iov ... rpc_transport_submit_request (msg) ... if (iobuf) iobuf_unref (iobuf) } rpcsvc_callback_submit assumes that once rpc_transport_submit_request() returns the msg is written on to socket and thus the buffers(rpchdr, proghdr) can be freed, which is not the case. In especially high workload, rpc_transport_submit_request() may not be able to write to socket immediately and hence adds it to its own queue and returns as successful. Thus, we have use after free, for rpchdr and proghdr. Hence the clients gets garbage rpchdr and proghdr and thus fails to decode the rpc, resulting in disconnect. To prevent this, we need to add the rpchdr and proghdr to a iobref and send it in msg: iobref_add (iobref, iobufs) msg.iobref = iobref; The socket layer takes a ref on msg.iobref, if it cannot write to socket and is adding to the queue. Thus we do not have use after free. Thank You for discussing, debugging and fixing along: Prashanth Pai <ppai@redhat.com> Raghavendra G <rgowdapp@redhat.com> Rajesh Joseph <rjoseph@redhat.com> Kotresh HR <khiremat@redhat.com> Mohammed Rafi KC <rkavunga@redhat.com> Soumya Koduri <skoduri@redhat.com> > Reviewed-on: https://review.gluster.org/16613 > Reviewed-by: Prashanth Pai <ppai@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: soumya k <skoduri@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: Ifa6bf6f4879141f42b46830a37c1574b21b37275 BUG: 1422788 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/16638 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
| * cluster/ec: Add/Modify description for eager-lock optionAshish Pandey2017-04-072-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch provides description for disperse.eager-lock option for disperse volume. It also modifies the description for cluster.eager-lock option to indicate that this option is only for replica volume. >Change-Id: Ie73298947fcaaa6aaf825978bc2d27ceaff386d2 >BUG: 1327171 >Signed-off-by: Ashish Pandey <aspandey@redhat.com> >Reviewed-on: http://review.gluster.org/13999 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> BUG: 1435645 Change-Id: I48b091e002b5c3308d6fbf2feb024a7f2fe08969 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> Reviewed-on: https://review.gluster.org/16943 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
| * glusterd: support filesystems with dynamic inode sizesNiels de Vos2017-04-071-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs and zfs are two filesystems that do not have fixed sizes for inodes. Instead of logging an error, skip checking and mark the size as "N/A" like other properties that can not be reported. The error message that was reported by users on the mailinglist shows up like: [glusterd-utils.c:5458:glusterd_add_inode_size_to_dict] 0-management: could not find (null) to getinode size for /dev/vdb (btrfs): (null) package missing? Cherry picked from commit 12921693b572f642156d3167d1c92d3449dfc8ec: > Change-Id: Ib10b7a3669f2f4221075715d9fd44ce1ffc35324 > Reported-by: Arman Khalatyan <arm2arm@gmail.com> > URL: http://lists.gluster.org/pipermail/gluster-users/2017-March/030189.html > BUG: 1433425 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/16867 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > Reviewed-by: Prashanth Pai <ppai@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: Ib10b7a3669f2f4221075715d9fd44ce1ffc35324 Reported-by: Arman Khalatyan <arm2arm@gmail.com> BUG: 1436412 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16960 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com>
| * cluster/afr: Undo pending xattrs only on the up brickskarthik-us2017-04-072-1/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: While doing conservative merge, even if a brick is down, it will reset the pending xattr on that. When that brick comes up, as part of the heal, it will consider this brick as the source and removes the entries on the other bricks, which leads to data loss. Fix: Undo pending only for the bricks which are up. > Change-Id: I18436fa0bb1faa5f60531b357dea3f6b20446303 > BUG: 1433571 > Signed-off-by: karthik-us <ksubrahm@redhat.com> > Reviewed-on: https://review.gluster.org/16913 > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit f91596e6566c605e70a31a60523d11f78a097c3c) Change-Id: Id20c9ce53ee59f005d977494903247e2a8024ed1 BUG: 1436231 Signed-off-by: karthik-us <ksubrahm@redhat.com> Reviewed-on: https://review.gluster.org/16956 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
| * cluster/ec: Metadata healing fails to update the versionSunil Kumar Acharya2017-04-071-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During meatadata heal, we were not updating the version though all the inode attributes were in sync. Updated the code to adjust version when all the inode attributes are in sync. >BUG: 1425703 >Change-Id: I6723be3c5f748b286d4efdaf3c71e9d2087c7235 >Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> >Reviewed-on: https://review.gluster.org/16772 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> BUG: 1434298 Change-Id: I5b74423253138957644b1bfa543d4abb2532c377 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> Reviewed-on: https://review.gluster.org/16935 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
| * features/locks: Fix leak of posix_lock_t's client_uidXavier Hernandez2017-04-073-69/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | > Change-Id: I3bc14998ed6a8841f77a004c24a456331048a521 > BUG: 1428510 > Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> > Reviewed-on: https://review.gluster.org/16838 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@gmail.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: I3bc14998ed6a8841f77a004c24a456331048a521 BUG: 1431592 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: https://review.gluster.org/16896 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
| * afr: do not mention split-brain in log message in read_txnRavishankar N2017-04-041-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I am seeing a lot of messages in qe/customer logs where read_txn complains that file is possibly in split-brain because of no readable subvol being found, does inode refresh and then there is no split-brain message post the inode refresh. This means that a lookup was not issued on the indoe to populate 'readable' or it can mean one brick is source for data and the other for metadata, making readable to be zero (because readable=intersection of (data,metadata readable) since commit 7a1c1e290470149696. Since we anyway log actual split-brains post inode-refresh, move this message to DEBUG log level. > Signed-off-by: Ravishankar N <ravishankar@redhat.com> > Reviewed-on: https://review.gluster.org/16879 > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 71e023fcaab0058f32fedc7b6b702040fdd85f46) Change-Id: Idb88b8ea362515279dc9b246f06b6b646c6d8013 BUG: 1434302 Reviewed-on: https://review.gluster.org/16933 Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
| * storage/posix: Use granular mutex locks for pgfid update syscallsKrutika Dhananjay2017-04-043-11/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: https://review.gluster.org/16869 Change-Id: I5c48b3be3f39bb8f951d33e2729522605384d1ff BUG: 1427390 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/16893 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
| * storage/posix: Use more granular mutex locks for atomic writesKrutika Dhananjay2017-04-043-8/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: https://review.gluster.org/16785 Change-Id: I64aa561cb76ff9d4597d91fb5aeb64531698936a BUG: 1427390 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/16892 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
| * features/shard: Pass the correct iatt for cache invalidationKrutika Dhananjay2017-04-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: https://review.gluster.org/16961 This fixes a performance issue with shard which was causing the translator to trigger unusually high number of lookups for cache invalidation even when there was no modification to the file. In shard_common_stat_cbk(), it is local->prebuf that contains the aggregated size and block count as opposed to buf which only holds the attributes for the physical copy of base shard. Passing buf for inode_ctx invalidation would always set refresh to true since the file size in inode ctx contains the aggregated size and would never be same as @buf->ia_size. This was leading to every write/read being preceded by a lookup on the base shard even when the file underwent no modification. Change-Id: I85940b4b33e77b98e97e277d880ab35b1496c89a BUG: 1437330 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/16968 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
| * build/packaging: Debian and Ubuntu don't have /usr/libexecKaleb S. KEITHLEY2017-03-2110-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GLUSTERFS_LIBEXECDIR is effectively hard-coded to /usr/libexec/glusterfs in configure(.ac) Debian-based distributions don't have a /usr/libexec/ directory This issues is partially mitigated by the use of $libexecdir in some of the Makefile.am files, but even so the incorrectly defined GLUSTERFS_LIBEXECDIR results in various things such as gsyncd, glusterfind, eventsd, etc., trying to invoke other scripts and programs from a location that doesn't exist. And once we correctly define GLUSTERFS_LIBEXECDIR, then we might as well use it appropriatedly. master change https://review.gluster.org/16880 master BZ: 1430841 release-3.10 change https://review.gluster.org/16881 release-3.10 BZ: 1430845 Change-Id: If5219cadc51ae316f7ba2e2831d739235c77902d BUG: 1430845 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/16882 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Milind Changire <mchangir@redhat.com>
* | nfs: Fix crash bug when mnt3_resolve_subdir_cbk() failsShreyas Siravara2017-08-301-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When mnt3_resolve_subdir_cbk() fails (this can happen when we race with gfs_stress.sh), authorized_path is sometimes NULL. We try to run strlen() on this path and as a result we crash. This diff fixes that by checking if path is NULL before dereferencing it. This bug exists in fb-release-3.6.3 & fb-release-3.6.3-stable. - This is a port of D3406533 to release-3.8-fb Test Plan: Run with patch and observe that there is no crash. Prove test for auth code. Reviewers: rwareing, kvigor Reviewed By: kvigor Subscribers: #posix_storage Change-Id: Ib24a9b640b066f72db30e9e08fccc512c0ff7bb6 Reviewed-on: https://review.gluster.org/18155 Reviewed-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* | io-stats: Expose io-thread queue depthsShreyas Siravara2017-08-309-10/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - This diff exposes the io-thread queue depths by sending a specialized getxattr() call down to the io-threads translator. - Port of D3086477, D3094145, D3095505 to 3.8 Test Plan: Tested on devserver, will run prove tests. Valgrind + ASAN pass as well. Reviewers: rwareing, kvigor Subscribers: dld, moox, dph Differential Revision: https://phabricator.fb.com/D3086477 Change-Id: Ia452a4fcdb9173a751c4cb48d739b25c235f6855 Reviewed-on: https://review.gluster.org/18143 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* | libglusterfs: Fix leak in client_t destroy structRichard Wareing2017-08-301-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - People tell me free'ing authentication data after we are done with it is a good thing to do. - This is a port of D3071688. Test Plan: - Ran valgrind w/ looping FUSE mount requests and watches "definitely lost", it no longer goes up the more we mount/umount Reviewers: kvigor, sshreyas Subscribers: moox, dld, dph Change-Id: Ia3d4a5bdd431006bd2d39b957cfe27f1ba3ef16e Reviewed-on: https://review.gluster.org/18142 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | storage/posix: Fix crash bug in posix_make_ancestryfromgfidRichard Wareing2017-08-304-5/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Per title, adding OOPS logging - Clean-up quota related log spew from when older clients connect to 3.6.x quota enabled clusters Test Plan: - Spew: Tested on dev servers, one with old client against dev server w/ 3.6.3_fb - Crash: Canaried on offending node (gfsai040.prn2) and ensure crash no longer happens - Canary on gfsbudev shadow tiers once approved - This is a port of D2894799 to 3.8 Reviewers: sshreyas, dph, dld, moox Change-Id: I13e7d6915ee301b8d607d5770ef2261a9ab78493 Reviewed-on: https://review.gluster.org/18140 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* | gNFSd: Auto re-register NFS/Mount programs with rpcbind periodicallyShreyas Siravara2017-08-305-9/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Every once in a while rpcbind crashes and the NFS endpoints go bye-bye. This diff makes it such that we should almost never encounter the case where we have NFS up and rpcbind down causing bad endpoints and hanging mounts for our customers. Test Plan: Added prove tests + tested on dev server Reviewers: dph, moox, rwareing Reviewed By: rwareing Differential Revision: https://phabricator.fb.com/D2571724 Tasks: 8803558 Change-Id: I35acb2d731185a7b20020cb57bdd4d879e978df4 Signature: t1:2571724:1445555327:3276a4dcc4da71346b09d4aeb46c69dddcc7c5ba Reviewed-on: https://review.gluster.org/17961 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | Make a DHT subvolume go read-only when a subvolume crashesShreyas Siravara2017-08-304-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When subvolumes crash, users get messages like "No such file or directory" or "I/O Error" when doing operations that are cluster-wide, i.e., operations that touch the subvolume that has crashed. These include operations like mkdir() and rmdir() which are cluster-wide, as well as reads/writes/creates that hash to the dead subvolume. DHT does the right thing by disallowing operations to the subvolume -- it is effectively putting the subvolume in "read-only" mode to protect data, but it does not return the correct error. As a result, users of the filesystem think that the data is gone (in the case of "No such file or directory", or worse a blanket error that means nothing in the case of EIO). DHT sets the errno to ENOENT, which while makes sense in the context of DHT (No subvolume entry, hence ENOENT), the error it should bubble up to the user is EROFS, since it is putting the system in read-only mode. This diff changes the error messages to EROFS so the users get a more clear message of what is going on. Test Plan: Tested by downing a subvolume and checking error codes. Also ran other prove tests to make sure they pass. Change-Id: I20ad6fe31dbd66536db2a69246771ffad0140db3 Reviewers: rwareing, dph, moox Reviewed-on: https://review.gluster.org/17952 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* | configure: add missing square bracketsKinglong Mee2017-08-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | $ sh autogen.sh ... GlusterFS autogen ... Running aclocal... configure.ac:939: warning: AC_LANG_CONFTEST: no AC_LANG_SOURCE call detected in body ../../lib/autoconf/lang.m4:193: AC_LANG_CONFTEST is expanded from... configure.ac:939: the top level Running autoheader... configure.ac:939: warning: AC_LANG_CONFTEST: no AC_LANG_SOURCE call detected in body ../../lib/autoconf/lang.m4:193: AC_LANG_CONFTEST is expanded from... configure.ac:939: the top level Running libtoolize... Running autoconf... configure.ac:939: warning: AC_LANG_CONFTEST: no AC_LANG_SOURCE call detected in body ../../lib/autoconf/lang.m4:193: AC_LANG_CONFTEST is expanded from... configure.ac:939: the top level Running automake... configure.ac:939: warning: AC_LANG_CONFTEST: no AC_LANG_SOURCE call detected in body ../../lib/autoconf/lang.m4:193: AC_LANG_CONFTEST is expanded from... configure.ac:939: the top level Change-Id: If5cecd75deb6a54c267eac899ca9e3c37d098193 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com> Reviewed-on: https://review.gluster.org/18038 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com> Tested-by: Kinglong Mee <kinglongmee@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | cluster/afr: SHD should not use did_discovery code pathsRichard Wareing2017-08-293-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Exempt the SHD from the discover code path Test Plan: - prove -v tests/bugs/fb8149516.t - Make rc and canary on offending host (gfsdataswarm048.prn2) Reviewers: moox, dph, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2491694 Change-Id: I5ec3997cf26375e834c3c7c4ea6c174eef957b8b Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/18141 Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* | fb-smoke: Minor fixkrad2017-08-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Minor bash issues Test Plan: Run remote test Differential Revision: https://phabricator.intern.facebook.com/D5726446 Change-Id: If95d091bf53b2959a36bc89a2ba056d333833a26 Tasks: T20082902 Reviewed-on: https://review.gluster.org/18136 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | Add negative caching to nfs auth cacheShreyas Siravara2017-08-295-351/+289
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This diff adds the ability to the nfs daemon to cache hosts it has deauthorized for mounts, not just the hosts it has authorized. This allows a host that has been denied to be deauthorized for ttl # of seconds, or until the nfs daemon has restarted. Test Plan: Use the prove tests to maintain the integrity of the auth code. Test manually to see if the correct code path is being hit. Reviewers: dph, rwareing Reviewed By: rwareing Differential Revision: https://phabricator.fb.com/D1947728 Change-Id: I9728e15913e0900ab34311b13b30eba0b91ce33f Reviewed-on: https://review.gluster.org/18134 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | fb-smoke: Add test_envkrad2017-08-292-23/+171
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: test_env provides test settings for 3.8 Test Plan: Run unit tests Differential Revision: https://phabricator.intern.facebook.com/D5671384 Change-Id: I1bab102b83d5fffe5ffcc568d2f82d19c78c84d5 Reviewed-on: https://review.gluster.org/18080 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | features/locks: Fix crash bug in connection (lock) clean-up flowRichard Wareing2017-08-282-12/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Fixes crash bug where bricks can crash when the "clear locks" command is run (by CLI or by revocation code) and sockets are later cleaned-up causing bricks to crash. Crash bug is due to use-after-free due to refs being left to the lock in the client-list. When this list is later traversed it triggers a crash as pointers are now pointing to garbage. Test Plan: - Ran with monkey-unlock and tested connection clean-ups after lock revocation Reviewers: sshreyas, dph, moox Reviewed By: moox Differential Revision: https://phabricator.fb.com/D2695087 Tasks: 6207062 Change-Id: Iea26efe4bfbadc26431a3c50a0a8bda218bb5219 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/18122 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | cluster/afr: Remove "compatability" code from SHD flowRichard Wareing2017-08-281-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - After examining some lock revocations, I recommend we remove this (problematic?) code which causes a lock to be held outside of the SHD locking domain. - The theory is that this lock should never conflict with anything outside of the SHD flow since the offsets are so huge. However in practice with lock revocation the lock lives a very long time, and I worry what the other implications of this might be. Test Plan: - Run tests/basic/afr/* (w/ D2706710) Reviewers: dph, moox, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2706717 Change-Id: I1f358f66810d104f28def9d1ac2a4fde3d073c92 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/18123 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | cluster/afr: Delete "special domain" AFR heal flowRichard Wareing2017-08-281-48/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Reverts code from 50998ae08c5a767468ee85cb5c53bb5554ff734a, this was originally intended for backward compat w/3.5. This isn't relevant for us, and everywhere we see stuck heals we tend to see this "aaaaaaaaa" guy show up, so let's nuke it. Test Plan: - Run heal prove tests Reviewers: moox, dph, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2689678 Tasks: 6207062 Change-Id: Ie9db3eb6c6d44f6137ebcf964e06965047763ed9 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/18121 Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* | fb-smoke: Add fb-smoke, build and build_env to r-3.8krad2017-08-183-0/+285
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The new plan is to keep fb-smoke, build and build_env in every version of gluster. Changes in r-3.6 will be ported to r-3.8 henceforth. We reference fbcode for remote testing. Test Plan: Run unit, asan, valgrind Reviewers: junsongli, sshreyas, jdarcy Reviewed By: jdarcy Subscribers: #posix_storage Differential Revision: https://phabricator.intern.facebook.com/D5653092 Tasks: T20082902 Change-Id: Iebf4cfc1752e97d6f9efe80af88ee06c21103d83 Signature: t1:5653092:1503006640:642f075cba3a7295af42638e100d2e48f426f07a Reviewed-on: https://review.gluster.org/18055 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | cluster/dht: Fix rebalance bug + better loggingRichard Wareing2017-08-151-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Fixes edge case where lookup by gfid fails because it's not copied into the inode struct from the loc_t struct during the readdir loop - Improved logging for error conditions Test Plan: - Tested on dev server - Canaried build on <redacted> Reviewers: dph, moox, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2676693 Tasks: 9034954 Change-Id: I7f0160b391c43fc38e679fdb660cee59d2267932 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/18040 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* | cluster/afr/shd: Fix leak in PGFID healingRichard Wareing2017-08-011-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Fixes leak in PGFID healing flow Test Plan: - Valgrind on dev server Differential Revision: https://phabricator.fb.com/D3090661 Change-Id: Icde6c3ed868034dff77c92f01182dd1e3a4f8a57 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17948 Tested-by: Jeff Darcy <jeff@pl.atyp.us> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | cluster/afr: Fix case in PGFID healing where NOOP was not being honoredRichard Wareing2017-08-015-7/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - PGFID healing should not be triggered in the case where there is nothing to do (ret = 2). Instead this return code should be returned to the heal daemon to trigger the reap of the entry. - Reworked shd-pgfid-heal.t to queue up heal naturally instead of synthetically Test Plan: - Run tests/basic/afr/shd-pgfid-heal.t Differential Revision: https://phabricator.fb.com/D2748578 Change-Id: I74300de2b4dce23867f4111548de35f58bf77453 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17936 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* | features/quota: Fix brick crash in quota unlink callbackRichard Wareing2017-08-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Get log message to use loc.gfid not loc.inode->gfid Test Plan: - Run prove -v tests/basic/quota*.t Reviewers: dph, moox, sshreyas Reviewed By: sshreyas Signature: t1:2559107:1445311668:61ca5809fa977326d0fb503e874363a29cd31dfe Change-Id: Iad16d7b2102376380eb0f6918111249af370aaeb Reviewed-on: https://review.gluster.org/17938 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | afr/cluster: PGFID heal supportRichard Wareing2017-07-3112-32/+226
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: PGFID healing enables heals which might otherwise fail due due to the lack of a entry heal to succeed by performing the entry healing within the same heal flow. It does this by leveraging the PGFID tracking feature of the POSIX xlator, and examining lookup replies for the PGFID attribute. If detected, the pgfid will be decoded and stored for later use in case the heal fails for whatever reason. Cascading heal failures are handled through recursion. This feature is critical for a couple reasons: 1. General healing predictability - When the SHD attempts to heal a given GFID, it should be able to do so without having to wait for some other dependent heal to take place. 2. Reliability - In some cases the parent directory may require healing, but the req'd entry in the indices/xattrop directory may not exist (e.g. bugs/crashes etc). Prior to PGFID heal support some sort of external script would be required to queue up these heals by using FS specific utilities to lookup the parent directory by hardlink or worse...do a costly full heal to clean them up. 3. Performance - In combination with multi-threaded SHD this feature will make SHD healing _much_ faster as directories with large amount of files to be healed will no longer have to wait for an entry heal to come along, the first file in that directory queued for healing will trigger an entry heal for the directory and this will allow the other files in that directory to be (immediatelly) healed in parallel. Test Plan: - run prove tests/basic/afr/shd_pgfid_heal.t - run prove tests/basic/afr/shd*.t - run prove tests/basic/afr/gfid*.t Differential Revision: https://phabricator.fb.com/D2546133 Change-Id: I25f586047f8bcafa900c0cc9ee8f0e2128688c73 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17929 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* | server: fix core dumps on upstream test machinesJeff Darcy2017-07-181-1/+5
| | | | | | | | | | | | | | | | | | | | Change-Id: I48f5340507a5fcebe874f498eba737585c1c32a7 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17818 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* | [io-cache] cache statfsDavid Wolinsky2017-07-144-16/+208
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: cache calls to statfs - io-cache must be enabled - then enable statfs caching - also can configure an independent cache time Test Plan: unit test basic/cache.t Reviewers: rwareing, sshreyas Subscribers: rappleye Differential Revision: https://phabricator.fb.com/D2524471 Change-Id: I55e0a773f9e24c2358d6fbbabbaf58bd5bd89ffc Tasks: 8618383 Reviewed-on: https://review.gluster.org/17771 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | [nfs] exports_auth per (sub) volumeDavid Wolinsky2017-07-138-28/+167
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - exports_auth changed to a per-volume option - parse exports_auth in nfs3.c - set nfs3_export state for exports_auth - all calls into mnt3_authenticate_request must pass in volname - volname is checked to determine if auth is enabled for that volume Test Plan: manual testing, will look into unit testing Reviewers: rwareing, sshreyas Reviewed By: sshreyas Subscribers: rappleye Differential Revision: https://phabricator.fb.com/D2519423 Tasks: 6863942 Change-Id: Ia9fd92ca5a5bd4cbb57e9ce61075f024ab7dbc27 Signature: t1:2519423:1444775772:24dc39e22684784b75899e97e9d1e294b059a077 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17762 Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* | cluster/afr: Handle gfid-less directories in heal flowRichard Wareing2017-07-127-18/+251
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Updates heal flow to handle case where a directory does not have a gfid assigned. In this case we will remove _only_ empty directories in these cases such that the parent can re-gain consistency and files within can be correctly healed. - Also adds a test for the case where a file does not have a gfid, this is already handles by the metadata heal flow, but tests were lacking for this code path. Test Plan: - prove -v tests/basic/shd_autofix_nogfid.t - prove -v tests/basic/gfid_unsplit_shd.t Reviewers: dph, moox, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2502067 Tasks: 8549168 Change-Id: I8dd3e6a6d62807cb38aafe597eced3d4b402351b Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17750 Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* | tests: test fix for 67279d73Jeff Darcy2017-07-111-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Accidentally pushed this directly through the branch instead of through Gerrit. Change-Id: Ieedd2f71887cca91a6f1d31bc3cddfc489fc9fa6 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17749 Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* | cluster/afr: SHD should always inspect directory healsRichard Wareing2017-07-112-2/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - This change ensures SHD always inspect directories which are queued for healing; i.e. it will not exclusively trust the wise-fool algorithm as there are cases where the change log simply isn't correct (bugs, crashes, etc). Failing to perform the entry heal in these cases will result in data heals failing to take place. - We made a similar change in 3.4.x for similar reasons Test Plan: - Run prove -v tests/basic/shd_force_inspect.t Reviewers: moox, dph, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2492993 Tasks: 8549168 Signature: t1:2492993:1443740894:7cf07168ca09946df9d8f96a3085fe2d3c201543 Change-Id: I2d8e1cbecbbca720cc3ee988d7aae08bea0a5453
* | cluster/afr: GFID unsplit improvementsRichard Wareing2017-07-117-105/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Few improvements: handle type mis-matches (e.g. dir/file mis-matches), added in an option to control whether gfid unsplits will happen, ensured entry healing will happen in the gfid mis-match case when the option is enabled. - Added prove test to cover entry healing & type mis-match cases - Enable metadata split-brain resolution by default - Enable gfid split-brain resolution by default - Fix gfid unsplit logging bugs where it was showing null GFIDs instead of the actual chosen GFIDs Test Plan: - run prove -v test/basic/gfid_unsplit* - Ran valgrind to verify leak-free state Reviewers: moox, sshreyas Reviewed By: sshreyas Change-Id: Id67ddc728745ebbbaf7bdd3f9a5549e5a4cc4a20 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Change-Id: I4181233f9ba7f61ccd2ba91f0874eb2ac7cd40b5 Manually-merged-by: Jeff Darcy <jdarcy@fb.com> Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17739 Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* | cluster/afr: Adjust gfid unsplit flow for proper correctness w/ AFR2Richard Wareing2017-07-073-8/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Prior patch did not re-run the gfid-mismatch flow after doing the unsplit. I think this is prudent to re-validate the unsplit worked as well as allow the code to continue from where it effectively left off. Test Plan: - Run prove -v tests/basic/gfid_unsplit.t Reviewers: dph, moox, sshreyas Reviewed By: sshreyas Change-Id: Ib3ed40f3db38c89090a876d7af3a1b2a303539d5 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17729 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* | cluster/afr: Non-destructive GFID unsplit brain support for v3.6.xRichard Wareing2017-07-065-13/+426
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - v3.6.3 port of non-destructive GFID unsplit-brain code, almost a re-write for AFR2, but the original behavior lives on. - This feature allows the GlusterFS filesystem to automagically resolve GFID splitbrain situations by choosing the authorative file based on the last modification time. Other policies such as majority or size are also possible but not implemented just yet. - Core feature to Halo Geo-Replication, as this (gfid) form of split-brain is an everyday possibility with async mounts, so there needs to be an automated & scalable method to resolve them via the SHD or optionally in-line by FUSE clients or NFS daemons. - Operational notes: 1. Files or directory entries are supported, you can even write files into a directory and they will not be lost. 2. Streamed writes to a files are fully supported while a split-brain resolution happens, i.e. the writes will not be interrupted while the unsplit takes place. 3. Un-split (ones which are determined not to be "authoritative") files are renamed like so: ".<filename>_<random uuid>" Test Plan: - Run prove -v tests/basic/gfid_unsplit.t - Test output: https://phabricator.fb.com/P20041740 Reviewers: moox, dph, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2479409 Signature: t1:2479409:1443208319:373218aa9758a1b48db23ea5e211ec303fa92e64 Blame Revision: Change-Id: I5b3d2e79fad74b4372c02b86219e8ee98f5e29dc Change-Id: I8ef719bcccb19ab6674647e02b72e1b36155fed9 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17720 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | Build/test fixes - build_env, tirpc, mem-pool, cleanupJeff Darcy2017-07-068-5/+21
| | | | | | | | | | | | | | | | | | | | | | Differential Revision: https://phabricator.intern.facebook.com/D5376801 Change-Id: I5bf733a395ef2b85065200fa5810ced27ee0d682 Reviewed-on: https://review.gluster.org/17719 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* | tests: forward-porting test fixes from 3.6 branchJeff Darcy2017-07-059-3/+11
| | | | | | | | | | | | | | | | | | | | Change-Id: I4074e7cce8f6782860f849780ab6d0458e92a2ce Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17708 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* | cluster/afr: Fix metadata split-brain flow (HOTFIX)Richard Wareing2017-07-051-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - The metadata heal flow for some reason likes to tinker with the sink states prior to having the source finalized, this broke the policy based unsplit flow. This patch fixes it by simply setting those chilren who aren't the favorite as sinks. Test Plan: - Tested against some reported instances Reviewers: moox, sshreyas, dph Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2481527 Signature: t1:2481527:1443215555:1165d8eb5f3dec216ec3ff0795d9837712906b1d Blame Revision: Change-Id: I56f96fdcef32dd4fc5d35958148d0e56d142d5e4 Change-Id: I16aa445a22c3bcd7b589954e2da513ed53822d5b Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17682 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | Revert "cluster/afr: Fix metadata split-brain flow (HOTFIX)"Jeff Darcy2017-07-031-17/+0
| | | | | | | | This reverts commit 992a9f8494a358f828eeef34b46e9f5ccfca1d3b.