diff options
author | Joseph Fernandes <josferna@redhat.com> | 2015-04-24 19:22:44 +0530 |
---|---|---|
committer | Vijay Bellur <vbellur@redhat.com> | 2015-05-06 04:40:55 -0700 |
commit | cb11dd91a6cc296e4a3808364077f4eacb810e48 (patch) | |
tree | 8c048c7e4f2ff13a2a47e893ff5a666082fef94f /tests/basic/tier | |
parent | 306585d2e57aadc7d15951ab1114d49fd9dbf5aa (diff) |
ctr/xlator: Named lookup heal of pre-existing files, before ctr was ON.
Problem: The CTR xlator records file meta (heat/hardlinks)
into the data. This works fine for files which are created
after ctr xlator is switched ON. But for files which were
created before CTR xlator is ON, CTR xlator is not able to
record either of the meta i.e heat or hardlinks. Thus making
those files immune to promotions/demotions.
Solution: The solution that is implemented in this patch is
do ctr-db heal of all those pre-existent files, using named lookup.
For this purpose we use the inode-xlator context variable option
in gluster.
The inode-xlator context variable for ctr xlator will have the
following,
a. A Lock for the context variable
b. A hardlink list: This list represents the successful looked
up hardlinks.
These are the scenarios when the hardlink list is updated:
1) Named-Lookup: Whenever a named lookup happens on a file, in the
wind path we copy all required hardlink and inode information to
ctr_db_record structure, which resides in the frame->local variable.
We dont update the database in wind. During the unwind, we read the
information from the ctr_db_record and ,
Check if the inode context variable is created, if not we create it.
Check if the hard link is there in the hardlink list.
If its not there we add it to the list and send a update to the
database using libgfdb.
Please note: The database transaction can fail(and we ignore) as there
already might be a record in the db. This update to the db is to heal
if its not there.
If its there in the list we ignore it.
2) Inode Forget: Whenever an inode forget hits we clear the hardlink list in
the inode context variable and delete the inode context variable.
Please note: An inode forget may happen for two reason,
a. when the inode is delete.
b. the in-memory inode is evicted from the inode table due to cache limits.
3) create: whenever a create happens we create the inode context variable and
add the hardlink. The database updation is done as usual by ctr.
4) link: whenever a hardlink is created for the inode, we create the inode context
variable, if not present, and add the hardlink to the list.
5) unlink: whenever a unlink happens we delete the hardlink from the list.
6) mknod: same as create.
7) rename: whenever a rename happens we update the hardlink in list. if the hardlink
was not present for updation, we add the hardlink to the list.
What is pending:
1) This solution will only work for named lookups.
2) We dont track afr-self-heal/dht-rebalancer traffic for healing.
Change-Id: Ia4bbaf84128ad6ce8c3ddd70bcfa82894c79585f
BUG: 1212037
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/10370
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Diffstat (limited to 'tests/basic/tier')
-rwxr-xr-x | tests/basic/tier/tier_lookup_heal.t | 72 |
1 files changed, 72 insertions, 0 deletions
diff --git a/tests/basic/tier/tier_lookup_heal.t b/tests/basic/tier/tier_lookup_heal.t new file mode 100755 index 00000000000..2b778f749f9 --- /dev/null +++ b/tests/basic/tier/tier_lookup_heal.t @@ -0,0 +1,72 @@ +#!/bin/bash + +. $(dirname $0)/../../include.rc +. $(dirname $0)/../../volume.rc + +LAST_BRICK=1 +CACHE_BRICK_FIRST=2 +CACHE_BRICK_LAST=3 +PROMOTE_TIMEOUT=5 + +function file_on_fast_tier { + local ret="1" + + s1=$(md5sum $1) + s2=$(md5sum $B0/${V0}${CACHE_BRICK_FIRST}/$1) + + if [ -e $B0/${V0}${CACHE_BRICK_FIRST}/$1 ] && ! [ "$s1" == "$s2" ]; then + echo "0" + else + echo "1" + fi +} + +cleanup + + +TEST glusterd +TEST pidof glusterd + +TEST $CLI volume create $V0 replica 2 $H0:$B0/${V0}{0..$LAST_BRICK} +TEST $CLI volume start $V0 +TEST $GFS --volfile-id=/$V0 --volfile-server=$H0 $M0; + +# Create files before CTR xlator is on. +cd $M0 +TEST stat . +TEST touch file1 +TEST stat file1 + +# gf_file_tb and gf_flink_tb should be empty +ENTRY_COUNT=$(echo "select * from gf_file_tb; select * from gf_flink_tb;" | \ + sqlite3 $B0/${V0}$LAST_BRICK/.glusterfs/${V0}$LAST_BRICK.db | wc -l ) +TEST [ $ENTRY_COUNT -eq 0 ] + + +#Attach tier and switch ON CTR Xlator. +TEST $CLI volume attach-tier $V0 replica 2 $H0:$B0/${V0}$CACHE_BRICK_FIRST $H0:$B0/${V0}$CACHE_BRICK_LAST +TEST $CLI volume set $V0 features.ctr-enabled on +TEST $CLI volume set $V0 cluster.tier-demote-frequency 4 +TEST $CLI volume set $V0 cluster.tier-promote-frequency 4 +TEST $CLI volume set $V0 cluster.read-freq-threshold 0 +TEST $CLI volume set $V0 cluster.write-freq-threshold 0 +TEST $CLI volume set $V0 performance.quick-read off +TEST $CLI volume set $V0 performance.io-cache off + +#The lookup should heal the database. +TEST ls file1 + +# gf_file_tb and gf_flink_tb should NOT be empty +ENTRY_COUNT=$(echo "select * from gf_file_tb; select * from gf_flink_tb;" | \ + sqlite3 $B0/${V0}$LAST_BRICK/.glusterfs/${V0}$LAST_BRICK.db | wc -l ) +TEST [ $ENTRY_COUNT -eq 2 ] + +# Heat-up the file +uuidgen > file1 +TEST $CLI volume rebalance $V0 tier start +sleep 5 + +#Check if the file is promoted +EXPECT_WITHIN $PROMOTE_TIMEOUT "0" file_on_fast_tier file1 + +cleanup |