diff options
-rw-r--r-- | doc/stat-prefetch-design.txt | 89 |
1 files changed, 45 insertions, 44 deletions
diff --git a/doc/stat-prefetch-design.txt b/doc/stat-prefetch-design.txt index 8d3896f2633..06d0ad37e7d 100644 --- a/doc/stat-prefetch-design.txt +++ b/doc/stat-prefetch-design.txt @@ -14,74 +14,74 @@ lookup (stat) calls on each directory entry. the stat call). 2. To maintain the correctness, it does lookup-behind - lookup is winded to underlying translators after it is unwound to upper translators. - A lookup-behind is necessary as inode gets populated in server inode table - only in lookup-cbk. Also various translators store their contexts in inode - contexts during lookup calls. + lookup-behind is necessary as inode gets populated in server inode table + only in lookup-cbk and also because various translators store their + contexts in inode contexts during lookup calls. fops to be implemented: -====================== +======================= * lookup - Check the dentry cache stored in context of fds opened by the same process - on parent inode for basename. If found unwind with cached stat, else wind - the lookup call to underlying translators. We also store the stat path in - context of inode if the path being looked upon happens to be directory. - This stat will be used to fill postparent stat when lookup happens on any of - the directory contents. + 1. check the dentry cache stored in context of fds opened by the same process + on parent inode for basename. If found unwind with cached stat, else wind + the lookup call to underlying translators. + 2. stat is stored in the context of inode if the path being looked upon + happens to be directory. This stat will be used to fill postparent stat + when lookup happens on any of the directory contents. * readdir - 1. Cache the direntries returned in readdir_cbk in the context of fd. - 2. If the readdir is happening on non-expected offsets (means a seekdir/rewinddir + 1. cache the direntries returned in readdir_cbk in the context of fd. + 2. if the readdir is happening on non-expected offsets (means a seekdir/rewinddir has happened), cache has to be flushed. - 3. Delete the entry corresponding to basename of path on which fd is opened + 3. delete the entry corresponding to basename of path on which fd is opened from cache stored in parent. * chmod/fchmod - Delete the entry corresponding to basename from cache stored in context of - fds opened on parent inode, since these calls change st_mode and ctime of + delete the entry corresponding to basename from cache stored in context of + fds opened on parent inode, since these calls change st_mode and st_ctime of stat. * chown/fchown - Delete the entry corresponding to basename from cache stored in context of + delete the entry corresponding to basename from cache stored in context of fds opened on parent inode, since these calls change st_uid/st_gid and st_ctime of stat. * truncate/ftruncate - Delete the entry corresponding to basename from cache stored in context of + delete the entry corresponding to basename from cache stored in context of fds opened on parent inode, since these calls change st_size/st_mtime of stat. * utimens - Delete the entry corresponding to basename from cache stored in context of + delete the entry corresponding to basename from cache stored in context of fds opened on parent inode, since this call changes st_atime/st_mtime of stat. * readlink - Delete the entry corresponding to basename from cache stored in context of fds + delete the entry corresponding to basename from cache stored in context of fds opened on parent inode, since this call changes st_atime of stat. * unlink - 1. Delete the entry corresponding to basename from cache stored in context of - fds opened on parent directory containing file being unlinked. - 2. Delete the entry corresponding to basename of parent directory from cache - of its parent directory. + 1. delete the entry corresponding to basename from cache stored in context of + fds, opened on parent directory containing the file being unlinked. + 2. delete the entry corresponding to basename of parent directory from cache + of grand-parent. * rmdir - 1. Delete the entry corresponding to basename from cache stored in context of + 1. delete the entry corresponding to basename from cache stored in context of fds opened on parent inode. - 2. Remove the entire cache from all fds opened on inode corresponding to + 2. remove the entire cache from all fds opened on inode corresponding to directory being removed. - 3. Delete the entry correspondig to basename of parent from cache stored in + 3. delete the entry correspondig to basename of parent from cache stored in grand-parent. * readv - Delete the entry corresponding to basename from cache stored in context of fds + delete the entry corresponding to basename from cache stored in context of fds opened on parent inode, since readv changes st_atime of file. * writev - Delete the entry corresponding to basename from cache stored in context of fds + delete the entry corresponding to basename from cache stored in context of fds opened on parent inode, since writev can possibly change st_size and definitely changes st_mtime of file. * fsync - There is a confusion here as to whether fsync updates mtime/ctimes. Disk based + there is a confusion here as to whether fsync updates mtime/ctimes. Disk based filesystems (atleast ext2) just writes the times stored in inode to disk during fsync and not the time at which fsync is being done. But in glusterfs, a translator like write-behind actually sends writes during fsync which will @@ -95,26 +95,27 @@ fops to be implemented: 2. remove entry corresponding to newname from cache stored in fd contexts of newparent. 3. remove entry corresponding to oldparent from cache stored in - old-grand-parent. + old-grand-parent, since removing oldname changes st_mtime and st_ctime + of oldparent stat. 4. remove entry corresponding to newparent from cache stored in - new-grand-parent. + new-grand-parent, since adding newname changes st_mtime and st_ctime + of newparent stat. 5. if oldname happens to be a directory, remove entire cache from all fds opened on it. - * create/mknod/mkdir/symlink/link - Delete entry corresponding to basename of directory in which these operations - are happening, from cache stored in context of fds of parent directory. Note - that the parent directory containing the cahce is of the directory in which - these operations are happening. + delete entry corresponding to basename of parent directory in which these + operations are happening, from cache stored in context of fds opened on + grand-parent, since adding a new entry to a directory changes st_mtime + and st_ctime of parent directory. * setxattr/removexattr - Delete the entry corresponding to basename from cache stored in context of fds - opened on parent inode, since setxattr changes st_ctime of file. + delete the entry corresponding to basename from cache stored in context of + fds opened on parent inode, since setxattr changes st_ctime of file. * setdents 1. remove entry corresponding to basename of path on which fd is opened from - cache stored in parent. + cache stored in context of fds opened on parent. 2. for each of the entry in the direntry list, delete from cache stored in context of fd, the entry corresponding to basename of path being passed. @@ -125,20 +126,20 @@ fops to be implemented: would've changed st_atime. * checksum - Delete the entry corresponding to basename from cache stored in context of + delete the entry corresponding to basename from cache stored in context of fds opened on parent inode, since st_atime is changed during this call. * xattrop/fxattrop - Delete the entry corresponding to basename from cache stored in context of fds + delete the entry corresponding to basename from cache stored in context of fds opened on parent inode, since these calls modify st_ctime of file. callbacks to be implemented: -======================= +============================ * releasedir - Flush the stat-prefetch cache. + free the context stored in fd. * forget - Free the stat if the inode corresponds to a directory. + dree the stat if the inode corresponds to a directory. limitations: ============ |