Monitoring your GlusterFS Workload You can monitor the GlusterFS volumes on different parameters. Monitoring volumes helps in capacity planning and performance tuning tasks of the GlusterFS volume. Using these information, you can identify and troubleshoot issues. You can use Volume Top and Profile commands to view the performance and identify bottlenecks/hotspots of each brick of a volume. This helps system administrators to get vital performance information whenever performance needs to be probed. You can also perform statedump of the brick processes and nfs server process of a volume, and also view volume status and volume information.
Running GlusterFS Volume Profile Command GlusterFS Volume Profile command provides an interface to get the per-brick I/O information for each File Operation (FOP) of a volume. The per brick information helps in identifying bottlenecks in the storage system. This section describes how to run GlusterFS Volume Profile command by performing the following operations:
Start Profiling You must start the Profiling to view the File Operation information for each brick. To start profiling: Start profiling using the following command: # gluster volume profile VOLNAME start For example, to start profiling on test-volume: # gluster volume profile test-volume start Profiling started on test-volume When profiling on the volume is started, the following additional options are displayed in the Volume Info: diagnostics.count-fop-hits: on diagnostics.latency-measurement: on
Displaying the I/0 Information You can view the I/O information of each brick. To display I/O information: Display the I/O information using the following command: # gluster volume profile VOLNAME info For example, to see the I/O information on test-volume: # gluster volume profile test-volume info Brick: Test:/export/2 Cumulative Stats: Block 1b+ 32b+ 64b+ Size: Read: 0 0 0 Write: 908 28 8 Block 128b+ 256b+ 512b+ Size: Read: 0 6 4 Write: 5 23 16 Block 1024b+ 2048b+ 4096b+ Size: Read: 0 52 17 Write: 15 120 846 Block 8192b+ 16384b+ 32768b+ Size: Read: 52 8 34 Write: 234 134 286 Block 65536b+ 131072b+ Size: Read: 118 622 Write: 1341 594 %-latency Avg- Min- Max- calls Fop latency Latency Latency ___________________________________________________________ 4.82 1132.28 21.00 800970.00 4575 WRITE 5.70 156.47 9.00 665085.00 39163 READDIRP 11.35 315.02 9.00 1433947.00 38698 LOOKUP 11.88 1729.34 21.00 2569638.00 7382 FXATTROP 47.35 104235.02 2485.00 7789367.00 488 FSYNC ------------------ ------------------ Duration : 335 BytesRead : 94505058 BytesWritten : 195571980
Stop Profiling You can stop profiling the volume, if you do not need profiling information anymore. To stop profiling Stop profiling using the following command: # gluster volume profile VOLNAME stop For example, to stop profiling on test-volume: # gluster volume profile test-volume stop Profiling stopped on test-volume
Running GlusterFS Volume TOP Command GlusterFS Volume Top command allows you to view the glusterfs bricks’ performance metrics like read, write, file open calls, file read calls, file write calls, directory open calls, and directory real calls. The top command displays up to 100 results. This section describes how to run and view the results for the following GlusterFS Top commands:
Viewing Open fd Count and Maximum fd Count You can view both current open fd count (list of files that are currently the most opened and the count) on the brick and the maximum open fd count (count of files that are the currently open and the count of maximum number of files opened at any given point of time, since the servers are up and running). If the brick name is not specified, then open fd metrics of all the bricks belonging to the volume will be displayed. To view open fd count and maximum fd count: View open fd count and maximum fd count using the following command: # gluster volume top VOLNAME open [brick BRICK-NAME] [list-cnt cnt] For example, to view open fd count and maximum fd count on brick server:/export of test-volume and list top 10 open calls: # gluster volume top test-volume open brick server:/export/ list-cnt 10 Brick: server:/export/dir1 Current open fd's: 34 Max open fd's: 209 ==========Open file stats======== open file name call count 2 /clients/client0/~dmtmp/PARADOX/ COURSES.DB 11 /clients/client0/~dmtmp/PARADOX/ ENROLL.DB 11 /clients/client0/~dmtmp/PARADOX/ STUDENTS.DB 10 /clients/client0/~dmtmp/PWRPNT/ TIPS.PPT 10 /clients/client0/~dmtmp/PWRPNT/ PCBENCHM.PPT 9 /clients/client7/~dmtmp/PARADOX/ STUDENTS.DB 9 /clients/client1/~dmtmp/PARADOX/ STUDENTS.DB 9 /clients/client2/~dmtmp/PARADOX/ STUDENTS.DB 9 /clients/client0/~dmtmp/PARADOX/ STUDENTS.DB 9 /clients/client8/~dmtmp/PARADOX/ STUDENTS.DB
Viewing Highest File Read Calls You can view highest read calls on each brick. If brick name is not specified, then by default, list of 100 files will be displayed. To view highest file Read calls: View highest file Read calls using the following command: # gluster volume top VOLNAME read [brick BRICK-NAME] [list-cnt cnt] For example, to view highest Read calls on brick server:/export of test-volume: # gluster volume top test-volume read brick server:/export list-cnt 10 Brick: server:/export/dir1 ==========Read file stats======== read filename call count 116 /clients/client0/~dmtmp/SEED/LARGE.FIL 64 /clients/client0/~dmtmp/SEED/MEDIUM.FIL 54 /clients/client2/~dmtmp/SEED/LARGE.FIL 54 /clients/client6/~dmtmp/SEED/LARGE.FIL 54 /clients/client5/~dmtmp/SEED/LARGE.FIL 54 /clients/client0/~dmtmp/SEED/LARGE.FIL 54 /clients/client3/~dmtmp/SEED/LARGE.FIL 54 /clients/client4/~dmtmp/SEED/LARGE.FIL 54 /clients/client9/~dmtmp/SEED/LARGE.FIL 54 /clients/client8/~dmtmp/SEED/LARGE.FIL
Viewing Highest File Write Calls You can view list of files which has highest file write calls on each brick. If brick name is not specified, then by default, list of 100 files will be displayed. To view highest file Write calls: View highest file Write calls using the following command: # gluster volume top VOLNAME write [brick BRICK-NAME] [list-cnt cnt] For example, to view highest Write calls on brick server:/export of test-volume: # gluster volume top test-volume write brick server:/export list-cnt 10 Brick: server:/export/dir1 ==========Write file stats======== write call count filename 83 /clients/client0/~dmtmp/SEED/LARGE.FIL 59 /clients/client7/~dmtmp/SEED/LARGE.FIL 59 /clients/client1/~dmtmp/SEED/LARGE.FIL 59 /clients/client2/~dmtmp/SEED/LARGE.FIL 59 /clients/client0/~dmtmp/SEED/LARGE.FIL 59 /clients/client8/~dmtmp/SEED/LARGE.FIL 59 /clients/client5/~dmtmp/SEED/LARGE.FIL 59 /clients/client4/~dmtmp/SEED/LARGE.FIL 59 /clients/client6/~dmtmp/SEED/LARGE.FIL 59 /clients/client3/~dmtmp/SEED/LARGE.FIL
Viewing Highest Open Calls on Directories You can view list of files which has highest open calls on directories of each brick. If brick name is not specified, then the metrics of all the bricks belonging to that volume will be displayed. To view list of open calls on each directory View list of open calls on each directory using the following command: # gluster volume top VOLNAME opendir [brick BRICK-NAME] [list-cnt cnt] For example, to view open calls on brick server:/export/ of test-volume: # gluster volume top test-volume opendir brick server:/export list-cnt 10 Brick: server:/export/dir1 ==========Directory open stats======== Opendir count directory name 1001 /clients/client0/~dmtmp 454 /clients/client8/~dmtmp 454 /clients/client2/~dmtmp 454 /clients/client6/~dmtmp 454 /clients/client5/~dmtmp 454 /clients/client9/~dmtmp 443 /clients/client0/~dmtmp/PARADOX 408 /clients/client1/~dmtmp 408 /clients/client7/~dmtmp 402 /clients/client4/~dmtmp
Viewing Highest Read Calls on Directory You can view list of files which has highest directory read calls on each brick. If brick name is not specified, then the metrics of all the bricks belonging to that volume will be displayed. To view list of highest directory read calls on each brick View list of highest directory read calls on each brick using the following command: # gluster volume top VOLNAME readdir [brick BRICK-NAME] [list-cnt cnt] For example, to view highest directory read calls on brick server:/export of test-volume: # gluster volume top test-volume readdir brick server:/export list-cnt 10 Brick: server:/export/dir1==========Directory readdirp stats======== readdirp count directory name 1996 /clients/client0/~dmtmp 1083 /clients/client0/~dmtmp/PARADOX 904 /clients/client8/~dmtmp 904 /clients/client2/~dmtmp 904 /clients/client6/~dmtmp 904 /clients/client5/~dmtmp 904 /clients/client9/~dmtmp 812 /clients/client1/~dmtmp 812 /clients/client7/~dmtmp 800 /clients/client4/~dmtmp
Viewing List of Read Performance on each Brick You can view the read throughput of files on each brick. If brick name is not specified, then the metrics of all the bricks belonging to that volume will be displayed. The output will be the read throughput. ==========Read throughput file stats======== read filename Time through put(MBp s) 2570.00 /clients/client0/~dmtmp/PWRPNT/ -2011-01-31 TRIDOTS.POT 15:38:36.894610 2570.00 /clients/client0/~dmtmp/PWRPNT/ -2011-01-31 PCBENCHM.PPT 15:38:39.815310 2383.00 /clients/client2/~dmtmp/SEED/ -2011-01-31 MEDIUM.FIL 15:52:53.631499 2340.00 /clients/client0/~dmtmp/SEED/ -2011-01-31 MEDIUM.FIL 15:38:36.926198 2299.00 /clients/client0/~dmtmp/SEED/ -2011-01-31 LARGE.FIL 15:38:36.930445 2259.00 /clients/client0/~dmtmp/PARADOX/ -2011-01-31 COURSES.X04 15:38:40.549919 2221.00 /clients/client0/~dmtmp/PARADOX/ -2011-01-31 STUDENTS.VAL 15:52:53.298766 2221.00 /clients/client3/~dmtmp/SEED/ -2011-01-31 COURSES.DB 15:39:11.776780 2184.00 /clients/client3/~dmtmp/SEED/ -2011-01-31 MEDIUM.FIL 15:39:10.251764 2184.00 /clients/client5/~dmtmp/WORD/ -2011-01-31 BASEMACH.DOC 15:39:09.336572 This command will initiate a dd for the specified count and block size and measures the corresponding throughput. To view list of read performance on each brick View list of read performance on each brick using the following command: # gluster volume top VOLNAME read-perf [bs blk-size count count] [brick BRICK-NAME] [list-cnt cnt] For example, to view read performance on brick server:/export/ of test-volume, 256 block size of count 1, and list count 10: # gluster volume top test-volume read-perf bs 256 count 1 brick server:/export/ list-cnt 10 Brick: server:/export/dir1 256 bytes (256 B) copied, Throughput: 4.1 MB/s ==========Read throughput file stats======== read filename Time through put(MBp s) 2912.00 /clients/client0/~dmtmp/PWRPNT/ -2011-01-31 TRIDOTS.POT 15:38:36.896486 2570.00 /clients/client0/~dmtmp/PWRPNT/ -2011-01-31 PCBENCHM.PPT 15:38:39.815310 2383.00 /clients/client2/~dmtmp/SEED/ -2011-01-31 MEDIUM.FIL 15:52:53.631499 2340.00 /clients/client0/~dmtmp/SEED/ -2011-01-31 MEDIUM.FIL 15:38:36.926198 2299.00 /clients/client0/~dmtmp/SEED/ -2011-01-31 LARGE.FIL 15:38:36.930445 2259.00 /clients/client0/~dmtmp/PARADOX/ -2011-01-31 COURSES.X04 15:38:40.549919 2221.00 /clients/client9/~dmtmp/PARADOX/ -2011-01-31 STUDENTS.VAL 15:52:53.298766 2221.00 /clients/client8/~dmtmp/PARADOX/ -2011-01-31 COURSES.DB 15:39:11.776780 2184.00 /clients/client3/~dmtmp/SEED/ -2011-01-31 MEDIUM.FIL 15:39:10.251764 2184.00 /clients/client5/~dmtmp/WORD/ -2011-01-31 BASEMACH.DOC 15:39:09.336572
Viewing List of Write Performance on each Brick You can view list of write throughput of files on each brick. If brick name is not specified, then the metrics of all the bricks belonging to that volume will be displayed. The output will be the write throughput. This command will initiate a dd for the specified count and block size and measures the corresponding throughput. To view list of write performance on each brick: View list of write performance on each brick using the following command: # gluster volume top VOLNAME write-perf [bs blk-size count count] [brick BRICK-NAME] [list-cnt cnt] For example, to view write performance on brick server:/export/ of test-volume, 256 block size of count 1, and list count 10: # gluster volume top test-volume write-perf bs 256 count 1 brick server:/export/ list-cnt 10 Brick: server:/export/dir1 256 bytes (256 B) copied, Throughput: 2.8 MB/s ==========Write throughput file stats======== write filename Time throughput (MBps) 1170.00 /clients/client0/~dmtmp/SEED/ -2011-01-31 SMALL.FIL 15:39:09.171494 1008.00 /clients/client6/~dmtmp/SEED/ -2011-01-31 LARGE.FIL 15:39:09.73189 949.00 /clients/client0/~dmtmp/SEED/ -2011-01-31 MEDIUM.FIL 15:38:36.927426 936.00 /clients/client0/~dmtmp/SEED/ -2011-01-31 LARGE.FIL 15:38:36.933177 897.00 /clients/client5/~dmtmp/SEED/ -2011-01-31 MEDIUM.FIL 15:39:09.33628 897.00 /clients/client6/~dmtmp/SEED/ -2011-01-31 MEDIUM.FIL 15:39:09.27713 885.00 /clients/client0/~dmtmp/SEED/ -2011-01-31 SMALL.FIL 15:38:36.924271 528.00 /clients/client5/~dmtmp/SEED/ -2011-01-31 LARGE.FIL 15:39:09.81893 516.00 /clients/client6/~dmtmp/ACCESS/ -2011-01-31 FASTENER.MDB 15:39:01.797317
Displaying Volume Information You can display information about a specific volume, or all volumes, as needed. To display volume information Display information about a specific volume using the following command: # gluster volume info VOLNAME For example, to display information about test-volume: # gluster volume info test-volume Volume Name: test-volume Type: Distribute Status: Created Number of Bricks: 4 Bricks: Brick1: server1:/exp1 Brick2: server2:/exp2 Brick3: server3:/exp3 Brick4: server4:/exp4 Display information about all volumes using the following command: # gluster volume info all # gluster volume info all Volume Name: test-volume Type: Distribute Status: Created Number of Bricks: 4 Bricks: Brick1: server1:/exp1 Brick2: server2:/exp2 Brick3: server3:/exp3 Brick4: server4:/exp4 Volume Name: mirror Type: Distributed-Replicate Status: Started Number of Bricks: 2 X 2 = 4 Bricks: Brick1: server1:/brick1 Brick2: server2:/brick2 Brick3: server3:/brick3 Brick4: server4:/brick4 Volume Name: Vol Type: Distribute Status: Started Number of Bricks: 1 Bricks: Brick: server:/brick6
Performing Statedump on a Volume Statedump is a mechanism through which you can get details of all internal variables and state of the glusterfs process at the time of issuing the command.You can perform statedumps of the brick processes and nfs server process of a volume using the statedump command. The following options can be used to determine what information is to be dumped: mem - Dumps the memory usage and memory pool details of the bricks. iobuf - Dumps iobuf details of the bricks. priv - Dumps private information of loaded translators. callpool - Dumps the pending calls of the volume. fd - Dumps the open fd tables of the volume. inode - Dumps the inode tables of the volume. To display volume statedump Display statedump of a volume or NFS server using the following command: # gluster volume statedump VOLNAME [nfs] [all|mem|iobuf|callpool|priv|fd|inode] For example, to display statedump of test-volume: # gluster volume statedump test-volume Volume statedump successful The statedump files are created on the brick servers in the /tmp directory or in the directory set using server.statedump-path volume option. The naming convention of the dump file is <brick-path>.<brick-pid>.dump. By defult, the output of the statedump is stored at /tmp/<brickname.PID.dump> file on that particular server. Change the directory of the statedump file using the following command: # gluster volume set VOLNAME server.statedump-path path For example, to change the location of the statedump file of test-volume: # gluster volume set test-volume server.statedump-path /usr/local/var/log/glusterfs/dumps/ Set volume successful You can view the changed path of the statedump file using the following command: # gluster volume info VOLNAME
Displaying Volume Status You can display the status information about a specific volume, brick or all volumes, as needed. Status information can be used to understand the current status of the brick, nfs processes, and overall file system. Status information can also be used to monitor and debug the volume information. You can view status of the volume along with the following details: detail - Displays additional information about the bricks. clients - Displays the list of clients connected to the volume. mem - Displays the memory usage and memory pool details of the bricks. inode - Displays the inode tables of the volume. fd - Displays the open fd (file descriptors) tables of the volume. callpool - Displays the pending calls of the volume. To display volume status Display information about a specific volume using the following command: # gluster volume status [all|VOLNAME [BRICKNAME]] [detail|clients|mem|inode|fd|callpool] For example, to display information about test-volume: # gluster volume status test-volume STATUS OF VOLUME: test-volume BRICK PORT ONLINE PID -------------------------------------------------------- arch:/export/1 24009 Y 22445 -------------------------------------------------------- arch:/export/2 24010 Y 22450 Display information about all volumes using the following command: # gluster volume status all # gluster volume status all STATUS OF VOLUME: volume-test BRICK PORT ONLINE PID -------------------------------------------------------- arch:/export/4 24010 Y 22455 STATUS OF VOLUME: test-volume BRICK PORT ONLINE PID -------------------------------------------------------- arch:/export/1 24009 Y 22445 -------------------------------------------------------- arch:/export/2 24010 Y 22450 Display additional information about the bricks using the following command: # gluster volume status VOLNAME detail For example, to display additional information about the bricks of test-volume: # gluster volume status test-volume details STATUS OF VOLUME: test-volume ------------------------------------------- Brick : arch:/export/1 Port : 24009 Online : Y Pid : 16977 File System : rootfs Device : rootfs Mount Options : rw Disk Space Free : 13.8GB Total Disk Space : 46.5GB Inode Size : N/A Inode Count : N/A Free Inodes : N/A Number of Bricks: 1 Bricks: Brick: server:/brick6 Display the list of clients accessing the volumes using the following command: # gluster volume status VOLNAME clients For example, to display the list of clients connected to test-volume: # gluster volume status test-volume clients Brick : arch:/export/1 Clients connected : 2 Hostname Bytes Read BytesWritten -------- --------- ------------ 127.0.0.1:1013 776 676 127.0.0.1:1012 50440 51200 Display the memory usage and memory pool details of the bricks using the following command: # gluster volume status VOLNAME mem For example, to display the memory usage and memory pool details of the bricks of test-volume: Memory status for volume : test-volume ---------------------------------------------- Brick : arch:/export/1 Mallinfo -------- Arena : 434176 Ordblks : 2 Smblks : 0 Hblks : 12 Hblkhd : 40861696 Usmblks : 0 Fsmblks : 0 Uordblks : 332416 Fordblks : 101760 Keepcost : 100400 Mempool Stats ------------- Name HotCount ColdCount PaddedSizeof AllocCount MaxAlloc ---- -------- --------- ------------ ---------- -------- test-volume-server:fd_t 0 16384 92 57 5 test-volume-server:dentry_t 59 965 84 59 59 test-volume-server:inode_t 60 964 148 60 60 test-volume-server:rpcsvc_request_t 0 525 6372 351 2 glusterfs:struct saved_frame 0 4096 124 2 2 glusterfs:struct rpc_req 0 4096 2236 2 2 glusterfs:rpcsvc_request_t 1 524 6372 2 1 glusterfs:call_stub_t 0 1024 1220 288 1 glusterfs:call_stack_t 0 8192 2084 290 2 glusterfs:call_frame_t 0 16384 172 1728 6 Display the inode tables of the volume using the following command: # gluster volume status VOLNAME inode For example, to display the inode tables of the test-volume: # gluster volume status test-volume inode inode tables for volume test-volume ---------------------------------------------- Brick : arch:/export/1 Active inodes: GFID Lookups Ref IA type ---- ------- --- ------- 6f3fe173-e07a-4209-abb6-484091d75499 1 9 2 370d35d7-657e-44dc-bac4-d6dd800ec3d3 1 1 2 LRU inodes: GFID Lookups Ref IA type ---- ------- --- ------- 80f98abe-cdcf-4c1d-b917-ae564cf55763 1 0 1 3a58973d-d549-4ea6-9977-9aa218f233de 1 0 1 2ce0197d-87a9-451b-9094-9baa38121155 1 0 2 Display the open fd tables of the volume using the following command: # gluster volume status VOLNAME fd For example, to display the open fd tables of the test-volume: # gluster volume status test-volume fd FD tables for volume test-volume ---------------------------------------------- Brick : arch:/export/1 Connection 1: RefCount = 0 MaxFDs = 128 FirstFree = 4 FD Entry PID RefCount Flags -------- --- -------- ----- 0 26311 1 2 1 26310 3 2 2 26310 1 2 3 26311 3 2 Connection 2: RefCount = 0 MaxFDs = 128 FirstFree = 0 No open fds Connection 3: RefCount = 0 MaxFDs = 128 FirstFree = 0 No open fds Display the pending calls of the volume using the following command: # gluster volume status VOLNAME callpool Each call has a call stack containing call frames. For example, to display the pending calls of test-volume: # gluster volume status test-volume Pending calls for volume test-volume ---------------------------------------------- Brick : arch:/export/1 Pending calls: 2 Call Stack1 UID : 0 GID : 0 PID : 26338 Unique : 192138 Frames : 7 Frame 1 Ref Count = 1 Translator = test-volume-server Completed = No Frame 2 Ref Count = 0 Translator = test-volume-posix Completed = No Parent = test-volume-access-control Wind From = default_fsync Wind To = FIRST_CHILD(this)->fops->fsync Frame 3 Ref Count = 1 Translator = test-volume-access-control Completed = No Parent = repl-locks Wind From = default_fsync Wind To = FIRST_CHILD(this)->fops->fsync Frame 4 Ref Count = 1 Translator = test-volume-locks Completed = No Parent = test-volume-io-threads Wind From = iot_fsync_wrapper Wind To = FIRST_CHILD (this)->fops->fsync Frame 5 Ref Count = 1 Translator = test-volume-io-threads Completed = No Parent = test-volume-marker Wind From = default_fsync Wind To = FIRST_CHILD(this)->fops->fsync Frame 6 Ref Count = 1 Translator = test-volume-marker Completed = No Parent = /export/1 Wind From = io_stats_fsync Wind To = FIRST_CHILD(this)->fops->fsync Frame 7 Ref Count = 1 Translator = /export/1 Completed = No Parent = test-volume-server Wind From = server_fsync_resume Wind To = bound_xl->fops->fsync