Monitoring your GlusterFS Workload
You can monitor the GlusterFS volumes on different parameters. Monitoring volumes helps in capacity planning and performance tuning tasks of the GlusterFS volume. Using these information, you can identify and troubleshoot issues.
You can use Volume Top and Profile commands to view the performance and identify bottlenecks/hotspots of each brick of a volume. This helps system administrators to get vital performance information whenever performance needs to be probed.
You can also perform statedump of the brick processes and nfs server process of a volume, and also view volume status and volume information.
Running GlusterFS Volume Profile Command
GlusterFS Volume Profile command provides an interface to get the per-brick I/O information for each File Operation (FOP) of a volume. The per brick information helps in identifying bottlenecks in the storage system.
This section describes how to run GlusterFS Volume Profile command by performing the following operations:
Start Profiling
You must start the Profiling to view the File Operation information for each brick.
To start profiling:
Start profiling using the following command:
# gluster volume profile VOLNAME start
For example, to start profiling on test-volume:
# gluster volume profile test-volume start
Profiling started on test-volume
When profiling on the volume is started, the following additional options are displayed in the Volume Info:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
Displaying the I/0 Information
You can view the I/O information of each brick.
To display I/O information:
Display the I/O information using the following command:
# gluster volume profile VOLNAME info
For example, to see the I/O information on test-volume:
# gluster volume profile test-volume info
Brick: Test:/export/2
Cumulative Stats:
Block 1b+ 32b+ 64b+
Size:
Read: 0 0 0
Write: 908 28 8
Block 128b+ 256b+ 512b+
Size:
Read: 0 6 4
Write: 5 23 16
Block 1024b+ 2048b+ 4096b+
Size:
Read: 0 52 17
Write: 15 120 846
Block 8192b+ 16384b+ 32768b+
Size:
Read: 52 8 34
Write: 234 134 286
Block 65536b+ 131072b+
Size:
Read: 118 622
Write: 1341 594
%-latency Avg- Min- Max- calls Fop
latency Latency Latency
___________________________________________________________
4.82 1132.28 21.00 800970.00 4575 WRITE
5.70 156.47 9.00 665085.00 39163 READDIRP
11.35 315.02 9.00 1433947.00 38698 LOOKUP
11.88 1729.34 21.00 2569638.00 7382 FXATTROP
47.35 104235.02 2485.00 7789367.00 488 FSYNC
------------------
------------------
Duration : 335
BytesRead : 94505058
BytesWritten : 195571980
Stop Profiling
You can stop profiling the volume, if you do not need profiling information anymore.
To stop profiling
Stop profiling using the following command:
# gluster volume profile VOLNAME stop
For example, to stop profiling on test-volume:
# gluster volume profile test-volume stop
Profiling stopped on test-volume
Running GlusterFS Volume TOP Command
GlusterFS Volume Top command allows you to view the glusterfs bricks’ performance metrics like
read, write, file open calls, file read calls, file write calls, directory open calls, and directory real
calls. The top command displays up to 100 results.
This section describes how to run and view the results for the following GlusterFS Top commands:
Viewing Open fd Count and Maximum fd Count
You can view both current open fd count (list of files that are currently the most opened and the
count) on the brick and the maximum open fd count (count of files that are the currently open and
the count of maximum number of files opened at any given point of time, since the servers are up
and running). If the brick name is not specified, then open fd metrics of all the bricks belonging to
the volume will be displayed.
To view open fd count and maximum fd count:
View open fd count and maximum fd count using the following command:
# gluster volume top VOLNAME open [brick BRICK-NAME] [list-cnt cnt]
For example, to view open fd count and maximum fd count on brick server:/export of test-volume and list top 10 open calls:
# gluster volume top test-volume open brick server:/export/ list-cnt 10
Brick: server:/export/dir1
Current open fd's: 34 Max open fd's: 209 ==========Open file stats========
open file name
call count
2 /clients/client0/~dmtmp/PARADOX/
COURSES.DB
11 /clients/client0/~dmtmp/PARADOX/
ENROLL.DB
11 /clients/client0/~dmtmp/PARADOX/
STUDENTS.DB
10 /clients/client0/~dmtmp/PWRPNT/
TIPS.PPT
10 /clients/client0/~dmtmp/PWRPNT/
PCBENCHM.PPT
9 /clients/client7/~dmtmp/PARADOX/
STUDENTS.DB
9 /clients/client1/~dmtmp/PARADOX/
STUDENTS.DB
9 /clients/client2/~dmtmp/PARADOX/
STUDENTS.DB
9 /clients/client0/~dmtmp/PARADOX/
STUDENTS.DB
9 /clients/client8/~dmtmp/PARADOX/
STUDENTS.DB
Viewing Highest File Read Calls
You can view highest read calls on each brick. If brick name is not specified, then by default, list of
100 files will be displayed.
To view highest file Read calls:
View highest file Read calls using the following command:
# gluster volume top VOLNAME read [brick BRICK-NAME] [list-cnt cnt]
For example, to view highest Read calls on brick server:/export of test-volume:
# gluster volume top test-volume read brick server:/export list-cnt 10
Brick: server:/export/dir1 ==========Read file stats========
read filename
call count
116 /clients/client0/~dmtmp/SEED/LARGE.FIL
64 /clients/client0/~dmtmp/SEED/MEDIUM.FIL
54 /clients/client2/~dmtmp/SEED/LARGE.FIL
54 /clients/client6/~dmtmp/SEED/LARGE.FIL
54 /clients/client5/~dmtmp/SEED/LARGE.FIL
54 /clients/client0/~dmtmp/SEED/LARGE.FIL
54 /clients/client3/~dmtmp/SEED/LARGE.FIL
54 /clients/client4/~dmtmp/SEED/LARGE.FIL
54 /clients/client9/~dmtmp/SEED/LARGE.FIL
54 /clients/client8/~dmtmp/SEED/LARGE.FIL
Viewing Highest File Write Calls
You can view list of files which has highest file write calls on each brick. If brick name is not
specified, then by default, list of 100 files will be displayed.
To view highest file Write calls:
View highest file Write calls using the following command:
# gluster volume top VOLNAME write [brick BRICK-NAME] [list-cnt cnt]
For example, to view highest Write calls on brick server:/export of test-volume:
# gluster volume top test-volume write brick server:/export list-cnt 10
Brick: server:/export/dir1
==========Write file stats========
write call count filename
83 /clients/client0/~dmtmp/SEED/LARGE.FIL
59 /clients/client7/~dmtmp/SEED/LARGE.FIL
59 /clients/client1/~dmtmp/SEED/LARGE.FIL
59 /clients/client2/~dmtmp/SEED/LARGE.FIL
59 /clients/client0/~dmtmp/SEED/LARGE.FIL
59 /clients/client8/~dmtmp/SEED/LARGE.FIL
59 /clients/client5/~dmtmp/SEED/LARGE.FIL
59 /clients/client4/~dmtmp/SEED/LARGE.FIL
59 /clients/client6/~dmtmp/SEED/LARGE.FIL
59 /clients/client3/~dmtmp/SEED/LARGE.FIL
Viewing Highest Open Calls on Directories
You can view list of files which has highest open calls on directories of each brick. If brick name is
not specified, then the metrics of all the bricks belonging to that volume will be displayed.
To view list of open calls on each directory
View list of open calls on each directory using the following command:
# gluster volume top VOLNAME opendir [brick BRICK-NAME] [list-cnt cnt]
For example, to view open calls on brick server:/export/ of test-volume:
# gluster volume top test-volume opendir brick server:/export list-cnt 10
Brick: server:/export/dir1
==========Directory open stats========
Opendir count directory name
1001 /clients/client0/~dmtmp
454 /clients/client8/~dmtmp
454 /clients/client2/~dmtmp
454 /clients/client6/~dmtmp
454 /clients/client5/~dmtmp
454 /clients/client9/~dmtmp
443 /clients/client0/~dmtmp/PARADOX
408 /clients/client1/~dmtmp
408 /clients/client7/~dmtmp
402 /clients/client4/~dmtmp
Viewing Highest Read Calls on Directory
You can view list of files which has highest directory read calls on each brick. If brick name is not
specified, then the metrics of all the bricks belonging to that volume will be displayed.
To view list of highest directory read calls on each brick
View list of highest directory read calls on each brick using the following command:
# gluster volume top VOLNAME readdir [brick BRICK-NAME] [list-cnt cnt]
For example, to view highest directory read calls on brick server:/export of test-volume:
# gluster volume top test-volume readdir brick server:/export list-cnt 10
Brick: server:/export/dir1
==========Directory readdirp stats========
readdirp count directory name
1996 /clients/client0/~dmtmp
1083 /clients/client0/~dmtmp/PARADOX
904 /clients/client8/~dmtmp
904 /clients/client2/~dmtmp
904 /clients/client6/~dmtmp
904 /clients/client5/~dmtmp
904 /clients/client9/~dmtmp
812 /clients/client1/~dmtmp
812 /clients/client7/~dmtmp
800 /clients/client4/~dmtmp
Viewing List of Read Performance on each Brick
You can view the read throughput of files on each brick. If brick name is not specified, then the
metrics of all the bricks belonging to that volume will be displayed. The output will be the read
throughput.
==========Read throughput file stats========
read filename Time
through
put(MBp
s)
2570.00 /clients/client0/~dmtmp/PWRPNT/ -2011-01-31
TRIDOTS.POT 15:38:36.894610
2570.00 /clients/client0/~dmtmp/PWRPNT/ -2011-01-31
PCBENCHM.PPT 15:38:39.815310
2383.00 /clients/client2/~dmtmp/SEED/ -2011-01-31
MEDIUM.FIL 15:52:53.631499
2340.00 /clients/client0/~dmtmp/SEED/ -2011-01-31
MEDIUM.FIL 15:38:36.926198
2299.00 /clients/client0/~dmtmp/SEED/ -2011-01-31
LARGE.FIL 15:38:36.930445
2259.00 /clients/client0/~dmtmp/PARADOX/ -2011-01-31
COURSES.X04 15:38:40.549919
2221.00 /clients/client0/~dmtmp/PARADOX/ -2011-01-31
STUDENTS.VAL 15:52:53.298766
2221.00 /clients/client3/~dmtmp/SEED/ -2011-01-31
COURSES.DB 15:39:11.776780
2184.00 /clients/client3/~dmtmp/SEED/ -2011-01-31
MEDIUM.FIL 15:39:10.251764
2184.00 /clients/client5/~dmtmp/WORD/ -2011-01-31
BASEMACH.DOC 15:39:09.336572 This command will initiate a dd for the specified count and block size and measures the
corresponding throughput.
To view list of read performance on each brick
View list of read performance on each brick using the following command:
# gluster volume top VOLNAME read-perf [bs blk-size count count] [brick BRICK-NAME] [list-cnt cnt]
For example, to view read performance on brick server:/export/ of test-volume, 256 block size
of count 1, and list count 10:
# gluster volume top test-volume read-perf bs 256 count 1 brick server:/export/ list-cnt 10
Brick: server:/export/dir1 256 bytes (256 B) copied, Throughput: 4.1 MB/s
==========Read throughput file stats========
read filename Time
through
put(MBp
s)
2912.00 /clients/client0/~dmtmp/PWRPNT/ -2011-01-31
TRIDOTS.POT 15:38:36.896486
2570.00 /clients/client0/~dmtmp/PWRPNT/ -2011-01-31
PCBENCHM.PPT 15:38:39.815310
2383.00 /clients/client2/~dmtmp/SEED/ -2011-01-31
MEDIUM.FIL 15:52:53.631499
2340.00 /clients/client0/~dmtmp/SEED/ -2011-01-31
MEDIUM.FIL 15:38:36.926198
2299.00 /clients/client0/~dmtmp/SEED/ -2011-01-31
LARGE.FIL 15:38:36.930445
2259.00 /clients/client0/~dmtmp/PARADOX/ -2011-01-31
COURSES.X04 15:38:40.549919
2221.00 /clients/client9/~dmtmp/PARADOX/ -2011-01-31
STUDENTS.VAL 15:52:53.298766
2221.00 /clients/client8/~dmtmp/PARADOX/ -2011-01-31
COURSES.DB 15:39:11.776780
2184.00 /clients/client3/~dmtmp/SEED/ -2011-01-31
MEDIUM.FIL 15:39:10.251764
2184.00 /clients/client5/~dmtmp/WORD/ -2011-01-31
BASEMACH.DOC 15:39:09.336572
Viewing List of Write Performance on each Brick
You can view list of write throughput of files on each brick. If brick name is not specified, then the
metrics of all the bricks belonging to that volume will be displayed. The output will be the write
throughput.
This command will initiate a dd for the specified count and block size and measures the
corresponding throughput.
To view list of write performance on each brick:
View list of write performance on each brick using the following command:
# gluster volume top VOLNAME write-perf [bs blk-size count count] [brick BRICK-NAME] [list-cnt cnt]
For example, to view write performance on brick server:/export/ of test-volume, 256 block size
of count 1, and list count 10:
# gluster volume top test-volume write-perf bs 256 count 1 brick server:/export/ list-cnt 10
Brick
: server:/export/dir1
256 bytes (256 B) copied, Throughput: 2.8 MB/s
==========Write throughput file stats========
write filename Time
throughput
(MBps)
1170.00 /clients/client0/~dmtmp/SEED/ -2011-01-31
SMALL.FIL 15:39:09.171494
1008.00 /clients/client6/~dmtmp/SEED/ -2011-01-31
LARGE.FIL 15:39:09.73189
949.00 /clients/client0/~dmtmp/SEED/ -2011-01-31
MEDIUM.FIL 15:38:36.927426
936.00 /clients/client0/~dmtmp/SEED/ -2011-01-31
LARGE.FIL 15:38:36.933177
897.00 /clients/client5/~dmtmp/SEED/ -2011-01-31
MEDIUM.FIL 15:39:09.33628
897.00 /clients/client6/~dmtmp/SEED/ -2011-01-31
MEDIUM.FIL 15:39:09.27713
885.00 /clients/client0/~dmtmp/SEED/ -2011-01-31
SMALL.FIL 15:38:36.924271
528.00 /clients/client5/~dmtmp/SEED/ -2011-01-31
LARGE.FIL 15:39:09.81893
516.00 /clients/client6/~dmtmp/ACCESS/ -2011-01-31
FASTENER.MDB 15:39:01.797317
Displaying Volume Status
You can display the status information about a specific volume, brick or all volumes, as needed. Status information can be used to understand the current status of the brick, nfs processes, and overall file system. Status information can also be used to monitor and debug the volume information. You can view status of the volume along with the following details:
detail - Displays additional information about the bricks.
clients - Displays the list of clients connected to the volume.
mem - Displays the memory usage and memory pool details of the bricks.
inode - Displays the inode tables of the volume.
fd - Displays the open fd (file descriptors) tables of the volume.
callpool - Displays the pending calls of the volume.
To display volume status
Display information about a specific volume using the following command:
# gluster volume status [all|VOLNAME [BRICKNAME]] [detail|clients|mem|inode|fd|callpool]
For example, to display information about test-volume:
# gluster volume status test-volume
STATUS OF VOLUME: test-volume
BRICK PORT ONLINE PID
--------------------------------------------------------
arch:/export/1 24009 Y 22445
--------------------------------------------------------
arch:/export/2 24010 Y 22450
Display information about all volumes using the following command:
# gluster volume status all
# gluster volume status all
STATUS OF VOLUME: volume-test
BRICK PORT ONLINE PID
--------------------------------------------------------
arch:/export/4 24010 Y 22455
STATUS OF VOLUME: test-volume
BRICK PORT ONLINE PID
--------------------------------------------------------
arch:/export/1 24009 Y 22445
--------------------------------------------------------
arch:/export/2 24010 Y 22450
Display additional information about the bricks using the following command:
# gluster volume status VOLNAME detail
For example, to display additional information about the bricks of test-volume:
# gluster volume status test-volume details
STATUS OF VOLUME: test-volume
-------------------------------------------
Brick : arch:/export/1
Port : 24009
Online : Y
Pid : 16977
File System : rootfs
Device : rootfs
Mount Options : rw
Disk Space Free : 13.8GB
Total Disk Space : 46.5GB
Inode Size : N/A
Inode Count : N/A
Free Inodes : N/A
Number of Bricks: 1
Bricks:
Brick: server:/brick6
Display the list of clients accessing the volumes using the following command:
# gluster volume status VOLNAME clients
For example, to display the list of clients connected to test-volume:
# gluster volume status test-volume clients
Brick : arch:/export/1
Clients connected : 2
Hostname Bytes Read BytesWritten
-------- --------- ------------
127.0.0.1:1013 776 676
127.0.0.1:1012 50440 51200
Display the memory usage and memory pool details of the bricks using the following command:
# gluster volume status VOLNAME mem
For example, to display the memory usage and memory pool details of the bricks of test-volume:
Memory status for volume : test-volume
----------------------------------------------
Brick : arch:/export/1
Mallinfo
--------
Arena : 434176
Ordblks : 2
Smblks : 0
Hblks : 12
Hblkhd : 40861696
Usmblks : 0
Fsmblks : 0
Uordblks : 332416
Fordblks : 101760
Keepcost : 100400
Mempool Stats
-------------
Name HotCount ColdCount PaddedSizeof AllocCount MaxAlloc
---- -------- --------- ------------ ---------- --------
test-volume-server:fd_t 0 16384 92 57 5
test-volume-server:dentry_t 59 965 84 59 59
test-volume-server:inode_t 60 964 148 60 60
test-volume-server:rpcsvc_request_t 0 525 6372 351 2
glusterfs:struct saved_frame 0 4096 124 2 2
glusterfs:struct rpc_req 0 4096 2236 2 2
glusterfs:rpcsvc_request_t 1 524 6372 2 1
glusterfs:call_stub_t 0 1024 1220 288 1
glusterfs:call_stack_t 0 8192 2084 290 2
glusterfs:call_frame_t 0 16384 172 1728 6
Display the inode tables of the volume using the following command:
# gluster volume status VOLNAME inode
For example, to display the inode tables of the test-volume:
# gluster volume status test-volume inode
inode tables for volume test-volume
----------------------------------------------
Brick : arch:/export/1
Active inodes:
GFID Lookups Ref IA type
---- ------- --- -------
6f3fe173-e07a-4209-abb6-484091d75499 1 9 2
370d35d7-657e-44dc-bac4-d6dd800ec3d3 1 1 2
LRU inodes:
GFID Lookups Ref IA type
---- ------- --- -------
80f98abe-cdcf-4c1d-b917-ae564cf55763 1 0 1
3a58973d-d549-4ea6-9977-9aa218f233de 1 0 1
2ce0197d-87a9-451b-9094-9baa38121155 1 0 2
Display the open fd tables of the volume using the following command:
# gluster volume status VOLNAME fd
For example, to display the open fd tables of the test-volume:
# gluster volume status test-volume fd
FD tables for volume test-volume
----------------------------------------------
Brick : arch:/export/1
Connection 1:
RefCount = 0 MaxFDs = 128 FirstFree = 4
FD Entry PID RefCount Flags
-------- --- -------- -----
0 26311 1 2
1 26310 3 2
2 26310 1 2
3 26311 3 2
Connection 2:
RefCount = 0 MaxFDs = 128 FirstFree = 0
No open fds
Connection 3:
RefCount = 0 MaxFDs = 128 FirstFree = 0
No open fds
Display the pending calls of the volume using the following command:
# gluster volume status VOLNAME callpool
Each call has a call stack containing call frames.
For example, to display the pending calls of test-volume:
# gluster volume status test-volume
Pending calls for volume test-volume
----------------------------------------------
Brick : arch:/export/1
Pending calls: 2
Call Stack1
UID : 0
GID : 0
PID : 26338
Unique : 192138
Frames : 7
Frame 1
Ref Count = 1
Translator = test-volume-server
Completed = No
Frame 2
Ref Count = 0
Translator = test-volume-posix
Completed = No
Parent = test-volume-access-control
Wind From = default_fsync
Wind To = FIRST_CHILD(this)->fops->fsync
Frame 3
Ref Count = 1
Translator = test-volume-access-control
Completed = No
Parent = repl-locks
Wind From = default_fsync
Wind To = FIRST_CHILD(this)->fops->fsync
Frame 4
Ref Count = 1
Translator = test-volume-locks
Completed = No
Parent = test-volume-io-threads
Wind From = iot_fsync_wrapper
Wind To = FIRST_CHILD (this)->fops->fsync
Frame 5
Ref Count = 1
Translator = test-volume-io-threads
Completed = No
Parent = test-volume-marker
Wind From = default_fsync
Wind To = FIRST_CHILD(this)->fops->fsync
Frame 6
Ref Count = 1
Translator = test-volume-marker
Completed = No
Parent = /export/1
Wind From = io_stats_fsync
Wind To = FIRST_CHILD(this)->fops->fsync
Frame 7
Ref Count = 1
Translator = /export/1
Completed = No
Parent = test-volume-server
Wind From = server_fsync_resume
Wind To = bound_xl->fops->fsync