文章目录
  1. 1. 使用 Jconsole 监控 MyCAT 运行
  2. 2. MyCAT 性能监控

使用 Jconsole 监控 MyCAT 运行

Jconsole 是 Java 自带的性能监控工具,可以监控 Java 程序在运行过程中的 CPU、内存等的使用情况。

如果要使用 Jconsole 来监控 MyCAT 的运行状况,需要添加 MyCAT 的运行参数,在 conf/wrapper.conf 中,需要修改以下参数,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Java Additional Parameters
#wrapper.java.additional.1=
wrapper.java.additional.1=-DMYCAT_HOME=.
wrapper.java.additional.2=-server
wrapper.java.additional.3=-XX:MaxPermSize=64M
wrapper.java.additional.4=-XX:+AggressiveOpts
wrapper.java.additional.5=-XX:MaxDirectMemorySize=2G
wrapper.java.additional.6=-Dcom.sun.management.jmxremote
wrapper.java.additional.7=-Dcom.sun.management.jmxremote.port=1984
wrapper.java.additional.8=-Dcom.sun.management.jmxremote.authenticate=false
wrapper.java.additional.9=-Dcom.sun.management.jmxremote.ssl=false
wrapper.java.additional.10=-Xmx4G
wrapper.java.additional.11=-Xms1G
wrapper.java.additional.12=-Djava.rmi.server.hostname=192.168.2.201

其中,前11个参数是 MyCAT 默认的参数,最后一个参数 -Djava.rmi.server.hostname=192.168.2.201 是我本地的 rmi 监听 IP,即使用 Jconsole 远程连接的 IP。以上参数中,与 Jconsole 监控有关的参数还有以下几个,

1
2
3
4
wrapper.java.additional.6=-Dcom.sun.management.jmxremote
wrapper.java.additional.7=-Dcom.sun.management.jmxremote.port=1984
wrapper.java.additional.8=-Dcom.sun.management.jmxremote.authenticate=false
wrapper.java.additional.9=-Dcom.sun.management.jmxremote.ssl=false

其中,-Dcom.sun.management.jmxremote 表示启用远程 jmx 监听;-Dcom.sun.management.jmxremote.port=1984 表示监听端口是1984;-Dcom.sun.management.jmxremote.authenticate=false 表示不启用登陆认证;-Dcom.sun.management.jmxremote.ssl=false 表示不启用 ssl 加密连接。因此,Jconsole 的远程连接地址就是 192.168.2.201:1984

然后,运行 Jconsole,输入连接地址即可登陆开始监控,如下,

登陆之后,可以看到 MyCAT 进程使用CPU、内存的情况,如下,

MyCAT 性能监控

运行 MyCAT 自带的性能测试工具 testtool 来进行此次测试,测试用的表是 travelrecord,共有10个分片,采用主键 mod 10 的算法来执行分片,初始状态是空表,mysql 采用5.7版本,只有1个 mysql 实例,上面有10个数据库,本次测试固定100个连接。

先运行一个10000数据量的插入,得到在我的测试机上,单表插入 tps 大约是1600,如下,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ ./test_stand_insert_perf.sh jdbc:mysql://localhost:8066/TESTDB test test 100 "0-10000"
check JAVA_HOME & java
---------set HOME_DIR------------
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=64M; support was removed in 8.0
create jobs ...
success ful created connections ,total :100
create jobs finished ,begin run test...
success create job count: 10 teset threads: 100
03 21:27:32 finished records :0 failed:0 speed:0.0
03 21:27:33 finished records :1010 failed:0 speed:931.7343173431734
03 21:27:34 finished records :4444 failed:0 speed:2129.372304743651
03 21:27:35 finished records :5858 failed:0 speed:1897.020725388601
03 21:27:36 finished records :7171 failed:0 speed:1754.1585127201565
03 21:27:37 finished records :9617 failed:0 speed:1889.762232265671
finishend:10001 failed:0
used time total:6seconds
tps:1639.5081967213116

再做一个单表查询的测试,基于之前已经插入的表,qps 大约是5000左右,如下,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
$ ./test_stand_select_perf.sh jdbc:mysql://localhost:8066/TESTDB test test 100 1000 file=travelrecord_select.sql
check JAVA_HOME & java
---------set HOME_DIR------------
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=64M; support was removed in 8.0
execute sql times:1000
create thread 0
create thread 1
create thread 2
create thread 3
......
create thread 99
success create thread count: 100
all thread started,waiting finsh...
finishend:4052 failed:0 qps:5346.46,query time min:0ms,max:99ms,avg:9.99
finishend:8619 failed:0 qps:4972.03,query time min:0ms,max:117ms,avg:13.66
finishend:13011 failed:0 qps:4787.22,query time min:0ms,max:141ms,avg:18.11
finishend:17134 failed:0 qps:4623.36,query time min:0ms,max:141ms,avg:20.28
finishend:21861 failed:0 qps:4635.94,query time min:0ms,max:141ms,avg:20.94
finishend:27917 failed:0 qps:4888.89,query time min:0ms,max:141ms,avg:21.11
finishend:34310 failed:0 qps:5113.82,query time min:0ms,max:141ms,avg:20.06
finishend:40371 failed:0 qps:5237.45,query time min:0ms,max:141ms,avg:19.31
finishend:45614 failed:0 qps:5239.43,query time min:0ms,max:141ms,avg:19.11
finishend:51195 failed:0 qps:5274.61,query time min:0ms,max:141ms,avg:18.97
finishend:55691 failed:0 qps:5200.02,query time min:0ms,max:168ms,avg:19.00
finishend:61247 failed:0 qps:5228.27,query time min:0ms,max:180ms,avg:19.13
finishend:67327 failed:0 qps:5295.13,query time min:0ms,max:180ms,avg:18.89
finishend:73218 failed:0 qps:5338.11,query time min:0ms,max:180ms,avg:18.77
finishend:78833 failed:0 qps:5364.28,query time min:0ms,max:180ms,avg:18.62
finishend:84058 failed:0 qps:5378.86,query time min:0ms,max:180ms,avg:18.60
finishend:89272 failed:0 qps:5423.15,query time min:0ms,max:180ms,avg:18.54
warning many test threads finished ,qps may NOT Accurate ,alive threads:49
finishend:94340 failed:0 qps:5515.03,query time min:0ms,max:180ms,avg:18.38
warning many test threads finished ,qps may NOT Accurate ,alive threads:12
finishend:99205 failed:0 qps:5688.55,query time min:0ms,max:180ms,avg:17.72
warning many test threads finished ,qps may NOT Accurate ,alive threads:0
finishend:100000 failed:0 qps:5727.72,query time min:0ms,max:180ms,avg:17.52
finishend:100000 failed:0 qps:5727.72,query time min:0ms,max:180ms,avg:17.52
finished all,total time :20

其中,参数1000表示,每个线程执行1000次查询,而非总共执行1000次查询。travelrecord_select.sql 的内容如下,

1
sql=select * from travelrecord where id = ${int(1-10000)}

设计一个实验场景,先执行大约10分钟的插入,再 sleep 30秒,最后执行大约10分钟的查询。根据之前的测试数据,大约需要插入90万(15006010)条数据,执行300万(50006010)次查询(每个连接3万次查询),这个估算是根据数据量的增大仍然不影响 tps 和 qps 的基础上来进行的,实际上,数据量大的时候,插入和查询的效率都会受到影响,实验中,执行50万次插入,300万次查询。因此,编写测试脚本 perf-test.sh 如下,

1
2
3
4
5
#!/bin/bash
time ./test_stand_insert_perf.sh jdbc:mysql://localhost:8066/TESTDB test test 100 "1-500000"
sleep 30
time ./test_stand_select_perf.sh jdbc:mysql://localhost:8066/TESTDB test test 100 30000 file=travelrecord_select.sql

但是,MyCAT 的工具似乎有问题,在100个并发连接下,不能执行16000条数据以上的插入,否则会报错如下,

1
2
3
4
5
6
7
8
9
10
create jobs finished ,begin run test...
success create job count: 40 teset threads: 100
03 21:11:10 finished records :0 failed:1414 speed:-1801.2738853503186
03 21:11:11 finished records :0 failed:5757 speed:-3225.2100840336134
warn no connection in pool,create new one
warn no connection in pool,create new one
warn no connection in pool,create new one
warn no connection in pool,create new one
warn no connection in pool,create new one
warn no connection in pool,create new one

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: #HY000
at sun.reflect.GeneratedConstructorAccessor9.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:389)
at com.mysql.jdbc.Util.getInstance(Util.java:372)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:958)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3835)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3771)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2435)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2582)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2535)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1911)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2145)
at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:1777)
at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1281)
at org.opencloudb.performance.TravelRecordInsertJob.insert(TravelRecordInsertJob.java:71)
at org.opencloudb.performance.TravelRecordInsertJob.run(TravelRecordInsertJob.java:125)
... 3 more
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No operations allowed after connection closed.
at sun.reflect.GeneratedConstructorAccessor9.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:389)
at com.mysql.jdbc.Util.getInstance(Util.java:372)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:958)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:937)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:926)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:872)
at com.mysql.jdbc.ConnectionImpl.throwConnectionClosedException(ConnectionImpl.java:1236)
at com.mysql.jdbc.ConnectionImpl.checkClosed(ConnectionImpl.java:1231)
at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4574)
at org.opencloudb.performance.TravelRecordInsertJob.run(TravelRecordInsertJob.java:130)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

怀疑是并发连接数量到达上线的原因,按官方文档的说法,在 schema.xml 中,增大了 minCon 参数,仍然不起效果,因此,实际运行的是如下脚本,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#!/bin/bash
for ((i=0; i<50; ++i))
do
j=`echo "$i*10000" | bc`
start=`echo "$j+1" | bc`
end=`echo "$j+10000" | bc`
# echo "${start} , ${end}"
./test_stand_insert_perf.sh jdbc:mysql://localhost:8066/TESTDB test test 100 "${start}-${end}"
sleep 1
done
sleep 30
time ./test_stand_select_perf.sh jdbc:mysql://localhost:8066/TESTDB test test 100 30000 file=travelrecord_select.sql

travelrecord_select.sql 脚本也需要修改查询范围,如下,

1
sql=select * from travelrecord where id = ${int(1-500000)}

正式执行之前,可以先执行上述脚本一次,然后清空 travelrecord 表,再重启 MyCAT,这主要是用来预热 mysql,并将 MyCAT 重置,最后连接 Jconsole,开始监控。

在运行测试的过程中,使用管理端登陆可以看到进程的信息,如下,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
mysql> show @@server;
+--------------+-------------+--------------+------------+---------------+---------------+---------+--------+-----------------------+
| UPTIME | USED_MEMORY | TOTAL_MEMORY | MAX_MEMORY | RELOAD_TIME | ROLLBACK_TIME | CHARSET | STATUS | AVG_BUFPOOL_ITEM_SIZE |
+--------------+-------------+--------------+------------+---------------+---------------+---------+--------+-----------------------+
| 4m 35s 925ms | 298996728 | 1063256064 | 3817865216 | 1459690848717 | -1 | latin1 | ON | 95 |
+--------------+-------------+--------------+------------+---------------+---------------+---------+--------+-----------------------+
1 row in set (0.01 sec)
mysql> show @@processor;
+------------+---------+---------+-------------+---------+---------+-------------+--------------+------------+----------+----------+----------+
| NAME | NET_IN | NET_OUT | REACT_COUNT | R_QUEUE | W_QUEUE | FREE_BUFFER | TOTAL_BUFFER | BU_PERCENT | BU_WARNS | FC_COUNT | BC_COUNT |
+------------+---------+---------+-------------+---------+---------+-------------+--------------+------------+----------+----------+----------+
| Processor0 | 7398748 | 7696860 | 0 | 0 | 0 | 1264 | 4000 | 68 | 31 | 12 | 39 |
| Processor1 | 9560214 | 9409852 | 0 | 0 | 0 | 1264 | 4000 | 68 | 31 | 10 | 44 |
| Processor2 | 8904692 | 6975814 | 0 | 0 | 0 | 1264 | 4000 | 68 | 31 | 12 | 34 |
| Processor3 | 6765729 | 8399034 | 0 | 0 | 0 | 1264 | 4000 | 68 | 31 | 6 | 35 |
+------------+---------+---------+-------------+---------+---------+-------------+--------------+------------+----------+----------+----------+
4 rows in set (0.00 sec)
mysql> show @@threadpool;
+------------------+-----------+--------------+-----------------+----------------+------------+
| NAME | POOL_SIZE | ACTIVE_COUNT | TASK_QUEUE_SIZE | COMPLETED_TASK | TOTAL_TASK |
+------------------+-----------+--------------+-----------------+----------------+------------+
| Timer | 2 | 0 | 0 | 579 | 579 |
| BusinessExecutor | 8 | 0 | 0 | 223484 | 223484 |
+------------------+-----------+--------------+-----------------+----------------+------------+
2 rows in set (0.00 sec)
mysql> show @@datanode;
+------+--------------------+-------+-------+--------+------+-------+---------+------------+----------+---------+---------------+
| NAME | DATHOST | INDEX | TYPE | ACTIVE | IDLE | SIZE | EXECUTE | TOTAL_TIME | MAX_TIME | MAX_SQL | RECOVERY_TIME |
+------+--------------------+-------+-------+--------+------+-------+---------+------------+----------+---------+---------------+
| dn0 | localhost1/dbtest0 | 0 | mysql | 0 | 15 | 10000 | 2209 | 0 | 0 | 0 | -1 |
| dn1 | localhost1/dbtest1 | 0 | mysql | 0 | 16 | 10000 | 2275 | 0 | 0 | 0 | -1 |
| dn2 | localhost1/dbtest2 | 0 | mysql | 0 | 16 | 10000 | 2210 | 0 | 0 | 0 | -1 |
| dn3 | localhost1/dbtest3 | 0 | mysql | 0 | 15 | 10000 | 2193 | 0 | 0 | 0 | -1 |
| dn4 | localhost1/dbtest4 | 0 | mysql | 0 | 14 | 10000 | 2203 | 0 | 0 | 0 | -1 |
| dn5 | localhost1/dbtest5 | 0 | mysql | 0 | 16 | 10000 | 2226 | 0 | 0 | 0 | -1 |
| dn6 | localhost1/dbtest6 | 0 | mysql | 0 | 15 | 10000 | 2211 | 0 | 0 | 0 | -1 |
| dn7 | localhost1/dbtest7 | 0 | mysql | 0 | 16 | 10000 | 2231 | 0 | 0 | 0 | -1 |
| dn8 | localhost1/dbtest8 | 0 | mysql | 0 | 15 | 10000 | 2211 | 0 | 0 | 0 | -1 |
| dn9 | localhost1/dbtest9 | 0 | mysql | 0 | 14 | 10000 | 2212 | 0 | 0 | 0 | -1 |
+------+--------------------+-------+-------+--------+------+-------+---------+------------+----------+---------+---------------+
10 rows in set (0.00 sec)

Jconsole 的监控截图如下,这里只做了30分钟的监控,整个测试运行了约26分30秒(包括其中 sleep 的时间),

从上述图表可知,内存的使用虽然波动比较剧烈,但是总体比较稳定,线程数量基本没有变化,CPU 的使用上,插入的时候比查询的时候用的要少,这应该是因为查询的 CPU 消耗比较密集,而插入的瓶颈仍然在 IO 部分。

文章目录
  1. 1. 使用 Jconsole 监控 MyCAT 运行
  2. 2. MyCAT 性能监控

欢迎来到Valleylord的博客!

本博的文章尽量原创。