# arthas

Arthas（阿尔萨斯）是阿里巴巴开源的 Java 诊断工具，深受开发者喜爱。
Arthas 采用命令行交互模式，同时提供丰富的 Tab 自动补全功能，进一步方便进行问题的定位和诊断。

当你遇到以下类似问题而束手无策时，Arthas可以帮助你解决：

这个类从哪个 jar 包加载的？为什么会报各种类相关的 Exception？
我改的代码为什么没有执行到？难道是我没 commit？分支搞错了？
遇到问题无法在线上 debug，难道只能通过加日志再重新发布吗？
线上遇到某个用户的数据处理有问题，但线上同样无法 debug，线下无法重现！
是否有一个全局视角来查看系统的运行状况？
有什么办法可以监控到JVM的实时运行状态？

Arthas 3.1.5版本带来下面全新的特性：

开箱即用的Profiler/火焰图功能
grep命令支持更丰富的选项
monitor/tt/trace等命令提供更精确的时间统计
telnet/http协议共用3658端口

java -jar arthas-boot.jar

如需打印帮助信息，执行java -jar arthas-boot.jar -h

# arthas常用命令

back

$ dashboard

数据说明

ID: Java级别的线程ID，注意这个ID不能跟jstack中的nativeID一一对应
NAME: 线程名
GROUP: 线程组名
PRIORITY: 线程优先级, 1~10之间的数字，越大表示优先级越高
STATE: 线程的状态
CPU%: 线程消耗的cpu占比，采样100ms，将所有线程在这100ms内的cpu使用量求和，再算出每个线程的cpu使用占比。
TIME: 线程运行总时间，数据格式为分：秒
INTERRUPTED: 线程当前的中断位状态
DAEMON: 是否是daemon线程

thread

查看当前线程信息，查看线程的堆栈

参数名称	参数说明
id	线程id
[n:]	指定最忙的前N个线程并打印堆栈
[b]	找出当前阻塞其他线程的线程
[i `<value>`]	指定cpu占比统计的采样间隔，单位为毫秒

thread -n 3、thread、thread 1(thread id)、thread -b、thread -n 3 -i 1000

jad

反编译指定已加载类的源码

反编绎时只显示源代码jad --source-only demo.MathGame
反编译指定的函数jad demo.MathGame main 反编译时指定ClassLoaderjad org.apache.log4j.Logger

mc

Memory Compiler/内存编译器，编译.java文件生成.class。

可以通过-c参数指定classloader：mc -c 327a647b /tmp/Test.java
可以通过-d命令指定输出目录：mc -d /tmp/output /tmp/ClassA.java /tmp/ClassB.java

redefine

加载外部的.class文件，redefine jvm已加载的类。

sc

查看JVM已加载的类信息

打印类的详细信息$ sc -d demo.MathGame
模糊搜索$ sc demo.*
打印出类的Field信息$ sc -d -f demo.MathGame

stack

输出当前方法被调用的调用路径

很多时候我们都知道一个方法被执行，但这个方法被执行的路径非常多，或者你根本就不知道这个方法是从那里被执行了，此时你需要的是 stack 命令。

stack$ stack demo.MathGame primeFactors
据条件表达式来过滤$ stack demo.MathGame primeFactors 'params[0]<0' -n 2 据执行时间来过滤$ stack demo.MathGame primeFactors '#cost>5'

trace

方法内部调用路径，并输出方法路径上的每个节点上耗时

watch/stack/trace这个三个命令都支持#cost

trace函数trace demo.MathGame run
过滤掉jdk的函数$ trace -j demo.MathGame run
据调用耗时过滤$ trace demo.MathGame run '#cost > 10' trace多个类或者多个函数trace -E com.test.ClassA|org.test.ClassB method1|method2|method3

watch

方法执行数据观测

watch 命令定义了4个观察事件点，即 -b 方法调用前，-e 方法异常后，-s 方法返回后，-f 方法结束后
4个观察事件点 -b、-e、-s 默认关闭，-f 默认打开，当指定观察点被打开后，在相应事件点会对观察表达式进行求值并输出
这里要注意方法入参和方法出参的区别，有可能在中间被修改导致前后不一致，除了 -b 事件点 params 代表方法入参外，其余事件都代表方法出参
当使用 -b 时，由于观察事件点是在方法调用前，此时返回值或异常均不存在

观察方法出参和返回值$ watch demo.MathGame primeFactors "{params,returnObj}" -x 2

Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 44 ms.
ts=2018-12-03 19:16:51; [cost=1.280502ms] result=@ArrayList[
    @Object[][
        @Integer[535629513],
    ],
    @ArrayList[
        @Integer[3],
        @Integer[19],
        @Integer[191],
        @Integer[49199],
    ],
]

1
2
3
4
5
6
7
8
9
10
11
12
13

观察方法入参$ watch demo.MathGame primeFactors "{params,returnObj}" -x 2 -b

Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 50 ms.
ts=2018-12-03 19:23:23; [cost=0.0353ms] result=@ArrayList[
    @Object[][
        @Integer[-1077465243],
    ],
    null,
]

1
2
3
4
5
6
7
8

同时观察方法调用前和方法返回后$ watch demo.MathGame primeFactors "{params,target,returnObj}" -x 2 -b -s -n 2

Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 46 ms.
ts=2018-12-03 19:29:54; [cost=0.01696ms] result=@ArrayList[
    @Object[][
        @Integer[1544665400],
    ],
    @MathGame[
        random=@Random[java.util.Random@522b408a],
        illegalArgumentCount=@Integer[13038],
    ],
    null,
]
ts=2018-12-03 19:29:54; [cost=4.277392ms] result=@ArrayList[
    @Object[][
        @Integer[1544665400],
    ],
    @MathGame[
        random=@Random[java.util.Random@522b408a],
        illegalArgumentCount=@Integer[13038],
    ],
    @ArrayList[
        @Integer[2],
        @Integer[2],
        @Integer[2],
        @Integer[5],
        @Integer[5],
        @Integer[73],
        @Integer[241],
        @Integer[439],
    ],
]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

参数里-n 2，表示只执行两次
这里输出结果中，第一次输出的是方法调用前的观察表达式的结果，第二次输出的是方法返回后的表达式的结果
结果的输出顺序和事件发生的先后顺序一致，和命令中 -s -b 的顺序无关

调整-x的值，观察具体的方法参数值$ watch demo.MathGame primeFactors "{params,target}" -x 3

Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 58 ms.
ts=2018-12-03 19:34:19; [cost=0.587833ms] result=@ArrayList[
    @Object[][
        @Integer[47816758],
    ],
    @MathGame[
        random=@Random[
            serialVersionUID=@Long[3905348978240129619],
            seed=@AtomicLong[3133719055989],
            multiplier=@Long[25214903917],
            addend=@Long[11],
            mask=@Long[281474976710655],
            DOUBLE_UNIT=@Double[1.1102230246251565E-16],
            BadBound=@String[bound must be positive],
            BadRange=@String[bound must be greater than origin],
            BadSize=@String[size must be non-negative],
            seedUniquifier=@AtomicLong[-3282039941672302964],
            nextNextGaussian=@Double[0.0],
            haveNextNextGaussian=@Boolean[false],
            serialPersistentFields=@ObjectStreamField[][isEmpty=false;size=3],
            unsafe=@Unsafe[sun.misc.Unsafe@2eaa1027],
            seedOffset=@Long[24],
        ],
        illegalArgumentCount=@Integer[13159],
    ],
]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

-x表示遍历深度，可以调整来打印具体的参数和结果内容，默认值是1。

条件表达式的例子$ watch demo.MathGame primeFactors "{params[0],target}" "params[0]<0"

Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 68 ms.
ts=2018-12-03 19:36:04; [cost=0.530255ms] result=@ArrayList[
    @Integer[-18178089],
    @MathGame[demo.MathGame@41cf53f9],
]

1
2
3
4
5
6

只有满足条件的调用，才会有响应。

观察异常信息的例子$ watch demo.MathGame primeFactors "{params[0],throwExp}" -e -x 2

Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 62 ms.
ts=2018-12-03 19:38:00; [cost=1.414993ms] result=@ArrayList[
    @Integer[-1120397038],
    java.lang.IllegalArgumentException: number is: -1120397038, need >= 2
    at demo.MathGame.primeFactors(MathGame.java:46)
    at demo.MathGame.run(MathGame.java:24)
    at demo.MathGame.main(MathGame.java:16)
,
]

1
2
3
4
5
6
7
8
9
10

-e表示抛出异常时才触发
express中，表示异常信息的变量是throwExp

按照耗时进行过滤$ watch demo.MathGame primeFactors '{params, returnObj}' '#cost>200' -x 2

Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 66 ms.
ts=2018-12-03 19:40:28; [cost=2112.168897ms] result=@ArrayList[
    @Object[][
        @Integer[2141897465],
    ],
    @ArrayList[
        @Integer[5],
        @Integer[428379493],
    ],
]

1
2
3
4
5
6
7
8
9
10
11

#cost>200(单位是ms)表示只有当耗时大于200ms时才会输出，过滤掉执行时间小于200ms的调用

观察当前对象中的属性

如果想查看方法运行前后，当前对象中的属性，可以使用target关键字，代表当前对象

$ watch demo.MathGame primeFactors 'target'

Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 52 ms.
ts=2018-12-03 19:41:52; [cost=0.477882ms] result=@MathGame[
    random=@Random[java.util.Random@522b408a],
    illegalArgumentCount=@Integer[13355],
]

1
2
3
4
5
6

然后使用target.field_name访问当前对象的某个属性$ watch demo.MathGame primeFactors 'target.illegalArgumentCount'

Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 67 ms.
ts=2018-12-03 20:04:34; [cost=131.303498ms] result=@Integer[8]
ts=2018-12-03 20:04:35; [cost=0.961441ms] result=@Integer[8]

1
2
3
4

Monitor

监控某个特殊方法的调用统计数据，包括总调用次数，平均rt，成功率等信息，每隔5秒输出一次。方法拥有一个命名参数 [c:]，意思是统计周期（cycle of output），拥有一个整型的参数值

监控的维度说明

监控项	说明
timestamp	时间戳
class	Java类
method	方法（构造方法、普通方法）
total	调用次数
success	成功次数
fail	失败次数
rt	平均RT
fail-rate	失败率

monitor -c 5 demo.MathGame primeFactors

Time Tunnel(tt)

记录方法调用信息，支持事后查看方法调用的参数，返回值，抛出的异常等信息，仿佛穿越时空隧道回到调用现场一般。

$ tt -t org.apache.dubbo.demo.provider.DemoServiceImpl sayHello

Classloader

了解当前系统中有多少类加载器，以及每个加载器加载的类数量，帮助您判断是否有类加载器泄露。

$ classloader

Web Console

Arthas目前支持Web Console，用户在attach成功之后，可以直接访问：http://127.0.0.1:8563/。

# 开箱即用的Profiler/火焰图功能

back

火焰图的威名相信大家都有所耳闻，但可能因为使用比较复杂，所以望而止步。

# 启动profiler

profiler start

默认情况下，生成的是cpu的火焰图，即event为cpu。可以用--event参数来指定。

profiler getSamples:获取已采集的sample的数量
profiler status:查看profiler状态
可以查看当前profiler在采样哪种event和采样时间。
profiler stop:生成svg格式结果
默认情况下，生成的结果保存到应用的工作目录下的arthas-output目录里。
通过浏览器查看arthas-output下面的profiler结果:默认情况下，arthas使用3658端口，则可以打开： http://localhost:3658/arthas-output/ 查看到arthas-output目录下面的profiler结果：

# grep命令支持更丰富的选项

back

sysprop | grep java
sysprop | grep java -n
sysenv | grep -v JAVA
sysenv | grep -e "(?i)(JAVA|sun)" -m 3  -C 2
sysenv | grep JAVA -A2 -B3
thread | grep -m 10 -e  "TIMED_WAITING|WAITING"

1
2
3
4
5
6

# telnet/http协议共用3658端口

back

默认情况下，Arthas的Telnet端口是3658，HTTP端口是8563，这个常常让用户迷惑。在新版本里，在3658端口同时支持Telnet/HTTP协议。