kill crash原因分析
crash | |
关键字 | has die AndroidRuntime F DEBUG am_crash vold UsbDeviceService |
signal 9 | kill caller: pid 20112,tgid 20112,uid 0,sig 9,kill pid 438 (anr) Zygote : Process exited due to signal 9 (kill -9) |
signal 2 | Zygote : Process 4214 exited due to signal 2 (Interrupt) |
signal 6 | (SIGABRT), code -1 (SI_QUEUE), fault addr 'Unable to generate SkSurface. isTextureValid:0 D skia : Could not create EGL image, err = (0x3002)EGL_NOT_INITIALIZED Abort message: 'Scudo ERROR: corrupted chunk header at address 0x asan可定位到:[0x0044bf1fb1c0,0x0044bf1fb1e0) is a small allocated heap chunk; size: 32 offset: 16 kill -6 (SI_USER from pid x 可以看到被谁kill的) |
signal 7 | Signal: 7 (SIGBUS), Code: 2 (BUS_ADRERR) |
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR) fault addr | |
OomAdjuster: kill previous app | 将empty cached B service previous app 统一使用一个LRU List进行管理,List大小为4,当List溢出时,kill掉最先加入到List的进程 |
am_kill : [0,5247,com.jidu.media.hall,0,user request after error cached #3 empty #3 | 1,anr弹窗点击关闭app,强制stop activity 2,系统kill掉缓存进程 3,系统kill掉缓存进程ActivityManager: Killing empty for(empty进程超过limit数量之后,最多存活30分钟,超过30分钟的进程会被kill,知道empty进程数量少于limit数量 [LMKD] [Android] 进程OomAdj调整分析:Empty被Kill流程(4)_android empty进程被kill-CSDN博客 |
ActivityTaskManager: Force finishing activity | kill -9/force-stop(adb shell kill -9) |
fd 1024 | fd泄露 |
Too many receivers, total of 1000 | systemui : android.app.RemoteServiceException$CannotDeliverBroadcastException: can't deliver broadcast |
open failed: ENOENT (No such file or directory) ILL_ILLOPC | |
Read-only file system | SQLiteException: not an error (code 0 SQLITE_OK): Could not open the database in read/write mode. E AndroidRuntime: at android.database.sqlite.SQLiteConnection.nativeOpen(Native Method) |
libutils.so systemTime( | 获取系统时间导致内存踩踏的crash:https://jira.jiduauto.com/browse/DCD-127999 |
OOM | OutOfMemoryError: Failed to allocate a 40 byte 达到进程java最大值,申请失败 达到了进程vm最大值,jni内存失败 OutOfMemoryError: Could not allocate JNI Env: Failed anonymous mmap Out of memory. See process maps in the log Android 启动线程OOM_android逻辑地址占满-CSDN博客 2种查杀的方式: oom-kill:(java抛出来的) Out of memory: Kill process (kernel抛出来的) 原理: dalvik.vm.heapsize 应用自己在AndroidManifest.xml设置了android:largeHeap="true" == dalvik.vm.heapgrowthlimit == dalvik.vm.heapsize == 512M 系统没定义prop:dalvik.vm.heapgrowthlimit,每个App最大内存就是dalvik.vm.heapsize 清单文件中声明 largeHeap 为 true,则 App 使用的内存到 heapsize 才会 OOM,否则达到 heapgrowthlimit 就会 OOM dalvik.vm.heaptargetutilization = 已使用内存良/堆大小。过大或者过小都会影响GC的频繁程度及效率,它通常被设置为0.75 OutOfMemoryError是谁抛出来的? ART Runtime创建(三)--Heap的创建 - 简书 Android 启动线程OOM_android逻辑地址占满-CSDN博客 |
lowmemorykiller | 在Linux系统无法分配新内存的时候,选择性杀掉进程,LowMemoryKiller是一种根据内存阈值级别触发的GC内存回收的机制,在系统可用内存较低时,就会选择性杀死进程的策略,相对OOMKiller |
binder导致的crash | TransactionTooLargeException,ActivityStopInfo 有保存的所有bundle,如果太大会导致crash |
有异步binder buffer用光了,申请不到kernel内存,系统会杀掉接受binder调用的进程 | |
有同步binder通信请求,如果在frozen状态下会杀死进程 | |
系统Binder数量大于6000,非系统app可能会被系统kill掉: | |
am_kill : [0,4494,com.jidu.media.service,100,Too many Binders sent to SYSTEM] | |
如果进程的UID不是SYSTEM_UID,就会被Kill掉 | |
BroadCastReceiver.onreceive方法中抛出时异常,ams杀掉进程 | |
ANR |
大致分类 |
WindowManager: ANR in–获取不到Window焦点会触发5s |
Waited 5000ms for FocusEvent(hasFocus=true)) |
Waited 5000ms for KeyEvent (KeyEvent) |
Waited 5000ms for MotionEvent (MotionEvent) |
process is bad:Broadcast of Intent |
主线程执行Service各个生命周期函数在规定时间内(前台20s,后台200s) |
startForegroundService() did not then call Service.startForeground() |
Unable to launch app for process is bad:process is bad的原因是App crash次数太多(短时间内连续crash 2次),mAm.startProcessLocked会打印process is bad,无法启动进程也可在cleanUpApplicationRecordLocked方法中强制控制 |
排查步骤 |
从anr文件中看sysTid=主线程调用stack是否正常("main" prio=5 tid=1 Blocked |
cpu和memory资源是否充足 |
cpu负载看Load平均进程数 (>3 异常,>5 critical),如Load 32.89 / 23.29 / 13.9(表示1、5、15分钟内系统的平均进程数 (8核下)) some avg10 full avg10 2位数是异常 kswapd (大:频发缺页中断,读取新文件)173/kswapd0:0% user + 83% kernel【avg =1 avg60=4 avg300=2 表示IO频繁 】 |
排查 input : Input Dispatcher State at time of last ANR |
Looper : Slow dispatch took 140ms android.fg h=android.os.Handler c=com.android.server.wm.ActivityMetricsLogger$$ExternalSyntheticLambda1@87eeca0 m
Choreographer: Skipped 494 frames! The application may be doing too much work on its main thread.
binder耗时:Slow Binder
dvm_lock_sample:wait lock
lowmemorykiller: Kill 'com.jidu.map'
si_code 字段记录了信号的来源:
- SI_USER表示信号是由用户进程产生的,(大概率是调用kill()发出的),例如F DEBUG : signal 6 (SIGABRT), code 0 (SI_USER from pid 565, uid 0), fault addr(比如memory leak pressure会用kill -6杀死进程)
- SI_KERNEL则说明该信号由内核产生的
- SI_QUEUE是由进程调用sigqueue()发出的
https://zhuanlan.zhihu.com/p/77598393
用kill -6 pid 主动杀死进程, 使进程abort/coredump, 有哪些用处?-CSDN博客
SIGHUP 终止进程 终端线路挂断 |
理解杀进程的实现原理 - Gityuan博客 | 袁辉辉的技术博客