转自:http://blog.csdn.net/sonikk/article/details/9199865
1. 使用NDK_DEBUG=1后,编译出来的程序比不使用慢很多,大概效率会降低2.5倍,这个数字真不小了!
2. 在NDK_DEBUG=0的情况下,Java调用一次Native函数,耗费时间在0.147~0.233ms左右
3. 使用系统提供的函数,比自己手动copy内存效率快得多,如使用memcpy或者memset可以提高很多速度
4. 尽量避免除法,比较好的做法是使用倒数进行乘法计算,而倒数可以实现用计算器进行求值,直接写入代码
5. Android的一些cpu在处理float类型的时候非常缓慢,下面是我的一个测试程序,程序跑的是求灰度值,可以明显的对比出来优化前后的效果,简直是立竿见影!
[plain]
view plain
copy
- 测试条件:图片分辨率500×750
- [pad] 设备:gt_p6800 双核1.4GHz
- [phone] 设备:nexus s 单核1.0GHz
- [优化前]——————————————-
- // [pad][debug=0]
- // [time] 14.132000 (ms)
- // [time] 15.066000 (ms)
- // [time] 25.767000 (ms)
- // [phone][debug=0]
- // [time] 175.110992 (ms)
- // [time] 196.481003 (ms)
- // [time] 113.485001 (ms)
- // [time] 162.052994 (ms)
- // [time] 178.488998 (ms) (平均=164.8)
- [优化后]——————————————-
- // [pad][debug=0]
- // [time] 12.625000 (ms)
- // [phone][debug=0]
- // [time] 48.099998 (ms)
- // [time] 52.244999 (ms)
- // [time] 27.365000 (ms)
- // [time] 34.356998 (ms)
- // [time] 23.222000 (ms)
- // [time] 31.214001 (ms) (平均=35.8)
- 结论:优化后提升的效率:164.8/35.8 = 4.6 提升360%
6. 在必要的时候可以使用多线程进行优化,这样也可以提升cpu的利用率,从而达到优化的目的:
[plain]
view plain
copy
- 测试条件:图片分辨率500×750
- [pad] 设备:gt_p6800 双核1.4GHz
- [phone] 设备:nexus s 单核1.0GHz
- [优化前]——————————————-
- // [pad][debug=0]
- // [time] 14.132000 (ms)
- // [time] 15.066000 (ms)
- // [time] 25.767000 (ms)
- // [phone][debug=0]
- // [time] 175.110992 (ms)
- // [time] 196.481003 (ms)
- // [time] 113.485001 (ms)
- // [time] 162.052994 (ms)
- // [time] 178.488998 (ms) (平均=164.8)
- [优化后]——————————————-
- // [pad][debug=0]
- // [time] 19.000000 (ms)
- // [time] 10.667000 (ms)
- // [time] 13.659000 (ms)
- // [time] 15.953000 (ms)
- // [time] 28.219999 (ms)
- // [time] 17.202999 (ms)
- // [phone][debug=0]
- // [time] 192.744003 (ms)
- // [time] 124.049004 (ms)
- // [time] 91.658997 (ms)
- // [time] 88.318001 (ms)
- // [time] 92.111000 (ms)
- // [time] 91.551003 (ms)
- // [time] 31.214001 (ms) (平均=101.7)
- 结论:优化后提升的效率:164.8/101.7 = 1.62 提升:62%
7. 同时使用int优化以及多线程,效果得到进一步提升!:
[plain]
view plain
copy
- 测试条件:图片分辨率500×750
- [pad] 设备:gt_p6800 双核1.4GHz
- [phone] 设备:nexus s 单核1.0GHz
- [优化前]——————————————-
- // [pad][debug=0]
- // [time] 14.132000 (ms)
- // [time] 15.066000 (ms)
- // [time] 25.767000 (ms)
- // [phone][debug=0]
- // [time] 175.110992 (ms)
- // [time] 196.481003 (ms)
- // [time] 113.485001 (ms)
- // [time] 162.052994 (ms)
- // [time] 178.488998 (ms) (平均=164.8)
- [优化后]——————————————-
- // [pad][debug=0]
- // [time] 7.036000 (ms)
- // [time] 8.646000 (ms)
- // [time] 11.264000 (ms)
- // [time] 15.105000 (ms)
- // [time] 8.559000 (ms)
- // [time] 10.316000 (ms) 平均=10.154333
- // [phone][debug=0]
- // [time] 35.727001 (ms)
- // [time] 42.995998 (ms)
- // [time] 25.725000 (ms)
- // [time] 19.454000 (ms)
- // [time] 27.643000 (ms)
- // [time] 29.118000 (ms) 平均=30.110
- 结论:优化后提升的效率:164.8/30.110 = 5.47 提升:447%
8. 如果追求更快的速度,可以使用neon技术,以及汇编
9. 使用查表法进行RGB <-> HSY 空间的转换,先用预处理剔除重复像素
10. 分散时间执行,不一定就是点击处理按钮的当时进行,可以开多线程在后台做一些操作,当到达这个点的时候其实大部分任务量已经计算好了
11. 缩小进行计算,然后放大进行混合计算
12. 对实时性要求高的场合,显示的时候可以先用粗质
下面是我最近工作中算法的效率
优化前期甚至
[pad] 4.5s
[phone] 25-30s
优化中期:
[plain]
view plain
copy
- // [pad] [debug=0]
- // [time] 760.336975 (ms)
- // [time] 681.442993 (ms) //684.579669毫秒
- // [time] 623.708984 (ms) //624.158042毫秒
- // [time] 633.445984 (ms) //634.257毫秒
- // [time] 597.619019 (ms) //597.847208毫秒
- // [time] 627.309998 (ms) //627.49325毫秒
- // [time] 570.181030 (ms) //570.961041毫秒
- // [phone] [debug=0]
- // [time] 4956.109863 (ms) //4962.556毫秒
- // [time] 7306.398926 (ms) //7313.900167毫秒
- // [time] 5068.775879 (ms) //5071.079875毫秒
- // [time] 7162.111816 (ms) //7163.000041毫秒
- // [time] 6936.270996 (ms) //6937.797708毫秒
- // [time] 5545.577148 (ms) //5549.424749毫秒
- // [time] 5578.754883 (ms) //5579.310792毫秒
- // [time] 5208.092773 (ms) //5214.470333毫秒
- // [time] 5067.886230 (ms) //5068.809999毫秒
- // [time] 5506.863770 (ms) //5507.421083毫秒
优化后期:
[plain]
view plain
copy
- // [pad] [debug=0]
- // [time] 477.792999 (ms) //483.404708毫秒
- // [time] 181.477997 (ms) //182.157458毫秒
- // [time] 179.610001 (ms) //180.286042毫秒
- // [time] 191.210007 (ms) //191.778毫秒
- // [time] 177.595001 (ms) //179.568376毫秒
- // [time] 175.041000 (ms) //175.818125毫秒
- // [phone] [debug=0]
- // [time] 795.302002 (ms) //802.008249毫秒
- // [time] 1108.545044 (ms) //1113.701375毫秒
- // [time] 932.200012 (ms) //954.378375毫秒
- // [time] 939.596008 (ms) //940.612125毫秒
- // [time] 834.835022 (ms) //835.261416毫秒
- // [time] 766.768982 (ms) //767.523749毫秒
- // [time] 539.195007 (ms) //540.059709毫秒
- // [time] 561.531006 (ms) //562.323375毫秒
可以看到,从优化中期到优化后期,[phone]几乎变快了7倍!内存的占用率也降低了很多!
@sonikk 2013-6-30 13:32:05
研究资料:
Neon的文档:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0406c/index.html
Neon 原理的ppt介绍:
http://www.arm.com/files/pdf/AT_-_NEON_for_Multimedia_Applications.pdf
ARM NEON 指令(用例)
http://blog.csdn.net/tonyfield/article/details/8597549
ARM NEON Optimization. An Example
关于neon
http://blog.csdn.net/ccjjnn19890720/article/details/7291228关于neon
ARM-NEON-Intrinsics
http://www.doc88.com/p-703867169475.html
转贴ARM NEON 优化的例子
http://www.cnblogs.com/c6000/archive/2010/09/14/1826324.html
Neon 使用小结
http://blog.csdn.net/chenji001/article/details/4798754
Android NDK 之NEON优化
http://blog.csdn.net/zwcai/article/details/6843531
Android NDK(ARM开发)使用NEON优化
http://luofl1992.is-programmer.com/posts/38686.html
图像处理常用算法
http://www.rosoo.net/a/201108/14913.html
ARM首席工程师:关于Android NDK的10个技巧
http://www.chinaaet.com/article/index.aspx?id=133752
ARM(Android NDK)混编C/C++汇编优化
http://www.2cto.com/kf/201304/200755.html
Android为何要用据说效率很低的Java呢?(二)
http://blog.163.com/liutian945@126/blog/static/16813804820111182123485/
【好文】iOS – How to convert BGRA video streams to Grayscale SUPER fast.
http://teh1337.nfshost.com/blog.py?post=198
a-very-fast-bgra-to-grayscale-conversion-on-iphone
http://computer-vision-talks.com/2011/02/a-very-fast-bgra-to-grayscale-conversion-on-iphone/
ARM NEON Optimization. An Example
Introduction to NEON on iPhone
http://wanderingcoder.net/2010/06/02/intro-neon/
Android NDK使用NEON优化,SIMD优化
http://blog.csdn.net/luofl1992/article/details/8759145
详解Android jit
http://hi.baidu.com/cool_parkour/item/570886f9d0732e14e3e3bd7f
[置顶] Android开发性能优化简介
http://blog.csdn.net/h3c4lenovo/article/details/7669749
jni加载、卸载动态链接库文件
android JNI处理图片的例子
http://blog.csdn.net/xjwangliang/article/details/7065670
bitmap.h
http://mobilepearls.com/labs/native-android-api/include/android/bitmap.h
Android c 通过skia图形库绘制文字
http://www.360doc.com/content/13/0401/16/7891085_275323215.shtml
Chrome for Mac 将换用 Skia 2D 图形库
http://www.guao.hk/tag/skia
skia源码下载:
https://code.google.com/p/skia/
http://www.chromium.org/developers/design-documents/graphics-and-skia
skia官方文档:
https://sites.google.com/site/skiadocs/
【好文】Android有效解决加载大图片时内存溢出的问题
http://www.cnblogs.com/wanqieddy/archive/2011/11/25/2263381.html
Android 读取本地(SD卡)图片
http://blog.csdn.net/knowheart/article/details/7334966
2013-7-1 0:17:59补充:
android官方ndk文档说明:
http://developer.android.com/tools/sdk/ndk/index.html#Docs
Android: NDK编程入门笔记
http://www.cnblogs.com/hibraincol/archive/2011/05/30/2063847.html
给出c&c++程序优化的几个建议,希望对你有帮助
http://blog.csdn.net/wangjiaoyu250/article/details/9185591
采用泰勒级数展开法编写的定点化的开方运算,定点化精度为Q15
http://www.pudn.com/downloads188/sourcecode/math/detail884298.html
提高专业技能之 “Codec定点化”
http://www.cnblogs.com/huaping-audio/archive/2010/07/30/1788753.html
[转]vlc android 代码编译
http://3792615.blog.163.com/blog/static/778210942012927103347556/
package manager service是怎样选择armeabi/armeabi-v7a中的库文件的?
http://www.cnblogs.com/loveisbug/archive/2013/04/25/3042950.html
开发笔记:android ndk 开发之Application.mk
http://www.2cto.com/kf/201207/143406.html
Android NDK学习 <四> Application.mk简介
http://blog.sina.com.cn/s/blog_602f877001014ptu.html
Android NDK学习 <五> C++ 支持
http://blog.sina.com.cn/s/blog_602f877001014qe5.html
android 代码优化
http://hilary3113.iteye.com/blog/1018700
Android平台上的浮点优化
http://task.zhubajie.com/1804783/
Android开发性能优化浅论之一
http://www.cnblogs.com/huang1986513/archive/2013/03/09/2951742.html
Android arm处理器优化
http://www.360doc.com/content/13/0618/10/8204997_293662997.shtml
浮点优化选项 -ffast-math:极大地提高浮点运算速度
http://blog.csdn.net/zjujoe/article/details/2604157
Android应用程序优化都有哪些需要注意的?
http://www.mianwww.com/html/2012/05/16497.html
Android如何避免自己的应用程序被破解和反编译?
http://www.mianwww.com/html/2012/05/16500.html
Android开发过程中如何进行算法与界面的优化?
http://www.mianwww.com/html/2012/05/16495.html
Android浮点基础概念浅谈
http://developer.51cto.com/art/201001/180521.htm
ARM 浮点运算详解
http://blog.csdn.net/haomcu/article/details/7677460
CPU浮点运算与整点运算分别决定其什么方面性能?
http://www.zhihu.com/question/20086019
arm芯片中的浮点运算
http://blog.chinaunix.net/uid-27875-id-3453816.html
RTTI技术
http://baike.baidu.com/view/1042388.htm