在ImportNew上面看到一篇文章:http://www.importnew.com/14841.html,说的是使用Java8的对集合采用流操作的新特性,替代旧的使用循环对集合操作的方式,使用Java8的流操作功能对集合进行分组,以及对相应的内容进行去重等操作等,使用Java8编写的代码可读性和理解性都有了非常大的提高,是非常值得称称赞的。
Java8通过流对集合的分组操作,让分组功能实现起来就非常容易了,我就想对其性能做一下比较,看看这二者之间是否有差距。
我把ImportNew上面的示例做了一下扩充,在原来的Article对像中增加了国家和省份两个,后续的示例就根据国家和省份进行二维分组统计,然后比较一下性能和效率。
Article对像:
/**
* ClassName:Article <br/>
* Date: 2018年5月8日 上午10:31:03 <br/>
*
* @author fenglibin
* @version
* @see
*/
public class Article {
private final String title;
private final String author;
private final List<String> tags;
private final String countryCode;
private final String province;
public Article(String title, String author, List<String> tags, String countryCode, String province){
this.title = title;
this.author = author;
this.tags = tags;
this.countryCode = countryCode;
this.province = province;
}
public String getTitle() {
return title;
}
public String getAuthor() {
return author;
}
public List<String> getTags() {
return tags;
}
public String getCountryCode() {
return countryCode;
}
public String getProvince() {
return province;
}
}
准备一些测试数据:
private static List<Article> articles = new ArrayList<Article>();
static {
Article a1 = new Article("Hello World", "Tom", Arrays.asList("Hello", "World", "Tom"), "CN", "GD");
Article a2 = new Article("Thank you teacher", "Bruce", Arrays.asList("Thank", "you", "teacher", "Bruce"), "CN",
"GX");
Article a3 = new Article("Work is amazing", "Tom", Arrays.asList("Work", "amazing", "Tom"), "CN", "GD");
Article a4 = new Article("New City", "Lucy", Arrays.asList("New", "City", "Lucy", "Good"), "US", "OT");
articles.add(a1);
articles.add(a2);
articles.add(a3);
articles.add(a4);
}
使用普通的分组方式进行分组:
/**
* 通过for循环逻辑,编程上会麻烦点,但是效率上高很多
*/
private static void groupByCountryAndProvince_byNormal() {
Map<String, Map<String, List<Article>>> result = new HashMap<String, Map<String, List<Article>>>();
for (Article article : articles) {
Map<String, List<Article>> pMap = result.get(article.getCountryCode());
if(pMap==null) {
pMap = new HashMap<String, List<Article>>();
result.put(article.getCountryCode(), pMap);
}
List<Article> list = pMap.get(article.getProvince());
if(list==null) {
list = new ArrayList<Article>();
pMap.put(article.getProvince(), list);
}
list.add(article);
}
result.forEach((cc, map) -> {
System.out.println("Country Code is:" + cc);
map.forEach((pc, list) -> {
System.out.println(" Province Code is:" + pc);
list.forEach((article) -> {
System.out.println(" Article titile is:" + article.getTitle() + ",author is:"
+ article.getAuthor());
});
});
});
}
使用串行流的方式进行分组:
/**
* 以串行流的方式,通过Collectors做多维度的分组,非常方便,但是性能上很差
*/
private static void groupByCountryAndProvince() {
Map<String, Map<String, List<Article>>> result = articles.stream()
.collect(Collectors.groupingBy(Article::getCountryCode,
Collectors.groupingBy(Article::getProvince)));
result.forEach((cc, map) -> {
System.out.println("Country Code is:" + cc);
map.forEach((pc, list) -> {
System.out.println(" Province Code is:" + pc);
list.forEach((article) -> {
System.out.println(" Article titile is:" + article.getTitle() + ",author is:"
+ article.getAuthor());
});
});
});
}
使用并行流的方式进行分组:
/**
* 以并行流的方式,通过Collectors做多维度的分组,性能上比串行流的效率就高很多了
* 实现方式也很简单,只需要将stream()修改为parallelStream()实现。
*/
private static void groupByCountryAndProvinceParallel() {
Map<String, Map<String, List<Article>>> result = articles.parallelStream()
.collect(Collectors.groupingBy(Article::getCountryCode,
Collectors.groupingBy(Article::getProvince)));
result.forEach((cc, map) -> {
System.out.println("Country Code is:" + cc);
map.forEach((pc, list) -> {
System.out.println(" Province Code is:" + pc);
list.forEach((article) -> {
System.out.println(" Article titile is:" + article.getTitle() + ",author is:"
+ article.getAuthor());
});
});
});
}
加入以下代码执行:
public static void main(String[] args) {
long start = System.currentTimeMillis();
groupByCountryAndProvince();
long end = System.currentTimeMillis();
System.out.println("串行流分组使用时长(毫秒):" + (end - start)+"\n");
start = System.currentTimeMillis();
groupByCountryAndProvinceParallel();
end = System.currentTimeMillis();
System.out.println("并行流分组使用时长(毫秒):" + (end - start)+"\n");
start = System.currentTimeMillis();
groupByCountryAndProvince_byNormal();
end = System.currentTimeMillis();
System.out.println("普通分组使用时长(毫秒):" + (end - start));
}
得到的结果如下:
Country Code is:CN
Province Code is:GX
Article titile is:Thank you teacher,author is:Bruce
Province Code is:GD
Article titile is:Hello World,author is:Tom
Article titile is:Work is amazing,author is:Tom
Country Code is:US
Province Code is:OT
Article titile is:New City,author is:Lucy
串行流分组使用时长(毫秒):70
Country Code is:CN
Province Code is:GX
Article titile is:Thank you teacher,author is:Bruce
Province Code is:GD
Article titile is:Hello World,author is:Tom
Article titile is:Work is amazing,author is:Tom
Country Code is:US
Province Code is:OT
Article titile is:New City,author is:Lucy
并行流分组使用时长(毫秒):5
Country Code is:CN
Province Code is:GX
Article titile is:Thank you teacher,author is:Bruce
Province Code is:GD
Article titile is:Hello World,author is:Tom
Article titile is:Work is amazing,author is:Tom
Country Code is:US
Province Code is:OT
Article titile is:New City,author is:Lucy
普通分组使用时长(毫秒):1
执行多次也基本上是类似的效果,因此通过以上示例可以看出,在代码的编写上确实优化了不少,但即使通过并行流的方式,性能上的差距也不少,在真实的应用场景中特别是高并发的场景中,使用的时候还是需要多考虑,毕竟鱼和熊掌不可兼容了。