我有以下数组作为输入
String[] input = new String[] {
"This is a sample string",
" string ", // additional spaces here cause issues while splitting
"Another sample string",
"This is not a sample string"
};
我需要计算单个单词的频率。所需的输出是:
{a=2, not=1, string=4, This=2, is=2, sample=3, Another=1}
到目前为止,我得到了一些可行的代码:
// 1. Convert String[] into a single " " delimited String
String joined = String.join(" ", input);
// 2. Split on " " and then calculate count using Collectors.groupingBy
Map <String, Long> output =
Arrays
.stream(joined.split(" "))
.filter(s -> !s.equals("")) // To Deal with Empty Strings
.collect(
Collectors.groupingBy(
Function.identity(),
Collectors.counting()
)
);
System.out.println(output);
这对我来说看起来很粗糙,请建议一种使用 Streams API 来实现此目的的更好方法。