low = np.average(bins[:-1], weights=cnt)
high = np.average(bins[1:], weights=cnt)
print(f'The average is in the {low}-{high} range.')
# The average is in the 70.0-220.0 range.
对于中位数:
cnt_cumsum = np.add.accumulate(cnt)
idx = np.searchsorted(cnt_cumsum, half)
low = bins[idx]
high = bins[idx+1]
print(f'The median is in the {low}-{high} range.')
# The median is in the 10.0-160.0 range.
具有 1000 个随机值和 20 个箱的示例:
True data mean: 0.496, median: 0.481
The average is in the 0.471-0.521 range.
The median is in the 0.45-0.5 range.
不可以。聚合为直方图后,初始信息会部分丢失。您无法准确计算原始总体的平均值/中位数。
为了演示,这里有两个不同的数组(具有不同的均值/中位数),它们给出相同的计数和箱子:
近似
但是你可以确定平均值的限度:
对于中位数:
具有 1000 个随机值和 20 个箱的示例: