关于【numpy】的问题- 第1页

march_1

Asked: 2025-01-18 20:53:49 +0800 CST

TypeError：不可散列类型‘series’/‘numpy.ndarray’

5

我正在尝试使用 seaborn 准备可视化数据。因此我需要获取多种不同类型的会话以用于多线图。

和
session_cnt = df.groupby(df['EVENT_DATETIME'].dt.date, df['CUSTOMER_ID']).agg(session_count=('SESSION_ID', 'nunique'), app_session_cnt=('APP_SESSION_ID', 'nunique')).reset_index()

我得到了

TypeError: unhashable type 'series'

所以我这样做了：

session_cnt = df.groupby(df['EVENT_DATETIME'].dt.date, df['CUSTOMER_ID'].values).agg(session_count=('SESSION_ID', 'nunique'), app_session_cnt=('APP_SESSION_ID', 'nunique')).reset_index()

但得到了

TypeError: unhashable type 'numpy.ndarray'

我想了解使用 groupby 并获取 TypeError 时应检查哪一列，因为现在我只能猜测。也许我需要阅读一篇关于该错误的好文章。

Kuraga

Asked: 2025-01-08 02:37:00 +0800 CST

从另一个数组投影一个索引

7

a我有一个大小为的数组(M, N, K)。我有一个b大小为的数组(M, N)，其整数值为[0，K-1]。

如何以最简单的方式获取c大小为的数组？(M, N)c[i, j] == a[i, j, b[i, j]]

它是索引指南的哪一部分？

user28311778

Asked: 2025-01-07 17:28:28 +0800 CST

如何获取我想要的 NumPy 数组

6

我怎样才能像这样改变数组？

arr = [
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31],
[30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41],
...,
[130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141],
]

=> [
[ [0, 1], [20, 21], [30, 31], ,.., [130, 131]],
[ [2, 3], [22, 23], [32, 33], .., [132, 133]],
[ [4, 5], [24, 25], [34, 35], .., [134, 135]],
[ [6, 7], [26, 27], [36, 37], .., [136, 137]],
[ [8, 9], [28, 29], [38, 39], .., [138, 139]],
[ [10, 11], [30, 31], [40, 41], .., [140, 141]],
...
]

我的尝试最好修改为以下内容，但这不是我想要的结果。

[ [0, 1], [2, 3], [4, 5], ,.., [10, 11]],
[ [20, 21], [22, 23], [24, 25], .., [30, 31]],

Shantanu Gontia

Asked: 2025-01-07 04:29:30 +0800 CST

当结果低于正常值时，Numpy Float 到 HalfFloat 转换 RNE

7

我试图了解 NumPy 如何实现四舍五入到最接近的数，即使在转换为较低精度格式时也是如此，在本例中为 Float32 到 Float16，具体来说，当数字在 Float32 中是正常的，但在 Float16 中四舍五入为低于正常值时。

代码链接： https ://github.com/numpy/numpy/blob/13a5c4e569269aa4da6784e2ba83107b53f73bc9/numpy/core/src/npymath/halffloat.c#L244-L365

我的理解如下，

在 float32 中，数字有位

31	三十	二十九	二十八	二十七	二十六	二十五	24	23	22	21	20	19	18	17	16	15	14	十三	12	11	10	9	8	7	6	5	4	3	2	1	0
s	e0	e1	e2	e3	e4	e5	e6	e7	m0	米1	平方米	米3	米4	m5	米6	m7	M8	m9	m10	m11	m12	m13	m14	m15	m16	m17	m18	m19	m20	m21	m22

        /*
         * If the last bit in the half significand is 0 (already even), and
         * the remaining bit pattern is 1000...0, then we do not add one
         * to the bit after the half significand. However, the (113 - f_exp)
         * shift can lose up to 11 bits, so the || checks them in the original.
         * In all other cases, we can just add one.
         */
        if (((f_sig&0x00003fffu) != 0x00001000u) || (f&0x000007ffu)) {m
            f_sig += 0x00001000u;
        }

上述代码用于打破最接近偶数的平局。我不明白为什么在逻辑或的第二部分，我们对0x0000'07ffu(位 m12-m22) 进行按位与，而不是0x0000'ffffu(m11-m22) 进行按位与。

一旦我们将尾数位对齐为 float16 的亚正规格式（这是此段代码之前的位移所做的），在上面的 float32 数字表示中，我们就可以m10决定m22要舍入的方向。

我的理解是，OR 的第二部分检查数字是否大于中间点，如果是，则将半有效数字位加一。但对于原始数字，它不是只检查中间点以上的数字子集吗？在 float16 数字中，m9 将是最后一个要保留的精度。因此，如果满足以下条件，我们将向上舍入：

m9 为 1，m10 为 1，m11-m22 均为 0（或的第一部分）
m10 为 1，m11-m22 中至少有一个为 1（将数字置于中间点以上）
如果 m11-m22 中任何一个为 1，则可以通过将 1 添加到 m10 来简化。如果 m10 已经为 1，则添加将影响到 m9，否则将不受影响。但是，在 NumPy 代码的情况下，检查的位是 m12-m22。

我不确定我遗漏了什么。这是一个特殊情况吗？

我期望位 m11-m22 能够决定是否加 1 以及是否加 m12-m22。

user2912230

Asked: 2024-12-28 21:20:14 +0800 CST

安装 aeneas 时无法检测到 numpy

6

在装有 python 3.13.1 的 Windows 上，当运行 pip install 某些东西（aeneas）时，我无法解决：您必须先安装 numpy，然后再安装 aeneas

我尝试过许多不同的方法，包括关注有关此事的旧 stackoverflow 帖子。

我希望这种方法至少能够奏效：

python -m venv myenv
.\myenv\Scripts\Activate
pip install numpy
pip list

Package Version
------- -------
numpy   2.2.1
pip     24.3.1

pip install aeneas

Collecting aeneas
  Using cached aeneas-1.7.3.0.tar.gz (5.5 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [3 lines of output]
      [ERRO] You must install numpy before installing aeneas
      [INFO] Try the following command:
      [INFO] $ sudo pip install numpy
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

pip install setuptools
pip install --upgrade pip setuptools wheel
pip install --no-build-isolation aeneas

...You must install numpy before installing aeneas

Ben

Asked: 2024-12-06 03:29:25 +0800 CST

在 JAX 中高效自定义数组创建例程

6

我仍在掌握最佳实践jax。我的主要问题如下：

实现自定义数组创建例程的最佳实践是什么jax？

例如，我想实现一个函数，创建一个矩阵，其中除给定列中的 1 外，其他列均为 0。我选择了这个（Jupyter 笔记本）：

import numpy as np
import jax.numpy as jnp

def ones_at_col(shape_mat, idx):
    idxs = jnp.arange(shape_mat[1])[None,:]
    mat = jnp.where(idx==idxs, 1, 0)
    mat = jnp.repeat(mat, shape_mat[0], axis=0)
    return mat

shape_mat = (5,10)

print(ones_at_col(shape_mat, 5))

%timeit np.zeros(shape_mat)

%timeit jnp.zeros(shape_mat)

%timeit ones_at_col(shape_mat, 5)

输出为

[[0 0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 1 0 0 0 0]]
127 ns ± 0.717 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
31.3 µs ± 331 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
123 µs ± 1.79 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

我的功能比常规功能慢了 4 倍jnp.zeros()，这还不算太糟糕。这说明我做的事情并不疯狂。

但是这两个jax例程都比等效例程慢得多numpy。这些函数无法进行 jitted，因为它们将形状作为参数，因此无法跟踪。我猜这就是它们天生就慢的原因？我猜如果它们中的任何一个出现在另一个 jitted 函数的范围内，它们可以被跟踪并加速？

我能做得更好吗？或者我是否正在突破可能的极限jax？

BlackPhoenix

Asked: 2024-12-04 17:48:43 +0800 CST

给定 numpy 直方图的分布的平均值和中位数

7

假设您numpy histogram根据一些数据（您无法访问）计算出一个，因此您只知道箱数和计数。有没有一种有效的方法来计算直方图描述的分布的平均值和中位数？

julian2000P

Asked: 2024-12-03 23:21:47 +0800 CST

为什么我使用 numpy 的矩阵乘法这么慢？

6

我正在尝试在 numpy 中将两个维数相当大的矩阵相乘。请参阅下面的 3 种方法。我随机实现了这 3 个矩阵来展示我的问题。第一个矩阵，即Y1[:,:,0]首先是一个更大的 3d 数组的一部分。第二个是.copy()这个矩阵的，第三个是它自己的矩阵。

为什么第一次乘法比后两次乘法慢这么多？

import numpy as np
from time import time

Y1 = np.random.uniform(-1, 1, (5000, 1093, 201))
Y2 = Y1[:,:,0].copy()
Y3 = np.random.uniform(-1, 1, (5000, 1093))

W = np.random.uniform(-1, 1, (1093, 30))

# method 1
START = time()
Y1[:,:,0].dot(W)
END = time()
print(f"Method 1 : {END - START}")

# method 2
START = time()
Y2.dot(W)
END = time()
print(f"Method 2 : {END - START}")

# method 3
START = time()
Y3.dot(W)
END = time()
print(f"Method 3 : {END - START}")

输出时间分别大约为34、0.06、0.06秒。

我看到了区别：虽然最后两个矩阵是“真正的”二维数组，但第一个矩阵是我更大的三维数组的一部分。

子集化Y1[:,:,0]导致速度如此缓慢吗？另外，我注意到为矩阵 Y2 创建 Y1 的副本也相当慢。

毕竟，我得到了这个 3d 数组，并且必须重复计算 Y1 切片与（可能不同的）矩阵 W 的矩阵乘积。有没有更好/更快的方法来做到这一点？

提前致谢！

Ben

Asked: 2024-11-26 08:30:40 +0800 CST

从标量和矩阵形成元素列表

6

我有一个零维numpy标量s和一个二维numpy矩阵m。我想形成一个向量矩阵，其中的所有元素都m与配对，s如下例所示：

import numpy as np

s = np.asarray(5)

m = np.asarray([[1,2],[3,4]])

# Result should be as follows

array([[[5, 1],
        [5, 2]],

       [[5, 3],
        [5, 4]]])

换句话说，我想np.asarray([s, m])在的最低级别上逐元素地矢量化操作。对于内的m任何多维数组，是否有一种明显的方法来做到这一点？mnumpy

我确信这个在某个地方，但我无法用语言表达，也找不到它。如果你能找到它，请随时将我重定向到那里。

Mincheol

Asked: 2024-09-19 02:13:15 +0800 CST

Python NetCDF 本初子午线空白

5

我正在尝试绘制 GPCC 降水的气候图。

但我发现本初子午线处有一块空白（数据切断）。

我该如何解决这个问题？我可以使用 CDO、NCO、Python。

我也分享代码和数据。

（数据） https://drive.google.com/drive/folders/1rEHKz5GQlvC3m_Cfzx48cTrPZPU0qAar?usp=sharing

（GPCC 元数据） https://opendata.dwd.de/climate_environment/GPCC/html/fulldata-monthly_v2022_doi_download.html

type here

from netCDF4 import Dataset
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

import xarray as xr
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import calendar

import cartopy.crs as ccrs
from cartopy.mpl.gridliner import LONGITUDE_FORMATTER, LATITUDE_FORMATTER
import cartopy.feature as cfeature
import cftime
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
%matplotlib inline

mpl.rcParams['figure.figsize'] = [8., 6.]

filename = 'D:/ERA5/precip.nc'
ds = xr.open_dataset(filename)
ds

da = ds['precip']
da

def is_jjas(month):
    return (month >= 6) & (month <= 9)

dd = da.sel(time=is_jjas(da['time.month']))

def is_1982(year):
    return (year> 1981)

dn = dd.sel(time=is_1982(dd['time.year']))
dn

JJAS= dn.groupby('time.year').mean('time')

JJAS2 = JJAS.mean(dim='year', keep_attrs=True)
JJAS2

fig, ax = plt.subplots(1, 1, figsize = (16, 8), subplot_kw={'projection': ccrs.PlateCarree()})

cs = plt.contourf(JJAS2.lon, JJAS2.lat, JJAS2, levels=[0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500],
                  vmin=0, vmax=500, cmap='Blues', extend='both')

# Set the figure title, add lat/lon grid and coastlines
ax.set_title('', fontsize=16)
ax.gridlines(draw_labels=True, linewidth=1, color='gray', alpha=0.5, linestyle='--') 
ax.coastlines(color='black')
ax.set_extent([-20, 30, 0, 30], crs=ccrs.PlateCarree())

cbar = plt.colorbar(cs,fraction=0.05, pad=0.04, extend='both', orientation='horizontal')

我尝试在 Google 上搜索并找到 CDO 方法。如果我插入数据，其分辨率可能会改变。我想使用具有相同分辨率但没有任何空白的 GPCC 数据。

TypeError：不可散列类型‘series’/‘numpy.ndarray’

从另一个数组投影一个索引

如何获取我想要的 NumPy 数组

当结果低于正常值时，Numpy Float 到 HalfFloat 转换 RNE

安装 aeneas 时无法检测到 numpy

在 JAX 中高效自定义数组创建例程

给定 numpy 直方图的分布的平均值和中位数

为什么我使用 numpy 的矩阵乘法这么慢？

从标量和矩阵形成元素列表

Python NetCDF 本初子午线空白

为什么 C++20 概念会导致循环约束错误，而老式的 SFINAE 不会？

VScode 自动卸载扩展的问题（Material 主题）

Vue 3：创建时出错“预期标识符但发现‘导入’”[重复]

具有指定基础类型但没有枚举器的“枚举类”的用途是什么？

如何修复未手动导入的模块的 MODULE_NOT_FOUND 错误？

`(表达式，左值) = 右值` 在 C 或 C++ 中是有效的赋值吗？为什么有些编译器会接受/拒绝它？

何时应使用 std::inplace_vector 而不是 std::vector？

在 C++ 中，一个不执行任何操作的空程序需要 204KB 的堆，但在 C 中则不需要

PowerBI 目前与 BigQuery 不兼容：Simba 驱动程序与 Windows 更新有关

AdMob：MobileAds.initialize() - 对于某些设备，“java.lang.Integer 无法转换为 java.lang.String”

问题[numpy](coding)