Richard Asked: 2019-08-26 12:17:11 +0800 CST2019-08-26 12:17:11 +0800 CST 2019-08-26 12:17:11 +0800 CST Razer Core X eGPU 在 Thinkpad + Lubuntu 18.04 中不工作 772 我刚刚第一次将带有 eGPU 的 Razor Core X 连接到我的 Thinkpad。风扇在旋转,但nvidia-smi没有显示 eGPU。 我能做些什么? linux ubuntu 1 个回答 Voted Best Answer Richard 2019-08-26T12:17:11+08:002019-08-26T12:17:11+08:00 首先,检查dmesg | tail -n 200. 它可能会显示如下内容: [ 74.959198] thunderbolt 0000:06:00.0: current switch config: [ 74.959201] thunderbolt 0000:06:00.0: Switch: 8086:15da (Revision: 6, TB Version: 2) [ 74.959202] thunderbolt 0000:06:00.0: Max Port Number: 11 [ 74.959203] thunderbolt 0000:06:00.0: Config: [ 74.959204] thunderbolt 0000:06:00.0: Upstream Port Number: 1 Depth: 1 Route String: 0x3 Enabled: 1, PlugEventsDelay: 254ms [ 74.959205] thunderbolt 0000:06:00.0: unknown1: 0x0 unknown4: 0x0 [ 74.999560] thunderbolt 0000:06:00.0: 3: reading drom (length: 0x56) [ 75.301575] thunderbolt 0000:06:00.0: 3: uid: 0x1279cc9b0ba8400 [ 75.301686] thunderbolt 0000:06:00.0: Port 0: 8086:15d3 (Revision: 6, TB Version: 1, Type: Port (0x1)) [ 75.301689] thunderbolt 0000:06:00.0: Max hop id (in/out): 7/7 [ 75.301692] thunderbolt 0000:06:00.0: Max counters: 8 [ 75.301694] thunderbolt 0000:06:00.0: NFC Credits: 0x800000 [ 75.302174] thunderbolt 0000:06:00.0: Port 1: 8086:15d3 (Revision: 6, TB Version: 1, Type: Port (0x1)) [ 75.302178] thunderbolt 0000:06:00.0: Max hop id (in/out): 15/15 [ 75.302180] thunderbolt 0000:06:00.0: Max counters: 16 [ 75.302183] thunderbolt 0000:06:00.0: NFC Credits: 0x7800000 [ 75.302681] thunderbolt 0000:06:00.0: Port 2: 8086:15d3 (Revision: 6, TB Version: 1, Type: Port (0x1)) [ 75.302683] thunderbolt 0000:06:00.0: Max hop id (in/out): 15/15 [ 75.302685] thunderbolt 0000:06:00.0: Max counters: 16 [ 75.302687] thunderbolt 0000:06:00.0: NFC Credits: 0x0 [ 75.302689] thunderbolt 0000:06:00.0: 3:3: disabled by eeprom [ 75.302691] thunderbolt 0000:06:00.0: 3:4: disabled by eeprom [ 75.302692] thunderbolt 0000:06:00.0: 3:5: disabled by eeprom [ 75.302806] thunderbolt 0000:06:00.0: Port 6: 8086:15d3 (Revision: 6, TB Version: 1, Type: PCIe (0x100102)) [ 75.302808] thunderbolt 0000:06:00.0: Max hop id (in/out): 8/8 [ 75.302809] thunderbolt 0000:06:00.0: Max counters: 2 [ 75.302811] thunderbolt 0000:06:00.0: NFC Credits: 0x800000 [ 75.302960] thunderbolt 0000:06:00.0: Port 7: 8086:15d3 (Revision: 6, TB Version: 1, Type: PCIe (0x100101)) [ 75.302962] thunderbolt 0000:06:00.0: Max hop id (in/out): 8/8 [ 75.302964] thunderbolt 0000:06:00.0: Max counters: 2 [ 75.302966] thunderbolt 0000:06:00.0: NFC Credits: 0x800000 [ 75.302967] thunderbolt 0000:06:00.0: 3:8: disabled by eeprom [ 75.302969] thunderbolt 0000:06:00.0: 3:9: disabled by eeprom [ 75.302971] thunderbolt 0000:06:00.0: 3:a: disabled by eeprom [ 75.302973] thunderbolt 0000:06:00.0: 3:b: disabled by eeprom 这表明存在权限/安全问题。 让我们安装 Thunderbolt 管理工具,以便我们修复它: sudo apt install thunderbolt-tools 现在,让我们检查一下 Thunderbolt 是否看到了扩展坞: root@mymachine:~# tbtadm devices 0-4 Razer Core X non-authorized not in ACL 确实如此! 现在,让我们授权码头: tbtadm approve 0-4 由此可见: Authorizing "/sys/bus/thunderbolt/devices/0-4" Already in ACL system:5 Input/output error 拔下并重新插入扩展坞并dmesg再次查看显示: [11187.232181] thunderbolt 0000:06:00.0: PCIe tunnel creation failed 所以我们再看一下迅雷: root@mymachine:~# tbtadm devices 0-4 Razer Core X non-authorized in ACL 而且,确实,我们可以看到扩展坞已连接: root@mymachine:~# tbtadm acl 0XXXXXb0-XXXX-XXXX-ffff-ffffffffffff Razer Core X connected 让我们尝试手动授权: root@mymachine:~# echo '1' > /sys/bus/thunderbolt/devices/0-4/authorized -bash: echo: write error: Input/output error 在这一点上,我怀疑 BIOS 可能是问题所在。所以,重新启动,拉起 BIOS 设置。它在“用户授权”中,但让我们使用核“无安全性”选项(可能很好地弄清楚如何稍后再次锁定): 此时,再次启动机器。 在插入 GPU 之前,请确保您已加载 Nvidia 驱动程序: sudo modprobe nvidia-uvm 并尝试找到 GPU: nvidia-smi 成功! +-----------------------------------------------------------------------------+ | NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce RTX 208... Off | 00000000:3D:00.0 Off | N/A | | 15% 36C P0 1W / 250W | 0MiB / 10989MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ 根据本文档,BIOS 设置意味着: 无安全性:允许自动连接 Thunderbolt 设备。 用户授权:允许在用户授权后连接 Thunderbolt 设备。 安全连接:允许 Thunderbolt 设备使用已被用户批准的已保存密钥进行连接。 显示端口和 USB:仅允许连接显示输出和 USB 设备。不允许连接迅雷设备
首先,检查
dmesg | tail -n 200
. 它可能会显示如下内容:这表明存在权限/安全问题。
让我们安装 Thunderbolt 管理工具,以便我们修复它:
现在,让我们检查一下 Thunderbolt 是否看到了扩展坞:
确实如此!
现在,让我们授权码头:
由此可见:
拔下并重新插入扩展坞并
dmesg
再次查看显示:所以我们再看一下迅雷:
而且,确实,我们可以看到扩展坞已连接:
让我们尝试手动授权:
在这一点上,我怀疑 BIOS 可能是问题所在。所以,重新启动,拉起 BIOS 设置。它在“用户授权”中,但让我们使用核“无安全性”选项(可能很好地弄清楚如何稍后再次锁定):
此时,再次启动机器。
在插入 GPU 之前,请确保您已加载 Nvidia 驱动程序:
并尝试找到 GPU:
成功!
根据本文档,BIOS 设置意味着: