默认的字符编码是什么？

Question

NotTheDr01ds

Asked: 2023-01-05 14:41:28 +0800 CST2023-01-05 14:41:28 +0800 CST 2023-01-05 14:41:28 +0800 CST

尝试在 Ubuntu 22.04 WSL 下运行 GPT 检测器时出错

772

虽然OpenAI 检测器在识别由 ChatGPT 和其他基于 OpenAI 的模型创建的内容方面很有用，但随着使用量的增加（尤其是 Stack Exchange 网站上的用户），它的故障频率越来越高。

根据项目README在本地安装后，尝试使用以下命令从 repo 目录运行它时收到以下错误python -m detector.server ../gpt-2-models/detector-base.pt：

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/ntd/src/gpt-2-output-dataset/detector/server.py", line 120, in <module>
    fire.Fire(main)
  File "/home/ntd/src/venv/openai-detector/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/ntd/src/venv/openai-detector/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/ntd/src/venv/openai-detector/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/home/ntd/src/gpt-2-output-dataset/detector/server.py", line 89, in main
    model.load_state_dict(data['model_state_dict'])
  File "/home/ntd/src/venv/openai-detector/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1671, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for RobertaForSequenceClassification:
        Missing key(s) in state_dict: "roberta.embeddings.position_ids".
        Unexpected key(s) in state_dict: "roberta.pooler.dense.weight", "roberta.pooler.dense.bias".

我试图根据这个问题transformers==2.9.1中的评论进行更改，但也失败了。pip install -r requirements.txt

1 个回答

Voted

NotTheDr01ds · Answer 1 · 2023-01-05T14:41:28+08:00

这里的主要问题似乎通过使用transformers==2.5.1for me（而不是 2.9.1）来解决，但我还需要 Rust 编译器（和build-essential）来构建它。其中大部分，至少从第 11 步开始，也可能适用于非 WSL Ubuntu。但是，CUDA 也有一些额外的依赖项（我不能完全确定是哪一个，因为我没有可以测试的纯 Ubuntu GPU 系统）。

以下是我在 WSL 上安装 Ubuntu 22.04 时使用的完整步骤。请注意，您可以通过不为检测器设置特殊分布、不设置 Pythonvenv或什至跳过两者来大大简化它。老实说，就“隔离”而言，两者都做得太过分了，但是所有步骤都在这里，具体取决于您要如何处理它：

ubuntu2204.exe从 PowerShell注册了一个新的 Ubuntu 22.04 WSL 发行版。以前不存在，原因如下。
请求时添加用户名和密码。
运行正常的初始sudo apt update && sudo apt upgrade -y.
/etc/wsl.conf使用我在这里的回答设置默认用户名。
退出Ubuntu
wsl --shutdown
为我的“openai-detector”实例创建了一个目录：
```
mkdir D:\WSL\instances\openai-detector
```

将刚刚创建的 Ubuntu 22.04 实例复制到名为的新发行版openai-detector：

wsl --import --vhd openai-detector D:\wsl\instances\openai-detector\ $env:localappdata\Packages\CanonicalGroupLimited.Ubuntu22.04LTS_79rhkp1fndgsc\LocalState\ext4.vhdx --version 2

删除了ubuntu-22.04发行版，因为我总是可以在需要时按需创建另一个发行版（如上所述）。但是，请仅在您确定这是您刚刚创建的文件并且没有您需要的文件时才执行此操作。这是一个不可逆的、可破坏的操作。老实说，每次我这样做都有点紧张，因为我有可能不小心使用错误的发行版名称。只是......小心：
```
wsl --unregister ubuntu-22.04
```
启动上面创建的新openai-detector发行版：
```
wsl ~ -d openai-detector
```

安装rustup和build-essential：

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source "$HOME/.cargo/env"
sudo apt install build-essential

设置虚拟环境：

sudo apt install python3-venv
python3 -m venv ~/src/venv/openai-detector
source ~/src/venv/openai-detector/bin/activate

克隆检测器并下载模型文件：

cd ~/src
git clone https://github.com/openai/gpt-2-output-dataset.git
mkdir gpt-2-models
cd gpt-2-models
wget https://openaipublic.azureedge.net/gpt-2/detector-models/v1/detector-base.pt
# and/or
wget https://openaipublic.azureedge.net/gpt-2/detector-models/v1/detector-large.pt

修改使用 Transformers 2.5.1 的要求：
```
editor ~/src/gpt-2-output-dataset/requirements.txt
```
将行更改transformers为：
```
transformers==2.5.1
```

安装要求：

pip install wheel
cd ~/src/gpt-2-output-dataset
pip install -r requirements.txt

跑步：

python -m detector.server ../gpt-2-models/detector-base.pt

初始安装后，将来需要启动的是：

wsl ~ -d openai-detector
cd ~/src/gpt-2-output-dataset
source ~/src/venv/openai-detector/bin/activate
python -m detector.server ../gpt-2-models/detector-base.pt

OpenAI Detector 的本地副本应该在localhost:8080.

尝试在 Ubuntu 22.04 WSL 下运行 GPT 检测器时出错

如何运行 .sh 脚本？

如何安装 .tar.gz（或 .tar.bz2）文件？

如何列出所有已安装的软件包

无法锁定管理目录 (/var/lib/dpkg/) 是另一个进程在使用它吗？

尝试在 Ubuntu 22.04 WSL 下运行 GPT 检测器时出错

1 个回答

相关问题