AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / server / 问题 / 1132033
Accepted
Snappawapa
Snappawapa
Asked: 2023-05-27 07:27:42 +0800 CST2023-05-27 07:27:42 +0800 CST 2023-05-27 07:27:42 +0800 CST

复制时 Mongodb 数据文件损坏

  • 772

我确信有一个简单的解决方案,我正在运行一个 Mongodb 服务:

[Unit]
Description=Mongo server for Open Quartermaster. Version ${version}, using MongoDB tagged to "4".
Documentation=https://github.com/Epic-Breakfast-Productions/OpenQuarterMaster/tree/main/software/Infrastructure
After=docker.service
Wants=network-online.target docker.socket
Requires=docker.socket

[Service]
Type=simple
Restart=always
TimeoutSec=5m

#ExecStartPre=/bin/bash -c "/usr/bin/docker container inspect oqm_mongo 2> /dev/null || "
ExecStartPre=/bin/bash -c "/usr/bin/docker stop -t 10 oqm_infra_mongo || echo 'Could not stop mongo container'"
ExecStartPre=/bin/bash -c "/usr/bin/docker rm oqm_infra_mongo || echo 'Could not remove mongo container'"
ExecStartPre=/bin/bash -c "/usr/bin/docker pull mongo:4"

ExecStart=/bin/bash -c "/usr/bin/docker run \
                                 --name oqm_infra_mongo \
                                 -p=27017:27017 \
                                 -v /data/oqm/db/mongo/:/data/db  \
                                 mongo:4 mongod --replSet rs0"
ExecStartPost=/bin/bash -c "running=\"false\"; \
                            while [ \"$running\" != \"true\" ]; do \
                                sleep 1s; \
                                /usr/bin/docker exec oqm_infra_mongo mongo --eval \"\"; \
                                if [ \"$?\" = \"0\" ]; then \
                                    echo \"Mongo container running and available!\"; \
                                    running=\"true\"; \
                                fi \
                            done \
                            "
ExecStartPost=/bin/bash -c "/usr/bin/docker exec oqm_infra_mongo mongo --eval \"rs.initiate({'_id':'rs0', 'members':[{'_id':0,'host':'localhost:27017'}]})\" || echo 'Probably already initialized.'"

ExecStop=/bin/bash -c "/usr/bin/docker stop -t 10 oqm_infra_mongo || echo 'Could not stop mongo container'"
ExecStopPost=/bin/bash -c "/usr/bin/docker rm oqm_infra_mongo || echo 'Could not remove mongo container'"

[Install]
WantedBy=multi-user.target

我正在让 docker 映射主机的 /data/oqm/db/mongo/ 目录,看起来工作正常;从主机上看,我看到运行时填充的目录。此设置似乎在重新启动之间运行良好并且持续良好。

但是,当我尝试通过将此目录中的数据复制出来然后再复制回来来备份和恢复数据时,当尝试恢复服务时,Mongo barfs 声称数据已损坏。有任何想法吗?

需要明确的是,我的备份和恢复都是在服务停止时发生的,包括:

备份:

  1. 停止服务(使用 systemd)
  2. 复制出数据目录中的文件
  3. 启动服务备份(使用 systemd)

恢复:

  1. 停止服务(使用 systemd)
  2. 核对数据目录中的现有文件
  3. 复制回先前复制出的文件
  4. 启动服务(使用 systemd)(失败)

我正在尝试这种方法,因为它通常在文档中进行了描述:https://www.mongodb.com/docs/manual/core/backups/#back-up-with-cp-or-rsync

错误日志片段:


May 27 09:06:59 oqm-dev bash[41304]: {"t":{"$date":"2023-05-27T13:06:59.275+00:00"},"s":"I",  "c":"STORAGE",  "id":22315,   "ctx":"initandlisten","msg":"Opening WiredTiger","attr":{"config":"create,cache_size=11224M,session_max=33000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000,close_scan_interval=10,close_handle_minimum=250),statistics_log=(wait=0),verbose=[recovery_progress,checkpoint_progress,compact_progress],"}}
May 27 09:06:59 oqm-dev bash[41304]: {"t":{"$date":"2023-05-27T13:06:59.686+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1685192819:686027][1:0x7fefd1197cc0], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 1 through 3"}}
May 27 09:06:59 oqm-dev bash[41304]: {"t":{"$date":"2023-05-27T13:06:59.719+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1685192819:719823][1:0x7fefd1197cc0], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 2 through 3"}}
May 27 09:06:59 oqm-dev bash[41304]: {"t":{"$date":"2023-05-27T13:06:59.749+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1685192819:749492][1:0x7fefd1197cc0], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 3 through 3"}}
May 27 09:06:59 oqm-dev bash[41304]: {"t":{"$date":"2023-05-27T13:06:59.801+00:00"},"s":"E",  "c":"STORAGE",  "id":22435,   "ctx":"initandlisten","msg":"WiredTiger error","attr":{"error":-31804,"message":"[1685192819:801911][1:0x7fefd1197cc0], txn-recover: __recovery_setup_file, 643: metadata corruption: files file:collection-0-6243490866083295563.wt and file:collection-0--9007965794334803376.wt have the same file ID 4: WT_PANIC: WiredTiger library panic"}}
May 27 09:06:59 oqm-dev bash[41304]: {"t":{"$date":"2023-05-27T13:06:59.801+00:00"},"s":"F",  "c":"-",        "id":23089,   "ctx":"initandlisten","msg":"Fatal assertion","attr":{"msgid":50853,"file":"src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp","line":481}}
May 27 09:06:59 oqm-dev bash[41304]: {"t":{"$date":"2023-05-27T13:06:59.802+00:00"},"s":"F",  "c":"-",        "id":23090,   "ctx":"initandlisten","msg":"\n\n***aborting after fassert() failure\n\n"}
May 27 09:06:59 oqm-dev bash[41304]: {"t":{"$date":"2023-05-27T13:06:59.802+00:00"},"s":"F",  "c":"CONTROL",  "id":4757800, "ctx":"initandlisten","msg":"Writing fatal message","attr":{"message":"Got signal: 6 (Aborted).\n"}}

更新:在有人提出建议后,我仔细检查了是否保留了权限,现在肯定会保留它们。不幸的是,仍然遇到同样的问题

backup
  • 1 1 个回答
  • 56 Views

1 个回答

  • Voted
  1. Best Answer
    Snappawapa
    2023-05-28T03:11:19+08:002023-05-28T03:11:19+08:00

    弄清楚了。

    权限可能确实发挥了作用,但真正起作用的是意识到:

    rm -rf "/some/dir/*"!=rm -rf /some/dir*

    更改命令以rm -rf "$root/some/dir"/*达到目的

    https://unix.stackexchange.com/questions/326584/rm-command-in-bash-script-does-not-work-with-variable

    • 0

相关问题

  • 总大小(磁盘)与总大小(媒体)

  • 社区对备份解决方案的意见

  • 无法读取不同 LTO-3 驱动器上的 LTO-3 磁带

  • 使用 TSM 备份时跳过硬链接

  • 使用 rsync 维护名称更改的目录的副本

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    新安装后 postgres 的默认超级用户用户名/密码是什么?

    • 5 个回答
  • Marko Smith

    SFTP 使用什么端口?

    • 6 个回答
  • Marko Smith

    命令行列出 Windows Active Directory 组中的用户?

    • 9 个回答
  • Marko Smith

    什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同?

    • 3 个回答
  • Marko Smith

    如何确定bash变量是否为空?

    • 15 个回答
  • Martin Hope
    Tom Feiner 如何按大小对 du -h 输出进行排序 2009-02-26 05:42:42 +0800 CST
  • Martin Hope
    Noah Goodrich 什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同? 2009-05-19 18:24:42 +0800 CST
  • Martin Hope
    Brent 如何确定bash变量是否为空? 2009-05-13 09:54:48 +0800 CST
  • Martin Hope
    cletus 您如何找到在 Windows 中打开文件的进程? 2009-05-01 16:47:16 +0800 CST

热门标签

linux nginx windows networking ubuntu domain-name-system amazon-web-services active-directory apache-2.4 ssh

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve