为什么编译器在这里错过矢量化？

Question

hubenhau

Asked: 2024-08-08 22:16:43 +0800 CST2024-08-08 22:16:43 +0800 CST 2024-08-08 22:16:43 +0800 CST

尝试使用 A* 解决大型迷宫时出现 SIGSEGV

772

我正在用 C++ 实现 A* 寻路算法来解决迷宫，但是当迷宫尺寸很大（~10,000x10,000 或更大）时遇到了分段错误（SIGSEGV）。

错误发生在第 58 行调用after中~__shared_count()的函数，以及此后调用的数十个解构函数中。shared_ptr_base.hnode = queue.top()~Node()

编辑：错误实际上_Sp_counted_base<_S_atomic>::_M_release()现在发生在函数中，尽管我不知道我改变了什么。

以下是我的代码中应该相关的部分。我是第一次使用共享指针，所以我不太清楚问题是什么或如何修复它。

主程序

#include <iostream>
#include <queue>
#include <random>
#include <vector>
#include <algorithm>
#include <stack>
#include <fstream>

#include "Node.hpp"

constexpr uint16_t maze_size = 10000;

void solve(const uint16_t start_x, const uint16_t start_y, const uint16_t goal_x,
           const uint16_t goal_y,
           const std::vector<std::vector<std::pair<bool, bool> > > &walls) {
    std::vector<std::vector<bool> > visited(maze_size, std::vector<bool>(maze_size, false));

    // Min heap for our queue
    std::priority_queue<Node, std::vector<Node>, std::greater<> > queue;

    Node node(start_x, start_y, goal_x, goal_y);

    visited[node.x][node.y] = true;

    while (!node.solved(goal_x, goal_y)) {
        // Add all valid neighbors to the queue
        {
            const auto x = node.x;
            const auto y = node.y;
            const auto ptr = std::make_shared<Node>(node);
            // Right
            if (x < maze_size - 1 && !walls[x][y].first && !visited[x + 1][y]) {
                queue.emplace(x + 1, y, goal_x, goal_y, ptr);
                visited[x + 1][y] = true;
            }
            // Left
            if (x > 0 && !walls[x - 1][y].first && !visited[x - 1][y]) {
                queue.emplace(x - 1, y, goal_x, goal_y, ptr);
                visited[x - 1][y] = true;
            }
            // Down
            if (y < maze_size - 1 && !walls[x][y].second && !visited[x][y + 1]) {
                queue.emplace(x, y + 1, goal_x, goal_y, ptr);
                visited[x][y + 1] = true;
            }
            // Up
            if (y > 0 && !walls[x][y - 1].second && !visited[x][y - 1]) {
                queue.emplace(x, y - 1, goal_x, goal_y, ptr);
                visited[x][y - 1] = true;
            }
        }
        node = queue.top();
        queue.pop();
    }
}

节点.hpp

#ifndef NODE_HPP
#define NODE_HPP

#include <cstdint>
#include <memory>

class Node {
public:
    uint16_t x, y;
    uint32_t steps;
    uint32_t value;

    std::shared_ptr<Node> parent;

    Node(const uint16_t x, const uint16_t y, const uint16_t goal_x, const uint16_t goal_y,
         const std::shared_ptr<Node> &parent = nullptr) : x(x), y(y), parent(parent) {
        steps = parent ? parent->steps + 1 : 0;
        value = abs(static_cast<int32_t>(goal_x) - static_cast<int32_t>(x)) +
                abs(static_cast<int32_t>(goal_y) - static_cast<int32_t>(y)) +
                steps;
    }

    [[nodiscard]] bool solved(const uint16_t goal_x, const uint16_t goal_y) const {
        return x == goal_x && y == goal_y;
    }

    // Overload the > operator for the priority queue
    bool operator>(const Node &rhs) const {
        if (value == rhs.value) {
            return steps < rhs.steps;
        }
        return value > rhs.value;
    }
};

#endif

Valgrind 输出

valgrind --leak-check=full ./Tests 
==5385== Memcheck, a memory error detector
==5385== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==5385== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==5385== Command: ./Tests
==5385== 
Maze loaded10000
==5385== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
==5385== 
==5385== Process terminating with default action of signal 11 (SIGSEGV)
==5385==  Access not within mapped region at address 0x1FFE801FF8
==5385== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
==5385==    at 0x116230: std::_Sp_counted_ptr_inplace<Node, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr_base.h:613)
==5385==  If you believe this happened as a result of a stack
==5385==  overflow in your program's main thread (unlikely but
==5385==  possible), you can try to increase the size of the
==5385==  main thread stack using the --main-stacksize= flag.
==5385==  The main thread stack size used in this run was 8388608.
==5385== 
==5385== HEAP SUMMARY:
==5385==     in use at exit: 239,026,864 bytes in 556,422 blocks
==5385==   total heap usage: 3,380,473 allocs, 2,824,051 frees, 374,594,816 bytes allocated
==5385== 
==5385== LEAK SUMMARY:
==5385==    definitely lost: 0 bytes in 0 blocks
==5385==    indirectly lost: 0 bytes in 0 blocks
==5385==      possibly lost: 0 bytes in 0 blocks
==5385==    still reachable: 239,026,864 bytes in 556,422 blocks
==5385==         suppressed: 0 bytes in 0 blocks
==5385== Reachable blocks (those to which a pointer was found) are not shown.
==5385== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==5385== 
==5385== For lists of detected and suppressed errors, rerun with: -s
==5385== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)

您可以尝试使用以下函数加载此迷宫文件来重现错误。抱歉，文件太大了。您可以使用以下函数加载它：

std::vector<std::vector<std::pair<bool, bool> > > readVectorFromFile(const std::string &filename) {
    std::vector<std::vector<std::pair<bool, bool> > > vec;
    std::ifstream inFile(filename, std::ios::binary);
    if (!inFile) {
        std::cerr << "Error opening file for reading: " << filename << std::endl;
        return vec;
    }

    // Read the size of the outer vector
    size_t outerSize;
    inFile.read(reinterpret_cast<char *>(&outerSize), sizeof(outerSize));

    vec.resize(outerSize);

    for (auto &innerVec: vec) {
        // Read the size of each inner vector
        size_t innerSize;
        inFile.read(reinterpret_cast<char *>(&innerSize), sizeof(innerSize));

        innerVec.resize(innerSize);

        // Read each pair in the inner vector
        for (auto &pair: innerVec) {
            inFile.read(reinterpret_cast<char *>(&pair.first), sizeof(bool));
            inFile.read(reinterpret_cast<char *>(&pair.second), sizeof(bool));
        }
    }

    inFile.close();
    return vec;
}

主要内容如下：

int main() {
    auto walls = readVectorFromFile("maze.dat");
    std::cout << "Maze loaded" << std::endl;

    solve(0, 0, maze_size - 1, maze_size - 1, walls);

    return 0;
}

3 个回答

Voted

MSalters · Answer 1 · 2024-08-08T22:29:28+08:00

Best Answer

MSalters

2024-08-08T22:29:28+08:002024-08-08T22:29:28+08:00

您没有明确定义，Node::~Node()但编译器会为您完成。在其中，它调用std::shared_ptr<Node>::~shared_ptr。反过来，它又调用Node::~Node。

这是一个递归调用。根据您的迷宫和查询，这可能会导致调用堆栈溢出。

更简单的解决方案是让solve()管理所有节点的所有权；每个节点都可以有一个非拥有的Node* parent。这种所有权可以像一样简单std::deque<Node>（但不是std::vector<Node>，因为它会重新分配并破坏Node* parent）

[编辑]

在节点对象中使用智能指针并不一定是坏主意。在宽树中，每个父节点有 10 个子节点，您将在深度约为 9 级（十亿个节点）时遇到内存问题，但堆栈可以轻松处理该级别的嵌套。即使是具有十亿个节点的平衡二叉树也不算太糟糕（深度 32）。在这些树的情况下，每个节点都是几个子节点的唯一所有者。这可以用来实现std::vector<std::unique_ptr>> children。

6

pptaszni · Answer 2 · 2024-08-08T22:29:25+08:00

在您的中，Node并不是std::shared_ptr<Node> parent;真正需要的，并且容易产生循环依赖，这显然发生在您的迷宫中（尽管我懒得确认这一点）。删除shared_ptr并只提供有关父 Node 的信息steps：

class Node {
public:
    uint16_t x, y;
    uint32_t steps;
    uint32_t value;

    Node(uint16_t x, uint16_t y, uint16_t goal_x, uint16_t goal_y, int parentSteps = -1) : x(x), y(y) {
        steps = parentSteps + 1;
        value = abs(static_cast<int32_t>(goal_x) - static_cast<int32_t>(x)) +
                abs(static_cast<int32_t>(goal_y) - static_cast<int32_t>(y)) + steps;
    }

    [[nodiscard]] bool solved(const uint16_t goal_x, const uint16_t goal_y) const { return x == goal_x && y == goal_y; }

    // Overload the < operator for the priority queue
    bool operator>(const Node &rhs) const {
        if (value == rhs.value) {
            return steps < rhs.steps;
        }
        return value > rhs.value;
    }
};

然后您可以像这样创建新节点：

queue.emplace(x + 1, y, goal_x, goal_y, node.steps);

Lajos Arpad · Answer 3 · 2024-08-08T23:10:52+08:00

这是您收到的错误：

线程 #1 中的堆栈溢出：无法将堆栈增长至 0x1ffe801000

堆栈溢出就是指堆栈溢出了，我们先来了解一下什么是堆栈。

堆栈是一种后进先出 (LIFO) 的数据结构，也就是说，将元素堆叠在一起，并且总是能够先弹出顶部的元素。

在计算机科学中，有一部分内存称为调用堆栈。

此部分是调用子程序调用的地方。因此，如果您调用一个函数，该函数调用另一个函数，该函数调用另一个函数，那么这些函数将被堆叠到调用堆栈中。这在一定限度内有效，但如果有太多函数调用而没有执行函数（并从堆栈中弹出），那么未完成的函数调用会堆积起来并填满调用堆栈的大小。此时，如果还有另一个函数调用，则会尝试将其推送到调用堆栈中，但由于存储空间不足而失败。这就是发生的事情。

您可能已经意识到，您可以增加调用堆栈的大小，这将消除症状，但潜在的问题仍然存在，只是堆栈的限制更难达到。

因此，解决方案是确保您的算法是迭代的而不是递归的。尝试解决该问题的一种方法是尝试创建一个迭代堆栈（您实际上明确使用堆栈而不是依赖调用堆栈），如果您想释放一个Node，则将其所有引用推送到堆栈中，将它们设置为空，这样就不会发生任何奇怪的事情，释放Node现在不再具有触发析构函数的任何引用，从堆栈顶部弹出您的节点，行为类似，即将其引用推送到堆栈，删除此对象的引用，然后释放它，然后继续堆栈顶部，依此类推，直到没有更多指针要释放。

尝试使用 A* 解决大型迷宫时出现 SIGSEGV

`(表达式，左值) = 右值` 在 C 或 C++ 中是有效的赋值吗？为什么有些编译器会接受/拒绝它？

何时应使用 std::inplace_vector 而不是 std::vector？

在 C++ 中，一个不执行任何操作的空程序需要 204KB 的堆，但在 C 中则不需要

如果 T 既不可构造、不可复制、也不可移动，那么我可以拥有 std::optional<T> 吗？

为什么我可以定义一个 constinit 的 std::string 实例？如果对象需要动态初始化，constinit 不是被禁止的吗？

如何分配以后放置的新“如同新”

PowerBI 目前与 BigQuery 不兼容：Simba 驱动程序与 Windows 更新有关

将 NULL 和 nullptr 传递给模板参数有什么区别？

AdMob：MobileAds.initialize() - 对于某些设备，“java.lang.Integer 无法转换为 java.lang.String”

我正在尝试仅使用海龟随机和数学模块来制作吃豆人游戏

尝试使用 A* 解决大型迷宫时出现 SIGSEGV

3 个回答

相关问题