如何将 for 循环拆分为 3 个单独的数据框？

Question

Eric Wang

Asked: 2025-03-06 02:40:45 +0800 CST2025-03-06 02:40:45 +0800 CST 2025-03-06 02:40:45 +0800 CST

如何解析以平面列表形式呈现的嵌套结构？

772

为了轻松理解我的问题，下面是一些示例输入的简化版本。

['foo', 1, 'a', 'foo', 2, 'foo', 1, 'b', 'foo', -1, 'foo', -1, "bar", 1, "c", "bar", 2, 'baz', 1, 'd', 'baz', -1, "bar", 3, "e", "bar", 4, 'qux', 1, 'stu', 1, 'f', 'stu', -1, 'qux', -1, 'bar', -1]

（我使用“stu”是因为我没有占位符名称。）

这些字符串是函数名（有点像，稍后详细说明）。函数名后面的数字指定了参数在后面函数中的位置。位置为 -1 表示函数结束。

例如，['foo',1,'a','foo',2,'b','foo',-1]应该等同于foo('a', 'b')。

嵌套时也应该有效：

['foo', 1, 'a', 'foo', 2, 'foo', 1, 'b', 'foo', -1, 'foo', -1]

应该foo('a', foo('b'))相当于

['bar', 1, 'c', 'bar', 2, 'baz', 1, 'd', 'baz', -1, 'bar', 3, 'e', 'bar', 4, 'qux', 1, 'stu', 1, 'f', 'stu',-1, 'qux', -1, 'bar', -1]

应该等同于bar('c', baz('d'), e, qux(stu('f')))。

我想要的函数应该返回一个列表。例如，

['foo', 1, 'a', 'foo', 2, 'foo', 1, 'b', 'foo', -1, 'foo', -1, 'bar', 1, 'c', 'bar', -1]

应该导致

[['foo', 'a', ['foo', 'b']], ['bar', 'c']]

现在问题已经更清楚了，但我的实际问题略有不同。列表中的所有元素都是整数。函数名称不是字符串，而是三个整数的序列。因此，['foo',1,'a','foo',2,'b','foo',-1]实际上是[1, 1, 1, 1, 104, 1, 1, 1, 2, 105, 1, 1, 1, -1]。

函数名称 ([1, 1, 1]在上例中 ) 充当字典键。字典 ( 称为constructs) 如下所示：

constructs = {
    1: {
        1: {
            1: lambda *chars : print(''.join(chr(char) for char in chars))
        }
    }
}

因此，最后，该示例应产生如下结果

[[lambda *chars : print(''.join(chr(char) for char in chars)), 104, 105]]

所有关于嵌套等的规范都应该仍然适用。我不知道如何可靠而优雅地实现这一点，请帮忙！

提前致谢。

编辑：我忘了说0总是被忽略和跳过，并且跟在0函数调用后面的会逃避函数调用并导致它被视为参数。到目前为止，所有这些功能都在某种程度上实现了，但当同一个函数嵌套在其自身中时，它不起作用。它也是低效和不优雅的，有很多潜在的问题，所以我向 Stack Overflow 寻求帮助，编写一个更好的。请随意使用它作为起点！

编辑2：这是我迄今为止尝试的代码：

constructs = {
    1: {
        1: {
            1: print,
        }
    }
}

def parse(code: list) -> list:
    if len(code) <= 1:
        return code
    result = []
    in_function = 0
    for i, token in enumerate(code):
        if in_function > 0:
            in_function -= 1
            continue
        if token == 0:
            continue
        if result and result[-1][0][3] != -1:
            if token in constructs and code[i + 1] in constructs[token] and code[i + 2] in constructs[token][code[i + 1]]:
                if i < len(code) - 4 and code[i + 4] == 0:
                    result[-1][-1].append(token)
                else:
                    if code[i + 3] == result[-1][0][3] + 1:
                        result[-1].append([])
                    result[-1][0] = code[i:i + 4]
                    in_function = 3
            else:
                result[-1][-1].append(token)
        else:
            if token in constructs and code[i + 1] in constructs[token] and code[i + 2] in constructs[token][code[i + 1]]:
                if code[i + 3] == 1:
                    result.append([code[i:i + 4], []])
                    in_function = 3
                else:
                    raise SyntaxError(f'function {code[i:i + 3]} has no previous separator {code[i + 3] - 1}')
            else:
                raise SyntaxError(f'function {code[i:i + 3]} was not recognized')
    for i, function in enumerate(result):
        result[i][0] = constructs[result[i][0][0]][result[i][0][1]][result[i][0][2]]
        for j, argument in enumerate(result[i][1:]):
            result[i][j + 1] = parse(argument)
    return result

它适用于parse([1, 1, 1, 1, 'Hello world', 1, 1, 1, 2, 'etc', 1, 1, 1, -1])但不适合parse([1, 1, 1, 1, 1, 1, 1, 1, 'Hello world', 1, 1, 1, -1, 1, 1, 1, 2, 'etc', 1, 1, 1, -1])。

2 个回答

Voted

blhsing · Answer 1 · 2025-03-06T11:23:01+08:00

遵循此语法规则的有效列表可以使用函数递归地解析，给定有效索引，该函数尝试将从索引开始的标记解析为函数的键，后跟参数位置（以 -1 结束函数调用），否则默认为标量值。

除了解析的对象之外，该函数还应返回下一个标记的索引，以便根据当前级别消耗的标记数推进索引。

由于输入列表中可能有一个或多个这样的表达式，因此应迭代调用该函数并将返回值附加到输出列表，直到索引到达输入的末尾：

def parse(code):
    def parse_expr(index):
        while index < size and code[index] == 0:
            index += 1  # skip 0s
        try:
            call = []
            while True:
                key1, key2, key3, pos = code[index: (next_index := index + 4)]
                if next_index < size and code[next_index] == 0:
                    raise ValueError  # escape the token
                if not call:
                    call.append(constructs[key1][key2][key3])
                if pos == -1:
                    return call, next_index
                obj, index = parse_expr(next_index)
                call.append(obj)
        except (ValueError, KeyError):
            return code[index], index + 1
    size = len(code)
    index = 0
    result = []
    while index < size:
        obj, index = parse_expr(index)
        result.append(obj)
    return result

以便：

print(parse([1, 1, 1, 1, 'Hello world', 1, 1, 1, 2, 'etc', 1, 1, 1, -1]))
print(parse([1, 1, 1, 1, 1, 1, 1, 1, 'Hello world', 1, 1, 1, -1, 1, 1, 1, 2, 'etc', 1, 1, 1, -1]))
print(parse([1, 1, 1, 1, 'a', 1, 1, 1, 2, 1, 1, 1, 1, 'b', 1, 1, 1, -1, 1, 1, 1, -1, 1, 1, 1, 1, 'c', 1, 1, 1, -1]))

输出：

[[<built-in function print>, 'Hello world', 'etc']]
[[<built-in function print>, [<built-in function print>, 'Hello world'], 'etc']]
[[<built-in function print>, 'a', [<built-in function print>, 'b']], [<built-in function print>, 'c']]

演示：https://ideone.com/4ct2KA

上述代码假定输入有效，并忽略第一个参数之后的函数键，因为它们是多余的。参数位置也被忽略，因为它们总是从 1 开始递增，尽管它们可以在必要时用于输入验证。

Eric Wang · Answer 2 · 2025-03-07T06:33:27+08:00

谢谢，blhsing！

因此我尝试了一下，尽管它比 blhsing 的答案长两倍。

def parse(code: list) -> tuple:
    result = []
    stack = []
    i = 0
    while i < len(code):
        key1, key2, key3, pos = code[i: i + 4]
        if key1 == 0 and key2 != 0:
            i += 1
        elif stack:
            if constructs.get(key1, {}).get(key2, {}).get(key3) is not None:
                if (len(code) > i + 4 and code[i + 4] == 0) and (len(code) <= i + 5 or code[i + 5] != 0):
                    result[-1][-1] += [key1, key2, key3, pos]
                    i += 4
                elif pos == 1:
                    stack.append([(key1, key2, key3), pos])
                    result[-1][-1] += [key1, key2, key3, pos]
                    i += 4
                elif (key1, key2, key3) == stack[-1][0]:
                    if pos == stack[-1][1] + 1:
                        stack[-1][1] += 1
                        if len(stack) == 1:
                            result[-1].append([])
                        else:
                            result[-1][-1] += [key1, key2, key3, pos]
                        i += 4
                    elif pos == -1:
                        if len(stack) > 1:
                            result[-1][-1] += [key1, key2, key3, pos]
                        stack.pop()
                        i += 4
                    else:
                        raise SyntaxError(f'function {key1}.{key2}.{key3} at position {i} has no previous position {pos - 1}')
                else:
                    raise SyntaxError(f'function {key1}.{key2}.{key3} at position {i} has no previous position {pos - 1}')
            else:
                result[-1][-1].append(key1)
                i += 1
        else:
            if constructs.get(key1, {}).get(key2, {}).get(key3) is not None:
                if pos == 1:
                    result.append([(key1, key2, key3), []])
                    stack.append([(key1, key2, key3), pos])
                    i += 4
                else:
                    raise SyntaxError(f'function {key1}.{key2}.{key3} at position {i} has no previous separator {pos - 1}')
            else:
                raise SyntaxError(f'function {key1}.{key2}.{key3} at position {i} is not recognized')
    for i, function_call in enumerate(result):
        new_function_call = [function_call[0]]
        for argument in function_call[1:]:
            try:
                new_function_call += parse(argument)[0]
            except:
                new_function_call += argument
        result[i] = new_function_call
    return result, stack

0它还实现了不会跳过 a后面跟着另一个的功能0。唯一的问题是，恰好形成嵌套函数调用的转义序列将被视为嵌套函数调用，而不是被跳过。

即，parse([1, 1, 1, 1, 1, 1, 2, 1, 0, 97, 1, 1, 2, -1, 0, 1, 1, 1, -1])返回([[(1, 1, 1), [(1, 1, 2), 97]]], [])而不是([[1, 1, 1], 1, 1, 2, 1, 97, 1, 1, 2, -1], [])。

（我将堆栈添加到返回中，以便另一个函数可以判断是否有任何函数未正确关闭。）

编辑：演示

如何解析以平面列表形式呈现的嵌套结构？

重新格式化数字，在固定位置插入分隔符

为什么 C++20 概念会导致循环约束错误，而老式的 SFINAE 不会？

VScode 自动卸载扩展的问题（Material 主题）

Vue 3：创建时出错“预期标识符但发现‘导入’”[重复]

具有指定基础类型但没有枚举器的“枚举类”的用途是什么？

如何修复未手动导入的模块的 MODULE_NOT_FOUND 错误？

`(表达式，左值) = 右值` 在 C 或 C++ 中是有效的赋值吗？为什么有些编译器会接受/拒绝它？

在 C++ 中，一个不执行任何操作的空程序需要 204KB 的堆，但在 C 中则不需要

PowerBI 目前与 BigQuery 不兼容：Simba 驱动程序与 Windows 更新有关

AdMob：MobileAds.initialize() - 对于某些设备，“java.lang.Integer 无法转换为 java.lang.String”

如何解析以平面列表形式呈现的嵌套结构？

2 个回答

相关问题