如何将 for 循环拆分为 3 个单独的数据框？

Question

sdbbs

Asked: 2024-03-09 00:38:25 +0800 CST2024-03-09 00:38:25 +0800 CST 2024-03-09 00:38:25 +0800 CST

Pandas 数据帧纯文本 to_string 输出中的列名称的长文本拆分/换行？

772

考虑这个例子：

import pandas as pd

df = pd.DataFrame({
  "LIDSA": [0, 1, 2, 3],
  "CAE": [3, 5, 7, 9],
  "FILA": [1, 2, 3, 4], # 2 is default, so table idx 1 is default
  "VUAMA": [0.5, 1.0, 1.5, 2.0],
})
df_colnames = { # https://stackoverflow.com/q/48243818
  "LIDSA": "Lorem ipsum dolor sit amet",
  "CAE": "Consectetur adipiscing elit",
  "FILA": "Fusce imperdiet libero arcu",
  "VUAMA": "Vitae ultricies augue molestie ac",
}

# "Pandas autodetects the size of your terminal window if you set pd.options.display.width = 0" https://stackoverflow.com/q/11707586
with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.width', 0, 'max_colwidth', 20, 'display.float_format', "{:.2f}".format):
  df_str = df.rename(df_colnames,axis=1).to_string()

print(df_str)

终端标准输出的结果是 111 个字符宽：

   Lorem ipsum dolor sit amet  Consectetur adipiscing elit  Fusce imperdiet libero arcu  Vitae ultricies augue
 molestie ac
0                           0                            3                            1
        0.50
1                           1                            5                            2
        1.00
2                           2                            7                            3
        1.50
3                           3                            9                            4
        2.00

因此，只有最后一列被换行（相应地，它的值也被换行）。我希望每个长列名称在 20 个字符处进行换行/换行，然后相应地输出值，如下所示：

   Lorem ipsum dolor      Consectetur  Fusce imperdiet    Vitae ultricies
            sit amet  adipiscing elit      libero arcu  augue molestie ac
0                  0                3                1               0.50
1                  1                5                2               1.00
2                  2                7                3               1.50
3                  3                9                4               2.00

我以为'max_colwidth', 20会这样做，但显然事实并非如此。

我什至尝试在长列名称中添加显式换行符，但它们只是呈现为\n，并且列名称仍然在一行中（如pandas 列名称中的换行符中所述）

那么，是否可以在 Pandas 中对长列名进行“自动换行”/“换行”以实现纯文本字符串输出？

2 个回答

Voted

mozway · Answer 1 · 2024-03-09T00:50:57+08:00

您可以使用textwrap.wrap和tabulate：

#  pip install tabulate
from textwrap import wrap
from tabulate import tabulate

df_colnames_wrap = {k: '\n'.join(wrap(v, 20))
                    for k,v in df_colnames.items()}

print(tabulate(df.rename(columns=df_colnames_wrap),
               headers='keys', tablefmt='plain'))

输出：

      Lorem ipsum dolor        Consectetur    Fusce imperdiet      Vitae ultricies
               sit amet    adipiscing elit        libero arcu    augue molestie ac
 0                    0                  3                  1                  0.5
 1                    1                  5                  2                  1
 2                    2                  7                  3                  1.5
 3                    3                  9                  4                  2

使用浮动格式：

print(tabulate(df.rename(columns=df_colnames_wrap)
                 .convert_dtypes(),
               headers='keys', tablefmt='plain',
               floatfmt='.2f'
              ))

输出：

      Lorem ipsum dolor        Consectetur    Fusce imperdiet      Vitae ultricies
               sit amet    adipiscing elit        libero arcu    augue molestie ac
 0                    0                  3                  1                 0.50
 1                    1                  5                  2                 1.00
 2                    2                  7                  3                 1.50
 3                    3                  9                  4                 2.00

user3369545 · Answer 2 · 2024-03-09T00:48:25+08:00

当您将 DataFrame 转换为字符串时，Pandas 不提供自动换行或跨行换行长列名的内置方法。名为 max_colwidth 的设置仅影响表内的数据，而不影响列标题本身。如果您尝试在列名称中添加自己的换行符，您会发现它们实际上并没有改变标题的显示方式；相反，您会在输出中看到“\n”字符，这不是您想要的。

要让您的列名称换行为多行，您必须发挥一点创意并自己完成。您需要：

编写一个函数，可以采用长列名并将其分解为较小的部分，每个部分足够短（例如，不超过 20 个字符）以适合自己的行。使用此函数处理所有列名称，然后调整 DataFrame 的显示方式，使这些多行名称看起来正确。此方法涉及手动更改列名称以在您想要的位置包含换行符，然后确保 DataFrame 的字符串表示形式（当您打印出来时）遵循这些换行符。这更多的是在实际打印或显示 DataFrame 之前准备数据和显示设置。

import pandas as pd

# Original DataFrame
df = pd.DataFrame({
    "LIDSA": [0, 1, 2, 3],
    "CAE": [3, 5, 7, 9],
    "FILA": [1, 2, 3, 4],
    "VUAMA": [0.5, 1.0, 1.5, 2.0],
})

# Dictionary with long column names
df_colnames = {
    "LIDSA": "Lorem ipsum dolor sit amet",
    "CAE": "Consectetur adipiscing elit",
    "FILA": "Fusce imperdiet libero arcu",
    "VUAMA": "Vitae ultricies augue molestie ac",
}

# Custom function to word-wrap text
def word_wrap(text, max_width):
    """
    Word-wrap text at a specified width. Attempts to break lines at word boundaries
    where possible.
    """
    words = text.split()
    lines = []
    current_line = []
    current_length = 0

    for word in words:
        if current_length + len(word) <= max_width:
            current_line.append(word)
            current_length += len(word) + 1  # +1 for space
        else:
            lines.append(' '.join(current_line))
            current_line = [word]
            current_length = len(word) + 1
    lines.append(' '.join(current_line))  # Add the last line

    return '\n'.join(lines)

# Apply word-wrap to column names
wrapped_colnames = {col: word_wrap(name, 20) for col, name in df_colnames.items()}

# Rename DataFrame columns
df = df.rename(columns=wrapped_colnames)

# Print the DataFrame with modified display settings
with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.width', 0, 'max_colwidth', 20, 'display.float_format', "{:.2f}".format):
    print(df.to_string())

Pandas 数据帧纯文本 to_string 输出中的列名称的长文本拆分/换行？

Vue 3：创建时出错“预期标识符但发现‘导入’”[重复]

为什么这个简单而小的 Java 代码在所有 Graal JVM 上的运行速度都快 30 倍，但在任何 Oracle JVM 上却不行？

具有指定基础类型但没有枚举器的“枚举类”的用途是什么？

如何修复未手动导入的模块的 MODULE_NOT_FOUND 错误？

`(表达式，左值) = 右值` 在 C 或 C++ 中是有效的赋值吗？为什么有些编译器会接受/拒绝它？

何时应使用 std::inplace_vector 而不是 std::vector？

在 C++ 中，一个不执行任何操作的空程序需要 204KB 的堆，但在 C 中则不需要

PowerBI 目前与 BigQuery 不兼容：Simba 驱动程序与 Windows 更新有关

AdMob：MobileAds.initialize() - 对于某些设备，“java.lang.Integer 无法转换为 java.lang.String”

我正在尝试仅使用海龟随机和数学模块来制作吃豆人游戏

Pandas 数据帧纯文本 to_string 输出中的列名称的长文本拆分/换行？

2 个回答

相关问题