AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / user-29846737

Julie Johnson's questions

Martin Hope
Julie Johnson
Asked: 2025-04-09 21:56:30 +0800 CST

如何使用 C 语言中的管道正确解析 strace 输出中的系统调用执行时间?

  • 11

我正在研究操作系统编程作业以了解管道。文本任务涉及使用正则表达式从 strace 输出中提取信息并统计系统调用执行时间。

但是,我遇到了时间计算不正确的问题。我怀疑 parse_strace_line 函数未能正确解析 strace 输出行,尤其是时间戳部分(例如 <0.000123>),导致统计信息中的时间值不正确或缺失。该函数可能无法正确处理行格式、行不完整等边缘情况,或者正则表达式模式可能无法准确捕获 strace 输出中的时间分量。

我的环境是在 WSL2 上运行的 Ubuntu。

以下是代码:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>
#include <regex.h>

#define MAX_SYSCALLS 1024  // Maximum distinct syscalls we can track
#define TOP_N 5            // Number of top results to display

/* Syscall tracking structure
 * name: syscall name (e.g., "open", "read")
 * time: accumulated time spent in this syscall (seconds)
 */
typedef struct {
    char name[64];
    double time;
} syscall_stat;

/* Main statistics container
 * stats: array of individual syscall metrics
 * count: number of unique syscalls tracked
 * total_time: sum of all syscall times for percentage calculation
 */
typedef struct {
    syscall_stat stats[MAX_SYSCALLS];
    int count;
    double total_time;
} syscall_stats;

/* Parse a line of strace output with timing information (-T option)
 * line: input string from strace output
 * syscall_name: output parameter for extracted syscall name
 * time: output parameter for extracted duration
 * Returns: 1 if successful, 0 if line doesn't contain timing info
 * 
 * Strace line format examples:
 * openat(AT_FDCWD, "/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 <0.000014>
 * read(0, "hello\n", 1024)               = 6 <0.000010>
 */
int parse_strace_line(char* line, char* syscall_name, double* time) {
    // Filter lines without timing information
    if (strstr(line, " = ") == NULL || strstr(line, "<") == NULL) {
        return 0;
    }

    // Regular expression pattern breakdown:
    // ^(\w+)       Capture group 1: syscall name at line start
    // \(.*?\)      Non-greedy match of arguments between parentheses
    // \s+=\s+.*?   Match return value portion
    // <(\d+\.?\d*)> Capture group 2: time value between <>
    regex_t regex;
    regcomp(&regex, "^(\\w+)\\(.*?\\)\\s+=\\s+.*?<(\\d+\\.?\\d*)>$", REG_EXTENDED);
    regmatch_t matches[3];  // Array for regex capture groups

    if (regexec(&regex, line, 3, matches, 0) != 0) {
        regfree(&regex);
        return 0;
    }

    // Extract syscall name from first capture group
    int name_len = matches[1].rm_eo - matches[1].rm_so;
    strncpy(syscall_name, line + matches[1].rm_so, name_len);
    syscall_name[name_len] = '\0';

    // Convert time string to double
    char* time_str = line + matches[2].rm_so;
    *time = atof(time_str);

    regfree(&regex);
    return 1;
}

/* Update statistics with new syscall data
 * stats: pointer to statistics container
 * name: syscall name to add/update
 * time: duration to accumulate
 * 
 * If the syscall already exists, we sum the time.
 * If new, we add it to the list (until MAX_SYSCALLS).
 */
void add_syscall(syscall_stats *stats, const char *name, double time) {
    // Safety check against array overflow
    if (stats->count >= MAX_SYSCALLS) {
        fprintf(stderr, "Warning: Reached maximum tracked syscalls (%d)\n", MAX_SYSCALLS);
        return;
    }

    // Check existing entries for duplicate syscall
    for (int i = 0; i < stats->count; i++) {
        if (strcmp(stats->stats[i].name, name) == 0) {
            stats->stats[i].time += time;
            stats->total_time += time;
            return;
        }
    }

    // Add new entry if space available
    if (stats->count < MAX_SYSCALLS) {
        strncpy(stats->stats[stats->count].name, name, sizeof(stats->stats[0].name));
        stats->stats[stats->count].time = time;
        stats->total_time += time;
        stats->count++;
    }
}

/* Display top N syscalls by accumulated time
 * stats: statistics container with all data
 * n: number of top results to show
 * 
 * Uses bubble sort for simplicity with small datasets.
 * Prints percentages relative to total tracked time.
 * Flushes output with nulls for clean pipe termination.
 */
void print_top_syscalls(syscall_stats *stats, int n) {
    if (stats->count == 0 || stats->total_time == 0) {
        fprintf(stderr, "Error: No valid syscall data to display\n");
        return;
    }

    // Simple O(n²) sort acceptable for MAX_SYSCALLS=1024
    for (int i = 0; i < stats->count - 1; i++) {
        for (int j = i+1; j < stats->count; j++) {
            // Swap if later entry has higher time
            if (stats->stats[j].time > stats->stats[i].time) {
                syscall_stat temp = stats->stats[i];
                stats->stats[i] = stats->stats[j];
                stats->stats[j] = temp;
            }
        }
    }

    // Determine number of results to display
    n = stats->count < n ? stats->count : n;
    for (int i = 0; i < n; i++) {
        double ratio = (stats->stats[i].time / stats->total_time) * 100;
        printf("%s (%d%%)\n", stats->stats[i].name, (int)ratio);
    }

    /* Null-terminate output for clean pipe communication
     * Some programs may read fixed buffer sizes - this ensures
     * we flush the pipe completely and avoid hanging readers */
    for (int i = 0; i < 80; i++) putchar('\0');
    fflush(stdout);
}

/* Main execution flow:
 * 1. Validate command line arguments
 * 2. Set up parent/child process communication
 * 3. Child: execute strace with command
 * 4. Parent: process and analyze strace output
 */
int main(int argc, char *argv[]) {
    if (argc < 2) {
        fprintf(stderr, "Usage: %s COMMAND [ARGS...]\n", argv[0]);
        fprintf(stderr, "Example: %s ls -l\n", argv[0]);
        return 1;
    }

    // Create pipe for inter-process communication
    int pipefd[2];
    if (pipe(pipefd) == -1) {
        perror("Failed to create pipe");
        return 1;
    }

    pid_t pid = fork();
    if (pid == -1) {
        perror("Process fork failed");
        return 1;
    }

    if (pid == 0) { // Child process - execute strace
        // Redirect stdout to pipe write end
        close(pipefd[0]);
        dup2(pipefd[1], STDOUT_FILENO);
        close(pipefd[1]);

        // Build strace argument array:
        // strace -T [user command] [args...]
        char* strace_argv[argc + 2];
        strace_argv[0] = "/usr/bin/strace";
        strace_argv[1] = "-T";  // Enable timing reports
        for (int i = 1; i < argc; i++) {
            strace_argv[i+1] = argv[i];
        }
        strace_argv[argc+1] = NULL;  // NULL-terminate array

        // Execute strace with clean environment
        execve("/usr/bin/strace", strace_argv, (char *[]){ "PATH=/bin:/usr/bin", NULL });
        perror("Failed to execute strace");
        return 1;
    } else { // Parent process - analyze output
        close(pipefd[1]);  // Close unused write end
        syscall_stats stats = {0};  // Initialize all fields to zero

        char buffer[1024];
        char* line_start = buffer;
        ssize_t bytes_read;

        // Read output in chunks until EOF
        while ((bytes_read = read(pipefd[0], 
               line_start, 
               sizeof(buffer) - (line_start - buffer))) > 0) {
            // Null-terminate the received data
            buffer[bytes_read + (line_start - buffer)] = '\0';
            line_start = buffer;

            // Process complete lines (delimited by newlines)
            char* line_end;
            while ((line_end = strchr(line_start, '\n')) != NULL) {
                *line_end = '\0';  // Temporarily terminate at newline

                char syscall_name[64];
                double time;
                if (parse_strace_line(line_start, syscall_name, &time)) {
                    add_syscall(&stats, syscall_name, time);
                }

                line_start = line_end + 1;  // Move to next line
            }

            // Handle partial line remaining in buffer
            if (line_start != buffer) {
                size_t remaining = buffer + sizeof(buffer) - line_start;
                // Move partial line to buffer start for next read
                memmove(buffer, line_start, remaining);
                line_start = buffer + remaining;
            }
        }

        // Cleanup and display results
        close(pipefd[0]);
        wait(NULL);  // Wait for strace to terminate completely
        print_top_syscalls(&stats, TOP_N);
    }

    return 0;
}

这是要测试的命令行:

root@DimLights:/mnt/f/WSL/Data/OperatingSystems/2025/os2025/sperf# gcc -g -o sperf sperf.c -lm -lrt
root@DimLights:/mnt/f/WSL/Data/OperatingSystems/2025/os2025/sperf# ./sperf echo "Hello WSL"

execve("/bin/echo", ["echo", "Hello WSL"], 0x7ffe6258ee20 /* 1 var */) = 0 <0.000363>
brk(NULL)                               = 0x55d4dee1c000 <0.000104>
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f483b684000 <0.000137>
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory) <0.000115>
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 <0.000133>
fstat(3, {st_mode=S_IFREG|0644, st_size=36191, ...}) = 0 <0.000115>
mmap(NULL, 36191, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f483b67b000 <0.000097>
close(3)                                = 0 <0.000079>
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 <0.000108>
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220\243\2\0\0\0\0\0"..., 832) = 832 <0.000094>
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784 <0.000101>
fstat(3, {st_mode=S_IFREG|0755, st_size=2125328, ...}) = 0 <0.000096>
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784 <0.000088>
mmap(NULL, 2170256, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f483b469000 <0.000107>
mmap(0x7f483b491000, 1605632, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x28000) = 0x7f483b491000 <0.000117>
mmap(0x7f483b619000, 323584, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b0000) = 0x7f483b619000 <0.000099>
mmap(0x7f483b668000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1fe000) = 0x7f483b668000 <0.000102>
mmap(0x7f483b66e000, 52624, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f483b66e000 <0.000103>
close(3)                                = 0 <0.000104>
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f483b466000 <0.000085>
arch_prctl(ARCH_SET_FS, 0x7f483b466740) = 0 <0.000049>
set_tid_address(0x7f483b466a10)         = 1543 <0.000319>
set_robust_list(0x7f483b466a20, 24)     = 0 <0.000098>
rseq(0x7f483b467060, 0x20, 0, 0x53053053) = 0 <0.000123>
mprotect(0x7f483b668000, 16384, PROT_READ) = 0 <0.000096>
mprotect(0x55d4de99a000, 4096, PROT_READ) = 0 <0.000093>
mprotect(0x7f483b6bc000, 8192, PROT_READ) = 0 <0.000103>
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0 <0.000095>
munmap(0x7f483b67b000, 36191)           = 0 <0.000118>
getrandom("\xc0\x8d\xa3\xbc\x88\x59\xbb\xd3", 8, GRND_NONBLOCK) = 8 <0.000098>
brk(NULL)                               = 0x55d4dee1c000 <0.000113>
brk(0x55d4dee3d000)                     = 0x55d4dee3d000 <0.000097>
fstat(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 <0.000092>
write(1, "Hello WSL\n", 10Received from pipe: Hello WSL
)             = 10 <0.000216>
close(1)                                = 0 <0.000165>
close(2)                                = 0 <0.000092>
exit_group(0)                           = ?
+++ exited with 0 +++
Total count is zero

以下是我尝试过的方法和预期结果:我使用 GDB 调试,发现 while ((line = strchr(line_start, '\n')) != NULL) 循环在测试命令时只运行了一次。我不明白为什么输出显示 +++ exited with 0 +++ 并且 Total count 为零。我还向 AI 寻求帮助,并修改了 parse_strace_line 函数和正则表达式模式的部分内容,但收效甚微。

实际发生的情况:统计数据仍然为空(总数为零),表明解析或数据提取失败。

我的期望:代码应该正确解析来自 strace 输出的系统调用时间戳,累积它们的时间,并计算每个调用的时间与总时间的比率。

作为一名学生,如果您能指出我的代码中的错误并解释如何正确读取系统调用时间并将其转换为比率,我将不胜感激。

c
  • 1 个回答
  • 97 Views

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    重新格式化数字,在固定位置插入分隔符

    • 6 个回答
  • Marko Smith

    为什么 C++20 概念会导致循环约束错误,而老式的 SFINAE 不会?

    • 2 个回答
  • Marko Smith

    VScode 自动卸载扩展的问题(Material 主题)

    • 2 个回答
  • Marko Smith

    Vue 3:创建时出错“预期标识符但发现‘导入’”[重复]

    • 1 个回答
  • Marko Smith

    具有指定基础类型但没有枚举器的“枚举类”的用途是什么?

    • 1 个回答
  • Marko Smith

    如何修复未手动导入的模块的 MODULE_NOT_FOUND 错误?

    • 6 个回答
  • Marko Smith

    `(表达式,左值) = 右值` 在 C 或 C++ 中是有效的赋值吗?为什么有些编译器会接受/拒绝它?

    • 3 个回答
  • Marko Smith

    在 C++ 中,一个不执行任何操作的空程序需要 204KB 的堆,但在 C 中则不需要

    • 1 个回答
  • Marko Smith

    PowerBI 目前与 BigQuery 不兼容:Simba 驱动程序与 Windows 更新有关

    • 2 个回答
  • Marko Smith

    AdMob:MobileAds.initialize() - 对于某些设备,“java.lang.Integer 无法转换为 java.lang.String”

    • 1 个回答
  • Martin Hope
    Fantastic Mr Fox msvc std::vector 实现中仅不接受可复制类型 2025-04-23 06:40:49 +0800 CST
  • Martin Hope
    Howard Hinnant 使用 chrono 查找下一个工作日 2025-04-21 08:30:25 +0800 CST
  • Martin Hope
    Fedor 构造函数的成员初始化程序可以包含另一个成员的初始化吗? 2025-04-15 01:01:44 +0800 CST
  • Martin Hope
    Petr Filipský 为什么 C++20 概念会导致循环约束错误,而老式的 SFINAE 不会? 2025-03-23 21:39:40 +0800 CST
  • Martin Hope
    Catskul C++20 是否进行了更改,允许从已知绑定数组“type(&)[N]”转换为未知绑定数组“type(&)[]”? 2025-03-04 06:57:53 +0800 CST
  • Martin Hope
    Stefan Pochmann 为什么 {2,3,10} 和 {x,3,10} (x=2) 的顺序不同? 2025-01-13 23:24:07 +0800 CST
  • Martin Hope
    Chad Feller 在 5.2 版中,bash 条件语句中的 [[ .. ]] 中的分号现在是可选的吗? 2024-10-21 05:50:33 +0800 CST
  • Martin Hope
    Wrench 为什么双破折号 (--) 会导致此 MariaDB 子句评估为 true? 2024-05-05 13:37:20 +0800 CST
  • Martin Hope
    Waket Zheng 为什么 `dict(id=1, **{'id': 2})` 有时会引发 `KeyError: 'id'` 而不是 TypeError? 2024-05-04 14:19:19 +0800 CST
  • Martin Hope
    user924 AdMob:MobileAds.initialize() - 对于某些设备,“java.lang.Integer 无法转换为 java.lang.String” 2024-03-20 03:12:31 +0800 CST

热门标签

python javascript c++ c# java typescript sql reactjs html

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve