如何让我的 Linux 机器看起来像是在运行 Windows？

Question

244boy

Asked: 2022-07-12 22:18:43 +0800 CST2022-07-12 22:18:43 +0800 CST 2022-07-12 22:18:43 +0800 CST

从日志中获取顶级 url

772

我有许多日志文件：

adsfs.demo.com_2022-07-11-0000-0001_cn.tgz 
adsfs.demo.com_2022-07-11-0000-0002_cn.tgz 
adsfs.demo.com_2022-07-11-0000-0003_cn.tgz 
adsfs.demo.com_2022-07-11-0000-0004_cn.tgz 
adsfs.demo.com_2022-07-11-0000-0005_cn.tgz 
...

它的内容是这样的：

google 16.122.87.76 12.48.167.135 80 adsfs.demo.com [11/Jul/2022:00:45:03 +0800]  1657471503.000 "GET https://adsfs.demo.com/mp/app/feeds/index.js?age=11&name=jock 1.1" 304 - 395 - - 1 "https://dhfs.demo.com/" "Mozilla/5.0 (Linux; U; Android 11; zh-cn; PDVM00 Build/RKQ1.201217.002) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/90.0.4430.61 Mobile Safari/537.36 HeyTapBrowser/40.7.39.5" "16.11.87.76" "-" 1 - 1

我的要求是从第 8 行获取带有参数的 topURL，第 8 字段是这样的：

"GET https://adsfs.demo.com/mp/app/feeds/index.js?age=11&name=jock 1.1"

我想要的结果如下：

https://adsfs.demo.com/mp/app/feeds/index.js?age=11&name=jock 13549
https://adsfs.demo.com/mp/app/feeds/index.js?age=12&name=jock 12541
https://adsfs.demo.com/mp/app/feeds/index.js?age=13&name=rose 1142
https://adsfs.demo.com/mp/app1/index.css?age=11&name=jock 1074
https://adsfs.demo.com/mp/app2/index.html 874
...

我试过这个，但似乎不正确：

 zcat * | awk '{print $10, $17}' | awk '{a[$1]+=$10} END{for(i in a){print i, a[i]}}' | sort -rn -k 2 | head

https://adsfs.demo.com/user 0
https://adsfs.demo.com/union/adlogo/o_1512387525231.png 0
https://adsfs.demo.com/union/adlogo/logo_wo_b.png 0
https://adsfs.demo.com/union/adlogo/logo_w_b.png?aaa=aa.png 0
https://adsfs.demo.com/union/adlogo/logo_w_b.png?aa=1.jpg 0
https://adsfs.demo.com/union/adlogo/logo_w_b.png 0
https://adsfs.demo.com/union/adlogo/gdt_logo.png 0
https://adsfs.demo.com/signin 0
https://adsfs.demo.com/res/v2/feeds/mat_pic/202101/05/1000096829_1609822941972.jpg.short.webp?region=cn-north-1&x-ocs-process=image%252fresize%252cm_fix%252cw_640%252ch_320%252ffallback 0
https://adsfs.demo.com/res/v2/feeds/mat_pic/202101/05/1000096829_1609822941972.jpg.short.webp 0

1 个回答

Voted

Arnaud Valmary · Answer 1 · 2022-07-13T02:43:43+08:00

Best Answer

Arnaud Valmary

2022-07-13T02:43:43+08:002022-07-13T02:43:43+08:00

比我的评论更完整。完整的 awk 脚本和调用解决方案：

awk 脚本./topurllogs.awk

#! /usr/bin/awk -f

BEGIN {
    # Init for command head equivalent
    if (MAX == "") {
        MAX = 10
    }
}
{
    # For add (second awk call)
    h[$10] += $17
}
END {
    # Sorting order (command sort equivalent)
    #  Sort by hash values, numeric, descendant
    PROCINFO["sorted_in"] = "@val_num_desc"
    i=0
    for (e in h) {
        i++
        # print hash key and value
        print e, h[e]
        # sort after n first hash key(s)
        if (i >= MAX) {
            break
        }
    }
}

可使用此命令执行：

chmod +x ./topurllogs.awk

像这样使用：

zcat * | ./topurllogs.awk

或使用另一个 MAX 值：

zcat * | ./topurllogs.awk -v MAX=8

0

从日志中获取顶级 url

如何减少“vmmem”进程的消耗？

从 Microsoft Stream 下载视频

Google Chrome DevTools 无法解析 SourceMap：chrome-extension

Windows 照片查看器因为内存不足而无法运行？

支持结束后如何激活 WindowsXP？

远程桌面间歇性冻结

子网掩码 /32 是什么意思？

鼠标指针在 Windows 中按下的箭头键上移动？

VirtualBox 无法以 VERR_NEM_VM_CREATE_FAILED 启动

应用程序不会出现在 MacBook 的摄像头和麦克风隐私设置中

从日志中获取顶级 url

1 个回答

相关问题