AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / unix / 问题 / 493657
Accepted
hawkeye
hawkeye
Asked: 2019-01-11 02:23:49 +0800 CST2019-01-11 02:23:49 +0800 CST 2019-01-11 02:23:49 +0800 CST

如何删除超过 2 周的目录中的所有目录,除了与文件模式匹配的最新目录?

  • 772

我有以下路径:

/dir1/dir2/

在此路径中,我有以下目录,其中包含各种(不相关的)应用程序碎屑:

follower1234  1-Dec-2018
follower3456  2-Dec-2018
follower4567  3-Dec-2018
follower7890  9-Jan-2019
follower8901 10-Jan-2019
leader8765    4-Dec-2018
bystander6789 5-Dec-2018

假设今天是 2019 年 1 月 10 日。

假设可以有任意数量的followerXXXX,leaderXXXX和bystanderXXXX目录。

我想要的是删除所有followerXXXX目录,但最新followerXXX目录除外,这些目录超过两周。

现在我可以删除所有早于特定日期的目录。但这不是我的问题。我正在添加两个附加参数。

在这种情况下,我想删除:

follower1234  1-Dec-2018
follower3456  2-Dec-2018
follower4567  3-Dec-2018

但不是

follower7890  9-Jan-2019
follower8901 10-Jan-2019
leader8765    4-Dec-2018
bystander6789 5-Dec-2018

即我想删除文件

(a) 匹配模式

(b) 两周以上

(c) 不是匹配模式的最新目录(即保留最后一个)

我的问题是:如何删除超过 2 周的目录中的所有目录,除了与文件模式匹配的最新目录?

shell-script directory
  • 5 5 个回答
  • 7527 Views

5 个回答

  • Voted
  1. Best Answer
    sudodus
    2019-01-11T05:03:30+08:002019-01-11T05:03:30+08:00

    介绍

    问题已修改。

    • 我的第一个替代方案(oneliner)与新规范不匹配,但将最新目录保存在足以删除(超过 14 天)的目录中。

    • 我做了第二个选择,(shellscript)使用

      @ 自 1970 年 1 月 1 日 00:00 GMT 以来的秒数,带有小数部分。

      并减去对应于 14 天的秒数,以seclim获得排序后的目录列表中“秒数限制”的时间戳。

    1. 单线

    以前的答案很干净,很好,但它们不保留最新的follower目录。以下命令行将执行此操作(并且可以管理带有空格的名称,但带有换行符的名称会产生问题),

    find . -type d -name "follower*" -printf "%T+ %p\n"|sort|head -n -1 | cut -d ' ' -f 2- | sed -e 's/^/"/' -e 's/$/"/' | xargs echo rm -r
    

    在这个目录结构上测试,

    $ find -printf "%T+ %p\n"|sort
    2019-01-10+13:11:40.6279621810 ./follower1
    2019-01-10+13:11:40.6279621810 ./follower1/2/3
    2019-01-10+13:11:40.6279621810 ./follower1/2/dirnam with spaces
    2019-01-10+13:11:40.6279621810 ./follower1/2/name with spaces
    2019-01-10+13:11:56.5968732640 ./follower1/2/file
    2019-01-10+13:13:18.3975675510 ./follower2
    2019-01-10+13:13:19.4016254340 ./follower3
    2019-01-10+13:13:20.4056833250 ./follower4
    2019-01-10+13:13:21.4097412230 ./follower5
    2019-01-10+13:13:22.4137991260 ./follower6
    2019-01-10+13:13:23.4138568040 ./follower7
    2019-01-10+13:13:24.4219149500 ./follower8
    2019-01-10+13:13:25.4259728780 ./follower9
    2019-01-10+13:15:34.4094596830 ./leader1
    2019-01-10+13:15:36.8336011960 .
    2019-01-10+13:15:36.8336011960 ./leader2
    2019-01-10+13:25:03.0751878450 ./follower1/2
    

    像这样,

    $ find . -type d -name "follower*" -printf "%T+ %p\n"|sort|head -n -1 | cut -d ' ' -f 2- | sed -e 's/^/"/' -e 's/$/"/' | xargs echo rm -r
    rm -r ./follower1 ./follower2 ./follower3 ./follower4 ./follower5 ./follower6 ./follower7 ./follower8
    

    所以follower9被排除在外,因为它是最新的follower目录(有名称的目录,不以follower(开头leader1,leader2并且2不在游戏中)。

    现在我们添加时间标准,-mtime +14并进行另一次“试运行”以检查它是否正常工作,当我们将目录更改为存在真实follower目录的位置时,

    find . -type d -name "follower*" -mtime +14 -printf "%T+ %p\n"|sort|head -n -1 | cut -d ' ' -f 2- | sed -e 's/^/"/' -e 's/$/"/' | xargs echo rm -r
    

    最后我们删除echo并有一个可以做我们想做的命令行,

    find . -type d -name "follower*" -mtime +14 -printf "%T+ %p\n"|sort|head -n -1 | cut -d ' ' -f 2- | sed -e 's/^/"/' -e 's/$/"/' | xargs rm -r
    

    • find在当前目录中,名称以 开头的目录,follower自 14 天前以来未修改。
    • 打印和排序后head -n -1会排除最新的follower目录。
    • 去掉时间戳,在每个目录名的首尾加上双引号。
    • 最后,结果通过管道xargs作为参数传递rm -r,以删除我们要删除的目录。

    2. 脚本

    我做了第二个选择,(shellscript)使用

    @      seconds since Jan. 1, 1970, 00:00 GMT, with fractional part.
    

    它也有两种选择,

    • -n空跑
    • -v冗长的

    • 我根据 OP 的要求修改了 shellscript:在单引号中输入模式作为参数,例如“follower*”。

    • 我建议 shellscript 的名称是prune-dirs因为它现在更通用(不再只是prune-followers修剪目录follower*)。

    建议您第一次使用这两个选项运行 shellscript,以便“查看”您将做什么,当它看起来正确时,删除-n以使 shellscript 删除足够旧的目录以删除。因此,让我们调用它prune-dirs并使其可执行。

    #!/bin/bash
    
    # date        sign     comment
    # 2019-01-11  sudodus  version 1.1
    # 2019-01-11  sudodus  enter the pattern as a parameter
    # 2019-01-11  sudodus  add usage
    # 2019-01-14  sudodus  version 1.2
    # 2019-01-14  sudodus  check if any parameter to the command to be performed
    
    # Usage
    
    usage () {
     echo "Remove directories found via the pattern (older than 'datint')
    
     Usage:    $0 [options] <pattern>
    Examples: $0 'follower*'
              $0 -v -n 'follower*'  # 'verbose' and 'dry run'
    The 'single quotes' around the pattern are important to avoid that the shell expands
    the wild card (for example the star, '*') before it reaches the shellscript"
     exit
    }
    
    # Manage options and parameters
    
    verbose=false
    dryrun=false
    for i in in "$@"
    do
     if [ "$1" == "-v" ]
     then
      verbose=true
      shift
     elif [ "$1" == "-n" ]
     then
      dryrun=true
      shift
     fi
    done
    if [ $# -eq 1 ]
    then
     pattern="$1"
    else
     usage
    fi
    
    # Command to be performed on the selected directories
    
    cmd () {
     echo rm -r "$@"
    }
    
    # Pattern to search for and limit between directories to remove and keep
    
    #pattern='follower*'
    datint=14  # days
    
    tmpdir=$(mktemp -d)
    tmpfil1="$tmpdir"/fil1
    tmpfil2="$tmpdir"/fil2
    
    secint=$((60*60*24*datint))
    seclim=$(date '+%s')
    seclim=$((seclim - secint))
    printf "%s limit-in-seconds\n" $seclim > "$tmpfil1"
    
    if $verbose
    then
     echo "----------------- excluding newest match:"
     find . -type d -name "$pattern" -printf "%T@ %p\n" | sort |tail -n1 | cut -d ' ' -f 2- | sed -e 's/^/"/' -e 's/$/"/'
    fi
    
    # exclude the newest match with 'head -n -1'
    
    find . -type d -name "$pattern" -printf "%T@ %p\n" | sort |head -n -1 >> "$tmpfil1"
    
    # put 'limit-in-seconds' in the correct place in the sorted list and remove the timestamps
    
    sort "$tmpfil1" | cut -d ' ' -f 2- | sed -e 's/^/"/' -e 's/$/"/' > "$tmpfil2"
    
    if $verbose
    then
     echo "----------------- listing matches with 'limit-in-seconds' in the sorted list:"
     cat "$tmpfil2"
     echo "-----------------"
    fi
    
    # create 'remove task' for the directories older than 'limit-in-seconds'
    
    params=
    while read filnam
    do
     if [ "${filnam/limit-in-seconds}" != "$filnam" ]
     then
      break
     else
      params="$params $filnam"
     fi
    done < "$tmpfil2"
    cmd $params > "$tmpfil1"
    cat  "$tmpfil1"
    
    if ! $dryrun && ! test -z "$params"
    then
     bash "$tmpfil1"
    fi
    rm -r $tmpdir
    
    • 将当前目录更改为包含follower子目录的目录
    • 创建文件prune-dirs
    • 使其可执行
    • 并使用两个选项运行-v -n

      cd directory-with-subdirectories-to-be-pruned/
      nano prune-dirs  # copy and paste into the editor and save the file
      chmod +x prune-dirs
      ./prune-dirs -v -n
      

    测试

    我prune-dirs在具有以下子目录的目录中进行了测试,如find

    $ find . -type d -printf "%T+ %p\n"|sort
    2018-12-01+02:03:04.0000000000 ./follower1234
    2018-12-02+03:04:05.0000000000 ./follower3456
    2018-12-03+04:05:06.0000000000 ./follower4567
    2018-12-04+05:06:07.0000000000 ./leader8765
    2018-12-05+06:07:08.0000000000 ./bystander6789
    2018-12-06+07:08:09.0000000000 ./follower with spaces old
    2019-01-09+10:11:12.0000000000 ./follower7890
    2019-01-10+11:12:13.0000000000 ./follower8901
    2019-01-10+13:15:34.4094596830 ./leader1
    2019-01-10+13:15:36.8336011960 ./leader2
    2019-01-10+14:08:36.2606738580 ./2
    2019-01-10+14:08:36.2606738580 ./2/follower with spaces
    2019-01-10+17:33:01.7615641290 ./follower with spaces new
    2019-01-10+19:47:19.6519169270 .
    

    用法

    $ ./prune-dirs
    Remove directories found via the pattern (older than 'datint')
    
     Usage:    ./prune-dirs [options] <pattern>
    Examples: ./prune-dirs 'follower*'
              ./prune-dirs -v -n 'follower*'  # 'verbose' and 'dry run'
    The 'single quotes' around the pattern are important to avoid that the shell expands
    the wild card (for example the star, '*') before it reaches the shellscript
    

    运行-v -n(详细的试运行)

    $ ./prune-dirs -v -n 'follower*'
    ----------------- excluding newest match:
    "./follower with spaces new"
    ----------------- listing matches with 'limit-in-seconds' in the sorted list:
    "./follower1234"
    "./follower3456"
    "./follower4567"
    "./follower with spaces old"
    "limit-in-seconds"
    "./follower7890"
    "./follower8901"
    "./2/follower with spaces"
    -----------------
    rm -r "./follower1234" "./follower3456" "./follower4567" "./follower with spaces old"
    

    具有更一般模式的详细试运行

    $ LANG=C ./prune-dirs -v -n '*er*'
    ----------------- excluding newest match:
    "./follower with spaces new"
    ----------------- listing matches with 'limit-in-seconds' in the sorted list:
    "./follower1234"
    "./follower3456"
    "./follower4567"
    "./leader8765"
    "./bystander6789"
    "./follower with spaces old"
    "limit-in-seconds"
    "./follower7890"
    "./follower8901"
    "./leader1"
    "./leader2"
    "./2/follower with spaces"
    -----------------
    rm -r "./follower1234" "./follower3456" "./follower4567" "./leader8765" "./bystander6789" "./follower with spaces old"
    

    不带任何选项运行(删除目录的真实案例)

    $ ./prune-dirs 'follower*'
    rm -r "./follower1234" "./follower3456" "./follower4567" "./follower with spaces old"
    

    运行-v“再试一次”

    $ LANG=C ./prune-dirs -v 'follower*'
    ----------------- excluding newest match:
    "./follower with spaces new"
    ----------------- listing matches with 'limit-in-seconds' in the sorted list:
    "limit-in-seconds"
    "./follower7890"
    "./follower8901"
    "./2/follower with spaces"
    -----------------
    rm -r
    

    shellscript 没有列出“高于”“limit-in-seconds”的目录,并且没有为命令行列出任何文件rm -r,因此工作已经完成(这是正确的结果)。但是如果你几天后再次运行 shellscript,一些新目录可能会在“limit-in-seconds”之上找到并被删除。

    • 7
  2. Emilio Galarraga
    2019-01-11T03:15:50+08:002019-01-11T03:15:50+08:00

    补充罗文的答案。您可以通过目录的路径更改点

    find . -type d -name follower* -mtime +14 -exec rm -rf {} +;
    
    • 5
  3. Stéphane Chazelas
    2019-01-11T09:22:02+08:002019-01-11T09:22:02+08:00

    With zsh:

    (){ n=$#; } follower<->(/)       # count the number of follower<n> dirs
    
    to_remove=(follower<->(/m+13om)) # assumes the dir list is not changed
                                     # since the previous command
    
    (($#to_remove < n)) || to_remove[1]=() # keep the youngest if they're
                                           # all over 2 weeks old
    
    
    
    echo rm -rf $to_remove
    

    (remove echo when happy)

    • <-> any sequence of decimal digits (a short form of <1-20> be without bound).
    • (){code} args: anonymous function which here stores its number of arguments in $n.
    • (/omm+13): glob qualifier
    • /: only select files of type directory (equivalent of find's -type d)
    • m+13: files whose age in whole days is strictly greater than 13 days, so files that are 14 days old or older (equivalent of find's -mtime +13).
    • om: order by modification time (like ls -t younger files first)

    Note that it's dangerous to rely on directory modification time. directories are modified when files are added, removed or renamed in them (or when they're touched). Since those directories are numbered, you may want to rely on that numbering instead, so replace om with nOn (numerically Order in reverse (capital O) by name).

    To have the pattern in a variable, replace follower<-> with $~pattern and set pattern='follower<->' or any other value.

    • 5
  4. rowan
    2019-01-11T02:52:41+08:002019-01-11T02:52:41+08:00

    当我需要删除与时间相关的文件或目录时,我会使用find.

    在删除任何内容之前,您可以运行该命令几次,看看它是否找到了您想要的所有内容。

    find . -type d -mtime +14
    # -type c, File is of type c: d  directory
    # -mtime n, File's data was last modified n*24 hours ago.
    

    如果它符合您的所有条件,您可以-exec rm -r {} +在其后面添加:

    find . -type d -mtime +14 -exec rm -r {} +
    

    我们在-exec这里使用的原因是,-delete如果目录不为空,将无法正常工作。

    查看man find更多指导。

    • 3
  5. fra-san
    2019-01-11T07:33:09+08:002019-01-11T07:33:09+08:00

    几个解决方案:

    1. 基于 GNU find:

    #!/bin/bash
    
    # The pattern has to be available to subshells
    export patt="$1"
    
    find . -maxdepth 1 -type d -name "${patt}*" -mtime +14 \
      -exec sh -c '[ "$(find . -maxdepth 1 -type d -name "${patt}*" -print0 |
        sort -z -V |
        tail -z -n 1 |
        tr -d "\0")" != "$1" ]' sh {} \; \
      -exec sh -c 'echo rm -r "$1"' sh {} \;
    

    该脚本的调用方式如下:

    ./script name_pattern
    

    照原样,它会给你一个空运行。echo在最后一个操作中删除-exec以使其实际删除目录。

    它会:

    • 查找当前目录中所有修改超过 14 天的目录(但请参阅下面的注释)并且名称以;-mtime开头的值 ${patt}对于每个:
    • 确保(第一个-exec)找到的目录不是与名称模式匹配的最后一个目录,按版本升序排序(-V)(例如,follower100放在 之后follower2);如果 test ( [) 失败,find则跳到下一个循环并且不执行后面的动作;
    • 删除找到的目录(第二个-exec)。

    在这里,我假设按名称按字典顺序对目录进行排序和按修改日期对目录进行排序之间是等价的。如果您的最新目录是根据其名称定义的,则可以。
    相反,如果您的最新目录是最近修改时间的目录,我们必须将-exec ...上面代码中的第一个替换为这个:

      -exec sh -c '[ "$(find . -maxdepth 1 -type d -name "${patt}*" -printf "%T@\n" |
        sed "s/\..*$//" |
        sort -n |
        tail -n 1)" != "$(stat -c "%Y" "$1")" ]' sh {} \; \
    

    在内部find,我们找到与名称模式匹配的所有目录,打印它们自 Epoch 以来的修改时间列表(以秒为单位),去掉小数部分,排序,取最后一个并检查它是否不等于当前的外部的结果find。

    请注意,使用此过滤器,如果所有匹配的目录都超过 14 天并且修改时间完全相同,则不会删除它们。


    笔记:

    -maxdepth 1不严格要求将搜索限制为当前目录 ( ) 的内容。

    You may want to tell sort how to order things, e.g. adding export LC_ALL=C at the beginning of the script (refer to this answer to 'What does "LC_ALL=C" do?' about the issues you may have when sorting, depending on your localization settings).

    Note that, using -mtime +14, files that have been modified between 14 and 15 days ago are skipped even if their modification time is technically older than 14*24 hours from now (refer to man find for details; specifically, the description of -atime n).

    It will work even when names contain spaces, newlines, uncommon and non-printable characters.

    Compatibility: the flip side is that it is not portable: some features used here, notably find's -maxdepth, -print0 and -printf, the stat command, the -V option to sort and the -z option to sort and tail (and I am possibly forgetting some more), are not specified in POSIX.

    2. Based on shell features

    #!/bin/sh
    
    patt="$1"                 # The name pattern
    test -z "$patt" && exit   # Caution: pattern must not be empty
    
    days=14     # How old has to be a directory to get deleted, in days?
    last=       # The youngest directory
    
    dirs=( "$patt"* )     # Array of files matched by name (note, here we
                          # have everything that matches, not just dirs)
    now="$(date +'%s')"   # Now in seconds since Epoch
    
    threshold="$(( "$now" - ( "$days" * 24 * 60 *60 ) ))"
                          # Dirs older than this date (in seconds since
                          # Epoch) are candidates for deletion
    
    # We find the youngest directory in the array
    #
    for i in "${!dirs[@]}"; do
      if  [ -z "$last" ] ||
        ( [ -d "${dirs[$i]}" ] &&
          [ "$(stat -c '%Y' -- "${dirs[$i]}")" -gt "$(stat -c '%Y' -- "$last")" ] ); then
        last="${dirs[$i]}"
      fi
    done
    
    # We delete all the directories in the array that are
    # not the youngest one AND are older that the thrashold
    #
    for i in "${!dirs[@]}"; do
      if  [ -d "${dirs[$i]}" ] &&
          [ "${dirs[$i]}" != "$last" ] &&
          [ "$(stat -c '%Y' -- "${dirs[$i]}")" -lt "$threshold" ]; then
        echo rm -rf -- "${dirs[$i]}"
      fi
    done
    

    This script, too, is meant to be invoked as

    ./script name_pattern
    

    Again, it will give you a dry run until you remove echo from echo rm -rf -- "${dirs[$i]}".

    It will:

    • Populate an array with the names of all files, in the current directory, that match the name pattern;
    • Determine the youngest directory in the array;
    • Delete all the directories in the array that 1) are older than 14 days AND 2) are not the youngest one.

    Notes:

    It will target directories older then 14 days from now (unlike find). Thus, these two solutions are not strictly equivalent.
    Also, if all the matching directories are older than the threshold and have all the same modification time, it will delete all but one of them - randomly chosen.

    Names with uncommon characters are ok, including newlines and non printable ones.

    Compatibility: even this solution relies on some non POSIX features: namely, stat and the %s date format. Ah, and arrays, apparently...

    • 2

相关问题

  • 在awk中的两行之间减去相同的列

  • 打印文件行及其长度的脚本[关闭]

  • 通过命令的标准输出以编程方式导出环境变量[重复]

  • 按分隔符拆分并连接字符串问题

  • MySQL Select with function IN () with bash array

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    模块 i915 可能缺少固件 /lib/firmware/i915/*

    • 3 个回答
  • Marko Smith

    无法获取 jessie backports 存储库

    • 4 个回答
  • Marko Smith

    如何将 GPG 私钥和公钥导出到文件

    • 4 个回答
  • Marko Smith

    我们如何运行存储在变量中的命令?

    • 5 个回答
  • Marko Smith

    如何配置 systemd-resolved 和 systemd-networkd 以使用本地 DNS 服务器来解析本地域和远程 DNS 服务器来解析远程域?

    • 3 个回答
  • Marko Smith

    dist-upgrade 后 Kali Linux 中的 apt-get update 错误 [重复]

    • 2 个回答
  • Marko Smith

    如何从 systemctl 服务日志中查看最新的 x 行

    • 5 个回答
  • Marko Smith

    Nano - 跳转到文件末尾

    • 8 个回答
  • Marko Smith

    grub 错误:你需要先加载内核

    • 4 个回答
  • Marko Smith

    如何下载软件包而不是使用 apt-get 命令安装它?

    • 7 个回答
  • Martin Hope
    user12345 无法获取 jessie backports 存储库 2019-03-27 04:39:28 +0800 CST
  • Martin Hope
    Carl 为什么大多数 systemd 示例都包含 WantedBy=multi-user.target? 2019-03-15 11:49:25 +0800 CST
  • Martin Hope
    rocky 如何将 GPG 私钥和公钥导出到文件 2018-11-16 05:36:15 +0800 CST
  • Martin Hope
    Evan Carroll systemctl 状态显示:“状态:降级” 2018-06-03 18:48:17 +0800 CST
  • Martin Hope
    Tim 我们如何运行存储在变量中的命令? 2018-05-21 04:46:29 +0800 CST
  • Martin Hope
    Ankur S 为什么 /dev/null 是一个文件?为什么它的功能不作为一个简单的程序来实现? 2018-04-17 07:28:04 +0800 CST
  • Martin Hope
    user3191334 如何从 systemctl 服务日志中查看最新的 x 行 2018-02-07 00:14:16 +0800 CST
  • Martin Hope
    Marko Pacak Nano - 跳转到文件末尾 2018-02-01 01:53:03 +0800 CST
  • Martin Hope
    Kidburla 为什么真假这么大? 2018-01-26 12:14:47 +0800 CST
  • Martin Hope
    Christos Baziotis 在一个巨大的(70GB)、一行、文本文件中替换字符串 2017-12-30 06:58:33 +0800 CST

热门标签

linux bash debian shell-script text-processing ubuntu centos shell awk ssh

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve