GNU Parallel 是否会启动一批尽可能多的作业(启动的作业数量由 GNU Parallel 内部或/和-j
选项以及给定参数控制),一旦完成,然后启动下一批作业等等?
语境
我想学习如何更好地处理与作业(开始时间、结束时间和运行时间)和 GNU Parallel 相关的时间戳。作为这里的一个例子,我想了解我是否可以在我的自定义日志中使用时间戳,通过自定义log
函数记录,它在执行实际处理命令之前出现,总是在传递给 GNU Parallel 的 for 循环中. 他们能给我实际处理命令的运行时间吗?
细节
在一个 for 循环中,然后与 一起传递给 GNU Parallel --joblog
,我放置了两个命令:第一个命令是一个自定义log
命令,包括一些时间戳,就在第二个命令之前,它执行感兴趣的实际处理。自定义log
命令的时间不是直接感兴趣的——它是另一个日志命令。不幸的是,我不知道该--joblog
选项是如何工作的——正如这里所解释的GNU Parallel——joblog 只记录 for loop 中的第一行命令,它只记录第一条命令。
为了理解我的日志,我用它来显示输出mlr
的前三行--joblog
❯ mlr --itsv --oxtab head -n 3 parallel/parallel.job.4437.3.log
Seq 1
Host :
Starttime 1670106266.417
JobRuntime 0.000
Send 0
Receive 0
Exitval 0
Signal 0
Command log /scratch/pvgis/job.4437.3/Process_2022_12_02_23_15_50_10m_u_component_of_wind_2008.log Action=Metadata, Map=era5_and_land_10m_u_component_of_wind_2008_band_79_merged_scaled.nc, Hours since=946704, Longname=10 metre U wind component, Units=m s**-1
Seq 2
Host :
Starttime 1670106266.419
JobRuntime 0.009
Send 0
Receive 0
Exitval 0
Signal 0
Command log /scratch/pvgis/job.4437.3/Process_2022_12_02_23_15_50_10m_u_component_of_wind_2008.log Action=Metadata, Map=era5_and_land_10m_u_component_of_wind_2008_band_39_merged_scaled.nc, Hours since=946705, Longname=10 metre U wind component, Units=m s**-1
Seq 3
Host :
Starttime 1670106266.422
JobRuntime 0.012
Send 0
Receive 0
Exitval 0
Signal 0
Command log /scratch/pvgis/job.4437.3/Process_2022_12_02_23_15_50_10m_u_component_of_wind_2008.log Action=Metadata, Map=era5_and_land_10m_u_component_of_wind_2008_band_28_merged_scaled.nc, Hours since=946706, Longname=10 metre U wind component, Units=m s**-1
以上并不是gdalmerge_and_clean
我感兴趣的命令的运行时间。不过,我认为记录的开始时间应该在每个记录的行之间有所不同,因为所有执行的命令的运行时间(批量? ) 在传递给 GNU Parallel 的 for 循环迭代中。我想情况并非如此,GNU Parallel 在记录的内容上非常精确,这正是它首先读取的命令的运行时间。
连续记录之间的差异Starttime
(下图显示前 10 行)
mlr --itsv --opprint step -a delta -f Starttime then rename Starttime_delta,Delta then cut -f Starttime,JobRuntime,Delta parallel/parallel.job.4437.3.log |head
是
Starttime JobRuntime Delta
1670106266.417 0.000 0
1670106266.419 0.009 0.0019998550415039062
1670106266.422 0.012 0.003000020980834961
1670106266.424 0.014 0.002000093460083008
1670106266.427 0.013 0.003000020980834961
1670106266.434 0.012 0.006999969482421875
1670106266.439 0.021 0.004999876022338867
1670106266.442 0.019 0.003000020980834961
1670106266.446 0.018 0.004000186920166016
..
依此类推。平均三角洲
mlr --itsv --opprint step -a delta -f Starttime then rename Starttime_delta,Delta then cut -f Starttime,JobRuntime,Delta then stats1 -a mean -f Delta parallel/parallel.job.4437.3.log
是
Delta_mean
0.33402504553451784
这显然与log
命令有关。命令不太可能gdalmerge_and_clean
这么快。
尽管如此,从自定义log
命令中,我可以根据总体开始和结束时间戳计算所有运行的作业的总体持续时间
Action=Processing, Start=2022-12-02 23:15:50
Action=Processing, End=2022-12-04 02:16:43
这非常有用。但是,我想更多地了解在此“处理”期间运行的每一个作业。这就是为什么有一个log
命令在执行实际命令之前记录时间戳的原因gdalmerge_and_clean
。
这些日志行看起来像这样:
..
size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_210_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_211_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_212_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_213_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_214_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_215_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_216_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_217_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_218_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_219_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_220_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_221_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_222_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_223_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_224_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_225_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_226_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_227_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_228_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_229_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_230_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_231_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:48:15
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_232_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_233_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_234_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_235_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_236_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_237_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_238_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_239_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_240_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_241_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_242_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_243_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_244_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_245_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_246_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_247_merged.nc, Pixel
size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_248_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_249_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_250_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_251_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_252_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_253_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_254_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:55:45
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_255_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_256_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_257_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_258_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_259_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_260_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_261_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_262_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_263_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_264_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_265_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_266_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_267_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_268_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_269_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_270_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_271_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_272_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_273_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_274_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_275_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_276_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
Action=Merge, Output=era5_and_land_10m_u_component_of_wind_2008_band_277_merged.nc, Pixel size=0.099999998304108 -0.100000000000000, Timestamp=2022-12-02 23:56:02
..
再次mlr
用于计算记录的时间戳之间的差异,也许有一些有用的东西?非零差异涉及在不同时刻开始的作业批次的时间戳(我认为这很有用,因为有许多作业具有相同的开始时间,因为它们通过 GNU Parallel 并行运行,对吧?):
mlr --ocsv grep 'Action=Merge, Output' then clean-whitespace then put '$Seconds = localtime2sec($Timestamp)' then step -a delta -f Seconds then cut -f Timestamp,Seconds,Seconds_delta then cat -n then rename n,Job,Seconds_delta,Delta then filter '$Delta != 0' jobs/Process_2022_12_02_23_15_50_10m_u_component_of_wind_2008.log
是
Job,Timestamp,Seconds,Delta
209,2022-12-02 23:48:15,1670024895,625
232,2022-12-02 23:55:45,1670025345,450
255,2022-12-02 23:56:02,1670025362,17
278,2022-12-02 23:56:19,1670025379,17
291,2022-12-02 23:56:20,1670025380,1
301,2022-12-02 23:56:36,1670025396,16
324,2022-12-02 23:56:56,1670025416,20
347,2022-12-02 23:57:11,1670025431,15
370,2022-12-02 23:57:25,1670025445,14
393,2022-12-02 23:57:38,1670025458,13
..
8570,2022-12-03 21:18:20,1670102300,94
8593,2022-12-03 21:19:48,1670102388,88
8616,2022-12-03 21:21:56,1670102516,128
8639,2022-12-03 21:23:54,1670102634,118
8662,2022-12-03 21:25:42,1670102742,108
8685,2022-12-03 21:26:00,1670102760,18
8708,2022-12-03 21:27:12,1670102832,72
8731,2022-12-03 21:28:24,1670102904,72
8754,2022-12-03 21:29:19,1670102959,55
8777,2022-12-03 21:29:59,1670102999,40
也许这些差异确实说明了每个单独的作业在 GNU Parallel-ised for 循环中运行所花费的时间,或多或少。?