我想创建一个sed
命令,从给定文档中删除所有这些奇怪的字符:
sed -n 's/\|®MD-IT¯\|®MD\+BO¯\|®MDNM¯®LL\.8LI,0LI¯\|®LL0LI,0LI¯\|®MD\+IT¯\|®LL.8LI,0LI¯®MDIT¯\|®MDNM¯®FL¯®LL.8LI,0LI¯\|®FL¯®MD-BO¯\|®FL¯®MD-BO¯\|®MD-BO¯\|¯®OF1IN,1IN¯®FC¯®LL1LI,0LI¯\|\|®SF1,1¯\|®FM1FT=0LI,LR=1;\|®MDSU¯®FN1¯\|®MDNM¯¯\|®IV-RTF\|\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\|¯®BF0¯\|®FS1\|-------------------------------------\|¯®FW1\|\|//gp'
这些代码都是在另一个应用程序中创建的Nota Bene
,我有很多包含这些代码的文件,我想将它们转换为纯文本,甚至可能是降价。
问题是字符没有被替换。我已经尝试过这样做,Sublime Text
并成功地使用 find-replace (regex) 剥离了文档。我最好创建一个sed
脚本而不是Sublime
用于此任务。
我也尝试过使用Ed
,但它也没有找到替代品。
这是在 `Sublime Text 中打开时的示例 nb 文件:
®SSDEFAULTS¯®LR1¯®JU¯®MD+BO¯®UFTimes New Roman¯®SZ12Pt¯Glossary®MD+BO¯®TS.5IN,1IN,1.5IN,2IN,2.5IN,3IN,3.5IN,4IN,4.5IN,5IN,5.5IN,6IN¯ ®MD-BO¯
®NJ¯®LR1¯®LL.5LI,0LI¯®MD+BO¯®LL0LI,0LI¯®MDNM¯®LR1¯®LL.5LI,0LI¯A fortiori proposition: If X is true, then how much greater is Y true? To move logically from a stronger argument to establish a weaker argument. The weaker argument is sometimes presented by the speaker as the stronger argument.
®LL0LI,0LI¯®LR1¯®LL.5LI,0LI¯®LL0LI,0LI¯®LR1¯®LL.5LI,0LI¯Accusative of motion/direction - Indicates movement to the noun marked by the accusative and is to be distinguished from the accusative of local determination which indicates location without motion (Joüon and Muraoka 2006, 428).
Anadiplosis - A figure of speech in which the word that a colon ends with, or a like sounding word, is the word that begins the next colon ®GC|CI:R#=47;AU=Brown, Raymond E.;YR=1990;TI=New Jerome biblical commentary;PG=245;XT=;F[=;F]=;F#=;ID=;XX=Print;CT=;FL=¯(Brown, Fitzmyer, Murphy, et al. 1990, 245)®GC¯.
®LL0LI,0LI¯®LR1¯®LL.5LI,0LI¯®LL0LI,0LI¯®LR1¯®LL.5LI,0LI¯Anaphoric use of the article - When the article is used to indicate that the word to which it is attached is the one previously mentioned (Williams and Beckman 2007, 36).
®LL0LI,0LI¯®LR1¯®LL.5LI,0LI¯®LL0LI,0LI¯®LR1¯®LL.5LI,0LI¯Anaptyxis - The insertion of a vowel into a word to avoid a consonant cluster.
®LL0LI,0LI¯®LR1¯®LL.5LI,0LI¯®LL0LI,0LI¯®LR1¯®LL.5LI,0LI¯Aoristic perfect - I use the phrase 'aoristic perfect' to refer to one of the ways the qatal form can be rendered into English. Aoristic perfect denotes a past situation the implications of which are no longer felt in the present. The situation may have extended over a period of time and it may have occurred more than once. It may have occurred in the recent or distant past but from the standpoint of the speaker it is to be regarded as a fact having occurred and hence as a fact belonging to the past (Joüon and Muraoka 2006, 337; Driver 1998, 12). The term 'aoristic perfect' and indeed the other categorizations of perfect in this grammar, all relate to the interpretation of qatal verbs in their given contexts. The qatal form in and of itself does not convey these meanings.
®LL0LI,0LI¯®LR1¯®LL.5LI,0LI¯®LL0LI,0LI¯®LR1¯®LL.5LI,0LI¯Beth essentiae - ®LAHebrew¯ÿHá®LAEnglish¯ that is used to indicate the predicate of a clause or a word used predicatively (Joüon and Muraoka 2006, 458).
这就是我希望文本阅读的方式:
Glossary
A fortiori proposition: If X is true, then how much greater is Y true? To move logically from a stronger argument to establish a weaker argument. The weaker argument is sometimes presented by the speaker as the stronger argument.
Accusative of motion/direction - Indicates movement to the noun marked by the accusative and is to be distinguished from the accusative of local determination which indicates location without motion (Joüon and Muraoka 2006, 428).
Anadiplosis - A figure of speech in which the word that a colon ends with, or a like sounding word, is the word that begins the next colon (Brown, Fitzmyer, Murphy, et al. 1990, 245).
Anaphoric use of the article - When the article is used to indicate that the word to which it is attached is the one previously mentioned (Williams and Beckman 2007, 36).
Anaptyxis - The insertion of a vowel into a word to avoid a consonant cluster.
Aoristic perfect - I use the phrase 'aoristic perfect' to refer to one of the ways the qatal form can be rendered into English. Aoristic perfect denotes a past situation the implications of which are no longer felt in the present. The situation may have extended over a period of time and it may have occurred more than once. It may have occurred in the recent or distant past but from the standpoint of the speaker it is to be regarded as a fact having occurred and hence as a fact belonging to the past (Joüon and Muraoka 2006, 337; Driver 1998, 12). The term 'aoristic perfect' and indeed the other categorizations of perfect in this grammar, all relate to the interpretation of qatal verbs in their given contexts. The qatal form in and of itself does not convey these meanings.
|> sed -n l Glossary.NB
\256SSDEFAULTS\257\256LR1\257\256JU\257\256MD+BO\257\256UFTimes New R\
oman\257\256SZ12Pt\257Glossary\256MD+BO\257\256TS.5IN,1IN,1.5IN,2IN,2\
.5IN,3IN,3.5IN,4IN,4.5IN,5IN,5.5IN,6IN\257\t\256MD-BO\257\r$
\256NJ\257\256LR1\257\256LL.5LI,0LI\257\256MD+BO\257\256LL0LI,0LI\257\
\256MDNM\257\256LR1\257\256LL.5LI,0LI\257A fortiori proposition: If X\
is true, then how much greater is Y true? To move logically from a s\
tronger argument to establish a weaker argument. The weaker argument \
is sometimes presented by the speaker as the stronger argument.\r$
\256LL0LI,0LI\257\256LR1\257\256LL.5LI,0LI\257\256LL0LI,0LI\257\256LR\
1\257\256LL.5LI,0LI\257Accusative of motion/direction - Indicates mov\
ement to the noun marked by the accusative and is to be distinguished\
from the accusative of local determination which indicates location \
without motion (Jo\374on and Muraoka 2006, 428).\r$
Anadiplosis - A figure of speech in which the word that a colon ends \
with, or a like sounding word, is the word that begins the next colon\
\256GC|CI:R#=47;AU=Brown, Raymond E.;YR=1990;TI=New Jerome biblical \
commentary;PG=245;XT=;F[=;F]=;F#=;ID=;XX=Print;CT=;FL=\257(Brown, Fit\
zmyer, Murphy, et al. 1990,\240245)\256GC\257.\r$
\256LL0LI,0LI\257\256LR1\257\256LL.5LI,0LI\257\256LL0LI,0LI\257\256LR\
1\257\256LL.5LI,0LI\257Anaphoric use of the article - When the articl\
e is used to indicate that the word to which it is attached is the on\
e previously mentioned (Williams and Beckman 2007, 36). \r$
\256LL0LI,0LI\257\256LR1\257\256LL.5LI,0LI\257\256LL0LI,0LI\257\256LR\
1\257\256LL.5LI,0LI\257Anaptyxis - The insertion of a vowel into a wo\
rd to avoid a consonant cluster.\r$
\256LL0LI,0LI\257\256LR1\257\256LL.5LI,0LI\257\256LL0LI,0LI\257\256LR\
1\257\256LL.5LI,0LI\257Aoristic perfect - I use the phrase 'aoristic \
perfect' to refer to one of the ways the qatal form can be rendered i\
nto English. Aoristic perfect denotes a past situation the implicatio\
ns of which are no longer felt in the present. The situation may have\
extended over a period of time and it may have occurred more than on\
ce. It may have occurred in the recent or distant past but from the s\
tandpoint of the speaker it is to be regarded as a fact having occurr\
ed and hence as a fact belonging to the past (Jo\374on and Muraoka 20\
06, 337; Driver 1998, 12). The term 'aoristic perfect' and indeed the\
other categorizations of perfect in this grammar, all relate to the \
interpretation of qatal verbs in their given contexts. The qatal form\
in and of itself does not convey these meanings. \r$
\256LL0LI,0LI\257\256LR1\257\256LL.5LI,0LI\257\256LL0LI,0LI\257\256LR\
1\257\256LL.5LI,0LI\257Beth essentiae - \256LAHebrew\257\377H\341\256\
LAEnglish\257 that is used to indicate the predicate of a clause or a\
word used predicatively (Jo\374on and Muraoka 2006, 458).\r$
\256LL0LI,0LI\257\256LR1\257\256LL.5LI,0LI\257\256LL0LI,0LI\257\256LR\
1\257\256LL.5LI,0LI\257Classic perfect - I use the phrase 'classic pe\
rfect' to refer to one of the ways the qatal form can be rendered int\
o English. Classic perfect refers to the continuing present relevance\
of a past situation from the perspective of the speaker (Comrie 1976\
, 52). By perfect I do not necessarily imply that a previous situatio\
n has resulted in a state but that the situation has implications rel\
evant to the present. The situation is not merely past and over but s\
omehow persists and continues to intrude into the present. Such verbs\
are usually translated into English using the perfect or present ten\
se. I have included under this definition quasi-stative verbs which r\
efer to attributes which were acquired before, but which are assumed \
to continue in some way up to the present moment (Driver 1998, 11; Jo\
\374on and Muraoka 2006, 333; Waltke and O'Connor 1990, 487). In some\
grammars these are treated separately. However, that creates too man\
y functions for the one perfect form. The term 'classic perfect' and \
indeed the other categorizations of perfect in this grammar all relat\
e to the \256MD+IT\257interpretation \256MD-IT\257of qatal verbs in t\
heir given contexts. The qatal form by itself does not convey these m\
eanings.\r$
\256LL0LI,0LI\257\256LR1\257\256LL.5LI,0LI\257\256LL0LI,0LI\257\256LR\
1\257\256LL.5LI,0LI\257Cohortative of praise. The cohortative is ofte\
n used in Psalms to indicate that praise, freely undertaken, has begu\
n. This usage is close to the cohortative of resolve but not identica\
l with it. The emphasis falls not on what the writer is intending to \
do, but what he has already undertaken. \r$
Cohortative of resolve - The cohortative mood normally expresses the \
will of the speaker, but when the speaker has the ability to carry ou\
t what he wants it takes on the coloring of resolve (Van der Merwe et\
al. 1997, 152; Waltke and O'Connor 1990, 573).\r$
\256LL0LI,0LI\257\256LR1\257\256LL.5LI,0LI\257\256LL0LI,0LI\257\256LR\
1\257\256LL.5LI,0LI\257Concluding \256LAHebrew\257\377h\353\377H\351\
\256LAEnglish\257 - A special use of the word \256LAHebrew\257\377h\
\353\377H\351\256LAEnglish\257 found towards the end of several Psalm\
s and approximating in meaning to: the conclusion of the matter is th\
at\205\r$
\256LL0LI,0LI\257\256LR1\257\256LL.5LI,0LI\257\256LL0LI,0LI\257\256LR\
1\257\256LL.5LI,0LI\257Conjunctive waw - Waw used to connect clauses \
sed 也可以用作脚本(更容易开发):创建一个文件“nb2txt”
和:
您的正则表达式使用
\|
(GNU 中的替代模式,sed
大多数其他实现中的文字 barsed
)和\+
(GNU 中的一次或多次出现,大多数其他实现中的sed
文字)。如果您使用 GNU ,此模式将删除任何类似or的模式。如果您使用不同的实现,它可能找不到任何匹配项。+
sed
sed
®MD-IT¯
®MDDDDDBO¯
sed
更好地使用扩展正则表达式,大多数
sed
版本多年来都支持:我还建议删除空的替代项(
\|
在模式的开头和结尾),尽管在这种情况下它们不会造成伤害。而无穷无尽的
\.\.\.\.\.\.\.\.\.\.\.\.
and----
应该\.{42}
用-{23}
实际数量的点或破折号代替。或者也许通过\-{10,}
摆脱任何出现的 10 个或更多点。从
sed -n l
清单中可以清楚地看出,您有一个包含许多字符的文件 174(十进制或 256 八进制)和 [字符 175](十进制)或 257(八进制)。列出为\256
and\257
并且可以解释为 Unicode\xae
(十六进制代码ae
- 或八进制 -256
)或只是®
,如果解释为“一个字节”字符,以及 Unicode\xaf
(十六进制代码af
- 或257
八进制)或只是¯
,如果解释为单个字节字符,如果您使用 utf8 作为默认编码(在 Linux 中很常见)。
这似乎
start
和文件end
的一些内部编码.nb
。删除以开头\xae
和结尾的字符串\xaf
似乎让我们更接近您的要求: