BIBLIOGRAPHY
Stavrou, Protesilaos. 2020. “Primer on Regular Expressions inside of Emacs.” Protesilaos Stavrou. January 23, 2020. https://protesilaos.com/codelog/2020-01-23-emacs-regexp-primer/.
History
- 2020년도 자료인데 옮긴다.
Primer on regular expressions inside of Emacs
#정규표현식 를 본다면
(Stavrou 2020)
- Stavrou, Protesilaos
- In this video tutorial I show how to use regexp notation to solve practical problems in Emacs. Protesilaos
Emacs has a few ways to operate on regexp matches, such as:
isearch
query-replace
keep-lines
flush-lines
To make our life easier, we can practice with the built-in regexp-builder
or the third-party package visual-regexp
. This demo will rely on the latter.
If you have the manual you can run C-h r i regexp
to get to the relevant chapter. Do it!
Line boundaries
The caret ^
denotes the beginning of the line.
The dollar sign $
marks the end.
Match all lines that start with a space:
Emacs Emacs Emacs Emacs Emacs
And all that end with a capital S
:
emacs emacS emacS emacs emacs emacs emacS emacS
헤드라인을 네로우 하면 여기에 집중할 수 있다. 여기서 말하는 것은 isearch-regex 를 이용해서 라인 바운더리를 오가는 것이다. 만약 맨 뒤에 S 만 있는 라인을 가려면 isearch-toggle-regexp
에서 S$ 을 입력하면 된다. evil 에도 뭔가 있을 텐데 프롯은 정통파라
Remove or keep lines
Remove the empty lines. Then keep the ones that contain “username”.
<username><![CDATA[name]]></username> emacs emacS emacS emacs emacs emacs emacS emacS
<userName><![CDATA[nom]]></userName> emacs emacS emacS emacs emacs emacs emacS emacS
<username><![CDATA[name]]></username> emacs emacS emacS emacs emacs emacs emacS emacS
whitespace-mode
를 켜니까 공백이 다 보인다. 와. 몰랐다. SPC t w
에 바인딩 되어 있네.
마우스로 영역 잡고 flush-lines
^$ 하니까 빈 라인이 다 사라졌다. 와우! 그 다음에 keep-lines
username 을 적으면 username 이 없는 라인은 다 사라진다.
The dot character
The dot or full stop .
means matches every character except the newline.
Match these words using their common part ired
as a string.
dired fired mired tired wired
> Mark saved where search started [2 times] 이게 뭔가? set-mark 를 어떻게 꺼내 쓰는가?
Character sets and ranges
A set of individual characters is marked between brackets []
.
Sets can be written as ranges:
Range | Scope |
---|---|
[a-z] | all lower cases alphabetic characters |
[A-Za-z] | all upper or lower case letters |
[a-z0-9] | lower case alphabet or numbers 0 through 9 |
[abcd1234] | letters a,b,c,d and numbers 1,2,3,4 |
Match both of those using a character set for the first letter:
emacs Emacs
Match those that end with a number:
Emacs emacs-27 emacs-26 GNU emacs
해보자! 어떻게 하나면! 잘 된다.
Difference between postfix operators ?, +, *
“Postfix” means that it comes after a given set and alters its scope. “포스트픽스”는 주어진 집합 뒤에 와서 그 범위를 변경하는 것을 의미합니다.
?
match the previous term zero or one time. +
match the previous term one or more times. *
match the previous term zero or as many times as possible.
Match the s
optionally:
day days
Use prote
followed by a postfix:
prot prote proteeee
어떻게 하는가요? 프롯은 isearch 를 사용한다. 뭐든 좋다. 기본은 기본대로 해야 된다. 잠시만.
Grouped matches
A group is enclosed inside escaped parentheses \(GROUP\)
.
Match both of these, including the optional suffix ig
:
conf config
> 보자 보자!
conf\(ig\)?
를 이용해서 해보자. 즉 그룹을 만들어서 해당 영역을 묶는 것이다. 유용하다.
Greedy versus non-greedy
Postfix charaacter are greedy by default. “Greedy” matches the longest possible part. Whereas “non-greedy” corresponds to the shortest.
A non-greedy variant is used when the postfix is followed by ?
.
Using the .*
construct, match items both greedily and not:
Hello world Hello world world world world
> ** Multiple groups
Match the alphabetic and numeric parts in two separate groups.
emacs27 emacs26 emacs25 emacs24
Literal hyphen and dot
Match the hyphen as part of the alphabetic group and the dot as part of the numeric one.
emacs-27.1 emacs-26.3 emacs-25.2
Exclude sets
To exclude a set you prepend a caret sign: [^SET]
Match every line except those that start with a capital letter.
GNU Emacs org-mode regexp emacs_lisp Linux guix
Alternative groups with literal brackets
Use a character sets that matches name
and nom
.
name nom
Then:
- Match the
username
variants’[name]
or[nom]
. - Replace the match with
[PROT]
.
<username><![CDATA[name]]></username> <nameuser><![CDATA[nam]]></nameuser> <userName><![CDATA[nom]]></userName> <nameuser><![CDATA[nome]]></nameuser>
Either match
To target either set, use \|
.
Prepend vr/
to the first group
and match
on each line.
`(group-0 ((group (:inherit modus-theme-intense-blue)))) `(group-1 ((group (:inherit modus-theme-intense-magenta)))) `(group-2 ((group (:inherit modus-theme-intense-green)))) `(match-0 ((match (:inherit modus-theme-refine-yellow)))) `(match-1 ((match (:inherit modus-theme-refine-yellow))))
Running elisp functions on groups
Run elisp by escaping the comma \,
and then following it with a symbol inside parentheses: \,(FUNCTION)
.
Using the .ired
pattern from earlier, run a replace command where you must execute the upcase
function on the second/middle match. Keep the rest in tact.
direddireddired firedfiredfired miredmiredmired tiredtiredtired wiredwiredwired