BIBLIOGRAPHY

Stavrou, Protesilaos. 2020. “Primer on Regular Expressions inside of Emacs.” Protesilaos Stavrou. January 23, 2020. https://protesilaos.com/codelog/2020-01-23-emacs-regexp-primer/.

History

  • [2025-04-22 Tue 17:45] 2020년도 자료인데 옮긴다.

Primer on regular expressions inside of Emacs

#정규표현식 를 본다면

(Stavrou 2020)

  • Stavrou, Protesilaos
  • In this video tutorial I show how to use regexp notation to solve practical problems in Emacs. Protesilaos

Emacs has a few ways to operate on regexp matches, such as:

  • isearch
  • query-replace
  • keep-lines
  • flush-lines

To make our life easier, we can practice with the built-in regexp-builder or the third-party package visual-regexp. This demo will rely on the latter.

If you have the manual you can run C-h r i regexp to get to the relevant chapter. Do it!

Line boundaries

The caret ^ denotes the beginning of the line.

The dollar sign $ marks the end.

Match all lines that start with a space:

Emacs Emacs Emacs Emacs Emacs

And all that end with a capital S:

emacs emacS emacS emacs emacs emacs emacS emacS

헤드라인을 네로우 하면 여기에 집중할 수 있다. 여기서 말하는 것은 isearch-regex 를 이용해서 라인 바운더리를 오가는 것이다. 만약 맨 뒤에 S 만 있는 라인을 가려면 isearch-toggle-regexp 에서 S$ 을 입력하면 된다. evil 에도 뭔가 있을 텐데 프롯은 정통파라

Remove or keep lines

Remove the empty lines. Then keep the ones that contain “username”.

<username><![CDATA[name]]></username> emacs emacS emacS emacs emacs emacs emacS emacS

<userName><![CDATA[nom]]></userName> emacs emacS emacS emacs emacs emacs emacS emacS

<username><![CDATA[name]]></username> emacs emacS emacS emacs emacs emacs emacS emacS

whitespace-mode 를 켜니까 공백이 다 보인다. 와. 몰랐다. SPC t w 에 바인딩 되어 있네.

마우스로 영역 잡고 flush-lines ^$ 하니까 빈 라인이 다 사라졌다. 와우! 그 다음에 keep-lines username 을 적으면 username 이 없는 라인은 다 사라진다.

The dot character

The dot or full stop . means matches every character except the newline.

Match these words using their common part ired as a string.

dired fired mired tired wired

> Mark saved where search started [2 times] 이게 뭔가? set-mark 를 어떻게 꺼내 쓰는가?

Character sets and ranges

A set of individual characters is marked between brackets [].

Sets can be written as ranges:

RangeScope
[a-z]all lower cases alphabetic characters
[A-Za-z]all upper or lower case letters
[a-z0-9]lower case alphabet or numbers 0 through 9
[abcd1234]letters a,b,c,d and numbers 1,2,3,4

Match both of those using a character set for the first letter:

emacs Emacs

Match those that end with a number:

Emacs emacs-27 emacs-26 GNU emacs

해보자! 어떻게 하나면! 잘 된다.

Difference between postfix operators ?, +, *

“Postfix” means that it comes after a given set and alters its scope. “포스트픽스”는 주어진 집합 뒤에 와서 그 범위를 변경하는 것을 의미합니다.

? match the previous term zero or one time. + match the previous term one or more times. * match the previous term zero or as many times as possible.

Match the s optionally:

day days

Use prote followed by a postfix:

prot prote proteeee

어떻게 하는가요? 프롯은 isearch 를 사용한다. 뭐든 좋다. 기본은 기본대로 해야 된다. 잠시만.

Grouped matches

A group is enclosed inside escaped parentheses \(GROUP\).

Match both of these, including the optional suffix ig:

conf config

> 보자 보자!

conf\(ig\)? 를 이용해서 해보자. 즉 그룹을 만들어서 해당 영역을 묶는 것이다. 유용하다.

Greedy versus non-greedy

Postfix charaacter are greedy by default. “Greedy” matches the longest possible part. Whereas “non-greedy” corresponds to the shortest.

A non-greedy variant is used when the postfix is followed by ?.

Using the .* construct, match items both greedily and not:

Hello world Hello world world world world

> ** Multiple groups

Match the alphabetic and numeric parts in two separate groups.

emacs27 emacs26 emacs25 emacs24

Literal hyphen and dot

Match the hyphen as part of the alphabetic group and the dot as part of the numeric one.

emacs-27.1 emacs-26.3 emacs-25.2

Exclude sets

To exclude a set you prepend a caret sign: [^SET]

Match every line except those that start with a capital letter.

GNU Emacs org-mode regexp emacs_lisp Linux guix

Alternative groups with literal brackets

Use a character sets that matches name and nom.

name nom

Then:

  1. Match the username variants’ [name] or [nom].
  2. Replace the match with [PROT].

<username><![CDATA[name]]></username> <nameuser><![CDATA[nam]]></nameuser> <userName><![CDATA[nom]]></userName> <nameuser><![CDATA[nome]]></nameuser>

Either match

To target either set, use \|.

Prepend vr/ to the first group and match on each line.

`(group-0 ((group (:inherit modus-theme-intense-blue)))) `(group-1 ((group (:inherit modus-theme-intense-magenta)))) `(group-2 ((group (:inherit modus-theme-intense-green)))) `(match-0 ((match (:inherit modus-theme-refine-yellow)))) `(match-1 ((match (:inherit modus-theme-refine-yellow))))

Running elisp functions on groups

Run elisp by escaping the comma \, and then following it with a symbol inside parentheses: \,(FUNCTION).

Using the .ired pattern from earlier, run a replace command where you must execute the upcase function on the second/middle match. Keep the rest in tact.

direddireddired firedfiredfired miredmiredmired tiredtiredtired wiredwiredwired