({EmacsConf and hangouts} 2022) "EmacsConf 2022: GRAILโ€”A Generalized Representation and Aggregation of Information Layers" 2022 EmacsConf and Emacs hangouts - EmacsConf 2022

EmacsConf - 2022 - talks - GRAIL---A Generalized Representation and A-

์งˆ๋ฌธ : ํ•™์Šต์— ํ•„์š”ํ•œ ์ •๋ณด ๋ ˆ์ด์–ด์— ๋Œ€ํ•œ ํ™œ์šฉ๊ณผ ์ด๋งฅ์Šค๋กœ ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฌด์—‡์ธ๊ฐ€๊ฐ€ ์žˆ๋Š”๊ฐ€? ์ด๋งฅ์Šค์ด๊ธฐ์— ๊ฐ€๋Šฅํ•œ ๋ฌด์—‡์ธ๊ฐ€ ๋ง์ด๋‹ค.

์„ค๋ช…

The human brain receives various signals that it assimilates (filters, splices, corrects, etc.) to build a syntactic structure and its semantic interpretation. This is a complex process that enables human communication. The field of artificial intelligence (AI) is devoted to studying how we generate symbols and derive meaning from such signals and to building predictive models that allow effective human-computer interaction.

์ธ๊ฐ„์˜ ๋‡Œ๋Š” ๋‹ค์–‘ํ•œ ์‹ ํ˜ธ๋ฅผ ๋ฐ›์•„ ์ด๋ฅผ ๋™ํ™”(ํ•„ํ„ฐ๋ง, ์ ‘ํ•ฉ, ์ˆ˜์ • ๋“ฑ)ํ•˜์—ฌ ๊ตฌ๋ฌธ ๊ตฌ์กฐ์™€ ๊ทธ ์˜๋ฏธ ํ•ด์„์„ ๊ตฌ์ถ•ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ์ธ๊ฐ„์˜ ์˜์‚ฌ์†Œํ†ต์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋Š” ๋ณต์žกํ•œ ๊ณผ์ •์ž…๋‹ˆ๋‹ค. ์ธ๊ณต ์ง€๋Šฅ(AI) ๋ถ„์•ผ๋Š” ๊ธฐํ˜ธ๋ฅผ ์ƒ์„ฑํ•˜๊ณ  ์ด๋Ÿฌํ•œ ์‹ ํ˜ธ์—์„œ ์˜๋ฏธ๋ฅผ ๋„์ถœํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์—ฐ๊ตฌํ•˜๊ณ  ์ธ๊ฐ„๊ณผ ์ปดํ“จํ„ฐ์˜ ํšจ๊ณผ์ ์ธ ์ƒํ˜ธ ์ž‘์šฉ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋Š” ์˜ˆ์ธก ๋ชจ๋ธ์„ ๊ตฌ์ถ•ํ•˜๋Š” ๋ฐ ์ „๋…ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

For the purpose of this talk we will limit the scope of signals to the domain to languageโ€”text and speech. Computational Linguistics (CL), a.k.a. Natural Language Processing (NLP), is a sub-area of AI that tries to interpret them. It involves modeling and predicting complex linguistic structures from these signals. These models tend to rely heavily on a large amount of raw'' (naturally occurring) data and a varying amount of (manually) enriched data, commonly known asannotations''. The models are only as good as the quality of the annotations. Owing to the complex and numerous nature of linguistic phenomena, a divide and conquer approach is common. The upside is that it allows one to focus on one, or few, related linguistic phenomena. The downside is that the universe of these phenomena keeps expanding as language is context sensitive and evolves over time. For example, depending on the context, the word bank'' can refer to a financial institution, or the rising ground surrounding a lake, or something else. The verbgoogle'' did not exist before the company came into being.

์ด ๊ฐ•์—ฐ์—์„œ๋Š” ์‹ ํ˜ธ์˜ ๋ฒ”์œ„๋ฅผ ์–ธ์–ด ํ…์ŠคํŠธ์™€ ์Œ์„ฑ์œผ๋กœ ์ œํ•œํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP)๋ผ๊ณ ๋„ ํ•˜๋Š” ์ปดํ“จํ„ฐ ์–ธ์–ดํ•™(CL)์€ ์ด๋ฅผ ํ•ด์„ํ•˜๋Š” AI์˜ ํ•˜์œ„ ์˜์—ญ์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—๋Š” ์ด๋Ÿฌํ•œ ์‹ ํ˜ธ์—์„œ ๋ณต์žกํ•œ ์–ธ์–ด ๊ตฌ์กฐ๋ฅผ ๋ชจ๋ธ๋งํ•˜๊ณ  ์˜ˆ์ธกํ•˜๋Š” ์ž‘์—…์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ชจ๋ธ์€ ๋Œ€๋Ÿ‰์˜ raw (naturally occurring) data and a varying amount of (manually) enriched data, commonly known as ์ฃผ์„์— ํฌ๊ฒŒ ์˜์กดํ•˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ์€ ์ฃผ์„์˜ ํ’ˆ์งˆ๋งŒํผ๋งŒ ์šฐ์ˆ˜ํ•ฉ๋‹ˆ๋‹ค. ์–ธ์–ด ํ˜„์ƒ์˜ ๋ณต์žกํ•˜๊ณ  ๋‹ค์–‘ํ•œ ํŠน์„ฑ์œผ๋กœ ์ธํ•ด ๋ถ„ํ•  ๋ฐ ์ •๋ณต ์ ‘๊ทผ ๋ฐฉ์‹์ด ์ผ๋ฐ˜์ ์ž…๋‹ˆ๋‹ค. ์žฅ์ ์€ ํ•˜๋‚˜ ๋˜๋Š” ์†Œ์ˆ˜์˜ ๊ด€๋ จ ์–ธ์–ด ํ˜„์ƒ์— ์ง‘์ค‘ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ๋‹จ์ ์€ ์–ธ์–ด๊ฐ€ ๋ฌธ๋งฅ์— ๋ฏผ๊ฐํ•˜๊ณ  ์‹œ๊ฐ„์ด ์ง€๋‚จ์— ๋”ฐ๋ผ ์ง„ํ™”ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๋Ÿฌํ•œ ํ˜„์ƒ์˜ ์„ธ๊ณ„๊ฐ€ ๊ณ„์† ํ™•์žฅ๋œ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ๋ฌธ๋งฅ์— ๋”ฐ๋ผ bank ''can refer to a financial institution, or the rising ground surrounding a lake, or something else. The verb google''์ด๋ผ๋Š” ๋‹จ์–ด๋Š” ํšŒ์‚ฌ๊ฐ€ ํƒ„์ƒํ•˜๊ธฐ ์ „์—๋Š” ์กด์žฌํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.

Manually annotating data can be a very task specific, labor intensive, endeavor. Owing to this, advances in multiple modalities have happened in silos until recently. Recent advances in computer hardware and machine learning algorithms have opened doors to interpretation of multimodal data. However, the need to piece together such related but disjoint predictions poses a huge challenge.

๋ฐ์ดํ„ฐ์— ์ˆ˜๋™์œผ๋กœ ์ฃผ์„์„ ๋‹ค๋Š” ์ž‘์—…์€ ๋งค์šฐ ๊นŒ๋‹ค๋กญ๊ณ  ๋…ธ๋™ ์ง‘์•ฝ์ ์ธ ์ž‘์—…์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๋•Œ๋ฌธ์— ์ตœ๊ทผ๊นŒ์ง€ ๋‹ค์ค‘ ์–‘์‹์˜ ๋ฐœ์ „์€ ์‚ฌ์ผ๋กœ์—์„œ ์ด๋ฃจ์–ด์กŒ์Šต๋‹ˆ๋‹ค. ์ตœ๊ทผ ์ปดํ“จํ„ฐ ํ•˜๋“œ์›จ์–ด์™€ ๋จธ์‹ ๋Ÿฌ๋‹ ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๋ฐœ์ „์œผ๋กœ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋ฐ์ดํ„ฐ๋ฅผ ํ•ด์„ํ•  ์ˆ˜ ์žˆ๋Š” ๊ธธ์ด ์—ด๋ ธ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์„œ๋กœ ์—ฐ๊ด€๋˜์–ด ์žˆ์ง€๋งŒ ์„œ๋กœ ๋‹ค๋ฅธ ์˜ˆ์ธก์„ ํ†ตํ•ฉํ•ด์•ผ ํ•œ๋‹ค๋Š” ์ ์€ ํฐ ๋„์ „ ๊ณผ์ œ์ž…๋‹ˆ๋‹ค.

This brings us to the two questions that we will try to address in this talk:

์ด ๊ฐ•์—ฐ์—์„œ ๋‹ค๋ฃจ๊ณ ์ž ํ•˜๋Š” ๋‘ ๊ฐ€์ง€ ์งˆ๋ฌธ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

How can we come up with a unified representation of data and annotations that encompasses arbitrary levels of linguistic information? and,

์ž„์˜์˜ ์ˆ˜์ค€์˜ ์–ธ์–ด ์ •๋ณด๋ฅผ ํฌ๊ด„ํ•˜๋Š” ๋ฐ์ดํ„ฐ์™€ ์ฃผ์„์˜ ํ†ตํ•ฉ๋œ ํ‘œํ˜„์„ ์–ด๋–ป๊ฒŒ ๋งŒ๋“ค ์ˆ˜ ์žˆ์„๊นŒ์š”? ๊ทธ๋ฆฌ๊ณ ,

What role might Emacs play in this process?

์ด ๊ณผ์ •์—์„œ Emacs๋Š” ์–ด๋–ค ์—ญํ• ์„ ํ•  ์ˆ˜ ์žˆ์„๊นŒ์š”?

Emacs provides a rich environment for editing and manipulating recursive embedded structures found in programming languages. Its view of text, however, is more or less linearโ€“strings broken into words, strings ended by periods, strings identified using delimiters, etc. It does not assume embedded or recursive structure in text. However, the process of interpreting natural language involves operating on such structures. What if we could adapt Emacs to manipulate rich structures derived from text? Unlike programming languages, which are designed to be parsed and interpreted deterministically, interpretation of statements in natural languages has to frequently deal with phenomena such as ambiguity, inconsistency, incompleteness, etc. and can get quite complex.

Emacs๋Š” ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด์—์„œ ๋ณผ ์ˆ˜ ์žˆ๋Š” ์žฌ๊ท€์  ์ž„๋ฒ ๋””๋“œ ๊ตฌ์กฐ๋ฅผ ํŽธ์ง‘ํ•˜๊ณ  ์กฐ์ž‘ํ•  ์ˆ˜ ์žˆ๋Š” ํ’๋ถ€ํ•œ ํ™˜๊ฒฝ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ํ…์ŠคํŠธ ๋ณด๊ธฐ๋Š” ๋‹จ์–ด๋กœ ๋‚˜๋‰œ ๋ฌธ์ž์—ด, ๋งˆ์นจํ‘œ๋กœ ๋๋‚˜๋Š” ๋ฌธ์ž์—ด, ๊ตฌ๋ถ„ ๊ธฐํ˜ธ๋กœ ์‹๋ณ„๋˜๋Š” ๋ฌธ์ž์—ด ๋“ฑ ๋‹ค์†Œ ์„ ํ˜•์ ์ธ ํ˜•ํƒœ์ž…๋‹ˆ๋‹ค. ํ…์ŠคํŠธ์˜ ์ž„๋ฒ ๋””๋“œ ๋˜๋Š” ์žฌ๊ท€ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ •ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ž์—ฐ์–ด๋ฅผ ํ•ด์„ํ•˜๋Š” ๊ณผ์ •์—๋Š” ์ด๋Ÿฌํ•œ ๊ตฌ์กฐ์— ๋Œ€ํ•œ ์ž‘์—…์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ํ…์ŠคํŠธ์—์„œ ํŒŒ์ƒ๋œ ํ’๋ถ€ํ•œ ๊ตฌ์กฐ๋ฅผ ์กฐ์ž‘ํ•˜๊ธฐ ์œ„ํ•ด Emacs๋ฅผ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด ์–ด๋–จ๊นŒ์š”? ๊ฒฐ์ •๋ก ์ ์œผ๋กœ ๊ตฌ๋ฌธ ๋ถ„์„ํ•˜๊ณ  ํ•ด์„ํ•˜๋„๋ก ์„ค๊ณ„๋œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด์™€ ๋‹ฌ๋ฆฌ ์ž์—ฐ์–ด ๋ฌธ์žฅ์˜ ํ•ด์„์€ ๋ชจํ˜ธ์„ฑ, ๋ถˆ์ผ์น˜, ๋ถˆ์™„์ „์„ฑ ๋“ฑ๊ณผ ๊ฐ™์€ ํ˜„์ƒ์„ ์ž์ฃผ ์ฒ˜๋ฆฌํ•ด์•ผ ํ•˜๋ฉฐ ์ƒ๋‹นํžˆ ๋ณต์žกํ•ด์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

We present an architecture (GRAIL) which utilizes the capabilities of Emacs to allow the representation and aggregation of such rich structures in a systematic fashion. Our approach is not tied to Emacs, but uses its many built-in capabilities for creating and evaluating solution prototypes.

์ €ํฌ๋Š” ์ด๋Ÿฌํ•œ ํ’๋ถ€ํ•œ ๊ตฌ์กฐ๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ํ‘œํ˜„ํ•˜๊ณ  ์ง‘๊ณ„ํ•  ์ˆ˜ ์žˆ๋„๋ก Emacs์˜ ๊ธฐ๋Šฅ์„ ํ™œ์šฉํ•˜๋Š” ์•„ํ‚คํ…์ฒ˜(GRAIL)๋ฅผ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค. ์šฐ๋ฆฌ์˜ ์ ‘๊ทผ ๋ฐฉ์‹์€ Emacs์— ์ข…์†๋˜์ง€ ์•Š๊ณ  ์†”๋ฃจ์…˜ ํ”„๋กœํ† ํƒ€์ž…์„ ๋งŒ๋“ค๊ณ  ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด ๋‚ด์žฅ๋œ ๋งŽ์€ ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

Related-Notes

References

{EmacsConf, and Emacs hangouts}, eds. 2022. EmacsConf 2022: GRAILโ€“-A Generalized Representation and Aggregation of Information Layers. Directed by {EmacsConf and Emacs hangouts}. https://www.youtube.com/watch?v=q2b3mSOUZcY.