Sehyeog Kim
← Back to Agentic_AI_Theory

๋ชฉ์ฐจ
- Frontend framework: ์‚ฌ์šฉ์ž๊ฐ€ ๋Œ€ํ™”/์ž‘์—…์„ ์š”์ฒญํ•˜๋Š” UI (part1)
- Agent development framework: ์—์ด์ „ํŠธ ๋กœ์ง(๋ฃจํ”„, ์ƒํƒœ, ๋„๊ตฌ ์—ฐ๊ฒฐ)์„ ๋งŒ๋“œ๋Š” ํ”„๋ ˆ์ž„์›Œํฌ (part1)
- Agent memory: ๋Œ€ํ™”/์„ธ์…˜ ์ƒํƒœ์™€ ์žฅ๊ธฐ ๊ธฐ์–ต ์ €์žฅ (part1)
- Agent tools: ๊ฒ€์ƒ‰, DB, ์‚ฌ๋‚ด API ๋“ฑ โ€œํ–‰๋™โ€์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋„๊ตฌ ๋ฌถ์Œ (part1)
- Agent design patterns: ์‹ฑ๊ธ€ ์—์ด์ „ํŠธ vs ๋ฉ€ํ‹ฐ ์—์ด์ „ํŠธ ๋“ฑ ๊ตฌ์กฐ ํŒจํ„ด (part3)
- Agent runtim: ์—์ด์ „ํŠธ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์ด ์‹ค์ œ๋กœ ๋Œ์•„๊ฐ€๋Š” ์‹คํ–‰ ํ™˜๊ฒฝ (part3)
- AI models: ์ถ”๋ก /์˜์‚ฌ๊ฒฐ์ • ์—”์ง„(part 3)
- Model runtime: ๋ชจ๋ธ์„ ์„œ๋น™ํ•˜๋Š” ์ธํ”„๋ผ(๊ด€๋ฆฌํ˜• API/์ปจํ…Œ์ด๋„ˆ/GKE ๋“ฑ) (part 3)

๐ŸŽ Agentic AI - Memory

Agentic AI๋ฅผ ํ•œ ๋ฌธ์žฅ์œผ๋กœ ์ •๋ฆฌํ•˜๋ฉด,ย ์‚ฌ์šฉ์ž ์˜๋„๋ฅผ ์ดํ•ดํ•˜๊ณ  โ†’ ์—ฌ๋Ÿฌ ๋‹จ๊ณ„ ๊ณ„ํš์„ ์„ธ์šฐ๊ณ  โ†’ ๋„๊ตฌ๋ฅผ ํ˜ธ์ถœํ•ด ์‹คํ–‰๊นŒ์ง€ ๋๋‚ด๋Š”ย ์ž์œจ ์‹œ์Šคํ…œ์ด๋‹ค. ๋‹จ์ˆœํžˆ โ€œ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•˜๋Š” ๋ชจ๋ธโ€์ด ์•„๋‹ˆ๋ผ,ย ๊ณ„ํš(Planning)ย ๊ณผย ๋„๊ตฌ(Tools)ย ๋ฅผ ํ†ตํ•ด ์‹ค์ œ ์—…๋ฌด๋ฅผ ์™„๋ฃŒํ•˜๋„๋ก ์„ค๊ณ„๋œ ์•„ํ‚คํ…์ฒ˜์ธ ๊ฒƒ์ด๋‹ค.
์ด ์ž‘์—…์„ ์ง„ํ–‰ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š”, ์ด์ „์— ์–ด๋–ค ๋น„์Šทํ•œ ์ž‘์—…์„ ํ–ˆ๋Š”์ง€, ์‚ฌ์šฉ์ž์˜ ํŠน์„ฑ์€ ๋ฌด์—‡์ธ์ง€๋ฅผ agent๊ฐ€ ๋ฏธ๋ฆฌ ์ธ์ง€ํ•˜๊ณ  ์ž‘์—…์„ ํ•˜๋Š”๊ฒƒ์ด ํšจ์œจ์ ์ด๊ณ  ์ด๋ฅผ ์œ„ํ•ด Memory๊ธฐ๋Šฅ์€ ํ•„์ˆ˜์ ์ด๋‹ค.
๐ŸŽ Agentic AI - Memory
(agent memory ๊ธฐ๋Šฅ์ด ์—†๋‹ค๋ฉด ์œ„ ์˜ˆ์‹œ์ฒ˜๋Ÿผ, action, operation๊ณผ์ •์ด ๊ณ„์†ํ•ด์„œ ๋Š˜์–ด๋‚˜๊ณ , ์ง„ํ–‰์ด ๋А๋ ค์ง€๋Š” ์ƒํ™ฉ์ด ๋ฐœ์ƒํ•œ๋‹ค)

๋‹จ๊ธฐ๊ธฐ์–ต


Agent memory๋Š” ์‚ฌ๋žŒ์˜ ๊ธฐ์–ต๊ณผ๋„ ๊ฐ™๋‹ค. ์šฐ๋ฆฌ๊ฐ€ ์–ด๋–ค ์‚ฌ๋žŒ๊ณผ ๋Œ€ํ™”๋ฅผ ์ง„ํ–‰ํ• ๋•Œ, ๊ทธ ์‚ฌ๋žŒ๊ณผ์˜ ๋Œ€ํ™” ํ๋ฆ„๊ณผ ๋‚ด์šฉ์„ ๊ธฐ์–ต์„ ํ•˜์ง€๋งŒ, ์žฅ๊ธฐ์ ์œผ๋กœ๋Š” ๋ชจ๋“  ๋Œ€ํ™”๋‚ด์šฉ์„ ๊ธฐ์–ตํ•  ์ˆ˜๋Š” ์—†๋‹ค.
(์‚ฌ๋žŒ๋“ค์˜ ๋‡Œ์— ์ €์žฅ๊ณต๊ฐ„์˜ ํ•œ๊ณ„๊ฐ€ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค).

์šฐ๋ฆฌ๊ฐ€ CHATGPT๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด, ๊ฒฝํ—˜์— ์˜ํ•ด์„œ ์•„๋งˆ ๋ชจ๋“  ๋ถ„๋“ค์ด ์•„์‹ค ๊ฒƒ์ด๋‹ค. ์ƒˆ๋กœ์šด ๋Œ€ํ™”์ฐฝ์„ ์—ด๊ฒŒ ๋˜๋ฉด, ๊ธฐ์กด์— ๋Œ€ํ™”ํ–ˆ๋˜ ๊ฒƒ๋“ค๊ณผ ๋ณ„๊ฐœ์˜ ๋‚ด์šฉ์— ๋Œ€ํ•ด์„œ ์ด์•ผ๊ธฐ ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ฆ‰, ๋Œ€ํ™” ๊ธฐ์–ต์ด reset๋˜๋Š” ๊ฒƒ์ด๋‹ค.

์—ฌ๊ธฐ์„œ ์ค‘์š”ํ•œ ๊ฐœ๋… ๋‘๊ฐœ๋ฅผ ์งš๊ณ  ๋„˜์–ด๊ฐ€์ž.
1. Session: ์‚ฌ์šฉ์ž์™€ ์—์ด์ „ํŠธ๊ฐ€ ์ฃผ๊ณ ๋ฐ›๋Š” โ€œ๋Œ€ํ™”์ฐฝโ€
- ํ•œ ์„ธ์…˜ ์•ˆ์—์„œ๋Š” ์ด์ „ ๋ฉ”์„ธ์ง€๋“ค์ด ์ด์–ด์ง„ ๋งฅ๋ฝ์œผ๋กœ (๋‹จ๊ธฐ๊ธฐ์–ต์œผ๋กœ) ๊ธฐ์–ต๋จ.
- ํ•œ ์„ธ์…˜์€ ๋‹ค๋ฅธ ์„ธ์…˜ ๋‚ด์šฉ๊ณผ ์„ž์ด์ง€ ์•Š์Œ, ๋‹ค๋ฅธ๋ง๋กœ ๋‹ค๋ฅธ ์„ธ์…˜์„ ๊ธฐ์–ตํ•˜์ง€ ์•Š์Œ.
2. State: ํ•œ session์—์„œ ๋Œ€ํ™”๋ฅผ ํ†ตํ•ด ์ •๋ฆฌ๋œ โ€œํ˜„์žฌ ์ž‘์—…์˜ ํ•ต์‹ฌ ์ •๋ณดโ€
- state type: ๋ชฉํ‘œ ์ œ์•ฝ, ์ด๋ฏธ ๊ฒฐ์ •๋œ ์‚ฌํ•ญ, ์ง„ํ–‰๋‹จ๊ณ„, ๋‹ค์Œ ํ–‰๋™ ๊ฐ™์€ ๊ตฌ์กฐํ™”๋œ ์š”์•ฝ.

๐ŸŽ Agentic AI - Memory

์œ„ Chatpgt history์—์„œ session์€
- Opencalw Voice call ์„ค์ •
- Twill Definition and Uses
- Openclaw ์ „ํ™”์„ค์ •
- etc.
state๋Š”
- ์‚ฌ์šฉ์ž์˜ ๋ชฉํ‘œ: โ€œํ‘ธ๊พธ์˜ฅ ๊ณตํ•ญ ์ง ๋ณด๊ด€ ์œ„์น˜/๊ฐ€๊ฒฉ ์กฐ์‚ฌโ€
- ์ œ์•ฝ์กฐ๊ฑด: โ€œํ•œ๊ตญ์–ด๋กœ ์š”์•ฝ, ๊ฐ€๊ฒฉ ๋น„๊ต ํฌํ•จโ€
- ์ด๋ฏธ ๊ฒฐ์ •๋œ ์„ ํƒ: โ€œํ›„๋ณด๋Š” ํ”ผํฌํƒ€์ž„ ์šฐ์„ โ€
- ํ˜„์žฌ ์ง„ํ–‰ ๋‹จ๊ณ„: โ€œ1) ํ›„๋ณด ๋ฆฌ์ŠคํŠธ์—… ์™„๋ฃŒ โ†’ 2) ๊ฐ€๊ฒฉ/์œ„์น˜ ๊ฒ€์ฆ ๋‹จ๊ณ„โ€
- ํ•ด์•ผ ํ•  ๋‹ค์Œ ํ–‰๋™: โ€œ๊ตฌ๊ธ€๋งต ์œ„์น˜ ํ™•์ธ / ์ตœ์‹  ํ›„๊ธฐ 3๊ฐœ ์š”์•ฝโ€
- ๋„๊ตฌ ์‚ฌ์šฉ ๊ฒฐ๊ณผ ์š”์•ฝ: โ€œ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ ํ•ต์‹ฌ๋งŒ 5์ค„๋กœ ์ •๋ฆฌโ€

์—ฌ๊ธฐ์„œ ๊ต‰์žฅํžˆ ์ค‘์š”ํ•œ ์ ์€ ๋‘๊ฐ€์ง€์ด๋‹ค:
- โ€œ๋‹จ๊ธฐ๊ธฐ์–ต์€ โ€˜Session์—์„œโ€™ ๋งฅ๋ฝ์„ ์œ ์ง€ํ•˜๊ธฐ ์œ„ํ•ด ์‹œ์Šคํ…œ์ด ๊ด€๋ฆฌํ•˜๋Š” ๋Œ€ํ™”/์ƒํƒœ(state)์ด๋‹ค.โ€
- โ€œLLM ์ž์ฒด๊ฐ€ ๊ธฐ์–ตํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, ๋Œ€ํ™” ๊ธฐ๋ก/์ƒํƒœ๋ฅผ ์ปจํ…์ŠคํŠธ๋กœ ์žฌ์ฃผ์ž…ํ•ด์„œ ๊ธฐ์–ตํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋™์ž‘ํ•œ๋‹ค.โ€
- ๋Œ€ํ™”๋‚ด์šฉ์„ โ€˜๋ฌธ์žโ€™๋กœ ์ „๋ถ€ ๊ธฐ์–ตํ•˜๋Š” ๊ฒƒ์€ ๋น„ํšจ์œจ ์ ์ด๋ฏ€๋กœ, ํ•ต์‹ฌ์ ์ธ state๋ฅผ ์ค‘์‹ฌ์œผ๋กœ ๊ธฐ์–ตํ•œ๋‹ค.
์œ„ ์‚ฌ์ง„์˜ ์˜ˆ์‹œ์—์„œ, โ€˜openclaw voice call ์„ค์ •โ€™ session๋‚ด๋ถ€์—์„œ ๋Œ€ํ™”๋‚ด์šฉ์„ ๊ณ„์†ํ•ด์„œ ๊ธฐ์–ตํ•˜๋Š” ๋Šฅ๋ ฅ์ด ๋ฐ”๋กœ ๋‹จ๊ธฐ๊ธฐ์–ต๋Šฅ๋ ฅ์ธ ๊ฒƒ์ด๋‹ค.

์žฌ๋ฏธ ์žˆ๋Š” ์‚ฌ์‹ค์€ LLM์ด ๊ธฐ์–ต์„ ํ•˜๋Š”๊ฒŒ ์•„๋‹ˆ๋ผ openai, antropic, google ๋ชจ๋“  ํšŒ์‚ฌ๋“ค์ด ์•„๋ž˜์˜ ๊ตฌ์กฐ๋กœ ์„ค๊ณ„๋ฅผ ํ•ด์„œ ๊ทธ๋ ‡๋‹ค.

๋“ค์–ด๊ฐ€๊ธฐ์ „์— context window๋ผ๋Š” ๊ฐœ๋…์„ ๋น ๋ฅด๊ฒŒ ์งš๊ณ  ๋„˜์–ด๊ฐ€์ž. ์ปจํ…์ŠคํŠธ ์œˆ๋„์šฐ(Context Window)๋Š” โ€œ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ(LLM)์ด ํ•œ ๋ฒˆ์˜ ๋Œ€ํ™”๋‚˜ ์ž‘์—…์—์„œ ์ดํ•ดํ•˜๊ณ  ๊ธฐ์–ตํ•  ์ˆ˜ ์žˆ๋Š” ์ตœ๋Œ€ ์ž…๋ ฅ ๋ฐ ์ถœ๋ ฅ ํ…์ŠคํŠธ์˜ ์–‘โ€ ์ด๋‹ค.
์•„๋ž˜์˜ LLM์— ๋„ฃ์–ด์ฃผ๋Š” prompt๋ฅผ ๋ณด๋ฉด words, imsage, videos๊ฐ€ ์žˆ๋‹ค. ํ•˜์ง€๋งŒ LLM ์ด ํ˜„์žฌ ๊ฐ๋‹นํ•  ์ˆ˜ ์žˆ๋Š” ์ตœ๋Œ€ ์–‘์ด ์กด์žฌ๋ฅผ ํ•œ๋‹ค.
๐ŸŽ Agentic AI - Memory

์•„๋ž˜ 2026.02์ž๋ฃŒ์— ๋”ฐ๋ฅด๋ฉด Claude, gpt-codex์˜ context window๋Š” 1M, 400K token์ด๋ผ๊ณ  ์ ํ˜€ ์žˆ๋‹ค. ์—ฌ๊ธฐ์„œ ํ† ํฐ์€, ์šฐ๋ฆฌ๊ฐ€ prompt์— ์ž…๋ ฅํ•˜๋Š” ๋ฐ์ดํ„ฐ์˜ ์ข…๋ฅ˜๋Š” (๊ธ€, ์ด๋ฏธ์ง€, ๋น„๋””์˜ค)๋กœ ๋‹ค์–‘ํ•˜์ง€๋งŒ ์ด๋ฅผ LLM์ด ํก์ˆ˜ํ• ๋•Œ๋Š” ํ•˜๋‚˜์˜ ๋ฐ์ดํ„ฐ์ด๋‹ค. ๋”ฐ๋ผ์„œ, ๊ณตํ†ต๋œ ๋‹จ์œ„๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ๊ทธ ๋‹จ์œ„๊ฐ€ token์ด๋‹ค. ๋‹ค์‹œ ๋งํ•ด์„œ, LLM์ด ํก์ˆ˜ํ•˜๋Š” ๋ฐ์ดํ„ฐ์˜ ๋‹จ์œ„์ด๋‹ค.
(์‰ฝ๊ฒŒ ๋งํ•ด ๊ธ€์„ ๋งŽ์ด ์“ฐ๊ณ , ์ด๋ฏธ์ง€๋ฅผ ๋งŽ์ด ์“ฐ๊ณ , ๋น„๋””์˜ค์˜ ์šฉ๋Ÿ‰์ด ํฌ๋‹ค๋ฉด, ์‚ฌ์šฉ๋˜๋Š” ํ† ํฐ์ด ๋งŽ๋‹ค ๋ผ๊ณ  ์ดํ•ดํ•˜์‹œ๋ฉด ์ข‹๋‹ค.)

๐ŸŽ Agentic AI - Memory
[Nivida blog]
โ€œLLM (Large Language Model) token isย the fundamental unit of text or code that a model processes, analyzes, and generates. Think of tokens as the "Lego bricks" of languageโ€”they are the basic pieces used to break down input text into manageable units before converting them into numerical representations (vectors)โ€


๋‹ค์‹œ chatgpt๊ฐ€ ์–ด๋–ป๊ฒŒ ๋Œ€ํ™”์ฐฝ์˜ ์ด์ „ ๋‚ด์šฉ๋“ค์„ ๊ธฐ์–ตํ•˜๋Š”๊ฐ€? ๋ผ๋Š” ์งˆ๋ฌธ์œผ๋กœ ๋Œ์•„๊ฐ€ ๋ณด๋ฉด, context window ์ฆ‰ ์ž…๋ ฅํ•ด์ฃผ๋Š” prompt ๋‚ด๋ถ€์— ์ด์ „ ๋Œ€ํ™”๋‚ด์šฉ๋“ค์„ ์งš์–ด ๋„ฃ๋Š” ๊ฒƒ์ด๋‹ค.์•„๋ž˜์˜ ์‚ฌ์ง„์„ ๋ณด๋ฉด turn2,3์—์„œ๋Š” turn1,2 ์ฆ‰ ์ด์ „ ๋Œ€ํ™”์˜ ๋‚ด์šฉ๋“ค์„ ๊ฐ™์ด ์งš์–ด ๋„ฃ๋Š”๋‹ค.(์‚ฌ์šฉ์ž์˜ ์งˆ๋ฌธ๊ณผ ๋‹ต๋ณ€๊นŒ์ง€ ์งš์–ด ๋„ฃ๋Š”๋‹ค).
๐ŸŽ Agentic AI - Memory

ํ•˜์ง€๋งŒ, ์ด์•ผ๊ธฐ ํ–ˆ๋‹ค ์‹ถ์ด. context window์˜ ์ œํ•œ ์œ„ ์‚ฌ์ง„์—์„œ๋Š” 200k token์ด ์กด์žฌํ•œ๋‹ค. ๋”ฐ๋ผ์„œ, ๋ชจ๋“  ๋Œ€ํ™”๋ฅผ ๋„ฃ์„ ์ˆ˜ ์—†๋‹ค. ๊ทธ๋ž˜์„œ openai docs์— ๋”ฐ๋ฅด๋ฉด ๋‘๊ฐ€์ง€ ๋ฐฉ์‹ ์„ ์‚ฌ์šฉํ•œ๋‹ค.
Context Engineering - Short-Term Memory Management with Sessions from OpenAI Agents SDK

Trimming


Trimming์€ ์•„์ฃผ ๋‹จ์ˆœํ•˜๊ฒŒ time sequence์— ๋”ฐ๋ผ ๋Œ€ํ™”๋‚ด์šฉ์„ ์ •๋ ฌํ•œํ›„, context windowํ•œ๊ณ„์น˜๊นŒ์ง€๋งŒ ๋‹ด๋Š” ๊ฒƒ์ด๋‹ค. (์ด์ „ ๋‚ด์šฉ์€ ๋‹ค ์ง€์šฐ๋Š” ๊ฒƒ)
๐ŸŽ Agentic AI - Memory

Compaction


Compact (summarizing)์€ ๋Œ€ํ™”๋‚ด์šฉ๋“ค์„ ์ „๋ถ€ ์š”์•ฝํ•œ๋‹ค์Œ์— ํ•˜๋‚˜์˜ compaction block์œผ๋กœ createํ•œ ํ›„์—, Context window์— ์งš์–ด ๋„ฃ์–ด ์ดํ›„ ๋Œ€ํ™”๋ฅผ ์ง„ํ–‰ํ•˜๋Š” ๊ฒƒ์ด๋‹ค.
Compaction

๐ŸŽ Agentic AI - Memory

์žฅ๊ธฐ๊ธฐ์–ต


๋‹จ๊ธฐ๊ธฐ์–ต์ด โ€œํ•œ ์„ธ์…˜ ์•ˆ์—์„œ ์œ ์ง€๋˜๋Š” ๋Œ€ํ™”/์ƒํƒœ(state)โ€๋ผ๋ฉด, ์žฅ๊ธฐ๊ธฐ์–ต์€ ์„ธ์…˜์ด ๋ฐ”๋€Œ์–ด๋„ ๊ณ„์† ๋‚จ์•„์žˆ๋Š” โ€˜์ง€์† ์ €์žฅ(persistent storage)โ€™์ด๋‹ค. ์ฆ‰, ์—์ด์ „ํŠธ๊ฐ€ ๋งค๋ฒˆ ์ฒ˜์Œ๋ถ€ํ„ฐ ๋‹ค์‹œ ๋ฌป๊ณ  ๋ฐฐ์šฐ์ง€ ์•Š๋„๋ก, ์‚ฌ์šฉ์ž ์„ ํ˜ธยทํ”„๋กœ์ ํŠธ ์ง€์‹ยท๊ณผ๊ฑฐ ๊ฒฐ์ •์‚ฌํ•ญยท์ž์ฃผ ์“ฐ๋Š” ๋ฌธ์„œ ๋“ฑ์„ ์ €์žฅํ•ด๋‘๊ณ  ํ•„์š”ํ•  ๋•Œ ๊บผ๋‚ด ์“ฐ๋Š” ๊ตฌ์กฐ๋‹ค.

wHY?


์œ„์—์„œ ์„ค๋ช…ํ–ˆ๋‹ค ์‹ถ์ด, ์ปจํ…์ŠคํŠธ ์œˆ๋„์šฐ๋Š” ๊ฒฐ๊ตญ ์ œํ•œ์ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, ๋Œ€ํ™”๊ฐ€ ๊ธธ์–ด์ง€๋ฉด ๊ธธ์–ด์งˆ์ˆ˜๋ก (์š”์•ฝ, trim)์œผ๋กœ โ€œ๋ชจ๋“  ๊ฒƒ์„ ๋Œ€ํ™” ํžˆ์Šคํ† ๋ฆฌ๋กœ ๋“ค๊ณ  ๊ฐ€๋Š” ๋ฐฉ์‹โ€์€ ์˜ค๋ž˜ ๋ชป ๊ฐ„๋‹ค. ๊ทธ๋ž˜์„œ ์žฅ๊ธฐ๊ธฐ์–ต์ด ํ•„์š”ํ•ด์ง„๋‹ค.
- ๊ฐœ์ธํ™”: ์‚ฌ์šฉ์ž์˜ ๋งํˆฌ/์„ ํ˜ธ/๋ชฉํ‘œ๋ฅผ ๊ธฐ์–ตํ•ด์„œ ๋งค๋ฒˆ ์„ค๋ช…์„ ๋ฐ˜๋ณตํ•˜์ง€ ์•Š๊ฒŒ ํ•œ๋‹ค.
- ์ง€์‹ ์œ ์ง€: ํ”„๋กœ์ ํŠธ์˜ ๋ฐฐ๊ฒฝ์ง€์‹, ๋‚ด๋ถ€ ๋ฌธ์„œ, ๊ณผ๊ฑฐ ๊ฒฐ์ •์‚ฌํ•ญ์„ ์ถ•์ ํ•ด โ€œ์—…๋ฌด ์—ฐ์†์„ฑโ€์„ ๋งŒ๋“ ๋‹ค.
- ์ •ํ™•์„ฑ/๊ทผ๊ฑฐ: ๋ชจ๋ธ์ด ํ•™์Šต ๋•Œ ๋ณด์ง€ ๋ชปํ•œ ์ตœ์‹  ์ •๋ณด๋‚˜ ์‚ฌ๋‚ด ์ •๋ณด๋ฅผ, ์ €์žฅ์†Œ์—์„œ ์ฐพ์•„์™€ ๋‹ต๋ณ€์„ โ€œgrounding(๊ทผ๊ฑฐํ™”)โ€ํ•œ๋‹ค.
- ๋น„์šฉ/์†๋„: ๊ธด ๋Œ€ํ™” ์ „์ฒด๋ฅผ ๋„ฃ๋Š” ๋Œ€์‹ , ํ•„์š”ํ•œ ๊ฒƒ๋งŒ ๊ฒ€์ƒ‰ํ•ด ๋„ฃ์œผ๋ฉด ํ† ํฐ ๋น„์šฉ๊ณผ ์ง€์—ฐ์ด ์ค„์–ด๋“ ๋‹ค. (RAG์˜ ๋Œ€ํ‘œ ์žฅ์ )

Method


์žฅ๊ธฐ๊ธฐ์–ต์€ โ€œ์™ธ๋ถ€ ์ €์žฅ์†Œ + ๊ฒ€์ƒ‰/ํšŒ์ˆ˜(retrieval)โ€๋กœ ๊ตฌํ˜„๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค. ํฌ๊ฒŒ 3๊ฐˆ๋ž˜๊ฐ€ ์‹ค๋ฌด์—์„œ ๋งŽ์ด ์“ฐ์ธ๋‹ค.

(A) ํŒŒ์ผ ๊ธฐ๋ฐ˜ ๋ฉ”๋ชจ๋ฆฌ (๊ฐ€์žฅ ๋‹จ์ˆœยท์ง๊ด€)

์•„๋ž˜ context engineering์˜ ์˜ˆ์‹œ๋ฅผ ๋ณด๋ฉด, ์‚ฌ์‹ค ์šฐ๋ฆฌ๊ฐ€ agent์—๊ฒŒ โ€œ์˜ค๋Š˜ ๋‚ ์”จ๊ฐ€ ์–ด๋•Œโ€ ๋ผ๊ณ  ์งˆ๋ฌธ์„ ํ•œ๋‹ค๋ฉด, ๋‹จ์ˆœํžˆ ๊ทธ ์งˆ๋ฌธ๋งŒ LLM(๋‡Œ)์— ๋“ค์–ด ๊ฐ€๋Š”๊ฒŒ ์•„๋‹ˆ๋ผ, ํ˜„์žฌ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ๋„๊ตฌ๋“ค๊ทธ๋ฆฌ๊ณ  ๊ธฐ์–ต๋„ ๊ฐ™์ด prompt์— ๋“ค์–ด๊ฐ„๋‹ค. ๋”ฐ๋ผ์„œ, ์šฐ๋ฆฌ๋Š” memory.mdํŒŒ์ผ์— ์žฅ๊ธฐ๊ธฐ์–ตํ•ด์•ผํ•  ๋‚ด์šฉ๋“ค์„ ์ž์—ฐ์–ด๋กœ ์ž‘์„ฑํ•˜๊ณ , prompt์— ๋„ฃ์–ด์ค€๋‹ค.( Application๋‹จ๊ณ„์—์„œ ์„ค๋ช…๋“œ๋ฆฌ๊ฒ ์ง€๋งŒ, memory.md์˜ ๋‚ด์šฉ์„ ์ž๋™์œผ๋กœ ์ฝ๊ฒŒ ํ•  ์ˆ˜๋„ ์žˆ๊ณ , ์ˆ˜๋™์œผ๋กœ ํŒŒ์ผ์˜ ์œ„์น˜๋ฅผ ์งˆ๋ฌธํ• ๋•Œ ์•Œ๋ ค ์ค„์ˆ˜๋„ ์žˆ๋‹ค.)
- ์žฅ์ : ์ž์—ฐ์–ด๋กœ ์ ๊ธฐ๋งŒ ํ•˜๋ฉด ๋˜๋‹ˆ, ๊ตฌํ˜„์ด ์‰ฝ๊ณ , ์‚ฌ๋žŒ์ด ์ง์ ‘ ์—ด์–ด๋ณด๊ณ  ์ˆ˜์ •ํ•  ์ˆ˜ ์žˆ์–ด ๋””๋ฒ„๊น…์ด ํŽธํ•˜๋‹ค.
- ํ•œ๊ณ„: ๊ทœ๋ชจ๊ฐ€ ์ปค์ง€๋ฉด โ€œํ•„์š”ํ•œ ๋ถ€๋ถ„๋งŒ ๋น ๋ฅด๊ฒŒ ์ฐพ๊ธฐโ€๊ฐ€ ์–ด๋ ค์›Œ์ง„๋‹ค. (๊ฒฐ๊ตญ์— ๋˜๋‹ค๋ฅธ context์ด๋ฏ€๋กœ, ๋งŽ์ด ์“ธ์ˆ˜๊ฐ€ ์—†๋‹ค)
๐ŸŽ Agentic AI - Memory

Claude memory

์‹ค์ œ Claude code docx (Manage Claude's memory - Claude Code Docs)๋ฅผ ์‚ดํŽด๋ณด๋ฉด, session์˜ ๋งจ ์ฒ˜์Œ์— 200์ค„์˜ memory.md๋‚ด์šฉ์ด ์ž๋™ ๊ธฐ์ž…๋œ๋‹ค๊ณ  ๋งํ•œ๋‹ค. ์ฆ‰ ๋ชจ๋“  ๋Œ€ํ™”์—์„œ memory.md๋ฅผ ๋„ฃ์œผ๋ฉด ๋น„ํšจ์œจ์ ์ด๋‹ˆ session์˜ ์ฒ˜์Œ์— ๊ธฐ์ž…ํ•œ๋‹ค.
๐ŸŽ Agentic AI - Memory

(B) ๋ฒกํ„ฐ DB(Vector Database) / ์ž„๋ฒ ๋”ฉ ๊ฒ€์ƒ‰

(Exploring Vector Databases: Pinecone, Chroma, Weaviate, Qdrant, Milvus, PgVector, and Redis)
๐ŸŽ Agentic AI - Memory

์ž ๋‘๋ฒˆ์งธ๋กœ, ์™ธ๋ถ€ ์ €์žฅ๊ณต๊ฐ„์„ ์ด์šฉํ•˜๋Š” memory ์žฅ๊ธฐ๊ธฐ์–ต ๋ฐฉ์‹์ด๋‹ค. ์œ„์‚ฌ์ง„์„ ๋ณด๊ฒŒ ๋˜๋ฉด, ๋‹ค์–‘ํ•œ ์ œํ’ˆ/์˜คํ”ˆ์†Œ์Šค ํ”„๋กœ์ ํŠธ๊ฐ€ ์กด์žฌํ•œ๋‹ค. y์ถ•์€ ์ƒ์—…/open source ๋กœ ๊ตฌ๋ถ„์ด ๋˜์–ด ์žˆ์œผ๋ฉฐ, ์œ„์ชฝ์€ Dedicated = ๋ฒกํ„ฐ ๊ฒ€์ƒ‰์ด ์ฃผ์—…(์ „์šฉ ์—”์ง„), Support = ๋ณธ์—…์€ ๋”ฐ๋กœ ์žˆ๊ณ  ๋ฒกํ„ฐ ๊ฒ€์ƒ‰์€ ์˜ต์…˜(์ถ”๊ฐ€ ๊ธฐ๋Šฅ) ๋กœ ์ดํ•ดํ•˜๋ฉด ๋œ๋‹ค.

โ€œRAG๋Š” โ€˜์™ธ๋ถ€ ์ง€์‹์„ ๊ฒ€์ƒ‰ํ•ด(์ฃผ๋กœ ๋ฒกํ„ฐ ๊ฒ€์ƒ‰) LLM ์ž…๋ ฅ์— ๋ถ™์ธ ๋’ค ์ƒ์„ฑํ•˜๋Š”โ€™ ํŒจํ„ด์ด๋‹ค. ๋ฒกํ„ฐ DB๋Š” RAG์—์„œ ๊ฐ€์žฅ ๋„๋ฆฌ ์“ฐ์ด๋Š” retrieval ์ €์žฅ์†Œ ์ค‘ ํ•˜๋‚˜๋‹ค.โ€

Retrieval Augmented Generation (RAG) Process


๐ŸŽ Agentic AI - Memory
An introduction to RAG and simple/ complex RAG

์‚ฌ์šฉ์ž์—๊ฒŒ ์งˆ๋ฌธ์„ ๋ฐ›๊ธฐ ์ „์—, ๋จผ์ € Vector Database๋ฅผ ๊ตฌ์ถ•ํ•ด์•ผํ•œ๋‹ค. ์ดํ•ด๋ฅผ ์œ„ํ•ด์„œ, ๋ณ‘์›์—์„œ ์‚ฌ์šฉํ•˜๋Š” RAG๋ผ๊ณ  ๊ฐ€์ •ํ•˜์ž.
- A: Raw Data Source
- e.g) ํ™˜์ž์˜ ๋ฐ์ดํ„ฐ์™€ ์น˜๋ฃŒ ๋ฐฉ์‹๋“ค์ด ๋‹ด๊ธด ๋ฌธ์„œ๋“ค์„ ์ค€๋น„ํ•œ๋‹ค.
- B: Information Extraction
- (๋ฐฉ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ) โ†’ ๊ฒ€์ƒ‰ ๊ฐ€๋Šฅํ•œ ํ…์ŠคํŠธ/meta data๋ฅผ ๋ฝ‘์•„๋ƒ„ (pdf โ†’ text, image โ†’ text)
- ํ™˜์ž์˜ ์ง„๋‹จ ๊ฒฐ๊ณผ ๋ฐ์ดํ„ฐ medical image โ†’ text, pdf โ†’ text๋กœ ์ „๋ถ€ ์ „ํ™˜ํ•œ๋‹ค.
- C: Chunking
- pdf, ๋ฌธ์„œ โ†’ ๋ฌธ๋‹จ, ์„น์…˜/์†Œ์ œ๋ชฉ ๋‹จ์œ„
- ์ „ํ™˜๋œ ์ •๋ณด๋ฅผ ์˜๋ฏธ์žˆ๋Š” ๋” ์ž‘์€ ๋ฉ์–ด์ด๋กœ ์ชผ๊ฐ ๋‹ค. (๋„ˆ๋ฌด ํฌ๋ฉด context window์— ๋“ค์–ด๊ฐ€์ง€ ๋ชปํ•จ)
- Embedding
- chunk โ†’ embedding vector (๋ฒกํ„ฐํ™”์ง„ํ–‰)
- e.g) ํ™˜์ž ์ •๋ณด, ์ง„๋‹จ ์ •๋ณด ๋ฉ์–ด๋ฆฌ๋“ค์„ ๋ฒกํ„ฐํ™”(์ˆซ์ž๋กœ ๋‚˜ํƒ€๋ƒ„) ๊ทธ๋ฆฌ๊ณ  database์— ์งš์–ด ๋„ฃ๋Š”๋‹ค.

Example (RAG + LLM)

์‚ฌ์šฉ์ž๊ฐ€ ์งˆ๋ฌธ์„ ํ•˜๋ฉด, ์‹œ์Šคํ…œ์€ ์งˆ๋ฌธ ์ „์ฒด๋ฅผ ์ž„๋ฒ ๋”ฉ(๋ฒกํ„ฐํ™”) ํ•˜๊ณ  ๋ฒกํ„ฐ DB์—์„œ ์˜๋ฏธ์ ์œผ๋กœ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด chunk(top-k) ๋ฅผ ๊ฒ€์ƒ‰ํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ฒ€์ƒ‰๋œ chunk์˜ ์›๋ฌธ(๋ฐ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ)์„ ์ปจํ…์ŠคํŠธ ์œˆ๋„์šฐ์— ์ฒจ๋ถ€ํ•œ ๋’ค, LLM์ด ๊ทธ ๊ทผ๊ฑฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•˜๋„๋ก ํ•œ๋‹ค.
1. ์‚ฌ์šฉ์ž๊ฐ€ ์งˆ๋ฌธ์„ ํ•จ.
e.g.) โ€œ์ด ํ™˜์ž ์ผ€์ด์Šค์—์„œ ์ฐธ๊ณ ํ•ด์•ผ ํ•  ํ”„๋กœํ† ์ฝœ/์ฃผ์˜์‚ฌํ•ญ์„ ๊ทผ๊ฑฐ์™€ ํ•จ๊ป˜ ์ •๋ฆฌํ•ด์ค˜โ€
2. ํ™˜์ž ์š”์•ฝ(์ง„๋‹จ๋ช…/๊ฒ€์‚ฌ๊ฒฐ๊ณผ/๊ธˆ๊ธฐ/๊ณผ๊ฑฐ๋ ฅ)์„ ๊ตฌ์กฐํ™”ํ•ด์„œ ์ฟผ๋ฆฌ๋กœ ๋งŒ๋“ ๋‹ค
3. ๋ณ‘์› ๋‚ด๋ถ€ ์ง€์‹๋ฒ ์ด์Šค(์น˜๋ฃŒ ํ”„๋กœํ† ์ฝœ, ๊ฐ€์ด๋“œ๋ผ์ธ, ์•ฝ๋ฌผ ๊ธˆ๊ธฐ ๋ชฉ๋ก, ์œ ์‚ฌ ์ผ€์ด์Šค ์š”์•ฝ)์—์„œ top-k ๋ฌธ์„œ๋ฅผ ๊ฒ€์ƒ‰ ํ›„, ๊ฒ€์ƒ‰๋œ ๊ทผ๊ฑฐ ๋ฌธ์„œ์˜ ํ•ต์‹ฌ ๋ฌธ๋‹จ์„ ์ปจํ…์ŠคํŠธ์— ๋ถ™์ธ๋‹ค
4. LLM์ด โ€œ๊ทผ๊ฑฐ ๋ฌธ์„œ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ฐ€๋Šฅํ•œ ์˜ต์…˜/์ฃผ์˜์‚ฌํ•ญ/์ถ”๊ฐ€๋กœ ํ™•์ธํ•  ์ •๋ณดโ€๋ฅผ ์ •๋ฆฌํ•œ๋‹ค
5. ์‚ฌ์šฉ์ž์—๊ฒŒ ๋‹ต๋ณ€์„ ์ง„ํ–‰.

  • ์žฅ์ : ๋ฌธ์„œ๋ฅผ โ€œ์˜๋ฏธ ๊ธฐ๋ฐ˜โ€์œผ๋กœ ์ฐพ์„ ์ˆ˜ ์žˆ์–ด, ํ‚ค์›Œ๋“œ๊ฐ€ ์ •ํ™•ํžˆ ์ผ์น˜ํ•˜์ง€ ์•Š์•„๋„ ๊ด€๋ จ ๋‚ด์šฉ์„ ์ž˜ ๋Œ์–ด์˜จ๋‹ค. RAG์—์„œ ๊ฐ€์žฅ ํ”ํ•œ ์ €์žฅ์†Œ ํƒ€์ž…์ด๋‹ค.

Conclusion

Effective context engineering for AI agents
๐ŸŽ Agentic AI - Memory

๋งˆ๋ฌด๋ฆฌ๋กœ, antropic (claude)์˜ context window๋ฅผ ์‚ดํŽด๋ณด๊ณ  ๋งˆ๋ฌด๋ฆฌํ•˜์ž. ๊ฒฐ๊ตญ ๊ธฐ์–ต์„ ํ•˜๋ ค๋ฉด prompt์— ์ž…๋ ฅ์„ ํ•ด์ฃผ์–ด์•ผ ํ•œ๋‹ค. ๋”ฐ๋ผ์„œ prompt engineering์ด๋ผ๋Š” ์ขŒ์ธก์˜ ์šฉ์–ด๋„ ์กด์žฌํ•œ๋‹ค. โ€œํ”„๋กฌํŠธ๋ฅผ ์–ด๋–ป๊ฒŒ ์“ฐ๊ณ , ์กฐ์ž‘ํ•˜๋Š”์ง€โ€

ํ•˜์ง€๋งŒ agentic ai์‹œ๋Œ€์—์„œ ์‚ฌ์šฉํ•ด์•ผํ•  tool, ๊ธฐ์–ตํ•ด์•ผํ•  memory๋Š” ๋งŽ์•„์ง€๊ณ  context window๋Š” ํ•œ์ •์ ์ด๋ฏ€๋กœ, ์–ด๋– ํ•œ ์กฐํ•ฉ์œผ๋กœ tool, memory๋“ค์„ ์ตœ์†Œํ•œ์œผ๋กœ ๋„ฃ์–ด์„œ ์ตœ๋Œ€ํ•œ์˜ ํšจ์œจ์ ์ธ action์„ ํ•  ์ˆ˜ ์žˆ์„๊นŒ? ๋ผ๋Š” context engineering์„ antropic์€ ์ƒˆ๋กญ๊ฒŒ ์ •์˜ํ•œ๋‹ค.

์žฌ๋ฏธ์žˆ๋Š” ์‚ฌ์‹ค: ๋„ˆ๋ฌด ๋งŽ์ด ์งš์–ด ๋„ฃ์–ด๋„ ๋ถ„์‚ฐ๋œ๋‹ค.


๊ธ€์€ ์ปจํ…์ŠคํŠธ๋ฅผ โ€œ๋ฌดํ•œํžˆ ๋„ฃ์œผ๋ฉด ์ข‹๋‹คโ€๊ฐ€ ์•„๋‹ˆ๋ผ, ๋„ฃ์„์ˆ˜๋ก ์ง‘์ค‘๋ ฅ์ด ๋–จ์–ด์งˆ ์ˆ˜ ์žˆ๋Š” ์ž์›์œผ๋กœ ๋ด…๋‹ˆ๋‹ค.
- ํ† ํฐ์ด ๋Š˜์ˆ˜๋ก ๋ชจ๋ธ์ด ์ค‘์š”ํ•œ ์ •๋ณด๋ฅผ ์ž˜ ๋ชป ์ง‘์–ด๋‚ด๋Š” context rot(๋ฌธ๋งฅ ๋ถ€ํŒจ) ๊ฐ™์€ ํ˜„์ƒ์ด ๊ด€์ฐฐ๋œ๋‹ค๊ณ  ์–ธ๊ธ‰ํ•ฉ๋‹ˆ๋‹ค.
- ๊ทผ๋ณธ ์ด์œ ๋Š” ํŠธ๋žœ์Šคํฌ๋จธ๊ฐ€ ํ† ํฐ๋“ค ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ๋ชจ๋‘ ๋ณด๋ ค๋‹ค ๋ณด๋‹ˆ(์ƒํ˜ธ attention), ๊ธธ์–ด์งˆ์ˆ˜๋ก โ€œ์ฃผ์˜(Attention) ์˜ˆ์‚ฐโ€์ด ์–‡์•„์ง„๋‹ค๋Š” ์„ค๋ช…์„ ํ•ฉ๋‹ˆ๋‹ค.
๊ทธ๋ž˜์„œ ๋ชฉํ‘œ๋Š” โ€œ๊ฐ€์žฅ ์ž‘์€ ํ† ํฐ์œผ๋กœ ๊ฐ€์žฅ ๋†’์€ ์‹ ํ˜ธ(signal)๋ฅผ ๋งŒ๋“ค๊ธฐโ€์ž…๋‹ˆ๋‹ค
๐ŸŽ Agentic AI - Memory
์ด๋ฒˆ์‹œ๊ฐ„์—๋Š” ์ € context window์—์„œ memory ๋ถ€๋ถ„์€ ์–ด๋””์„œ ๊ฐ€์ ธ์™€์„œ ์ฑ„์›Œ๋„ฃ๋Š”์ง€. DOC1,2๋Š” ์œ ์ €์˜ database์—์„œ ์–ด๋–ป๊ฒŒ ๊ฐ€์ ธ์˜ค๋Š”์ง€์— ๋Œ€ํ•ด์„œ ์•Œ์•„๋ณด์•˜๋‹ค.

โ†’ ๋‹ค์Œ ์‹œ๊ฐ„์€ tool์— ๋Œ€ํ•ด์„œ ์‚ดํŽด๋ณด์ž.