[ AI/anthropic-4 ]LLM은 단순히 단어를 Vector Embedding 할까?

언어는 단순히 단어를 Vector Embedding 할까?

결론부터 말하면, 아닐 것이라는 결론.

On the Biology of a Large Language Model

We investigate the internal mechanisms used by Claude 3.5 Haiku — Anthropic's lightweight production model — in a variety of contexts, using our circuit tracing methodology.

transformer-circuits.pub

여전히 완전히 하다고 말하긴 어렵지만,
그럼에도 유의미한 결과가 나왔다.

To test this, we collect feature activations on a dataset of paragraphs on a diverse range of topics, with (Claude-generated) translations in French and Chinese. For each paragraph and its translations, we record the set of features which activate anywhere in the context. For each {paragraph, pair of languages, and model layer}, we compute the intersection (i.e., the set of features which activate in both), divided by the union (the set of features which activate in either), to measure the degree of overlap. As a baseline, we compare this with the same "intersection over union" measurement of unrelated paragraphs with the same language pairing.

모델이 정말로 언어에 관계없이 통합된 의미 표현을 학습했는가?

영어 | 프랑스어 | 중국어

표면적 처리 vs 깊은 이해의 구분을 구분하면,
단순한 번역 시스템이라면 "dog → chien → 狗"처럼 단어 간 매핑만 학습할 것.

하지만 진정한 언어 간 일반화는 "네 발 달린 충성스러운 반려동물"이라는 개념 자체를 언어와 무관하게
표현하고 조작할 수 있음을 의미.

깊은 의미적 표현이 갖는 함의

깊은 의미적 표현이란 모델 내부에 언어 중립적인 개념 공간이 존재한다는 뜻.

만약 모델이 영어로 "The economy is struggling due to inflation"과
프랑스어로 "L'économie souffre à cause de l'inflation"를 처리할 때
같은 내부 특징들이 활성화된다면, 이는 모델이 단순히 단어들을 번역하는 것이 아니라
경제적 어려움"이라는 추상적 개념 자체를 인식하고 있다는 강력한 증거.

이는 마치 우리 뇌에서 브로카 영역이나 베르니케 영역을 넘어선,
더 깊은 개념적 이해를 담당하는 영역들이 언어와 무관하게 활성화되는 것과 비슷.

토큰화 가설의 검증

These results show that features at the beginning and end of models are highly language-specific (consistent with the {de, re}-tokenization hypothesis), while features in the middle are more language-agnostic.

"(consistent with the {de, re}-tokenization hypothesis)"

이 부분은 연구자들이 이전부터 가지고 있던 가설이 실제로 맞았다는 것을 입증.

"언어 → 개념 → 언어"
형태의 3단계 처리 과정을 거친다는 강력한 증거

De-tokenization(역토큰화)
입력된 텍스트를 언어별 토큰에서 언어 중립적인 표현으로 변환하는 과정.
예를 들어, "사랑", "love", "愛"라는 서로 다른 토큰들을 모두 동일한 "애정"이라는 추상적 개념으로 변환.

Re-tokenization(재토큰화)
그 반대로, 언어 중립적 표현을 다시 특정 언어의 구체적인 출력으로 변환하는 과정.

728x90

저작자표시 비영리 변경금지 (새창열림)

'AI' 카테고리의 다른 글

[ AI/학습 ]우리는 어떻게 언어를 배우는 가?(Feat.글) (4)	2025.07.21
[ AI/Chat-GPT ] AI Assistant 모델의 대중화 ?(Feat. Agent) (3)	2025.07.18
[ AI ] Context Engineering vs Prompt Engineering (3)	2025.06.26
[ AI/LLM ] LLM는 결국 어떤 식으로 우리 삶에 적용될까?(Feat. Andrej Karpathy) (1)	2025.06.23
[ AI/anthropic-3 ]질문에 단어 키워드를 바꾸면 어떻게 될까? (feat. Claude 3.5 Haiku) (5)	2025.06.19

Homo Faber

[ AI/anthropic-4 ]LLM은 단순히 단어를 Vector Embedding 할까?

영어 | 프랑스어 | 중국어

깊은 의미적 표현이 갖는 함의

토큰화 가설의 검증

'AI' 카테고리의 다른 글

티스토리툴바

[ AI/anthropic-4 ]LLM은 단순히 단어를 Vector Embedding 할까?

영어 | 프랑스어 | 중국어

깊은 의미적 표현이 갖는 함의

토큰화 가설의 검증

'AI' 카테고리의 다른 글

'AI' Related Articles

티스토리툴바