Findings (2)
- As a Korean lab, Kakao Brain operates within a national context where the dominant non-Western language and culture (Korean) is centered, which incidentally resists Anglophone defaults.
- Korean folk knowledge and Hangul-script traditions are implicitly part of the training corpus for Korean NLP work.
Gaps (3)
- No public evidence of engagement with Indigenous data sovereignty frameworks (CARE Principles).
- No acknowledgment of the distinction between national-majority Korean culture and genuinely Indigenous or minority embodied knowledge systems.
- Image-text datasets (e.g. COYO-700M) are large web-scraped corpora with no documented consent or sovereignty consideration for any community whose imagery is captured.
Justification
Homepage unreachable; assessment rests on public knowledge of Kakao Brain's open releases (KoGPT, minDALL-E, COYO-700M, Karlo). Centering Korean is culturally significant but is a national-majority stance, not Indigenous-knowledge stewardship. Large web-scraped multimodal corpora are the opposite of CARE-aligned data practice, hence a low score. Not the floor, because Korean-language centering does displace some Anglophone universalism.