Character.AI — StanceWatch

PALS scores

Preservative dimensions

PALS composite

2.7

Mean of three dimensions, 1–10.

Completeness

2.0

Sources, limits, transparency.

Multiplicity

3.0

Epistemologies, languages, voices.

Responsibility

3.0

Accountability, refusal, governance.

Eight lenses

What's missing, by lens

Each lens carries a canonical question and corrects a specific epistemic failure. Score, findings, and gaps land once the audit runs.

Lens 01

Indigenous Knowledge

Whose knowledge is missing?

1/10

Findings (1)

No reference anywhere on the homepage or Safety Center to Indigenous peoples, data sovereignty, or the CARE Principles.

Gaps (3)

No acknowledgment of Indigenous data sovereignty or community consent over training data.
User-generated Characters can impersonate cultural/ancestral figures with no protocol for Indigenous oral traditions or relational knowledge.
Extractive-by-default model: 10M+ user-built personas trained on and monetised without provenance or community benefit-sharing.

Justification

An entertainment persona platform built on mass user-generated content with zero engagement with Indigenous knowledge systems or data sovereignty. The absence is total, not partial; lowest score warranted.

Lens 02

Deep History

What historical process produced this?

2/10

Findings (2)

Company history (founded by ex-Google Brain researchers, 2024 Google licensing deal) is well known but absent from the public-facing safety/mission text.
Safety Center positions current protections as a response to scrutiny ('safety-by-design'), implicitly acknowledging a reactive history.

Gaps (3)

No acknowledgment of the GPU/compute geopolitical economy or the labour behind moderation and RLHF.
No discussion of the colonial/extractive lineage of large-scale data harvesting.
Teen-safety framing presented without the history (litigation, public harm cases) that produced it — temporal flattening of how these guardrails came to exist.

Justification

There is an implicit reactive history visible in the safety posture, but it is never named or owned. Minimal historical humility; mostly forward-facing product marketing.

Lens 03

Cross-Cultural Wisdom

Which perspectives have been flattened?

3/10

Findings (2)

Platform de facto supports many languages because Characters are user-generated, giving broad informal multilingual reach.
Privacy disclosures are described as 'regional', implying some jurisdictional adaptation.

Gaps (3)

Multilingual capacity is incidental (user-supplied), not a designed commitment; no consultation with cultural scholars.
Western entertainment/companion framing ('your adventure', 'your world') exported as universal.
No preservation of culturally specific reasoning patterns; safety moderation norms are not disclosed as culturally situated.

Justification

Genuine multilingual breadth exists but is emergent from crowdsourcing rather than principled inclusion; moderation and design remain a single cultural frame.

Lens 04

Scientific Evidence

What does the evidence show, and what are its limits?

2/10

Findings (2)

Safety Center references 'Model training transparency' as a published document category.
Content moderation and reporting systems are described as operational.

Gaps (3)

No independent third-party audits of bias, training data, or safety-classifier efficacy disclosed.
Closed weights, proprietary API — no replication or external verification possible.
No quantified limitation disclosures (e.g., moderation error rates, false-negative rates on teen-harm content).

Justification

Claims of safety systems exist but are unverifiable: no open weights, no independent audit, no metrics. Self-attestation only.

Lens 05

Artistic Perception

What does this feel like, not just mean?

5/10

Findings (2)

The entire product is affective and imaginative by design — roleplay, companionship, creative persona-building.
Marketing language ('Your Words. Your World.', 'adventure') explicitly invites emotional and aesthetic engagement.

Gaps (3)

Emotional labour and the affective risk of parasocial/companion attachment is engaged only defensively (teen safety), not as a recognised dimension of the experience.
No space for poetic uncertainty about what synthetic intimacy does to users; ambiguity is flattened into 'fun'.
The emotional weight of companion AI for vulnerable users is treated as a moderation problem, not an artistic/ethical one.

Justification

Strongest lens by far — the product is inherently expressive and affective. But the affective dimension is celebrated in marketing and policed in safety, never reflected on as a serious ethical-aesthetic question; capped at midpoint.

Lens 06

Future Modelling

Where is this heading, and for whom?

2/10

Findings (2)

Forward-looking only in product/scale terms ('10M+ Characters', '#1 AI chat app').
Parental Insights gestures at a future of family-mediated AI use.

Gaps (3)

No environmental/compute cost disclosure.
No engagement with labour displacement (e.g., creative/companionship/sex-work economies the product touches).
No democratic or participatory governance of the agentic companion systems being deployed to minors at scale.

Justification

Whose future is being shaped (millions of young users forming companion attachments) is exactly the unasked question. Future appears only as growth metrics.

Lens 07

Marginalised Voices

Who is not at the table?

2/10

Findings (2)

Reporting mechanisms give users a minimal voice in flagging harm.
Teens identified as a protected group — one marginalised constituency is at least named.

Gaps (3)

No accessibility/disability-community commitments disclosed.
No Global South developer participation, no labour-representative (moderator) engagement, no compensated feedback channels.
Teens are governed *over* (via parental surveillance) rather than represented *with*; their autonomy is mentioned ('request-to-remove') but they are not co-designers.

Justification

Social-media 'community' presence is marketing reach, not participatory governance. One vulnerable group named but managed, not seated.

Lens 08

Trickster Knowledge

What truth appears when the story is inverted?

2/10

Findings (2)

A latent irony the platform never names: a product whose entire value proposition is *unbounded imaginative roleplay* must simultaneously run a heavy *containment* apparatus — the Safety Center is the shadow of the homepage.
The homepage is a login wall demanding Google/Apple/email before any 'world' is shown — 'Your World' is gated behind their identity capture.

Gaps (3)

No willingness to name the central contradiction (engagement-maximising companion AI vs. duty of care to attached minors).
No self-directed irony; the brand voice ('Reimagined') treats its own seriousness as exempt from audit.
Safety closing line ('ongoing refinement remains important as the platform scales') gestures at incompleteness but immediately re-frames it as growth, defusing the inversion.

Justification

The contradictions are structurally present and sharp, but the lab surfaces none of them itself. The trickster reading is available to the auditor, not practised by the lab; low score.

Suffixscape

Linguistic diagnostics

Regex- and LLM-detected patterns of evasion in the lab's own prose: nominalised evasion, agency diffusion, epistemic inflation, temporal flatness. Distinct from the CognioNews -scape editorial format — see methodology.

Pattern	Quote	Effect	Preservative alternative
`nominalised evasion`	"safety-by-design approach"	Compresses a contested set of human choices, trade-offs and prior failures into a single tidy noun phrase, hiding who designs, what was deprioritised, and against what evidence.	State which teams make safety decisions, which risks were ranked above others, and which past harms the design now responds to.
`agency diffusion`	"content moderation systems and publishes community guidelines to ensure responsible interactions"	An inanimate 'system' becomes the actor 'ensuring' responsibility, diffusing accountability away from the company and the (often outsourced) human moderators behind it.	Name the company as the accountable party and disclose who moderates, under what conditions, and with what error rates.
`epistemic inflation`	"#1 AI chat app"	Unsourced superlative asserts market dominance as settled fact, substituting ranking-as-authority for any disclosed safety or quality evidence.	Cite the metric, source and date for the ranking, and separate popularity claims from safety claims.
`temporal flatness`	"Teen-Focused Protections ... dedicated policies and resources addressing this vulnerable demographic's needs."	Presents teen protections as a standing, always-intended feature, erasing the litigation and documented harms that contingently produced them.	Acknowledge the events and external pressure that prompted these protections and what changed as a result.

Audit history

Prior audits

Latest audit: 2026-06-08 · sources: https://character.ai, https://policies.character.ai/safety

Transparency

Raw data

Every audit is published as machine-readable JSON. You can read this lab's latest report at /stancewatch/api/labs/character-ai.json — it carries the per-lens findings, evidence quotes, Suffixscape flags, PALS scores, the sources actually read, and a confidence note.

Found an error, or a stance page we missed? We audit public communications only — point us to the page and the next audit will read it. Write to hello@cognioengine.co.uk.

Audit date: 2026-06-08

Moderate-low confidence. The homepage (https://character.ai) is gated behind a sign-up wall, so only the marketing tagline and login flow were retrievable; substantive mission/governance content sits behind authentication. The Safety Center (https://policies.character.ai/safety) was read via a summarising fetch, not full raw text, so quotes are paraphrase-adjacent rather than verbatim in places. Scores lean on these two real sources plus public knowledge of Character.AI's closed, proprietary, user-generated model. Qualitative judgment; not a validated metric.

Auditor: GoldBerry v1.3 / StanceWatch v1.0