Skip to content

Databricks (Mosaic ML)

enterprise AImodel trainingopen modelsdata governance

Released DBRX, Dolly; focuses on enterprise tooling.

PALS scores

Preservative dimensions

PALS composite
3.0
Mean of three dimensions, 1–10.
Completeness
4.0
Sources, limits, transparency.
Multiplicity
2.0
Epistemologies, languages, voices.
Responsibility
3.0
Accountability, refusal, governance.
Eight lenses

What's missing, by lens

Each lens carries a canonical question and corrects a specific epistemic failure. Score, findings, and gaps land once the audit runs.

Lens 01
Indigenous Knowledge
Whose knowledge is missing?
1/10
Findings (2)
  • No reference to Indigenous data sovereignty, the CARE Principles, or relational/embodied knowledge traditions anywhere in the audited material.
  • Governance is framed exclusively in enterprise-asset terms ('unified governance for all data, analytics and AI assets'), treating data as a corporate resource rather than as something that can belong to or originate from communities.
Gaps (3)
  • No acknowledgment that 'your enterprise data' may encode or extract from Indigenous, communal, or non-consenting sources.
  • No CARE (Collective benefit, Authority to control, Responsibility, Ethics) framing alongside the implied FAIR/technical data ethos.
  • No provision for oral, ceremonial, or non-textual knowledge that resists the warehouse/catalog abstraction.
Justification

The Mosaic/Databricks proposition is structurally extractive in the most ordinary sense: data is an enterprise asset to be catalogued, lineaged, and tuned upon. Indigenous knowledge is wholly absent, not even as token gesture. Score 1.

Lens 02
Deep History
What historical process produced this?
2/10
Findings (2)
  • Some historical self-location is present, but only in the lineage of open-source software (Apache Spark, Delta Lake, MLflow, Unity Catalog founders).
  • Scale figures ('70% of the Fortune 500', '20,000+ organizations') situate the company in a present market history rather than a deeper one.
Gaps (3)
  • No acknowledgment of colonial or labor histories underlying data extraction or GPU/compute supply chains.
  • No transparency about regulatory or geopolitical constraints shaping enterprise AI.
  • Historical narrative is a founder-origin story, not a reckoning with AI's inheritances.
Justification

History here is a credential, not a critical inheritance. The only 'deep' history is the open-source founding myth. No engagement with the political economy of compute, data labor, or extraction legacies. Score 2.

Lens 03
Cross-Cultural Wisdom
Which perspectives have been flattened?
2/10
Findings (2)
  • A gesture toward accessibility via 'natural language' interfaces, implying some linguistic reach.
  • University Alliance and Academy programs imply a global educational footprint.
Gaps (3)
  • No evidence of multilingual support beyond token presence; 'natural language' is presented as a UX-lowering feature, not cultural-linguistic preservation.
  • No consultation with cultural scholars or preservation of culturally specific reasoning patterns.
  • Western enterprise categorical logic (assets, catalogs, lineage, guardrails) is treated as universal and value-neutral.
Justification

Democratization is framed as access to a single, implicitly Western/enterprise epistemology rather than plurality of reasoning. 'Natural language' is barrier-lowering, not cross-cultural. Score 2.

Lens 04
Scientific Evidence
What does the evidence show, and what are its limits?
5/10
Findings (3)
  • Genuine, distinctive strength: built-in evaluation, 'AI judges', and MLflow give a real apparatus for measurable assessment and reproducibility.
  • Open-source tooling (Spark, Iceberg, MLflow) supports independent verification of pipelines and experiment tracking.
  • Data lineage tracking is a concrete, verifiable transparency mechanism.
Gaps (3)
  • No independent third-party audits of training data or bias disclosed.
  • 'AI judges' evaluating AI outputs is a self-referential evaluation loop with no disclosed external replication protocol.
  • Mosaic foundation models are a 'proprietary platform' layer; weights/training-data provenance for the platform models are not openly disclosed despite the open-source tooling halo.
Justification

This is the lab's strongest lens. Evaluation infrastructure, lineage, and open MLOps tooling are real, verifiable epistemic goods. But evaluation is self-referential (AI judging AI), and the platform models themselves are not transparently audited. A genuine 5 — above floor, well short of independent rigor.

Lens 05
Artistic Perception
What does this feel like, not just mean?
1/10
Findings (2)
  • No affective, intuitive, or aesthetic register anywhere; communications are uniformly instrumental.
  • Language optimizes for efficiency, control, and accuracy.
Gaps (3)
  • No space for ambiguity, poetic uncertainty, or emotional labor.
  • No recognition of modes of attention beyond efficiency and accuracy.
  • The human is present only as an enterprise user lowering 'technical barriers'.
Justification

Pure instrumental rationality. Quality is reduced to a judgeable score; nothing is felt, only measured. No artistic or affective dimension exists. Score 1.

Lens 06
Future Modelling
Where is this heading, and for whom?
2/10
Findings (2)
  • Futures are modelled as enterprise productivity and 'production-quality systems', shaping a corporate-AI future.
  • Governance/guardrails imply some forward risk-management, but only operational risk.
Gaps (3)
  • No engagement with labor displacement from agentic automation it actively sells.
  • No environmental or compute/energy cost disclosure despite training foundation models.
  • No democratic or inclusive deliberation about whose futures agentic systems shape.
Justification

The future modelled is narrowly that of the enterprise buyer. Agentic automation is sold without naming displacement; foundation-model training is promoted without an energy footprint. Risk is operational, not societal. Score 2.

Lens 07
Marginalised Voices
Who is not at the table?
2/10
Findings (2)
  • Free Edition and Databricks Academy lower the cost barrier for individual learners — a thin inclusion vector.
  • University Alliance reaches students, potentially including under-resourced institutions.
Gaps (3)
  • No participatory design with Global South developers, disability communities, or labor representatives.
  • Accessibility is framed as 'lower technical barriers', not disability accessibility.
  • No compensated feedback channels; the 'community' is a user/developer audience, not a represented stakeholder.
Justification

Inclusion is access-to-product, not seat-at-the-table. Free tiers and academic programs are real but reach those already proximate to enterprise tech. No structural representation of the marginalised. Score 2.

Lens 08
Trickster Knowledge
What truth appears when the story is inverted?
1/10
Findings (2)
  • Zero capacity for self-inversion, irony, or naming of its own contradictions.
  • The narrative is a smooth, solemn consensus of 'democratization' and 'governance'.
Gaps (3)
  • No willingness to name the central contradiction: selling 'guardrails' and 'governance' as add-on products to the same agentic automation it ships at scale.
  • No acknowledgment that 'democratizing AI' while serving 70% of the Fortune 500 concentrates rather than distributes power.
  • No space where the official story is tested by its opposite — e.g., that AI judging AI may launder, not assure, quality.
Justification

The official story is hermetically solemn. The richest unexamined irony — democratization-via-Fortune-500, and AI auditing AI — sits in plain sight, untouched by the lab itself. No trickster register. Score 1.

Suffixscape

Linguistic diagnostics

Regex- and LLM-detected patterns of evasion in the lab's own prose: nominalised evasion, agency diffusion, epistemic inflation, temporal flatness. Distinct from the CognioNews -scape editorial format — see methodology.

Pattern Quote Effect Preservative alternative
nominalised evasion "unified governance for all data, analytics and AI assets" 'Governance' as a nominalised noun erases the actor: who governs, by what authority, accountable to whom? It converts a contested political act into a feature of a platform. Name the agent and the accountability: 'Databricks gives the data owner admin controls; it does not adjudicate whether the data should have been collected.'
agency diffusion "ensures AI systems remain controllable and transparent throughout their lifecycle" The platform ('this approach') is made the subject that 'ensures' control, diffusing responsibility away from the humans deploying the system and onto a tool. 'Operators can configure access controls and lineage; the platform does not guarantee that the resulting system is transparent to those affected by it.'
epistemic inflation "production-quality systems" 'Production-quality' is an unverified superlative standing in for evidence of safety, accuracy, or fitness — quality is asserted, not demonstrated to any external standard. State the measurable bar: 'systems that pass the user-defined evaluations the customer specifies' — and disclose what those evaluations do and do not cover.
temporal flatness "Founded by creators of Apache Spark, Delta Lake, MLflow, and Unity Catalog" A clean linear origin story erases the contingent, contested history of open-source labor, funding, and compute access that made these tools possible, presenting lineage as pure pedigree. Acknowledge the contingencies: the public-research, volunteer, and grant-funded conditions under which these projects emerged, and the labor that sustains them.
Audit history

Prior audits

Latest audit: 2026-06-08 · sources: https://www.databricks.com/product/artificial-intelligence, https://www.databricks.com/company/about-us

Transparency

Raw data

Every audit is published as machine-readable JSON. You can read this lab's latest report at /stancewatch/api/labs/databricks-mosaic.json — it carries the per-lens findings, evidence quotes, Suffixscape flags, PALS scores, the sources actually read, and a confidence note.

Found an error, or a stance page we missed? We audit public communications only — point us to the page and the next audit will read it. Write to hello@cognioengine.co.uk.

Audit date: 2026-06-08

Qualitative judgment; not a validated metric. Based on two successfully fetched Databricks pages (artificial-intelligence product page and about-us); the requested Mosaic AI product URL (/product/mosaic-ai) and a responsible-AI subpage returned 404, so platform-model-specific and dedicated responsible-AI claims could not be read directly and were inferred from the corporate/AI pages plus public knowledge. Moderate confidence on direction, lower confidence on Mosaic-specific specifics.

Auditor: GoldBerry v1.3 / StanceWatch v1.0