cyberivy
DatabricksData IntelligenceLakehouseMosaic AIUnity CatalogAgent BricksSAP DatabricksApache SparkLakebase2026

Databricks in 2026: What the $134 Billion Data Platform Actually Offers

May 18, 2026

With a $134 billion valuation and $5.4 billion in annualised revenue, Databricks is one of the largest independent data and AI platforms — and is particularly in the spotlight in 2026 with its SAP integration, Agent Bricks, and Lakebase. This article walks through the company, its services, and where strengths and limits sit.

What this is about

Databricks is one of the most visible players in the data and AI market — and yet it tends to be under-represented in mid-market conversations. After its latest funding round in February 2026, the company is valued at $134 billion according to CNBC, raised $7 billion in equity and debt, and reported a $5.4 billion annualised revenue run rate with roughly 65 percent year-over-year growth. AI product revenue alone passed $1.4 billion. CEO Ali Ghodsi has not ruled out a 2026 IPO. A good moment to take a sober look at what the company actually offers, where the strengths are, and where a second look pays off.

What Databricks actually offers

The company was founded in 2013 by six UC Berkeley researchers — Ion Stoica, Ali Ghodsi, Andy Konwinski, Matei Zaharia, Reynold Xin, and Patrick Wendell — the same team that built Apache Spark as an open-source project. Ali Ghodsi is CEO today. The central idea Databricks started with is the Lakehouse: an architectural approach that combines the flexibility of a data lake with the structure and performance of a data warehouse. Instead of running two separate systems, you have one platform for traditional SQL analytics, ETL, streaming, machine learning, and AI.

Today's product umbrella is the Databricks Data Intelligence Platform. The most important building blocks as of 2026:

  • Delta Lake as the open table format for the lakehouse layer.
  • Unity Catalog as central governance — permissions, lineage, tags, audit, attribute-based access control. In 2026 with additions like synchronisation of personal-data tags from SAP Business Data Cloud.
  • Mosaic AI as the AI platform for model training, fine-tuning, vector search, model serving, and agent development. Via Mosaic AI Model Serving, Claude Opus 4.7 as well as GPT-5.5 and GPT-5.5 Pro are now available as Databricks-hosted models.
  • Agent Bricks (introduced in 2025, expanded in 2026) builds AI agents on a customer's own data, generating synthetic training data and task-specific benchmarks automatically. Databricks names AstraZeneca and Hawaiian Electric as early adopters.
  • Databricks SQL for classical BI workloads.
  • Lakebase, a transactional database built on Postgres and integrated into the platform. With it, Databricks natively covers OLTP for the first time and connects operational with analytical workloads.
  • Databricks Apps for deploying data and AI applications directly on the platform.
  • Unity AI Gateway to centrally govern and secure external LLMs and MCP endpoints.

Strategically the most important move since 2025 has been the SAP partnership: SAP Databricks integrates the platform natively into SAP Business Data Cloud and makes S/4HANA master data available to Databricks workloads without classical ETL. General availability was announced in 2026.

Why it matters

Data architectures in many companies are historically fragmented: a data warehouse for reports, a data lake for raw data, a few notebook servers for data science, and increasingly a model-serving stack for AI. Databricks bundles these layers — with two real effects: less data movement between systems, and a single governance point for privacy and compliance.

Gartner projects that by the end of 2026, more than 50 percent of enterprises will use a lakehouse architecture as the foundation for analytics and AI — up from less than 15 percent in 2022. Even read with the usual scepticism toward analyst forecasts, the direction is clear.

Anyone working in the SAP ecosystem should evaluate the new SAP-Databricks bundle seriously — removing the classic ETL pipelines from S/4HANA into an analytical environment is for many corporates the most expensive part of a data project. Anyone experimenting with AI agents and treating their own data as a competitive asset finds a ready-made pipeline in Agent Bricks and Mosaic AI. And anyone using Snowflake or BigQuery now faces a direct competitor that handles comparable BI workloads via Databricks SQL.

In plain language

Picture your company as a large kitchen. So far you have had three different storage rooms: one with unsorted deliveries (data lake), one with pre-cooked ingredients (data warehouse), and a separate experimentation table for new recipes (machine learning). The trouble: every cook runs between the rooms, nobody really knows where which ingredient lives, and some tomatoes get inventoried twice. A lakehouse is one big kitchen with clear labelling, an inventory list, access rights, and an AI cook who designs new recipes on demand. Databricks sells exactly that kitchen, plus the kitchen staff.

A practical example

A corporate group with an S/4HANA backend and 80 plants worldwide wants to forecast shift demand per location. Today, material data lives in SAP, production counts in an MES, weather forecasts in a third-party table, and sales forecasts in Salesforce. The old pipeline: every source is copied into a BI data warehouse overnight, the data scientist gets a day-old snapshot the next morning, and builds the model on top of that.

With Databricks inside the SAP BDC setup, SAP data becomes visible in the lakehouse layer without anything being physically copied. Salesforce and MES data land in the same Unity Catalog via native connectors. A Mosaic AI model is trained on the platform, a vector search component enriches the forecast with historical reasoning text from the service ticket backlog, and an Agent Brick produces a daily briefing for plant management. The point is not that any one step would be technically impossible elsewhere — the point is that all of it happens in one tool with one permission model and one audit trail.

Scope and limits

Three honest caveats.

First, cost. Databricks is powerful on large workloads, but rarely cheap. The mix of compute hours, DBU multipliers per cluster type, model-serving costs and additional storage fees from the hyperscalers is demanding to manage. Anyone starting without FinOps discipline gets surprises at month-end.

Second, cloud lock-in. The platform runs on AWS, Azure, and Google Cloud. A true on-premises variant does not exist. In strictly regulated industries or under sovereignty constraints, this can be a show-stopper that has to be checked before buying.

Third, vendor lock-in through governance. Unity Catalog, Mosaic AI, and Agent Bricks are tightly interwoven. The open layers — Delta Lake, MLflow, Apache Spark — remain portable. The governance and agent layer is less so. Anyone aiming for long-term flexibility should plan abstraction layers and keep data models such that a move to a different lakehouse engine remains possible without data loss.

That said, for most data-intensive corporates in 2026, Databricks is no longer a question of "whether" but of "how deep". Anyone serious about the SAP context or about building AI agents will have a hard time avoiding a fair evaluation.

SEO and GEO keywords

Databricks, Data Intelligence Platform, Lakehouse, Apache Spark, Unity Catalog, Mosaic AI, Agent Bricks, Lakebase, Databricks SQL, SAP Databricks, SAP Business Data Cloud, Delta Lake, Ali Ghodsi, Databricks IPO 2026, Snowflake alternative.

💡 In plain English

Databricks is a platform on which companies can bring their data together in one place, analyse it, and use it for AI. Instead of running separate systems for reports, data lakes, and machine learning, there is a single environment with central permissions and audit logic. In 2026 the offering also covers transactional databases, AI agents, and a deep SAP integration.

Key Takeaways

  • Databricks was founded in 2013 by the Apache Spark creators around Ali Ghodsi and now runs the Data Intelligence Platform with Lakehouse as its core architecture.
  • In February 2026 the company closed a funding round at a $134 billion valuation, with $5.4 billion in annualised revenue and ~65 percent growth.
  • Most important 2026 building blocks: Delta Lake, Unity Catalog (governance/ABAC), Mosaic AI (including hosted Claude Opus 4.7 and GPT-5.5), Agent Bricks, Databricks SQL, Lakebase (Postgres OLTP), Databricks Apps, Unity AI Gateway.
  • SAP Databricks has been natively integrated into SAP Business Data Cloud since 2025; S/4HANA master data is usable without classical ETL.
  • Strong in the combination of analytics, ML, AI agents, and governance; limits are cost, no true on-premises option, and lock-in via the governance and agent layers.

FAQ

What is Databricks in one sentence?

A cloud-based platform for data and AI that, on top of a lakehouse architecture, combines analytics, ETL, streaming, machine learning, and AI agents in a single environment with central governance.

Who is behind Databricks?

Six UC Berkeley researchers, including CEO Ali Ghodsi and Matei Zaharia. The same team also built Apache Spark as an open-source project. Databricks was founded in 2013.

How does Databricks differ from Snowflake or BigQuery?

Snowflake and BigQuery focus more strongly on data warehousing. Databricks additionally covers machine learning, AI agents, streaming, and with Lakebase even transactional workloads — in one platform with shared governance.

Who should consider Databricks?

Mainly data-intensive enterprises, SAP customers, companies with their own data science or AI team, and anyone building AI agents directly on top of operational data. For small data volumes and tight budgets, leaner alternatives exist.

Sources & Context