Back to Glossary

Data & AI

Data Lakehouse

A data lakehouse unifies the flexibility and low-cost storage of a data lake with the transactional capabilities and structured data management of a data warehouse, enabling diverse analytics and AI workloads on a single platform.

Context for Technology Leaders

For CIOs and Enterprise Architects, the data lakehouse architecture addresses the long-standing challenge of integrating disparate data environments for advanced analytics and machine learning. It streamlines data governance, reduces operational complexity, and supports agile data strategies by providing a unified platform that can handle both structured and unstructured data, crucial for modern data-driven initiatives and compliance with regulations like GDPR or CCPA.

Key Principles

  • 1Open Formats: Utilizes open, standardized data formats like Parquet or Delta Lake, ensuring interoperability and avoiding vendor lock-in for data storage and processing.
  • 2Transactional Support: Provides ACID (Atomicity, Consistency, Isolation, Durability) properties, enabling reliable data updates, deletions, and concurrent operations, critical for data integrity.
  • 3Schema Enforcement: Offers flexible schema evolution while enforcing data quality, balancing agility with the need for structured data for BI and reporting.
  • 4Separation of Storage & Compute: Decouples data storage from processing engines, allowing independent scaling and cost optimization for diverse workloads.

Related Terms

Data LakeData WarehouseDelta LakeData MeshCloud Data Platform