Confiz Logo

Getting started with Lakehouse in Microsoft Fabric: A how-to guide

March 29, 2024

Subscribe to the newsletter

The launch of Microsoft Fabric for managing massive data loads has revolutionized the entire data infrastructure. With Microsoft Fabric Lakehouse, organizations can leverage a unified data repository that seamlessly stores and manages both structured and unstructured data. This data Lakehouse architecture eliminates the need for separate data lakes and data warehouses, fostering a more agile approach to data management.

With Lakehouse, enterprises can handle huge data volumes through a range of possibilities that come with Spark since it’s fully based on the Spark engine.

Ready to get started with Lakehouse in Microsoft Fabric? This blog outlines how to get started with Lakehouse and highlights the different ways to create a Lakehouse within the Microsoft Fabric environment.

Introduction to Microsoft Fabric Lakehouse

The Lakehouse in Microsoft Fabric is defined as the data architecture platform that is used to store, manage, and analyze data (both structured and unstructured) in a single location. This flexible and scalable solution provides the capability to analyze large volumes of data using various tools and personas. Its integration with other data management and analytics tools provides a comprehensive solution for addressing data engineering, analytics, and data science challenges.

Fabric’s Lakehouse works with Delta Tables to ensure the appropriate data format in the Lakehouse. Once users import data, they can create Spark notebooks to write code and explore their data to extract valuable insights for their organization. Users can leverage the SQL endpoint to integrate Lakehouse data with other applications for broader analysis.

How to build a Lakehouse in Microsoft Fabric?

Creating a Lakehouse in Microsoft Fabric is easy and quick. Before getting started with the process, you need to create a Fabric workspace first. Once this is established, you can proceed with various ways to create a Lakehouse in Fabric. Let’s explore what they are:

Option 1: Data engineering homepage

From the data engineering homepage, simply access the Lakehouse card under the New section on homepage.

Further Readings: Get started with Data engineering in Microsoft Fabric.

Option 2: Workspace view

While exploring the Data Engineering experience, you also have the option of creating a Lakehouse within the workspace view, accessible through the New dropdown menu.

Option 3: Create a page

Under the Data Engineering section, you can access an entry point for creating a Lakehouse on the Create page.

How to delete a Lakehouse?

It is important to note that a Lakehouse cannot be deleted if it is referenced somewhere else, for instance in a pipeline or in a real time analytics workflow.

To delete a Lakehouse,

  1. First, go to the OneLake data hub and locate your Lakehouse.
  2. Then, click on the three dots (…) next to the Lakehouse name and choose the Delete option.

This action will erase the Lakehouse along with its corresponding SQL analytics endpoint and semantic model.

How can data engineers interact with the Lakehouse items?

There are several approaches available for a data engineer to interact with both the Lakehouse and the data stored within it:

1. The Lakehouse Explorer

Within the Lakehouse Explorer interface, you can manage your interactions with the Lakehouse. This includes loading data into your Lakehouse, conducting data exploration via the object explorer, setting MIP labels, and engaging in various additional functionalities.

2. Notebooks

Using notebooks, data engineers can write code to read, transform, and write data directly to the Lakehouse, either as tables or folders.

3. Pipelines

Data engineers can pull data from different sources and load it into the Lakehouse using data integration tools, such as pipeline copy tool.

4. Apache Spark job definition

Data engineers can create robust applications and manage the execution of compiled Spark jobs in Java, Scala, and Python.

5. Dataflows Gen 2

Using Dataflows Gen 2, data engineers can ingest and prepare their data.

How to multitask with Lakehouse?

The multitasking feature presents a browser tab layout that facilitates seamless navigation between multiple items, allowing for more efficient data Lakehouse management. No longer will you need to juggle between different windows or lose track of tasks. Lakehouse delivers an enhanced multitasking experience, ensuring your data management journey is as efficient and user-friendly as can be, featuring the following capabilities:

1. Preserve running operations

In one tab, you can upload data or initiate data loading operations while attending to other tasks in separate tabs. With advanced multitasking functionality, operations in progress remain active even as you switch between tabs, ensuring you can maintain focus on your work without any interruptions.

2. Retain your context

When switching between tabs, your selected objects, data tables, or files will stay open and easily accessible, ensuring your data Lakehouse context is always within reach.

3. Non-blocking list reload

A non-blocking reload mechanism doesn’t block your files and tables list. You can continue your work as the list updates in the background, guaranteeing you always have the most up-to-date data and a seamless, uninterrupted experience.

4. Clearly defined notifications

The toast notifications specify the source Lakehouse, streamlining the tracking of changes and updates within your multitasking environment.

How do you access Lakehouse design?

Microsoft highly emphasizes the accessibility of its design to ensure it is inclusive and user-friendly for every user. The following are the key initiatives they have implemented to support accessibility.

Screen reader compatibility

Popular screen readers are fully supported, ensuring effective navigation and interaction for visually impaired users on the platform.

Text reflow

The design responds to various screen sizes and orientations with text and content reflow dynamically. This enhances user viewing and interaction with the application across multiple devices.

Keyboard navigation

Enhanced keyboard navigation allows users to navigate the Lakehouse without using a mouse, thereby enhancing the experience for individuals with motor disabilities.

Alternative text for images

Every image now features a descriptive alt text, enabling screen readers to communicate meaningful information effectively.

Form fields and Labels

Associated labels have been added to all form fields, making data input easier for all users, including screen reader users.

Get started on your Lakehouse journey with Confiz

Lakehouse in Microsoft Fabric offers a compelling solution for organizations seeking to leverage most of their data. Whether you’re a data engineer, analyst, or business leader, Microsoft Fabric Lakehouse offers the tools and capabilities to extract valuable insights from your data. All you need to do is to create a Lakehouse and get going. It can be a perfect solution for simplifying data management, increasing scalability and enhancing data security.

Do you need help with implementing Lakehouse with Microsoft Fabric or explore the capabilities of Microsoft Fabric? Contact our experts today at marketing@confiz.com and take the first step in managing your data infrastructure.