Lesson 18 – Introduction to Data Pipelines in Microsoft Fabric

In today’s world of managing data, it’s crucial to move data seamlessly and efficiently. Microsoft Fabric’s Data Pipelines, an advanced version of Azure Data Factory, lead the way in making this process efficient. This blog is your go-to guide for understanding and creating Data Pipelines, helping you optimize your data integration projects.

Data Pipelines in Microsoft Fabric

In Microsoft Fabric, data pipelines serve as a means to design and control data ingestion and transformation processes through a user-friendly interface. These pipelines enable the utilization of diverse data sources (e.g., Azure SQL database, Blob Storage, Data Lakehouse) and the application of various activities (e.g., Copy Data, Merge Queries, Dataflow Gen2) for extract, transform, and load (ETL) operations. Additionally, scheduling and monitoring of data pipelines are seamlessly handled through Microsoft Fabric’s Data Factory.

Key features of data pipelines in Microsoft Fabric

 Some of the key features of data pipelines in Microsoft fabric are

  • User-Friendly Interface – Easily create and manage data ingestion and transformation tasks through an intuitive graphical user interface.
  • Seamless Integration with Microsoft Fabric Artifacts – Data pipelines seamlessly connect with other Microsoft Fabric artifacts, facilitating easy linking to the Lakehouse, Fabric Data Warehouse, or execution of Data Engineering notebooks.
  • Pipeline Templates – Pre-defined templates for common tasks reduce development time, enabling quick initiation and completion of data integration projects.
  • Built-in AI Capabilities – Integration of artificial intelligence automates common data integration tasks, enhancing efficiency in data processing.
  • Quick Data Copy – The Copy Assistant simplifies and accelerates the data copying process, providing a guided approach to connect to various data sources and destinations.

How to create Data Pipelines in Microsoft Fabric?

Follow the steps to create data pipelines

Prerequisite

  • Obtain a Microsoft Fabric tenant account with an active subscription. Refer Lesson 3 Getting started with Microsoft Fabric.
  • Verify that you have a Microsoft Fabric-enabled Workspace set up and ready for use. Refer Lesson 4 Fabric Workspaces and how to create one?

Steps to create data pipeline

  • Launch https://app.fabric.microsoft.com/
  • Click on the Power BI icon at the bottom of the page, then choose “Data Factory.”  This will take you directly to the Data Factory home page.
  • You have two options to create a Data Pipeline. You can either click on “Data Pipeline” on the Data Factory home page.
  • Alternatively, navigate to the workspace you’ve set up, click on “New” and then choose “Data Pipeline“.
  • A new window will appear, give your pipeline a name and click “Create”.

The homepage of Demopipeline has the following appearance.

Add pipeline activity

Pipeline activities represent the individual steps or tasks performed on data to achieve specific goals, such as moving, transforming, or analyzing data. The types of activities may vary based on data sources, destinations, and processing methods.

Copy data

You can connect to various data sources and destinations and choose from sample data sources for a quick start. The step-by-step process guides you to configure data load options, creating a new pipeline activity for streamlined data movement across different stores like Azure SQL database, Blob Storage, or Data Lakehouse.

Choose a task to start

Accelerate your pipeline creation process by leveraging ready-made templates designed for quick starts. These pre-defined pipelines not only streamline the building of data integration projects but also contribute to increased efficiency by significantly reducing development time.