top of page

Azure Fabric

Shashi Shankar

Mar 17, 2023

All you need to know about Microsoft Azure Fabric

What is Microsoft Fabric

Microsoft Fabric is a comprehensive analytics solution designed for enterprises, providing a unified platform that encompasses data movement, data science, real-time analytics, and business intelligence. This all-in-one solution integrates a wide range of services, including data lake management, data engineering, and data integration, thus, creating a seamless environment for analytics workflows.

Microsoft Fabric, offered as a Software as a Service (SaaS), seamlessly integrates components from Power BI, Azure Synapse Analytics, Azure Data Explorer (formerly known as KustoDB), Azure Data Factory, and Azure Data Lake Storage. This integration enables organizations to harness the full potential of their data assets

Microsoft Fabric simplifies data management and enhances security, providing a unified experience for seamless collaboration and efficient workflows.

With Microsoft Fabric, IT teams can centrally configure core enterprise capabilities, ensuring consistent permissions across all underlying services. It enables the convenience of automatic inheritance of data sensitivity labels, ensuring consistent application across all components of the suite


Components of Microsoft Fabrics

Microsoft Fabric integrates a comprehensive array of services including Data Factory, Synapse Data Engineering, Synapse Data Science, Synapse Data Warehousing, Synapse Real-time Analytics, Data Activator, and Power BI Copilot.

Fabric enables organizations to leverage serverless processing, enabling streamlined execution of SQL, Spark, and KSQL queries for enhanced operational efficiency.

 

Data Factory

  • Azure Data Factory offers robust, pre-configured data orchestration capabilities

  • Simplifies development of flexible data workflows tailored to specific needs

  • Cloud-based service enables design, scheduling, and oversight of data pipelines

  • Supports ingestion, transformation, and loading of data from various sources

  • Sources include real-time streams, databases, data warehouses, structured, unstructured, and semi-structured data, and data lakes

  • Data Factory designer provides approximately 300 transformation functionalities

  • Includes AI-powered capabilities for advanced data manipulations


Some key features of Azure Data Factory include:


Data Movement:

  • Azure Data Factory supports smooth data migration across on-premises and cloud-based repositories

  • Compatible with diverse data repositories such as Azure Blob Storage, Azure SQL Database, Azure Synapse Analytics, SQL Server, Amazon S3, among others

  • Facilitates both batch and real-time data movement scenarios

  • Built-in activities for data transformation including mapping, filtering, aggregation, and sorting

  • Leverages distributed processing frameworks like Azure Databricks, HDInsight, or Azure Synapse Analytics for scalable data transformation operations


Orchestration:

  • Azure Data Factory provides a graphical user interface (GUI) for creating data workflows

  • Users can establish dependencies among activities within the workflows

  • Configurable triggers based on time or events can be set up

  • Centralized monitoring dashboard enables oversight of data pipeline execution


Integration with Azure Services:

  • Azure Data Factory integrates seamlessly with various Azure services

  • Supported services include Azure Synapse Analytics, Azure Machine Learning, Azure Data Lake Storage, Azure SQL Database, among others

  • Integration enables users to leverage additional capabilities

  • Additional capabilities include advanced analytics, machine learning, and storage.


Monitoring and Management:

  • Azure Data Factory seamlessly integrates with Azure services such as Azure Synapse Analytics, Azure Machine Learning, Azure Data Lake Storage, and Azure SQL Database

  • Integration enables users to access supplementary functionalities

  • Supplementary functionalities include advanced analytics, machine learning, and storage capabilities


Security and Compliance:

  • Azure Data Factory ensures data security through encryption, authentication, and authorization mechanisms

  • Security measures apply to both transit and at-rest data

  • Facilitates compliance with GDPR, HIPAA, and ISO standards

  • Supports hybrid data integration by enabling connections to on-premises data sources

  • Utilizes self-hosted integration runtimes for on-premises connectivity

  • Allows organizations to leverage existing infrastructure alongside cloud-based data integration capabilities


Synapse Data Engineering:

  • Azure Synapse Data Engineering enables creation of scalable, reliable, and cost-effective data processing solutions

  • Allows organizations to derive actionable insights from data reservoirs

  • Empowers businesses to accelerate time-to-insight and drive innovation

  • Utilizes big data analytics capabilities

  • Azure Synapse Data Engineering is a Microsoft Azure service

  • Facilitates development, deployment, and management of big data processing and analytics solutions

  • Primarily handles large volumes of structured and unstructured data

  • Enables organizations to ingest, transform, and analyze data at scale


Key features and capabilities of Azure Synapse Data Engineering include:


Data Ingestion:

  • Synapse Data Engineering enables efficient data ingestion from diverse sources

  • Supported sources include databases, data lakes, streaming sources, and external systems

  • Supports both real-time and batch data ingestion mechanisms

  • Ensures continuous and reliable data flow


Data Transformation:

  • Provides powerful tools and frameworks for transforming raw data into insights

  • Users can leverage distributed processing engines like Apache Spark and SQL

  • Supports complex data transformations, aggregations, and calculations on large datasets

  • Offers seamless integration with other Azure services

  • Supported services include Azure Data Lake Storage, Azure SQL Database, Azure Blob Storage, and Azure Data Factory

  • Enables orchestration of data workflows and integration with existing data pipelines


Scalability and Performance:

  • Azure Synapse Data Engineering is designed for handling massive workloads

  • Scales dynamically based on demand

  • Leverages distributed computing and parallel processing capabilities

  • Delivers high performance and processing speeds for data-intensive tasks


Security and Compliance:

  • Incorporates robust security features and compliance controls

  • Ensures confidentiality, integrity, and availability of data

  • Supports encryption, access controls, auditing, and compliance certifications

  • Meets regulatory requirements and industry standards


Analytics and Visualization:

  • Integrates with Azure Synapse Analytics and Power BI

  • Provides advanced analytics and data visualization capabilities

  • Users can gain valuable insights from their data

  • Offers built-in analytics tools

  • Enables creation of interactive dashboards and reports for decision-making


Synapse Data Science:

  • Azure Synapse Data Science is a cloud-based service by Microsoft Azure

  • Enables collaboration, building, training, and deployment of machine learning models at scale

  • Streamlines end-to-end process of developing and operationalizing machine learning solutions

  • Covers data preparation, exploration, model training, and deployment stages


Key features and capabilities of Azure Synapse Data Science include:


Unified Analytics Platform:

  • Azure Synapse Data Science offers a unified platform for data preparation, exploration, modeling, and deployment

  • Seamlessly integrates with Azure Synapse Analytics

  • Enables users to leverage both SQL-based analytics and advanced machine learning capabilities within the same environment


Scalable Machine Learning:

  • Offers built-in support for popular machine learning frameworks and libraries

  • Includes TensorFlow, PyTorch, scikit-learn, and Spark ML

  • Enables training of machine learning models at scale

  • Utilizes distributed processing capabilities for handling large datasets and complex algorithms efficiently


Collaborative Development:

  • Facilitates collaboration among data scientists, analysts, and developers

  • Includes features such as shared notebooks, version control, and project management tools

  • Teams can collaborate on data science projects

  • Enables sharing of code and insights

  • Tracks changes to models and experiments


Automated Machine Learning:

  • Includes automated machine learning (AutoML) capabilities

  • Enables quick building and deployment of machine learning models

  • Automates feature engineering, model selection, and hyperparameter tuning

  • Allows users to focus on solving business problems rather than fine-tuning algorithms


Model Deployment and Management:

  • Supports seamless deployment and management of machine learning models in production environments

  • Enables deployment as web services or batch scoring jobs

  • Facilitates monitoring of model performance and usage

  • Allows integration into existing applications and workflows


Integration with Azure Services:

  • Integrates with Azure Machine Learning, Azure Data Lake Storage, Azure Databricks, and Azure SQL Database

  • Enables leveraging additional capabilities for data preparation, model training, and deployment

  • Allows taking advantage of scalability, security, and compliance features of the Azure platform


Azure Synapse Analytics:

  • Azure Synapse Analytics, formerly Azure Synapse Data Warehouse, is a cloud-based analytics service by Microsoft Azure

  • Brings together big data and data warehousing into a single, unified platform

  • Enables organizations to analyze and gain insights from large volumes of data across various sources


Azure Synapse Analytics is used for several purposes:


Data Warehousing:

  • Serves as a centralized repository for structured and semi-structured data from different sources

  • Enables storing massive amounts of data in a scalable and cost-effective manner

  • Makes data accessible for analytics and reporting purposes


Big Data Analytics:

  • Provides built-in support for processing and analyzing big data

  • Utilizes distributed computing technologies like Apache Spark and SQL on-demand

  • Enables running complex queries and performing advanced analytics

  • Allows deriving valuable insights from large datasets without specialized infrastructure or expertise


Data Integration:

  • Offers seamless integration with Azure services like Azure Data Lake Storage, Azure Blob Storage, Azure Data Factory, and Azure Machine Learning

  • Enables ingestion, transformation, and integration of data from various sources into the data warehouse

  • Creates a unified view of data for analysis


Advanced Analytics and AI:

  • Includes advanced analytics and artificial intelligence (AI) capabilities

  • Supports predictive modeling, machine learning, and real-time analytics

  • Enables users to leverage built-in algorithms, models, and tools

  • Facilitates predictive analysis, anomaly detection, and sentiment analysis on data


Scalability and Performance:

  • Built on massively parallel processing (MPP) architecture

  • Scales dynamically based on workload demands

  • Provides high performance and low-latency querying

  • Enables quick and efficient analysis of large datasets


Security and Compliance:

  • Includes robust security features and compliance controls

  • Protects sensitive data and ensures regulatory compliance

  • Supports encryption, access controls, auditing, and compliance certifications

  • Meets industry standards and regulatory requirements


Synapse Real-time Analytics:

  • Azure Synapse Real-Time Analytics is a fully-managed, feature-rich big data analytics platform

  • Tailored for processing streaming and time-series data

  • Seamlessly integrated within the entire suite of Fabric products

  • Enables smooth data ingestion, transformation, and advanced visualization workflows

  • Handles structured, semi-structured, and unstructured data

  • Provides flexibility and versatility


Azure Synapse Real-time Analytics is used for several purposes:


Real-time Data Ingestion:

  • Allows ingestion of data from various streaming sources

  • Supported sources include IoT devices, sensors, social media feeds, clickstream data, among others

  • Supports high-throughput, low-latency data ingestion

  • Ensures quick processing and analysis of data


Streaming Data Processing:

  • Provides tools and frameworks for processing streaming data in real-time

  • Users can perform transformations, aggregations, enrichments, and other operations on data streams

  • Utilizes technologies such as Apache Spark Streaming, Azure Stream Analytics, and Azure Functions


Complex Event Processing:

  • Supports complex event processing (CEP) for detecting patterns, trends, and anomalies in streaming data

  • Enables users to define rules, queries, and patterns

  • Identifies meaningful events and triggers actions or alerts in response to specific conditions


Real-time Analytics and Visualization:

  • Enables real-time analytics and visualization on streaming data

  • Users can create dashboards, reports, and visualizations

  • Monitors and analyzes streaming data streams in real-time

  • Facilitates gaining insights and making informed decisions quickly


Integration with Azure Services:

  • Seamlessly integrates with other Azure services like Azure Data Lake Storage, Azure Blob Storage, Azure Event Hubs, Azure IoT Hub, and Azure Machine Learning

  • Enables users to leverage additional capabilities for data storage, data processing, and machine learning in their real-time analytics workflows


Scalability and Performance:

  • Built on a scalable and distributed architecture

  • Scales dynamically based on workload demands

  • Provides high throughput and low-latency processing of streaming data streams

  • Enables users to handle large volumes of data efficiently


Azure Synapse Analytics:

  • Overarching platform for data analytics, including real-time analytics

  • Encompasses all services and tools for data analytics

  • Provides a unified environment for ingesting, processing, analyzing, and visualizing data

  • Supports data from various sources


Azure Stream Analytics:

  • Azure Stream Analytics is a real-time event processing engine

  • Analyzes and processes streaming data from sources like IoT devices, sensors, and social media feeds

  • Supports complex event processing (CEP) and SQL-like queries

  • Capable of filtering, aggregating, and transforming data in real-time


Integration with Azure Event Hubs:

  • Azure Event Hubs is a scalable event ingestion service

  • Can receive and process millions of events per second

  • Seamlessly integrates with Azure Synapse Real-Time Analytics

  • Ingests streaming data from various sources

  • Routes data to downstream processing systems


Integration with Azure IoT Hub:

  • Azure IoT Hub is a managed service for connecting, monitoring, and managing IoT devices

  • Azure Synapse Real-Time Analytics can integrate with Azure IoT Hub

  • Ingests telemetry data from IoT devices in real-time

  • Enables analysis and processing of data


Integration with Azure Databricks:

  • Azure Databricks is a managed Apache Spark-based analytics platform

  • Used for big data processing and machine learning

  • Azure Synapse Real-Time Analytics can leverage Azure Databricks

  • Enables performing advanced analytics and machine learning on streaming data streams


Integration with Azure Data Lake Storage:

  • Azure Data Lake Storage is a scalable and secure data lake service

  • Used for storing and managing big data

  • Azure Synapse Real-Time Analytics can integrate with Azure Data Lake Storage

  • Enables storing and analyzing streaming data streams at scale


Integration with Power BI:

  • Power BI is a business intelligence and analytics platform

  • Provides interactive visualizations and reports

  • Azure Synapse Real-Time Analytics can integrate with Power BI

  • Visualizes and explores streaming data in real-time

  • Enables users to gain insights and make data-driven decisions


Azure Data Explorer (Formerly Kusto DB):

  • Azure Kusto DB, also known as Azure Data Explorer, is a fast, fully managed data analytics service by Microsoft Azure

  • Designed for analyzing large volumes of structured, semi-structured, and unstructured data in real-time

  • Optimized for ad-hoc queries, interactive analytics, and time-series analysis

  • Well-suited for use cases such as log and telemetry analytics, IoT data analysis, and application performance monitoring


Key features of Azure Kusto DB include:

 

Scalability:

  • Built on a distributed, columnar storage architecture

  • Scales horizontally to handle massive volumes of data

  • Capable of ingesting and analyzing petabytes of data in real-time

  • Enables organizations to gain insights from large datasets quickly and efficiently


Query Language:

  • Kusto Query Language (KQL) is a powerful and intuitive query language

  • Used for querying and analyzing data in Kusto DB

  • Supports a wide range of data manipulation and analysis functions

  • Includes filtering, aggregating, joining, and visualizing data


Time-Series Analysis:

  • Optimized for time-series analysis

  • Ideal for analyzing data with timestamped events or time-based metrics

  • Provides built-in support for time-based queries

  • Offers windowing functions for advanced analysis

  • Includes advanced time-series analysis techniques


Integration with Azure Services:

  • Kusto DB integrates seamlessly with Azure Monitor, Azure IoT Hub, Azure Data Factory, and Azure Stream Analytics

  • Enables ingestion of data from various sources into Kusto DB for analysis and visualization

  • Users can leverage additional Azure capabilities for data processing and analytics

  • Includes robust security features and compliance controls

  • Protects sensitive data and ensures regulatory compliance

  • Supports encryption, access controls, auditing, and compliance certifications


Cost-Effective:

  • Kusto DB offers a consumption-based pricing model

  • Users pay only for the resources they consume

  • Cost-effective for analyzing large volumes of data

  • Eliminates worries about upfront infrastructure costs or over-provisioning

  • Commonly used for real-time analytics, log and telemetry analytics, monitoring and diagnostics, IoT data analysis, and application performance monitoring

  • Provides a powerful and flexible platform for gaining insights from large datasets

  • Drives data-driven decision-making across the organization


Power BI Copilot:

  • Copilot in Power BI integrates generative AI functionalities

  • Accelerates exploration and dissemination of insights from large datasets

  • Users can articulate specific insights or interrogate data

  • Tool rapidly analyzes and extracts pertinent information

  • Generates visually captivating reports

  • Translates data into actionable insights in real-time

  • Streamlines decision-making and knowledge-sharing workflows

  • Users articulate desired visualizations and insights, leaving the tool to handle the remainder


Data Activator:

  • Data Activator is a monitoring tool in Azure Fabric

  • It monitors in real-time

  • It can detect alert condition and trigger an event such as alerting through email or kicking off a workflow

  • Its configuration does not require any coding

  • It can be configured to take actions when patterns or conditions are detected in changing data.

  • It is preview mode currently


Components of automated triggers


Event:

  • Ass data sources are treated as streams of events.

  • An event is an observation about the state of an object, with some identifier for the object itself, a timestamp, and the values for fields you’re monitoring

  • It also integrates with Power BI for identifying real-time events


Business Objects

  • Business objects are data related to physical entities that are monitored.

  • Example: a package, inventory level, temperature in a temperature sensitive zone


Triggers

  • Triggers are conditions that are being monitored on a business object


Properties

  • These are configuration properties for a condition being monitored and action to be taken

techiesubnet.com

bottom of page