top of page

Introduction to Snowflake

Shashi Shankar

Mar 17, 2023

Up and Running in a flash with Snowflake Data Lake

What is Snowflake

 

Intro
  • Software as a Service

  • Hardware and software managed by the Snowflake team.

  • Data Warehouse hosted on public clouds – AWS, Azure and Google Cloud.

  • No Hardware or software to buy or maintain – near zero setup and maintenance. 

  • As a Snowflake customer you simply sign up, load your data and start querying

  • Cannot be hosted on private cloud or on-premise.

  • Separate layers for storage, compute and services. 

  • Data stored in columnar, compressed format in micro-partitions. 

  • Low cost of ownership.

  • Supports variety of file formats - CSV, Parquet, ORC, JSON, XML, regular expressions, AVRO. 

  • Supported programing languages – SQL, Python and Java 

  • Tools - Web UI, SnowSQL Command Level Interface. 

  • Interfaces – ODBC, JDBC, Python, Spark, Node.js, PHP, .NET, Several third party partners. 

  • Concept of External table for reading data hosted outside of Snowflake storage.

Ease of Use 
  • SQL based data warehouse.

  • Full Transactional consistency (ACID). 

  • Supports structured and semi-structured data format - CSV, JSON, Parquet, ORC, XML. 

  • Provides data loading and unloading tools including data pipeline.

  • Snowflakes handles all aspects of authentication, configuration, resource management, data protection, availability, optimization, etc. 

  • Every workload gets the same copy of data. 

  • Every workload can get its own compute environment for data processing. 

  • Zero copy cloning. 

Scalability 

No hardware or software needed to be bought and configured. 

Scalability possible with click of a few buttons. 

Enables faster Time-to-Market. 

Variety of workloads (Data Engineering, Data Scientists, Analysts, BI, Metadata Management)

Unlimited storage scalability 

The concept of Elastic Warehouse provides easy vertical and horizontal scaling.

Unlimited storage scalability.

Concept of Elastic Warehouse provides scalability and compute isolation. 

Availability and Recoverability

Time Travel – point in time rollback feature 

99.9% availability and failover capability. 

Variety of workloads (Data Engineering, Data Scientists, Analysts, BI, Metadata Management)

Automatic backup, replication and cross zone cross-cloud replication. 

DR Solutions – Database replication and Failover, 

Multi region replication between Snowflake accounts.

Replication possible across AWS, Azure and Google Cloud.

Security   

End to end data encryption.

Customer managed encryption key. 

Column based control through data masking and tokenization.

Authentication, MFA, Federated, SSO, OAuth.

Role based access control. 

Complete data security and compliant with HIPAA, PCI DSS and other regulatory requirements.

Concept of Secured view provides additional security.

Concept of Reader account provides secured read only data sharing. 

Performance

Data Sharing – Secure and safe data sharing both within and outside of the organization.

Column level security using masking and tokenization.

Secured and seamless data sharing.

Data Marketplace. 

Smart use of data and metadata cashing.

Clustering of elastic warehouse. 

Snowflake managed materialized view provides improved performance for complex queries.

Snowflake managed micro partitioning and columnar compressed storage provides faster query performance.  

techiesubnet.com

bottom of page