Interview Questions

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 16

1. What is a Snowflake cloud data warehouse?

Snowflake is an analytic data warehouse implemented as a SaaS service. It is built on a


new SQL database engine with a unique architecture built for the cloud. This cloud-
based data warehouse solution was first available on AWS as software to load and
analyze massive volumes of data. The most remarkable feature of Snowflake is its
ability to spin up any number of virtual warehouses, which means the user can operate
an unlimited number of independent workloads against the same data without any risk
of contention.

2. Is Snowflake an ETL tool?


Yes, Snowflake is an ETL tool. It’s a three-step process, which includes:

1. Extracts data from the source and creates data files. Data files support multiple
data formats like JSON, CSV, XML, and more.
2. Loads data to an internal or external stage. Data can be staged in an internal,
Microsoft Azure blob, Amazon S3 bucket, or Snowflake managed location.
3. Data is copied into a Snowflake database table using the COPY INTO command.

3. Explain Snowflake ETL?


The full form of ETL is Extract, Transform, and Load. ETL is the process that we use for
extracting the data from multiple sources and loading it to a particular database or
data warehouse. The sources are third party apps, databases, flat files, etc.

Snowflake ETL is an approach to applying the ETL process for loading the data into the
Snowflake data warehouse or database. Snowflake ETL also includes extracting the
data from the data sources, doing the necessary transformations, and loading the data
into Snowflake.

3. How is data stored in Snowflake?


Snowflakes store the data in multiple micro partitions which are internally optimized
and compressed. The data is stored in a columnar format in the cloud storage of
Snowflake. The data objects stored by Snowflake cannot be accessed or visible to the
users. By running SQL query operations on Snowflake, you can access them.

4. How is Snowflake distinct from AWS?


Snowflake offers storage and computation independently, and storage cost is similar to
data storage. AWS handles this aspect by inserting Redshift Spectrum, which enables
data querying instantly on S3, yet not as continuous as Snowflake.
5. What type of database is Snowflake?
Snowflake is built entirely on a SQL database. It’s a columnar-stored relational
database that works well with Excel, Tableau, and many other tools. Snowflake contains
its query tool, supports multi-statement transactions, role-based security, etc., which
are expected in a SQL database.

6. Can AWS glue connect to Snowflake?


Definitely. AWS glue presents a comprehensive managed environment that easily
connects with Snowflake as a data warehouse service. These two solutions collectively
enable you to handle data ingestion and transformation with more ease and flexibility.

7. Explain Snowflake editions.


Snowflake offers multiple editions depending on your usage requirements.

1. Standard edition - Its introductory level offering provides unlimited access to


Snowflake’s standard features.
2. Enterprise edition - Along with Standard edition features and services, offers
additional features required for large-scale enterprises.
3. Business-critical edition - Also, called Enterprise for Sensitive Data (ESD). It offers
high-level data protection for sensitive data to organization needs.
4. Virtual Private Snowflake (VPS) - Provides high-level security for organizations
dealing with financial activities.

8. Define the Snowflake Cluster


In Snowflake, data partitioning is called clustering, which specifies cluster keys on the
table. The method by which you manage clustered data in a table is called re-clustering.

9. Explain Snowflake architecture


Snowflake is built on an AWS cloud data warehouse and is truly Saas offering. There is
no software, hardware, ongoing maintenance, tuning, etc. needed to work with
Snowflake.

Three main layers make the Snowflake architecture - database storage, query
processing, and cloud services.

1. Data storage - In Snowflake, the stored data is reorganized into its internal
optimized, columnar, and optimized format.
2. Query processing - Virtual warehouses process the queries in Snowflake.
3. Cloud services - This layer coordinates and handles all activities across the
Snowflake. It provides the best results for Authentication, Metadata
management, Infrastructure management, Access control, and Query parsing.
👉 Also read: Learn Snowflake Architecture

10. What are the features of Snowflake?


Unique features of the Snowflake data warehouse are listed below:

1. Database and Object Closing


2. Support for XML
3. External tables
4. Hive meta store integration
5. Supports geospatial data
6. Security and data protection
7. Data sharing
8. Search optimization service
9. Table streams on external tables and shared tables
10. Result Caching

11. Why is Snowflake highly successful?


Snowflake is highly successful because of the following reasons:

1. It assists a wide variety of technology areas like data integration, business


intelligence, advanced analytics, security, and governance.
2. It offers cloud infrastructure and supports advanced design architectures ideal
for dynamic and quick usage developments.
3. Snowflake supports predetermined features like data cloning, data sharing,
division of computing and storage, and directly scalable computing.
4. Snowflake eases data processing.
5. Snowflake provides extendable computing power.
6. Snowflake suits various applications like ODS with the staged data, data lakes
with data warehouse, raw marts, and data marts with acceptable and modelled
data.

12. Tell me something about Snowflake AWS?


For managing today’s data analytics, companies rely on a data platform that offers
rapid deployment, compelling performance, and on-demand scalability. Snowflake on
the AWS platform serves as a SQL data warehouse, which makes modern data
warehousing effective, manageable, and accessible to all data users. It enables the
data-driven enterprise with secure data sharing, elasticity, and per-second pricing.
13. Describe Snowflake computing.
Snowflake cloud data warehouse platform provides instant, secure, and governed
access to the entire data network and a core architecture to enable various types of
data workloads, including a single platform for developing modern data applications.

14. What is the schema in Snowflake?


Schemas and databases used for organizing data stored in the Snowflake. A schema is
a logical grouping of database objects such as tables, views, etc. The benefits of using
Snowflake schemas are it provides structured data and uses small disk space.

15. What are the benefits of the Snowflake Schema?


1. In a denormalized model, we use less disk space.
2. It provides the best data quality.

Related Article - Snowflake vs Redshift

16. Differentiate Star Schema and Snowflake Schema?


Both Snowflake and Star Schemas are identical, yet the difference exists in dimensions.
In Snowflake, we normalise only a few dimensions, and in a star schema, we
denormalise the logical dimensions into tables.

17. What kind of SQL does Snowflake use?


Snowflake supports the most common standardized version of SQL, i.e., ANSI for
powerful relational database querying.

18. What are the cloud platforms currently supported by Snowflake?


1. Amazon Web Services (AWS)
2. Google Cloud Platform (GCP)
3. Microsoft Azure (Azure)

19. What ETL tools do you use with Snowflake?


Following are the best ETL tools for Snowflake

1. Matillion
2. Blendo
3. Hevo Data
4. StreamSets
5. Etleap
6. Apache Airflow
Snowflake Advanced Interview Questions
20. Explain zero-copy cloning in Snowflake?
In Snowflake, Zero-copy cloning is an implementation that enables us to generate a
copy of our tables, databases, schemas without replicating the actual data. To carry out
zero-copy in Snowflake, we have to use the keyword known as CLONE. Through this
action, we can get the live data from the production and carry out multiple actions.

21. Explain “Stage” in the Snowflake?


In Snowflake, the Stage acts as the middle area that we use for uploading the files.
Snowpipe detects the files once they arrive at the staging area and systematically loads
them into the Snowflake.

Following are the stages supported by the snowflake:

1. Table Stage
2. User Stage
3. Internal Named Stage

22. Explain data compression in Snowflake?


All the data we enter into the Snowflake gets compacted systematically. Snowflake
utilizes modern data compression algorithms for compressing and storing the data.
Customers have to pay for the packed data, not the exact data.

23. How do we secure the data in the Snowflake?


Data security plays a prominent role in all enterprises. Snowflake adapts the best-in-
class security standards for encrypting and securing the customer accounts and data
that we store in the Snowflake. It provides the industry-leading key management
features at no extra cost:

24. Explain Snowflake Time Travel?


Snowflake Time Travel tool allows us to access the past data at any moment in the
specified period. Through this, we can see the data that we can change or delete.
Through this tool, we can carry out the following tasks:

Restore the data-associated objects that may have lost unintentionally.


For examining the data utilization and changes done to the data in a specific time
period.
Duplicating and backing up the data from the essential points in history.
25. What is the database storage layer?
Whenever we load the data into the Snowflake, it organizes the data into the
compressed, columnar, and optimized format. Snowflake deals with storing the data
that comprises data compression, organization, statistics, file size, and other properties
associated with the data storage. All the data objects we store in the Snowflake are
inaccessible and invisible. We can access the data objects by executing the SQL query
operation through Snowflake.

26. Explain Fail-safe in Snowflake?


Fail-safe is a modern feature that exists in Snowflake to assure data security. Fail-safe
plays a vital role in the data protection lifecycle of the Snowflake. Fail-safe provides
seven days of additional storage even after the time travel period is completed.

27. Explain Virtual warehouse?


In Snowflake, a Virtual warehouse is one or more clusters endorsing users to carry out
operations like queries, data loading, and other DML operations. Virtual warehouses
approve users with the necessary resources like temporary storage, CPU for
performing various snowflake operations.

28. Explain Data Shares


Snowflake Data sharing allows organizations to securely and immediately share their
data. Secure data sharing enables sharing of the data between the accounts through
Snowflake secure views, database tables.

29. What are the various ways to access the Snowflake Cloud data warehouse?
We can access the Snowflake data warehouse through:

1. ODBC Drivers
2. JDBC Drivers
3. Web User Interface
4. Python Libraries
5. SnowSQL Command-line Client

30. What are the advantages of Snowflake Compression?


Following are the advantages of the Snowflake Compression:

1. Storage expenses are lesser than original cloud storage because of


compression.
2. No storage expenditure for on-disk caches.
3. Approximately zero storage expenses for data sharing or data cloning.
31. Differentiate Fail-Safe and Time-Travel in Snowflake?
Time-Travel Fail-Safe

Fail-Safe, the User does not


have control over the recovery
of data valuable merely after
According to the Snowflake completing the period. In this
edition, account or object context, only Snowflake
particular time travel setup, assistance can help for 7 days.
users can retrieve and set the Therefore if you set time travel
data reverting to the history. as six days, we retrieve the
database objects after
executing the transaction + 6
days duration.

32. Explain Snowpipe in Snowflake?


Snowpipe is a cost-efficient and constant service that we use for loading the data into
the Snowflake. Snowpipe systematically loads data from the files as soon as they are
attainable on the stage. Snowpipe eases the data loading process by loading the data
into the micro-batches and shapes data for analysis.

Also Read - Databricks vs Snowflake

33. What are the advantages of the Snowpipe?


Following are the Snowpipe advantages:

1. Live insights
2. User-friendly
3. Cost-efficient
4. Resilience

Snowflake Developer Interview Questions


34. Explain Micro Partitions?
Snowflake comes along with a robust and unique kind of data partitioning known as
micro partitioning. Data that exists in the Snowflake tables are systematically converted
into micro partitions. Generally, we perform Micro partitioning on the Snowflake tables.
35. Explain Columnar database?
The columnar database is opposite to the conventional databases. It saves the data in
columns in place of rows, eases the method for analytical query processing and offers
more incredible performance for databases. Columnar database eases analytics
processes, and it is the future of business intelligence.

36. How to create a Snowflake task?


To create a Snowflake task, we have to use the “CREATE TASK” command. Procedure to
create a snowflake task:

CREATE TASK in the schema.


USAGE in the warehouse on task definition.
Run SQL statement or stored procedure in the task definition.

37. How do we create temporary tables?


To create temporary tables, we have to use the following syntax:
Create temporary table mytable (id number, creation_date date);

38. Where do we store data in Snowflake?


Snowflake systematically creates metadata for the files in the external or internal
stages. We store metadata in the virtual columns, and we can query through the
standard “SELECT” statement.

39. Does Snowflake use Indexes?


No, Snowflake does not use indexes. This is one of the aspects that set the Snowflake
scale so good for the queries.

41. How is Snowflake distinct from AWS?


Snowflake offers storage and computation independently, and storage cost is similar to
data storage. AWS handles this aspect by inserting Redshift Spectrum, which enables
data querying instantly on S3, yet not as continuous as Snowflake.

42. How do we execute the Snowflake procedure?


Stored procedures allow us to create modular code comprising complicated business
logic by adding various SQL statements with procedural logic. For executing Snowflake
procedure, carry out the below steps:

1. Run a SQL statement


2. Extract the query results
3. Extract the result set metadata

43. Does Snowflake maintain stored procedures?


Yes, Snowflake maintains stored procedures. The stored procedure is the same as a
function; it is created once and used several times. Through the CREATE PROCEDURE
command, we can create it and through the “CALL” command, we can execute it. In
Snowflake, stored procedures are developed in Javascript API. These APIs enable stored
procedures for executing the database operations like SELECT, UPDATE, and CREATE.

44. Is Snowflake OLTP or OLAP?


Snowflake is developed for the Online Analytical Processing(OLAP) database system.
Subject to the usage, we can utilize it for OLTP(Online Transaction processing) also.

45. How is Snowflake distinct from Redshift?


Both Redshift and Snowflake provide on-demand pricing but vary in package features.
Snowflake splits compute storage from usage in its pricing pattern, whereas Redshift
integrates both.

46. What is the use of the Cloud Services layer in Snowflake?


The services layer acts as the brain of the Snowflake. In Snowflake, the Services layer
authenticates user sessions, applies security functions, offers management, performs
optimization, and organizes all the transactions.

47. What is the use of the Compute layer in Snowflake?


In Snowflake, Virtual warehouses perform all the data handling tasks. Which are
multiple clusters of the compute resources. While performing a query, virtual
warehouses extract the least data needed from the storage layer to satisfy the query
requests.

Top of Form

Subscribe
Be the first to catch the latest happenings of technology with us.
Subscribe

Bottom of Form

About Author
Name Madhuri Yerukala

Author Bio Madhuri is a Senior Content Creator at


MindMajix. She has written about a range
of different topics on various
technologies, which include, Splunk,
Tensorflow, Selenium, and CEH. She
spends most of her time researching on
technology, and startups. Connect with
her via LinkedIn and Twitter .

1.
2.
3.
4.
5.

Trending Course Categories

1. Business Intelligence & Analytics Courses


2. Cloud Computing Courses
3. Programming & Frameworks Courses
4. Customer Relationship Management Courses
5. Database Management and Administration Courses
6. Business Process Management Courses
7. Software and Automation Testing Courses
8. IT Service Management Courses
9. RPA Certification Courses

Trending Courses

1. Power BI Training
2. Google Cloud Training
3. Salesforce Training
4. Oracle DBA Training
5. Informatica Training
6. Snowflake Training
7. Jira Training
8. Python Training
9. ServiceNow Training

1. About USContact UsRefund PolicyReviewsTeamPopular Courses

1. Agile Training
2. ArcSight Training
3. CybarArk Training
4. Workday Training
5. Looker Training
6. AWS Training
7. Alteryx Training
8. Powershell Training
9. UiPath Training

For Businesses

1. Corporate Training

Work with us

1. Hire From US
2. Become an instructor
3. Write for us

Discover

1. Community
2. Blog
3. Sample Resumes

Interview Questions

1. Salesforce Interview Questions


2. Aws Interview Questions
3. RPA Interview Questions
4. Looker Interview Questions
5. Informatica Interview Questions
6. Workday Interview Questions
7. ServiceNow Interview Questions

Tutorials

1. Salesforce Tutorial
2. RPA Tutorial
3. Looker Tutorial
4. Informatica Tutorial
5. Workday Tutorial
6. Servicenow Tutorial
Copyright © 2021 Appmajix Technologies Private Limited.

Disclaimer: All the course names, logos, and certification titles we use are their respective owners' property. The
firm, service, or product names on the website are solely for identification purposes. We do not own, endorse
or have the copyright of any brand/logo/name in any manner. Few graphics on our website are freely available
on public domains.

Contact our advisor

Drop us a query

Probable Interview Questions on Snowflake :

Snowflake being relatively new most of the interview questions would be based on how you have
implemented and used it in your project.

Some which I could recall are -

1. Snowflake Architecture and it's various components

2. Data loading and challenges faced


3. Data loading include copy command, how do you create stages, various stage -
table/internal/external.

4. How does stages differ from each other.

5. How to create external stage referencing S3 bucket/ADSL

6. Scenarios like how do you handle duplicates as integrity constraints are not enforced in SF.

7. What are warehouses and how do you design it.

8. Performance optimization - Auto Scaling, Caching and clustering

9. Scenarios where you have used time travel feature.

10. UDF and how have you used it.

11. Restrictions in SF as compared to RDBMS database.

12. If you have worked on migration what are challenges faced and how did you overcome it.

13. Task, streams implementation scenario.

14. Structured/semi- structured files how have you loaded them.

15. Volume of data you have handled.

16. Architecture of your project.

17. Orchestration of SF pipelines - tools being used for it.

18. How do you handle transactional transactions in Snowflake.

19. Snowpipe

20.Data Unloading.

21. Role based access management - important to have knowledge on this.

22. How does normal views and materials views differ in Snowflake

There can be various scenario based question based on what you explain about your project and how
knowledgeable interviewer is.

With Snowflake you need to have basic to intermediate knowledge on cloud technology and
orchestration will be add on.

[16/06, 5:03 pm] Kalyani Snowfl: Questions specific to data loading/unloading :

1. Difference between copy and put command.


2. Steps involved in creation of stages.

3. Different type of stages.

4. Rejected/bad records- how to view them.

5. Considering a scenario where you need to load files as soon as it arrives, how would you implement it.

6. Loading semi structured data - through variant column.

7. How does snowflake always load the incremental data files ignoring the ones already loaded.

8. How would you force to load all the files in a given folder/directory.

9. Can you query data on external storage without loading them, if yes how ?

10. Bulk vs Continuous loading.

[16/06, 5:04 pm] Kalyani Snowfl: 1. Snowflake Architecture and it's various components

2. Data loading and challenges faced

3. Data loading include copy command, how do you create stages, various stage -
table/internal/external.

4. How does stages differ from each other.

5. How to create external stage referencing S3 bucket/ADSL

6. Scenarios like how do you handle duplicates as integrity constraints are not enforced in SF.

7. What are warehouses and how do you design it.

8. Performance optimization - Auto Scaling, Caching and clustering

9. Scenarios where you have used time travel feature.

10. UDF and how have you used it.

11. Restrictions in SF as compared to RDBMS database.

12. If you have worked on migration what are challenges faced and how did you overcome it.

13. Task, streams implementation scenario.

14. Structured/semi- structured files how have you loaded them.

15. Volume of data you have handled.

16. Architecture of your project.

17. Orchestration of SF pipelines - tools being used for it.

18. How do you handle transactional transactions in Snowflake.

19. Snowpipe
20.Data Unloading.

21. Role based access management - important to have knowledge on this.

22. How does normal views and materials views differ

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy