0% found this document useful (0 votes)
223 views20 pages

Snowflake Interview Question

Uploaded by

vnc7229
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
223 views20 pages

Snowflake Interview Question

Uploaded by

vnc7229
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

1. Snowflake is ETL or ELT tool?

2. Virtual warehous contain CPU, Memory and Temporary storage


3. What is transient and temporary table?
4. Join

5. I am having stored procedure named as test where I am returning error as true or false , so
how to call these store proc.?
6. Which cloud service you are using along with snowflake?
7. UNION and UNIOn ALL
 Snowflake Union: Combines the result set queries and removes
duplicates if any.
 Snowflake Union ALL: Combines the result set queries without
removing duplicates if any

SELECT C1, C2, C3 FROM TABLE_1 UNION SELECT C1, C2, C3 FROM
TABLE_2;

8. I want to read the data from external table, how you will red it.
Select value:”column name” as “anything” from tablename.
9. Online editor:
https://www.programiz.com/sql/online-compiler/
select c.customer_id,c.first_name, c.last_name,o.item,o.amount,s.status
from Customers c
left join Orders o on c.customer_id=o.customer_id
left join Shippings s on c.customer_id=s.customer;

10. http://teachmehana.com/row-store-vs-column-store/
11. Which snowflake version you worked I mean standard, enterprised or business
11. [{“Employee_ID”:1901,”Employee_Name”:”James”,”salary”:$9000}]
12. [{“Employee_ID”:1901,”Employee_Name”:”James”,”salary”:$9000,expenses:{“Fast_Food”:
$200,”Hotelling”:$50}}]

10. how caching works in snowflake

11. how does caching works when underline table get updated in snowflake?

Ans: cache will not used here, snowflake directly connect to database storage layer.

13. When we update or delete the records then changes made at storage level, so when ever we
run select query after delete or update the records its connect to database storage layer via
compute layer and it charge you.
14. Does the chache shared across multiple uesrs? -- yes
15. For how long the query result is chached?—24 hrs
16. Does any charges for storing cache?—No it’s a in memory storage
17. Does the aws has same service as snowflake? Yes redshift spectrum where we query the
data directly from s3
18. Which sql version does snowflake support? It support standard sql version i.e ANSI
19. Which are the could platform supported by snowflake? AWS, AZURE, GCP

20. Which are the ETL tool use with snowflake? AWS glue, Apache airflow, Hevo
Data,informatica
21. In Snowflake, Zero-copy cloning is an implementation that enables us to generate a copy of
our tables, databases, schemas without replicating the actual data. To carry out zero-copy in
Snowflake, we have to use the keyword known as CLONE.
22. What is snowflake time travel?
Ans: Snowflake Time Travel tool allows us to access the past data at any moment in the
specified period.
23. Explain Fail-safe in Snowflake?
Ans: Fail-safe is a modern feature that exists in Snowflake to assure data security. Fail-safe
plays a vital role in the data protection lifecycle of the Snowflake. Fail-safe provides seven
days of additional storage even after the time travel period is completed.

https://snowflakemasters.in/snowflake-interview-questions/
24. Data at snowflake stage is actual data or metadata? It’s a metadata, it reference to the data
present at s3
25. Does Snowflake use Indexes? NO, snowflake as of now not uses index but yes snowflake
adding indexes in its new version—24-11-2022
26. Is Snowflake OLTP or OLAP? Actually snowflake is developed for OLAP database system(for
analysing historical data) but subject to usage you can use it for OLTP
27. How many records get displayed by default when we use select query? 10,000
28. https://medium.com/@sanket.prabhu34/commonly-asked-snowflake-interview-questions-
e3863732c53f
29. explain Column-level Security in snowflake?—it’s a masking
https://docs.snowflake.com/en/user-guide/security-column-intro.html#what-are-masking-
policies
30. can we execute stored procedure one after other rather than scheduled based?
31. https://www.edureka.co/blog/interview-questions/sql-query-interview-questions
32. https://www.interviewbit.com/sql-interview-questions/
33. https://www.kdnuggets.com/2020/11/5-tricky-sql-queries-solved.html
34. 250 SQL Server Interview Questions And Answers For Experienced (codingcompiler.com)
35. Write a query to find the duplicate
fName lName salary id
Neil lee 2000 1
adam young 5000 2
john meloni 2500 3
adam young 1900 4
john meloni 6500 5
Arnold brent 4500 6
36. Difference Between Snowflake Stored Procedure and UDFs - SP vs UDFs - DWgeek.com
37. What are indexes and what are its type. What is clustered and non clustered index types,
what is unique index
38. Write a query to get 2nd highest salary using common table expression(CTE)
39. Parsing json in snowflake from external stage
--==============================================================================

------------Sent by Digambar on 13-12-2023


Explain exception handling in stored procedures?
Explain complete set up of snowpipe?
What is data warehouse?
What is snowflake?
What is task?
What is stream? Explain types of stream?
Have you implemented task and stream with snowpipe explain it?
How to check the changes happening in data using stream, tell me query for the same?
What is time traveling?
How to restored dropped table?
Suppose I have dropped a table emp then I created a new table emp, now I want to restore emp
table which was dropped, is it possible to restore it and how?
Type of tables in snowflake and explain it?
Suppose you are executing a query how you know ware house is up or not?
How to check how many partitions are scanned when query is executed?
What is cache and types of cache in snowflake?
Is it possible to resize a warehouse size during query execution?
Explain architecture of snowflake?
How data is stored in snowflake?
What is clustering in snowflake?
Advantages of Zero copy cloning

--18-12-2023
Clone related questions:-

1) What is cloning?
2) How to clone a table, schema and
database?
3) Why cloning is called zero copy cloning?
4) In which situation we have used
cloning in our project?
5) Does cloned object will have storage?
6) When we make any changes to cloned
object, does it get reflected the base table and vice versa.?
7) A new record inserted in the cloned
table – Does it have storage or not. If storage is there then where it will
store the data?
8) Can we perform DML operations on
cloned tables? How will it impact original tables?

--22-12-2023
What is data chaching in snowflake? What are the types of data chaching?
What is the difference between time travel and failsafe?
Explain Snowflake architecture

--03-01-2024
What is difference between delete and truncate?
What is difference between execute as caller and owner?
how we can manage metadata in snowflake?
What is difference between STUFF() and REPLACE()?
How we can check who deletes the particular table in SQL SERVER?
What is cluster key in snowflake?
What is the max file size which we can upload into stage?
What is the data ingestion? what are the ingestion techniques?
Suppose we have A and B tables and we are applying CROSS join then how the result looks like?

In which case you have used horizontal scaling in your current project?
what is RBAG?

What is query tag in snowflake?


how we can manage metadata?
CTE and subquery which one is better and why?
how we declare variable in javascript during stored procedure?
how to call procedure in function? We cant
what is the difference between view and secured view?

--From Whatsup Group


Data mart, Data warehouse, Data lake:

Data Mart: A data mart is a simple form of data warehouse that is designed to serve a specific
business unit or team. It contains a subset of an organization's data and is optimized for the queries
and reports needed by that particular team.

Data Warehouse: A data warehouse is a large, centralized repository of data that is used for
reporting and data analysis. It is designed to handle the massive volumes of data that businesses
generate, and it is optimized for querying and analysis rather than transaction processing.

Data Lake: A data lake is a large, centralized repository of raw, unstructured data. It is designed to
handle the diverse and ever-growing volume of data that businesses generate, and it is optimized for
storing and processing large volumes of data in its native format.

Time Travel and Syntax: Time travel is a feature in some databases that allows you to query the
database as it existed at some point in the past. In Snowflake, you can use the SYSTEM$ASOF
function to query data as it existed at a specific time. The syntax is:

SYSTEM$ASOF(<table_name>, timestamp => '2022-01-01 12:00:00'::timestamp_ltz)

I have used time travel up to 3 days in the past.

Loading specific columns: To load only 5 columns from a stage that has 1000 columns, you can use
the COPY INTO command with the HEADER = TRUE and COLUMN_NAMES = ('col1', 'col2', ..., 'col5')
options. For example:

COPY INTO my_table

FROM @my_stage

FILE_FORMAT = (TYPE = 'CSV')

HEADER = TRUE

COLUMN_NAMES = ('col1', 'col2', 'col3', 'col4', 'col5');


Original and cloned tables: If you insert data into table B, which is a cloned table of table A, the data
will not reflect in table A. This is because a clone in Snowflake is a point-in-time copy of a table, and
any changes made to the clone do not affect the original table.

Extracting changed records: To extract the changed records (excluding duplicates) between two files
received on different days, you would need to first load both files into separate tables. Then, you can
use the INTERSECT and MINUS operators to find the records that are in one table but not the other.
For example:

CREATE TABLE table1 AS

SELECT * FROM @my_stage_1

FILE_FORMAT = (TYPE = 'CSV');

CREATE TABLE table2 AS

SELECT * FROM @my_stage_2

FILE_FORMAT = (TYPE = 'CSV');

-- Records in table1 but not in table2

SELECT * FROM table1

MINUS

SELECT * FROM table2;

-- Records in table2 but not in table1

SELECT * FROM table2

MINUS

SELECT * FROM table1;

-- Records that are in both tables (duplicates excluded)

SELECT * FROM table1

INTERSECT

SELECT * FROM table2;

Permanent, transient, and temporary tables:


Permanent Table: A permanent table is a table that is stored in a database and persists across
sessions. It is visible to all users and can be used to store data that needs to be accessed and
updated over time.

Transient Table: A transient table is a table that is stored in the temporary tablespace and is
automatically dropped when the session ends. It is useful for storing temporary results or for use in
transient data processing.

Temporary Table: A temporary table is a table that is stored in the temporary tablespace and is
visible only to the current session. It is useful for storing temporary results or for use in transient
data processing.

Usage of transient table: I have used transient tables in Snowflake for storing temporary results
while performing complex data transformations. This allows me to improve the performance of the
query by storing intermediate results in memory, rather than re-computing them for each step of
the transformation.

Max size of data warehouse: The maximum size of a data warehouse in Snowflake depends on the
edition and the size of the underlying cloud infrastructure

Max records of data processed: I have processed up to billions of records in a single query in
Snowflake. The exact number depends on the complexity of the query and the resources available in
the data warehouse.

Materialized views and secure views:

Materialized View: A materialized view is a precomputed, stored version of a SELECT statement. It is


used to improve the performance of queries by storing the results of the SELECT statement in a
physical table, rather than re-computing them each time the query is run.

Secure View: A secure view is a virtual table that is defined by a SELECT statement, but is not stored
as a physical table. It is used to provide access to data in a secure manner by allowing users to query
the data without granting them direct access to the underlying tables.

Probable Interview Questions on Snowflake :

Snowflake being relatively new most of the interview questions would be based on how you have
implemented and used it in your project.

Some which I could recall are -


1. Snowflake Architecture and it's various components

2. Data loading and challenges faced

3. Data loading include copy command, how do you create stages, various stage -
table/internal/external.

4. How does stages differ from each other.

5. How to create external stage referencing S3 bucket/ADSL

6. Scenarios like how do you handle duplicates as integrity constraints are not enforced in SF.

7. What are warehouses and how do you design it.

8. Performance optimization - Auto Scaling, Caching and clustering

9. Scenarios where you have used time travel feature.

10. UDF and how have you used it.

11. Restrictions in SF as compared to RDBMS database.

12. If you have worked on migration what are challenges faced and how did you overcome it.

13. Task, streams implementation scenario.

14. Structured/semi- structured files how have you loaded them.

15. Volume of data you have handled.

16. Architecture of your project.

17. Orchestration of SF pipelines - tools being used for it.

18. How do you handle transactional transactions in Snowflake.

19. Snowpipe

20.Data Unloading.

21. Role based access management - important to have knowledge on this.

22. How does normal views and materials views differ in Snowflake

There can be various scenario based question based on what you explain about your project and
how knowledgeable interviewer is.

With Snowflake you need to have basic to intermediate knowledge on cloud technology and
orchestration will be add on.

When to use star schema and when to use snowflake schema for designing?
--2024-04-15

35 Most Common SQL Interview Questions 👇👇

1.) Explain order of execution of SQL.

2.) What is difference between where and having?

3.) What is the use of group by?

4.) Explain all types of joins in SQL?

5.) What are triggers in SQL?

6.) What is stored procedure in SQL

7.) Explain all types of window functions?

(Mainly rank, row_num, dense_rank, lead & lag)

8.) What is difference between Delete and Truncate?

9.) What is difference between DML, DDL and DCL?

10.) What are aggregate function and when do we use them? explain with few example.

11.) Which is faster between CTE and Subquery?

12.) What are constraints and types of Constraints?

13.) Types of Keys?

14.) Different types of Operators ?

15.) Difference between Group By and Where?

16.) What are Views?

17.) What are different types of constraints?

18.) What is difference between varchar and nvarchar?

19.) Similar for char and nchar?

20.) What are index and their types?

21.) What is an index? Explain its different types.

22.) List the different types of relationships in SQL.

23.) Differentiate between UNION and UNION ALL.

24.) How many types of clauses in SQL?

25.) What is the difference between UNION and UNION ALL in SQL?

26.) What are the various types of relationships in SQL?

27.) Difference between Primary Key and Secondary Key?


28.) What is the difference between where and having?

29.) Find the second highest salary of an employee?

30.) Write retention query in SQL?

31.) Write year-on-year growth in SQL?

32.) Write a query for cummulative sum in SQL?

33.) Difference between Function and Store procedure ?

34.) Do we use variable in views?

35.) What are the limitations of views?

--2024-04-15

Here you can find essential SQL Interview Resources..

1. What is the data flow, and how many layers are in our projects?

2. How do you convert JSON to the Snowflake VARIANT data type?

3. What are alternative methods for loading data into Snowflake without using JSON functions?

4. Can you explain the concept of a warehouse in Snowflake?

5. What is the architecture of Snowflake?

6. What is a stream in Snowflake, and what are the columns present in a stream?

7. How are task dependencies managed in Snowflake?

8. What is a Snowpipe in the context of Snowflake?

9. How can you set up error notifications in Snowflake?

10. Is there a specific table for maintaining notification history in Snowflake?

11. What is the purpose of the pattern function in Snowflake?

12. Could you explain the process of data sharing in Snowflake?

13. What are the types of Slowly Changing Dimensions (SCD)?

14. How do you move 100 GB of data into SF? Describe the steps you would follow.

15. What is the maximum size of a file that can be loaded into an S3 bucket?

16. Explain the relationship between AWS and SF.

17. How can you create a table in Oracle with a time/travel retention period to go back before
12 days?

18. Differentiate between a View and a Materialized View (MView).

19. Have you worked with Snowpipe? If so, describe your experience in creating and using
Snowpipe.

20. Explain the concept of a Merge statement in the context of a relational database.
21. What ETL (Extract, Transform, Load) tool would you recommend for data integration tasks,
and what are the key features that make it suitable?

22. Can you explain the key features and advantages of using Snowflake as a cloud data
warehouse platform?

23. What are the key components and architectural considerations when designing a data
solution using Snowflake as the underlying data warehouse?

24. How does Snowflake handle caching, and what role does it play in optimizing query
performance?

25. What strategies or best practices can be employed to enhance the performance of a
Snowflake data warehouse?

26. What mechanisms does Snowflake provide for ensuring fail-safe operations, especially in the
context of data processing and storage?

27. What steps should be followed when extracting data from Snowflake as a source for a data
integration or ETL process?

28. Can you elaborate on the internal storage architecture of Snowflake and how data is
organized within the platform?

29. What methods or tools can be used to schedule and automate tasks or jobs within
Snowflake?

30. What is normalization in the context of database design, and why is it important?

31. Explain the concept of third normalization in database design and its significance.

32. Query to find the second date from the given dates (2020-01-23, 2020-02-21):

33. Provide a SQL query to find the second date from a set of given dates, considering a specific
condition.

34. How can you extract the fourth character from each value in a specific column (col_a)
containing city names like Chennai, AP, and Mumbai?

35. What steps are involved in creating a table in Snowflake, and how can you insert data from a
file into that table?

36. What are the advantages and use cases for Snow SQL, Snowflake's SQL-based query
language?

37. Explain the methods or procedures to recover or retrieve records that have been
accidentally deleted from a Snowflake table.

38. Is there any size limit for file loading in Snowpipe?

39. How to handle double quotes in the file format of Snowflake?

40. Can we perform any DML in the clone table, and what happens to storage in Snowflake?

41. How to perform database cloning in Snowflake?

42. Does database cloning in Snowflake include a stage as well?


43. What is the datatype to handle JSON and XML in the stage layer of Snowflake?

44. What is the max size of the VARIANT data type in Snowflake?

45. Scenario to change the warehouse from small to medium in Snowflake?

46. How to select duplicates in Snowflake tables?

47. How will you optimize a query even when it has a cluster key, and the query is still taking
time with a large volume of data in Snowflake?

48. If a file with the same filename gets loaded into S3 in Snowflake, what will happen?

49. If a file with the same filename gets loaded into S3 after deleting the old one, will the
duplicate data get loaded, and how will it load?

50. How to exclude double quotes in the file format of Snowflake?

51. What happens to compute in Snowflake?

52. What is the cost computation for cloning in Snowflake?

53. Limitations in a Reader account in Snowflake?

54. How to change the sequence number in Snowflake?

55. How to alter the auto concrete in Snowflake?

56. Syntax to change the datatype in Snowflake?

57. What are the credits for each warehouse in Snowflake?

58. What are credits in Datawarehouse?

59. Different types of credits in Snowflake?

60. Errors we get when loading files to Snowflake tables?

61. How to eliminate the entire row duplicate in the flat file of Snowflake?

62. How to retain one unique record and delete duplicates in Snowflake tables?

63. How to check the long-running step in SQL at Snowflake?

64. What are the limitations we have in using SQL language in Snowflake Java scripts?

65. How to integrate and share data between Unix and Snowflake using SnowSQL on Unix?

66. How to load a large volume of data from on-premise to Snowflake?

67. How do you load data into an external stage in S3 in Snowflake?

68. Python in Snowflake?

69. Does lambda function in Snowflake?

70. Use of any ETL tool in Snowflake?

71. The volume of data handled in Snowflake?

72. Provide a method to remove duplicate records from a table in a SF?.


[Apple - Cloud data engineer (4 + years)]

Round1:

Interview Questions

1. Tell me about yourself.

2. Explain about your previous project.

3. Explain how data populates from source into destination.

4. Difference between Redshift and Snowflake. What are the difficulties you faced in Redshift
compared with Snowflake? (Because i have experience in both)

5. How did you handle any unexpected failures in the pipeline?

6. How did you handle duplicates in the table?

7. What is your main role in your current project?

8. How did you handle the situation when you were unable to complete a task within the timeline?

9. How did you handle hard deleting the source system?

10. What is the maximum number of records you have worked with?

11. What are the techniques you used to tune SQL queries?

12. Tell me the list of transformations that you used in your project.

Query section:

1. How did you find the second-largest salary in the table?

2. How did you identify the current month and previous month salaries of employees? Assume we
have a table with dates and salaries.

3. Transform rows into columns. (without pivot)

Snowflake Basic to Intermediate Interview Questions.

1. What are the essential features of Snowflake?

2. Can you explain Snowflake's architecture?


3. What are micro-partitions in Snowflake, and what is its contribution to the platform's data storage
efficiency?

4. Can you explain how virtual warehouses affect the scalability, performance, and cost management
of data processing tasks?

5. Can you discuss how Snowflake’s compatibility with ANSI SQL standards influences the querying
and data manipulation capabilities?

6. Can you explain Snowflake's approach to data security, specifically its always-on encryption?

7. Can you explain Snowflake's support for both ETL and ELT processes?

8. What are all ETL tools you have used with Snowflake?

9. Can you explain how the advanced feature Snowpipe is used for continuous data ingestion?

10. What is Snowflake's approach to OLTP and OLAP?

11. What is the difference between shared-disk and shared-nothing architectures?

12. Define ‘Staging’ in Snowflake.

13. What are the different types of caching in Snowflake?

14. Define the different states of the Snowflake Virtual Warehouse.

15. Can you describe the impact of the different states of virtual warehouses on query performance?

16. How do you create a virtual warehouse?

17. How do you build a Snowflake task that calls a Stored Procedure?
18. You have a JSON data column in a table storing customer feedback with specific keys. Write a
query to extract and display the feedback text and timestamp for a specific customer_id.

19. How do you verify the task history of a Snowflake Task?

20. How do you create a temporary table in Snowflake?

Fact tables and dimension tables are two key components in a dimensional modeling approach used
in data warehousing.

1. Fact Tables:

• Fact tables contain the quantitative data, also known as facts, that are typically
numerical values representing business transactions or events.

• They are usually large tables and store information such as sales amounts, quantities
sold, or revenues.

• Fact tables often have foreign keys that reference the primary keys of dimension
tables, creating relationships between them.

2. Dimension Tables:

• Dimension tables contain descriptive attributes or context for the data stored in the
fact table.

• They provide the necessary context to interpret the data in the fact table.

• Dimension tables are typically smaller in size compared to fact tables.

• Examples of dimension tables include product, customer, time, and location tables.

In summary, fact tables store quantitative data about business processes, while dimension tables
provide the context or descriptive attributes related to that data. They are linked through foreign
key relationships to create a comprehensive and understandable data model for analysis and
reporting purposes.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy