Untitled document (1)
Untitled document (1)
● Indexes (Clustered vs. Non-Clustered): Clustered indexes sort and store rows
physically, while non-clustered indexes create a logical order. Both improve search
performance.
● Execution Plans & Query Profiling: Use EXPLAIN or EXPLAIN ANALYZE to inspect
query plans, identify bottlenecks, and optimize queries.
● Partitioning & Sharding: Partitioning splits a table within a single database; sharding
distributes data across multiple databases for scalability.
● Normalization & Denormalization: Denormalization improves read performance by
reducing joins, preferred in analytical workloads.
● CTEs vs. Temporary Tables vs. Derived Tables: CTEs improve readability; temporary
tables store intermediate data; derived tables are subqueries used within queries.
● Materialized Views vs. Regular Views: Use materialized views for performance when
precomputed results are needed. Regular views are virtual and recomputed on access.
● Window Functions & Ranking Queries: ROW_NUMBER(), RANK(), and
DENSE_RANK() assign ranks to rows based on partitioning and ordering.
● Optimizing Joins: Use indexes on join keys, prefer smaller datasets first, and consider
hash joins for large datasets.
● Handling Large Data Sets: Use indexing, partitioning, batch processing, and
compression. Avoid full table scans.
● Concurrency & Locking: Use proper isolation levels, indexing, and deadlock detection
to prevent performance issues.
● Recursive CTEs: Use WITH RECURSIVE to traverse hierarchical data like organization
trees.
● Pivot & Unpivot: PIVOT converts rows to columns; UNPIVOT does the opposite.
● JSON & XML Handling: Use JSON_VALUE(), JSON_QUERY(), or OPENXML() to query
and manipulate structured data.
● Dynamic SQL: Construct and execute queries dynamically using EXEC or
sp_executesql.
● LEAD(), LAG(), FIRST_VALUE(), LAST_VALUE(): Used for accessing previous, next,
or first/last values in analytic queries.
● ROLLUP & CUBE: ROLLUP creates hierarchical aggregations; CUBE computes all
possible aggregations.
● String Manipulation & Regex: Use LIKE, CHARINDEX(), PATINDEX(), and regex
functions for text processing.
● Stored Procedures vs. Functions: Procedures perform actions; functions return values
and can be used in queries.
● Triggers: Use sparingly for audit logging and enforcing business rules; can impact
performance.
● Transactions & Isolation Levels: Controls concurrency issues—READ UNCOMMITTED
(fastest), SERIALIZABLE (strictest).
● Optimistic vs. Pessimistic Locking: Optimistic is for high-read, low-conflict scenarios;
pessimistic prevents conflicts by locking.
● Savepoints & Rollback: Savepoints allow partial rollbacks within a transaction.
● ETL vs. ELT: ETL transforms data before loading; ELT loads first and then transforms
within the database.
● Slowly Changing Dimensions (SCDs): Type 1 (overwrite), Type 2 (historical tracking),
Type 3 (limited history).
● Fact vs. Dimension Tables: Fact tables store transactional data; dimension tables store
descriptive attributes.
● Change Data Capture (CDC): Tracks changes using log-based, trigger-based, or
timestamp-based methods.
● SQL on Big Data: Use distributed engines like Spark SQL, Hive, or BigQuery for
large-scale data querying.
● Columnar Databases: Store data by columns for fast analytical queries (e.g., Redshift,
ClickHouse).
● Horizontal vs. Vertical Scaling: Horizontal scaling adds more machines; vertical
scaling upgrades a single machine.