ankush_kaira
ankush_kaira
ankush_kaira
Summary:
• Overall, 10+ years of experience in multiple technology methodologies like Big Data, including
Analysis, Design, and Development of Big Data using Hadoop, Azure, Python, data Lake, Scala, and
PySpark, and database and data warehousing development using My SQL, Oracle, and Datawarehouse
• Extensively used PowerBI along with other azure cloud services while ingesting data in different
formats and sources.
• With 10+ years of experience in data warehouse varying from traditional warehouse to synapse and
snowflake these days.
• 5 years of experience as Azure Cloud Data Engineer in Microsoft Azure Cloud technologies including
Azure Data Factory (ADF), Azure Data Lake Storage (ADLS), Azure Synapse Analytics (SQL Data
warehouse), Azure SQL Database, Azure Analytical services, Polybase, Azure Cosmos NoSQL DB,
Azure Key vaults, Azure DevOps, Azure HDInsight Big Data Technologies like Hadoop, Apache
Spark, and Azure Data bricks.
• 4 years of experience as a Data warehouse developer handling Microsoft Business Intelligence Tools.
• Experience in developing pipelines in Spark using Scala and PySpark.
• Experience in building ETL (Azure Data Bricks) data pipelines leveraging PySpark, and Spark SQL.
• Extensively worked on Azure Databricks.
• Proficient in Azure Data Factory to perform Incremental Loads from Azure SQL DB to Azure
Synapse.
• Proficient in T-SQL with extensive experience in Microsoft SQL Server
• Hands-on experience in Azure Cloud Services (PaaS & IaaS), Azure Synapse Analytics, SQL Azure,
Data Factory, Azure Analysis Services, Application Insights, Azure Monitoring, Key Vault, and Azure
Data Lake.
• Experience in building the Orchestration on Azure Data Factory for scheduling purposes.
• Hands-on experience in Azure cloud worked on App services, Azure SQL Database, Azure Blob
storage, Azure Functions, Virtual machines, Azure AD, Azure Data Factory, event hub, and event
Queue.
• Experience working with systems processing event-based logs, user logs, etc.
• Experience working with Azure Logic APP Integration tool.
• Experience with the Azure logic apps with different triggers
• Orchestrated data integration pipelines in ADF using various Activities like Get Metadata, Lookup,
For Each, Wait, Execute Pipeline, Set Variable, Filter, until, etc.
• Strong experience in migrating other databases to Snowflake.
• Experience with Snowflake Multi-Cluster Warehouses.
• Experience with MS Azure (Databricks, Data Factory, Data Lake, Azure SQL, Event Hub, etc.)
• Experience in using Snowflake Clone and Time Travel.
• Hands-on working experience and developing large-scale data pipelines using spark and hive
• Implemented Security in Web Applications using Azure and deployed Web Applications to Azure.
• Experience working with ARM templets to deploy in production using Azure DevOps
• Experience with the Azure logic apps with different triggers
• Experience in developing very complex mappings, reusable transformations, sessions, and workflows
using the Informatica ETL tool to extract data from various sources and load it into targets.
• Proficient in leveraging cloud-based data solutions, including Microsoft 365, to enable smooth
collaboration, data-driven decision-making, and data sharing with cross functional teams.
• Worked closely with SAS institute representatives to enhance data analytics capabilities, develop
dashboards, visualizations, and custom reports.
• Experience in Developing Spark applications using Spark - SQL in Databricks for data extraction,
transformation, and aggregation from multiple file formats.
• Adept in azure intune management to ensure secure and compliant data access and device
management within cloud-based data environments.
• Implemented production scheduling jobs using Control-M, and Airflow.
• Used various file formats like Avro, Parquet, Sequence, JSON, ORC, and text for loading data,
parsing, gathering, and performing transformations.
• Hands-on experience with Kafka and Flume to load the log data from multiple sources directly into
HDFS.
• Good experience in Hortonworks and Cloudera for Apache Hadoop distributions.
• Hands-on experience with Confluent Kafka to load data from StreamSets directly into ADLS.
• Strong experience building data pipelines and performing large-scale data transformations.
• In-Depth knowledge in working with Distributed Computing Systems and parallel processing
techniques to efficiently deal with Big Data.
• Designed and Implemented Hive external tables using a shared meta-store with Static & Dynamic
partitioning, bucketing, and indexing.
• Exploring with Spark improving the performance and optimization of the existing algorithms in
Hadoop using Spark context, Spark-SQL, Data Frame, and pair RDDs.
• Extensive hands-on experience tuning spark Jobs and spark performance tuning.
• Experienced in working with structured data using HiveQL, and optimizing Hive queries.
• Experience with solid capabilities in exploratory data analysis, statistical analysis, and visualization
using R, Python, SQL, and Tableau.
• Running and scheduling workflows using Oozie and Zookeeper, identifying failures, and integrating,
coordinating, and scheduling jobs.
• Knowledge of Database Architecture for OLAP and OLTP Applications, Database designing, Data
Migration, and Data Warehousing Concepts, emphasizing ETL.
• Experience in Data Modeling & Analysis using Dimensional and Relational Data Modeling.
• Experience in using Star Schema and Snowflake schema for Modeling and using FACT & Dimensions
tables, Physical & Logical Data Modeling.
• Defining user stories and driving the agile board in JIRA during project execution, participating in
sprint demos and retrospectives.
• Maintained and administered GIT source code repository and GitHub Enterprise.
Education:
Bachelor of Engineering in Information Technology| Panjab University, India (2011)
MS in Information Systems| New York University, USA (2013)
Technical Skills:
Azure Services Azure data Factory, Airflow, Azure Data Bricks, Logic Apps, Functional App, Snowflake, Azure
DevOps, Blob Storage
Big Data Technologies MapReduce, Hive, Python, PySpark, Scala, Kafka, Spark streaming, Oozie, Sqoop, Zookeeper
Hadoop Distribution Cloudera, Horton Works
Languages Java, SQL, PL/SQL, Python, HiveQL, Scala.
Operating Systems Windows (XP/7/8/10), UNIX, LINUX, UBUNTU, CENTOS.
Build Automation tools Ant, Maven
Version Control GIT, GitHub.
IDE & Build Tools, Design Eclipse, Visual Studio.
Databases MS SQL Server 2016/2014/2012, Azure SQL DB, Azure Synapse. MS Excel, MS Access, Oracle
11g/12c, Cosmos DB
Data Analyst
| UBS, Weehawken, NJ | Apr 2016 - Apr 2018
Responsibilities:
• Worked on Azure Synapse for building and optimizing end-to-end data analytics solutions with seamless
integration of data warehousing, big data, and data integration capabilities.
• Created and maintained databases for server inventory and performance inventory
• Leveraged Azure Synapse Analytics while working with data warehousing
• Developed data marts and a user access tool for ad-hoc reporting.
• Built cubes and dimensions for business intelligence and wrote MDX scripting.
• Developed SSIS jobs to automate report generation and cube refresh processes.
• Utilized SQL Server Reporting Services (SSRS) for report management and delivery.
• Developed stored procedures and triggers for data consistency.
• Leveraged Snowflake data model for external data sharing using Erwin.
• Worked on PL/SQL databases for scripting and database modeling.
• Played a pivotal role in identifying and developing use cases for EDW/DAR applications that align with
organization goals and priorities.
• Utilized business intelligence tools specifically PowerBI for creating dashboards for project progress, resource
allocation, task completion, and project budgets.
• Environment: Windows Server, MS SQL Server, SSIS, SSAS, SSRS, SQL Profiler, Power BI, C#, SharePoint.
Data Warehouse Developer
| CenturyLink, Denver, CO | Feb 2014 - Mar 2016
Responsibilities:
• Developed stored procedures, triggers, and functions for performance enhancement.
• Designed ETL data flows using SSIS for data extraction and transformation.
• Built Cubes and Dimensions using various architectures for Business Intelligence.
• Collaborated with SAS institute to implement business intelligence solutions, such as Enterprise Data Warehouse
(EDW) and Data Analytics and Reporting (DAR), applications.
• Developed dimensional data models using Erwin and implemented slowly changing dimensions (SCD).
• Developed SSAS Cubes, implemented aggregations, and deployed and processed SSAS objects.
• Created ad hoc reports and performed database queries for Business Intelligence purposes.
• Collaborated effectively in a project-oriented team with excellent communication skills.
• Environment: MS SQL Server, Visual Studio, SSIS, SharePoint, MS Access, Team Foundation Server, Git.
Certifications: Microsoft Certified Associate: Azure Data Engineer Associate (H461-5113)