Resume
Resume
Resume
gshira96@gmail.com||+917986144521|| Noida,201301(IN)
SUMMARY EDUCATION
5 yrs. of Experience in Data Engineering include: Punjabi University
• Working on data warehousing, data lake solutions, data management Patiala, Punjab
& data migration projects that involve Bigdata. Bachelor of Technology
• Establishing data ingestion pipelines between OLTP source systems Electronics & Comm.
like Oracle ERP and target OLAP systems like Teradata / Hadoop GPA-6.96
/ADLS GEN2 using SPARK. Jun.2019
• Developing & automating ETL/ELT flows for different data processing Mount Carmel School
scenarios through different data layers. Hoshiarpur, Punjab
• Building data models using Microsoft Analysis Services / SAP BO for 12th, Science,75.20%
enabling self-serve capabilities.
Jun.2015
• Creating interactive BI dashboards for serving data insights to
business. 10th, General, 76.33%
• Production incident support to meet SLAs on time. Jun. 2013
• Creating API endpoints for providing data services for ML solutions.
TECHNICAL SKILLS
• Hadoop, Teradata & ADLS
• Java, Unix, Python, C#, Flask & Regex
• SQL & Hive-QL
• Spark, Sqoop & Databricks
• Oozie & NIFI
• GitHub, Bitbucket, Jenkins & AzureDevOps for CI/CD
• Microsoft Analysis Services
• Power BI
WORK EXPERIENCE
• Software Engineer- (Data Engineer)
Microsoft / May.2022—Present
• Data Engineer I
NCR Corporation Pvt. Ltd. / Jan.2019—May.2022
PROJECTS
Dashboard Usage Telemetry
• Technologies: Databricks, Spark, ADLS GEN -2, Python, PowerBI
• Organization: Microsoft -Internal project
• Team Size : Individual
• Role : Developer
• Description: Surfacing the number of views per dashboard, along with distinct users & groups.
Session data was fetched from log analytics API using python clubbed with dashboard metadata
fetched from COSMOS DB using spark, Ingested & processed on ADLS to feed PowerBI for
visualization.
Opportunity Insights
• Technologies: Databricks, Spark, ADLS GEN -2, Microsoft Analysis Services, PowerBI
• Organization: Microsoft -Internal project
• Team Size : Individual
• Role : Developer
• Description: Showing the number of opportunities won & being worked upon vs opportunities lost.
Capturing the time taken to complete a delivery by the consulting/delivery team, package/services
sold to customers and getting insights on our engineer’s performance and skills. Technically
requests/opportunity data was ingested into ADLS using spark on Databricks from a source SQL
server. The final data was fed to Analysis Services server for data modeling and PowerBI dashboard
was created on top of this data model.
AUTOMATIONS
• Developed data validation framework using Unix, Python and Spark to validate data pipeline between
Oracle ERP & Hadoop and fix data issues. Code automatically connects to Oracle & Hadoop using JDBC jar
by just passing the database & table names, brings data into dataframes, performs comparison, shows
final statistics and generates excels for missing and extra data in the target.
• Built data sync framework using Unix to sync production data with the lower environment by just passing
the table and database names.
• Created a master code using Unix to automatically create an Oozie job by just passing HQL scripts in the
order of execution.