Resume

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Gurpreet Singh Heera

gshira96@gmail.com||+917986144521|| Noida,201301(IN)

SUMMARY EDUCATION
5 yrs. of Experience in Data Engineering include: Punjabi University
• Working on data warehousing, data lake solutions, data management Patiala, Punjab
& data migration projects that involve Bigdata. Bachelor of Technology
• Establishing data ingestion pipelines between OLTP source systems Electronics & Comm.
like Oracle ERP and target OLAP systems like Teradata / Hadoop GPA-6.96
/ADLS GEN2 using SPARK. Jun.2019
• Developing & automating ETL/ELT flows for different data processing Mount Carmel School
scenarios through different data layers. Hoshiarpur, Punjab
• Building data models using Microsoft Analysis Services / SAP BO for 12th, Science,75.20%
enabling self-serve capabilities.
Jun.2015
• Creating interactive BI dashboards for serving data insights to
business. 10th, General, 76.33%
• Production incident support to meet SLAs on time. Jun. 2013
• Creating API endpoints for providing data services for ML solutions.

TECHNICAL SKILLS
• Hadoop, Teradata & ADLS
• Java, Unix, Python, C#, Flask & Regex
• SQL & Hive-QL
• Spark, Sqoop & Databricks
• Oozie & NIFI
• GitHub, Bitbucket, Jenkins & AzureDevOps for CI/CD
• Microsoft Analysis Services
• Power BI

WORK EXPERIENCE
• Software Engineer- (Data Engineer)
Microsoft / May.2022—Present

• Data Engineer I
NCR Corporation Pvt. Ltd. / Jan.2019—May.2022
PROJECTS
Dashboard Usage Telemetry
• Technologies: Databricks, Spark, ADLS GEN -2, Python, PowerBI
• Organization: Microsoft -Internal project
• Team Size : Individual
• Role : Developer
• Description: Surfacing the number of views per dashboard, along with distinct users & groups.
Session data was fetched from log analytics API using python clubbed with dashboard metadata
fetched from COSMOS DB using spark, Ingested & processed on ADLS to feed PowerBI for
visualization.
Opportunity Insights
• Technologies: Databricks, Spark, ADLS GEN -2, Microsoft Analysis Services, PowerBI
• Organization: Microsoft -Internal project
• Team Size : Individual
• Role : Developer
• Description: Showing the number of opportunities won & being worked upon vs opportunities lost.
Capturing the time taken to complete a delivery by the consulting/delivery team, package/services
sold to customers and getting insights on our engineer’s performance and skills. Technically
requests/opportunity data was ingested into ADLS using spark on Databricks from a source SQL
server. The final data was fed to Analysis Services server for data modeling and PowerBI dashboard
was created on top of this data model.

Supply Line Management


• Technologies: Hadoop, Spark, Oozie
• Organization: NCR Corporation -Internal project
• Team Size : Individual
• Role : Developer
• Description: Track all our raw material orders to see the performance of raw material providers. Data
goes through various transformations & processing using HQL Scripts & Spark and then finally lands
Build Purchase Order Repository in Hadoop

Build Purchase Order Repository in Hadoop


• Technologies: Hadoop, Hive-QL, Oozie, Unix and Informatica
• Organization: NCR Corporation -Internal project
• Team Size : Individual
• Role : Developer
• Description: Integrate Purchase Order Pipeline between Oracle ERP and Hadoop. All the transactions
performed on ERP tables are captured and brought into Hadoop staging tables via Informatica CDC in
the form of raw data. Based on this data, records are inserted, updated or deleted from main tables
to keep them in sync with ERP.

Oracle Cloud Integration


• Technologies: Hadoop, Hive-QL, Oozie, Unix and NIFI
• Organization: NCR Corporation -Internal project
• Team Size :3
• Role : Lead
• Description: Ingest supplies & demands data from Oracle cloud into Hadoop using NIFI. Build and
schedule code for performing full refresh and incremental load. Apply Analytics on final dataset to
generate reports for better inventory planning & efficient production. Here we use Hive-QL for
transformations and analytics while Unix and Oozie are used for scheduling.

Order Management & Tracking


• Technologies: Teradata, MLOAD, SQL, Unix
• Organization: NCR Corporation -Internal project
• Team Size : Individual
• Role : Developer
• Description: Build record activation & deactivation logic using SCD 2 architecture to track and store
orders on hold and released from hold. Data was ingested via files using Teradata MLOAD utility and
processed using Teradata BTEQs.

AUTOMATIONS
• Developed data validation framework using Unix, Python and Spark to validate data pipeline between
Oracle ERP & Hadoop and fix data issues. Code automatically connects to Oracle & Hadoop using JDBC jar
by just passing the database & table names, brings data into dataframes, performs comparison, shows
final statistics and generates excels for missing and extra data in the target.
• Built data sync framework using Unix to sync production data with the lower environment by just passing
the table and database names.
• Created a master code using Unix to automatically create an Oozie job by just passing HQL scripts in the
order of execution.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy