Resume

Gurpreet Singh Heera
gshira96@gmail.com||+917986144521|| Noida,201301(IN)
SUMMARY EDUCATION
5 yrs. of Experience in Data Engineering include: Punjabi University
• Working on data warehousing, data lake solutions, data management Patiala, Punjab
& data migration projects that involve Bigdata. Bachelor of Technology
• Establishing data ingestion pipelines between OLTP source systems Electronics & Comm.
like Oracle ERP and target OLAP systems like Teradata / Hadoop GPA-6.96
/ADLS GEN2 using SPARK. Jun.2019
• Developing & automating ETL/ELT flows for different data processing Mount Carmel School
scenarios through different data layers. Hoshiarpur, Punjab
• Building data models using Microsoft Analysis Services / SAP BO for 12th, Science,75.20%
enabling self-serve capabilities.
Jun.2015
• Creating interactive BI dashboards for serving data insights to
business. 10th, General, 76.33%
• Production incident support to meet SLAs on time. Jun. 2013
• Creating API endpoints for providing data services for ML solutions.
TECHNICAL SKILLS
• Hadoop, Teradata & ADLS
• Java, Unix, Python, C#, Flask & Regex
• SQL & Hive-QL
• Spark, Sqoop & Databricks
• Oozie & NIFI
• GitHub, Bitbucket, Jenkins & AzureDevOps for CI/CD
• Microsoft Analysis Services
• Power BI
WORK EXPERIENCE
• Software Engineer- (Data Engineer)
Microsoft / May.2022—Present
• Data Engineer I
NCR Corporation Pvt. Ltd. / Jan.2019—May.2022
PROJECTS
Dashboard Usage Telemetry
• Technologies: Databricks, Spark, ADLS GEN -2, Python, PowerBI
• Organization: Microsoft -Internal project
• Team Size : Individual
• Role : Developer
• Description: Surfacing the number of views per dashboard, along with distinct users & groups.
Session data was fetched from log analytics API using python clubbed with dashboard metadata
fetched from COSMOS DB using spark, Ingested & processed on ADLS to feed PowerBI for
visualization.
Opportunity Insights
• Technologies: Databricks, Spark, ADLS GEN -2, Microsoft Analysis Services, PowerBI
• Organization: Microsoft -Internal project
• Description: Showing the number of opportunities won & being worked upon vs opportunities lost.
Capturing the time taken to complete a delivery by the consulting/delivery team, package/services
sold to customers and getting insights on our engineer’s performance and skills. Technically
requests/opportunity data was ingested into ADLS using spark on Databricks from a source SQL
server. The final data was fed to Analysis Services server for data modeling and PowerBI dashboard
was created on top of this data model.
Supply Line Management

• Technologies: Hadoop, Spark, Oozie
• Organization: NCR Corporation -Internal project
• Description: Track all our raw material orders to see the performance of raw material providers. Data
goes through various transformations & processing using HQL Scripts & Spark and then finally lands
Build Purchase Order Repository in Hadoop
Build Purchase Order Repository in Hadoop

• Technologies: Hadoop, Hive-QL, Oozie, Unix and Informatica
• Description: Integrate Purchase Order Pipeline between Oracle ERP and Hadoop. All the transactions
performed on ERP tables are captured and brought into Hadoop staging tables via Informatica CDC in
the form of raw data. Based on this data, records are inserted, updated or deleted from main tables
to keep them in sync with ERP.
Oracle Cloud Integration

• Technologies: Hadoop, Hive-QL, Oozie, Unix and NIFI
• Team Size :3
• Role : Lead
• Description: Ingest supplies & demands data from Oracle cloud into Hadoop using NIFI. Build and
schedule code for performing full refresh and incremental load. Apply Analytics on final dataset to
generate reports for better inventory planning & efficient production. Here we use Hive-QL for
transformations and analytics while Unix and Oozie are used for scheduling.
Order Management & Tracking

• Technologies: Teradata, MLOAD, SQL, Unix
• Description: Build record activation & deactivation logic using SCD 2 architecture to track and store
orders on hold and released from hold. Data was ingested via files using Teradata MLOAD utility and
processed using Teradata BTEQs.
AUTOMATIONS
• Developed data validation framework using Unix, Python and Spark to validate data pipeline between
Oracle ERP & Hadoop and fix data issues. Code automatically connects to Oracle & Hadoop using JDBC jar
by just passing the database & table names, brings data into dataframes, performs comparison, shows
final statistics and generates excels for missing and extra data in the target.
• Built data sync framework using Unix to sync production data with the lower environment by just passing
the table and database names.
• Created a master code using Unix to automatically create an Oozie job by just passing HQL scripts in the
order of execution.

Resume

Uploaded by

Copyright:

Available Formats

Resume

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Resume

Uploaded by

Copyright:

Available Formats

Gurpreet Singh Heera

Supply Line Management

Build Purchase Order Repository in Hadoop

Oracle Cloud Integration

Order Management & Tracking

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.