0% found this document useful (0 votes)
13 views7 pages

Raj Data Engineer

DATA ENGINEER RESUME
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views7 pages

Raj Data Engineer

DATA ENGINEER RESUME
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Anto Raj Mithun

Contact Number: +1 (214) 608-5299


Email Address: armithun30@gmail.com

PROFESSIONAL SUMMARY
 7 years of strong experience in the IT industry as a Senior Data Engineer including Analysis, Design, Development and
Enhancements of Business intelligence / Analytics systems, Data Warehouses, Data Lakes and Delta Lakes.
 Design and implementation experience of robust data pipelines in Azure Data Factory (ADF) and Azure Databricks, ensuring
compliance. Involved in migrating the legacy ETL/ELT process to modern Azure cloud solutions, leveraging Azure Data Lake
(ADLS Gen2). Strong working experience in Pipelines in ADF using Linked Services/Datasets/Pipeline to Extract and load data
from different sources like Azure SQL, ADLS, Blob storage, and Azure SQL Data warehouse.
 Development and Implementation experience in data lake solution on AWS leveraging EMR, Glue, S3, Lambda and Airflow
to centralize and analyze large volumes of structured and unstructured data. Developed Data Pipelines and designed data
models and optimized data storage solutions using Amazon Redshift and S3, improving query performance. Optimized
Amazon Redshift configurations, reduced data load times.
 Performed data cleansing, transformation, and data loading tasks using Python and SQL. Enhanced data quality by refining
ETL processes using Python and AWS Glue, reducing data inaccuracies. Ingest data from Amazon S3 into Azure Blob, process
it using a notebook in Azure Databricks, and then move the processed data to Azure SQL Data Warehouse.
 Prepare and transform (clean, sort, merge, join, etc.) the ingested data in Azure Databricks as a Notebook activity step in
data factory pipelines. Delta Live Tables seamlessly integrates both streaming and batch data processing. Use DLT’s
capabilities to create reusable transformations and quality checks. Good Knowledge on integrating Databricks with different
storage and databases.
 Experience in Working on Apache NiFi and Airflow in creating and maintaining the data Pipelines to process large sets of
data and configured Lookup’s for Data Validation and Integrity. Leveraged Python libs (Airflow and NiPyAPI – NiFi Python
API’s), automated the complete workflow deployment, management and tracking.
 Developed reusable Python scripts to parse TXT, CSV, EXCEL, JSON, XML Files, performed data cleansing, transformation
and load into the Data Lakes/Warehouses. Developed ETL pipelines for various relational and non-relational databases like
MS SQL Server, MySQL, PostgreSQL, Oracle, and Teradata.
 Design and implementation of ETL processes using Talend and SSIS to integrate data from various sources into the data
warehouse. Transferred data using Bulk Copy Program (BCP) and SSIS. Experience in Designing Data warehouses, Business
Intelligence, Analytics, ETL Processes, Data Mining, Data Mapping, Data conversion, Data Migration using Talend.
 Strong work experience in the migration of ETL processes from SSIS packages to Talend jobs, ensuring seamless data
integration and optimized performance.
 Experience with Snowflake cloud data warehouse and AWS S3 bucket for integrating data from multiple source systems,
including loading nested JSON formatted data into Snowflake table, Understanding of Snowflake Internals and integration
of Snowflake with other data processing and reporting technologies.
 In-depth knowledge of Data Sharing, Multi-Cluster Warehouses and Time Travel in Snowflake, participates in the
development, improvement and maintenance of Snowflake database applications and has experience in handling
Snowflake database, Schema, table structure and configuring snowpipes.
 Develop and optimize complex SQL queries and stored procedures to ensure efficient data transformation and loading.
Conduct performance tuning and optimization of ETL processes to enhance system performance and reduce processing
time.
 Supported data warehouse and business intelligence initiatives by developing and maintaining ETL workflows.
 Highly skilled software development professional in software design, development and integration. Advanced knowledge of
Oracle SQL, PL/SQL and MS T-SQL. Developed and optimized complex SQL queries, stored procedures, and database
functions to support business operations and reporting needs. Conduct performance tuning and optimization of SQL
queries to enhance system performance and reduce processing time using explain plan. Created numerous simple to
complex queries involving self-joins, correlated sub-queries, CTE's and XML techniques for diverse business requirements.
Tuned and optimized queries by altering database design, analyzing different query options, and indexing strategies.
Created Shell Scripts for invoking SQL scripts and scheduled them using crontab.
Anto Raj Mithun - cont'd Page 2

 Hands-on experience in implementing RICE Oracle Reports /Forms, Interfaces, Conversions, Extension components,
workflows, OBIEE, Oracle Alerts, Lookups, and XML Publisher Reports.
 Strong working knowledge in CICD and used GitHub for version control.
 Worked in Agile Scrum methodologies for deliverables optimizing with story points estimations. Used Jira ticketing tool to
track/manage the Epics, User Stories and Issues.

Education:
 Bachelor of Engineering – Information Technology; Noorul Islam Centre for Higher Education; 2014 to
2018, India

Certifications:

 Complete Python Mastery – Code with Mosh


 Databricks Lakehouse Fundamentals -Databricks
 Databricks for Data Engineering Fundamentals -Databricks

Skillsets:

Data Engineering, Data Pipelines, Data Lakes, Data Warehouses, Big Data,
Specialization Lake House, ETL, Analytics
Data Pipelines Apache NiFi, Airflow
ETL Tools Talend, SSIS, SQL Loader, Tableau
Relational & Non-Relational
Databases PostgreSQL, MS-SQL server, Oracle, MySQL, Mongo-DB
Data Warehouses & Cloud Storage
solutions Snowflake, AWS(Redshift), Hadoop HDFS, ADLS and S3
Cloud Solutions Microsoft Azure, AWS
Programming Languages SQL, PL/SQL, Python, Shell Scripting
Version Control GitHub
Methodologies Agile and Modified Waterfall
Tracking and Ticketing JIRA, MS-Project

Projects:

Brierley + Partners, Frisco, TX, USA


Aug 2021 to Till Date
Senior Data Engineer

 Interacted with Business Analysts and end users to gather business requirements.
 Designed and implemented a data lake solution using Apache NiFi, AWS S3, and Redshift to centralize and analyze large
volumes of structured and unstructured data.
 Created pipeline for Text and CSV files to API for different services using Apache NiFi and used Python (NiPyAPI – NiFi
Python API’s), to automate the complete workflow deployment, management and tracking, experience in working on the
Apache NiFi data Pipeline to process large sets of data and configured Lookups for Data Validation and Integrity,
 Managed large datasets using Python Pandas libraries and performed the validations to ensure the data quality, conduct
performance tuning and optimize NiFi workflows to enhance system performance and reduce processing time.
Anto Raj Mithun - cont'd Page 3

 Maintained EC2 Instance, RDS Instance, AWS S3 buckets and integrated with MinIO storage (Object Store) to pull the data
from external sources for processing.
 Designed and Implemented ETL for data load from different heterogeneous Sources and Oracle as target databases.
 Reverse-engineered the reports and identified the Data Elements (in the source systems), Dimensions, Facts and Measures
required for new enhancements of reports—experience in using RESTful APIs using standard HTTP methods to perform
DML operations in PostgreSQL Database, Have strong knowledge of PostgreSQL database architecture, features, and
capabilities.
 Design and development, unit testing, integration, deployment packaging and checkout, scheduling, of various components
in Azure Data Factory and Azure Databricks through several SDLCs for implementing ETL/ELT processes for the cloud-based
very large data warehouse and also experience in implementing the alerts in ADF pipelines to trigger notifications when the
pipeline failed
 Experience with Snowflake cloud data warehouse and AWS S3 bucket for integrating data from multiple source systems,
including loading nested JSON formatted data into Snowflake table and In-Depth knowledge on Snowflake Clone, Time
Travel and Multi-Cluster Warehouses
 implement efficient SQL queries in AWS Athena to support data analysis and business intelligence requirements.
 Developed and maintained ETL processes using AWS Glue to extract, transform, and load data from various sources into
AWS S3 and Redshift
 Design and implement data pipelines using Delta Live Tables and Databricks, ensuring high data quality and reliability.
 Delta Live Tables seamlessly integrates both streaming and batch data processing, and experience in handling monitoring
tools in DLT provides real-time insights into the health and performance of the pipelines. Create a scalable ETL framework
using Delta Live Tables and Delta Lake, reducing ETL development time and using DLT’s capabilities to create reusable
transformations and quality checks.
 Experience in ingesting data from Amazon S3 into Azure Blob, Using Azure Databricks notebooks to process data from
Amazon S3.
 Prepare and transform (clean, sort, merge, join, etc.) the ingested data in Azure Databricks as a Notebook activity step in
data factory pipelines and good Knowledge on integrating Databricks with different storage and databases.
 Participates in the development, improvement and maintenance of Snowflake database applications, Schema, Data Sharing
and table structure, has in-depth knowledge on configuring snowpipes and creating Table DDL in the Snowflake
development database.
 Extensive experience in Designing Data warehouses, Business Intelligence, Analytics, ETL Processes, Data Mining, Data
Mapping, Data conversion, Data Migration using Talend. Implemented data validation and cleansing routines to improve
data quality.
 Used Talend to migrate Data from the Legacy system Oracle DB to the Product Database PostgreSQL and developed Parallel
sequence jobs using Talend to extract, transform and load the data from different source systems/files into the target
database to feed the data for the new system. Created Talend Mappings to populate data into dimensions and fact tables.
 Created and deployed physical objects including custom tables, custom views, stored procedures, and Indexes for the
staging environment.
 Used ETL (SSIS) to develop jobs for extracting, cleaning, transforming and loading data into the data warehouse.
 Experience in importing data from Excel Spreadsheets, Text files, and CSV files into SQL Server database using the SSIS
package, Strong experience in designing SSIS Packages using various Tasks like Execute SQL Task, bulk insert task, data flow
task, file system task, send mail task, Execute Package Task, active script task, xml task.
 Implemented Event Handlers, Package Configurations, Logging, System, User-defined Variables, and Expressions for SSIS
Packages.
 Experience in writing SQL Queries, dynamic queries, sub-queries and joins for generating Stored Procedures, Triggers, User-
defined Functions, Views and Cursors, customized and Developed PL/SQL procedures, functions, packages, triggers, tables,
views, and materialized views based on client requirements, used Bulk Collections for better performance and easy retrieval
of data, by reducing context switching between SQL and PL/SQL engines, created records, tables, and collections (nested
tables and arrays) for improving Query performance by reducing context switching, experience in designing and
implementing data warehouse schemas for optimal performance and query efficiency.
Anto Raj Mithun - cont'd Page 4

 Development and maintenance of web applications, using GitHub for version control and collaboration, managing and
organizing repositories, branches, and releases to ensure efficient workflow and version control.
 Involved in fine-tuning the existing packages for better performance providing ongoing support to existing applications and
troubleshooting serious errors when occurred, and ability to troubleshoot database management performance issues.
 Experienced in Jira ticketing tool managing different types of issues (stories, tasks, sub-tasks)
 Experienced in Agile Scrum methodologies for report deliverables optimizing with story points estimations.
 Ability to do performance tuning: query optimization (avoid coding loop and correlated subqueries), applying indexes and
partition functions.
Tools & Technologies: Apache NiFi, Talend, SSIS, Python, T-SQL, PL/SQL, Oracle, PostgreSQL, Microsoft Azure, AWS,
DBeaver, PyCharm, Snowflake, Visual Studio, GitHub
Datamorphix.ai, Chennai, India
Jul 2020 to Aug 2021
Data Engineer

 Collaborate with cross-functional teams to gather requirements, design solutions, and deliver features.
 Designed and implemented scalable ETL workflows using Apache NiFi to extract, transform, and load data from various
sources into the data warehouse, Experience Using Apache NiFi extensively to transform the data to AWS S3 Buckets, and
Azure Blob Storage and conduct performance tuning and optimization of NiFi workflows to enhance system performance
and reduce processing time.
 Used Python Pandas libs to read CSV, excel files &DB data, created data frames, processed and loaded data into target
systems and developed Python script to perform the file check validations and send email notifications to the users.
 Experience in developing reusable Python scripts to migrate the ETL components (config and parameter files) from one
server to another server and developing and maintaining backend services using Python, integrating with various third-
party APIs, writing Python scripts to parse JSON documents and loading the data in the database and also used Python
scripts to update content in the database and manipulate files.
 Developed and maintained ETL processes using AWS Glue to extract, transform, and load data from various sources into
AWS S3 and Redshift, conduct performance tuning and optimization of Redshift clusters to enhance system performance
and reduce query execution time and implement efficient SQL queries in AWS Athena to support data analysis and business
intelligence requirements.
 Developed and maintained data pipelines using Azure Databricks and Delta Lake to process and analyze large datasets,
implemented data ingestion pipelines to load data from Azure Data Lake Storage (ADLS) into Azure Databricks, Hands on
experience in using Amazon Cloud EC2 along with Amazon S3 bucket to upload and retrieve project history
 Experience in using collections in Python for manipulating and looping through different user-defined objects.
 Created complex SSIS packages using various transformations and tasks like Sequence Containers, Script, Fore loop, and
Foreach Loop Container, Execute SQL/Package, Send Mail, File System, Conditional Split, Data Conversion, Derived Column,
Lookup, Merge Join, Union All, flat file source and destination, OLE DB source and destination, excel source and destination.
 Creating SSIS packages using proper control and data flow elements with error handling. Use Loggings, Breakpoints and
Data Viewers for effective debugging of packages, optimized the performance of slow-executing SSIS packages by making
changes to Control Flow and Data Flow like avoiding blocking transformations, increasing buffer size, troubleshooting ETL
issues in SQL Server Integration Services and developed data processing scripts for ETL tasks, improving data workflows and
reducing processing time.
 Designed and developed Oracle PL/SQL Procedures, Functions, and Database Triggers and was involved in creating and
updating Packages to meet business requirements, experience in writing Sub Queries, Stored Procedures, Triggers, Cursors,
and Functions on My SQL and PostgreSQL databases.
 Developed a comprehensive ETL solution using Talend to integrate customer data from various CRM systems into a central
data warehouse and managed the migration of on-premises data warehouses to cloud-based solutions using Talend and
AWS Redshift and developed ETL workflows to ensure seamless data transfer and minimal downtime during migration.
Anto Raj Mithun - cont'd Page 5

 Created records, tables, and collections (nested tables and arrays) for improving Query performance by reducing context
switching, experience in designing and implementing data warehouse schemas for optimal performance and query
efficiency, integrate data from various sources into the data warehouse, ensuring consistency and accuracy, collaborate on
testing SQL queries and database changes. This includes unit testing, integration testing, and performance testing to ensure
the reliability and efficiency of SQL code.
 Worked in the development of applications especially in Linux environment and familiar with all of its commands.
 Managed version control using Azure Repos and Git, ensuring code quality and maintaining a clean code repository
 Experienced in Jira ticketing tool managing different types of issues (stories, tasks, sub-tasks)
 Experienced in Agile Scrum methodologies for report deliverables optimizing with story points estimations.

Tools & Technologies: Apache NiFi, Talend, SSIS, Python, PL/SQL, Oracle, PostgreSQL, Microsoft Azure, AWS, DBeaver,
PyCharm

Alliance Healthcare Group, Malaysia


Dec 2019 to Jul 2020
Software Developer

 Design, develop, and maintain interfaces for healthcare claims processing systems, ensuring seamless data exchange and
accurate claims management.
 Strong work experience in the migration of ETL processes from SSIS packages to Talend jobs and ensuring seamless data
integration and optimized performance, conducted a thorough analysis of existing SSIS packages, identifying areas for
improvement, and implemented equivalent Talend jobs. Participated in the migration of legacy integration processes to
Talend, resulting in improved maintainability and scalability.
 Developed ETL workflows to ensure seamless data transfer and minimal downtime during migration.
 Created complex mappings in Talend using tMap, tJoin, tAggregateRow, tDie, tWarn, tLogCatcher, tFileInputMSXML,
toracleinput, toracleoutput, tpostgresqlinput, tpostgresqloutput, tfiledelimited, tfileoutputdelimited,
tmssqloutputbulkexec, tunique, tFlowToIterate, tsort, tFilterRow.
 Used tStatsCatcher, tDie, and tLogRow to create a generic Joblet to store processing stats.
 Created Talend Mappings to populate data into dimensions and fact tables, implementing Change data capture techniques
with the slowly growing target, Simple Pass-through mapping, and slowly changing dimension (SCD) type1 and type2.
 Talend MDM and knowledge in Performance Tuning of mappings and developed comprehensive ETL solutions using Talend
to integrate healthcare data from various sources, ensuring compliance with HIPAA standards.
 Assisted in the design and implementation of data models and database schemas.
 Working with ADF control flow transformations such as For Each, Lookup Activity, Until Activity, Web Activity, Wait Activity,
If Condition Activity, developed a pipeline on Azure Data Factory for processing the files received from the SFTP locations.
 Working experience on Azure Databricks cloud to organize the data into notebooks and make it easy to visualize data using
a dashboard and implement parameterized ADF pipelines and monitor the ADF pipelines.
 Mastered the ability to design and deploy rich graphic visualization with drill-down and drop-down options and parameters
using Tableau (Claim adjudication and fraudulent detection).
 Cleansing up dirty data makes it easier to combine and analyze your data or makes it easier for others to understand your
data when sharing your data sets, and normalize formatting to support Tableau’s ability to read/analyze the information.
Build Various dashboards to discover claim fraud detection and build visually stunning and interactive dashboards, able to
manipulate and blend data to design dashboards and visualization, twist SQL queries to improve performance, identify
patterns and meaningful insights from data by analyzing it, resolving any data or performance issues related to workbooks
and data sources.
 Monitor the performance of the database, and data quality, troubleshoot and resolve problems, conduct performance
tuning and optimization of interfaces to enhance system performance and reduce processing time.
Anto Raj Mithun - cont'd Page 6

 Developed PL/SQL triggers and master tables to create primary keys automatically. Created PL/SQL stored procedures,
functions and packages for moving the data from the staging area to the data mart
 Developed and optimized complex SQL queries and stored procedures to ensure efficient data retrieval and processing,
 Created indexes on the tables for faster data retrieval to enhance database performance.
 Involved in data loading using PL/SQL and SQL*Loader calling UNIX scripts to download and manipulate files and performed
SQL, PL/SQL tuning and Application tuning using various tools like EXPLAIN PLAN, SQL*TRACE, TKPROF and AUTOTRACE.
 Extensively involved in using hints to direct the optimizer to choose an optimum query execution plan, used Bulk Collections
for better performance and easy retrieval of data, by reducing context switching between SQL and PL/SQL engines.
 Experience in creating PL/SQL scripts to extract the data from the operational database into simple flat text files using the
UTL_FILE package.
 Help with the analysis of issues raised during QA/UAT/PROD phases and support development teams with performance
tuning and troubleshooting issues.
 Experience in keeping code repositories and versions cataloged within GitHub.
 Attend Stand-ups, Grooming Sessions, Retrospective meetings and update the status of the tasks regularly in Jira.
Tools & Technologies: Talend, SSIS, T-SQL, PL/SQL, Oracle, PostgreSQL, SQL*Loader, Microsoft Azure, Tableau, DBeaver,
GitHub

Zen Linen International Pvt Ltd, Chennai, India


Jun 2018 to Dec 2019
Software Developer

 Designs and maintains software applications related to an enterprise-level database’s functionality.


 Key role involvement in writing SQL Queries, Dynamic-queries, sub-queries and joins for generating Stored Procedures, Triggers,
User-defined Functions, Views and Cursors.
 Experienced in database activities like Data Modeling, Database Design, Development, Database creation and Maintenance,
Performance Monitoring and Tuning, Troubleshooting, Normalization and Documentation.
 Created documents and maintained logical & physical data models in compliance with enterprise standards and maintained
corporate metadata definitions for enterprise data stores within a metadata repository.
 Strong knowledge of PostgreSQL database architecture, features, and capabilities, tune application SQL queries for performance
optimization
 Collaborate with business stakeholders to gather requirements for Oracle Forms and Reports applications.
 Experience in managing Installation and configuration of Oracle Forms/Reports.
 Proficiency in Oracle Forms and Reports development tools, design user interfaces, forms, and reports using Oracle Forms and
Reports development tools, and develop Oracle Forms modules to capture, validate, and process user input and create various
reports using Oracle Report Builder to present data in a structured format, implement data validation, error handling, and security
mechanisms in applications.
 Perform integration testing to validate the interaction between different modules and systems.
 Prepare deployment packages and installation instructions for deploying Oracle Forms and Reports applications.
 Coordinate with system administrators to deploy applications in development, testing, and production environments and
provided ongoing support and maintenance for deployed applications, including troubleshooting and bug fixing. Respond to user
queries and issues promptly to ensure uninterrupted operation, shadow write and optimize in-application SQL statements.
 Experience in customizing existing Oracle Forms and Reports, maintaining Oracle Forms and Reports Very high level of PL/SQL
programming- packages, triggers and stored procedures as part of the product. Used Collections, and Bulk Binds to improve
performance by minimizing the number of context switches between the PL/SQL and SQL engines.
 Involved in the continuous enhancements and fixing of production problems. Involved in data loading using PL/SQL and
SQL*Loader calling UNIX scripts to download and manipulate files. Performed SQL and PL/SQL tuning and Application tuning using
various tools like EXPLAIN PLAN, SQL*TRACE, TKPROF and AUTOTRACE.
 Extensively involved in using hints to direct the optimizer to choose an optimum query execution plan.
Anto Raj Mithun - cont'd Page 7

 Partitioned the fact tables and materialized views to enhance the performance. Extensively used bulk collection in PL/SQL objects
for improving performance.
 Ability to do performance tuning: query optimization (avoid coding loop and correlated subqueries), applying indexes and
partition functions.
 Developed Custom screens using Forms10g based on Stored Procedures to implement complex business functionality and tab
canvases, developed a List of values to populate data in certain conditions based on Record Groups.
 Experience in doing unit testing in PL/SQL (Procedural Language/Structured Query Language) is essential to ensure the correctness
and reliability of your database code. Perform root cause analysis on all processes resolve production issues and validate all data.
 Created and modified several UNIX Shell Scripts according to the changing needs of the project requirements.
 Developed Unix Shell scripts to monitor Crontab jobs, UNIX shell scripts run the stored procedures/functions from UNIX
 Collaborating with quality assurance (QA) teams to ensure comprehensive test coverage.
 Managing and version-controlling codebase using GitHub.
 Maintaining information about database architecture, relationships, and usage.
 Monitor the performance of the database and ensure optimum performance.
 Collaborating with operations teams to deploy and manage applications.
 Troubleshooting and resolving production issues promptly.
Tools & Technologies: TOAD, Oracle 9i to 11g, SQL Loader, SQL Server, Oracle Forms/Reports Builder, SQL, PL/SQL,
PostgreSQL, Unix Shell Scripting

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy