NandanaReddy SrDataEngineer
NandanaReddy SrDataEngineer
Technology Summary
Expert in working different big data distributions like Cloudera, Hortonworks and MapR.
Experience in implementing bi-directional batch data ingestion flows using Apache Sqoop.
Expert in building real-time ingestion flows in HDFS and any databases using Flume and Kafka.
Expert in handling different optimized data formats like Orc, Parquet, Avro and Sequence files.
Implemented real time Json & Avro streaming into dynamic Hive tables using Kafka.
Worked on implementing various ETL transformation using MapReduce and Pig.
Implemented optimized data pipelines in Hive and Spark for various data transformations.
Extensively worked on various Hive optimization techniques to run jobs without issues.
Developed efficient spark scalable applications using python and Scala for ETL purpose.
Implemented various optimization techniques like memory optimizations in spark.
Knowledge on using machine learning libraries in spark for data explorations and predications.
Automated end to end jobs in on prem hadoop cluster using Apache Oozie and Cron Scheduler.
Implemented automation pipelines in Airflow for multiple development platforms orchestration.
Developed various integration pipelines using Apache NiFi and Talend.
Developed ingestion pipelines from various RDBMS sources to HDFS and Hive using Talend.
Expert in data lake ingestion setup in Hive and HBase for historical and incremental purpose.
Working experience in working on NoSQL platforms like HBase, MongoDB and Cassandra.
Experience in working on multiple cloud environments like Aws, Azure and GCP.
Expert in working with AWS tools like S3, RDS, Redshift, Elastic Cache and Dynamo DB.
Implemented various python notebooks in Azure Databricks for analytics purpose.
Experience in ingested data from Event Hub into Azure Sql for analysis.
Expert in handling Azure tools like Azure Data Factory, Azure Stream analytics, Azure HDInsight’s
and Cosmos DB for implementing end to end data pipelines.
Exposure on GCP tools like BigQuery, Cloud SQL, Pub/Sub, GCS and Data Proc.
Expert in writing scripting for automations like Python, Bash and Shell.
Experience in working on different BI tools like Tableau, Qlikview, Domo and Power BI.
Expert in implementing code coverage for application development using Sonar.
Experience in working on CI-CD tools like Jenkins, Drone CI, Team city and Travis CI.
Working knowledge on container orchestration tools like Docker and Kubernetes.
Developed different modules in application developments using microservices in java.
Experience in working on different build tools like Maven, Gradle and SBT.
Implemented various application and data pipelines using IDE tools like Intellij and Eclipse.
Exposure in technologies for application development using tools like Snowflake, Druid and
Superset.
Exposed application metrics and logs using tools like Kibana and Grafana.
Education
Education
Responsibilities
Experience in migrating existing legacy applications into Optimized data pipelines utilizing Spark
with Scala and Python with testability and perceptibility.
Experience in creating scalable real-time applications for ingesting clickstream data using Kafka
Streams and Spark Streaming.
Developed Optimized and tuned ETL operations in Hive and Spark Scripts.
Worked on Talend integrations to ingest data from various sources into DataLake.
Developed an MVP on trading data to Snowflake to get the usage and benefits for migration.
Implemented Cloud integrations to AWS and Azure storage buckets for bi-directional data flow
setups for data migrations.
Automated end to end jobs using oozie in on prem cluster and airflow in cloud.
Extensively worked on building automations using Shell Script and Python.
Created Jupyter notebooks with PySpark for extensive in-depth data analysis and exploration.
Developed code coverage and test cases integrations using Sonar.
Pushed application logs and data streams logs to Kibana server for monitoring and alerting purpose.
Worked on migrating data from HDFS to Azure HD Insights and Azure Databricks.
Implemented various modules in microservices to expose data through Restful API’s.
Developed Jenkins pipelines for continuous integration and deployment purpose.
Technologies: Azure, Azure Data Factory, Azure Databricks, SSIS, Jenkins, Power BI, Event Hub, Hive,
Azure Sql, Airflow, HDFS, Kafka, Spark, Scala, Python
Responsibilities
Experience in working multiple projects and agile teams involved in analytics and cloud platforms.
Experience in building scalable data pipelines in Azure cloud platform using different tools.
Developed multiple optimized PySpark applications using Azure Databricks.
Developed data pipelines using Azure Data Factory that process cosmos activity.
Implemented reporting stats on top of real time data using Power BI.
Developed ETL solutions using SSIS, Azure Data Factory and Azure Data Bricks.
Expert in working continuous integration and deployment using Jenkins.
Developed real time ingestion data pipelines from Event Hub into different tools.
Experience in building ETL solutions using Hive and Spark with Python and Scala.
Expert in working on optimizing applications built using tools like Spark and Hive.
Developed a custom message consumer to consume the data from the Kafka producer and push the
messages to Service Bus and Event Hub (Azure Components).
Developed Azure Data Factory data pipelines that process the data utilizing the Cosmos Activity.
Developed real time streaming dashboards in Power BI using Stream Analytics.
Developed job automations from different clusters using Airflow Scheduler.
Worked on Talend integration with on prem cluster and Azure Cloud Sql for data migrations.
Developed code coverage and test cases integrations using sonar and Mockito.
Technologies: Azure, Azure Data Factory, Azure Databricks, Python, Scala, SSIS, Jenkins, Power BI, Event
Hub, Hive, Azure Sql, Cosmos DB, Airflow
Responsibilities
Experience in developing SFTP, NAS integrations to ingest data into HDFS using Python.
Implemented messaging queues and routes using microservices camel in java.
Developed batch ingestion jobs from Teradata to HDFS and Hive using Sqoop.
Developed part of data lake platform using multiple tools like Kafka, Sqoop, Hive, Spark and Oozie.
Implemented end to end job automation in hadoop using Apache Oozie.
Developed transnationals system updates in HBase for data lake implementations.
Developed end to end ETL operations in optimized way using Hive and Spark.
Expert in handling complex data issues and memory optimizations and tuning in Spark.
Implemented multiple data pipelines using Apache Spark using python and Scala.
Developed real time streaming application to ingest Json messages using Apache Kafka.
Implemented Data security features in data exposed through API endpoints.
Expert in writing in many features using scripting languages like Bash, Shell and Python.
Worked on implementing CRUD operations in HBase for multiple applications.
Handling the Tickets/Service Calls raised by End-users and providing them faster resolution.
Implemented multiple Change Request includes of new developments.
Co-ordinate with offshore team for timely completion of deliverables.
Technologies: CDH, SFTP, NAS, HDFS, Hive, Spark, Scala, Python, Kafka, Shell, Camel, Microservices, Java,
Teradata, Sqoop
Responsibilities
Extracted the data from Teradata & MySQL into HDFS using Sqoop export/import scripts.
Developed Sqoop jobs with incremental load to populate Hive External tables and HDFS.
Expertise in using design patterns in Map Reduce to convert business data into custom format.
Experienced with handling different compression codec's like LZO, GZIP, and Snappy.
Expert in optimizing performance in hive using partitions and bucketing concepts.
Experience on working hive dynamic partition to overcome hive locking mechanism.
Developed UDFs in Java as and when necessary to use in Hive queries
Involved in indexing hive data using Solr and prepare custom tokenizer formats for querying.
Involved in designing a real time computation engine using Kafka.
Worked on poc to set up spark streaming data to Solr and perform indexing on it.
Experienced with writing build jobs using Maven and integrate that with Jenkins.
Ingested data from AWS S3 cloud buckets for third party data exchanges.
Experience in working Redshift building optimized pipelines.
Implemented multiple data pipelines using Apache Spark using python and Scala.
Automated Hive, Map Reduce and Spark applications using Apache Oozie.
Ingested multiple feeds from SFTP locations to HDFS and vice versa.
Developed BI reporting dashboards in tableau for exposing daily data trends.
Developed integration pipelines for exporting and importing data from Talend.
Developed spark optimized data pipelines using scala and python.
Expert in building bash, shell scripting, Python for various functionalities.
Developed ingestion pipelines for pulling data from AWS S3 buckets to HDFS for further analytics.
Developed Lambda functions to trigger ETL jobs across AWS tools.
Creating Athena glue tables on existing csv data using AWS crawlers.
Also worked on L3 production support for existing products
Technologies: HDP, HDFS, MapReduce, Hive, Pig, HBase, Solr, Aws, Kafka, Spark, Spark Streaming,
Maven, Oozie, SFTP, Python, Scala, tableau, Talend
Responsibilities