Varma_Resume
Varma_Resume
Varma_Resume
Hadoop Developer
Professional Summary
Over 7+ years of experience in software development in variety of industries, which includes hands on
experience in Big Data technologies
3 years of comprehensive experience as a Hadoop Developer in all phases of Hadoop and HDFS
development.
Passionate about working on the most cutting-edge Big Data technologies.
Good Knowledge on Hadoop stack, Cluster architecture and monitoring the cluster.
Well versed with Developing and Implementing MapReduce programs using Hadoop to work with
BigData.
Hands on experience with Hadoop, HDFS, MapReduce and Hadoop Ecosystem (Pig, Hive, Oozie, Flume
and Sqoop).
Experience with NoSQL databases like HBase and Cassandra
Experience in analyzing data using HiveQL, Pig Latin, and custom MapReduce programs in Java.
Experience in Hadoop administration activities such as installation and configuration of clusters using
Apache and Cloudera
Wrote custom UDFs for extending Hive and Pig core functionality
Developed the Pig UDF'S to pre-process the data for analysis
Experience in implementing in setting up standards and processes for Hadoop based application design
and implementation.
Experienced with creating workflows using Oozie for cron jobs.
Created alter, insert and delete queries involving lists, sets and maps in DataStax Cassandra.
Experienced with Java API and REST to access HBase data.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems,
Teradata and vice versa.
Experience in Object Oriented Analysis, Design (OOAD) and development of software using UML
Methodology, good knowledge of J2EE design patterns.
Experience in managing Hadoop clusters using Cloudera Manager Tool and Ganglia.
Detailed knowledge and experience of Design, Development and Testing Software solutions using Java
and J2EE technologies.
Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
Involved in converting the business requirements into System Requirements specification (SRS)
Good understanding of XML methodologies (XML, XSL, XSD) including Web Services and SOAP
Proficiency with the application servers like WebSphere, WebLogic, JBOSS and Tomcat.
Extensive experience with SQL, PL/SQL and database concepts.
Expertise in debugging and optimizing Oracle and java performance tuning with strong knowledge in
Oracle 11g and SQL
Developed core modules in large cross-platform applications using JAVA, J2EE, Spring, Struts,
Hibernate, JAX-WS Web Services, and JMS.
Experienced with build tool Maven, ANT and continuous integrations like Jenkins.
Developed Unit test cases using Junit, Easy Mock and MrUnit testing frameworks.
Experienced with version controller systems like SVN, Clear case.
Hands on experience in VPN, Putty, winSCP, VNC viewer, etc.
Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.
Excellent global exposure to various work cultures and client interaction with diverse teams
Oracle Certified professional Java programmer(ID: OC1405206)
Technical Skills
Hadoop/Big
HDFS, MapReduce, Hive, Pig, Sqoop, Flume, HBase, Oozie, Avro,
Data/NoSQL
Zookeeper, Cassandra
Technologies
Programming
C, Java (JDK 5/JDK 6), SQL, PL/SQL, Python, Shell Script
Languages
Work Experience
General Motors, Detroit, MI Nov 2013 – Current
Sr. Hadoop Developer
This projects aims GM dealership to Utilize Data API to ingest data into Hadoop environment for the business
intelligence team to analyze the social reputation of dealerships.
Responsibilities:
Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats.
Responsible for Installation and configuration of Hive, Pig, Sqoop, Flume and Oozie on the Hadoop
cluster
Developed workflows using Oozie to automate the tasks of loading the data into HDFS and pre-processing
with Pig
Implemented scripts to transmit sysprin information from Oracle to HBase using Sqoop.
Worked on partitioning the HIVE table and running the scripts in parallel to reduce the run time of the
scripts.
Optimized Map/Reduced Jobs to use HDFS efficiently by using various compression mechanisms
Analyzes the data by preforming Hive queries and running Pig scripts to study data
Implemented business logic by writing Pig UDF’s in Java and used various UDF’s from Piggybanks and
other sources.
Continuously monitored and managed the Hadoop cluster using Ganglia.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as
required.
Exported the analyzed data to the relational databases using Scoop for visualization and to generate
reports for the BI team.
Supported in settling up QA environment and updating configuration for implementing scripts with Pig and
Scoop.
Implemented testing scripts to support test driven development and continuous integration.
Environment: Hadoop, Map Reduce, HDFS, Hive, Pig, Java, SQL, Ganglia, Scoop, Flume, Oozie, Java,
Maven, Eclipse
The project is aimed at collecting reports periodically from the customers. The reports include event logs and
system snapshots for storing them into HDFS. Process the data and study the correlations between different
types of threats to alert the managers.
Responsibilities:
Responsible for loading the customer's data and event logs from MSMQ into HBase using Java API.
Created HBase tables to store variable data formats of input data coming from different portfolios
Involved in adding huge volumes of data in rows and columns to store data in HBase.
Responsible for architecting Hadoop clusters with CDH4 on CentOS, managing with Cloudera Manager.
Involved in initiating and successfully completing Proof of Concept on FLUME for Pre-Processing,
Increased Reliability and Ease of Scalability over traditional MSMQ.
Use Flume to collect the log data from different resources and transfer the data type to hive tables using
different SerDe’s to store in JSON, XML and Sequence file formats.
Used Hive to find correlations between customer's browser logs in different sites and analyzed them to
build risk profile for such sites.
End-to-end performance tuning of Hadoop clusters and Hadoop Map/Reduce routines against very large
data sets.
Developed the Pig UDF'S to pre-process the data for analysis
Monitored Hadoop cluster job performance and performed capacity planning and managed nodes on
Hadoop cluster.
Proficient in using Cloudera Manager, an end to end tool to manage Hadoop operations.
Environment: Hadoop, BigData, HDFS, Pig, Hive, MapReduce, Sqoop, Cloudera manager, LINUX, CDH4,
FLUME, HBase, Pig, Hive
Merck & Co is an independent practice association (IPA) serving some 300,000 health plan members in
northern California. The company contracts with managed care organizations throughout the region -- including
HMOs belonging to Aetna, CIGNA, and Health Net -- to provide care to health plan members through its
provider affiliates. Its network includes about 3,700 primary care and specialty physicians, 36 hospitals, and 15
urgent care centers. The company also provides administrative services for doctors and patients. PriMed, a
management services organization, created Hill Physicians Medical Group in 1984 and still runs the company.
Responsibilities:
Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for
data cleaning and preprocessing.
Installed and configured Cassandra. In depth knowledge about Cassandra architecture, query, read and
write path.
Processed the source data to structured data and store in NoSQL database Cassandra.
Created alter, insert and delete queries involving lists, sets and maps in DataStax Cassandra.
Experienced in managing and reviewing Hadoop log files.
Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
Tested MapReduce programs using MR unit.
Load and transform large sets of structured, semi structured and unstructured data that includes Avro,
sequence files and xml files.
Involved in loading data from UNIX file system to HDFS.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map
reduce way.
Environment: Hadoop, Big Data, HDFS, MapReduce, Sqoop, Oozie, Pig, Hive, Flume, LINUX, Java,
Eclipse, Cassandra
CAN insurance provided B2B insurance to their customers. As part of enhancements we developed ELS
(Enterprise Logging Service) to provide statistics to support team and implemented Processor to send alerts
to Support Teams.
Responsibilities:
Developed the application using Struts Framework that leverages classical Model View Layer (MVC)
architecture UML diagrams like use cases, class diagrams, interaction diagrams (sequence and
collaboration) and activity diagrams were used
Gathered business requirements and wrote functional specifications and detailed design documents
Extensively used Core Java, Servlets, JSP and XML
Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle
9i database
Implemented Enterprise Logging service using JMS and apache CXF.
Developed Unit Test Cases, and used JUNIT for unit testing of the application
Implemented Framework Component to consume ELS service.
Involved in designing user screens and validations using HTML, jQuery, Ext JS and JSP as per user
requirements
Implemented JMS producer and Consumer using Mule ESB.
Wrote SQL queries, stored procedures, and triggers to perform back-end database operations
Sending Email Alerts to supporting team using BMC msend
Designed Low Level design documents for ELS Service.
Closely worked with QA, Business and Architect to solve various Defects in quick and fast to meet
deadlines
Environment: Java , Spring core, JMS Web services, JMS, JDK, SVN, Maven, Mule ESB Mule, Junit,WAS7,Jquery,
Ajax, SAX.
VSoft Technologies Pvt Ltd, Hyderabad, India Aug 2007 – June 2009
Jr Java Developer
VSoft Technologies is a lead provider of core banking and payment solutions. As a part of Coresoft loans team
we have developed modules for ACH clearing and also developed and maintained loans reports for all client
banks.
Responsibilities:
Environment: Java, Servlets, JSP, Hibernate, Junit Testing, Oracle DB, SQL, Jasper Reports, iReport,
Maven, Jenkins.