0% found this document useful (0 votes)
14 views7 pages

ritishsajjagcp (1)

Ritish Kumar Sajja is a data scientist with over 9 years of experience in machine learning, deep learning, and MLOps, proficient in various tools and technologies including TensorFlow, PyTorch, and AWS. He has a proven track record in deploying machine learning models, implementing CI/CD pipelines, and developing AI chatbots, with significant experience in cloud services and big data technologies. His professional background includes roles at Morgan Stanley and Capital One, where he successfully optimized machine learning workflows and collaborated with cross-functional teams to drive data-driven decisions.

Uploaded by

pal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views7 pages

ritishsajjagcp (1)

Ritish Kumar Sajja is a data scientist with over 9 years of experience in machine learning, deep learning, and MLOps, proficient in various tools and technologies including TensorFlow, PyTorch, and AWS. He has a proven track record in deploying machine learning models, implementing CI/CD pipelines, and developing AI chatbots, with significant experience in cloud services and big data technologies. His professional background includes roles at Morgan Stanley and Capital One, where he successfully optimized machine learning workflows and collaborated with cross-functional teams to drive data-driven decisions.

Uploaded by

pal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

RITISH KUMAR SAJJA

Email: ritishkumars2021@gmail.com Phone: (657) 204 2529


LinkedIn: linkedin.com/in/ritish-sajja
Summary:

Passionate data scientist with expertise in training, evaluating and deploying deep learning
models with Tensorflow/ Keras/ PyTorch. Experience in classical machine learning(XGBoost,
random forest) and handling complex data for deep learning (instance/semantic
segmentation/object detection).

● 9+ years of experience in MLOps, Machine Learning, Deep Learning,


NLP, Convolution Neural Networks, Recommender Systems, Statistical
Analysis, Feature Importance Analysis, Reinforcement Learning, Deep
Learning Frameworks like TensorFlow, PyTorch.
● Extensively worked on Spark using Python and Scala on cluster for
computational (analytics), installed it on top of Hadoop.
● Demonstrated expertise in implementing monitoring and logging solutions for machine
learning systems, ensuring proactive identification and resolution of issues, and
optimizing performance for enhanced reliability and scalability.
● Advanced statistical skills, including A/B testing, hypothesis testing, experiment
design, ANOVA, and time series modeling with techniques like ARIMA, GARCH,
and ARCH.
● Design and deploy AI chatbots powered by LLMs (Large Language Models) like
OpenAI's GPT or Hugging Face models. Use Boto3 to manage AWS resources (e.g., S3
for storing embeddings, Lambda for executing workflows) and LangChain for AI-driven
business logic.
● Proficient in containerization technologies such as Docker and orchestration tools like
Kubernetes, enabling scalable and reliable deployment of machine learning applications
across diverse cloud and on-premises environments.
● Knowledgeable in statistical and analytical tools, including R (ggplot2, Caret), for
developing predictive and exploratory data analysis solutions
● Developed a Python package implementing Circumscribed, Inscribed, and Face-
Centered techniques for Central Composite Design. Compiled and organized a wide
range of code functions to create a robust and versatile Python package, facilitating
usability for data analyst.
● built, and deploying a multitude of applications utilizing almost all the AWS stack
(Including EC2, R53, S3, RDS, DynamoDB, SQS, IAM, and EMR), focusing on high
availability, fault tolerance, and auto-scaling.
● Proficient in leveraging Terraform for infrastructure as code and
integrating it with GTS Analytics to formulate algorithms, construct
models, and build scalable data reporting systems
● Skilled in predictive modeling and machine learning algorithms,
including Regression Models, Decision Trees, Random Forests,
Sentiment Analysis, Naïve Bayes, and Clustering techniques.

AWS (EC2, S3, Lambda, Glue, sagemaker, cloudwatch),


Cloud Platform
AZURE, GCP

Big Data Hadoop, Spark, HBase, Cassandra, MongoDB, Kafka, Hive,


Technologies MapReduce

Statistical Tools Jupyter Notebook, Anaconda, R

CI Tools Jenkins, GitHub

Container Tools Docker, Kubernetes, ECS, EKS

Machine scikit-learn, TensorFlow, Keras, Spark MLlib, XGBoost ,


Learning Pyspark , PyTorch

Version control
Git, GitHub, Git Lab, Bitbucket
tool

Ridge-lasso, Random forest, Gradient boosting, KNN, CNN,


Algorithms
SVM.

Programming
Python, C++, HTML5, MYSQL,
Languages

Monitoring tools Splunk, Nagios, Cloud Watch, Prometheus, grafana

Tableau, Datadog, Matplotlib, Seaborn, ggplot2, Azure ML


Visualization
studio, AWS sage maker.

Database Microsoft synapse, Snowflake & Azure Data Factory.

SKILLS:

PROFESSIONAL EXPERIENCE:
Senior Data Scientist/Machine Learning:
Morgan Stanley, NY June 2022 -Oct 2024
• Deployed Azure IaaS virtual machines and cloud services into secure VNets and
subnets. Managed and optimized CI/CD pipelines using Azure DevOps for seamless
build and release automation.
• Prepared, trained, deployed, and monitored machine learning models using Azure
Pipelines, ensuring robust model performance and addressing data drift issues.
• Designed reproducible training workflows to reduce variability across model iterations.
• Implemented LLM-based chatbots (using OpenAI GPT, LLAMA2, Mistral
models) integrated with Janus Graph and Gremlin queries, enhancing user
interactions and customer service.
• Developed and deployed AI chatbots using Large Language Models like OpenAI’s
GPT and Hugging Face models.
• Leveraged Databricks Asset Bundles (DAB), Delta Lake, and MLflow for scalable
machine learning pipeline management, reducing downtime and increasing efficiency
by 16%.
• Authored Terraform scripts to automate the deployment of Azure cloud services,
creating reusable templates for multi-tier applications and provisioning cloud
infrastructure.
• Set up Jenkins pipelines integrated with tools like GIT, Nexus, SonarQube, Ansible,
and Docker. Configured additional Docker Slave Nodes for Jenkins CI/CD using
custom Docker Images.
• Worked extensively with Docker for containerized deployments, including Docker
Hub, Docker Compose, Docker Weave, and Docker Trusted Registry.
• Implemented container orchestration with Kubernetes, integrating EFK stack
(Elasticsearch, Fluentd, Kibana) for logging and Prometheus with Grafana for
cluster monitoring and alerting.
• Enforced network policies in K3s clusters using Calico CNI and explored CNCF
container runtimes for performance benchmarking.
• Configured and managed tools like Splunk, Nagios, CloudWatch, and the ELK stack
for system monitoring, log analysis, and visualizations.
• Integrated Adobe Analytics, Conviva, and Datadog to monitor platform performance,
generating insights to enhance stability.
• Conducted benchmarking using tools like Sysbench, JMeter, and Apache Bench to
evaluate container and orchestration platform performance.
• Built connectors for databases, APIs, and web scraping tools to integrate real-time
and historical data into LLM workflows.
• Regularly collaborated with business teams to finalize requirements, define model
monitoring metrics, and review implementation plans in an Agile environment.

• Developed spark applications in python using PySpark on distributed environments to


load huge number of CSV files with different schema.

• Worked on reading and writing of several formats like


JSON,ORC,Parquet on HDFS using PySpark
MLOps Engineer/Data Scientist:
Capitalone, TX March 2020 – June
2022
• Designed and executed end-to-end MLOps pipelines, seamlessly
integrating machine learning models into production environments.
• Established comprehensive training pipelines for machine learning
projects, optimizing resource allocation and enhancing model training
efficiency.
• Automated deployment pipelines to ensure consistent and reliable
model deployment across diverse environments.
• Implemented release pipelines to expedite deployment cycles and
enhance the agility of the development process.
• Collaborated closely with cross-functional teams to gather
requirements and devise scalable ETL solutions tailored to specific
business needs, ensuring alignment with stakeholder expectations.
• Enabled seamless deployment of SSRS reports to SharePoint
libraries, integrating them into SharePoint document libraries for
enhanced management and accessibility within the SharePoint
environment.
• Collaborated closely with data scientists and software engineers to
troubleshoot and resolve integration challenges.
• Implemented continuous integration and continuous deployment
(CI/CD) processes for smooth model deployment and monitoring.
• Utilized tools like Terraform or Ansible to deploy Infrastructure as
Code, automating the provisioning and management of scalable and
reproducible ML infrastructure.
• Conducted RMF analysis to assess customer behaviors and value, applying K-Means
and Hierarchical Clustering for segmentation.
• Built predictive models using Lasso, Ridge, SVR, and XGBoost to forecast Customer
Lifetime Value (CLV).
• Implemented A/B testing and Multi-armed Bandits to enhance customer experience
and optimize metrics like user engagement and conversion rates.
• Developed machine learning models using algorithms like Linear Regression, Naive
Bayes, Random Forests, KNN, PCA, and ensemble techniques (Bagging, Random
Forest, Gradient Boosting, XGBoost, AdaBoost).
• Productionized ML pipelines on Google Cloud Platform (GCP) using Cloud
Composer, BigQuery, GCP Storage Buckets, and Google Vertex AI for automated
model lifecycle management.
• Evaluated models with Cross-Validation, Log Loss, ROC curves, and used AUC for
feature selection and optimization.
• Leveraged Elastic Search and Kibana for data indexing and visualization.
• Utilized SAS for advanced statistical analysis, including ANOVA, predictive
analytics, and business intelligence.
• Integrated Generative AI techniques into predictive models to enhance accuracy and
versatility.
• Connected relational and non-relational databases via GCP services to support ML
workflows, enabling seamless data integration and deployment.
• Designed and developed analytics, machine learning models, Generative AI
applications, and visualizations for insights and operational efficiency, using Python
and Tableau.
• Automated cloud resource management using Terraform to deploy scalable
Infrastructure as Code (IaC) solutions.
• Executed Spark-Kafka integration to ingest and process real-time data streams.
• Developed Data Flow jobs on GCP to facilitate seamless data movement and
transformation.
• Created data quality scripts using SQL and Hive to ensure data accuracy and
integrity.
• Delivered end-to-end ML pipelines encompassing data exploration, feature
engineering, model building, and performance evaluation.
• Worked in Agile environments, collaborating with cross-functional teams for iterative
and adaptive development.
• Leveraged Microsoft Power Apps to develop low-code solutions for business process
optimization and rapid application deployment.

• Develop framework for converting existing PowerCenter mappings and to PySpark.

• Provide guidance to development team working on PySpark as ETL platform

Machine Learning/Data Scientist:


April 2017 – March 2020 Allstate Inc, Tx
● Managed and processed large datasets, handling missing values, creating dummy
variables, and mitigating data noise.
● Used RMSE score, Confusion matrix, ROC, Cross-validation, and A/B testing to
evaluate model performance in both the simulated environment and the real world.
● Engineered and fine-tuned Optical Character Recognition (OCR) systems to extract
meaningful information from documents and imagesBuilt classification models to predict
customer defaults, utilizing machine learning techniques to enhance model accuracy.
● Utilized matplotlib and seaborn to create histograms, bar plots, pie charts, scatter
plots, and box plots for assessing data conditions. Designed and delivered complex
reports and dashboards using Tableau for actionable insights.
● Created dashboards and visual reports to interpret and present findings to stakeholders.
● Partnered with cross-functional teams, including Financial Analysts and Data Engineers,
to drive data-driven decisions.
● Implemented diverse algorithms such as Linear Regression, Ridge, Lasso, Elastic Net,
KNN, Decision Trees, SVM, Random Forest, and XGBoost to build predictive
models.
● Applied data mining techniques to uncover new patterns and propose innovative
solutions for business challenges.
● Achieved an 84% accuracy rate in credit card default prediction through advanced
model optimization techniques.
● Evaluated model performance using metrics like F-Score, AUC/ROC, Confusion
Matrix, and RMSE, ensuring robust and reliable predictions.
● Collaborated with Marketing and other departments to enhance customer retention and
enable product deepening strategies. Provided expertise in consumer and small
business behavior score modeling, driving data-driven decision-making.
● Developed data pipelines using Glue, Lambda, Spark, and Python to streamline data
processing and analysis.
● Leveraged Python, Docker, AWS, Airflow, and Spark to generate actionable data
insights and establish model feedback loops for continuous improvement.
● Deployed models as Python packages, REST APIs, and microservices, ensuring
scalability and seamless integration into production using Kubernetes orchestration.

Data Analyst:
June 2015 - February 2017 SetuServ informatics, India

● Worked with requirements analysts and subject matter experts to identify, understand,
and document business needs for the data flow.
● Worked in close coordination with a variety of Financial, Mortgage, and Credit
Consumer Group business teams in gathering the business requirements.
● Worked with Central Distribution Hub (CDH) team in developing strategies to handle
data from EO (Enterprise Originations) to CDH, then CDH to downstream systems.
● I used the export-import feature to carry out day-to-day migrations of various ETL
Informatica objects.
● I worked with Chief Data Architects to slice the data requirements into work streams and
various components.
● Data mapping, logical data modeling, class diagrams, and ER diagrams were prepared
with the SQL queries and the PL/SQL stored procedures to filter data within the Oracle
database.
● We performed development tasks in ETL Informatica, such as job creation using
different stages, debugging, etc.
● It is developed and maintained a data dictionary to create metadata reports for both
technical and business purposes.
● A resource for analytical services using SQL Server, TOAD/Oracle.
● Execute the following SQL and PL/SQL queries using TOAD and SQL Navigator:.
● Identify and document the transformation rules and identified/documented data sources
needed to populate and maintain data warehouse content.
● I created the many KPIs required for the Dashboards in Excel and Cognos using
formulas, variables, standard business object functions, and merging between multiple
universes fetching information from different underlying data targets.
● Developed and implemented basic SQL queries for testing and data validation reports.
Made use of data warehousing for profiling the data available in an existing database.

Education: Bachelor’s in computer science engineering 2014 GITAM university

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy