Chapter 3
Chapter 3
Chapter 3
--------------------------------------------------------------------------------------------------------------------
DevOps is a cultural and technical movement that combines software development (Dev) and IT operations (Ops) to
improve collaboration, efficiency, and the delivery of software products. It aims to shorten the software development
lifecycle (SDLC) while delivering features, updates, and fixes frequently and reliably through automation and
collaboration.
1. Collaboration and Communication: DevOps fosters a collaborative culture between developers and operations
teams, breaking down silos to ensure seamless communication.
2. Automation: Automating repetitive tasks such as testing, deployment, and monitoring enhances productivity
and minimizes human error.
o Continuous Integration and Continuous Delivery (CI/CD): Continuous Integration (CI):
Developers regularly integrate code into a shared repository, followed by automated testing to detect
issues early.
o Continuous Delivery (CD): Ensures that the code is always in a deployable state, automating
deployment pipelines to push code changes into production.
3. Infrastructure as Code (IaC): Treating infrastructure as software enables teams to define and manage
resources (e.g., servers, networks) through code, ensuring consistency and repeatability.
4. Monitoring and Feedback: Continuous monitoring and feedback loops provide insights into system
performance and user behavior, allowing teams to improve software iteratively.
Benefits of DevOps
1. Faster Time-to-Market:
Automation and streamlined processes allow for faster development, testing, and deployment cycles.
2. Improved Collaboration:
Breaking down silos between development and operations ensures that teams work toward shared goals.
3. Higher Quality Software:
Continuous testing and integration reduce bugs and ensure higher-quality code.
4. Enhanced Scalability:
Automation and IaC make it easier to scale systems up or down based on demand.
5. Better Customer Satisfaction:
Faster delivery of features and fixes leads to better user experiences.
DevOps Toolchain
A DevOps toolchain comprises tools used across the SDLC stages to automate processes and improve efficiency. Some
popular tools include:
1. Version Control:
o Git, GitHub, GitLab, Bitbucket
2. CI/CD:
o Jenkins, CircleCI, TravisCI, GitHub Actions
3. Containerization:
o Docker, Podman
4. Container Orchestration:
o Kubernetes, Docker Swarm
5. Configuration Management:
o Ansible, Puppet, Chef, SaltStack
6. Infrastructure as Code (IaC):
o Terraform, AWS CloudFormation
7. Monitoring and Logging:
o Prometheus, Grafana, Splunk, ELK Stack
Challenges in DevOps
1. Cultural Resistance:
Adopting a DevOps mindset requires significant cultural change, which can face resistance from traditional
teams.
2. Tool Overload:
Choosing and managing the right tools can be overwhelming given the wide variety of options available.
3. Complexity in CI/CD Pipelines:
Building and maintaining robust CI/CD pipelines requires expertise and continuous effort.
4. Security Concerns:
Integrating security into DevOps (DevSecOps) is a challenge that demands additional focus and resources.
5. Scalability:
Scaling DevOps practices across large organizations can be complex.
3.1.2 7 Cs of DevOps
1. Continuous Development
2. Continuous Integration
3. Continuous Testing
4. Continuous Deployment/Continuous Delivery
5. Continuous Monitoring
6. Continuous Feedback
7. Continuous Operations
1. Continuous Development
In Continuous Development code is written in small, continuous bits rather than all at once,
Continuous Development is important in DevOps because this improves efficiency every time
a piece of code is created, it is tested, built, and deployed into production. Continuous
Development raises the standard of the code and streamlines the process of repairing flaws,
vulnerabilities, and defects. It facilitates developers’ ability to concentrate on creating high-
quality code.
2. Continuous Integration
Continuous Integration can be explained mainly in 4 stages in DevOps. They are as follows:
1. Getting the SourceCode from SCM
2. Building the code
3. Code quality review
4. Storing the build artifacts
The stages mentioned above are the flow of Continuous Integration and we can use any of the
tools that suit our requirement in each stage and of the most popular tools are GitHub for
source code management(SCM) when the developer develops the code on his local machine
he pushes it to the remote repository which is GitHub from here who is having the access can
Pull, clone and can make required changes to the code. From there by using Maven we can
build them into the required package (war, jar, ear) and can test the Junit
cases.SonarQube performs code quality reviews where it will measure the quality of source
code and generates a report in the form of HTML or PDF format. Nexus for storing the build
artifacts will help us to store the artifacts that are build by using Maven and this whole process
is achieved by using a Continuous Integration tool Jenkins.
3. Continuous Testing
Any firm can deploy continuous testing with the use of the agile and DevOps methodologies.
Depending on our needs, we can perform continuous testing using automation testing tools
such as Testsigma, Selenium, LambdaTest, etc. With these tools, we can test our code and
prevent problems and code smells, as well as test more quickly and intelligently. With the aid
of a continuous integration platform like Jenkins, the entire process can be automated, which is
another added benefit.
5. Continuous Monitoring
6. Continuous Feedback
Once the application is released into the market the end users will use the application and
they will give us feedback about the performance of the application and any glitches
affecting the user experience after getting multiple feedback from the end users’ the
DevOps team will analyze the feedbacks given by end users and they will reach out to the
developer team tries to rectify the mistakes they are performed in that piece of code by this
we can reduce the errors or bugs that which we are currently developing and can produce
much more effective results for the end users also we reduce any unnecessary steps to
deploy the application. Continuous Feedback can increase the performance of the
application and reduce bugs in the code making it smooth for end users to use the
application.
7. Continuous Operations
CI/CD is a set of practices that automate the building, testing, and deployment of
applications, ensuring they are always in a releasable state. It promotes faster development
cycles, improved software quality, and reduced time to market. A CI/CD (Continuous
Integration/Continuous Deployment) pipeline automates the software development process,
from code integration and testing (CI) to deployment and delivery (CD). In MLOps, it
streamlines machine learning model development and deployment.
1. Version Control:
o All code changes are committed to a shared version control system, such as
Git. This allows teams to track changes, collaborate, and revert to earlier
versions if necessary.
2. Automated Build:
o Every time code is integrated, the system automatically builds the software,
ensuring the code compiles correctly and dependencies are properly managed.
3. Automated Testing:
o Automated testing is crucial in CI/CD to catch bugs early. Unit tests,
integration tests, and acceptance tests are run automatically after each
commit.
4. Staging Environment:
o In Continuous Delivery, after passing initial tests, the code is deployed to a
staging environment. This environment mirrors production but allows for
final checks and approvals before the code is released to users.
5. Deployment Automation:
o The CD pipeline automates deployment to staging and production
environments, reducing human error and ensuring consistency.
6. Monitoring and Alerts:
o Once deployed to production, the system is monitored for performance issues,
bugs, or failures. If any issues arise, alerts are sent to the team to fix them
promptly.
Advantages of CI/CD:
An ETL Pipeline (Extract, Transform, Load) is a process used in data engineering to extract data
from various sources, transform it into a usable format, and load it into a target destination, such
as a database, data warehouse, or data lake. ETL pipelines are fundamental for preparing data for
analytics, reporting, and machine learning applications.
3.3.1 ETL Pipeline VS CI/CD Pipeline
An MLOps pipeline is the sequence of steps and tools used to develop, deploy, monitor, and
maintain machine learning models in production. The stages of an MLOps pipeline include:
1. Code and Data Versioning:
Version control for code and data.
Collaboration and traceability.
2. Data Preprocessing:
Cleaning, transforming, and feature engineering.
Ensure data quality.
3. Model Training:
Developing and training machine learning models.
Hyperparameter tuning and cross-validation.
4. Model Evaluation:
Assessing model performance using metrics.
Validation data separation.
5. Model Deployment:
Containerization, orchestration, and API endpoints.
Automating deployment via CI/CD.
6. Model Monitoring:
Continuous tracking of model performance.
Alerts for anomalies and drift.
7. Feedback and Iteration:
Incorporate user feedback into model updates.
Iterate on models for improvement.
An MLOps pipeline ensures a systematic and automated approach to managing machine learning
models throughout their lifecycle.
Jenkins is an open-source automation server used to streamline and automate various tasks in
software development. It is widely known for enabling Continuous Integration (CI) and
Continuous Delivery/Deployment (CD), helping teams integrate code changes, test them, and
deploy applications efficiently. Jenkins is highly extensible with hundreds of plugins to support
building, deploying, and automating any project.
1. Open-Source: Jenkins is free and has an active community contributing to its continuous
development.
2. Platform Independent: Runs on Windows, macOS, and Linux, and supports all major
development environments.
3. Extensibility: Over 1,800 plugins available for integration with various tools and platforms (e.g.,
Docker, Kubernetes, Git, Maven).
4. Distributed Builds: Supports a master-slave architecture for distributed builds, enabling parallel
execution.
5. Pipeline Support: Facilitates complex workflows through pipelines defined in code.
6. Integration: Supports popular tools like Git, JIRA, Docker, Selenium, and Kubernetes.
Jenkins Pipeline is a suite of plugins that support implementing and integrating continuous
delivery pipelines into Jenkins. It allows you to define the entire build, test, and deployment
process of your applications as code. This approach provides greater flexibility, reusability, and
maintainability compared to traditional Jenkins jobs, which were typically defined through the
UI.
Key Concepts
1. Pipeline: The entire process defined in a Jenkinsfile that describes how the software will
be built, tested, and deployed.
2. Jenkinsfile: A text file that contains the definition of a Jenkins Pipeline and is stored in
the version control system along with the application code.
3. Stages: Logical segments within a pipeline that define different parts of the build process,
such as "Build," "Test," and "Deploy."
4. Steps: Individual tasks that are executed within a stage, such as running a shell
command, invoking another job, or sending notifications.
5. Declarative and Scripted Pipelines: Two types of syntax used to define pipelines:
o Declarative Pipelines: A more structured and easier-to-read syntax.
Recommended for most use cases.
o Scripted Pipelines: A more flexible and powerful syntax, using Groovy, allowing
for complex logic and conditions.
pipeline {
agent any // Run on any available agent
stages {
stage('Build') {
steps {
echo 'Building...'
sh 'make' // Run the build command
}
}
stage('Test') {
steps {
echo 'Testing...'
sh 'make test' // Run the tests
}
}
stage('Deploy') {
steps {
echo 'Deploying...'
sh 'deploy.sh' // Run the deployment script
}
}
}
post {
success {
echo 'Pipeline completed successfully!'
}
failure {
echo 'Pipeline failed!'
}
}
}
A Jenkins Pipeline is a set of steps to define the workflow of a CI/CD process in code. It
consists of:
1. Declarative Pipeline:
o Easier to use and recommended for beginners.
o Example:
groovy
Copy code
pipeline {
agent any
stages {
stage('Build') {
steps {
echo 'Building...'
sh 'make build'
}
}
stage('Test') {
steps {
echo 'Testing...'
sh 'make test'
}
}
stage('Deploy') {
steps {
echo 'Deploying...'
sh 'make deploy'
}
}
}
}
2. Scripted Pipeline:
o More flexible and powerful.
o Written in Groovy scripting language.
o Example:
groovy
Copy code
node {
stage('Build') {
echo 'Building...'
sh 'make build'
}
stage('Test') {
echo 'Testing...'
sh 'make test'
}
stage('Deploy') {
echo 'Deploying...'
sh 'make deploy'
}
}
Advantages of Jenkins
1. Automation: Simplifies repetitive tasks like testing and deployment.
2. Extensibility: Wide range of plugins for seamless integration.
3. Ease of Use: User-friendly interface and dashboards.
4. Scalability: Distributed builds for large-scale projects.
5. Community Support: Extensive documentation and active community.
Challenges of Jenkins
1. Containerization:
2. Serverless:
3. On-Premises Servers:
4. Cloud Services:
1. Amazon SageMaker:
Provides end-to-end ML model development and hosting on AWS.
3. Google AI Platform:
4. Kubernetes:
5. TensorFlow Serving:
6. Apache Spark:
Choose the packaging and deployment type and platform that best fit your project's
requirements and infrastructure.
Batch processing and stream processing are two fundamental approaches for handling data in the
context of MLOps (Machine Learning Operations). Each has its strengths and weaknesses
depending on the use case and requirements of machine learning workflows. Here’s a
comparison of the two approaches: