0% found this document useful (0 votes)
8 views17 pages

01 Intro Tutorials Project

Uploaded by

terrielin01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views17 pages

01 Intro Tutorials Project

Uploaded by

terrielin01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Introduction: Tutorials and Final Project

Release Engineering for Machine Learning Applications


(CS4295)

Sebastian Proksch <S.Proksch@tudelft.nl>


Luís Cruz <L.Cruz@tudelft.nl>
REMLA 2022
Agenda

Tutorial classes
Baseline Project
Final Project
Project Outputs
Steering Meetings
Final announcements

REMLA 2022 | Luís Cruz & Sebastian Proksch 2


Tutorials

• 3 tutorial classes. Weeks 3–4.


• Practice concepts taught in the lectures.
• Improve an existing basic ML application (baseline project).
• Focus on a different angle of the application to make it production-ready.
• Docker, Kubernetes
• ML configuration management
• Define Metrics, Instrument App, Logging
REMLA 2022 | Luís Cruz & Sebastian Proksch 3
Baseline Project

• We will give you a basic ML project that you will have to use to improve based
on the content of our course.

• After each class you should look at the baseline project and see how to apply
these concepts on it.

REMLA 2022 | Luís Cruz & Sebastian Proksch 4


Baseline Project - part 1

ML Application in
ML Project production

- Pipeline Management
- Automated versioning
+ - Container build
- Kubernetes setup
- Basic grafana setup

REMLA 2022 | Luís Cruz & Sebastian Proksch 5


Baseline Project - part 2
Improved ML
Application
ML Application in
production - Model as artefact
- Inference API
- Pipeline Management - Shadow deployment
- Automated data and - Automated Testing
code versioning - Continuous training
- Container build - Drift Detection
- Kubernetes setup - Improved metrics
- Basic grafana setup dashboard
- …

Clarification from the lecture: While all features from the left side need to be implemented,
only one from the right side is required. Also, the right list is not exhaustive

REMLA 2022 | Luís Cruz & Sebastian Proksch 6


Extension example

POST
ML Project Client
ML Model Web API

Inference

REMLA 2022 | Luís Cruz & Sebastian Proksch 7


Extension example
Main branch

ML Project ML Model Web API


Release Orchestrator
(proxy)
POST
Online AB-testing
Client
Staging branch

Shadow AB-testing
Inference
ML Project ML Model Web API

Experiment X branch


Experiment Z branch

REMLA 2022 | Luís Cruz & Sebastian Proksch 8


Baseline Project

• Dataset of StackOverflow questions (over 100K


datapoints)

• ML Task: given a question, assign a tag (e.g., java, c#,


etc.)

• There is a notebook that is already creating a model


based on multi-label linear classification.

• Code available here: https://github.com/luiscruz/


remla-baseline-project

REMLA 2022 | Luís Cruz & Sebastian Proksch 9


Final Project Description

• Goal: Extend the baseline project to improve it in at least one aspect. E.g.,
improve and automate the collection of new data.

• Based on the ML application developed in the lab classes, propose a solution


that will improve an engineering process of the application.

• Idea should be relevant, novel, and creative.


• The best projects generalise to other ML applications as well. E.g., as a tool,
framework, or learning materials.

• It can be specific to Release Engineering. E.g., implement “shadow mode”


model releases; create a tool to monitor “shadow mode” models, etc.

• Or specific to ML Engineering. E.g., create a framework that promotes the


usage of Scikit-learn Pipelines for both data processing and model training;
create a diffing tool for ML artefacts; create a catalog of ML testing examples.

REMLA 2022 | Luís Cruz & Sebastian Proksch 10


Final Project
• Groups of 4
Clarification from the lecture: “Full time” refers to your available
• ≈4 weeks full time (weeks 5–9) “REMLA time“, as there are no other REMLA activities. In this
period, so you should spend all time for our lecture on the project

• Each team will have a coach.


• Weekly steering meetings. Feedback and formative assessment.
• Rubrics are currently work in progress and will be posted online:
• https://se.ewi.tudelft.nl/remla/2022

REMLA 2022 | Luís Cruz & Sebastian Proksch 11


Project Outputs

Project Codebase Essay Presentation with Q&A

• Solution on top of the • Clearly explain the • Live presentation.


lab project. underlying
engineering problem. • Quick demo.
• Improve Release or
ML Engineering • Explain and motivate • Discussion.
processes. the solution.
• Used to assess differences
• Publish a tool that • We will talk about between teammates.
helps the community. essay best practices.

REMLA 2022 | Luís Cruz & Sebastian Proksch 12


Steering Meetings

• One of the teachers will meet every week with


each team (weeks 5–8).

• Provide feedback.
• Help understand the potential of a given proposal/
idea.

• Making sure students are on track and don’t feel


lost.

REMLA 2022 | Luís Cruz & Sebastian Proksch 13


Monday Tuesday Wednesday Thursday Friday
Holiday

W1 16
Intro (Structure, Project)

Holiday
CDel/CDepl
W2 17
Testing
Containerization

Holiday
ML Pipelines
W3 18 Docker Practice and
Deployment in Kubernetes Kubernetes
Group Registration

Define Metrics,
Monitoring + Continuous
W4 ML Config Group Instrument App,
19 Experimentation Logging
Management Assignment Pipeline Extension
Proposal

How to write a paper /


W5 20 ML Validation
(Guest Lecture)
Review Current Pipeline How to present

Holiday Holiday

W6 21 Self-Study First Draft of ToC + Introduction

W7 22 Self-Study Individual Steering Meetings

Holiday

W8 23 Individual Steering Meetings

W9 24
Presentation

Dfsg f dgdf
asd aa sdd
sdsd sd dsd

W10
sdsdsd sdd
s dsl fjhfj ss
25 fhdj fjd dsfg

Essay

Lecture Lab Steering Meetings Deadline Lecture date


Final Project - deadlines

• Group registration - May 6 (end of week 3)


• Proposal of extension ideas - May 13 (end of week 4)
• Presentation - June 13 (week 9)
• Final submission (essay and code) - June 21 (week 10)

REMLA 2022 | Luís Cruz & Sebastian Proksch 15


Communication channels

• Join our Mattermost team: link will be shared


next week

• 1st class channel: Mattermost #townsquare


• 2nd class channel: Mattermost DMs
• 3rd class channel: Email
• Once groups are settled, each group creates
their own channel w/ profs.

REMLA 2022 | Luís Cruz & Sebastian Proksch 16


Guest Lectures
Announcement

• We are planning on inviting some guest speakers to give a


lecture on Data Validation and Code Smell detection for ML
applications. (More info soon)

REMLA 2022 | Luís Cruz & Sebastian Proksch 17

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy