Netflix Data Science Interview Question
Netflix Data Science Interview Question
DATA SCIENCE
INTERVIEW
QUESTIONS
WHAT IS DIFFERENCE BETWEEN BATCH AND ONLINE GRADIENT DESCENT
In batch gradient descent, the model looks at the entire dataset at
once to calculate the gradient (the direction to minimize error) and
update parameters. This means it calculates the average gradient
across all data points and then makes one update.
Pros: More stable and accurate, since it uses all data at each step.
Cons: Can be slow and memory-intensive, especially with large
datasets, since it needs to process all data at once.
Online Gradient Descent
@karunt
WHAT MAKE RELU AN EFFECTIVE ACTIVATION FUNCTION?
Sparse Activation: ReLU turns off (outputs zero) for any negative
input. This makes the network "sparse" by reducing unnecessary
signals, which improves efficiency and reduces the chances of
overfitting.
@karunt
EXPLAIN ANOVA TEST? (FOLLOW-UP: EXPLAIN MEANING OF P-VALUES)
The ANOVA test is used to compare the means of three or more groups
to determine if there is a significant difference among them. How ANOVA
Works -
1. Null Hypothesis (H₀): All group means are equal (i.e., any observed
differences are due to random variation).
2. Alternative Hypothesis (H₁): At least one group mean is significantly
different from the others.
@karunt
WHAT ARE THE KEY METRICS YOU WOULD CONSIDER WHEN EVALUATING
THE PERFORMANCE OF A RECOMMENDATION ALGORITHM?
Precision@K and Recall@K: These measure the relevance of the top K
recommended items. Precision@K is the proportion of relevant items in
the top K recommendations, while Recall@K measures the proportion of
all relevant items that appear in the top K.
Hit Rate: Measures how often the recommended list contains at least
one item that the user interacts with or rates highly, indicating that the
model is generating some relevant suggestions.
@karunt
HOW WOULD YOU BUILD AND TEST A METRIC TO COMPARE TWO USERS’
RANKED LISTS OF MOVIE/TV SHOW PREFERENCES?
A few metrics to consider are -
Kendall’s Tau: Measures how similarly two lists are ranked by counting
the number of pairwise swaps needed to convert one list into the other.
It’s a good choice when you want to assess the order of preferences
rather than exact placement.
A few other things to consider are - making sure the list lengths are the
same, and conducting testing by swapping order to see how much the
above metrics change etc.
@karunt
WAS THIS HELPFUL?
Be sure to save it so you
can come back to it later!
@karunt