Kantar - Consultant Interview Questions
Kantar - Consultant Interview Questions
Analytics consultants provide solutions that improve efficiency and solve company issues. They are
responsible for gathering and analyzing business data, making improvement suggestions, and
increasing revenue levels.
The position of a Data Analyst Consultant is one level higher than the Data Analyst due to the
following reasons:
Skill Set: In addition to the necessary technical knowledge, the Consultant must also have
a high level of Consulting skills, that is, the ability to solve problems, collaborative
communication, and effective presentation skills. Although technical skills can be acquired
academically, other soft skills are acquired through experience and practice only.
Domain Knowledge: A Data Analyst is usually an expert in a specific field and can offer
insights related to that field only. However, someone in the Data Analyst Consulting job
would have command over multiple fields and thus is more equipped to handle complex
challenges that a company face.
Professional Peers: Data Analysts usually work alongside the Development Teams and the
Testing Teams while Data Analyst Consulting requires spending hours with Senior
Management and various Stakeholders to come up with innovative solutions.
Data Analysis
Data can help businesses better understand their customers, improve their advertising campaigns,
personalize their content and improve their bottom lines. The advantages of data are many, but you
can’t access these benefits of data analytics without the proper data analytics tools and processes.
While raw data has a lot of potential, you need data analytics to unlock the power to grow your
business.
data analytics refers to the process of examining datasets to draw conclusions about the information
they contain. Data analytic techniques enable you to take raw data and uncover patterns to extract
valuable insights from it. Data Scientists and Analysts use data analytics technology and techniques
in their research, and businesses also use it to inform their decisions. Data analysis can help
companies better understand their customers, evaluate their ad campaigns, personalize content,
create content strategies and develop products. Ultimately, businesses can use data analytics to
boost business performance and improve their bottom line.
1. Improved Decision making: Data analytics eliminates much of the guesswork from planning
marketing campaigns, choosing what content to create, developing products and more. It
gives you a 360-degree view of your customers, which means you understand them more
fully, enabling you to better meet their needs. Plus, with modern data analytics technology,
you can continuously collect and analyze new data to update your understanding as
conditions change.
2. More effective marketing: When you understand your audience better, you can market to
them more effectively. Data analytics also gives you useful insights into how your campaigns
are performing so that you can fine-tune them for optimal outcomes.
3. Better customer Service: Data analytics provide you with more insights into your customers,
allowing you to tailor customer service to their needs, provide more personalization and
build stronger relationships with them.
4. More efficient operations: Data analytics can help you streamline your processes, save
money and boost your bottom line. When you have an improved understanding of what
your audience wants, you waste less time on creating ads and content that don’t match your
audience’s interests.
Data mining is the process of analyzing a large batch of information to discern trends and
patterns.
Data mining can be used by corporations for everything from learning about what customers
are interested in or want to buy to fraud detection and spam filtering.
Data mining programs break down patterns and connections in data based on what
information users request or provide.
Data mining uses algorithms and various techniques to convert large collections of data into useful
output. The most popular types of data mining techniques include:
Association rules, also referred to as market basket analysis, searches for relationships
between variables. This relationship in itself creates additional value within the data set as it
strives to link pieces of data. For example, association rules would search a
company's sales history to see which products are most commonly purchased together; with
this information, stores can plan, promote, and forecast accordingly.
Decision trees are used to classify or predict an outcome based on a set list of criteria or
decisions. A decision tree is used to ask for input of a series of cascading questions that sort
the dataset based on responses given. Sometimes depicted as a tree-like visual, a decision
tree allows for specific direction and user input when drilling deeper into the data.
K-Nearest Neighbor (KNN) is an algorithm that classifies data based on its proximity to other
data. The basis for KNN is rooted in the assumption that data points that are close to each
are more similar to each other than other bits of data. This non-parametric, supervised
technique is used to predict features of a group based on individual data points.
Neural networks process data through the use of nodes. These nodes is comprised of inputs,
weights, and an output. Data is mapped through supervised learning (similar to how the
human brain is interconnected). This model can be fit to give threshold values to determine
a model's accuracy.
To be most effective, data analysts generally follow a certain flow of tasks along the data mining
process. Without this structure, an analyst may encounter an issue in the middle of their analysis
that could have easily been prevented had they prepared for it earlier. The data mining process is
usually broken into the following steps.
Before any data is touched, extracted, cleaned, or analyzed, it is important to understand the
underlying entity and the project at hand. What are the goals the company is trying to achieve by
mining data? What is their current business situation? What are the findings of a SWOT analysis?
Before looking at any data, the mining process starts by understanding what will define success at
the end of the process.
Once the business problem has been clearly defined, it's time to start thinking about data. This
includes what sources are available, how it will be secured stored, how information will be gathered,
and what the final outcome or analysis may look like. This step also critically thinks about what limits
their are to data, storage, security, and collection and assesses how these constraints will impact the
data mining process.
It's now time to get our hands on information. Data is gathered, uploaded, extracted, or calculated.
It is then cleaned, standardized, scrubbed for outliers, assessed for mistakes, and checked for
reasonableness. During this stage of data mining, the data may also be checked for size as an
overbearing collection of information may unnecessarily slow computations and analysis.
With our clean data set in hand, it's time to crunch the numbers. Data scientists use the types of
data mining above to search for relationships, trends, associations, or sequential patterns. The data
may also be fed into predictive models to assess how previous bits of information may translate into
future outcomes.
The data-centered aspect of data mining concludes by assessing the findings of the data model(s).
The outcomes from the analysis may be aggregated, interpreted, and presented to decision-makers
that have largely be excluded from the data mining process to this point. In this step, organizations
can choose to make decisions based on the findings.
Step 6: Implement Change and Monitor
The data mining process concludes with management taking steps in response to the findings of the
analysis. The company may decide the information was not strong enough or the findings were not
relevant to change course. Alternatively, the company may strategically pivot based on findings. In
either case, management reviews the ultimate impacts of the business and re-creates future data
mining loops by identifying new business problems or opportunities.
Data mining doesn't always guarantee results. A company may perform statistical analysis, make
conclusions based on strong data, implement changes, and not reap any benefits. Through
inaccurate findings, market changes, model errors, or inappropriate data populations, data mining
can only guide decisions and not ensure outcomes.
Financial analysis of data is very important in order to analyze whether the business is stable and
profitable to make a capital investment. Financial analysts focus their analysis on the balance sheet,
cash flow statement, and income statement.
Data mining techniques have been used to extract hidden patterns and predict future trends and
behaviors in financial markets. Advanced statistical, mathematical and artificial intelligence
techniques are typically required for mining such data, especially the high-frequency financial data.
Data Mining techniques related to finance can be utilized on categories which are given below:
Peak Sales
Stockpile
Classification and clustering of customers for targeted marketing: The data mining approaches along
with marketing work together to target a specific market, they also support and decide market
decisions. With data mining, it helps retain profits, margin, etc and decide which product is best for
different kinds of a costumer.
Excel Questions
TRUE/1 . It refers to finding the closest (approximate) match and assuming the table is sorted in
ascending order. Whereas, FALSE/0 refers to exact match.
No, it is not case-sensitive. The text 'ram' and 'RAM' is identical for VLOOKUP.
Use Advanced Filter option (shortcut key : ALT D F A) and 'Remove Duplicates' option under Data
tab.
Use CONDITIONAL FORMATING to highlight duplicate values. OR use COUNTIF function as shown
below. For example, values are stored in cells D4:D7.
=COUNTIF(D4:D7,D4)
Apply filter on the column wherein you applied COUNTIF function and select values greater than 1.
Go to Data tab >> Select Data Validation. Another way to insert a drop down is to enable Developer
tab and Insert Combo box.
Use Pivot Table and select one variable in Row label and the other variable in Column label.
Suppose you need to pull 'Neha' from 'Neha Sharma'. Use MID and FIND functions.
= INDEX(range, row_number)
In this case, we are telling EXCEL to return second value of the range A2:A4. It returns 30.
match_type can be exact match, largest/smallest value that is less than or greater than equal to
lookup_value.
In this case, we are asking EXCEL to find the relative position of 30 in the range A2:A4. It returns 2.
Suppose information of Product and Sales are stored in columns A and B. You need to look for
product against sales value so you need to tell EXCEL to look from right to left as sales value is placed
in the right hand side of the range/table.
=INDEX(A2:A5,MATCH(45,B2:B5,0))
Basic and Intermediate Statistics
The following questions touch upon some basics and intermediate statistics topics. These topics are
generally taught in undergraduate / graduate courses.
1. What is p-value?
It is the lowest level of significance at which you can reject the null hypothesis. If p-value < 0.05, you
reject the null hypothesis at 5% level of significance.
Mean is calculated by summing all the values divided by number of observations. Median is the
middle value. And Mode is the most occurring value.
3. In which data types MEAN, MEDIAN and MODE are more suitable?
MEAN is suitable for continuous data with no outliers. It is affected by extreme values (Outliers).
MEDIAN is suitable for continuous data with outliers or ordinal data. Mode is suitable for categorical
data (including both nominal and ordinal data).
2. Stratified sampling
3. Cluster sampling
4. Systematic sampling
The main difference between cluster and stratified sampling is that in stratified sampling all the
strata need to be sampled. In cluster sampling one proceeds by first selecting a number of clusters at
random and then sampling each cluster or conduct a census of each cluster. But usually not all
clusters would be included.
Theoretically, we should use T-test when sample size (N) is less than 30. Practically, we always use t-
test. It is because t-test and z test are equivalent as N tends to infinity.
Box Plot Method - If a value is higher than the 1.5*IQR above the upper quartile (Q3), the value will
be considered as outlier. Similarly, if a value is lower than the 1.5*IQR below the lower quartile (Q1),
the value will be considered as outlier.
Standard Deviation Method - If a value is higher than the mean plus or minus three Standard
Deviation is considered as outlier.
9. Define Homoscedasticity?
In a linear regression model, there should be homogeneity of variance of the residuals. In other
words, the variance of residuals are approximately equal for all predicted dependent variable values.
Both the analysis are very much similar but they are different in terms of calculation and their
practical usage :
2. The main idea of using PCA is to explain as much of the total variance in the variables as
possible. Whereas, the factor analysis explains the covariances or correlations between the
variables.
3. PCA is used when we need to reduce the number of variables (dimensionality reduction)
whereas FA is used when we need to group variables into some factors.
There are more than 10 differences between these two algorithms. Check out the link below -
Use one way ANOVA when a continuous variable and a categorical variable having more than two
independent categories.
x y z
Why eigenvalue greater than 1 is considered to retain components? It is because the average
eigenvalue will be 1, so > 1 implies higher than average.