Fresco
Fresco
me/course/1
What needs to be considered first before you start a data visualization presentation?
How does visualization of big data help in interpreting information?
What is the purpose of using a dashboard for data visualization?
Data Visualization makes complex data more
Which type of visual aspect could be considered ineffective or not very accurate at conveying qua
Which law of Gestalt psychology explains individual's ability to perceive objects as shapes, letters
A good data visualization
While at a party you notice a group of people in the crowd headed in the same direction. Your t
As per Edward Tufte, good visualizations strive towards
Which of the following is powerful visualization technique for illustrating hierarchical data and p
Which map is a two-dimensional representation of data in which values are represented by color
Which of the following cannot be a geo-dimension?
Interactivity in Visualizations enables users to
A Manufacturing company wants to view the performance of various plants based on Throughput,
Shapes
Law of closure
Encourasing
Law of common
describing Data thoroughly visualizing the data in a single dashboard.
Discreate
Scatter
Length of bar
Gaps btw bars
Line
Tree map
HeatMap
Mobile number
Find ans to specific qq
Parellel coordinate plot
Infographics
Law of cont
To add Context
Communicate information to a diverse audience.
Probability distribution
Law of Similarity
Sectors
Area Chart
Continous
Scatter plot
Line Chart
90
To reset the grid to normal layout, after displaying multiple charts i par(mfcol=c(1,1)
Which parameter could be used to convert stacked barplot into gro beside=TRUE
If one would like to visualize 6 graphs in single view with graphs sca par(mrow(2,3))
Customizing the way your graphs could be arranged can be configurlayout()
Barchart represents visual display of of category of categorical varia Freq
Which parameter is helpful to swap the chart from x-axis to y-axis? Horiz = TRUE
The Angles are measured in degrees to construct arcs and pie charts in D3. 0
What is the command to invoke pie layout in D3 ? d3.layout.pie()
While drawing a pie chart from a given data set, the start angle and end angle autom 1
How will you invoke the arc path generator in d3js ? d3.svg.arc()
How many radians is 360 degrees equal to ? 6.28
I am using d3.geo.mercator() code . What is the significance of using mercator in thisMap Projection
__ translated coordinates to pixels . Projection Function
How will you invoke the link method while constructing a tree layout using D3 ? Choo
tree.links(node)
Geographic data stored in Json format is known as _ geojson
Which method of the tree layout reads the data and returns all the objects in form ofNode
A tree layout has hierarchical structure. 1
Which type of charts show the change in a given network structure over a period of Alluvial Charts
_________ project library is an initiative to build re-useable charts for d3 js . NVD3
___ is an interactive Java Script toolkit which is used for creating time series charts. Rickshaw
The goal of NVD3 is to keep the traditional cookie cutter approach to build componeFASLE
Independent variables refers to those variables__ Which acts as an input in the experiment.
__ is an example of Multivariate analysis in which relati partial
Pattern such as group or trend in the data table can not bIncorrect
Multivariate data analysis helps us to__ Both
Use of only one variable to describe the data is known as Uni
What are the features of multivariate random variable? both
Dependent variables refers to those variables__ Whose varable analysed
Multivariate data analysis is application of__ ALL
What is multivariate statistics? ALL
What is box kernel density estimate? Block in the histogram is centered over
the data points
The dividing point between the region where the null hypothesis is rejected and the region where it is not
A failing student is passed by an examiner, it is an example of:
A good way to get a small standard error is to use a
The use of the laws of probability to make inferences and draw statistical conclusions about populations
Confidence interval become narrow by increasing the:
An advertising agency wants to test the hypothesis that the proportion of adults in a country who read a
In which examples could binomial distribution be used?
The analysis of variance is a statistical test that is used to compare how many group means?
Identify the variables that are continuous or discrete
Which of the following is not the purpose of using chi-square distributions?
There may be times when data is supposed to fit a normal distribution, but does not. Which of the follow
Which of the following conditions are satisfied by Poisson random variable
A statistician calculates a 95% confidence interval for Mean when Standard Deviation is known. The conf
An essential component of the Central Limit Theorem is that
Identify which of the following steps would not be included in hypothesis
A door alarm works in 72 out of 100 cases and surveillance camera works in 68 out of 100 cases. What is
Any hypothesis which is tested for the purpose of rejection under the assumption that it is true is called?
The p-value in statistical significance testing should be used to assess how strong a relationship is. For e
0.05
0.14
All
0
All
Probability
Number of accicdnet
Number success - inDepen
Statistical hypothesis
Rejecting Ho when Ho is correct
Level of significance
Null
1
Type-II error
false positive
Type I
P value
1
Follow up to ANOVA
one way
two or more
Critical value
Type-I error
Large Sample
Inferential statistics
Sample Size
Equal to 25%
Modelling the number of failures in a trial
Two or more
Time and Waigth
To test how closly
Otlier and Small
Number dep Probability of an event in an interval is unpredictable
. 20000
ALL
Eliminate all
0.72 0.49
Null
0
Do you think heuristic for rule learning and heuristics for decis 0
Now Can you make quick guess where Decision tree will fall int Supervised Learning
ral network
on is linear or non-linear
ns of the data
Confidence limits
_ are the values that mark the boundaries o
Regression is typically carried out to deve 1
Which of the following activities are perfo ALL
Which of the following are Multi-class Clas Is this movie a comedy, a documentary, or a thriller?
Machine learning task of inferring a functioSupervised Learning
Which is the statistical technique used for Regression analysis
Simulations are carried out to develop a m Clustering 0
Which data mining method groups togetherAffinity
ob analysis
Associate rule is known as _
___ of two points is the average of the two points in Eucledian Space. centroid
The ______ is a visual representation of how the data points are merged to form clusters. Dendo
O(N3)
is the data point that is closest to the other point in the cluster. centroid
___ measures the goodness of a cluster Cohesion
A centroid is a valid point in a non-Eucledian space . 0
Convergence
The objective function for linear regression is also known as Cost Function. 1
Output variables are known as Feature Variables . fasle
For different parameters of the hypothesis function we get the same hypothesis functfasle
What is the Learning Technique in which the right answer is given for each example inSuper
Hypothesis Function
What is the name of the function that takes the input and maps it to the output variab
Cost function in linear regression is also called squared error function. 1
How are the parameters updates during Gradient Descent Process ? simultentialy
Input variables are also known as feature variables 1
Problems that predict real values outputs are called ? Regression Problem
What is the process of dividing each feature by its range called ? Feature Scaling
controls the magnitude of a step taken during Gradient Descent Learning Rate
Mean Normalization
What is the process of subtracting the mean of each variable from its variable called ?
So when a ML Model has high bias, getting more training data will help in improving 0
For an underfit data set the training and the cross validation error will be high 0
For an overfit data set the cross validation error will be much bigger than the training 1
What measures the extent to which the predictions change between various realizati variance
What measure how far the predictions are from the actual values ? Bias
Problems that predict real values outputs are called Regression Problem
sion Problem
on Boundary
Normalization
sion Problem
Perform a daily resample/upsample of the data. Do a forward filling of the missing values with limit of 2. What is the value for
16,2011?
How many observations have you seen from Jan 1, 2011 to March 31, 2011?
Perform a monthly resample/downsample of the time series. What is the minimum value for May?
Perform a daily resample/upsample of the data. Do interpolation to fill the data. What is the value for Jan 12, 2011?
Perform a monthly resample/downsample of the time series. What is the maximum value for February?
For the WMT stock open prices time series perform a stationarity test using ADF. What is the p-value?
For the WMT stock open prices time series perform a stationarity test using ADF. What is the value of ADF statistic?
For the XOM stock close prices time series perform a stationarity test using ADF. What is the p-value?
For the XOM stock close prices time series perform a stationarity test using ADF. What is the value of the ADF statistic ?
For the WMT stock open prices time series perform a stationarity test using ADF. How is the time series behaving?
When I upsample my time series and I find many missing values , how do I fill the missing values?
What is the default aggregation function while resampling a time series in pandas?
It is a good practice to apply Forecasting models for non-stationary time series
If the mean and variance of a Time Series is constant over time , it is called a _ Time Series.
I cannot plot resampled Time Series data in Python.
If the p value is > 0.05 during the ADF test of the time series then the series is said to be _
Augmented Dickey-Fuller test cannot be used for identifying if a Time Series is Stationary.
Down sampling is the process of converting a to ____ frequency.
What package in Python provides features to work with Time Zones?
What does freq='T' signify while passing this parameter to the date_range() function ?
In Time Series data , the observations are captured over varying time intervals.
What function in Python helps in creating Date Time index for data that does not have date or time values captured?
What is the function to plot a lag plot for a time series using python?
What is the function to offset the date for daylight saving?
What is the function used for plotting the values of a time series using Python?
In pandas I can combine two time series with different frequencies into a single time series.
I can write my custom aggregation function while resampling my time series in Pandas.
Time Series data is indexed by _?
Auto Correlation Function Plot can be used for determining if a Time Series is stationary or not.
AIC stands for?
0.0028
-3.89
0.99
0.91
It is stationa It is stationary
all
mean()
1
Stationary
1
Non-Stationary
1
low,low
pytz
not create any tmestamp
0
to_datetime()
lag_plot()
Date()
plot.ts()
1
1 0
Datetime Index
1
Akaike information criterion
14
Identify the unstructured data from the following both image and video
Which preprocessing technique is used for dimensinality svd
Select the correct statements about Nonlinear classificat Kernel tricks are used by Nonlinear classifiers to achieve maximum-m
TF-IDF is a common methodology used in pre-processing 1
The dimensionality reduction technique that efficiently r all
GradientDescent is one of Backward propagation techniqu 1
o be described
o be described
In a time series , the rate of decay will decide the value of the co Parsimonious
1 Model
A model that is efficient and simple is known as ? Moving Average Model
Time series is a linear combination of white noise process. This i
If there is no decay in the ACF values for any number of lags , whstationary
Exponential smoothing models can be considered as ARIMA mod 1
The coefficient for the residual error terms can be negative for a 1
ARIMA (1,0,0) is equivalent to AR Model
How will you make a non-stationary time series to stationary ? This conversi
For a moving average model , the expectation of the dependent va constant
For an ARMA process ____ ACF and PACF show a geometrically declining trend
In an ARMA(p,q) series , what do p and q represent ? Lag Terms
ARIMA (0,1,0) is equivalent to Random WalMA Model ARIMA (0,1,0) is equivalent
My time series model is predicting well for the available data b Over-fitting
ARIMA (1,0,1) is equivalent to _ ARMA Model
What is the mechanism used to choose optimal p and q for an
For MA(2) process , the Auto Correlation value for the first two l 1
is a form of Auto Regressive model where the coefficient of first la
AR , MA and ARMA models can handle non-stationary time seriesOLS and ML0
What methodology can be used for Estimating the coefficients in
If the ACF follows a geometric decay and the PACF is significant tiA R(p)Series and Independent Variable
Time
Multivariate time series includes lag of _ and _ 1 variable AR(p) process to n variables
What does a p order vector auto regressive process generalize ?
ARIMA (0,0,1) is equivalent to _ MA Model
white noise innovations
What are serially uncorrelated vectors which have variance betwe
What is the range of smoothing constant alpha ? 0 to 1
In vector auto regression , the estimation by ordinary least squar 1
Structural Models have a time component. 1
The coefficient for the residual error terms can be negative for a 1
eclining trend
) is equivalent
Select the correct statements about Nonlinear classificat kernel trick is used in SVM for non-linear classification
What is the tokenized output of the sentence “if you cannOnly', 'do', 'what', 'your', 'heart', 'tells', 'you'
Classification where each data is mapped to more than one Multi-class
The most widely used package for machine learning in pysklearn
In a Term Document Matrix (TDM) each row represents ?document
Cross-validation causes over-fitting 1
In Supervised learning, class labels of the training sampleunkown
TF-IDF is a freature extraction technique 1
Which one of the following is not a classification techniq StratifiedShuffleSplit
counts of unique values in the 'label' column
What does the command sentiment_analysis_data['label']
Inverse Document frequency is used in term document m 1
SVM is supervised learning algorithm.
Which NLP technique uses lexical knowledge base to obtalemmatization
To view the first 3 rows of the dataset, which of the fo sentimentanalysisdata.head(3)
Labeled data
Supervised learning differs from unsupervised learning in
Which numerical statistics is used to identify the import tf-idf
An algorithm that counts how many times a word appearBag-of-words(BOW)
Which preprocessing technique is used to make the data gaussian with zero mean and unit variance?
True Negative is when the predicted instance and the actual is positive.
True Positive is when the predicted instance and the actual instance is not negative.
Cross-validation technique is used to evaluate a classifier by dividing the data set into training set to train the classifier and tes
iris_series = pd.get_dummies(iris['Species'])
How many classes will the following command return(target classes in the dataset) : classes=list(iris['species'].unique())
Ensemble learning is used when you build component classifiers that are more accurate and independent from each other.
Cross-validation technique will provide accurate results when the training set and the testing set are from two different popul
Imagine you have just finished training a decision tree for spam classication and it is showing abnormal bad performance on b
The number of categorical attributes in the original dataset.
What does the command iris['species'].value_counts() return?
Ordinal variables has
clustering is an example
Which of the following is not a technique to process missing values?
Naive Bayes Algorithm is useful for :
Binary
Unique()
Standardization
fasle
1
1
iris.top(3)
Confusion Matrix
For imbalanced class problems, accuracy metric is not a good idea.
Decision tree
K -Fold wrong
Initialize -> Train - -> Predict-->Evaluate
iris.size,(150,5)
No
sklearn
0
Unstructure data
Decision tree wrong
Multi Class Classification
It will converge quicker than discriminative models like logistic regression AND it requires less training data
Multi Class Classification
Train the classifier
iris_num = iris_data.select_dtypes(include=[numpy.number])
3
1
Naive Bayes Classifier
1 0
None
Time taken to fix a broken build is measured using __. Build repair rate
Private builds are executed after moving the changes to v On-demand0
Release builds can be triggered NAnt
__ is a .Net build tool. Build scalability
Capablility of your build system to handle an increase in Cyclomatic Complexity Number
Complexity of a code is determined based on ___.
Pipeline break and build break is one and the same 0
_ is a measure of incoming dependencies. Afferent Coupling
Build can be triggered by a version control tool. 1
Teamcity Jenkins
Travis CI
CI pipeline consists of stages
Which is the first code analysis to be executed ? unit test
Code Complexity
CCMetrics is used to identify
Commiting a code change, when the inspection fails , is per 0
Teamcity is build tool
Efferent Coupling is a measure of Outgoing dependencies
Select a CI tool based on it's both
is a code coverage tool Nunit
Which of the tools is not mandated for integration in a CI Terraform
Commiting
changes
Staging frequently
Commiting changes
builds frequently
helps in differentiating the environments. Configuration file
Jabber is a Messaging plugin
In a normal scenario, software from is deployed to producMaster branch
How are component dependencies injected in a workflow Parallel Run unit test and build package
trigger
component
s and build
Build packagpackage
Which of these is not mandated for a CI pipeline ? Infrastructu ConfiguratioApplication code
It is a proven fact that reducing the CCN of code is known 1
Does CI pipeline need to have all the software development Cyclomatic1Complexity 0Number
Control flow graph is used to calculate _.
Based on the hands on card “ OLS in Python Statsmodels” What is the adjusted R sq value
Based on the hands on card “ OLS in Python Statsmodels”What is the value of the constant term ?
Based on the hands on card “ OLS in Python Statsmodels” What is the value of the estimated coef for variable
Based on the hands on card “ OLS in Python Statsmodels” What is the value of R sq ?
Based on the hands on card “ OLS in Python Statsmodels” How many observations are there in the dataset ?
Based on the hands on card “MLR Hands On” What is the P>|t| value for the 'INDUS' variable ?
Based on the hands on card “MLR Hands On” What is the standard error for the constant term ?
Based on the hands on card “MLR Hands On” What is the value of the estimated coef for the constant term ?
Based on the hands on card “MLR Hands On” what is the value of R sq ?
Based on the hands on card “MLR Hands On” Perform a correlation among all the independent variables . What
0.484
506
0.731
5.104
36.42
0.741
-0.76923
11
0 1
1
1
0 1
0
Small , Small
sqrt(SSE/n)
1
Increases
0
1
-1 to +1
Correlate
Differnece
Multi Collinearity
minimize error
normalization
SSE
Law of Incremental Returns
STANDERLIZATION
1
1
Correlation
SSE R
Name the color scheme you get by mixing different tones, shades and tints within a specific hue
What is the white space located inside letters like o and p called?
Name the design concept of making items represented, resemble their real-world counterparts
Which one below is a vector image format?
What do you get when you add black to a hue (color)?
The height of a capital letter measured from the baseline
What’s the measure of purity of the color?
Name the design tool that can be used to establish a spatial hierarchy of the content
Name the design language system developed by google
What’s the space between letters called?
The name of the invisible line marking the height of a font.
Counter
Skeuomorphism
svg
shade
Cap height
Chroma
Material Design
Kerning
Baseline