1 Devops Interview Questions
1 Devops Interview Questions
Git
Jenkins
Selenium
Puppet
Chef
Ansible
Nagios
Docker
Monit
ELK –Elasticsearch, Logstash, Kibana
Collected/Collect
Git(GitHub)
Server group provides 80 and 443 from around the world, but only port 22 are vital among the
jump box group. Database group allows port 3306 from the webserver group and port 22 from
the jump box group. Addition of any machines to the webserver group can store in the database.
No one can directly ssh to any of your boxes.
Reference architecture
Technical architecture
Deployment operation architecture
Reference architecture
Technical architecture
Deployment operation architecture
5. DevOps Toolchain?
Answer: DevOps Toolchain:
Code: code development and review, source code management tools, code merging
Build: continuous integration tools, build status
Test: continuous testing tools that provide feedback on business risks
Package: artifact repository, application pre-deployment staging
Release: change management, release approvals, release automation
Configure: infrastructure configuration and management, Infrastructure as Code tools
Monitor: applications performance monitoring, end-user experience
Some categories are more essential in a DevOps toolchain than others; especially continuous
integration (e.g. Jenkins) and infrastructure as code (e.g. Puppet).
Source: Wikipedia
We have the comprehensive DevOps Training Courses to give you a head start in your career.
In binary: Previously, the client always used to do serialization of the value with complex data,
but with Memcached, you can use the binary option.
You can summarize by saying Agile software development methodology focuses on the
development of software but DevOps, on the other hand, is responsible for development as well
as the deployment of the software in the safest and most reliable way possible. Here’s a blog that
will give you more information on the evolution of DevOps.
9. What are the core roles of DevOps Engineers in terms of development and
Infrastructure?
Answer:
15. What is the difference between Active and Passive check in Nagios?
Answer: For this answer, first, point out the basic difference Active and Passive checks. The
major difference between Active and Passive checks is that Active checks are initiated and
performed by Nagios, while passive checks are performed by external applications.
If your interviewer is looking unconvinced with the above explanation then you can also mention
some key features of both Active and Passive.
16. Mention what are the key aspects or principle behind DevOps?
Answer: The key aspects or principle behind DevOps is
Infrastructure as code
Continuous deployment
Automation
Monitoring
Security
18. What testing is necessary to ensure a new service is ready for production?
Answer: DevOps is all about continuous testing throughout the process, starting with
development through to production. Everyone shares the testing responsibility. This ensures that
developers are delivering code that doesn’t have any errors and is of high quality, and it also
helps everyone leverage their time most effectively. ( oracle apex training online )
Git is a Distributed Version Control system (DVCS). It can track changes to a file and allows
you to revert back to any particular change.
Its distributed architecture provides many advantages over other Version Control Systems (VCS)
like SVN one major advantage is that it does not rely on a central server to store all the versions
of a project’s files. Instead, every developer “clones” a copy of a repository I have shown in the
diagram below with “Local repository” and has the full history of the project on his hard drive so
that when there is a server outage, all you need for recovery is one of your teammate’s local Git
repository.
There is a central cloud repository as well where developers can commit changes and share it
with other teammates as you can see in the diagram where all collaborators are committing
changes “Remote repository”. ()
Revise capability,
Improve performance,
Reliability or maintainability,
Extend life,
Reduce cost,
Reduce risk and
Liability, or correct defects.
27. What is Chef?
Answer: Begin this answer by defining Chef. It is a powerful automation platform that
transforms infrastructure into code. A chef is a tool for which you write scripts that are used to
automate processes. What processes? Pretty much anything related to IT.
Now you can explain the architecture of Chef, it consists of:
Chef Server: The Chef Server is the central store of your infrastructure’s configuration data. The
Chef Server stores the data necessary to configure your nodes and provides search, a powerful
tool that allows you to dynamically drive node configuration based on data.
Chef Node: A Node is any host that is configured using Chef-client. Chef-client runs on your
nodes, contacting the Chef Server for the information necessary to configure the node. Since a
Node is a machine that runs the Chef-client software, nodes are sometimes referred to as
“clients”.
Chef Workstation: A Chef Workstation is a host you use to modify your cookbooks and other
configuration data.
Now give an example. You can write a manifest in Puppet Master that creates a file and installs
apache on all Puppet Agents (Slaves) connected to the Puppet Master.
32. What testing is necessary to ensure that a new service is ready for production?
Answer: DevOps is all about continuous testing throughout the process, starting with
development through to production. Everyone shares the testing responsibility. This ensures that
developers are delivering code that doesn’t have any errors and is of high quality, and it also
helps everyone leverage their time most effectively.
34. Explain your understanding and expertise on both the software development side and
the technical operations side of an organization you’ve worked for in the past?
Answer: DevOps engineers almost always work in a 24/7 business-critical online environment. I
was adaptable to on-call duties and able to take up real-time, live-system responsibility. I
successfully automated processes to support continuous software deployments. I have experience
with public/private clouds, tools like Chef or Puppet, scripting and automation with tools like
Python and PHP, and a background in AGILE.
38. What are the advantages of DevOps with respect to Technical and Business
perspective?
Answer:
Technical benefits:
Business benefits:
43. Which Testing tool are you comfortable with and what are the benefits of that tool?
Answer: Here mention the testing tool that you have worked with and accordingly frame your
answer. I have mentioned an example below:
I have worked on Selenium to ensure high quality and more frequent releases.
continuous audit
continuous controls monitoring
continuous transaction inspection
45. What is the one most important thing DevOps helps do?
Answer: The most important thing DevOps helps do is to get the changes into production as
quickly as possible while minimizing risks in software quality assurance and compliance. That is
the primary objective of DevOps. However, there are many other positive side-effects to
DevOps. For example, clearer communication and better working relationships between teams
which creates a less stressful working environment.
47. Explain how can create a backup and copy files in Jenkins?
Answer: Answer to this question is really direct. To create a backup, all you need to do is to
periodically back up your JENKINS_HOME directory. This contains all of your build jobs
configurations, your slave node configurations, and your build history. To create a back-up of
your Jenkins setup, just copy this directory. You can also copy a job directory to clone or
replicate a job or rename the directory.
49. Explain with a use case where DevOps can be used in industry/ real-life?
Answer: There are many industries that are using DevOps so you can mention any of those use
cases, you can also refer the below example:
Etsy is a peer-to-peer e-commerce website focused on handmade or vintage items and supplies,
as well as unique factory-manufactured items. Etsy struggled with slow, painful site updates that
frequently caused the site to go down. It affected sales for millions of Etsy’s users who sold
goods through the online market place and risked driving them to the competitor.
With the help of a new technical management team, Etsy transitioned from its waterfall model,
which produced four-hour full-site deployments twice weekly, to a more agile approach. Today,
it has a fully automated deployment pipeline, and its continuous delivery practices have
reportedly resulted in more than 50 deployments a day with fewer disruptions.
50. Explain how would you handle revision (version) control?
Answer: My approach to handling revision control would be to post the code on SourceForge or
GitHub so everyone can view it. Also, I will post the checklist from the last revision to make
sure that any unsolved issues are resolved.
your intervention. RDS is a Db management service for structured data only.DynamoDB, on the
other hand, is a NoSQL database service, NoSQL deals with unstructured data.
Redshift is an entirely different service, it is a data warehouse product and is used in data
analysis.
If a backup AWS Direct connects has been configured, in the event of a failure it will
switch over to the second one. It is recommended to enable Bidirectional Forwarding
Detection (BFD) when configuring your connections to ensure faster detection and
failover. On the other hand, if you have configured a backup IPsec VPN connection
instead, all VPC traffic will failover to the backup VPN connection automatically. Traffic
to/from public resources such as Amazon S3 will be routed over the Internet. If
you do not have a backup AWS Direct Connect link or an IPsec VPN link, then Amazon
VPC traffic will be dropped in the event of a failure.
By performing multiple copy operations at one time i.e. if the workstation is powerful
enough, you can initiate multiple cp commands each from different terminals,
on the same Snowball device.
Copying from multiple workstations to the same snowball.
Transferring large files or by creating a batch of small file, this will reduce the encryption
overhead.
Eliminating unnecessary hops i.e. make a setup where the source machine(s) and the
snowball are the only machines active on the switch being used, this can
hugely improve performance.
Puppet can run in a stand-alone architecture, where each managed node has its own
complete copy of your configuration info and compiles its own catalog.
In this architecture, managed nodes run the Puppet apply application, usually as a
scheduled task or cron job. You can also run it on demand for initial configuration of a
server or for smaller configuration tasks.
Like the Puppet master application, Puppet applies needs access to several sources of
configuration data, which it uses to compile a catalog for the node it is managing. (Online
Training Institute)
I will advise you to first explain Flapping first. Flapping occurs when a service or host
changes state too frequently, this causes a lot of problem and recovery notifications.
Once you have defined Flapping, explain how Nagios detects Flapping. Whenever
Nagios checks the status of a host or service, it will check to see if it has started or
stopped flapping. Nagios follows the below-given procedure to do that:
Storing the results of the last 21 checks of the host or service analyzing the historical
check results and determine where state changes/transitions occur
Using the state transitions to determine a percent state change value (a measure of
change) for the host or service
Comparing the percent state change value against low and high flapping thresholds
A host or service is determined to have started flapping when its percent state change first
exceeds a high flapping threshold. A host or service is determined to have stopped
flapping when its percent state goes below a low flapping threshold.
Configuration Item, on the other hand, may or may not have financial values assigned to it. It
will not have any depreciation linked to it. Thus, its life would not be dependent on its financial
value but will depend on the time until that item becomes obsolete for the organization.
Now you can give an example that can showcase the similarity and differences between
both:
1) Similarity:
Server: It is both an asset as well as a CI.
2) Difference:
Building: It is an asset but not a CI.
Document: It is a CI but not an asset
14. What is Git rebase and how can it be used to resolve conflicts in a feature branch before
the merge?
Answer: According to me, you should start by saying git rebase is a command which will merge
another branch into the branch where you are currently working, and move all of the local
commits that are ahead of the rebased branch to the top of the history on that branch.
Now once you have defined Git rebase time for an example to show how it can be used to
resolve conflicts in a feature branch before merge, if a feature branch was created from master,
and since then the master branch has received new commits, Git rebase can be used to move the
feature branch to the tip of master.
The command effectively will replay the changes made in the feature branch at the tip of the
master, allowing conflicts to be resolved in the process. When done with care, this will allow the
feature branch to be merged into master with relative ease and sometimes as a simple fast-
forward operation.
Continuous Testing is the process of executing automated tests as part of the software delivery
pipeline to obtain immediate feedback on the business risks associated with the latest build. In
this way, each build is tested continuously, allowing Development teams to get fast feedback so
that they can prevent those problems from progressing to the next stage of Software delivery life-
cycle. This dramatically speeds up a developer’s workflow as there’s no need to manually
rebuild the project and re-run all tests after making changes.
Revise capability,
Improve performance,
Reliability or maintainability,
Extend life,
Reduce cost,
Reduce risk and
Liability, or correct defects.
25. Which are the top DevOps tools? Which tools have you worked on?
Answer: The most popular DevOps tools are mentioned below:
You can also mention any other tool if you want, but make sure you include the above tools in
your answer.
If you have experience only with some of the above tools then mention those tools and say that I
have specialization in these tools and have an overview of the rest of the tools.
etckeeper-commit-post: In this configuration file you can define command and scripts
which executes after pushing configuration on Agent.
etckeeper-commit-pre: In this configuration file you can define command and scripts
which executes before pushing configuration on Agent.
I hope you have enjoyed the above set of Puppet interview questions, the next set of
questions will be more challenging, so be prepared.
DevOps brings faster and more frequent release cycles which allow developers to identify
and resolve issues immediately as well as implementing new features quickly.
Since DevOps is what makes people do better work by making them wear different hats,
Developers who collaborate with Operations will create software that is easier to operate,
more reliable, and ultimately better for the business.
Continuous integration (CI) tools such as Rational Build Forge, Jenkins and Semaphore merge
all developer copies of the working code into a central version. These tools are important for
larger groups where teams of developers work on the same codebase simultaneously. QA experts
use code analyzers to test software for bugs, security, and performance. If you’ve used HP’s
Fortify Static Code Analyzer, talk about how it identified security vulnerabilities in coding
languages. Also speak about tools like GrammaTech’s CodeSonar that you used to identify
memory leaks, buffer underruns and other defects for C/C++ and Java code. It is essential that
you have an adequate command of the principal languages like Ruby, C#, .NET, Perl, Python,
Java, PHP, Windows PowerShell, and are comfortable with the associated OS environments
Windows, Linux, and Unix.
29. Why has DevOps gained prominence over the last few years?
Answer:
Before talking about the growing popularity of DevOps, discuss the current industry
scenario. Begin with some examples of how big players such as Netflix and Facebook are
investing in DevOps to automate and accelerate application deployment and how this has
helped them grow their business. Using Facebook as an example, you would point to
Facebook’s continuous deployment and code ownership models and how these have
helped it scale up but ensure the quality of experience at the same time. Hundreds of lines
of code are implemented without affecting quality, stability, and security.
Your next use case should be Netflix. This streaming and on-demand video company
follow similar practices with fully automated processes and systems. Mention the user
base of these two organizations: Facebook has 2 billion users while Netflix streams
online content to more than 100 millions users worldwide. These are great examples of
how DevOps can help organizations to ensure higher success rates for releases, reduce
the lead time between bug fixes, streamline and continuous delivery through automation,
and an overall reduction in manpower costs.
31. Explain how you can minimize the Memcached server outages?
Answer:
When one instance fails, several of them goes down, this will put a larger load on the
database server when lost data is reloaded as the client make a request. To avoid this, if
your code has been written to minimize cache stampedes then it will leave a minimal
impact
Another way is to bring up an instance of Memcached on a new machine using the lost
machines IP address
Code is another option to minimize server outages as it gives you the liberty to change
the Memcached server list with minimal work
Setting timeout value is another option that some Memcached clients implement for
Memcached server outage. When your Memcached server goes down, the client will keep
trying to send a request till the time-out limit is reached.
32. Is continuous delivery related to the dev-ops movement? How so?
Answer: Absolutely. In any organization where there is a separate operations department, and
especially where there is an independent QA or testing function, we see that much of the pain in
getting software delivered is caused by poor communication between these groups, exacerbated
by an underlying cultural divide. Apps are measured according to throughput, and ops are
measured according to stability. Testing gets it in the neck from both sides, and like release
management, is often a political pawn in the fight between apps and ops. The point of dev-ops is
that developers need to learn how to create high-quality, production-ready software, and ops
need to learn that Agile techniques are actually powerful tools to enable effective, low-risk
change management. Ultimately, we’re all trying to achieve the same thing – creating business
value through software – but we need to get better at working together and focusing on this goal
rather than trying to optimize our own domains. Unfortunately, many organizations aren’t set up
in a way that rewards that kind of thinking. According to Forrester.
34. What is the way to secure data for carrying in the cloud?
Answer: One thing must be ensured that no one should seize the information in the cloud while
data is moving from point one to another and also there should not be any leakage with the
security key from several storerooms in the cloud. Segregation of information from additional
companies’ information and then encrypting it by means of approved methods is one of the
options.
you to interact with your EC2 instances as if they were within your existing network.
Yes, it can be used for instances with root devices backed by local instance storage. By
using Amazon S3, developers have access to the same highly scalable, reliable,
fast, inexpensive data storage infrastructure that Amazon uses to run its own global
network of web sites. In order to execute systems in the Amazon EC2 environment,
developers use the tools provided to load their Amazon Machine Images (AMIs) into
Amazon S3 and to move them between Amazon S3 and Amazon EC2.
37. The top 10 skills the person should be having for the DevOp’s position?
Answer:
39. Explain how can create a backup and copy files in Jenkins?
Answer: all you need to do is to periodically back up your JENKINS_HOME directory. This
contains all of your build jobs configurations, your slave node configurations, and your build
history. To create a back-up of your Jenkins setup, just copy this directory. You can also copy a
job directory to clone or replicate a job or rename the directory.
40. Which Testing tool are you comfortable with and what are the benefits of that tool?
Answer: Here mention the testing tool that you have worked with and accordingly frame
your answer. I have mentioned an example below:
I have worked on Selenium to ensure high quality and more frequent releases.
Some advantages of Selenium are:
It is free and open-source
It has a large user base and helping communities
It has cross Browser compatibility (Firefox, Chrome, Internet Explorer, Safari, etc.)
It has great platform compatibility (Windows, Mac OS, Linux, etc.)
It supports multiple programming languages (Java, C#, Ruby, Python, Pearl, etc.)
It has fresh and regular repository developments
It supports distributed testing
41. Which open source or community tools do you use to make Puppet more powerful?
Answer: Explain about some tools that you have used along with Puppet to do a specific task.
You can refer the below example:
Changes and requests are ticketed through Jira and we manage requests through an internal
process. Then, we use Git and Puppet’s Code Manager app to manage Puppet code in accordance
with best practices. Additionally, we run all of our Puppet changes through our continuous
integration pipeline in Jenkins using the beaker testing framework.
43. Describe the most significant gain you made from automating a process through
Puppet?
Answer: “I automated the configuration and deployment of Linux and Windows machines using
Puppet. In addition to shortening the processing time from one week to 10 minutes, I used the
roles and profiles paradigm and documented the purpose of each module in README to ensure
that others could update the module using Git. The modules I wrote are still being used, but
they’ve been improved by my teammates and members of the community.”
We’ve worked hard to try to find to everyone who has contributed code to Puppet, but if you
have questions or concerns about a previous contribution you’ve made to Puppet and you don’t
believe you’ve signed a CLA, please sign a CLA or contact us for further information.
In DevOps, developers are required to commit all the changes made in the source code to a
shared repository. Continuous Integration tools like Jenkins will pull the code from this shared
repository every time a change is made in the code and deploy it for Continuous Testing that is
done by tools like Selenium as shown in the below diagram.
In this way, any change in the code is continuously tested, unlike the traditional approach.
1. What are the core roles of DevOps Engineers in terms of development and
Infrastructure?
The core job roles of DevOps Engineer? (devops-engineer-interview-questions)
Answer:
Application development
Code developing
Code coverage
Unit testing
Packaging
Deployment With infrastructure
Continuous Integration
Continuous Testing
Continuous Deployment
Provisioning
Configuration
Orchestration
2. Explain your understanding and expertise on both the software development side and
the technical operations side of an organization you’ve worked for in the past?
Answer: DevOps engineers almost always work in a 24/7 business-critical online environment. I
was adaptable to on-call duties and able to take up real-time, live-system responsibility. I
successfully automated processes to support continuous software deployments. I have experience
with public/private clouds, tools like Chef or Puppet, scripting and automation with tools like
Python and PHP, and a background in Agile.
3. Which scripting languages do you think are most important for a DevOps engineer?
Answer: As far as scripting languages go, the simpler the better. In fact, the language itself isn’t
as important as understanding design patterns and development paradigms such as procedural,
object-oriented, or functional programming.
4. What’s the background of your system?
Answer: Some DevOps jobs require extensive systems knowledge, including server clustering
and highly concurrent systems. As a DevOps engineer, you need to analyze system capabilities
and implement upgrades for efficiency, scalability, and stability, or resilience. It is recommended
that you have a solid knowledge of OSes and supporting technologies, like network security,
virtual private networks, and proxy server configuration.
DevOps relies on virtualization for rapid workload provisioning and allocating compute
resources to new VMs to support the next rollout, so it is useful to have in-depth knowledge
around popular hypervisors. This should ideally include backup, migration, and lifecycle
management tactics to protect, optimize and eventually recover computing resources. Some
environments may emphasize microservices software development tailored for virtual containers.
Operations expertise must include extensive knowledge of systems management tools like
Microsoft System Center, Puppet, Nagios and Chef.
such as a card, and the other is typically something memorized, such as a security code.
The instructions are provided to make communication between one or more applications.
Creation of applications is made easy and accessible for the link of cloud services with other
systems.
Containerized data centers: Containerized data centers are the packages that contain a
consistent set of servers, network components, and storage delivered to large warehouse kind of
facilities. Here each deployment is relatively unique.
You can summarize by saying Agile software development methodology focuses on the
development of software but DevOps, on the other hand, is responsible for development as well
as the deployment of the software in the safest and most reliable way possible. Here’s a blog that
will give you more information on the evolution of DevOps.
Technical benefits:
Business benefits:
12. Discuss your experience building bridges between IT Ops, QA, and development?
Answer: DevOps is all about effective communication and collaboration. I’ve been able to deal
with production issues from the development and operations sides, effectively straddling the two
worlds. I’m less interested in finding blame or playing the hero than I am with ensuring that all
of the moving parts come together.
Distributed VCS tools do not necessarily rely on a central server to store all the versions of a
project’s files. Instead, every developer “clones” a copy of a repository and has the full history of
the project on their own hard drive.
16. How is AWS Elastic Beanstalk different than AWS OpsWorks?
Answer:
22. What special training or education did it require for you to become a DevOps engineer?
Answer: DevOps is more of a mindset or philosophy rather than a skill-set. The typical technical
skills associated with DevOps Engineers today is Linux systems administration, scripting, and
experience with one of the many continuous integration or configuration management tools like
Jenkins and Chef. What it all boils down to is that whatever skill-sets you have, while important,
are not as important as having the ability to learn new skills quickly to meet the needs. It’s all
about pattern recognition and having the ability to merge your experiences with current
requirements. Proficiency in Windows and Linux systems administration, script development, an
understanding of structured programming and object-oriented design, and experience creating
and consuming RESTful APIs would take one a long way.
24. Is there a difference between Agile and DevOps? If yes, please explain?
Answer: As a DevOps engineer, interview questions like this are quite expected. Start by
describing the obvious overlap between DevOps and Agile. Although the implementation of
DevOps is always in sync with Agile methodologies, there is a clear difference between the two.
The principles of Agile are associated with seamless production or development of a piece of
software. On the other hand, DevOps deals with the development, followed by deployment of the
software, ensuring faster turnaround time, minimum errors, and reliability
29. What testing is necessary to ensure that a new service is ready for production?
Answer: DevOps is all about continuous testing throughout the process, starting with
development through to production. Everyone shares the testing responsibility. This ensures that
developers are delivering code that doesn’t have any errors and is of high quality, and it also
helps everyone leverage their time most effectively.
Performance is improved by using Linux software raid and striping across four volumes.
1. Which among Puppet, Chef, SaltStack, and Ansible is the best Configuration
Management (CM) tool? Why?
Answer?
This depends on the organization’s need to mention a few points on all those tools: Puppet is the
oldest and most mature CM tool. Puppet is a Ruby-based Configuration Management tool, but
while it has some free features, much of what makes Puppet great is only available in the paid
version. Organizations that don’t need a lot of extras will find Puppet useful, but those needing
more customization will probably need to upgrade to the paid version.
The chef is written in Ruby, so it can be customized by those who know the language. It also
includes free features, plus it can be upgraded from open source to enterprise-level if necessary.
On top of that, it’s a very flexible product. (devops interview questions and answers pdf)
Ansible is a very secure option since it uses Secure Shell. It’s a simple tool to use, but it does
offer several other services in addition to configuration management. It’s very easy to learn, so
it’s perfect for those who don’t have a dedicated IT staff but still need a configuration
management tool.
SaltStack is a python based open-source CM tool made for larger businesses, but its learning
curve is fairly low.
3. What is an MX record?
Answer: An MX record tells senders how to send an email for your domain. When your domain
is registered, it’s assigned several DNS records, which enable your domain to be located on the
Internet. These include MX records, which direct the domain’s mail flow. Each MX record
points to an email server that’s configured to process mail for that domain. There’s typically one
record that points to a primary server, then additional records that point to one or more backup
servers. For users to send and receive an email, their domain’s MX records must point to a server
that can process their mail.
Given below is a generic logical flow where everything gets automated for seamless
delivery. However, this flow may vary from organization to organization as per the
requirement.
Developers develop the code and this source code is managed by Version Control System
tools like Git etc.
Developers send this code to the Git repository and any changes made in the code are
committed to this Repository.
Jenkins pulls this code from the repository using the Git plugin and builds it using tools
like Ant or Maven.
Configuration management tools like puppet deploys & provisions testing environment
and then Jenkins releases this code on the test environment on which testing is done using
tools like selenium.
Once the code is tested, Jenkins sends it for deployment on the production server (even
production server is provisioned & maintained by tools like a puppet).
After deployment, It is continuously monitored by tools like Nagios.
Docker containers provide a testing environment to test the build features.
5. What is an AMI? How do we implement it?
Answer:
AMI stands for Amazon Machine Image. It is a copy of the root file system.
It provides the data required to launch an instance, which means a copy of running an
AMI server in the cloud. It’s easy to launch an instance from many different AMIs.
Hardware servers that commodities bios which exactly point the master boot record of
the first block on a disk.
A disk image is created which can easily fit anywhere physically on a disk. Where Linux
can boot from an arbitrary location on the EBS storage network. (E Learning Portal)
Once you have defined Plugins, explain why we need Plugins. Nagios will execute a plugin
whenever there is a need to check the status of a host or service. The plugin will perform the
check and then simply returns the result to Nagios. Nagios will process the results that it receives
from the Plugin and take the necessary actions.
continuous audit
continuous controls monitoring
continuous transaction inspection
10. You are having multiple Memcache servers, in which one of the Memcache servers
fails, and it has your data, will it ever try to get key data from that one failed server?
Answer: The data in the failed server won’t get removed, but there is a provision for auto-
failure, which you can configure for multiple nodes. Fail-over can be triggered during any kind
of socket or Memcached server level errors and not during normal client errors like adding an
existing key, etc.
11. Explain what is Dogpile effect? How can you prevent this effect?
Answer: Dogpile effect is referred to as the event when the cache expires, and websites are hit
by the multiple requests made by the client at the same time. This effect can be prevented by
using a semaphore lock. In this system when value expires, the first process acquires the lock and
starts generating new value.
A VPC peering connection is a networking connection between two VPCs that enables
you to route traffic between them using private IP addresses. And instances which are in
VPC can communicate with each other as if they are within the same network.
You can create a VPC peering connection between your VPCs, or with a VPC in another
AWS account within a single region.
If you have more than one AWS account within the same region and want to share or
transfer the data, you can peer the VPCs across those accounts to create a file-sharing
network. You can also use a VPC peering connection to allow other VPCs to access the
resources you have in one of your VPCs.
A VPC peering connection can help you to facilitate the transfer of data.
It provides easily configurable options and allows the user to configure the capacity.
It provides the complete control of computing resources and lets the user run the
computing environment according to his requirements.
It provides a fast way to run the instances and quickly book the system hence reducing
the overall time.
It provides scalability to the resources and changes its environment according to the
requirement of the user.
It provides a variety of tools to the developers to build failure resilient applications.
22. How the processes start, stop and terminate works? How?
Answer: Starting and stopping an instance: If an instance gets stopped or ended, the instance
functions a usual power cut and then changes over to a clogged position. You can establish the
case afterward since all the EBS volumes of Amazon remain attached. If an instance is in
stopping state, then you will not get charged for an additional instance.
Finishing the instance: If an instance gets terminated it tends to perform a typical blackout, so the
EBS volumes which are attached will get removed except the volume’s delete On Termination
characteristic is set to zero. In such cases, the instance will get removed and cannot set it up
afterward.
24. How do I transfer my existing domain name registration to Amazon Route 53 without
disrupting my existing web traffic?
Answer: You will need to get a list of the DNS record data for your domain name first, it is
generally available in the form of a “zone file” that you can get from your existing DNS
provider. Once you receive the DNS record data, you can use Route 53’s Management Console
or simple web-services interface to create a hosted zone that will store your DNS records for
your domain name and follow its transfer process.
It also includes steps such as updating the nameservers for your domain name to the ones
associated with your hosted zone.
For completing the process you have to contact the registrar with whom you registered your
domain name and follow the transfer
process. As soon as your registrar propagates the new name server delegations, your DNS
queries will start to get answered.
25. When should I use a Classic Load Balancer and when should I use an Application load
balancer?
Answer: A Classic Load Balancer is ideal for simple load balancing of traffic across multiple
EC2 instances, while an Application Load Balancer is ideal for microservices or container-based
architectures where there is a need to route traffic to multiple services or load balance across
multiple ports on the same EC2 instance.
27. What do you understand by “Infrastructure as code”? How does it fit into the DevOps
methodology? What purpose does it achieve?
Answer:
Infrastructure as Code (IAC) is a type of IT infrastructure that operations teams can use
to automatically manage and provision through code, rather than using a manual process.
Companies for faster deployments treat infrastructure like software: as code that can be
managed with the DevOps tools and processes. These tools let you make infrastructure
changes more easily, rapidly, safely and reliably.
GET
HEAD
PUT
POST
PATCH
DELETE
TRACE
CONNECT
OPTIONS
30. Explain how can I vertically scale an Amazon instance?
Answer: This is one of the essential features of AWS and cloud virtualization. SpinUp a newly
developed large instance where we pause that instance and detach the root Ebs volume from the
server and discard it. Later stop your live instance, detach its root volume connected. Note down
the unique device ID and attach the same root volume to the new server. And restart it again.
This results in a vertically scaled Amazon instance.
server group provides 80 and 443 from around the world, but only port 22 are vital among the
jump box group. The database group allows port 3306 from the webserver group and port 22
from the jump box group. The addition of any machines to the webserver group can store in the
database. No one can directly ssh to any of your boxes.
31. How we can make sure the new service is ready for the products launched?
Answer:
Backup System
Recovery plans
Load Balancing
Monitoring
Centralized logging
Git
Jenkins
Selenium
Puppet
Chef
Ansible
Nagios
Docker
35. What is the one most important thing DevOps helps do?
Answer: The most important thing DevOps helps do is to get the changes into production as
quickly as possible while minimizing risks in software quality assurance and compliance. That is
the primary objective of DevOps. However, there are many other positive side-effects to
DevOps. For example, clearer communication and better working relationships between teams
which creates a less stressful working environment.
3. however does one realize an inventory of files that have modified in a very specific
commit?
Answer:
For this answer rather than simply telling the command, make a case for what specifically this
command can do this you’ll say that, to induce an inventory file that has modified in a very
specific commit use command.
git diff-tree -r
Given the commit hash, this can list all the files that were modified or more therein commit. The
-r flag makes the command list individual files, instead of collapsing them into root directory
names solely.
You can conjointly embody the below mention purpose though it’s entirely optional however can
facilitate in impressing the asker.
The output also will embody some additional data, which may be simply suppressed by as well
as 2 flags:
Here –no-commit-id can suppress the commit hashes from showing within the output, and –
name-only can solely print the file names, rather than their ways. (Online Training Institute)
4. however can you recognize in crumb if a branch has already been incorporating into the
master?
Answer:
I will counsel you to embody each of the below mentioned commands:
git branch –merged lists the branches that are incorporated into this branch.
git branch –no-merged lists the branches that haven’t been incorporate.
8. what’s Puppet?
Answer: I will be able to advise you to initially provides a tiny definition of Puppet. it’s a
Configuration Management tool that is employed to automatize administration tasks. (Online
coaching Institute)
Now you ought to describe its design and the way Puppet manages its Agents. Puppet contains a
Master-Slave design within which the Slave needs to initially send a Certificate language request
to Master and Master needs to sign that Certificate to ascertain a secure association between
Puppet Master and Puppet Slave as shown within the diagram below. Puppet Slave sends missive
of invitation to Puppet Master and Puppet Master then pushes configuration on Slave.
10. Tell the American state a couple of times once you used collaboration and Puppet to
assist resolve a conflict at intervals a team?
Answer: show them concerning your past expertise of Puppet and the way it had been helpful to
resolve conflicts, you’ll talk to the below-mentioned example:
The development team wished for root access on check machines managed by Puppet to form
specific configuration changes. we tend to respond by meeting with them weekly to agree on a
method for developers to speak about configuration changes and to empower them to form
several of the changes they required. Through our joint efforts, we tend to come up with the
simplest way for the developers to vary specific configuration values themselves via knowledge
abstracted through Hiera. In fact, we tend to even school one amongst the developers the way to
write Puppet code unitedly with the United States.
16. What are microservices and why they need an impression on operations?
Answer: Microservices could be a product of computer code design and programming practices.
Microservices architectures generally turn out smaller, however additional varied artifacts that
Operations is to blame for often deploying and managing. For this reason, microservices have a
crucial impact on Operations. The term that describes the responsibilities of deploying small
services is micro deployments. So, what DevOps is de facto concerning is bridging the gap
between small services and micro deployments.
17. what are the explanations against victimization associate degree RDBMS?
Answer: during a shell, if your application is all concerning storing application entities during a
persistent and consistent means, then associate degree RDBMS can be associate degree overkill.
a straightforward Key-Value storage answer may well be excellent for you. Note that the worth
isn’t meant to be a straightforward component however is a posh entity in itself!
Another reason can be if you have got graded application objects and wish some question
capability into them then most NoSQL solutions may well be a work. With associate degree
RDBMS you’ll use ORM to attain a similar result however at the price of adding further quality.
RDBMS is additionally not the simplest answer if you’re attempting to store giant trees or
networks of objects. reckoning on your different wants Graph information may suit you.
If {you ar|you’re} running within the Cloud and wish to run a distributed information for
sturdiness and availableness then you’ll check generator and massive Table primarily based
datastores that are designed for this core purpose.
Last however not least, if your knowledge grows large to be processed on one machine, you may
verify Hadoop or the other answer that supports distributed Map/Reduce.
19. What square measures the adoption of DevOps within the industry?
Answer: Use of agile and different development processes and strategies.
Demand for Associate in the Nursing enlarged rate of production releases from application and
business.
The wide handiness of virtual and cloud infrastructure from each internal and external providers;
Increased usage of information center, automation and configuration management tools;
Increased target take a look at automation and continuous integration methods;
Best practices on essential problems.
20. Tell North American country however you’ve got used dockhand in your past position?
Answer: make a case for however you’ve got used dockhand to assist speedy preparation. make
a case for however you’ve got written dockhand and used dockhand with different tools like
Puppet, Chef, or Jenkins. If you’ve got no past sensible expertise in dockhand and have past
expertise with different tools in a very similar house, be honest and make a case for a similar. In
this case, it is smart if you’ll compare different tools to dockhand in terms of practicality.
23. What square measure the benefits of NoSQL info over RDBMS?
Answer: the benefits are:
27. Is there a distinction between Agile and DevOps? If affirmative, please explain?
Answer: As a DevOps engineer, interview queries like this square measure quite expected. begin
by describing the plain overlap between DevOps and Agile. though the implementation of
DevOps is often in synchronizing with Agile methodologies, there’s a transparent distinction
between the 2. The principles of Agile square measure related to seamless production or
development of a bit of software system. On the opposite hand, DevOps deals with the event,
followed by preparation of the software system, making a certain quicker turnaround, minimum
errors, and responsibility.
30. What are the core operations of DevOps in terms of development and Infrastructure?
Answer:
The core operations of DevOps:
Application development
Code developing
Code coverage
Unit testing
Packaging
Deployment With infrastructure
Provisioning
Configuration
Orchestration
Deployment
38. What are the elements concerned with Amazon internet Services?
Answer: There are four elements concerned and areas below.Amazon S3: with this, one will
retrieve the key info that is occupied in making cloud structural style, and therefore the quantity
of made info can also be kept during this element that’s the consequence of the key such
as.Amazon EC2: useful to run an oversized distributed system on the Hadoop cluster. Automatic
parallelization and job planning will be achieved by this element. Amazon SQS: this element
acts as a negotiator between totally different controllers. conjointly worn for artifact needs those
are obtained by the manager of Amazon.Amazon SimpleDB: helps in storing the shift position
log and therefore the errands dead by the customers.
1. Jenkins: This is an open source automation server used as a continuous integration tool. We can
build, deploy and run automated tests with Jenkins.
2. GIT: It is a version control tool used for tracking changes in files and software.
3. Docker: This is a popular tool for containerization of services. It is very useful in Cloud based
deployments.
4. Nagios: We use Nagios for monitoring of IT infrastructure.
5. Splunk: This is a powerful tool for log search as well as monitoring production systems.
6. Puppet: We use Puppet to automate our DevOps work so that it is reusable.
DevOps is a very popular trend in Software Development. Some of the main benefits of DevOps
are as follows:
1. Release Velocity: DevOps practices help in increasing the release velocity. We can release code
to production more often and with more confidence.
2. Development Cycle: With DevOps, the complete Development cycle from initial design to
production deployment becomes shorter.
3. Deployment Rollback: In DevOps, we plan for any failure in deployment rollback due to a bug in
code or issue in production. This gives confidence in releasing feature without worrying about
downtime for rollback.
4. Defect Detection: With DevOps approach, we can catch defects much earlier than releasing to
production. It improves the quality of the software.
5. Recovery from Failure: In case of a failure, we can recover very fast with DevOps process.
6. Collaboration: With DevOps, collaboration between development and operations professionals
increases.
7. Performance-oriented: With DevOps, organization follows performance-oriented culture in
which teams become more productive and more innovative.
Amazon Web Services (AWS) provide many tools and features to deploy and manage
applications in AWS. As per DevOps, we treat infrastructure as code. We mainly use following
two services from AWS for DevOps:
1. CloudFormation: We use AWS CloudFormation to create and deploy AWS resources by using
templates. We can describe our dependencies and pass special parameters in these templates.
CloudFormation can read these templates and deploy the application and resources in AWS
cloud.
2. OpsWorks: AWS provides another service called OpsWorks that is used for configuration
management by utilizing Chef framework. We can automate server configuration, deployment
and management by using OpsWorks. It helps in managing EC2 instances in AWS as well as any
on-premises servers.
5. How will you run a script automatically when a developer commits a change into GIT?
GIT provides the feature to execute custom scripts when certain event occurs in GIT. This
feature is called hooks.
We can write two types of hooks.
1. Client-side hooks
2. Server-side hooks
For this case, we can write a Client-side post-commit hook. This hook will execute a custom
script in which we can add the message and code that we want to run automatically with each
commit.
1. Server Support: AWS OpsWorks Stacks we can automate operational tasks on any server in AWS
as well as our own data center.
2. Scalable Automation: We get automated scaling support with AWS OpsWorks Stacks. Each new
instance in AWS can read configuration from OpsWorks. It can even respond to system events in
same way as other instances do.
3. Dashboard: We can create dashboards in OpsWorks to display the status of all the stacks in
AWS.
4. Configuration as Code: AWS OpsWorks Stacks are built on the principle of “Configuration as
Code”. We can define and maintain configurations like application source code. Same
configuration can be replicated on multiple servers and environments.
5. Application Support: OpsQorks supports almost all kinds of applications. So it is universal in
nature.
Once the template is ready and submitted to AWS, CloudFormation will create all the resources
in the template. This helps in automation of building new environments in AWS.
CICD stands for Continuous Integration and Continuous Delivery. These are two different
concepts that are complementary to each other.
Continuous Integration (CI): In CI all the developer work is merged to main branch several times
a day. This helps in reducing integration problems.
In CI we try to minimize the duration for which a branch remains checked out. A developer gets
early feedback on the new code added to main repository by using CI.
Continuous Delivery (CD): In CD, a software team plans to deliver software in short cycles.
They perform development, testing and release in such a short time that incremental changes can
be easily delivered to production.
In CD, as a DevOps we create a repeatable deployment process that can help achieve the
objective of Continuous Delivery.
1. Build Automation:In CI, we create such a build environment that even with one command build
can be triggered. This automation is done all the way up to deployment to Production
environment.
2. Main Code Repository: In CI, we maintain a main branch in code repository that stores all the
Production ready code. This is the branch that we can deploy to Production any time.
3. Self-testing build: Every build in CI should be self-tested. It means with every build there is a set
of tests that runs to ensure that changes are of high quality.
4. Every day commits to baseline: Developers will commit all of theirs changes to baseline
everyday. This ensures that there is no big pileup of code waiting for integration with the main
repository for a long time.
5. Build every commit to baseline: With Automated Continuous Integration, every time a commit is
made into baseline, a build is triggered. This helps in confirming that every change integrates
correctly.
6. Fast Build Process: One of the requirements of CI is to keep the build process fast so that we can
quickly identify any problem.
7. Production like environment testing: In CI, we maintain a production like environment also
known as pre-production or staging environment, which is very close to Production
environment. We perform testing in this environment to check for any integration issues.
8. Publish Build Results: We publish build results on a common site so that everyone can see these
and take corrective actions.
9. Deployment Automation: The deployment process is automated to the extent that in a build
process we can add the step of deploying the code to a test environment. On this test
environment all the stakeholders can access and test the latest delivery.
1. CI makes the current build constantly available for testing, demo and release purpose.
2. With CI, developers write modular code that works well with frequent code check-ins.
3. In case of a unittest failure or bug, developer can easily revert back to the bug-free state of the
code.
4. There is drastic reduction in chaos on release day with CI practices.
5. With CI, we can detect Integration issues much earlier in the process.
6. Automated testing is one very useful side effect of implementing CI.
7. All the stakeholders including business partners can see the small changes deployed into pre-
production environment. This provides early feedback on the changes to software.
8. Automated CI and testing generates metrics like code-coverage, code complexity that help in
improving the development process.
In Jenkins, it is very important to make the system secure by setting user authentication and
authorization. To do this we have to do following:
1. First we have to set up the Security Realm. We can integrate Jenkins with LDAP server to create
user authentication.
2. Second part is to set the authorization for users. This determines which user has access to what
resources.
Chef is an automation tool for keeping infrastructure as code. It has many benefits. Some of
these are as follows:
1. Cloud Deployment: We can use Chef to perform automated deployment in Cloud environment.
2. Multi-cloud support: With Chef we can even use multiple cloud providers for our infrastructure.
3. Hybrid Deployment: Chef supports both Cloud based as well as datacenter-based infrastructure.
4. High Availability: With Chef automation, we can create high availability environment. In case of
hardware failure, Chef can maintain or start new servers in automated way to maintain highly
available environment.
Chef is composed of many components like Chef Server, Client etc. Some of the main
components in Chef are as follows:
1. Client: These are the nodes or individual users that communicate with Chef server.
2. Chef Manage: This is the web console that is used for interacting with Chef Server.
3. Load Balancer: All the Chef server API requests are routed through Load Balancer. It is
implemented in Nginx.
4. Bookshelf: This is the component that stores cookbooks. All the cookbooks are stored in a
repository. It is separate storage from the Chef server.
5. PostgreSQL: This is the data repository for Chef server.
6. Chef Server: This is the hub for configuration data. All the cookbooks and policies are stored in
it. It can scale to the size of any enterprise.
14. What is a Recipe in Chef?
Ansible is a powerful tool for IT Automation for large scale and complex deployments. It
increases the productivity of team. Some of the main benefits of Ansible are as follows:
1. App Deployment: With Ansible, we can deploy apps in a reliable and repeatable way.
2. Configuration Management: Ansible supports the automation of configuration management
across multiple environments.
3. Continuous Delivery: We can release updates with zero downtime with Ansible.
4. Security: We can implement complex security policies with Ansible.
5. Compliance: Ansible helps in verifying and organization’s systems in comparison with the rules
and regulations.
6. Provisioning: We can provide new systems and resources to other users with Ansible.
7. Orchestration: Ansible can be used in orchestration of complex deployment in a simple way.
Docker Hub is a cloud-based registry. We can use Docker Hub to link code repositories. We can
even build images and store them in Docker Hub. It also provides links to Docker Cloud to
deploy the images to our hosts.
Docker Hub is a central repository for container image discovery, distribution, change
management, workflow automation and team collaboration.
In DevOps, we use different scripting languages for different purposes. There is no single
language that can work in all the scenarios. Some of the popular scripting languages that we use
are as follows:
1. Bash: On Unix based systems we use Bash shell scripting for automating tasks.
2. Python: For complicated programming and large modules we use Python. We can easily use a
wide variety of standard libraries with Python.
3. Groovy: This is a Java based scripting language. We need JVM installed in an environment to use
Groovy. It is very powerful and it provides very powerful features.
4. Perl: This is another language that is very useful for text parsing. We use it in web applications.
With MFA, the system becomes more secure and it cannot be easily hacked.
Nagios is open source software to monitor systems, networks and infrastructure. The main
benefits of Nagios are as follows:
1. Monitor: DevOps can configure Nagios to monitor IT infrastructure components, system metrics
and network protocols.
2. Alert: Nagios will send alerts when a critical component in infrastructure fails.
3. Response: DevOps acknowledges alerts and takes corrective actions.
4. Report: Periodically Nagios can publish/send reports on outages, events and SLAs etc.
5. Maintenance: During maintenance windows, we can also disable alerts.
6. Planning: Based on past data, Nagios helps in infrastructure planning and upgrades.
State Stalking is a very useful feature. Though all the users do not use it all the time, it is very
helpful when we want to investigate an issue.
In State Stalking, we can enable stalking on a host. Nagios will monitor the state of the host very
carefully and it will log any changes in the state.
By this we can identify what changes might be causing an issue on the host.
Puppet Enterprise is a DevOps software platform that is used for automation of infrastructure
operations. It runs on Unix as well as on Windows.
The system configuration described in Puppet’s language can be distributed to a target system by
using REST API calls.
In Kubernetes we can create a cluster of servers that are connected to work as a single unit. We
can deploy a containerized application to all the servers in a cluster without specifying the
machine name.
We have to package applications in such a way that they do not depend on a specific host.
Master: There is a master node that is responsible for managing the cluster. Master performs
following functions in a cluster.
1. Scheduling Applications
2. Maintaining desired state of applications
3. Scaling applications
4. Applying updates to applications
Nodes: A Node in Kubernetes is responsible for running an application. The Node can be a
Virtual Machine or a Computer in the cluster. There is software called Kubelet on each node.
This software is used for managing the node and communicating with the Master node in cluster.
There is a Kubernetes API that is used by Nodes to communicate with the Master. When we
deploy an application on Kubernetes, we request Master to start application containers on Nodes.
In a Kubernetes cluster, there is a Deployment Controller. This controller monitors the instances
created by Kubernetes in a cluster. Once a node or the machine hosting the node goes down,
Deployment Controller will replace the node.
In DevOps approach we release software with high frequency to production. We have to run tests
to gain confidence on the quality of software deliverables.
Running tests manually is a time taking process. Therefore, we first prepare automation tests and
then deliver software. This ensures that we catch any defects early in our process.
Chaos Monkey is a concept made popular by Netflix. In Chaos Monkey, we intentionally try to
shut down the services or create failures. By failing one or more services, we test the reliability
and recovery mechanism of the Production architecture.
It checks whether our applications and deployment have survival strategy built into it or not.
We use Jenkins to create automated flows to run Automation tests. The first part of test
automation is to develop test strategy and test cases. Once automation test cases are ready for an
application, we have to plug these into each Build run.
In each Build we run Unit tests, Integration tests and Functional tests.
With a Jenkins job, we can automate all these tasks. Once all the automated tests pass, we
consider the build as green. This helps in deployment and release processes to build confidence
on the application software.
32. What are the main services of AWS that you have used?
33. Why GIT is considered better than CVS for version control system?
GIT is a distributed system. In GIT, any person can create its own branch and start checking in
the code. Once the code is tested, it is merged into main GIT repo. IN between, Dev, QA and
product can validate the implementation of that code.
In CVS, there is a centralized system that maintains all the commits and changes.
GIT is open source software and there are plenty of extensions in GIT for use by our teams.
We need to select an Operating System (OS) to get a specific Virtual Machine (VM). VM
provides full OS to an application for running in a virtualized environment.
A Container just provides the APIs that are required by the application.
This is a tricky question. DevOps is a new concept and in any organization the maturity of
DevOps varies from highly Operations oriented to highly DevOps oriented. In some projects
teams are very mature and practice DevOps in it true form. In some projects, teams rely more on
Operations team.
As a DevOps person I give first priority to the needs of an organization and project. At some
times I may have to perform a lot of operations work. But with each iteration, I aim to bring
DevOps changes incrementally to an organization.
Over time, organization/project starts seeing results of DevOps practices and embraces it fully.
REST is also known as Representational State Transfer. A REST service is a simple software
functionality that is available over HTTP protocol. It is a lightweight service that is widely
available due to the popularity of HTTP protocol.
Sine REST is lightweight; it has very good performance in a software system. It is also one of
the foundations for creating highly scalable systems that provide a service to large number of
clients.
Another key feature of a REST service is that as long as the interface is kept same, we can
change the underlying implementation. E.g. Clients of REST service can keep calling the same
service while we change the implementation from php to Java.
Three Ways of DevOps refers to three basic principles of DevOps culture. These are as follows:
1. The First Way: Systems Thinking: In this principle we see the DevOps as a flow of work from left
to right. This is the time taken from Code check in to the feature being released to End
customer. In DevOps culture we try to identify the bottlenecks in this.
2. The Second Way: Feedback Loops: Whenever there is an issue in production it is a feedback
about the whole development and deployment process. We try to make the feedback loop
more efficient so that teams can get the feedback much faster. It is a way of catching defect
much earlier in process than it being reported by customer.
3. The Third Way: Continuous Learning: We make use of first and second way principles to keep on
making improvements in the overall process. This is the third principle in which over the time we
make the process and our operations highly efficient, automated and error free by continuously
improving them.
Security of a system is one of the most important goals for an organization. We use following
ways to apply DevOps to security.
1. Automated Security Testing: We automate and integrate Security testing techniques for
Software Penetration testing and Fuzz testing in software development process.
2. Early Security Checks: We ensure that teams know about the security concerns at the beginning
of a project, rather than at the end of delivery. It is achieved by conducting Security trainings
and knowledge sharing sessions.
3. Standard Process: At DevOps we try to follow standard deployment and development process
that has already gone through security audits. This helps in minimizing the introduction of any
new security loopholes due to change in the standard process.
If we get an issue in production, we first write an automation test to validate that the issue
happens in current release. Once the issue in release code is fixed, we run the same test to
validate that the defect is not there. With each release we keep running these tests so that the
issue does not appear anymore.
One of the techniques of writing Self-testing code is Test Driven Development (TDD).
In DevOps, our aim is to automate all the stages of Deployment Pipeline. With a smooth running
Deployment Pipeline, we can achieve the goal of Continuous Delivery.
1. Image Repositories: In Docker Hub we can push, pull, find and manage Docker Images. It is a big
library that has images from community, official as well as private sources.
2. Automated Builds: We can use Docker Hub to create new images by making changes to source
code repository of the image.
3. Webhooks: With Webhooks in Docker Hub we can trigger actions that can create and build new
images by pushing a change to repository.
4. Github/Bitbucket integration: Docker Hub also provides integration with Github and Bitbucket
systems.
44. What are the security benefits of using Container based system?
Some of the main security benefits of using a Container based system are as follows:
In Nagios, we can monitor hosts and services by active checks. In addition, Nagios also supports
Passive checks that are initiated by external applications.
The results of Passive checks are submitted to Nagios. There are two main use cases of Passive
checks:
1. We use Passive checks to monitor asynchronous services that do not give positive result with
Active checks at regular intervals of time.
2. We can use Passive checks to monitor services or applications that are located behind a firewall.
A Docker Container is a lightweight system that can be run on a Linux operating system or a
virtual machine. It is a package of an application and related dependencies that can be run
independently.
Since Docker Container is very lightweight, multiple containers can be run simultaneously on a
single server or virtual machine.
With a Docker Container we can create an isolated system with restricted services and processes.
A Container has private view of the operating system. It has its own process ID space, file
system, and network interface.
We can use docker rmi command to delete an image from our local system.
If we want to find IDs of all the Docker images in our local system, we can user docker images
command.
% docker images
1. Setting up Development Environment: We can use Docker to set the development environment
with the applications on which our code is dependent.
2. Testing Automation Setup: Docker can also help in creating the Testing Automation setup. We
can setup different services and apps with Docker to create the automation-testing
environment.
3. Production Deployment: Docker also helps in implementing the Production deployment for an
application. We can use it to create the exact environment and process that will be used for
doing the production deployment.
A Docker Container has its own file-system. In an application running on Docker Container we
can write to this file-system. When the container exits, data written to file-system still remains.
When we restart the container, same data can be accessed again.
Docker Questions
Docker is Open Source software. It provides the automation of Linux application deployment in
a software container.
Docker can package software in a complete file system that contains software code, runtime
environment, system tools, & libraries that are required to install and run the software on a
server.
52. What is the difference between Docker image and Docker container?
A Docker image is an immutable file, which is a snapshot of container. We create an image with
build command.
In a Hypervisor environment we first create a Virtual Machine and then install an Operating
System on it. After that we deploy the application. The virtual machine may also be installed on
different hardware configurations.
In a Docker environment, we just deploy the application in Docker. There is no OS layer in this
environment. We specify libraries, and rest of the kernel is provided by Docker engine.
Yes. Yaml format is a superset of json format. Therefore any json file is also a valid Yaml file.
If we use a json file then we have to specify in docker command that we are using a json file as
follows:
% docker-compose -f docker-compose.json up
Yes, theoretically we can run multiples apps on one Docker server. But in practice, it is better to
run different components on separate containers.
With this we get cleaner environment and it can be used for multiple uses.
1. Multiple environments on same Host: We can use it to create multiple environments on the
same host server.
2. Preserve Volume Data on Container Creation: Docker compose also preserves the volume data
when we create a container.
3. Recreate the changed Containers: We can also use compose to recreate the changed containers.
4. Variables in Compose file: Docker compose also supports variables in compose file. In this way
we can create variations of our containers.
The most popular use of Docker is in build pipeline. With the use of Docker it is much easier to
automate the development to deployment process in build pipeline.
We use Docker for the complete build flow from development work, test run and deployment to
production environment.
58. What is the role of open source development in the popularity of Docker?
Since Linux was an open source operating system, it opened new opportunities for developers
who want to contribute to open source systems.
One of the very good outcomes of open source software is Docker. It has very powerful features.
Docker has wide acceptance due to its usability as well as its open source approach of integrating
with different systems.
59. What is the difference between Docker commands: up, run and start?
1. Up: We use this command to build, create, start or restart all the services in a docker-
compose.yml file. It also attaches to containers for a service. This command can also start linked
services.
2. Run: We use this command for adhoc requests. It just starts the service that we specifically
want to start. We generally use it run specific tests or any administrative tasks.
3. Start: This command is used to start the container that were previously created but are not
currently running. This command does not create new containers.
Docker Swarm is used to create a cluster environment. It can turn a group of Docker engines into
a Single virtual Docker Engine. This creates a system with pooled resources. We can use Docker
Swarm to scale our application.
Docker Image is the blue print that is used to create a Docker Container. Whenever we want to
run a container we have to specify the image that we want to run.
There are many Docker images available online for standard software. We can use these images
directly from the source.
The standard set of Docker Images is stored in Docker Hub Registry. We can download these
from this location and use it in our environment.
We can also create our own Docker Image with the software that we want to run as a container.
63. What is a Docker Container?
A Docker Container is a lightweight system that can be run on a Linux operating system or a
virtual machine. It is a package of an application and related dependencies that can be run
independently.
Since Docker Container is very lightweight, multiple containers can be run simultaneously on a
single server or virtual machine.
With a Docker Container we can create an isolated system with restricted services and processes.
A Container has private view of the operating system. It has its own process ID space, file
system, and network interface.
We can use Docker Machine to install Docker Engine on virtual hosts. It also provides
commands to manage virtual hosts.
Some of the popular Docker machine commands enable us to start, stop, inspect and restart a
managed host.
Docker Machine provides a Command Line Interface (CLI), which is very useful in managing
multiple hosts.
1. Old Desktop: If we have an old desktop and we want to run Docker then we use Docker Machine
to run Docker. It is like installing a virtual machine on an old hardware system to run Docker
engine.
2. Remote Hosts: Docker Machine is also used to provision Docker hosts on remote systems. By
using Docker Machine you can install Docker Engine on remote hosts and configure clients on
them.
To create a Container in Docker we have to create a Docker Image. We can also use an existing
Image from Docker Hub Registry.
Yes, a Docker Container can provide process management that can be used to run multiple
processes. There are process supervisors like runit, s6, daemontools etc that can be used to fork
additional processes in a Docker container.
69. What are the objects created by Docker Cloud in Amazon Web Services (AWS) EC2?
1. VPC: Docker Cloud creates a Virtual Private Cloud with the tag name dc-vpc. It also creates Class
Less Inter-Domain Routing (CIDR) with the range of 10.78.0.0/16.
2. Subnet: Docker Cloud creates a subnet in each Availability Zone (AZ). In Docker Cloud, each
subnet is tagged with dc-subnet.
3. Internet Gateway: Docker Cloud also creates an internet gateway with name dc-gateway and
attaches it to the VPC created earlier.
4. Routing Table: Docker Cloud also creates a routing table named dc-route-table in Virtual Private
Cloud. In this Routing Table Docker Cloud associates the subnet with the Internet Gateway.
70. How will you take backup of Docker container volumes in AWS S3?
We can use a utility named Dockup provided by Docker Cloud to take backup of Docker
container volumes in S3.
1. Environment: We first define the environment of our application with a Dockerfile. It can be
used to recreate the environment at a later point of time.
2. Services: Then we define the services that make our app in docker-compose.yml. By using this
file we can define how these services can be run together in an environment.
3. Run: The last step is to run the Docker Container. We use docker-compose up to start and run
the application.
Docker storage driver is by default based on a Linux file system. But Docker storage driver also
has provision to plug in any other storage driver that can be used for our environment.
In Pluggable Storage Driver architecture, we can use multiple kinds of file systems in our Docker
Container. In Docker info command we can see the Storage Driver that is set on a Docker
daemon.
We can even plug in shared storage systems with the Pluggable Storage Driver architecture.
73. What are the main security concerns with Docker based containers?
1. Kernel Sharing: In a container-based system, multiple containers share same Kernel. If one
container causes Kernel to go down, it will take down all the containers. In a virtual machine
environment we do not have this issue.
2. Container Leakage: If a malicious user gains access to one container, it can try to access the
other containers on the same host. If a container has security vulnerabilities it can allow the user
to access other containers on same host machine.
3. Denial of Service: If one container occupies the resources of a Kernel then other containers will
starve for resources. It can create a Denial of Service attack like situation.
4. Tampered Images: Sometimes a container image can be tampered. This can lead to further
security concerns. An attacker can try to run a tampered image to exploit the vulnerabilities in
host machines and other containers.
5. Secret Sharing: Generally one container can access other services. To access a service it
requires a Key or Secret. A malicious user can gain access to this secret. Since multiple
containers share the secret, it may lead to further security concerns.
We can use docker ps –a command to get the list of all the containers in Docker. This command
also returns the status of these containers.
Docker is a very powerful tool. Some of the main benefits of using Docker are as follows:
1. Utilize Developer Skills: With Docker we maximize the use of Developer skills. With Docker there
is less need of build or release engineers. Same Developer can create software and wrap it in
one single file.
2. Standard Application Image: Docker based system allows us to bundle the application software
and Operating system files in a single Application Image that can be deployed independently.
3. Uniform deployment: With Docker we can create one package of our software and deploy it on
different platforms seamlessly.
Prior to Docker, Developers would develop software and pass it to QA for testing and then it is
sent to Build & Release team for deployment.
In Docker workflow, Developer builds an Image after developing and testing the software. This
Image is shipped to Registry. From Registry it is available for deployment to any system. The
development process is simpler since steps for QA and Deployment etc take place before the
Image is built. So Developer gets the feedback early.
77. What is the basic architecture behind Docker?
Docker is built on client server model. Docker server is used to run the images. We use Docker
client to communicate with Docker server.
Additionally there is a Registry that stores Docker Images. Docker Server can directly contact
Registry to download images.
78. What are the popular tasks that you can do with Docker Command line tool?
Docker Command Line (DCL) tool is implemented in Go language. It can compile and run on
most of the common operating systems. Some of the tasks that we can do with Docker Command
Line tool are as follows:
79. What type of applications- Stateless or Stateful are more suitable for Docker
Container?
It is preferable to create Stateless application for Docker Container. We can create a container
out of our application and take out the configurable state parameters from application. Now we
can run same container in Production as well as QA environments with different parameters.
This helps in reusing the same Image in different scenarios. Also a stateless application is much
easier to scale with Docker Containers than a stateful application.
Docker directly works with Linux kernel level libraries. In every Linux distribution, the Kernel is
same. Docker containers share same kernel as the host kernel.
Since all the distributions share the same Kernel, the container can run on any of these
distributions.
Generally we use Docker on top of a virtual machine to ensure isolation of the application. On a
virtual machine we can get the advantage of security provided by hypervisor. We can implement
different security levels on a virtual machine. And Docker can make use of this to run the
application at different security levels.
So in a way Docker container does not share resources within its own namespace. But the
resources that are not in isolated namespace are shared between containers. These are the Kernel
resources of host machine that have just one copy.
So in the back-end there is same set of resources that Docker Containers share.
83. What is the difference between Add and Copy command in a Dockerfile?
Both Add and Copy commands of Dockerfile can copy new files from a source location to a
destination in Container’s file path.
The main difference between these two is that Add command can also read the files from a URL.
As per Docker documentation, Copy command is preferable. Since Copy only supports copying
local files to a Container, it is preferred over Add command.
We use Docker Entrypoint to set the starting point for a command in a Docker Image.
We can use the entrypoint as a command for running an Image in the container.
E.g. We can define following entrypoint in docker file and run it as following command:
ENTRYPOINT [“mycmd”]
We use ONBUILD command in Docker to run the instructions that have to execute after the
completion of current Dockerfile build.
It is used to build a hierarchy of images that have to be build after the parent image is built.
A Docker build will execute first ONBUILD command and then it will execute any other
command in Child Dockerfile.
86. What is Build cache in Docker?
When we build an Image, Docker will process each line in Dockerfile. It will execute the
commands on each line in the order that is mentioned in the file.
But at each line, before running any command, Docker will check if there is already an existing
image in its cache that can be reused rather than creating a new image.
We can also specify the option –no-cache=true to let Docker know that we do not want to use
cache for Images. With this option, Docker will create all new images.
1. FROM: We use FROM to set the base image for subsequent instructions. In every valid
Dockerfile, FROM is the first instruction.
2. LABEL: We use LABEL to organize our images as per project, module, licensing etc. We can also
use LABEL to help in automation. In LABEL we specify a key value pair that can be later used for
programmatically handling the Dockerfile.
3. RUN: We use RUN command to execute any instructions in a new layer on top of the current
image. With each RUN command we add something on top of the image and use it in
subsequent steps in Dockerfile.
4. CMD: We use CMD command to provide default values of an executing container. In a
Dockerfile, if we include multiple CMD commands, then only the last instruction is used.
We use EXPOSE command to inform Docker that Container will listen on a specific network
port during runtime.
But these ports on Container may not be accessible to the host. We can use –p to publish a range
of ports from Container.
In a Container we have an isolated environment with namespace for each resource that a kernel
provides. There are mainly six types of namespaces in a Container.
1. UTS Namespace: UTS stands for Unix Timesharing System. In UTS namespace every container
gets its own hostname and domain name.
2. Mount Namespace: This namespace provides its own file system within a container. With this
namespace we get root like / in the file system on which rest of the file structure is based.
3. PID Namespace: This namespace contains all the processes that run within a Container. We can
run ps command to see the processes that are running within a Docker container. IPC
Namespace:
4. IPC stands for Inter Process Communication. This namespace covers shared memory,
semaphores, named pipes etc resources that are shared by processes. The items in this
namespace do not cross the container boundary.
5. User Namespace: This namespace contains the users and groups that are defined within a
container.
6. Network Namespace: With this namespace, container provides its own network resources like-
ports, devices etc. With this namespace, Docker creates an independent network stack within
each container.
Docker provides tools like docker stats and docker events to monitor Docker in production.
Docker stats: When we call docker stats with a container id, we get the CPU, memory usage etc
of a container. It is similar to top command in Linux.
Docker events: Docker events are a command to see the stream of activities that are going on in
Docker daemon.
Some of the common Docker events are: attach, commit, die, detach, rename, destroy etc.
We can also use various options to limit or filter the events that we are interested in.
1. Amazon AWS
2. Google Cloud Platform
3. Microsoft Azure
4. IBM Bluemix
92. How can we control the startup order of services in Docker compose?
In Docker compose we can use the depends_on option to control the startup order of services.
With compose, the services will start in the dependency order. Dependencies can be defined in
the options like- depends_on, links, volumes_from, network_mode etc.
The problem with waiting for a container to be ready is that in a Distributed system, some
services or hosts may become unavailable sometimes. Similarly during startup also some
services may also be down.
Therefore, we have to build resiliency in our application. So that even if some services are down
we can continue our work or wait for the service to become available again.
We can use wait-for-it or dockerize tools for building this kind of resiliency.
94. How will you customize Docker compose file for different environments?
We can specify a service in both the files. Docker compose will merge these files based on
following rules:
For single value options, new value replaces the old value.
For multi-value options, compose will concatenate the both set of values.
We can also use extends field to extend a service configuration to multiple environments. With
extends, child services can use the common configuration defined by parent service. Cloud
Computing Questions
1. Flexibility: The businesses that have fluctuating bandwidth demands need the flexibility of Cloud
Computing. If you need high bandwidth, you can scale up your cloud capacity. When you do not
need high bandwidth, you can just scale down. There is no need to be tied into an inflexible
fixed capacity infrastructure.
2. Disaster Recovery: Cloud Computing provides robust backup and recovery solutions that are
hosted in cloud. Due to this there is no need to spend extra resources on homegrown disaster
recovery. It also saves time in setting up disaster recovery.
3. Automatic Software Updates: Most of the Cloud providers give automatic software updates. This
reduces the extra task of installing new software version and always catching up with the latest
software installs.
4. Low Capital Expenditure: In Cloud computing the model is Pay as you Go. This means there is
very less upfront capital expenditure. There is a variable payment that is based on the usage.
5. Collaboration: In a cloud environment, applications can be shared between teams. This
increases collaboration and communication among team members.
6. Remote Work: Cloud solutions provide flexibility of working remotely. There is no on site work.
One can just connect from anywhere and start working.
7. Security: Cloud computing solutions are more secure than regular onsite work. Data stored in
local servers and computers is prone to security attacks. In Cloud Computing, there are very few
loose ends. Cloud providers give a secure working environment to its users.
8. Document Control: Once the documents are stored in a common repository, it increases the
visibility and transparency among companies and their clients. Since there is one shared copy,
there are fewer chances of discrepancies.
9. Competitive Pricing: In Cloud computing there are multiple players, so they keep competing
among themselves and provide very good pricing. This comes out much cheaper compared to
other options.
10. Environment Friendly: Cloud computing saves precious environmental resources also. By not
blocking the resources and bandwidth.
In an enterprise system demand for computing resources varies from time to time. In such a
scenario, On-demand computing makes sure that servers and IT resources are provisioned to
handle the increase/decrease in demand.
A cloud provider maintains a poll of resources. The pool of resources contains networks, servers,
storage, applications and services. This pool can serve the varying demand of resources and
computing by various enterprise clients.
There are many concepts like- grid computing, utility computing, autonomic computing etc.that
are similar to on-demand computing.
1. Infrastructure as a Service (IAAS): IAAS providers give low-level abstractions of physical devices.
Amazon Web Services (AWS) is an example of IAAS. AWS provides EC2 for computing, S3
buckets for storage etc. Mainly the resources in this layer are hardware like memory, processor
speed, network bandwidth etc.
2. Platform as a Service (PAAS): PAAS providers offer managed services like Rails, Django etc. One
good example of PAAS is Google App Engineer. These are the environments in which developers
can develop sophisticated software with ease. Developers just focus on developing software,
whereas scaling and performance is handled by PAAS provider.
3. Software as a Service (SAAS): SAAS provider offer an actual working software application to
clients. Salesforce and Github are two good examples of SAAS. They hide the underlying details
of the software and just provide an interface to work on the system. Behind the scenes the
version of Software can be easily changed.
An IAAS provider can give physical, virtual or both kinds of resources. These resources are used
to build cloud.
IAAS provider handles the complexity of maintaining and deploying these services.
IAAS provider also handles security and backup recovery for these services. The main resources
in IAAS are servers, storage, routers, switches and other related hardware etc.
Platform as a service (PaaS) is a kind of cloud computing service. A PaaS provider offers a
platform on which clients can develop, run and manage applications without the need of building
the infrastructure.
In PAAS clients save time by not creating and managing infrastructure environment associated
with the app that they want to develop.
1. It allows development work on higher level programming with very less complexity.
2. Teams can focus on just the development of the application that makes the application very
effective.
3. Maintenance and enhancement of the application is much easier.
4. It is suitable for situations in which multiple developers work on a single project but are not co-
located.
Biggest disadvantage of PaaS is that a developer can only use the tools that PaaS provider makes
available. A developer cannot use the full range of conventional tools.
Some PaaS providers lock in the clients in their platform. This also decreases the flexibility of
clients using PaaS.
102. What are the different deployment models in Cloud computing?
Private Cloud: Some companies build their private cloud. A private cloud is a fully functional
platform that is owned, operated and used by only one organization.
Primary reason for private cloud is security. Many companies feel secure in private cloud. The
other reasons for building private cloud are strategic decisions or control of operations.
There is also a concept of Virtual Private Cloud (VPC). In VPC, private cloud is built and
operated by a hosting company. But it is exclusively used by one organization.
Public Cloud: There are cloud platforms by some companies that are open for general public as
well as big companies for use and deployment. E.g. Google Apps, Amazon Web Services etc. The
public cloud providers focus on layers and application like- cloud application, infrastructure
management etc. In this model resources are shared among different organizations.
Hybrid Cloud: The combination of public and private cloud is known as Hybrid cloud. This
approach provides benefits of both the approaches- private and public cloud. So it is very robust
platform. A client gets functionalities and features of both the cloud platforms. By using Hybrid
cloud an organization can create its own cloud as well as they can pass the control of their cloud
to another third party.
Scalability is the ability of a system to handle the increased load on its current hardware and
software resources. In a highly scalable system it is possible to increase the workload without
increasing the resource capacity. Scalability supports any sudden surge in the demand/traffic
with current set of resources.
Elasticity is the ability of a system to increase the workload by increasing the hardware/software
resources dynamically. Highly elastic systems can handle the increased demand and traffic by
dynamically commission and decommission resources. Elasticity is an important characteristic of
Cloud Computing applications. Elasticity means how well your architecture is adaptable to
workload in real time.
E.g. If in a system, one server can handle 100 users, 2 servers can handle 200 users and 10
servers can handle 1000 users. But in case for adding every X users, if you need 2X the amount
of servers, then it is not a scalable design.
Let say, you have just one user login every hour on your site. Your one server can handle this
load. But, if suddenly, 1000 users login at once, can your system quickly start new web servers
on the fly to handle this load? Your design is elastic if it can handle such sudden increase in
traffic so quickly.
104. What is Software as a Service?
Software as Service is a category of cloud computing in which Software is centrally hosted and it
is licensed on a subscription basis. It is also known as On-demand software. Generally, clients
access the software by using a thin-client like a web browser.
Many applications like Google docs, Microsoft office etc. provide SaaS model for their software.
The benefit of SaaS is that a client can add more users on the fly based on its current needs. And
client does not need to install or maintain any software on its premises to use this software.
Cloud computing consists of different types of Datacenters linked in a grid structure. The main
types of Datacenters in Cloud computing are:
1. Containerized Datacenter : As the name suggests, containerized datacenter provides high level
of customization for an organization. These are traditional kind of datacenters. We can choose
the different types of servers, memory, network and other infrastructure resources in this
datacenter. Also we have to plan temperature control, network management and power
management in this kind of datacenter.
2. Low-Density Datacenters : In a Low-density datacenter, we get high level of performance. In
such a datacenter if we increase the density of servers, the issue with power comes. With high
density of servers, the area gets heated. In such a scenario, effective heat and power
management is done. To reach high level of performance, we have to optimize the number of
servers’ in the datacenter.
106. Explain the various modes of Software as a Service (SaaS) cloud environment?
Software as a Service (SaaS) is used to offer different kinds of software applications in a Cloud
environment. Generally these are offered on subscription basis. Different modes of SaaS are:
1. Simple multi-tenancy: In this setup, each client gets its own resources. These resources are not
shared with other clients. It is more secure option, since there is no sharing of resources. But it
an inefficient option, since for each client more money is needed to scale it with the rising
demands. Also it takes time to scale up the application in this mode.
2. Fine grain multi-tenancy: In this mode, the feature provided to each client is same. The
resources are shared among multiple clients. It is an efficient mode of cloud service, in which
data is kept private among different clients but computing resources are shared. Also it is easier
and quicker to scale up the SaaS implementation for different clients.
107. What are the important things to care about in Security in a cloud environment?
With growing concern of hacking, every organization wants to make its software system and
data secure. Since in a cloud computing environment, Software and hardware is not on the
premises of an organization, it becomes more important to implement the best security practices.
Organizations have to keep their Data most secure during the transfer between two locations.
Also they have to keep data secure when it is stored at a location. Hackers can hack into
application or they can get an unauthorized copy of the data. So it becomes important to encrypt
the data during transit as well as during rest to protect it from unwanted hackers.
There are different types of clients for cloud computing APIs. It is easier to serve different needs
of multiple clients with APIs in cloud computing environment.
1. Identity Management: This aspect creates different level of users, roles and their credentials to
access the services in cloud.
2. Access Control: In this area, we create multiple levels of permissions and access areas that can
be given to a user or role for accessing a service in cloud environment.
3. Authentication: In this area, we check the credentials of a user and confirm that it is the correct
user. Generally this is done by user password and multi-factor authentication like-verification by
a one-time use code on cell phone.
4. Authorization: In this aspect, we check for the permissions that are given to a user or role. If a
user is authorized to access a service, they are allowed to use it in the cloud environment.
110. What are the main cost factors of cloud based data center?
Costs in a Cloud based data center are different from a traditional data center. Main cost factors
of cloud based data center are as follows:
1. Labor cost: We need skilled staff that can work with the cloud-based datacenter that we have
selected for our operation. Since cloud is not a very old technology, it may get difficult to get the
right skill people for handling cloud based datacenter.
2. Power cost: In some cloud operations, power costs are borne by the client. Since it is a variable
cost, it can increase with the increase in scale and usage.
3. Computing cost: The biggest cost in Cloud environment is the cost that we pay to Cloud provider
for giving us computing resources. This cost is much higher compared to the labor or power
costs.
In a cloud-computing environment we pay for the services that we use. So main criteria to
measure a cloud based service its usage.
For computing resource we measure by usage in terms of time and the power of computing
resource.
For a storage resource we measure by usage in terms of bytes (giga bytes) and bandwidth used in
data transfer.
Another important aspect of measuring a cloud service is its availability. A cloud provider has to
specify the service level agreement (SLA) for the time for which service will be available in
cloud.
In a traditional datacenter the cost of increasing the scale of computing environment is much
higher than a Cloud computing environment. Also in a traditional data center, there are not much
benefits of scaling down the operation when demand decreases. Since most of the expenditure is
in capital spent of buying servers etc., scaling down just saves power cost, which is very less
compared to other fixed costs.
Also in a Cloud environment there is no need to higher a large number of operations staff to
maintain the datacenter. Cloud provider takes care of maintaining and upgrading the resources in
Cloud environment.
With a traditional datacenter, people cost is very high since we have to hire a large number of
technical operation people for in-house datacenter.
113. How will you optimize availability of your application in a Cloud environment?
Another aspect of cloud environment is that servers often fail or go down. In such a scenario it is
important to implement the application in such a way that we just kill the slow server and restart
another server to handle the traffic seamlessly.
114. What are the requirements for implementing IaaS strategy in Cloud?
1. Operating System (OS): We need an OS to support hypervisor in IaaS. We can use open source
OS like Linux for this purpose.
2. Networking: We have to define and implement networking topology for IaaS implementation.
We can use public or private network for this.
3. Cloud Model: We have to select the right cloud model for implementing IaaS strategy. It can be
SaaS, PaaS or CaaS.
115. What is the scenario in which public cloud is preferred over private cloud?
In a startup mode often we want to test our idea. In such a scenario it makes sense to setup
application in public cloud.
It is much faster and cheaper to use public cloud over private cloud. Remember security is a
major concern in public cloud.
But with time and changes in technology, even public cloud is very secure.
Cloud Computing is neither a software application nor a hardware service. Cloud computing is a
system architecture that can be used to implement software as well as hardware strategy of an
organization.
Cloud Computing is a highly scalable, highly available and cost effective solution for software
and hardware needs of an application.
Cloud Computing provides great ease of use in running the software in cloud environment. It is
also very fast to implement compared with any other traditional strategy.
117. Why companies now prefer Cloud Computing architecture over Client Server
Architecture?
In Client Server architecture there is one to one communication between client and server. Server
is often at in-house datacenter and client can access same server from anywhere. If client is at a
remote location, the communication can have high latency.
In Cloud Computing there can be multiple servers in the cloud. There will be a Cloud controller
that directs the requests to right server node. In such a scenario clients can access cloud-based
service from any location and they can be directed to the one nearest to them.
Another reason for Cloud computing architecture is high availability. Since there are multiple
servers behind the cloud, even if one server is down, another server can serve the clients
seamlessly.
1. Elasticity: In Cloud Computing system is highly elastic in the sense that it can easily adapt itself
to increase or decrease in load. There is no need to take urgent actions when there is surge in
traffic requests.
2. Self-service provisioning: In Cloud environment users can provision new resources on their own
by just calling some APIs. There is no need to fill forms and order actual hardware from vendors.
3. Automated de-provisioning: In case demand/load decreases, extra resources can be
automatically shut down in Cloud computing environment.
4. Standard Interface: There are standard interfaces to start, stop, suspend or remove an instance
in Cloud environment. Most of the services are accessible via public and standard APIs in Cloud
computing.
5. Usage based Billing: In a Cloud environment, users are charged for their usage of resources.
They can forecast their bill and costs based on the growth they are expecting in their load.
119. How databases in Cloud computing are different from traditional databases?
In a Cloud environment, companies often use different kind of data to store. There are data like
email, images, video, pdf, graph etc. in a Cloud environment. To store this data often NoSQL
databases are used.
A NoSQL database like MongoDB provides storage and retrieval of data that cannot be stored
efficiently in a traditional RDBMS.
Database like Neo4J provides features to store graph data like Facebook, LinkedIn etc. in a cloud
environment.
Hadoop like database help in storing Big Data based information. It can handle very large-scale
information that is generated in a large-scale environment.
In a Cloud environment, we can create a virtual private network (VPM) that can be solely used
by only one client. This is a secure network in which data transfer between servers of same VPN
is very secure.
By using VPN, an organization uses the public network in a private manner. It increases the
privacy of an organization’s data transfer in a cloud environment.
121. What are the main components of a VPN?
1. Network Access Server (NAS): A NAS server is responsible for setting up tunnels in a VPN that is
accesses remotely. It maintains these tunnels that connect clients to VPN.
2. Firewall: It is the software that creates barrier between VPN and public network. It protects the
VPN from malicious activity that can be done from the outside network.
3. AAA Server: This is an authentication and authorization server that controls the access and
usage of VPN. For each request to use VPN, AAA server checks the user for correct permissions.
4. Encryption: In a VPN, encryption algorithms protect the important private data from malicious
users.
122. How will you secure the application data for transport in a cloud environment?
With ease of use in Cloud environment comes the important aspect of keeping data secure. Many
organizations have data that is transferred from their traditional datacenter to Cloud datacenter.
During the transit of data it is important to keep it secure. Once of the best way to secure data is
by using HTTPS protocol over Secure Socket Layer (SSL).
Another important point is to keep the data always encrypted. This protects data from being
accessed by any unauthorized user during transit.
In Cloud computing scale is not a limit. So there are very large-scale databases available from
cloud providers. Some of these are:
1. Amazon DynamoDB: Amazon Web Services (AWS) provides a NoSQL web service called
DynamoDB that provides highly available and partition tolerant database system. It has a multi-
master design. It uses synchronous replication across multiple datacenters. We can easily
integrate it with MapReduce and Elastic MapReduce of AWS.
2. Google Bigtable: This is a very large-scale high performance cloud based database option from
Google. It is available on Google Cloud. It can be scaled to peta bytes. It is a Google proprietary
implementation. In Bigtable, two arbitrary string values, row key and column key, and
timestamp are mapped to an arbitrary byte array. In Bigtable MapReduce algorithm is used for
modifying and generating the data.
3. Microsoft Azure SQL Database: Microsoft Azure provides cloud based SQL database that can be
scaled very easily for increased demand. It has very good security features and it can be even
used to build multi-tenant apps to service multiple customers in cloud.
124. What are the options for open source NoSQL database in a Cloud environment?
Most of the cloud-computing providers support Open Source NoSQL databases. Some of these
databases are:
1. Apache CouchDB: It is a document based NoSQL database from Apache Open Source. It is
compatible with Couch Replication Protocol. It can communicate in native JSON and can store
binary data very well.
2. HBase: It is a NoSQL database for use with Hadoop based software. It is also available as Open
Source from Apache. It is a scalable and distributed Big Data database.
3. MongoDB: It is an open source database system that offers a flexible data model that can be
used to store various kinds of data. It provides high performance and always-on user
experience.
125. What are the important points to consider before selecting cloud computing?
Cloud computing is a very good option for an organization to scale and outsource its
software/hardware needs. But before selecting a cloud provider it is important to consider
following points:
1. Security: One of the most important points is security of the data. We should ask the cloud
provider about the options to keep data secure in cloud during transit and at rest.
2. Data Integrity: Another important point is to maintain the integrity of data in cloud. It is
essential to keep data accurate and complete in cloud environment.
3. Data Loss: In a cloud environment, there are chances of data loss. So we should know the
provisions to minimize the data loss. It can be done by keeping backup of data in cloud. Also
there should be reliable data recovery options in case of data loss.
4. Compliance: While using a cloud environment one must be aware of the rules and regulations
that have to be followed to use the cloud. There compliance issues with storing data of a user in
an external provider’s location/servers.
5. Business Continuity: In case of any disaster, it is important to create business continuity plans so
that we can provide uninterrupted service to our end users.
6. Availability: Another important point is the availability of data and services in a cloud-computing
environment. It is very important to provide high availability for a good customer experience.
7. Storage Cost: Since data is stored in cloud, it may be very cheap to store the data. But the real
cost can come in transfer of data when we have to pay by bandwidth usage. So storage cost of
data in cloud should also include the access cost of data transfer.
8. Computing Cost: One of the highest costs of cloud is computing cost. It can be very high cost
with the increase of scale. So cloud computing options should be wisely considered in
conjunction with computing cost charged for them.
126. What is a System integrator in Cloud computing?
Often an organization does not know all the options available in a Cloud computing
environment. Here comes the role of a System Integrator (SI) who specializes in implementing
Cloud computing environment.
SI creates the strategy of cloud setup. It designs the cloud platform for the use of its client. It
creates the cloud architecture for the business need of client.
SI oversees the overall implementation of cloud strategy and plan. It also guides the client while
choosing the right options in cloud computing platform.
Virtualization is the core of cloud computing platform. In cloud we can create a virtual version
of hardware, storage and operating system that can be used to deploy the application.
A cloud provider gives options to create virtual machines in cloud that can be used by its clients.
These virtual machines are much cheaper than buying a few high end computing machines.
In cloud we can use multiple cheap virtual machines to implement a resilient software system
that can be scaled very easily in quick time. Where as buying an actual high-end machine to
scale the system is very costly and time taking.
Eucalyptus is an open source software to build private and hybrid cloud in Amazon Web
Services (AWS).
It stands for Elastic Utility Computing Architecture for Linking Your Programs To Useful
Systems.
We can create our own datacenter in a private cloud by using Eucalyptus. It makes use of
pooling the computing and storage resources to scale up the operations.
In Eucalyptus, we create images of software applications. These images are deployed to create
instances. These instances are used for computing needs.
1. Cloud Controller (CLC): This is the controller that manages virtual resources like servers, network
and storage. It is at the highest level in hierarchy. It is a Java program with web interface for
outside world. It can do resource scheduling as well as system accounting. There is only one CLC
per cloud. It can handle authentication, accounting, reporting and quota management in cloud.
2. Walrus: This is another Java program in Eucalyptus that is equivalent to AWS S3 storage. It
provides persistent storage. It also contains images, volumes and snapshots similar to AWS.
There is only one Walrus in a cloud.
3. Cluster Controller (CC): It is a C program that is the front end for a Eucalyptus cloud cluster. It
can communicate with Storage controller and Node controller. It manages the instance
execution in cloud.
4. Storage Controller (SC): It is a Java program equivalent to EBS in AWS. It can interface with
Cluster Controller and Node Controller to manage persistent data via Walrus.
5. Node Controller (NC): It is a C program that can host a virtual machine instance. It is at the
lowest level in Eucalyptus cloud. It downloads images from Walrus and creates an instance for
computing requirements in cloud.
6. VMWare Broker: It is an optional component in Eucalyptus. It provides AWS compatible
interface to VMWare environment.
Amazon Web Services (AWS) provides an important feature called Auto-scaling in the cloud.
With Auto-scaling setup we can automatically provision and start new instances in AWS cloud
without any human intervention.
Auto-scaling is triggered based on load and other metrics. Let say if the load reaches a threshold
we can setup auto-scaling to kick in and start a new server to handle additional load.
Utility computing is a cloud service model in which provider gives computing resources to users
for using on need basis.
1. Pay per use: Since a user pays for only usage, the cost of Utility computing is pay per use. We
pay for the number of servers of instances that we use in cloud.
2. Easy to Scale: It is easier to scale up the operations in Utility computing. There is no need to plan
for time consuming and costly hardware purchase.
3. Maintenance: In Utility computing maintenance of servers is done by cloud provider. So a user
can focus on its core business. It need not spend time and resources on maintenance of servers
in cloud.
Hypervisor runs on a host machine. Each virtual machine is called Guest machine.
Hypervisor derives its name from term supervisor, which is a traditional name for the kernel of
an operating system.
Hypervisor provides a virtual operating platform to the guest operating system. It manages the
execution of guest OS.
1. Type-1, native or bare-metal hypervisors: Type 1 hypervisor runs directly on the hardware of
host machine. It controls the guest operating system from host machine. It is also called bare
metal hypervisor or native hypervisor. Examples of Type-1 are: Xen, Oracle VM Server for
SPARC, Oracle VM Server for x86, the Citrix XenServer, Microsoft Hyper-V and VMware ESX/ESXi.
Type-2, hosted hypervisors:
2. Type 2 hypervisor runs like a regular computer program on an operating system. The guest
operating system runs like a process on the host machine. It creates an abstract guest operating
system different from the host operating system. Examples of Type-2 are: VMware Workstation,
VMware Player, VirtualBox, Parallels Desktop for Mac and QEMU are examples of type-2
hypervisors.
134. Why Type-1 Hypervisor has better performance than Type-2 Hypervisor?
Type-1 Hypervisor has better performance than Type-2 hypervisor because Type-1 hypervisor
skips the host operating system and it runs directly on host hardware. So it can utilize all the
resources of host machine.
In cloud computing Type-1 hypervisors are more popular since Cloud servers may need to run
multiple operating system images.
CaaS offers business features like desktop call control, unified messaging, and fax via desktop.
CaaS also provides services for Call Center automation like- IVR, ACD, call recording,
multimedia routing and screen sharing.
136. How is Cloud computing different from computing for mobile devices?
Since Mobile devices are getting connected to the Internet in large numbers, we often use Cloud
computing for Mobile devices.
In mobile applications, there can be sudden increase in traffic as well as usage. Even some
applications become viral very soon. This leads to very high load on application.
In such a scenario, it makes sense to use Cloud Computing for mobile devices.
Also mobile devices keep changing over time, it requires standard interfaces of cloud computing
for handling multiple mobile devices.
One of the main reasons for selecting Cloud architecture is scalability of the system. In case of
heavy load, we have to scale up the system so that there is no performance degradation.
While scaling up the system we have to start new instances. To provision new instances we have
to deploy our application on them.
In such a scenario, if we want to save time, it makes sense to automate the deployment process.
Another term for this is Auto-scaling.
With a fully automated deployment process we can start new instances based on automated
triggers that are raised by load reaching a threshold.
Amazon provides a wide range of products in Amazon Web Services for implementing Cloud
computing architecture. In AWS some of the main components are as follows:
1. Amazon EC2: This is used for creating instances and getting computing power to run applications
in AWS.
2. Amazon S3: This is a Simple Storage Service from AWS to store files and media in cloud.
3. Amazon DynamoDB: It is the database solution by AWS in cloud. It can store very large-scale
data to meet needs of even BigData computing.
4. Amazon Route53: This is a cloud based Domain Name System (DNS) service from AWS.
5. Amazon Elastic Load Balancing (ELB): This component can be used to load balance the various
nodes in AWS cloud.
6. Amazon CodeDeploy: This service provides feature to automate the code deployment to any
instance in AWS.
139. What are main components in Google Cloud?
Google is a newer cloud alternative than Amazon. But Google provides many additional features
than AWS. Some of the main components of Google Cloud are as follows:
1. Compute Engine: This component provides computing power to Google Cloud users.
2. Cloud Storage: As the name suggests this is a cloud storage solution from Google for storing
large files for application use or just serving over the Internet.
3. Cloud Bigtable: It is a Google proprietary database from Google in Cloud. Now users can use this
unique database for creating their applications.
4. Cloud Load Balancing: This is a cloud-based load balancing service from Google.
5. BigQuery: It is a data-warehouse solution from Google in Cloud to perform data analytics of
large scale.
6. Cloud Machine Learning Platform: It is a powerful cloud based machine learning product from
Google to perform machine learning with APIs like- Job Search, Text Analysis, Speech
Recognition, Dynamic translation etc.
7. Cloud IAM: This is an Identity and Access management tool from Google to help administrators
run the security and authorization/authentication policies of an organization.
Microsoft is a relatively new entrant to Cloud computing with Azure cloud offering. Some of the
main products of Microsoft cloud are as follows:
1. Azure Container Service: This is a cloud computing service from Microsoft to run and manage
Docker based containers.
2. StorSimple: It is a Storage solution from Microsoft for Azure cloud.
3. App Service: By using App Services, users can create Apps for mobile devices as well as websites.
4. SQL Database: It is a Cloud based SQL database from Microsoft.
5. DocumentDB: This is a NoSQL database in cloud by Microsoft.
6. Azure Bot Service: We can use Azure Bot Service to create serverless bots that can be scaled up
on demand.
7. Azure IoT Hub: It is a solution for Internet of Things services in cloud by Microsoft.
These days Cloud Computing is one of the most favorite architecture among organizations for
their systems. Following are some of the reasons for popularity of Cloud Computing
architecture:
1. IoT: With the Internet of Things, there are many types of machines joining the Internet and
creating various types of interactions. In such a scenario, Cloud Computing serves well to
provide scalable interfaces to communicate between the machines in IoT.
2. Big Data: Another major trend in today’s computing is Big Data. With Big Data there is very large
amount of user / machine data that is generated. Using in-house solution to handle Big Data is
very costly and capital intensive. In Cloud Computing we can handle Big Data very easily since
we do not have to worry about capital costs.
3. Mobile Devices: A large number of users are going to Mobile computing. With a mobile device
users can access a service from any location. To handle wide-variety of mobile devices, standard
interfaces of Cloud Computing are very useful.
4. Viral Content: With growth of Social Media, content and media is getting viral i.e. It takes very
short time to increase the traffic exponentially on a server. In such a scenario Auto-scaling of
Cloud Computing architecture can handle such spikes very easily.
142. What are the Machine Learning options from Google Cloud?
Google provides a very rich library of Machine Learning options in Google Cloud. Some of
these API are:
1. Google Cloud ML: This is a general purpose Machine Learning API in cloud. We can use pre-
trained models or generate new models for machine learning with this option.
2. Google Cloud Jobs API: It is an API to link Job Seekers with Opportunities. It is mainly for job
search based on skills, demand and location.
3. Google Natural Language API: This API can do text analysis of natural language content. We can
use it for analyzing the content of blogs, websites, books etc.
4. Google Cloud Speech API: It is a Speech Recognition API from Google to handle spoken text. It
can recognize more than 80 languages and their related variants. It can even transcribe the user
speech into written text.
5. Google Cloud Translate API: This API can translate content from one language to another
language in cloud.
6. Google Cloud Vision API: It is a powerful API for Image analysis. It can recognize faces and
objects in an image. It can even categorize images in multiple relevant categories with a simple
REST API call.
In a Cloud Computing environment we pay by usage. In such a scenario our usage costs are
much higher. To optimize the Cloud Computing environment we have to keep a balance between
our usage costs and usage.
If we are paying for computing instances we can choose options like Lambda in AWS, which is a
much cheaper options for computing in cloud.
In case of Storage, if the data to be stored is not going to be accesses frequently we can go for
Glacier option in AWS.
Similarly when we pay for bandwidth usage, it makes sense to implement a caching strategy so
that we use less bandwidth for the content that is accessed very frequently.
It is a challenging task for an architect in cloud to match the options available in cloud with the
budget that an organization has to run its applications.
Optimizations like server-less computing, load balancing, and storage selection can help in
keeping the Cloud computing costs low with no degradation in User experience.
144. Do you think Regulations and Legal Compliance is an important aspect of Cloud
Computing?
Yes, in Cloud Computing we are using resources that are owned by the Cloud provider. Due to
this our data resides on the servers that can be shared by other users of Cloud.
There are regulations and laws for handling user data. We have to ensure that these regulations
are met while selecting and implementing a Cloud computing strategy.
Similarly, if we are in a contract with a client to provide certain Service Level Agreement (SLA)
performance, we have to implement the cloud solution in such a way that there is no breach of
SLA agreement due to Cloud provider’s failures.
For security there are laws that have to be followed irrespective of Cloud or Co-located Data
center. This is in the interest of our end-customer as well as for the benefit of business
continuity.
With Cloud computing architecture we have to do due diligence in selecting Security and
Encryption options in Cloud.
Unix Questions
Including the files that are two levels down in a sub-directory. In Unix we have rm command to
remove files and sub-directories. With rm command we have –r option that stands for recursive.
The –r option can delete all files in a directory recursively.
My_dir
->Level_1_dir
With rm –r * command we can delete the file a.txt as well as sub-directories Level_1_dir and
Level_2_dir.
Command:
rm – r *
The asterisk (*) is a wild card character that stands for all the files with any name.
146. What is the difference between the –v and –x options in Bash shell scripts?
In a BASH Unix shell we can specify the options –v and –x on top of a script as follows:
#!/bin/bash -x –v
With –x option BASH shell will echo the commands like for, select, case etc. after substituting
the arguments and variables. So it will be an expanded form of the command that shows all the
actions of the script.
It is very useful for debugging a shell script. With –v option BASH shell will echo every
command before substituting the values of arguments and variables. In –v option Unix will print
each line as it reads.
In –v option, If we run the script, the shell prints the entire file and then executes. If we run the
script interactively, it shows each command after pressing enter.
In Unix there are many Filter commands like- cat, awk, grep, head, tail cut etc.
A Filter is a software program that takes an input and produces an output, and it can be used in a
stream operation.
We can mix and match multiple filters to create a complex command that can solve a problem.
Awk and Sed are complex filters that provide fully programmable features.
Even Data scientists use Unix filters to get the overview of data stored in the files.
A Kernel is the main component that can control everything within Unix OS.
It is the first program that is loaded on startup of Unix OS. Once it is loaded it will manage the
rest of the startup process.
Kernel manages memory, scheduling as well as communication with peripherals like printers,
keyboards etc.
But Kernel does not directly interact with a user. For a new task, Kernel will spawn a shell and
user will work in a shell.
Kernel provides many system calls. A software program interacts with Kernel by using system
calls.
Kernel has a protected memory area that cannot be overwritten accidentally by any process.
Shell in Unix is a user interface that is used by a user to access Unix services.
Generally a Unix Shell is a command line interface (CLI) in which users enter commands by
typing or uploading a file.
We use a Shell to run different commands and programs on Unix operating system.
A Shell also has a command interpreter that can take our commands and send these to be
executed by Unix operating system.
Some of the popular Shells on Unix are: Korn shell, BASH, C shell etc.
150. What are the different shells in Unix that you know about?
We use ls -l command to list the files and directories in a directory. With -l option we get long
listing format.
In this format the first character identifies the entry type. The entry type can be one of the
following:
In a Multi-tasking environment, same user can submit more than one tasks and operating system
will execute them at the same time.
In a Multi-user environment, more than one user can interact with the operating system at the
same time.
In this example files_to_delete is a file containing the list of files to be deleted. cat command
outputs this file and gives the output to rm command. rm command deletes the files.
An Inode is a Data Structure in Unix that denotes a file or a directory on file system. It contains
information about file like- location of file on the disk, access mode, ownership, file type etc.
Each Inode has a number that is used in the index table. Unix kernel uses Inode number to access
the contents of an Inode.
154. What is the difference between absolute path and relative path in Unix file system?
Absolute path is the complete path of a file or directory from the root directory. In general root
directory is represented by / symbol. If we are in a directory and want to know the absolute path,
we can use pwd command.
E.g. In a directory structure /var/user/kevin/mail if we are in kevin directory then pwd command
will give absolute path as /var/user/kevin.
Absolute path of mail folder is /var/user/kevin/mail. For mail folder ./mail is the relative path of
mail directory from kevin folder.
155. What are the main responsibilities of a Unix Shell?
1. Program Execution: A shell is responsible for executing the commands and script files in
Unix. User can either interactively enter the commands in Command Line Interface called
terminal or they can run a script file containing a program.
2. Environment Setup: A shell can define the environment for a user. We can set many
environment variables in a shell and use the value of these variables in our program.
3. Interpreter: A shell acts as an interpreter for our scripts. It has a built in programming
language that can be used to implement the logic.
4. Pipeline: A shell also can hookup a pipeline of commands. When we run multiple commands
separated by | pipe character, the shell takes the output of a command and passes it to next one in
the pipeline.
5. I/O Redirection: Shell is also responsible for taking input from command line interface (CLI)
and sending the output back to CLI. We use >, <, >> characters for this purpose.
A Unix Shell variable is an internal variable that a shell maintains. It is local to that Shell. It is
not made available to the parent shell or child shell.
To use a Shell variable in a script we use $ sign in front of the variable name.
157. What are the important Shell variables that are initialized on starting a Shell?
There are following important Shell variables that are automatically initialized when a Shell
starts:
user:
term:
home:
path:
If we change the value of these Shell variables then the corresponding environment variable
value is also changed.
158. How will you set the value of Environment variables in Unix?
% setenv MAX_TIME 10
If we just use printenv then it lists all the environment variables and their values.
To use an environment variable in a command we use the prefix $ with the name of variable.
What is the special rule about Shell and Environment variable in Bourne Shell?
In Bourne Shell, there is not much difference between Shell variable and Environment variable.
Once we start a Bourne Shell, it gets the value of environment variables and defines a
corresponding Shell variable. From that time onwards the shell only refers to Shell variable. But
if a change is made to a Shell variable, then we have to explicitly export it to environment so that
other shell or child processes can use it.
159. What is the difference between a System Call and a library function?
System calls are low-level kernel calls. These are handled by the kernel. System calls are
implemented in kernel of Unix. An application has to execute special hardware and system
dependent instruction to run a System call.
A library function is also a low level call but it is implemented in user space. A library call is a
regular function call whose code resides in a shared library.
160. What are the networking commands in Unix that you have used?
Some of the popular networking commands in Unix that we use are as follows:
1. ping: We use this command to test the reachability of a host on an Internet Protocol (IP)
network.
2. telnet: This is another useful command to access another machine on the network. This is
command uses Telnet protocol.
3. tracert: This is short for Traceroute. It is a diagnostic command to display the route and transit
delays of packets across Internet Protocol.
4. ftp: We use ftp commands to transfer files over the network. ftp uses File Transfer Protocol.
5. su: This unix command is used to execute commands with the privileges of another user. It is
also known as switch user, substitute user.
6. ssh: This is a secure command that is preferred over Telnet for connecting to another machine.
It creates a secure channel over an unsecured network. It uses cryptographic protocol to make
the communication secure.
A Pipeline in Unix is a chain of commands that are connected through a stream in such a way
that output of one command becomes input for another command.
In the above example we have created pipeline of three commands ls, grep and wc.
First ls –l command is executed and gives the list of files in a directory. Then grep command
searches for any line with word “abc” in it. Finally wc –l command counts the number of lines
that are returned by grep command.
In general a Pipeline is uni-directional. The data flows from left to right direction.
We use tee command in a shell to read the input by user (standard input) and write it to screen
(standard output) as well as to a file.
We can use tee command to split the output of a program so that it is visible on command line
interface (CLI) as well as stored on a file for later use.
We can use wc (word count) command for counting the number of lines and words in a file. The
wc command provides very good options for collecting statistics of a file. Some of these options
are:
In case we give more than one files as input to wc command then it gives statistics for individual
files as well as the total statistics for all files.
Bash stands for Bourne Again Shell. It is free software written to replace Bourne shell.
#!/bin/bash
% command1; command2
We can use grep command to search for a name or any text in a Unix file.
Grep command can search for a text in one file as well as multiple files.
We can also specify the text to be searched in regular expression pattern.
% grep ^z *.txt
Above command searches for lines starting with letter z in all the .txt files in current directory.
In Unix, grep is one of the very useful commands. It provides many useful options. Some of the
popular options are:
% grep –v: We use this option to find the lines that do not have the text we are searching.
% grep –A 10: This option displays 10 lines after the match is found.
167. What is the difference between whoami and who am i commands in Unix?
Both the commands whoami and who am i are used to get the user information in Unix.
When we login as root user on the network, then both whoami and who am i commands will
show the user as root.
But when any other user let say john logs in remotely and runs su –root, whoami will show root,
but who am i will show the original user john.
Superuser is a special user account. It is used for Unix system administration. This user can
access all files on the file system. Also Superuser can also run any command on a system.
Most of the users work on their own user accounts. But when they need to run some additional
commands, they can use su to switch to Superuser account.
We can use ps command to check the status of a process in Unix. It is short for Process Status.
On running ps command we get the list of processes that are executing in the Unix environment.
Generally we use ps –ef command. In this e stands for every process and f stands for full format.
This command gives us id of the process. We can use this id to kill the process.
If a file is very big then the contents of the file will not fit in screen, therefore screen will scroll
forward and in the end we just see the last page of information from a file.
With more command we can pause the scrolling of data from a file in display. If we use cat
command with more then we just see the first page of a file first. On pressing enter button, more
command will keep changing the page. In this way it is easier to view information in a file.
When using the cat command to display file contents, large data that does not fit on the screen
would scroll off without pausing, therefore making it difficult to view. On the other hand, using
the more command is more appropriate in such case because it will display file contents one
screen page at a time.
With the combination of these three sets permissions of file in Unix are specified.
E.g. If a file has permissions –rwxr-xr– , it means that owner has read, write, execute access.
Group has read and execute access. Others have just read access. So the owner or admin has to
specifically grant access to Others to execute the file.
172. We wrote a shell script in Unix but it is not doing anything. What could be the reason?
After writing a shell script we have to give it execute permission so that it can be run in Unix
shell.
We can use chmod command to change the permission of a file in Unix. In general we use
chmod +x to give execute permission to users for executing the shell script.
E.g. chmod +x abc.txt will give execute permission to users for executing the file abc.txt.
With chmod command we can also specify to which user/group the permission should be
granted. The options are:
We use chmod command to change the permissions of a file in Unix. In this command we can
pass the file permissions in the form of a three-digit number.
In this number 755, first digit 7 is the permissions given to owner, second digit 5 is the
permissions of group and third digit 5 is the permissions of all others.
4 = read permission
2 = write permission
1 = execute permission
In out example 755 means, owner has read, write and execute permissions. Group and others
have read and execute permissions.
174. How can we run a process in background in Unix? How can we kill a process running
in background?
Once we use & option it runs the process in background and prints the process ID. We cannot
down this process ID for using it in kill command.
We can also use ps –ef command to get the process ID of processes running in background.
% kill -9 processId
We can create a file with Vi editor, cat or any other command. Once the file is created we have
to give read only permissions to file. To change file permission to read only we use following
command:
We use alias in Unix to give a short name to a long command. We can even use it to combine
multiple commands and give a short convenient name.
With this alias we just need to type c for running clear command.
To get the list of all active alias in a shell we can run the alias command without any argument
on command line.
% alias
alias h=’history’
alias ki=’kill -9′
alias l=’last’
In Unix we can redirect the output of command or operation to a file instead of command line
interface (CLI). For this we sue redirection pointers. These are symbols > and >>.
If we want to append the contents of one file at the end of another file we use following:
178. What are the main steps taken by a Unix Shell for processing a command?
1. Parse: First step is to parse the command or set of commands given in a Command Line
Interface (CLI). In this step multiple consecutive spaces are replaced by single space. Multiple
commands that are delimited by a symbol are divided into multiple individual actions.
2. Variable: In next step Shell identifies the variables mentioned in commands. Generally any word
prefixed by $ sign is a variable.
3. Command Substitution: In this step, Shell executes the commands that are surrounded by back
quotes and replaces that section with the output from the command.
4. Wild Card: Once these steps are done, Shell replaces the Wild card characters like asterisk * with
the relevant substitution.
5. Execute: Finally, Shell executes all the commands and follows the sequence in which Commands
are given in CLI.
Sometimes when we give write permission to another user then that user can delete the file
without the owner knowing about it. To prevent such an accidental deletion of file we use sticky
bit.
When we mark a file/directory with a sticky bit, no user other than owner of file/directory gets
the privilege to delete a file/directory.
% chmod +t filename
When we do ls for a file or directory, the entries with sticky bit are listed with letter t in the end
of permissions.
E.g. % ls –lrt
180. What are the different outputs from Kill command in Unix?
EPERM denotes that system does not permit the process to be killed.
ESRCH denotes that process with PID mentioned in Kill command does not exist anymore. Or
due to security restrictions we cannot access that process.
In Unix, almost all the popular shells provide options to customize the environment by using
environment variables. To make these customizations permanent we can write these to special
files that are specific to a user in a shell.
Once we write our customizations to these files, we keep on getting same customization when
we open a new shell with same user account.
The special files for storing customization information for different shells at login time are:
182. What are the popular commands for user management in Unix?
1. id: This command gives the active user id with login and groups to which user belongs.
2. who: This command gives the user that is currently logged on system. It also gives the time of
login.
3. last: This command shows the previous logins to the system in a chronological order.
4. adduser: We use this command to add a new user.
5. groupadd: We use this command to add a new group in the system.
6. usermod: We user usermod command to add/remove a user to a group in Unix.
183. How will you debug a shell script in Unix?
A shell script is a program that can be executed in Unix shell. Sometimes a shell script does not
work as intended. To debug and find the problem in shell script we can use the options provided
by shell to debug the script.
In bash shell there are x and v options that can be used while running a script.
With option v all the input lines are printed by shell. With option x all the simple commands are
printed in expanded format. We can see all the arguments passed to a command with –x option.
184. What is the difference between a Zombie and Orphan process in Unix?
Zombie is a defunct child process in Unix that still has entry in process table.
Sometimes a child process is terminated in Unix, but the parent process still waits on it.
A Zombie process is different from an Orphan process. An orphan process is a child process
whose parent process had died. Once a process is orphan it is adopted by init process. So
effectively it is not an orphan.
Therefore if a process exits without cleaning its child processes, they do not become Zombie.
Instead init process adopts these child processes.
Zombie processes are the ones that are not yet adopted by init process.
We can use one of the networking commands in Unix. It is called ping. With ping command we
can ping a remote host.
Ping utility sends packets in an IP network with ICMP protocol. Once the packet goes from
source to destination and comes back it records the time.
We can even specify the number of packets we want to send so that we collect more statistics to
confirm the result.
% ping www.google.com
We can use history command to get the list commands that were executed in Unix. Since we are
only interested in the last executed command we have to use tail to get the last entry.
% history | tail -2
We can use “2>&1” in a command so that all the errors from standard error go to standard
output.
188. How will you find which process is taking most CPU time in Unix?
In Unix, we can use top command to list the CPU time and memory used by various processes.
The top command lists the process IDs and CPU time, memory etc used by top most processes.
Top command keeps refreshing the screen at a specified interval. So we can see over the time
which process is always appearing on the top most row in the result of top command.
189. What is the difference between Soft link and Hard link in Unix?
A soft link is a pointer to a file, directory or a program located in a different location. A hard link
can point to a program or a file but not to a directory.
If we move, delete or rename a file, the soft link will be broken. But a hard link still remains
after moving the file/program.
We use the command ln –s for creating a soft link. But a hard link can be created by ln command
without –s option.
190. How will you find which processes are using a file?
We can use lsof command to find the list of Process IDs of the processes that are accessing a file
in Unix.
Lsof stands for List Open Files.
% lsof /var
It will list the processes that are accessing /var directory in current unix system.
In Unix, nohup command can be used to run a command in background. But it is different from
& option to run a process in background.
Nohup stands for No Hangup. A nohup process does not stop even if the Unix user that started
the process has logged out from the system.
But the process started with option & will stop when the user that started the process logs off.
192. How will you remove blank lines from a file in Unix?
We can use grep command for this option. Grep command gives –v option to exclude lines that
do not match a pattern.
In an empty line there is nothing from start to end. In Grep command, ^ denotes that start of line
and $ denotes the end of line.
% grep –v ‘^$’ lists the lines that are empty from start to the end.
Once we get this result, we can use > operator to write the output to a new file. So exact
command will be:
193. How will you find the remote hosts that are connecting to your system on a specific
port in Unix?
We can use netstat command for this purpose. Netstat command lists the statistics about network
connections. We can grep for the port in which we are interested.
We use xargs command to build and execute commands that take input from standard input. It is
generally used in chaining of commands.
Xargs breaks the list of arguments into small sub lists that can be handled by a command.
The above command uses find to get the list of all files in /path directory. Then xargs command
passes this list to rm command so that they can be deleted.