Internship Report
Internship Report
INTERNSHIP REPORT
ON
“DATA SCIENCE”
Submitted in partial fulfilment for the award of degree(21CSI85)
BACHELOR OF ENGINEERING IN
Computer Science Engineering
Submitted by:
NAME :YASHWANTH.R
USN:1AM21CS221
CERTIFICATE
This is to certify that the Internship titled “Data Science” carried out by YASHWANTH.R
a bonafide student of AMC Engineering Institute of Technology, in partial fulfillment for
the award of Bachelor of Engineering, in CSE under Visvesvaraya Technological
University, Belagavi, during the year 2022-2023. It is certified that all
corrections/suggestions indicated have been incorporated in the report.
The project report has been approved as it satisfies the academic requirements in respect
of Internship prescribed for the course Internship / Professional Practice
External Viva:
1)
2)
Date :03-12-23 :
Place: Rajarajeshwarinagar, Bangalore
USN : 1AM21CS221
NAME : YASHWANTH.R
This Internship is a result of accumulated guidance, direction and support of several important
persons. We take this opportunity to express our gratitude to all who have helped us to
complete the Internship.
We express our sincere thanks to our Principal, for providing usadequate facilities to undertake
this Internship.
We would like to thank our Head of Dept – branch code, for providing us an opportunity to
carry out Internship and for his valuable guidance and support.
We would like to thank our Software Services for guiding us during the period of internship.
We express our deep and profound gratitude to our guide, Guide name, Assistant/Associate
Prof, for her keen interest and encouragement at every step in completing the Internship.
We would like to thank all the faculty members of our department for the support extended
during the course of Internship.
We would like to thank the non-teaching members of our dept, for helping us during the
Internship.
Last but not the least, we would like to thank our parents and friends without whose constant
help, the completion of Internship would have not been possible.
The report begins with an introduction to the [background/context] and a review of relevant
literature. Methodological details are outlined to provide transparency and context for the results.
The main body of the report discusses key findings, [subtopics], and their significance.
Key highlights of the report include [noteworthy findings or conclusions]. Recommendations for
[potential actions or future research] are provided based on the insights gained. The implications
of this research extend to [relevant stakeholders or fields].
Overall, this report contributes to the understanding of [Your Report Topic] and provides
valuable insights for [target audience or industry]. It serves as a foundation for further exploration
and advancements in [related areas].
1 Company Profile 8
3 Introduction 11
4 System Analysis 11
Macrolytics, strive to be the front runner in creativity and innovation in software development
through their well-researched expertise and establish it as an out of the box software
development company in Bangalore, India. As a software development company, they translate
this software development expertise into value for their customers through their professional
solutions.
They understand that the best desired output can be achieved only by understanding the clients
demand better. Macrolytics work with their clients and help them to defiine their exact solution
requirement. Sometimes even they wonder that they have completely redefined their solution
or new application requirement during the brainstorming session, and here they position
themselves as an IT solutions consulting group comprising of high caliber consultants.
They believe that Technology when used properly can help any business to scale and achieve
new heights of success. It helps Improve its efficiency, profitability, reliability; to put itin one
sentence ” Technology helps you to Delight your Customers” and that is what we want to
achieve.
Macrolytics Technology is a Technology Organization providing solutions for all web design
and development, MYSQL, PYTHON Programming,HTML,CSS,ASP,CAT,NET and LINQ.
Meeting the ever increasing automation requirements, Macrolytics Technology specialize in
ERP ,Connectivity,SEO Services,Conference Management effective webpromotion and
tailor-made software products, designing solutions best suiting clients requirements. The
organization where they have a right mix of professionals as a stakeholders to help us serve
our clients with best of our capability and with at par industry standards.They have
young, enthusiastic, passionate and creative Professionals to develop technological
innovations in the field of Mobile technologies, Web applications as well as Business and
Enterprise solution. Motto of our organization is to “Collaborate with our clients to provide
them with best Technological solution hence creating Good Present and Better Future for our
client which will bring a cascading a positive effect in their business shape as well”.
Providing a Complete suite of technical solutions is not just our tag line, it is Our Vision for
Our Clients and for Us, We strive hard to achieve it.
They have a great team of skilled mentors who are always ready to direct their trainees in the
best possible way they can and to ensure the skills of mentors we held many skill development
programs as well so that each and every mentor can develop their own skills with the demands
of the companies so that they can prepare a complete packaged trainee.
• Python
• Selenium Testing
• Software Training
• CAD Automation
This report delves into the symbiotic relationship between data science and AWS, exploring how
the integration of cutting-edge data science techniques with AWS's robust cloud services
enhances the efficiency, scalability, and accessibility of data-driven solutions. As organizations
increasingly recognize the strategic importance of leveraging their data assets, the synergy
between data science and AWS becomes a pivotal driver for innovation, agility, and competitive
advantage in today's dynamic business landscape.
Throughout this report, we will navigate the key components of data science workflows on AWS,
examining the tools, services, and best practices that facilitate the seamless development,
deployment, and management of data-driven applications. From data ingestion and storage to
advanced analytics and machine learning model deployment, we will explore how AWS enables
organizations to build end-to-end data solutions that optimize decision-making processes and
uncover hidden patterns within their data.
4.SYSTEM ANALYSIS
System analysis is a critical phase in the development of information systems that involves
studying and understanding the current system, identifying problems or areas for improvement,
and defining the requirements for a new or enhanced system. This process is essential for
designing effective and efficient solutions that align with organizational goals. Here's an
overview of key aspects involved in system analysis:
Scope Definition: Clearly defining the boundaries of the system under consideration and
understanding its interfaces with external entities.
Stakeholder Identification: Identifying and involving all relevant stakeholders, including end-
users, managers, and IT professionals.
Problem Identification and Definition:
Data Analysis:
Data Collection: Identifying and documenting the data sources, formats, and structures.
Data Modeling: Creating data models (e.g., Entity-Relationship Diagrams) to represent the
relationships between different data entities.
Process Analysis:
Process Modeling: Creating process flow diagrams or flowcharts to represent the workflow and
interactions within the system.
Identifying Bottlenecks: Analyzing the current processes to identify inefficiencies or bottlenecks.
Object Identification: Identifying and modeling objects in the system, along with their attributes
and behaviors.
Use Case Analysis: Identifying and documenting use cases that describe the interactions between
users and the system.
Prototyping and Mockups:
Prototyping: Developing prototypes or mockups to visualize the proposed system and gather
feedback from stakeholders.
User Interface Design: Designing the user interface based on user requirements and usability
principles.
Functional Requirements: Defining the functions and features the system must provide.
Non-functional Requirements: Specifying criteria related to performance, security, and other
quality attributes.
System Analysis Report: Compiling the findings, requirements, and proposed solutions into a
comprehensive report.
Communication: Effectively communicating the analysis results to stakeholders for validation
and feedback.
System analysis is a dynamic and iterative process, often involving collaboration between
analysts, end-users, and other stakeholders. It lays the foundation for successful system design
and development, ensuring that the resulting information system meets the needs of the
organization and its users.
Assignment
Complete four GeeksforGeeks website problems
SNAPSHOTS
NumPy
Key Features:
Pandas
Key Features:
• DataFrame and Series are core data structures for structured data.
• Data cleaning, exploration, and manipulation functionalities.
• Time series data support, merging, joining, and I/O operations.
Use Cases:
• Data cleaning, preprocessing, and exploratory data analysis.
• Essential for statistical analysis, time series analysis, and data wrangling.
Community and Documentation:
• Large and active community support with comprehensive documentation.
Conclusion:
Python, NumPy, and Pandas together form a powerful trio for programming, scientific
computing, and data manipulation. Python's simplicity, combined with the array manipulation
capabilities of NumPy and the versatile data structures of Pandas, makes them indispensable tools
for a wide range of applications. Whether you're a programmer, data scientist, or analyst, these
tools empower you to efficiently work with data and solve complex problems. The active
communities and extensive documentation further contribute to their widespread adoption and
continuous improvement. As you delve deeper into these tools, you'll unlock even more
possibilities for creativity and innovation in your projects.**
SNAPSHOTS
Database design is a crucial step in creating efficient and effective databases. It involves defining
the structure that will store and manage data.
Entity-Relationship (ER) Modeling:
ER modeling is a common technique for database design. It identifies entities, attributes, and
relationships to create a visual representation of the database structure.
Normalization:
Primary and foreign keys establish relationships between tables. Indexing enhances query
performance by allowing faster data retrieval.
Data Types and Constraints:
Choosing appropriate data types for columns and applying constraints (e.g., NOT NULL,
UNIQUE, CHECK) ensures data accuracy and consistency.
MySQL:
MySQL is an open-source relational database management system (RDBMS) widely used for
web applications and various software development projects.
Features:
• MySQL supports ACID properties (Atomicity, Consistency, Isolation, Durability) to
ensure reliability and data integrity.
• It provides a comprehensive set of SQL commands for database manipulation and
querying.
Data Definition Language (DDL):
• DDL statements in MySQL are used to define the database structure, including creating
and altering tables, specifying constraints, and defining indexes.
Data Manipulation Language (DML):
• DML statements are used for data manipulation operations such as SELECT, INSERT,
UPDATE, and DELETE.
Transactions:
• MySQL supports transactions, allowing multiple operations to be treated as a single,
atomic unit, ensuring consistency in the database.
User Management:
• MySQL allows the creation and management of multiple users with different levels of
access privileges, enhancing security.
Internship report 2022-2023 17
Storage Engines:
• MySQL supports multiple storage engines, such as InnoDB and MyISAM, each with its
own advantages and use cases.
Community and Documentation:
• MySQL has a large and active community providing support and resources. The official
documentation is comprehensive, offering guides, tutorials, and a detailed reference.
Integration with Programming Languages:
• MySQL integrates seamlessly with various programming languages, making it a popular
choice for developers working on diverse projects.
Use Cases:
Web Development:
• MySQL is extensively used in web development for storing and retrieving data from
websites and web applications.
Enterprise Applications:
• MySQL is employed in enterprise-level applications where data integrity and reliability
are paramount.
Data Warehousing:
• It is suitable for data warehousing applications where large volumes of data need to be
efficiently stored and queried.
Assignment
Snapshots
1. Data Collection:
Gather data from various sources, including databases, spreadsheets, cloud services, IoT devices,
and external data providers.
Extraction, Transformation, and Loading (ETL):
Use ETL processes to extract data from source systems, transform it into a consistent format, and
load it into a data warehouse or data mart.
2. Data Storage:
Data Warehouse:
Store structured, organized data in a centralized repository, such as a data warehouse, to facilitate
efficient querying and reporting.
Data Mart:
Create data marts for specific business units or departments to provide focused subsets of data
tailored to their needs.
3. Data Processing:
Data Modeling:
Design data models to represent the relationships between different data entities, ensuring data
integrity and supporting efficient queries.
Data Cubes:
Use multidimensional data cubes to organize data along multiple dimensions, facilitating
multidimensional analysis.
4. Data Analysis:
Query and Reporting:
Develop queries and reports to retrieve and present data in a readable format for analysis.
OLAP (Online Analytical Processing):
Utilize OLAP tools to interactively analyze multidimensional data, enabling users to explore and
drill down into information.
5. Data Visualization:
Dashboards:
Create interactive dashboards that visualize key performance indicators (KPIs) and critical metrics
to facilitate quick decision-making.
Charts and Graphs:
Use various charts, graphs, and visual elements to represent data trends and patterns.
Machine Learning (ML) is a subfield of artificial intelligence (AI) that focuses on the
development of algorithms and statistical models that enable computer systems to perform tasks
without explicit programming. The core idea behind machine learning is to empower machines to
learn patterns and insights from data, improving their performance and decision-making
capabilities over time.
Machine learning can be categorized into three main types:
Supervised Learning:
In supervised learning, the algorithm is trained on a labeled dataset, where each input is
associated with a corresponding output. The goal is to learn a mapping from inputs to outputs,
allowing the algorithm to make predictions on new, unseen data.
Unsupervised Learning:
Unsupervised learning involves working with unlabeled data, where the algorithm aims to
discover inherent patterns or structures without explicit guidance. Clustering and dimensionality
reduction are common tasks in unsupervised learning.
Semi-supervised Learning:
Semi-supervised learning is a machine learning paradigm that falls between supervised learning
and unsupervised learning. In this approach, the algorithm is trained on a dataset that contains
both labeled and unlabeled examples. While a portion of the data is explicitly labeled with
corresponding outputs, the majority of the data remains unlabeled.
Reinforcement Learning:
Reinforcement learning involves an agent interacting with an environment and learning to make
decisions by receiving feedback in the form of rewards or penalties. The agent aims to maximize
cumulative rewards over time.
Networking Services:
Amazon VPC (Virtual Private Cloud): Isolated virtual networks for secure and customizable
cloud environments.
Amazon Route 53: Scalable and highly available Domain Name System (DNS) web service.
Analytics and Machine Learning:
Amazon Redshift: Fully managed data warehouse service for analytics.
Amazon SageMaker: Managed service for building, training, and deploying machine learning
models.
Security and Identity Services:
AWS Identity and Access Management (IAM): Access control and identity management
service.
AWS Key Management Service (KMS): Managed service for creating and controlling
encryption keys.
Management and Monitoring:
Amazon CloudWatch: Monitoring and management service for AWS resources.
AWS CloudTrail: Records AWS API calls for auditing.
Assignment:
Create an EFS and connect it to 3 different EC2 instances. Make sure all instances have
Different Operating Systems. For instance, Ubuntu, Red Hat Linux, and Amazon
Linux 2.
Snapshots: