0% found this document useful (0 votes)

32 views6 pages

Ciencia Datos Corner

Data Science bootcamp Spanish

Uploaded by

Arias Saraeva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views6 pages

Ciencia Datos Corner

Data Science bootcamp Spanish

Uploaded by

Arias Saraeva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Machine Translated by Google

MODULAR DEVELOPMENT

TRAINING MODULE 1: SUPPORT SYSTEMS FOR DECISION MAKING AND

DATA MANAGEMENT

AIM

Develop computer applications to perform basic data processing with the Python language,
identifying methods for exploring large pools of data and relational and non-relational data
management systems (NoSQL).

DURATION IN ANY MODALITY OF DELIVERY: 100 hours

Teletraining: Duration of face-to-face tutorials: 0 hours

LEARNING OUTCOMES

Knowledge/cognitive and practical abilities

• Characterization of the application of the Python language

- Python language
- Running Python programs
- Objects in Python
- Numeric and dynamic types
- Management of text strings: lists, dictionaries, tuples and files
- Python statements: assignments, expressions and printing results
- Variable tests, syntax rules
- For and while loops

• Interpretation of the application of API protocols

- Use of remote APIs
- Integration of applications with remote APIs
- Examples of application of remote APIs in Python language

• Programming a modular algorithm in Python language

- Module programming
- Class scheduling fundamentals
- Use of APIs and integration with Python applications

• Distinction of basic Cloud concepts

- Principles of cloud computing (Cloud Computing)
- Service engineering: software as a service, Platform as a service, Infrastructure as a
Service
- Examples of relevant applications in the industry

• Use of NoSQL DBs and new data models (structured and unstructured)
- Fundamentals of the NoSQL paradigm
- Data distribution and parallel processing
- Main data models in the NoSQL world: key-value, orientation to
documents, property graphs, knowledge graphs

• Knowledge of Big Data storage and massive processing tools

- Applications based on the management and analysis of large volumes of data
7
Machine Translated by Google

- Architectural foundations of distributed systems

- Main reference architectures
- New data models
- Distributed file systems
- Document stores
- Graph databases

• Evaluation of the methodologies and techniques applied in solving problems and

justification of the approaches, decisions and proposals made
- Decision-making support systems
- Data analysis: descriptive, predictive and prescriptive analysis
- Use cases: management and analysis of large volumes of data

• Identification of the key factors of a complex problem in the context of a project

of analytics.

- Context of the data society/economy and the applications paradigm

data oriented
- Fundamentals of relational databases: SQL language.
- Need for a paradigm shift: NoSQL. The 'one size does not fit all' principle.
- Main data models in the NoSQL world: Key-Value, Document-oriented, Property Graphs and
Knowledge Graphs
- Architectural fundamentals: distributed systems, scalability, parallelism.
Main reference architectures (shared nothing, shared disk, shared memory)

• Distinction and application of new data models

- Distributed file systems: concepts and principles (distribution, replication, horizontal vs.
Vertical partitioning, specialized file formats)
- Knowledge and use of Hadoop File System (HDFS), Apache Avro, Apache Parquet, Key-
value stores: Apache HBase
- Document stores: concepts and principles (replication mechanisms, sharding, queries
spatial)
- Immersion in MongoDB and the Aggregation Framework
- Graph databases: property and knowledge graphs. Concepts and principles Modeling
in graph, regular queries. Introduction to Neo4j and Cypher
- Knowledge graphs. Concepts and principles: the open / linked data paradigm, RDF and
SPARQL. Introduction to GraphDB

• Identification and analysis of complex problems in the area of data analysis and
solution approach
- Main concepts of data processing flows in large-scale systems
volume
- Main phases of managing large volumes of data and associated challenges
- Data engineer roles in the main phases of data management
- Main limitations of traditional data management models
- New data models

• Planning and execution of a data analysis work with a methodological proposal

- Definition of a set of starting data and a series of business needs that require data aggregation,
external data capture, a process
ETL, data analysis and a final visualization of the results obtained
- Implementation of a distributed file system
- Using Hadoop to store a set of social network activity data.
Storing a data set in an HDFS environment
8
Machine Translated by Google

- Graph modeling: storing a set of data in a database

documentary or graph-oriented.

• Choosing a suitable repository for the problem data and defining a

storage strategy.
- Data life cycle: database design, data flow manager, architecture of data extraction, loading and transformation systems
and distributed storage and processing systems - Data management: model limits relational and data distribution

Management, personal and social skills

• Effectiveness in solving complex problems in the development of knowledge to design prototypes of software solutions in
Python in phases without losing sight of the complexity of the global problem.

• Ability to analyze the important elements of the development project of a

data management solution.
• Development of critical thinking and reasoning of the various techniques to be applied within the framework of the problem to
be solved, balancing the complexity of the solution and its real functioning.

• Identification of the tools to be applied and their cost and the needs of the data cycle
required.
• Development of a positive attitude towards learning and continuous improvement, with the objective of knowing and reviewing
the suppliers of the tools and the installation and updating methods.

• Demonstration of initiative and autonomy in the presentation of prototypes and discussion of problems and solutions to be
discussed in a group, reviewing requirements and their costs.

TRAINING MODULE 2: DATA MANAGEMENT AND PROCESSING

AIM

Identify data management principles for a project with multiple input sources and apply data
model organization techniques from a logical and physical point of view.

DURATION IN ANY MODALITY OF DELIVERY: 80 hours

Teletraining: Duration of face-to-face tutorials: 0 hours

LEARNING OUTCOMES

Knowledge/cognitive and practical abilities

• Critical evaluation of the methodologies and techniques to be applied in solving problems and
justification of the approaches, decisions and proposals made
- Data management fundamentals for a project with multiple input sources
of data
- Techniques for organizing data models from a logical and physical point of view

• Identification of data flows and ETL (Extract Transform Load)

- Fundamentals of Data Warehousing and Business Intelligence
- OLAP concepts and information extraction
- ETL process: extraction, transformation and loading of data
9
Machine Translated by Google

- Types of flows and operations

- Data cleaning
- Data quality
- Application examples

• Design of an ETL process and a multidimensional analysis model.

- Multidimensional modeling
- DFM: Dimensional Fact Model
- Star scheme and derivatives
- OLAP operators
- Implementation of cubes and OLAP operators in relational environments
- Multidimensional modeling tools

• Design of a data load to a NoSQL repository and basic data analysis

using spark
- Design, implementation and maintenance of Date Lake solutions. Concepts and principles
(schema-on-write vs. schema-on-read). Data modeling and governance
- Concepts and principles of distributed data processing (declarative vs. non-declarative
solutions)
- Distributed data processing models: Disk-based and main memory-based

- MapReduce and Apache Spark

- Real-time data processing (streaming). Concepts and principles (models, time windows, time
queries). Stream query languages.
Introduction to streaming tools: Apache Kafka, Apache Spark Streaming
- BigData architectures: Lambda, Kappa and orchestrators. Workflow management tools:
Apache Airflow

• Identification of the key factors of a complex problem in the context of a project

of analytics.

- ETL design and implementation project with NoSQL tools

- Batch data incorporation process with Apache tools.
- Data analysis and data extraction for business model from the set of
data with Spark
- Data analysis with Apache Spark
- Reading and exporting data
- Data quality review
- Filters and data transformations
- Data processing to obtain summaries and groupings
- Combinations, partitions and reformulation of data.
- Configuration, monitoring and error management of Spark applications
Management, personal and social skills

• Demonstration of a critical attitude of strategic thinking, presenting data processing schemes and
allowing discussion with interest groups internal and external to the company to formulate future-
oriented actions.
• Development of design and data analysis activities with social responsibility, intellectual honesty and
scientific integrity.
• Awareness of the need for a responsible attitude committed to results and the limitation of available
resources when making decisions in complex professional environments.

• Assessment of the importance of adaptation to cost, availability, development or implementation

time constraints in the review of an initial data management design.
10
Machine Translated by Google

TRAINING MODULE 3: MACHINE LEARNING AND VISUALIZATION

AIM

Apply the fundamentals of machine learning and visualization to analyze the results of data
processing.

DURATION IN ANY MODALITY OF DELIVERY: 130 hours

Teletraining: Duration of face-to-face tutorials: 0 hours

LEARNING OUTCOMES

Knowledge/cognitive and practical abilities

• Identification of the fundamentals of data analysis and machine learning (Machine
Learning)
- Typology of tasks and learning algorithms (supervised, unsupervised, semi-
supervised)
- Main learning methods
- Validation and evaluation of results

• Distinction of classifier methods.

- Predictive models
- Unsupervised methods. Hierarchical grouping. Partitional clustering (k-means
and derivatives). Dimensionality reduction (PCA and others)
- Supervised methods. K-NN. Decision trees. SVM. Neural networks
- Validation and evaluation of results

• Application of machine learning techniques and the integration of various sources of information
data
- Analysis of sentiments and polarity on the set of tweets collected.
- Construction of a profile analysis using clustering algorithms
unsupervised (clustering).
- Implementation of a polarity analysis (sentiment analysis) on the set of
collected messages.
- Implementation of two alternative approaches to compare the performance obtained:
Dictionary-based approach. Vectorization approach (Word2Vec) and use of a supervised
machine learning model.

• Design, development and evaluation of machine learning methods.

- Data processing

- Machine Learning Fundamentals

- Typology of tasks and learning algorithms
- Validation and evaluation of results

• Design and development of dashboards.

- Data visualization principles.
- Design of control panels and dashboards to define alarms and transmit results
- Integration of visualization with analysis tools and data queries
- Visual and written documentation of the results of data analytics projects
for non-specialized audiences
eleven
Machine Translated by Google

• Using a data visualization tool to design and upload data to a dashboard

- Data visualization tools: Grafana, MS PowerBar, Tableau

- Visualization of business queries and results dashboard in data visualization tools

• Choice, application and quality evaluation of a machine learning algorithm

for a given problem from a set of data.
- Text processing (NLP)
- Dictionary-based polarity analysis
- Analysis based on supervised predictive models
- Feature extraction (Word2Vec)

Management, personal and social skills

• Use of communication skills with interest groups to show the most relevant aspects of the results obtained in the results
of the process and their adaptation to the needs of the project.

• Application of innovative solutions and adaptation to changing environments. • Capacity for continuous
development of projects and communication of results and
decisions with visualization techniques and tools
• Coordination and communication with specialists, non-specialists, supervisors and clients
with the use of communication tools for the design of relevant information on the key aspects of the application.

EVALUATION OF LEARNING IN THE TRAINING ACTION

• The evaluation will have a theoretical-practical nature and will be carried out systematically and continuously,
during the development of each module and at the end of the course.

• It may include an initial diagnostic evaluation to detect the starting level of the
student body.

• The evaluation will be carried out using the most appropriate methods and instruments to verify the different learning results,
and that guarantee its reliability and validity.

• Each evaluation instrument will be accompanied by its corresponding correction and scoring system in which the
measurement criteria to evaluate the results achieved by the students are explained, clearly and unequivocally.

• The final score achieved will be expressed in terms of Pass/No Pass.

Generative AI Masters Program Brochure_Edureka
No ratings yet
Generative AI Masters Program Brochure_Edureka
46 pages
Data Science Training Content Naresh IT Hyderabad
No ratings yet
Data Science Training Content Naresh IT Hyderabad
13 pages
Data Engineer Toolkit in 2025_Must‑Have Skills, Tools & Resources _ by Vijay Gadhave _ May, 2025 _ Medium
No ratings yet
Data Engineer Toolkit in 2025_Must‑Have Skills, Tools & Resources _ by Vijay Gadhave _ May, 2025 _ Medium
15 pages
12 - DataEngineer - Interview - Questions and Answers - EPAM Anywhere
No ratings yet
12 - DataEngineer - Interview - Questions and Answers - EPAM Anywhere
2 pages
Road_Map_1741960074
No ratings yet
Road_Map_1741960074
24 pages
Updated Data Engineering Syllabus 1
No ratings yet
Updated Data Engineering Syllabus 1
6 pages
Dodbook
No ratings yet
Dodbook
217 pages
Certified Professional Diploma in Data Science-1
No ratings yet
Certified Professional Diploma in Data Science-1
43 pages
Step by Step Guide For Data Engineering
No ratings yet
Step by Step Guide For Data Engineering
7 pages
Big Data Engineer Course (2) (1)
No ratings yet
Big Data Engineer Course (2) (1)
31 pages
Python
No ratings yet
Python
35 pages
MCA 3rd semester Big Data Analytics syllabus
No ratings yet
MCA 3rd semester Big Data Analytics syllabus
15 pages
Data Engineering Brochure
No ratings yet
Data Engineering Brochure
24 pages
1 Intro
No ratings yet
1 Intro
33 pages
2 Data Science
No ratings yet
2 Data Science
27 pages
Data Engineering Brochure
No ratings yet
Data Engineering Brochure
23 pages
MR20 Vi-I Syllabus
No ratings yet
MR20 Vi-I Syllabus
22 pages
Data Engineering Notes
No ratings yet
Data Engineering Notes
4 pages
Syllabus New Wal
No ratings yet
Syllabus New Wal
5 pages
Data Analytics Course Guide 2024
No ratings yet
Data Analytics Course Guide 2024
14 pages
Brochure Professional Certificate in Data Engineering
100% (1)
Brochure Professional Certificate in Data Engineering
14 pages
DE_Python
No ratings yet
DE_Python
11 pages
Data Engineern - Bootcamp Brochure
No ratings yet
Data Engineern - Bootcamp Brochure
12 pages
iran
No ratings yet
iran
7 pages
Document (1)
No ratings yet
Document (1)
4 pages
Certification in advanced Python, R and Data Management 18.12.24
No ratings yet
Certification in advanced Python, R and Data Management 18.12.24
6 pages
Big Data Engineering and Analytics Developer (1) (1)
No ratings yet
Big Data Engineering and Analytics Developer (1) (1)
5 pages
Rahul Kumar Gupta
No ratings yet
Rahul Kumar Gupta
11 pages
Chapter Two Data Science: by Abdulaziz Oumer
No ratings yet
Chapter Two Data Science: by Abdulaziz Oumer
29 pages
Data Engineering Course Outline
No ratings yet
Data Engineering Course Outline
3 pages
Data Analyst & Data Engineer
No ratings yet
Data Analyst & Data Engineer
4 pages
roadmap
No ratings yet
roadmap
3 pages
Data Warehouse Developer Curriculum
No ratings yet
Data Warehouse Developer Curriculum
10 pages
3rd Sem Syllabus
No ratings yet
3rd Sem Syllabus
13 pages
AI - ML - DS 1-Credit Program-Learning-Guide
No ratings yet
AI - ML - DS 1-Credit Program-Learning-Guide
7 pages
Data Engineering 101
No ratings yet
Data Engineering 101
1 page
Cloud Data Engineering V1.0
No ratings yet
Cloud Data Engineering V1.0
5 pages
4 2 PDF
No ratings yet
4 2 PDF
2 pages
Big Data & Hadoop - Course Curriculum
No ratings yet
Big Data & Hadoop - Course Curriculum
6 pages
Syllabus of MCA - Management - 2020 Patt - Sem III and IV - 13122021
No ratings yet
Syllabus of MCA - Management - 2020 Patt - Sem III and IV - 13122021
23 pages
Data Engineering Nanodegree Program Syllabus PDF
No ratings yet
Data Engineering Nanodegree Program Syllabus PDF
5 pages
Complete Data Engineering Roadmap With Resources
No ratings yet
Complete Data Engineering Roadmap With Resources
16 pages
Data Glossary - Michael Dillon
No ratings yet
Data Glossary - Michael Dillon
11 pages
Data Engineering Brochure New
No ratings yet
Data Engineering Brochure New
33 pages
Brochure MIT XPRO - Professional Certificate in Data Engineering - V44
No ratings yet
Brochure MIT XPRO - Professional Certificate in Data Engineering - V44
15 pages
5-Day KVCET Bootcamp - Data Analytics
No ratings yet
5-Day KVCET Bootcamp - Data Analytics
6 pages
Data-Engineering Compressed
No ratings yet
Data-Engineering Compressed
20 pages
NPN 1 Credit Course Learning Guide V1
No ratings yet
NPN 1 Credit Course Learning Guide V1
7 pages
Chapter Two
No ratings yet
Chapter Two
14 pages
MIT data engineering
No ratings yet
MIT data engineering
20 pages
Become A Big Data Engineer 1
No ratings yet
Become A Big Data Engineer 1
7 pages
Semester Wise Scaler SST - Curriculum
100% (1)
Semester Wise Scaler SST - Curriculum
14 pages
Python Developer Curriculum Recommendations
No ratings yet
Python Developer Curriculum Recommendations
6 pages
SEMANTIC WEB UNIT - 5 MATERIAL FINAL (1)
No ratings yet
SEMANTIC WEB UNIT - 5 MATERIAL FINAL (1)
22 pages
Chapter 2 - Data Science
No ratings yet
Chapter 2 - Data Science
20 pages
73754680
No ratings yet
73754680
72 pages
BDA2023Outline
No ratings yet
BDA2023Outline
7 pages
Roadmap To Become Data Engineer in 2024
No ratings yet
Roadmap To Become Data Engineer in 2024
8 pages
DS Curriculum
No ratings yet
DS Curriculum
4 pages
2CS702-CPD-Odd 23 24
No ratings yet
2CS702-CPD-Odd 23 24
9 pages
Data and Analytics Syllabus
No ratings yet
Data and Analytics Syllabus
4 pages
Internship Report on Data Science
No ratings yet
Internship Report on Data Science
33 pages
Flowchart For Bibliomagika® 2.0
No ratings yet
Flowchart For Bibliomagika® 2.0
1 page
Unit 1a
No ratings yet
Unit 1a
53 pages
MD006 Introducción
No ratings yet
MD006 Introducción
52 pages
17308914155
No ratings yet
17308914155
28 pages
Chapter 11 Introduction to Machine Learning
No ratings yet
Chapter 11 Introduction to Machine Learning
11 pages
Types of Expert Systems
100% (1)
Types of Expert Systems
21 pages
Menu Almanac barcelona
No ratings yet
Menu Almanac barcelona
4 pages
15. Purple Llama CYBERSECEVAL
No ratings yet
15. Purple Llama CYBERSECEVAL
13 pages
Cloud Computing and Interoperability in Healthcare
No ratings yet
Cloud Computing and Interoperability in Healthcare
6 pages
Barcelona WorkCafe Menu
No ratings yet
Barcelona WorkCafe Menu
2 pages
Open Source Intelligence For Malicious Behavior Discovery and Interpretation
No ratings yet
Open Source Intelligence For Malicious Behavior Discovery and Interpretation
14 pages
Software Engineering-Unit-3
No ratings yet
Software Engineering-Unit-3
36 pages
ChatBot_synopsis-final (1)
No ratings yet
ChatBot_synopsis-final (1)
7 pages
Smart Manufacturing Andsustainability Abibliometric Analysisbenchmarking
No ratings yet
Smart Manufacturing Andsustainability Abibliometric Analysisbenchmarking
21 pages
AI Photo Enhancement (1)
No ratings yet
AI Photo Enhancement (1)
2 pages
AI Tutorial 2
No ratings yet
AI Tutorial 2
5 pages
Bidirectional LSTM Networks For Poetry Generation in Hindi
No ratings yet
Bidirectional LSTM Networks For Poetry Generation in Hindi
4 pages
Entregabl E
No ratings yet
Entregabl E
9 pages
Integration of Remote Sensing and GIS for Climate Change Analysis and Mitigation
No ratings yet
Integration of Remote Sensing and GIS for Climate Change Analysis and Mitigation
1 page
What Is Artificial Intelligence (AI) ? - IBM
No ratings yet
What Is Artificial Intelligence (AI) ? - IBM
8 pages
Cognitive Computing For Natural Language Processing NLP and Understanding Medical Imaging Narratives
No ratings yet
Cognitive Computing For Natural Language Processing NLP and Understanding Medical Imaging Narratives
5 pages
Contoh Jurnal Internasional
No ratings yet
Contoh Jurnal Internasional
14 pages
MIDAS-2025 Brochure
No ratings yet
MIDAS-2025 Brochure
2 pages
Core Concepts of Supervised, Unsupervised, and Reinforcement Learning
No ratings yet
Core Concepts of Supervised, Unsupervised, and Reinforcement Learning
3 pages
Data Fusion
No ratings yet
Data Fusion
6 pages
Sanjay Kulkarni
No ratings yet
Sanjay Kulkarni
1 page
Geographic Information System
No ratings yet
Geographic Information System
14 pages
MANN 1033 Assignment 1 Answers
No ratings yet
MANN 1033 Assignment 1 Answers
19 pages
Software Developer: About Me My Contact
No ratings yet
Software Developer: About Me My Contact
1 page
Component-Based Software Engineering
No ratings yet
Component-Based Software Engineering
22 pages
Applied Data Mining with Weka: Definitive Reference for Developers and Engineers
From Everand
Applied Data Mining with Weka: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient Data Querying with Drill: Definitive Reference for Developers and Engineers
From Everand
Efficient Data Querying with Drill: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Ciencia Datos Corner

Uploaded by

Ciencia Datos Corner

Uploaded by

Machine Translated by Google

TRAINING MODULE 1: SUPPORT SYSTEMS FOR DECISION MAKING AND

DURATION IN ANY MODALITY OF DELIVERY: 100 hours

Teletraining: Duration of face-to-face tutorials: 0 hours

Knowledge/cognitive and practical abilities

• Characterization of the application of the Python language

• Interpretation of the application of API protocols

• Programming a modular algorithm in Python language

• Distinction of basic Cloud concepts

• Knowledge of Big Data storage and massive processing tools

- Architectural foundations of distributed systems

• Evaluation of the methodologies and techniques applied in solving problems and

• Identification of the key factors of a complex problem in the context of a project

- Context of the data society/economy and the applications paradigm

• Distinction and application of new data models

• Planning and execution of a data analysis work with a methodological proposal

- Graph modeling: storing a set of data in a database

• Choosing a suitable repository for the problem data and defining a

Management, personal and social skills

• Ability to analyze the important elements of the development project of a

TRAINING MODULE 2: DATA MANAGEMENT AND PROCESSING

DURATION IN ANY MODALITY OF DELIVERY: 80 hours

Teletraining: Duration of face-to-face tutorials: 0 hours

Knowledge/cognitive and practical abilities

• Identification of data flows and ETL (Extract Transform Load)

- Types of flows and operations

• Design of an ETL process and a multidimensional analysis model.

• Design of a data load to a NoSQL repository and basic data analysis

- MapReduce and Apache Spark

• Identification of the key factors of a complex problem in the context of a project

- ETL design and implementation project with NoSQL tools

• Assessment of the importance of adaptation to cost, availability, development or implementation

TRAINING MODULE 3: MACHINE LEARNING AND VISUALIZATION

DURATION IN ANY MODALITY OF DELIVERY: 130 hours

Teletraining: Duration of face-to-face tutorials: 0 hours

Knowledge/cognitive and practical abilities

• Distinction of classifier methods.

• Design, development and evaluation of machine learning methods.

- Machine Learning Fundamentals

• Design and development of dashboards.

• Using a data visualization tool to design and upload data to a dashboard

- Data visualization tools: Grafana, MS PowerBar, Tableau

• Choice, application and quality evaluation of a machine learning algorithm

Management, personal and social skills

EVALUATION OF LEARNING IN THE TRAINING ACTION

• The final score achieved will be expressed in terms of Pass/No Pass.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.