Sesi 1 Pendahuluan Big Data
Sesi 1 Pendahuluan Big Data
Sesi 1 Pendahuluan Big Data
BIG DATA
OUTLINE
• Fundamentals of Big Data
• Big Data Types
• Big Data Technology Components
• Virtualization and How It Supports Distributed Computing
• The Cloud and Big Data
Fundamentals of Big Data
BIG Data Word Cloud
Gelombang Pengelolaan Data
Wave 1:
Wave 2: Web Wave 3:
Creating
and content Managing big
manageable
management data
data structures
Pendefinisian Big Data
Pendefinisian Big Data
Arsitektur Managemen Big Data
Perihal Unjuk Kerja Big Data
MapReduce was designed by Google Big Table is a sparse, distributed, persistent
as a way of efficiently executing a set multidimensional sorted map. It
of functions against a large amount is intended to store huge volumes of data across
of data in batch mode. commodity servers.
Map Big
Hadoop
Reduce Table
Bi
g
d
at
a
a
p
pl
ic
a
ti
o
n
s
Big Data Types
Sumber Data dari BIG Data
Big Data Terstruktur
Computer /
Machine Human Generated
Generated
Sensor Input Data
Web Log
Click Stream
Point of Sale
Gaming
Financial Related
Big Data Tidak Terstruktur
Computer / Machine Generated Human Generated
Data access: implementations. The data should be available only to those who have a legitimate business need for examining
or interacting with it. Most core data storage platforms have rigorous security schemes and are often augmented
with a federated identity capability, providing appropriate access across the many layers of the architecture.
Application Application access to data is also relatively straightforward from a technical perspective. Most
application programming interfaces (APIs) offer protection from unauthorized usage or access.
access: This level of protection is probably adequate for most big data implementations.
Data encryption is the most challenging aspect of security in a big data environment. In traditional environments, encrypting and decrypting
data really stresses the systems’ resources. With the volume, velocity, and varieties associated with big data, this problem is exacerbated. The
Data encryption: simplest (brute-force) approach is to provide more and faster computational capability. However, this comes with a steep price tag —
especially when you have to accommodate resiliency requirements. A more temperate approach is to identify the data elements requiring
this level of security and to encrypt only the necessary items.
Threat The inclusion of mobile devices and social networks exponentially increases both
the amount of data and the opportunities for security threats. It is therefore
detection: important that organizations take a multiperimeter approach to security.
Layer 2: Operational Databases
✓ Atomicity: A transaction is “all or nothing” when it is atomic. If any part of the transaction or
the underlying system fails, the entire transaction fails.
✓ Consistency: Only transactions with valid data will be performed on the database. If the data
is corrupt or improper, the transaction will not complete and the data will not be written to the
database.
✓ Isolation: Multiple, simultaneous transactions will not interfere with each other. All valid
transactions will execute until completed and in the order they were submitted for processing.
✓ Durability: After the data from the transaction is written to the database, it stays there
“forever.”
Layer 2: Operational Databases
Layer 3: Organizing Data Services and Tools
A distributed file system: Serialization services:
Necessary to accommodate Necessary for persistent
the decomposition of data data storage and
streams and to provide multilanguage remote
scale and storage capacity procedure calls (RPCs)
www.cloudcomputingchina.com
Dasar-dasar Virtualisasi
Manfaat Virtualisasi
✓ Virtualization of physical
resources (such as servers, ✓ Virtualization enables
storage, and networks) improved control over the
enables substantial usage and performance of
improvement in the utilization your IT resources.
of these resources.
Isolation: Each virtual machine is isolated from its host physical system and other virtualized
machines. Because of this isolation, if one virtual instance crashes, the other virtual machines and the host
system aren’t affected. In addition, data isn’t shared between one virtual instance and another.
Encapsulation: A virtual machine can be represented (and even stored) as a single file, so you
can identify it easily based on the services it pro vides. For example, the file containing the encapsulated
process could be a complete business service. This encapsulated virtual machine could be presented to an
application as a complete entity. Thus, encapsulation could protect each application so that it doesn’t interfere
with another application.
VIRTUALISASI SERVER
VIRTUALISASI APLIKASI
VIRTUALISASI SISTEM
System
Virtualization The administrator build the disk image from
existing system disks.
On
Demand
• Sampai saat ini paradigma cloud Self
computing ini masih berevolusi, Service
masih menjadi subjek
perdebatan dikalangan Broad
5 Karakteristik
akademisi, vendor TI dan Measured network
pemeritah/bisnis services access
• Berdasarkan NIST, ada 5 kriteria
yang harus dipenuhi oleh
Cloud
sebuah sistem untuk bisa
dimasukkan kedalam keluarga
Computing
cloud
Rapid Resources
elasticity Pooling
Arsitektur Cloud Computing
Layer physical hardware divirtualisasiuntuk memberikan platform yang fleksible dan meningkatakn
utilisasiresources. Kunci dari new enterprise data center adalah bagaimanamengkombinasikan
layer virtualiasi dan layer management agar dapat mengelola data center secara efisien, men-deploy
dan meng-configure layanan dengan cepat.
.
Apa arti Cloud Computing bagi Service
Provider ?
Rapid Resources
elasticity Pooling
5 Karakteristik Utama Cloud Computing
• Penyedia layanan cloud, memberikan
On Demand layanan melalui sumberdaya yang
Self Service dikelompokkan di satu atau berbagai
lokasi date center yang terdiri dari
sejumlah server dengan mekanisme
multi-tenant.
• Mekanisme multi-tenant ini
Measured Broad memungkinkan sejumlah sumberdaya
services
5 Karakteristik network
access
komputasi tersebut digunakan secara
bersama-sama oleh sejumlah user, di
Cloud mana sumberdaya tersebut baik yang
berbentuk fisik maupun virtual, dapat
Public Cloud
Hybrid Cloud
The 4 Implementations of the Cloud
• Private Cloud
• Private Clouds are normal data centers within an enterprise with all the 4
attributes of the Cloud – Elasticity, Self Service, Pay-By-Use and
Programmability
• By setting up a Private Cloud, enterprises can consolidate their IT
infrastructure
• They will need fewer IT staff to manage the data center
• Reduced power bills because of the low electricity consumption and lesser
cooling equipment needs
Private Cloud
4 Deployment Model Infrastruktur Cloud
Computing
Private Cloud
• Dalam model ini, sebuah infrastruktur cloud
digunakan bersama-sama oleh beberapa
organisasi yang memiliki kesamaan
kepentingan, misalnya dari sisi misinya, atau
Community Cloud tingkat keamanan yang dibutuhkan, dan lainnya.
• Jadi, community cloud ini merupakan
"pengembangan terbatas" dari private cloud.
Dan sama juga dengan private cloud,
infrastruktur cloud yang ada bisa di-manage oleh
Public Cloud salah satu dari organisasi itu, ataupun juga oleh
pihak ketiga.
Hybrid Cloud
The 4 Implementations of the Cloud
• Community Cloud
• Community Cloud is implemented when a set of businesses have a similar
requirement and share the same context.
• For example, the Federal government in US may decide to setup a
government specific Community Cloud that can leveraged by all the states
• Through this, individual local bodies like state governments will be freed from
investing, maintaining and managing their local data centers
• So, a Community Cloud is a sort of Private Cloud but goes beyond just one
organization
Community Cloud
4 Deployment Model Infrastruktur Cloud
Computing
Private Cloud
Hybrid Cloud
The 4 Implementations of the Cloud
• Public Cloud
• It needs a huge investment and only well established companies with deep
pockets like Microsoft, Amazon and Google can afford to set them up.
• Public Cloud is implemented on thousands of servers running across
hundreds of data centers deployed across tens of locations around the world
• Customers can choose a location for his application to be deployed
Public Cloud
4 Deployment Model Infrastruktur Cloud
Computing
• Merupakan komposisi dari dua atau lebih infrastruktur
Private Cloud cloud (private, community, atau public).
• Meskipun secara entitas mereka tetap berdiri sendiri-
sendiri, tapi dihubungkan oleh suatu
teknologi/mekanisme yang memungkinkan portabilitas
data dan aplikasi antar cloud itu. Misalnya, mekanisme
Community Cloud load balancing yang antarcloud, sehingga alokasi
sumberdaya bisa dipertahankan pada level yang
optimal.
• Menurut lembaga NIST bahwa definisi dan batasan
dari Cloud Computing sendiri masih mencari bentuk
Public Cloud dan standarnya. Sehingga nanti pasarlah yang akan
menentukan model mana yang akan bertahan.
• Namun semua sepakat bahwa cloud computing akan
menjadi masa depan dari dunia komputasi. Bahkan
lembaga riset bergengsi Gartner Group juga telah
Hybrid Cloud menyatakan bahwa Cloud Computing adalah wacana
yang tidak boleh dilewatkan oleh seluruh pemangku
kepentingan di dunia TI.
The 4 Implementations of the Cloud
• Hybrid Cloud
• a combination of Private Cloud and Public Cloud
• Security plays a critical role in connecting the Private Cloud to the Public
Cloud
• Amazon Web Services has recently announced Virtual Private Cloud (VPC)
that securely bridges Private Cloud and Amazon Web Services
• Microsoft’s recent Windows AppFabric brings the concept of Hybrid Cloud to
Microsoft‟s future customers
Hybrid Cloud
Keuntungan Cloud Computing