Skip to content

1tangerine1day/Data-mining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data-mining HW

Association Rules (FP-Growth)

Tid Items
100 A, C, D
200 B, C, E
300 A, B, C, E
400 B, E

Strong rule
{B, E}->C (2/3)
C->A (2/3)
A->C (2/2)

Sequential Pattern (PrefixSpan)

SID Sequences
100 <(1,5) (2) (3) (4)>
200 <(1) (3) (4) (3,5)>
300 <(1) (2) (3) (4)>
400 <(1) (3) (5)>
500 <(4) (5)>

Some frequent sequential patterns
{1,2,3,4}, {1,3,5}, {4,5}

Classification (decision tree with gini index)

@attribute marital_status {S,M}
@attribute num_children_at_home numeric
@attribute member_card {Basic,Normal,Silver,Gold}
@attribute age numeric
@attribute year_income numeric

Example database:
  {0 M,1 1,2 Silver,3 82,4 40000}
  {0 M,1 2,2 Silver,3 53,4 40000}
  
{0 M,1 1,3 59,4 60000} member_card = basic
{1 3,2 Silver,3 43,4 160000} member_card = basic
{0 M,1 2,3 52,4 120000} member_card = silver 

Clustering(Dbscan)

Input: given a node with two dimensional attributes X and Y 
Output: group these nodes into several clusters. The format is X Y ClusterID 
Example 
Input 2 3 (X=2, Y=3)
Output 2 3 1(X=2 Y=3 ClusterID=1)

There are five test data in the attachments
Each row is a data point with x and y value
Hint: 
Test1-3 should be clustered to 4 clusters

Releases

No releases published

Packages

No packages published
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy