Data Mining 資 料 探 勘. 分 群 分 析 (Cluster Analysis)
|
|
|
- Cameron Sims
- 10 years ago
- Views:
Transcription
1 Data Mining 資 料 探 勘 Tamkang University 分 群 分 析 (Cluster Analysis) DM MI Wed,, (:- :) (B) Min-Yuh Day 戴 敏 育 Assistant Professor 專 任 助 理 教 授 Dept. of Information Management, Tamkang University 淡 江 大 學 資 訊 管 理 學 系 tku.edu.tw/myday/ - -
2 課 程 大 綱 (Syllabus) 週 次 (Week) 日 期 (Date) 內 容 (Subject/Topics) // 資 料 探 勘 導 論 (IntroducGon to Data Mining) // 關 連 分 析 (AssociaGon Analysis) // 分 類 與 預 測 (ClassificaGon and PredicGon) // 分 群 分 析 (Cluster Analysis) // 個 案 分 析 與 實 作 一 (SAS EM 分 群 分 析 ): Case Study (Cluster Analysis K- Means using SAS EM) // 個 案 分 析 與 實 作 二 (SAS EM 關 連 分 析 ): Case Study (AssociaGon Analysis using SAS EM) // 教 學 行 政 觀 摩 日 (Off- campus study) // 個 案 分 析 與 實 作 三 (SAS EM 決 策 樹 模 型 評 估 ): Case Study (Decision Tree, Model EvaluaGon using SAS EM)
3 課 程 大 綱 (Syllabus) 週 次 (Week) 日 期 (Date) 內 容 (Subject/Topics) // 期 中 報 告 (Midterm Project PresentaGon) // 期 中 考 試 週 (Midterm Exam) // 個 案 分 析 與 實 作 四 (SAS EM 迴 歸 分 析 類 神 經 網 路 ): Case Study (Regression Analysis, ArGficial Neural Network using SAS EM) // 文 字 探 勘 與 網 頁 探 勘 (Text and Web Mining) // 海 量 資 料 分 析 (Big Data AnalyGcs) // 期 末 報 告 (Final Project PresentaGon) // 畢 業 考 試 週 (Final Exam)
4 Outline Cluster Analysis K- Means Clustering Source: Han & Kamber ()
5 Cluster Analysis Used for automagc idengficagon of natural groupings of things Part of the machine- learning family Employ unsupervised learning Learns the clusters of things from past data, then assigns new instances There is not an output variable Also known as segmentagon Source: Turban et al. (), Decision Support and Business Intelligence Systems
6 Cluster Analysis Clustering of a set of objects based on the k-means method. (The mean of each cluster is marked by a +.) Source: Han & Kamber ()
7 Cluster Analysis Clustering results may be used to IdenGfy natural groupings of customers IdenGfy rules for assigning new cases to classes for targegng/diagnosgc purposes Provide characterizagon, definigon, labeling of populagons Decrease the size and complexity of problems for other data mining methods IdenGfy outliers in a specific domain (e.g., rare- event detecgon) Source: Turban et al. (), Decision Support and Business Intelligence Systems
8 Example of Cluster Analysis Point P P(x,y) p a (, ) p b (, ) p c (, ) p d (, ) p e (, ) p f (, ) p g (, ) p h (, ) p i (, ) p j (, )
9 Cluster Analysis for Data Mining Analysis methods StaGsGcal methods (including both hierarchical and nonhierarchical), such as k- means, k- modes, and so on Neural networks (adapgve resonance theory [ART], self- organizing map [SOM]) Fuzzy logic (e.g., fuzzy c- means algorithm) GeneGc algorithms Divisive versus AgglomeraGve methods Source: Turban et al. (), Decision Support and Business Intelligence Systems
10 Cluster Analysis for Data Mining How many clusters? There is not a truly opgmal way to calculate it HeurisGcs are ojen used. Look at the sparseness of clusters. Number of clusters = (n/) / (n: no of data points). Use Akaike informagon criterion (AIC). Use Bayesian informagon criterion (BIC) Most cluster analysis methods involve the use of a distance measure to calculate the closeness between pairs of items Euclidian versus Manhalan (recglinear) distance Source: Turban et al. (), Decision Support and Business Intelligence Systems
11 k- Means Clustering Algorithm k : pre- determined number of clusters Algorithm (Step : determine value of k) Step : Randomly generate k random points as inigal cluster centers Step : Assign each point to the nearest cluster center Step : Re- compute the new cluster centers RepeGGon step: Repeat steps and ungl some convergence criterion is met (usually that the assignment of points to clusters becomes stable) Source: Turban et al. (), Decision Support and Business Intelligence Systems
12 Cluster Analysis for Data Mining - k- Means Clustering Algorithm Step Step Step Source: Turban et al. (), Decision Support and Business Intelligence Systems
13 Similarity and Dissimilarity Between Objects Distances are normally used to measure the similarity or dissimilarity between two data objects Some popular ones include: Minkowski distance: where i = (x i, x i,, x ip ) and j = (x j, x j,, x jp ) are two p- dimensional data objects, and q is a posigve integer If q =, d is Manhalan distance q q p p q q j x i x j x i x j x i x j i d )... ( ), ( =... ), ( p p j x i x j x i x j x i x j i d = Source: Han & Kamber ()
14 Similarity and Dissimilarity Between Objects (Cont.) If q =, d is Euclidean distance: d( i, j) = ProperGes d(i,j) d(i,i) = ( x i d(i,j) = d(j,i) x j + x i d(i,j) d(i,k) + d(k,j) x j x i p p Also, one can use weighted distance, parametric Pearson product moment correlagon, or other disimilarity measures x j ) Source: Han & Kamber ()
15 Euclidean distance vs ManhaMan distance Distance of two point x = (, ) and x (, ) x (, ). x = (, ) Euclidean distance: = ((-) + (-) ) / = ( + ) / = ( + ) / = () / =. Manhattan distance: = (-) + (-) = + =
16 Example The K- Means Clustering Method Assign each objects to most similar center reassign Update the cluster means reassign K= Arbitrarily choose K object as initial cluster center Update the cluster means Source: Han & Kamber ()
17 K- Means Clustering Step by Step Point P P(x,y) p a (, ) p b (, ) p c (, ) p d (, ) p e (, ) p f (, ) p g (, ) p h (, ) p i (, ) p j (, )
18 m = (, ) K- Means Clustering Step : K=, Arbitrarily choose K object as initial cluster center M = (, ) Point P P(x,y) p a (, ) p b (, ) p c (, ) p d (, ) p e (, ) p f (, ) p g (, ) p h (, ) p i (, ) p j (, ) Initial m (, ) Initial m (, )
19 Step : Compute seed points as the centroids of the clusters of the current partition Step : Assign each objects to most similar center m = (, ) M = (, ) Point P P(x,y) m distance m distance Cluster p a (, ).. Cluster p b (, ).. Cluster p c (, ).. Cluster p d (, ).. Cluster p e (, ).. Cluster p f (, ).. Cluster p g (, ).. Cluster p h (, ).. Cluster p i (, ).. Cluster p j (, ).. Cluster K- Means Clustering Initial m (, ) Initial m (, )
20 Step : Compute seed points as the centroids of the clusters of the current partition Step : Assign each objects to most similar center m = (, ) Euclidean distance b(,) ß à m(,) = ((-) + (-) ) / = ( + (-) ) / = ( + ) / M = (, ) K- Means = () / Clustering =. Point P P(x,y) m distance m distance Cluster p a (, ).. Cluster p b (, ).. Cluster p c (, ).. Cluster p d (, ).. Cluster Euclidean distance b(,) ß à m(,) = ((-) + (-) ) / = ( + (-) ) / = ( + ) / = () / =. p e (, ).. Cluster p f (, ).. Cluster p g (, ).. Cluster p h (, ).. Cluster p i (, ).. Cluster p j (, ).. Cluster Initial m (, ) Initial m (, )
21 Step : Update the cluster means, Repeat Step,, stop when no more new assignment m = (.,.) M = (.,.) K- Means Clustering Point P P(x,y) m distance m distance Cluster p a (, ).. Cluster p b (, ).. Cluster p c (, ).. Cluster p d (, ).. Cluster p e (, ).. Cluster p f (, ).. Cluster p g (, ).. Cluster p h (, ).. Cluster p i (, ).. Cluster p j (, ).. Cluster m (.,.) m (.,.)
22 Step : Update the cluster means, Repeat Step,, stop when no more new assignment m = (.,.) M = (..,.) K- Means Clustering Point P P(x,y) m distance m distance Cluster p a (, ).. Cluster p b (, ).. Cluster p c (, ).. Cluster p d (, ).. Cluster p e (, ).. Cluster p f (, ).. Cluster p g (, ).. Cluster p h (, ).. Cluster p i (, ).. Cluster p j (, ).. Cluster m (.,.) m (.,.)
23 stop when no more new assignment K- Means Clustering Point P P(x,y) m distance m distance Cluster p a (, ).. Cluster p b (, ).. Cluster p c (, ).. Cluster p d (, ).. Cluster p e (, ).. Cluster p f (, ).. Cluster p g (, ).. Cluster p h (, ).. Cluster p i (, ).. Cluster p j (, ).. Cluster m (.,.) m (.,.)
24 K- Means Clustering (K=, two clusters) stop when no more new assignment K- Means Clustering Point P P(x,y) m distance m distance Cluster p a (, ).. Cluster p b (, ).. Cluster p c (, ).. Cluster p d (, ).. Cluster p e (, ).. Cluster p f (, ).. Cluster p g (, ).. Cluster p h (, ).. Cluster p i (, ).. Cluster p j (, ).. Cluster m (.,.) m (.,.)
25 Summary Cluster Analysis K- Means Clustering Source: Han & Kamber ()
26 References Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Second EdiGon,, Elsevier Efraim Turban, Ramesh Sharda, Dursun Delen, Decision Support and Business Intelligence Systems, Ninth EdiGon,, Pearson.
Data Mining 5. Cluster Analysis
Data Mining 5. Cluster Analysis 5.2 Fall 2009 Instructor: Dr. Masoud Yaghini Outline Data Structures Interval-Valued (Numeric) Variables Binary Variables Categorical Variables Ordinal Variables Variables
Data Mining for Knowledge Management. Clustering
Data Mining for Knowledge Management Clustering Themis Palpanas University of Trento http://disi.unitn.eu/~themis Data Mining for Knowledge Management Thanks for slides to: Jiawei Han Eamonn Keogh Jeff
Chapter 7. Cluster Analysis
Chapter 7. Cluster Analysis. What is Cluster Analysis?. A Categorization of Major Clustering Methods. Partitioning Methods. Hierarchical Methods 5. Density-Based Methods 6. Grid-Based Methods 7. Model-Based
資 料 倉 儲 (Data Warehousing)
商 業 智 慧 實 務 Prac&ces of Business Intelligence Tamkang University 資 料 倉 儲 (Data Warehousing) 1022BI04 MI4 Wed, 9,10 (16:10-18:00) (B113) Min-Yuh Day 戴 敏 育 Assistant Professor 專 任 助 理 教 授 Dept. of Information
Machine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
STOCK MARKET TRENDS USING CLUSTER ANALYSIS AND ARIMA MODEL
Stock Asian-African Market Trends Journal using of Economics Cluster Analysis and Econometrics, and ARIMA Model Vol. 13, No. 2, 2013: 303-308 303 STOCK MARKET TRENDS USING CLUSTER ANALYSIS AND ARIMA MODEL
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
Clustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016
Clustering Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016 1 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate data attributes with
Comparison of K-means and Backpropagation Data Mining Algorithms
Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
Data Mining Project Report. Document Clustering. Meryem Uzun-Per
Data Mining Project Report Document Clustering Meryem Uzun-Per 504112506 Table of Content Table of Content... 2 1. Project Definition... 3 2. Literature Survey... 3 3. Methods... 4 3.1. K-means algorithm...
ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)
ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications
Neural Networks Lesson 5 - Cluster Analysis
Neural Networks Lesson 5 - Cluster Analysis Prof. Michele Scarpiniti INFOCOM Dpt. - Sapienza University of Rome http://ispac.ing.uniroma1.it/scarpiniti/index.htm [email protected] Rome, 29
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
Clustering. Data Mining. Abraham Otero. Data Mining. Agenda
Clustering 1/46 Agenda Introduction Distance K-nearest neighbors Hierarchical clustering Quick reference 2/46 1 Introduction It seems logical that in a new situation we should act in a similar way as in
How To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
Exploratory data analysis approaches unsupervised approaches. Steven Kiddle With thanks to Richard Dobson and Emanuele de Rinaldis
Exploratory data analysis approaches unsupervised approaches Steven Kiddle With thanks to Richard Dobson and Emanuele de Rinaldis Lecture overview Page 1 Ø Background Ø Revision Ø Other clustering methods
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Segmentation of stock trading customers according to potential value
Expert Systems with Applications 27 (2004) 27 33 www.elsevier.com/locate/eswa Segmentation of stock trading customers according to potential value H.W. Shin a, *, S.Y. Sohn b a Samsung Economy Research
Cluster Analysis: Basic Concepts and Algorithms
Cluster Analsis: Basic Concepts and Algorithms What does it mean clustering? Applications Tpes of clustering K-means Intuition Algorithm Choosing initial centroids Bisecting K-means Post-processing Strengths
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
Data Mining and Business Intelligence CIT-6-DMB. http://blackboard.lsbu.ac.uk. Faculty of Business 2011/2012. Level 6
Data Mining and Business Intelligence CIT-6-DMB http://blackboard.lsbu.ac.uk Faculty of Business 2011/2012 Level 6 Table of Contents 1. Module Details... 3 2. Short Description... 3 3. Aims of the Module...
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 by Tan, Steinbach, Kumar 1 What is Cluster Analysis? Finding groups of objects such that the objects in a group will
Unsupervised learning: Clustering
Unsupervised learning: Clustering Salissou Moutari Centre for Statistical Science and Operational Research CenSSOR 17 th September 2013 Unsupervised learning: Clustering 1/52 Outline 1 Introduction What
An Analysis on Density Based Clustering of Multi Dimensional Spatial Data
An Analysis on Density Based Clustering of Multi Dimensional Spatial Data K. Mumtaz 1 Assistant Professor, Department of MCA Vivekanandha Institute of Information and Management Studies, Tiruchengode,
CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht [email protected] 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht [email protected] 539 Sennott
Data Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification
Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin
Data Mining for Customer Service Support Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin Traditional Hotline Services Problem Traditional Customer Service Support (manufacturing)
Cluster Analysis. Isabel M. Rodrigues. Lisboa, 2014. Instituto Superior Técnico
Instituto Superior Técnico Lisboa, 2014 Introduction: Cluster analysis What is? Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from
Cluster Analysis: Basic Concepts and Methods
10 Cluster Analysis: Basic Concepts and Methods Imagine that you are the Director of Customer Relationships at AllElectronics, and you have five managers working for you. You would like to organize all
COURSE SYLLABUS. Enterprise Information Systems and Business Intelligence
MASTER PROGRAMS Autumn Semester 2008/2009 COURSE SYLLABUS Enterprise Information Systems and Business Intelligence Instructor: Malov Andrew, Master of Computer Sciences, Assistant,[email protected] Organization
Introduction to Clustering
Introduction to Clustering Yumi Kondo Student Seminar LSK301 Sep 25, 2010 Yumi Kondo (University of British Columbia) Introduction to Clustering Sep 25, 2010 1 / 36 Microarray Example N=65 P=1756 Yumi
Cluster analysis Cosmin Lazar. COMO Lab VUB
Cluster analysis Cosmin Lazar COMO Lab VUB Introduction Cluster analysis foundations rely on one of the most fundamental, simple and very often unnoticed ways (or methods) of understanding and learning,
ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION
ISSN 9 X INFORMATION TECHNOLOGY AND CONTROL, 00, Vol., No.A ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION Danuta Zakrzewska Institute of Computer Science, Technical
Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.
Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,
City University of Hong Kong. Information on a Course offered by the Department of Management Sciences with effect from Semester A in 2012 / 2013
City University of Hong Kong Information on a Course offered by the Department of Management Sciences with effect from Semester A in 2012 / 2013 Part I Course Title: Customer Relationship Management with
Statistical Databases and Registers with some datamining
Unsupervised learning - Statistical Databases and Registers with some datamining a course in Survey Methodology and O cial Statistics Pages in the book: 501-528 Department of Statistics Stockholm University
A Study of Web Log Analysis Using Clustering Techniques
A Study of Web Log Analysis Using Clustering Techniques Hemanshu Rana 1, Mayank Patel 2 Assistant Professor, Dept of CSE, M.G Institute of Technical Education, Gujarat India 1 Assistant Professor, Dept
Clustering Connectionist and Statistical Language Processing
Clustering Connectionist and Statistical Language Processing Frank Keller [email protected] Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised
A Comparative Study of clustering algorithms Using weka tools
A Comparative Study of clustering algorithms Using weka tools Bharat Chaudhari 1, Manan Parikh 2 1,2 MECSE, KITRC KALOL ABSTRACT Data clustering is a process of putting similar data into groups. A clustering
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
Index Contents Page No. Introduction . Data Mining & Knowledge Discovery
Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.
Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors
Classification k-nearest neighbors Data Mining Dr. Engin YILDIZTEPE Reference Books Han, J., Kamber, M., Pei, J., (2011). Data Mining: Concepts and Techniques. Third edition. San Francisco: Morgan Kaufmann
CSCI-599 DATA MINING AND STATISTICAL INFERENCE
CSCI-599 DATA MINING AND STATISTICAL INFERENCE Course Information Course ID and title: CSCI-599 Data Mining and Statistical Inference Semester and day/time/location: Spring 2013/ Mon/Wed 3:30-4:50pm Instructor:
There are a number of different methods that can be used to carry out a cluster analysis; these methods can be classified as follows:
Statistics: Rosie Cornish. 2007. 3.1 Cluster Analysis 1 Introduction This handout is designed to provide only a brief introduction to cluster analysis and how it is done. Books giving further details are
Clustering. 15-381 Artificial Intelligence Henry Lin. Organizing data into clusters such that there is
Clustering 15-381 Artificial Intelligence Henry Lin Modified from excellent slides of Eamonn Keogh, Ziv Bar-Joseph, and Andrew Moore What is Clustering? Organizing data into clusters such that there is
CLUSTER ANALYSIS FOR SEGMENTATION
CLUSTER ANALYSIS FOR SEGMENTATION Introduction We all understand that consumers are not all alike. This provides a challenge for the development and marketing of profitable products and services. Not every
Hadoop Operations Management for Big Data Clusters in Telecommunication Industry
Hadoop Operations Management for Big Data Clusters in Telecommunication Industry N. Kamalraj Asst. Prof., Department of Computer Technology Dr. SNS Rajalakshmi College of Arts and Science Coimbatore-49
HIGH DIMENSIONAL UNSUPERVISED CLUSTERING BASED FEATURE SELECTION ALGORITHM
HIGH DIMENSIONAL UNSUPERVISED CLUSTERING BASED FEATURE SELECTION ALGORITHM Ms.Barkha Malay Joshi M.E. Computer Science and Engineering, Parul Institute Of Engineering & Technology, Waghodia. India Email:
UNSUPERVISED MACHINE LEARNING TECHNIQUES IN GENOMICS
UNSUPERVISED MACHINE LEARNING TECHNIQUES IN GENOMICS Dwijesh C. Mishra I.A.S.R.I., Library Avenue, New Delhi-110 012 [email protected] What is Learning? "Learning denotes changes in a system that enable
Subject Description Form
Subject Description Form Subject Code Subject Title COMP417 Data Warehousing and Data Mining Techniques in Business and Commerce Credit Value 3 Level 4 Pre-requisite / Co-requisite/ Exclusion Objectives
Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing
Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition
RYERSON UNIVERSITY Ted Rogers School of Information Technology Management And G. Raymond Chang School of Continuing Education
1.0 PREREQUISITE RYERSON UNIVERSITY Ted Rogers School of Information Technology Management And G. Raymond Chang School of Continuing Education COURSE OF STUDY 2015-2016 (C)ITM 618 - Business Intelligence
City University of Hong Kong. Information on a Course offered by Department of Management Sciences with effect from Semester A in 2010 / 2011
City University of Hong Kong Information on a Course offered by Department of Management Sciences with effect from Semester A in 200 / 20 Part I Course Title: Enterprise Data Mining Course Code: MS4224
A Novel Fuzzy Clustering Method for Outlier Detection in Data Mining
A Novel Fuzzy Clustering Method for Outlier Detection in Data Mining Binu Thomas and Rau G 2, Research Scholar, Mahatma Gandhi University,Kerala, India. [email protected] 2 SCMS School of Technology
Medical Information Management & Mining. You Chen Jan,15, 2013 [email protected]
Medical Information Management & Mining You Chen Jan,15, 2013 [email protected] 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?
Credit Card Fraud Detection Using Self Organised Map
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 13 (2014), pp. 1343-1348 International Research Publications House http://www. irphouse.com Credit Card Fraud
Clustering UE 141 Spring 2013
Clustering UE 141 Spring 013 Jing Gao SUNY Buffalo 1 Definition of Clustering Finding groups of obects such that the obects in a group will be similar (or related) to one another and different from (or
Data Mining + Business Intelligence. Integration, Design and Implementation
Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution
The SPSS TwoStep Cluster Component
White paper technical report The SPSS TwoStep Cluster Component A scalable component enabling more efficient customer segmentation Introduction The SPSS TwoStep Clustering Component is a scalable cluster
A Relevant Document Information Clustering Algorithm for Web Search Engine
A Relevant Document Information Clustering Algorithm for Web Search Engine Y.SureshBabu, K.Venkat Mutyalu, Y.A.Siva Prasad Abstract Search engines are the Hub of Information, The advances in computing
Quality Assessment in Spatial Clustering of Data Mining
Quality Assessment in Spatial Clustering of Data Mining Azimi, A. and M.R. Delavar Centre of Excellence in Geomatics Engineering and Disaster Management, Dept. of Surveying and Geomatics Engineering, Engineering
Hierarchical Cluster Analysis Some Basics and Algorithms
Hierarchical Cluster Analysis Some Basics and Algorithms Nethra Sambamoorthi CRMportals Inc., 11 Bartram Road, Englishtown, NJ 07726 (NOTE: Please use always the latest copy of the document. Click on this
Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010.
Title Introduction to Data Mining Dr Arulsivanathan Naidoo Statistics South Africa OECD Conference Cape Town 8-10 December 2010 1 Outline Introduction Statistics vs Knowledge Discovery Predictive Modeling
Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca
Clustering Adrian Groza Department of Computer Science Technical University of Cluj-Napoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 K-means 3 Hierarchical Clustering What is Datamining?
K-Means Cluster Analysis. Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1
K-Means Cluster Analsis Chapter 3 PPDM Class Tan,Steinbach, Kumar Introduction to Data Mining 4/18/4 1 What is Cluster Analsis? Finding groups of objects such that the objects in a group will be similar
Cluster Analysis: Basic Concepts and Algorithms
8 Cluster Analysis: Basic Concepts and Algorithms Cluster analysis divides data into groups (clusters) that are meaningful, useful, or both. If meaningful groups are the goal, then the clusters should
EXTENDED CENTROID BASED CLUSTERING TECHNIQUE FOR ONLINE SHOPPING FRAUD DETECTION
EXTENDED CENTROID BASED CLUSTERING TECHNIQUE FOR ONLINE SHOPPING FRAUD DETECTION Priya J Rana 1, Jwalant Baria 2 1 ME IT, Department of IT, Parul institute of engineering & Technology, Gujarat, India 2
The Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data
Data Mining Clustering (2) Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining
Data Mining Clustering (2) Toon Calders Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining Outline Partitional Clustering Distance-based K-means, K-medoids,
Cluster Analysis: Advanced Concepts
Cluster Analysis: Advanced Concepts and dalgorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 08/06/2006 1 Introduction to Data Mining 08/06/2006 1 Outline Prototype-based Fuzzy c-means
Classification and Prediction
Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser
K-Means Clustering Tutorial
K-Means Clustering Tutorial By Kardi Teknomo,PhD Preferable reference for this tutorial is Teknomo, Kardi. K-Means Clustering Tutorials. http:\\people.revoledu.com\kardi\ tutorial\kmean\ Last Update: July
Clustering. Clustering. What is Clustering? What is Clustering? What is Clustering? Types of Data in Cluster Analysis
What is Clustering? Clustering Tpes of Data in Cluster Analsis Clustering A Categorization of Major Clustering Methods Partitioning Methods Hierarchical Methods What is Clustering? Clustering of data is
MS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
IMPROVISATION OF STUDYING COMPUTER BY CLUSTER STRATEGIES
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND SCIENCE IMPROVISATION OF STUDYING COMPUTER BY CLUSTER STRATEGIES C.Priyanka 1, T.Giri Babu 2 1 M.Tech Student, Dept of CSE, Malla Reddy Engineering
Essential Components of an Integrated Data Mining Tool for the Oil & Gas Industry, With an Example Application in the DJ Basin.
Essential Components of an Integrated Data Mining Tool for the Oil & Gas Industry, With an Example Application in the DJ Basin. Petroleum & Natural Gas Engineering West Virginia University SPE Annual Technical
Map-Reduce Algorithm for Mining Outliers in the Large Data Sets using Twister Programming Model
Map-Reduce Algorithm for Mining Outliers in the Large Data Sets using Twister Programming Model Subramanyam. RBV, Sonam. Gupta Abstract An important problem that appears often when analyzing data involves
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
資 訊 管 理 個 案. E-commerce: Digital Markets, Digital Goods 1. Facebook, 2. Amazon vs. Walmart (Chap. 10)
Case Study for Information Management 資 訊 管 理 個 案 E-commerce: Digital Markets, Digital Goods 1. Facebook, 2. Amazon vs. Walmart (Chap. 10) 1011CSIM4B10 TLMXB4B Thu 8, 9, 10 (15:10-18:00) B508 Min-Yuh Day
CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing
CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate
CSci 538 Articial Intelligence (Machine Learning and Data Analysis)
CSci 538 Articial Intelligence (Machine Learning and Data Analysis) Course Syllabus Fall 2015 Instructor Derek Harter, Ph.D., Associate Professor Department of Computer Science Texas A&M University - Commerce
CHAPTER 20. Cluster Analysis
CHAPTER 20 Cluster Analysis 20.1 Introduction 20.2 What Is Cluster Analysis? 20.3 Typical requirements 20.4 Types of Data in cluster Analysis 20.5 Interval-scaled Variables 20.6 Binary Variables 20.7 Nominal,Ordinal,
Big Data and Marketing
Big Data and Marketing Professor Venky Shankar Coleman Chair in Marketing Director, Center for Retailing Studies Mays Business School Texas A&M University http://www.venkyshankar.com [email protected]
PERFORMANCE ANALYSIS OF CLUSTERING ALGORITHMS IN DATA MINING IN WEKA
PERFORMANCE ANALYSIS OF CLUSTERING ALGORITHMS IN DATA MINING IN WEKA Prakash Singh 1, Aarohi Surya 2 1 Department of Finance, IIM Lucknow, Lucknow, India 2 Department of Computer Science, LNMIIT, Jaipur,
Analytics on Big Data
Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis
Data Mining K-Clustering Problem
Data Mining K-Clustering Problem Elham Karoussi Supervisor Associate Professor Noureddine Bouhmala This Master s Thesis is carried out as a part of the education at the University of Agder and is therefore
COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction
COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised
Principles of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
Unsupervised Learning and Data Mining. Unsupervised Learning and Data Mining. Clustering. Supervised Learning. Supervised Learning
Unsupervised Learning and Data Mining Unsupervised Learning and Data Mining Clustering Decision trees Artificial neural nets K-nearest neighbor Support vectors Linear regression Logistic regression...
Comparison of Data Mining Techniques used for Financial Data Analysis
Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract
Data Mining and Soft Computing. Francisco Herrera
Francisco Herrera Research Group on Soft Computing and Information Intelligent Systems (SCI 2 S) Dept. of Computer Science and A.I. University of Granada, Spain Email: [email protected] http://sci2s.ugr.es
