Introduction

 

Digging
information from the pool of data is termed as data mining. There is humungous
data available in the information industry that is useless unless converted
into beneficial information and analyzed to discover any fraudulence, buyer’s
choice, to control the manufacturing of products and understand the market
better.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

Data mining
helps the entrepreneurs to know their customers better in a way of their
choices, the deals for their money, their income and criteria by which they
like to spend. It also gives an idea how often a customer likes to spend and
makes one capable to relate different people with similar choices.

Apart from
these it also assists in cooperate sector.

 

Data mining
is categorized as “Descriptive and Classification and production” on the basis
of the type of the data.

 

1.    
Descriptive function
It describes the basic feature of information in database such as:

-Class/concept description
-Mining of frequent patterns
-Mining of association
-Mining of correction
-Mining of clusters

CLASS/CONCEPT DESCRIPTION
Class- The products to be sold by the company, for example, clothes.
Concept- The money being spent by the customer, shoppers or the ones who buy in
budget.

They can be gathered in two ways:

– Data Characterization: Review the data of the class to be studied namely the
‘Target class’
– Data Discrimination: Comparison of the class with a designated class.

MINING OF FREQUENT PATTERNS
The products (patterns) that usually are seen in transactional data are
termed as frequent patterns.

– Frequent item set: The products that are enlisted with one another such as
top and bottom wear in clothing section.
– Frequent sub sequence: The products that are generally bought with the main
item such as buying pet food followed by pet treats.
-Frequent sub structure: Graphs, trees or various other structural forms that
are attached to sub sequences.

MINING OF ASSOCIATION
The item that are generally bought together are included in this category. With
the help of this a businessman discovers a percentage of association between
products bought together such as 60 percent of times a mobile phone is bought
with a mobile cover and 40 percent of times with screen guards.

MINING OF CORRELATION

It reveals the effect of purchase of one product over another whether it has a
negative, positive or no effect at all.

MINING OF CLUSTERS
It is grouping the like similar products from one another. Each cluster
varies from the other.

2.    
Classification and prediction
The class label
of some items may be unknown. Classification and prediction is one such
procedure that can be utilized to uncover the data class or concepts.
This procedure is presented as:
 (a) Classification (If-Then) rules
(b) Decision trees
(c) Mathematical formulae
(d) Neural networks

 

FUNCTIONS:

-Classification: Deriving
model that differentiates the class or concept of the information. This model
is based on the object with a well known class label.
– Prediction: Regression analysis is brought to practice to predict the
numerical values that are unknown rather than the class label. Also it is used
to identify sale trends on the basis of data available.
-Outlier analysis: The data that does not abide by the model of data available
is an outlier data.

-Evolution analysis: It refers to
those subjects which are transitional in nature.

HOW DOES THE CLASSIFICATION WORK?

It
incorporates two stages:

-Building the classifier or model

– Using classifier for classification

 

BUILDING THE CLASSIFIER

-It is a learning step

-order calculations
assemble the classifier

-set made from database
tuples and related class labels

-each type is called as
classification or class are known as test/question or information points.

 

 

 

 

 

USING THE CLASSIFIER

Classifier is utilized for arrangements that include analyzing the
relevance and exactness of characterization rules and thus linking the older
and new information tuples if considered adequate.

 

 

 

DATA MINING TASK PRIMITIVES

A data mining exercise can be specified as a query.

-Transfer the query to the computer.

-This query is hence derived as data mining task primitive.

-Therefore, the primitive develop an interactive communication with data
mining system.

 

This process is undertaken with following requirements:

 

#Mine the appropriate data:

Part of database that is of user’s interest.

It is composed of:-

database attributes and data warehouse dimensions of interest.

 

#Nature of information for mining process

It advices the functions to be undertaken which are:

            -characterization
            -discrimination

            -association and
correlation analysis

            -classification

            -prediction

            -clustering

            -outlier analysis

            -evolution analysis

 

#Stored knowledge

It permits the mining of information at multifarious levels of contemplation.

E.g. the concept of hierarchies.

 

#Effectiveness measures and outset for evaluation for the patterns

The patterns discovered through stored knowledge are appraised.

 

#Presentation to anticipate the
uncovered patterns

It alludes to the visualization of discovered patterns by the means of
rules, tables, charts, decision trees, graphs etc.

ISSUES IN DATA MINING

Data aggregation can be complicated due unavailability of information all
at a single place. It creates a need to be collected from varied sources.

 

The major points of concern are:

(i)Mining methodology and user interest

(ii)Performance issues

(iii)Diverse data type issues

 

The following diagram shows issues in data mining:

 

 

 

DATA WAREHOUSE

In order to back the discussion of management following features are
exhibited:

Subject oriented

Since the information related to subject that could be sales, customer,
product etc, so data warehouse is considered as subject oriented. In addition,
it does not consider the prevalent operation but the analysis of data for
decision making.

Integrated

Since  the data is collected from
variable sources, it makes it reliable for studying the data.

Time variant

The data is recognized in relation to the past view points.

Non volatile

Data warehouse is kept aloof the operational database. So any new
information does not delete or replace the previously stored information.

 

Data warehousing is composed of data cleaning, integration and
consolidation and is followed through two approaches i.e. query driven and update
driven viz a viz the former builds the wrappers and integrations also called
mediators and the latter makes the data available for direct query. Update
driven approach is today’s approach.

 

APPLICATION

Data mining is used in:

·      Retail industry

·      Telecommunication

·      Financial data analysis

·      Intrusion detection

·      Biological data analysis

·      Other scientific applications

Data
mining in banking/finance

In financial
arena data mining is reliable to predict payment of the loans and analysis of
the credit policy and detect any fraudulence.

Data
mining in marketing

Similarly in
retail industry it helps in better understanding of customers, products, sales,
etc.

Data
mining in healthcare

It helps
preserve a large data as in bioinformatics that enables study in various biological
aspects such as genomics, proteomics and biomedical research.

TRENDS IN DATA MINING

There is a
constant evolution of concept in data mining such as follows:

·      Visualization

·      Exploring the application

·      Web mining

·      Biological mining

·      Privacy protection

·      Distributed data mining

Post Author: admin

x

Hi!
I'm Eileen!

Would you like to get a custom essay? How about receiving a customized one?

Check it out