Get assignment help for this at assignmenthelpuk@yahoo.com
Executive
summary
Data mining is an important process for the
decision makers wherein decision making is done based on the data collected
through multiple sources and analyzed by deploying suitable data analytical
tools. In the first task, rattle data mining tool would be deployed in order to
carry out the statistical analysis so as to reveal the chances for a consumer
to respond favorably towards a marketing campaign made by the bank for term
deposit. In the pivot analysis for the present context data has been collected
for the technology adoption, population and urbanization in the African
continent in order to understand the overall technology adoption in different
countries of South Africa.
Introduction
Business intelligence is an important
discipline in current age business scenario wherein business organizations gain
wide knowledge of the various business aspects in order to take informed
decisions which helps them in growing their business. Present paper would tend
to apply the wide variety of knowledge pertaining to the markets, technology
and management so as to understand implications of these in the organizational
systems. Further business intelligence allows decisions makers in the business
organizations to solve the critical problems by making suitable use of the data
which helps them in making informed decisions which allow them to gain
sustainable competitive advantage over their competitors. Finally, present
paper would showcase the importance of clear communication to the management
team of the organization by making use of the report format so that they gain
the main idea represented through statistical information.
Task-1
CRISP
DM Process
CRISM DP process stands for the cross
industry standard process for data mining and this process is deployed in the
current time so as to make better decision by making use of the wide range of
data available. There are mainly six stages to the data mining process in the
CRISP DM process and these include data understanding, business understanding,
modeling, evaluation, deployment and data preparation etc. There are range of
opportunities and challenges faced by the CRISP DM process which are given as
under:
Opportunities
·
CRISP DM process would be helpful in making
predictive decisions for the financial industry so that results of the decision
taken can be determined for the future
·
Data gathering process is made a part of the
decision making process for the financial institutions wherein this process
would be helpful for them to handle data in better way.
·
Implementation of the CRISP DM process in the
financial sector for the decision making process of loan sanctioning socio
technical changes can be deployed
·
Implementation of CRISP DM process would
allow quick decision making based on the data gathered
Key
challenges
·
No standard process: There is no standard
process which can be adopted for the CRISP DM and the decision making would be
highly subjective in nature thereby varying the decision from person to person
·
Predictive modeling process has high degree
of risk and it usage of CRISP DM process further enhances the risk for the
decision makers
This task would tend to develop a statistical
model in order to predict whether the bank customers would respond positively
to the marketing campaign developed by the bank for newly deployed product
which is a term deposit.
PC
|
PC1
|
PC2
|
PC3
|
PC4
|
PC5
|
PC6
|
PC7
|
SD
|
1.21
|
1.06
|
1.04
|
0.98
|
0.94
|
0.91
|
0.76
|
Variance
|
0.21
|
0.16
|
0.15
|
0.13
|
0.12
|
0.11
|
0.08
|
Cumm. Prop.
|
0.21
|
0.37
|
0.52
|
0.66
|
0.79
|
0.91
|
1.00
|
Table 1: Showing the variance for seven
principle components
There are seven principle components
identified explaining the reason for impact of the marketing campaign made by
the bank on consumers and these seven principle components are important in
order to explain 100% variance present in the data. 21% variance is the highest
amount which is explained by any of the variables while 8% variance explained
by 7th variable shows the least amount of variance. Looking into
this, all seven principle components needs to be considered for explaining
variance in the available data.
Figure 1: Showing the variance explained
through each of the principle components
The seven principle components are explained
as under as under:
·
Age: Age variable shows the age of the customer
and it’s a numeric variable. With increasing age of the customer there would be
higher chances for the customer to respond positively towards the campaign as
older consumers are likely to take more term deposit as compared to younger
consumers as younger consumers would look for higher return.
·
Balance: This variable shows the average
yearly balance in euro and with increasing balance there would be high
likelihood for a customer to respond positively towards the campaign as higher
balance would lead to higher chances of term deposit.
·
Day: This variable shows the last contact day
of the month and this would have adverse relationship with the chances of a
consumer to respond positively for the campaign.
·
Duration: It shows the duration for the last contact
in seconds and with increasing duration there would be higher chances for
responding positively for a consumer.
·
Campaign: It shows the number of contacts
performed during the campaign and it would be having positive impact over the
consumer for responding positively to the term deposit campaign of the bank.
·
Pdays: It shows the number of days passed by
for customer last contact and it is numeric variable. With higher pdays value
there would be higher chances that a customer would respond negatively to the
campaign of the bank.
·
Previous: It shows the number of contacts
performed before this campaign to this client and with higher number of contact
there would be higher chances that a client would respond positively for a
marketing campaign.
Table 2 below shows the correlation values
between different variables wherein a negative value shows the negative
correlation and a positive value presents a positive correlation between the
two variables.
|
Duration
|
Day
|
Pdays
|
campaign
|
previous
|
Balance
|
Age
|
Duration
|
1.00
|
-0.03
|
-0.01
|
-0.09
|
-0.00
|
0.02
|
-0.01
|
Day
|
-0.03
|
1.00
|
-0.09
|
0.17
|
-0.05
|
-0.00
|
-0.01
|
Pdays
|
-0.15
|
-0.09
|
1.00
|
-0.08
|
0.54
|
0.01
|
-0.01
|
Campaign
|
-0.09
|
0.17
|
-0.08
|
1.00
|
-0.03
|
-0.00
|
-0.00
|
Previous
|
-0.00
|
-0.05
|
0.54
|
-0.03
|
1.00
|
-0.02
|
0.00
|
Balance
|
0.02
|
-0.00
|
0.01
|
-0.00
|
0.02
|
1.00
|
0.09
|
Age
|
-0.01
|
-0.01
|
-0.01
|
-0.00
|
0.00
|
0.09
|
1.00
|
Table 2: Showing correlation table
Figure 2: Showing correlation plot
As given through figure 2 above that
correlation has been plotted for the different variables with the other
variables of interest in the present context. A dark blue ball shows the
perfect correlation between the two variables hence correlation of the one
variable with itself has been represented by the dark blue ball (Pang-Ning et al,
2005). Similarly, a light blue ball shows the positive
correlation between the two variables while a red ball shows the negative
correlation between the two variables. Intensity of the correlation between the
two variables is represented by the intensity of the color of the balls. For
example, duration has perfect correlation with itself due to which it has been
shown through perfect dark blue ball. Similarly, duration and campaign are
having negative correlation between them but the negative correlation value is
just -0.09 which is not very high.
Table 3 below provides the rotation of the
each principle component present in the above statistical model.
Factor
|
PC1
|
PC2
|
PC3
|
PC4
|
PC5
|
PC6
|
PC7
|
Age
|
-0.01
|
0.13
|
0.69
|
-0.21
|
-0.65
|
0.16
|
0.05
|
Balance
|
0.02
|
0.19
|
0.67
|
0.14
|
0.68
|
-0.12
|
0.01
|
Day
|
-0.293
|
-0.48
|
0.12
|
0.49
|
0.03
|
0.64
|
0.04
|
Duration
|
0.067
|
0.45
|
-0.05
|
0.80
|
-0.28
|
-0.21
|
0.02
|
Campaign
|
-0.28
|
-0.58
|
0.16
|
0.12
|
-0.14
|
-0.70
|
0.08
|
Pdays
|
0.65
|
-0.22
|
0.02
|
0.06
|
0.00
|
0.02
|
0.71
|
previous
|
0.62
|
-0.31
|
0.11
|
0.12
|
-0.06
|
-0.01
|
-0.69
|
Table 3: Showing the rotation for each of the
seven principle components
Table 3 above shows the rotation variable for
the each of the seven components responsible for a customer to respond
positively towards the marketing campaign of the bank for term deposit variable.
Selection of the principle components
have been done based on the value explained by them as per above table. For
example, first principle component is based explained through pdays variable as
it has the highest value of 0.65 (considering both positive and negative
rotation). From the above table value of the seven principle components can be
ascertained and given as under:
PC
|
Variable
|
Variance explained
|
|
PC1
|
Pday
|
21%
|
|
PC2
|
Campaign
|
16%
|
|
PC3
|
Age
|
15%
|
|
PC4
|
Duration
|
13%
|
|
PC5
|
Balance
|
12%
|
|
PC6
|
Day
|
11%
|
|
PC7
|
Previous
|
8%
|
|
Table 4: Showing the principle component
Figure 3: Showing the variance explained by
each of the principle component
Hence pdays is the most important factor
which would showcase the probability for a customer to respond positively
towards the marketing campaign and 21% of the variance in the data would be
explained by this factor only.
Cluster variable can be further investigated
through the below figure
Figure 4: Showing the cluster correlation
From the above figure it is clear that there
is low degree of correlation exist between duration and age factor and low
degree of correlation exist between the age and balance variable. Variables
which have been explained through longer lines would be having lesser
correlation while variables represented through shorter line would be having
higher degree of correlation (Ian et al, 2011).
Correlation between previous and pdays is much higher as compared to the day
and campaign and similarly correlation between day and campaign is higher as
compared to the correlation between age and balance factor which is also
evident through the correlation table and figure.
Decision
tree model
In order to predict the likelihood of
customers to respond positively for the marketing campaign developed by banks
for the term deposit there would be several factors of importance so as to make
a predictive model. As shown in the figure that the duration of last contact in
seconds have been taken as an important variable for decision making. In case
duration is lesser than 382 then there would only be 19% chances for a customer
to respond positively for the marketing campaign of the bank while there are
81% of the chances in case the duration is more than 382 seconds.
Figure 5: Showing the decision tree model
As shown above that the total sample size of
the customers is 23736 based on which probability for responding to a marketing
campaign of the bank has been assessed. As shown through the above figure that
in case duration of the last contact is lesser than 381 seconds than there
would be 589 customers which would respond to the marketing campaign favorably
while 18717 would respond unfavorably. Similarly other variables such as
contact mode would impact the final probability for a consumer to respond
towards the marketing campaign developed by the banks.
Linear
logistic regression
Output for the linear logistic regression
value can be given as under which shows the p value for each of the variable
given in the model. P values having higher value than 0.05 would be
insignificant for the present model and would be neglected in the present
modeling:
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.342738380 0.254914383
-9.190 < 2e-16 ***
age 0.001566255 0.003052009
0.513 0.607820
jobblue-collar -0.318673374 0.102655668
-3.104 0.001907 **
jobentrepreneur -0.316647948 0.172889041
-1.832 0.067025 .
jobhousemaid -0.538388738 0.191183831
-2.816 0.004861 **
jobmanagement -0.102527911 0.102615934
-0.999 0.317726
jobretired 0.376518263 0.133761955
2.815 0.004880 **
jobself-employed -0.342504442
0.158459442 -2.161 0.030659
*
jobservices -0.194232947 0.118919971
-1.633 0.102404
jobstudent 0.436806785 0.151637794
2.881 0.003969 **
jobtechnician -0.130960267 0.096823566
-1.353 0.176194
jobunemployed -0.095461551 0.154354754
-0.618 0.536275
jobunknown -0.324410056 0.341731512
-0.949 0.342462
maritalmarried -0.204017340 0.081121122
-2.515 0.011904 *
maritalsingle 0.067809569 0.092891094
0.730 0.465396
educationsecondary 0.144754469
0.090122466 1.606 0.108231
educationtertiary 0.335025063
0.104212182 3.215 0.001305 **
educationunknown 0.280421030
0.144362709 1.942 0.052080 .
defaultyes -0.206265166 0.245001364
-0.842 0.399847
balance 0.000007552 0.000006997
1.079 0.280404
housingyes -0.598551149 0.060724215
-9.857 < 2e-16 ***
loanyes -0.403288451 0.083548021
-4.827 1.39e-06 ***
contacttelephone -0.212671512
0.103962989 -2.046 0.040791
*
contactunknown -1.881500801 0.104719349 -17.967 < 2e-16 ***
day 0.015389112 0.003459778
4.448 8.67e-06 ***
monthaug -0.854139270 0.106503112
-8.020 1.06e-15 ***
monthdec 0.745806323 0.234774040
3.177 0.001490 **
monthfeb -0.244513923 0.121338985
-2.015 0.043891 *
monthjan -1.482841489 0.168533560
-8.798 < 2e-16 ***
monthjul -1.011211970 0.105775958
-9.560 < 2e-16 ***
monthjun 0.490577739 0.127371095
3.852 0.000117 ***
monthmar 1.508293633 0.166565279
9.055 < 2e-16 ***
monthmay -0.541757623 0.098919286
-5.477 4.33e-08 ***
monthnov -1.054958271 0.115861428
-9.105 < 2e-16 ***
monthoct 0.562862311 0.152434421
3.692 0.000222 ***
monthsep 0.854664108 0.158987534
5.376 7.63e-08 ***
duration 0.004172679 0.000089337
46.707 < 2e-16 ***
campaign -0.090290211 0.013999226
-6.450 1.12e-10 ***
pdays -0.001008817 0.000434729
-2.321 0.020310 *
previous 0.013926615 0.008771871
1.588 0.112367
poutcomeother 0.114061243 0.127666809
0.893 0.371627
poutcomesuccess 2.121250059 0.113068828
18.761 < 2e-16 ***
poutcomeunknown -0.266338419 0.129018678
-2.064 0.038985 *
Hence from the above data it can be evident
that several factors such as age, single (married), management job, service job
and previous can be neglected from the predictive modeling due to their higher
value as compared to the normal p values.
Task-2
Datafication
can be considered
as the process of turning various aspects of human life into the data so that
by analysis of data value can be created through the data regarding human
beings. Some of the key examples of datafication process is the ways Twitter
and LinkedIn uses the data in order to create value by turning various aspects
of human life in computerized data. The concept of datafication can be defined
based on the three key aspects which are density, liquification and
dematerlization. Dematerialization process can be defined as the process
wherein the data is separated from the resource in context of the physical
world. Process of liquification shows the point that information which is
dematerlized from the resources and assets can be further manipulated and moved
so as to allow resource and assets which were linked to unbundle or re-bundle
them. Density can be considered as the outcome for the value creation process (Shah et al, 2012).
There are key challenges faced by the process
of datafication in context to the individuals and organizations as privacy and
security concerns are major for the individuals and business organizations. A
semantic or causal relationship is not required between various variables such
as economic, political and social variables but the technology provides an
alternate way in order to track the trend between these external market
sources. Individual privacy and security
concerns hinder the progress of datafication process. Daily activities for the
users are being tracked through the various tools and various human life
aspects are datafied. For example, looking at the human life network of friends
is datafied through the facebook, network of professional links is datafied
through the LinkedIn, location of a person is datafied through the foursquare,
thoughts are datafied through Twitter and music preferences are datafied with
the help of Spotify. This shows that various aspects of human life are being
datafied in order to record the daily life activities for the users based on
which decision making can be done in the organizational context.
Further processes like reading books is also
being tracked by the online sites so as to present the analysis of the reading
list for the users and websites like Amazon tracks the reading list of the users
and provide suggestions in accordance to the reading preferences shown by the
customers. Analysis done by these sites is to deep that they can check the
speed of reading and according to this they can estimate when I will be
finishing the book so that they can make new offer for the books they are
selling online. Websites like Amazon would offer the books from the next series
with discount when someone would finish off the book from the earlier series. Similarly,
for business organizations also datafication is being done e.g. commercial
vehicles are being tracked through the GPS devices and even the tires used in
the vehicles are being monitored.
With advancement in technology, datafication
process is able to answer several issues faced by the business organizations
for tracking human behavior so as to take the strategic business decisions by
the organizations but at the same time it invades privacy for the individuals
and business organizations thereby questioning the usefulness of datafication.
Individuals and business organizations using internet for various purposes are
being heavily tracked by the websites using datafication and this tracking is
much more than just recoding the preferences of the consumers over internet (Anderson,
2008). These websites are having access to the consumer
details, personal activities, preferences and buying decision making process
which allows them to shape their business offering as per the consumer
preferences and this process would involve invading the security for the
individual and business organizations. With access for the key information from
the individuals and organization it would be easy for a person with ill
intentions to use them for their personal benefits. Further unauthorized access
is made by the users without the prior knowledge of the users working on
internet and many a times financial details of the consumers can also be shared
which may lead to financial losses for the individuals and business
organizations. Overall the datafication process leads to compromise on the
internet security for the individuals and organizations working on internet.
Further government organizations such as
military and banks also under the purview of datafication process and tracking
for these organizations may even lead to higher security concerns as the
information available in the information systems used by these organizations is
very critical. Any compromise done with the security aspects would lead to the
lapses and may result in attack from the external invaders on the information
system so as to take away the critical information and use them for their
personal benefits. Government has developed several regulations related with
the invasion of privacy for the users on internet so that such datafication
process do not lead to the compromise on user security thereby resulting in
proper information security measures being adopted by the users. Government
regulations have been framed so as to impose restrictions on the websites which
are involved in the process of datafication and compromising on the information
security of users. A list of such websites has been prepared by the government
and have been blocked for access from the government offices so that their
information security do not get compromised and critical information present in
their information systems do not get into wrong hands. Further in order to
ensure higher level of security against the phenomenon of datafication
information security arrangement has been done so that there are no such
instances arises from the process.
In addition to this, user training is another
important step which is taken into this direction so as to impart training for
the surfing, information security and steps adopted in order to ensure proper
information security arrangements in the organization. Users are provided with
detail manual and processes which needs to be followed while working on the
information systems for the government authorities and responsibilities have
been assigned for the users in case of any breach for information security
takes place (Tuomi, 1999). Such user
trainings are of immense importance so as to ensure that users do not get
trapped into the various traps developed by the websites involved into
datafication so as to obtain information from the users working on internet and
compromise on the information security aspects. Similarly, individuals using
internet for purchase of particular products are being tracked for the payment
ways adopted by them, password for the credit/debit card used by the customers
and several websites offer remember option for the password so that quick
payment can be made. Such instances and tracking for the financial information of
the users may involve compromise on the financial information for the users
which may result in heavy financial losses.
Ethical issues are of major concern while
considering about the datafication phenomenon as the individuals and
organizations are being tracked for their every activities and the data is
created and analyzed to generate value or the business organizations involved
into the datafication process. Though it has been argued several times that
this data would be useful for the individuals and organizations making use of
the internet but it is one of major ethical concerns which needs to be
addressed by advocates of datafication process. The first ethical issue arising
from the datafication process is the low consumer knowledge regarding the
tracking being done by the websites for their every action on the internet.
Individuals and organizations working on internet and not having sufficient
information regarding the process of datafication and they realize it when they
come across several advertisement and offers pertaining to their choice only. Tracking
of actions for consumers without their prior knowledge creates ethical
implications as users have not agreed upon the information sharing and despite
of that information is obtained from the consumers which are being used by the
business organizations for their personal benefits thereby leading to unethical
work for the websites involved into datafication process.
Further marketing and advertisement companies
involved into datafication process are tracking consumer preferences for
particular product and services and based on their past history or search for
the particular products of services they are being offered similar products as
there would be higher probability for the consumers to purchase such things which
they are searching on the internet. However, according to the datafication
advocates it is of immense importance for the consumers as well since they are
getting offers on the products which they like and it becomes easy for them to
look for the products and services without looking for the information here and
there (Maitlis, 2005). But once the
advertisements and marketing communication made by the website is customized in
such a manner to suit the consumer preferences then consumers would be buying it
without knowing the fact that these offerings have been customized by the
marketers hence they are unknowingly buying products as per the marketers
choice. Further several times users are faced with several terms and conditions
which are agreed upon and they may involve sharing consumer information with
other users available on internet for their commercial benefits. Since
consumers are not aware of the consequences of sharing information so it should
be the responsibility of the websites to aware users for the consequences of
such information sharing process.
Task-3
a)
1) Top 10 best performing countries in terms
of technology adoption of mobile phones by year and region
Figure below provides the analysis of the 10
top performing countries in terms of the mobile technology adoption from the
year and region wise. It shows that the 10 best countries for different years
in terms of the mobile technology wherein Algeria is the best and Tanzania is
the worst in technology adoption among these 10 countries
2)
Top 10 worst performing countries in terms of technology
adoption of Internet by year and region
As shown in the figure below that 10 worst
performing countries in terms of the internet technology adoption in African
continent has been highlighted wherein it has been found out that Comoros is
the worst performing countries followed by Dijbouti and others.
3)
Top 10 best performing countries in terms of technology
adoption of landlines per head of population by year and region
Figure below provides the 10 best performing
countries from the landline adoption technology and reveals that Algeria is
still the best country while Tunisia is at the 10th rank in terms of
technology adoption for the landline technology.
4)
A summary of key technology adoption factors for each
region of Africa for a given year
Figure below provides the summary of
technology adoption factors region wise for African continent as per the given
year and graph indicates that the East Africa is the best region and South
Africa is the worst performing region.
b)
Graphic design and functionality for World bank regional unit of African
continent
Dashboard prepared for the World Bank
regional unit of African continent would be of high importance in order to
determine the key trends in terms of the population, urbanization and adoption
of technological equipments such as mobile phones, landline and internet etc.
Each dashboard developed above highlights the key figures indicating that
adoption of technology based on several years in the different countries in the
African continent. These dashboards can be considered as an important part of
the analysis while decision making would be done from the higher authorities in
the bank. Some of the key features and functionality provided by the dashboard
adopted by World Bank regional unit of African continent can be given as under:
·
eDeployment: There would be several users
accessing the dashboard for the key information available in the system and
real time update feature should be provided so that users are able to know
about the current information which is available with the organization. This
would allow users sitting at different geographic locations to assess the same
dashboard and discuss the implications and improvement tools which can be
deployed in order to improve upon the situation faced in the African continent.
Email/internet and intranet tools can be deployed so as to make real time
update of information available on dashboard. Enabling eDeployment process
would help in order to have large number of cubes and dashboards for updating
information in much shorter time period. Further during the lean hours data
would be updated on the systems so that the data updation process does not
impact upon the work activity done by the users at the time of business hours.
·
Extension builder: Hassle free data transfer
process needs to be implemented with the designed dashboard so that there are
no time delays resulting due to data transfer process and safety for the data
can also be ensured while transferring the data from one system to another
system. Extension builder functionality deployed on the dashboard for the
current system would enable users to transfer the data at high speed without
any error or loss of data in the process. The role of extension builder tool
would not only limit to the data transfer process but it would also allow
resolution of the compatibility issues faced by the users in data movement from
one system to another.
·
Dashboard designer: Dashboard designer tool
would be of immense importance for the users as it would enable the users to
design various components of the dashboard. Efficiency for the dashboard can be
enhanced through the process of visualization as the data available on
dashboard would be real time and proper synchronization of the data would be
done. Real time data refresh can be allowed through this tool so that users can
access the data at the same time of update (Günnemann et
al, 2011).
·
Named consumer users: This functionality
adopted by the system would reduce duplication in the available data wherein
one department can enter the data in system and the analysis can be accessed by
multiple users sitting at different locations. This tool would be helpful for
the users so that they can synchronize their efforts to obtain the output
without entering the data multiple times in the system. This functionality
would enhance the efficiency of the organization and would reduce duplication
efforts made by the data entering process.
·
Web access server: Web access server would be
deployed for the dashboard made by the World Bank regional African continent
unit so that faster access can be provided to the users of the dashboard
without any intervention (Battiti and Andrea, 2010).
Access through URL would be provided to the users so that they can access the
dashboard through any device by logging into the portal for which access rights
would be provided to the limited users only. Access offered through the URL
would be having the option of anytime, anywhere and faster access for the users
accessing the dashboard in order to look at the critical information available
on dashboard.
Hence the various functionalities deployed
for the dashboard would be useful for the various users of the dashboard and
these functionalities would ensure that the data provided to the users are real
time updated, users have faster access to the data and transfer of data from
one system to another can be done without facing any hassles in the data
transfer process.
References
Anderson C (2008) The end of theory: the data
deluge makes the scientific method obsolete. Wired,http://www.wired.com/science/discoveries/magazine/1607/pb_theory (accessed
25 May 2015).
Maitlis S (2005) The social
processes of organizational sensemaking. Academy of Management
Journal 48(1),
21–49
Tuomi I (1999) Data is more than knowledge:
implications of the reversed knowledge hierarchy for knowledge management and
organizational memory. Journal of
Management Information Systems 16(3), 103–117.
Shah S, Horne A and Capellá
J (2012) Good data won’t guarantee good decisions. Harvard Business Review90(4),
23–25
Günnemann, S.; Kremer, H.; Seidl, T. (2011).
"An extension of the PMML standard to subspace clustering models". Proceedings of the 2011 workshop on
Predictive markup language modeling - PMML '11. pp. 48. DOI:10.1145/2023598.2023605. ISBN 9781450308373.
Battiti, R; Andrea P. (2010).
"Brain-Computer Evolutionary Multi-Objective Optimization (BC-EMO): a
genetic algorithm adapting to the decision maker". IEEE Transactions on Evolutionary Computation 14 (15): 671–687
Pang-Ning Tan, Michael Steinbach and Vipin Kumar (2005). Introduction to Data Mining. ISBN 0-321-32136-7
Ian H. Witten; Eibe Frank; Mark A. Hall (2011). Data Mining: Practical Machine Learning
Tools and Techniques (3 ed.). Elsevier. ISBN 978-0-12-374856-0
No comments:
Post a Comment