<?xml version="1.0" encoding="utf-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">INFORMATICA</journal-id>
<journal-title-group><journal-title>Informatica</journal-title></journal-title-group>
<issn pub-type="epub">1822-8844</issn>
<issn pub-type="ppub">0868-4952</issn>
<issn-l>0868-4952</issn-l>
<publisher>
<publisher-name>Vilnius University</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">INFO1224</article-id>
<article-id pub-id-type="doi">10.15388/Informatica.2019.221</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Research Article</subject></subj-group></article-categories>
<title-group>
<article-title>Fuzzifier Selection in Fuzzy C-Means from Cluster Size Distribution Perspective</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Zhou</surname><given-names>Kaile</given-names></name><email xlink:href="zhoukaile@hfut.edu.cn">zhoukaile@hfut.edu.cn</email><xref ref-type="aff" rid="j_info1224_aff_001">1</xref><xref ref-type="aff" rid="j_info1224_aff_002">2</xref><xref ref-type="corresp" rid="cor1">∗</xref><bio>
<p><bold>K. Zhou</bold> received the BS and PhD degrees from the School of Management, Hefei University of Technology, Hefei, China, in 2010 and 2014, respectively. From 2013 to 2014, he was a visiting scholar in the Eller College of Management, The University of Arizona, Tucson, AZ, USA. He is currently an associate professor with the School of Management, Hefei University of Technology. His research interests include clustering algorithm, data analysis, and smart energy management.</p></bio>
</contrib>
<contrib contrib-type="author">
<name><surname>Yang</surname><given-names>Shanlin</given-names></name><email xlink:href="yangsl@hfut.edu.cn">yangsl@hfut.edu.cn</email><xref ref-type="aff" rid="j_info1224_aff_001">1</xref><xref ref-type="aff" rid="j_info1224_aff_002">2</xref><bio>
<p><bold>S. Yang</bold> is currently a distinguished professor with the School of Management, Hefei University of Technology, Hefei, China. He has authored over 300 referred journal papers and over 200 conference papers. His research interests include engineering management, information management, and decision support systems. He is a member of the Chinese Academy of Engineering. He is a fellow of the Asian Pacific Industrial Engineering and Management Society. He is also the vice chairman of the China Branch of the Association of Information Systems.</p></bio>
</contrib>
<aff id="j_info1224_aff_001"><label>1</label>School of Management, <institution>Hefei University of Technology</institution>, Hefei 230009, <country>China</country></aff>
<aff id="j_info1224_aff_002"><label>2</label>Key Laboratory of Process Optimization and Intelligent Decision-Making, Ministry of Education, <institution>Hefei University of Technology</institution>, Hefei 230009, <country>China</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2019</year></pub-date>
<pub-date pub-type="epub"><day>1</day><month>1</month><year>2019</year></pub-date><volume>30</volume><issue>3</issue><fpage>613</fpage><lpage>628</lpage>
<history>
<date date-type="received"><month>8</month><year>2018</year></date>
<date date-type="accepted"><month>3</month><year>2019</year></date>
</history>
<permissions><copyright-statement>© 2019 Vilnius University</copyright-statement><copyright-year>2019</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>Fuzzy c-means (FCM) is a well-known and widely applied fuzzy clustering method. Although there have been considerable studies which focused on the selection of better fuzzifier values in FCM, there is still not one widely accepted criterion. Also, in practical applications, the distributions of many data sets are not uniform. Hence, it is necessary to understand the impact of cluster size distribution on the selection of fuzzifier value. In this paper, the coefficient of variation (CV) is used to measure the variation of cluster sizes in a data set, and the difference of coefficient of variation (DCV) is the change of variation in cluster sizes after FCM clustering. Then, considering that the fuzzifier value with which FCM clustering produces minor change in cluster variation is better, a criterion for fuzzifier selection in FCM is presented from cluster size distribution perspective, followed by a fuzzifier selection algorithm called CSD-m (cluster size distribution for fuzzifier selection) algorithm. Also, we developed an indicator called Influence Coefficient of Fuzzifier (<inline-formula id="j_info1224_ineq_001"><alternatives>
<mml:math><mml:mi mathvariant="italic">ICF</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{ICF}$]]></tex-math></alternatives></inline-formula>) to measure the influence of fuzzifier values on FCM clustering results. Finally, experimental results on 8 synthetic data sets and 4 real-world data sets illustrate the effectiveness of the proposed criterion and CSD-m algorithm. The results also demonstrate that the widely used fuzzifier value <inline-formula id="j_info1224_ineq_002"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn></mml:math>
<tex-math><![CDATA[$m=2$]]></tex-math></alternatives></inline-formula> is not optimal for many data sets with large variation in cluster sizes. Based on the relationship between <inline-formula id="j_info1224_ineq_003"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathit{CV}_{0}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_info1224_ineq_004"><alternatives>
<mml:math><mml:mi mathvariant="italic">ICF</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{ICF}$]]></tex-math></alternatives></inline-formula>, we further found that there is a linear correlation between the extent of fuzzifier value influence and the original cluster size distributions.</p>
</abstract>
<kwd-group>
<label>Key words</label>
<kwd>fuzzy c-means</kwd>
<kwd>fuzzifier</kwd>
<kwd>CSD-m algorithm</kwd>
<kwd>cluster size distribution</kwd>
</kwd-group>
<funding-group>
<award-group>
<funding-source xlink:href="https://doi.org/10.13039/501100001809">National Natural Science Foundation of China</funding-source>
<award-id>71822104</award-id>
<award-id>71521001</award-id>
</award-group>
<award-group>
<funding-source xlink:href="https://doi.org/10.13039/501100002858">China Postdoctoral Science Foundation</funding-source>
<award-id>2017M612072</award-id>
</award-group>
<funding-statement>This work is supported by the National Natural Science Foundation of China (Nos. 71822104, 71521001), Anhui Science and Technology Major Project (No. 17030901024), Hong Kong Scholars Program (No. 2017-167), and China Postdoctoral Science Foundation (No. 2017M612072). </funding-statement>
</funding-group>
</article-meta>
</front>
<body>
<sec id="j_info1224_s_001">
<label>1</label>
<title>Introduction</title>
<p>Clustering (Jain, <xref ref-type="bibr" rid="j_info1224_ref_019">2010</xref>; Hartigan, <xref ref-type="bibr" rid="j_info1224_ref_016">1975</xref>; Khemchandani and Pal, <xref ref-type="bibr" rid="j_info1224_ref_022">2019</xref>) is an unsupervised learning process to partition a given data set into clusters based on similarity/dissimilarity functions, such that the data objects partitioned in the same cluster are as similar as possible, while those in different clusters are dissimilar at the same time. Currently, there have been various clustering methods that were proposed and applied in many areas (Olde Keizer <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_027">2016</xref>; Benati <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_003">2017</xref>; Truong <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_035">2017</xref>; Pham <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_033">2018</xref>; Motlagh <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_026">2019</xref>; Borg and Boldt, <xref ref-type="bibr" rid="j_info1224_ref_008">2016</xref>; Mokhtari and Salmasnia, <xref ref-type="bibr" rid="j_info1224_ref_025">2015</xref>).</p>
<p>For crisp clustering method, like <italic>k</italic>-means (MacQueen, <xref ref-type="bibr" rid="j_info1224_ref_023">1967</xref>; Mehdizadeh <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_024">2017</xref>) or hierarchical clustering method (Johnson, <xref ref-type="bibr" rid="j_info1224_ref_020">1967</xref>), each data object can only be partitioned into one cluster. While fuzzy c-means (FCM) (Bezdek <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_006">1984</xref>; Zhao <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_043">2013</xref>) introduced the concept of membership degree so that each object can belong to two or more clusters with a certain membership degree value. FCM is the extension of hard <italic>k</italic>-means clustering, and the rich information conveyed by the membership degree and fuzzifier in FCM further expanded its application areas. FCM algorithm was first proposed by Dunn and generalized by Bezdek (Dunn, <xref ref-type="bibr" rid="j_info1224_ref_013">1973</xref>; Bezdek, <xref ref-type="bibr" rid="j_info1224_ref_005">1981</xref>), and it has become a popular and widely used fuzzy clustering method in pattern recognition (Ahmed <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_001">2002</xref>; Dembélé and Kastner, <xref ref-type="bibr" rid="j_info1224_ref_012">2003</xref>; Park, <xref ref-type="bibr" rid="j_info1224_ref_032">2009</xref>; Hou <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_018">2007</xref>).</p>
<p>However, the fuzzifier, also known as the weighting exponent or fuzziness parameter in FCM, is an important parameter in FCM which can significantly influence the performance of FCM clustering (Pal and Bezdek, <xref ref-type="bibr" rid="j_info1224_ref_030">1995</xref>). There have been considerable research efforts that focused on the selection of fuzzifier, and many suggestions have been proposed (Cannon <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_009">1986</xref>; Hall <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_015">1992</xref>; Shen <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_034">2001</xref>; Ozkan and Turksen, <xref ref-type="bibr" rid="j_info1224_ref_028">2004</xref>, <xref ref-type="bibr" rid="j_info1224_ref_029">2007</xref>; Wu, <xref ref-type="bibr" rid="j_info1224_ref_040">2012</xref>). However, there is still not one generally accepted criterion and few theoretical guides for the selection of fuzzifier in FCM (Fadili <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_014">2001</xref>). In many cases, users subjectively select the value of fuzzifier while using FCM clustering.</p>
<p>In addition, the distributions of many data sets are not uniform in practical applications (Wu <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_039">2012</xref>). It has been demonstrated that clustering performance is always affected by data distributions (Xiong <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_041">2009</xref>; Wu <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_038">2009c</xref>). In our previous work (Zhou and Yang, <xref ref-type="bibr" rid="j_info1224_ref_044">2016</xref>), we have also found that FCM has the uniform effect similar to <italic>k</italic>-means clustering. The clustering results of FCM can be significantly influenced by the cluster size distributions. Therefore, to improve the performance of FCM for data sets with different cluster size distributions, it is important to select the appropriate value of fuzzifier. In this study, a new fuzzifier selection criterion and a corresponding algorithm called CSD-m algorithm are proposed from the perspective of cluster size distribution. The cluster size distribution mainly refers to the variation of cluster sizes. First, we use the coefficient of variance (CV) to measure the variation of data in cluster sizes. Then, the values of DCV, which indicate the change of variation in cluster sizes after FCM clustering, are calculated iteratively with different fuzzifier values within an initial search interval. Finally, according to the minimum absolute value of DCV, the optimal value of fuzzifier is determined. Our experiments on both synthetic data sets and real-world data sets illustrate the effectiveness of the proposed criterion and CSD-m algorithm. The experimental results also reveal that the widely used fuzzifier value <inline-formula id="j_info1224_ineq_005"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn></mml:math>
<tex-math><![CDATA[$m=2$]]></tex-math></alternatives></inline-formula> is not optimal for many data sets, especially for data sets with large variation in cluster sizes.</p>
<p>The fuzzifier, denoted as <italic>m</italic> in FCM, is an important parameter which can significantly influence the performance of FCM clustering. Currently, there have been considerable studies on fuzzifier selection. Bezdek proposed a range interval of fuzzifier, <inline-formula id="j_info1224_ineq_006"><alternatives>
<mml:math><mml:mn>1.1</mml:mn><mml:mo>⩽</mml:mo><mml:mi mathvariant="italic">m</mml:mi><mml:mo>⩽</mml:mo><mml:mn>5</mml:mn></mml:math>
<tex-math><![CDATA[$1.1\leqslant m\leqslant 5$]]></tex-math></alternatives></inline-formula>, based on experience (Bezdek, <xref ref-type="bibr" rid="j_info1224_ref_005">1981</xref>). Pal and Bezdek presented a heuristic criteria for the selection of optimal fuzzifier value, and the interval they suggested was <inline-formula id="j_info1224_ineq_007"><alternatives>
<mml:math><mml:mo fence="true" stretchy="false">[</mml:mo><mml:mn>1.5</mml:mn><mml:mo mathvariant="normal">,</mml:mo><mml:mn>2.5</mml:mn><mml:mo fence="true" stretchy="false">]</mml:mo></mml:math>
<tex-math><![CDATA[$[1.5,2.5]$]]></tex-math></alternatives></inline-formula> (Pal and Bezdek, <xref ref-type="bibr" rid="j_info1224_ref_030">1995</xref>). They also pointed out that the median, namely <inline-formula id="j_info1224_ineq_008"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn></mml:math>
<tex-math><![CDATA[$m=2$]]></tex-math></alternatives></inline-formula>, can be selected when there is no other specific constraints. Some studies (Cannon <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_009">1986</xref>; Hall <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_015">1992</xref>; Shen <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_034">2001</xref>) presented the similar suggestion as the work of Pal and Bezdek (<xref ref-type="bibr" rid="j_info1224_ref_030">1995</xref>). In addition, Bezdek studied the physical interpretation of FCM when <inline-formula id="j_info1224_ineq_009"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn></mml:math>
<tex-math><![CDATA[$m=2$]]></tex-math></alternatives></inline-formula> and pointed out that <inline-formula id="j_info1224_ineq_010"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn></mml:math>
<tex-math><![CDATA[$m=2$]]></tex-math></alternatives></inline-formula> was the best selection (Bezdek, <xref ref-type="bibr" rid="j_info1224_ref_004">1976</xref>). The study of Bezdek <italic>et al.</italic> further demonstrated that the value of <italic>m</italic> should be greater than <inline-formula id="j_info1224_ineq_011"><alternatives>
<mml:math><mml:mi mathvariant="italic">n</mml:mi><mml:mo mathvariant="normal" stretchy="false">/</mml:mo><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo><mml:mi mathvariant="italic">n</mml:mi><mml:mo>−</mml:mo><mml:mn>2</mml:mn><mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math>
<tex-math><![CDATA[$n/(n-2)$]]></tex-math></alternatives></inline-formula>, where <italic>n</italic> is the total number of sample objects (Bezdek <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_007">1987</xref>). Based on their work of word recognition, Chan and Cheung suggested that the value range of <italic>m</italic> should be <inline-formula id="j_info1224_ineq_012"><alternatives>
<mml:math><mml:mo fence="true" stretchy="false">[</mml:mo><mml:mn>1.25</mml:mn><mml:mo mathvariant="normal">,</mml:mo><mml:mn>1.75</mml:mn><mml:mo fence="true" stretchy="false">]</mml:mo></mml:math>
<tex-math><![CDATA[$[1.25,1.75]$]]></tex-math></alternatives></inline-formula> (Chan and Cheung, <xref ref-type="bibr" rid="j_info1224_ref_010">1992</xref>). However, Choe and Jordan pointed out that the performance of FCM is not sensitive to the value of <italic>m</italic> based on the fuzzy decision theory (Choe and Jordan, <xref ref-type="bibr" rid="j_info1224_ref_011">1992</xref>). Ozkan and Turksen presented an entropy assessment for <italic>m</italic> considering the uncertainty contained (Ozkan and Turksen, <xref ref-type="bibr" rid="j_info1224_ref_028">2004</xref>). To obtain the uncertainty generated by <italic>m</italic> in FCM, Ozkan and Turksen also identified the upper and lower values of <italic>m</italic> as 1.4 and 2.6, respectively, (Ozkan and Turksen, <xref ref-type="bibr" rid="j_info1224_ref_029">2007</xref>). Wu proposed a new guideline for the selection of <italic>m</italic> based on a robust analysis of FCM, and suggested implementing FCM with <inline-formula id="j_info1224_ineq_013"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo stretchy="false">∈</mml:mo><mml:mo fence="true" stretchy="false">[</mml:mo><mml:mn>1.5</mml:mn><mml:mo mathvariant="normal">,</mml:mo><mml:mn>4</mml:mn><mml:mo fence="true" stretchy="false">]</mml:mo></mml:math>
<tex-math><![CDATA[$m\in [1.5,4]$]]></tex-math></alternatives></inline-formula> (Wu, <xref ref-type="bibr" rid="j_info1224_ref_040">2012</xref>).</p>
<p>In summary, there is still not one widely accepted criterion and little theoretical support for the selection of fuzzifier in FCM (Pal and Bezdek, <xref ref-type="bibr" rid="j_info1224_ref_030">1995</xref>; Yu <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_042">2004</xref>). In most practical applications, the value of fuzzifier is always subjectively selected by users, and <inline-formula id="j_info1224_ineq_014"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn></mml:math>
<tex-math><![CDATA[$m=2$]]></tex-math></alternatives></inline-formula> is the most common selection (Pal and Bezdek, <xref ref-type="bibr" rid="j_info1224_ref_030">1995</xref>; Cannon <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_009">1986</xref>; Hall <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_015">1992</xref>; Shen <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_034">2001</xref>). Indeed, this selection may not be always the optimal, and inappropriate selection of fuzzifier value can significantly affect the clustering results of FCM. Additionally, few of the above researches have focused on the cluster size distribution while studying the related issue of fuzzifier selection. The characteristics of cluster size distribution may have an impact on the performance of FCM clustering. Fuzzifier is a key parameter that influences the clustering results of FCM. Furthermore, in some studies, only the range intervals of empirical reference values were presented without specific criterion and method for the selection of optimal fuzzifer value in practical applications. Therefore, the motivation of this study is to explore the influence and measure the influence extent of fuzzifier value on FCM clustering results, and further investigate the fuzzifier selection from a cluster size distribution perspective. The main contributions of this study are as follows. First, the mechanism that fuzzifier influences the FCM clustering result is revealed. Second, we point out that the widely used fuzzifier value <inline-formula id="j_info1224_ineq_015"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn></mml:math>
<tex-math><![CDATA[$m=2$]]></tex-math></alternatives></inline-formula> is not optimal for many data sets with large variation in cluster sizes. Third, a criterion and a CSD-m algorithm for fuzzifier selection in FCM is presented from cluster size distribution perspective.</p>
<p>We note that, for a given data set, “data distribution” typically means many aspects of the characteristics, such as the shapes, densities and dimensions. While the focus of this study is the cluster size distributions of data sets. So we use cluster size distribution to represent the variation in cluster sizes of a data set.</p>
<p>The remainder of this paper is organized as follows. The FCM clustering algorithm is briefly reviewed in Section <xref rid="j_info1224_s_002">2</xref>. In Section <xref rid="j_info1224_s_003">3</xref>, we propose the fuzzifier selection criterion from cluster size distribution perspective and the corresponding algorithm called CSD-m algorithm. Experimental results and discussion are presented in Section <xref rid="j_info1224_s_006">4</xref>. Finally, conclusions are made in Section <xref rid="j_info1224_s_009">5</xref>.</p>
</sec>
<sec id="j_info1224_s_002">
<label>2</label>
<title>FCM Clustering</title>
<p>FCM algorithm (Bezdek <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_006">1984</xref>; Bezdek, <xref ref-type="bibr" rid="j_info1224_ref_005">1981</xref>) starts with determining the number of clusters followed by guessing the initial cluster centres. Then every sample point is assigned a membership degree for each cluster. Each cluster centre’s point and corresponding membership degrees are updated iteratively by minimizing the objective functions until the stopping criteria are met. The stopping criteria mainly include the iterations <italic>t</italic> reach the maximum number <inline-formula id="j_info1224_ineq_016"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">t</mml:mi></mml:mrow><mml:mrow><mml:mo movablelimits="false">max</mml:mo></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${t_{\max }}$]]></tex-math></alternatives></inline-formula>, or the difference of the cluster centres between two consecutive iterations is within a small enough threshold <italic>ε</italic>, i.e. <inline-formula id="j_info1224_ineq_017"><alternatives>
<mml:math><mml:mo stretchy="false">‖</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">v</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:mi mathvariant="italic">t</mml:mi></mml:mrow></mml:msub><mml:mo>−</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">v</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:mi mathvariant="italic">t</mml:mi><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo stretchy="false">‖</mml:mo><mml:mo>⩽</mml:mo><mml:mi mathvariant="italic">ε</mml:mi></mml:math>
<tex-math><![CDATA[$\| {v_{i,t}}-{v_{i,t-1}}\| \leqslant \varepsilon $]]></tex-math></alternatives></inline-formula>. The objective function of FCM algorithm is defined as: 
<disp-formula id="j_info1224_eq_001">
<label>(1)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi mathvariant="italic">J</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo><mml:mi mathvariant="italic">U</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:mi mathvariant="italic">V</mml:mi><mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo><mml:mo>=</mml:mo>
<mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mstyle displaystyle="true"><mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi mathvariant="italic">c</mml:mi></mml:mrow></mml:munderover>
<mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mstyle displaystyle="true"><mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle></mml:mrow><mml:mrow><mml:mi mathvariant="italic">j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi mathvariant="italic">n</mml:mi></mml:mrow></mml:munderover><mml:msubsup><mml:mrow><mml:mi mathvariant="italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mi mathvariant="italic">j</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msubsup><mml:msubsup><mml:mrow><mml:mi mathvariant="italic">d</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mi mathvariant="italic">j</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo mathvariant="normal">,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ {J_{m}}(U,V)={\sum \limits_{i=1}^{c}}{\sum \limits_{j=1}^{n}}{\mu _{ij}^{m}}{d_{ij}^{2}},\]]]></tex-math></alternatives>
</disp-formula> 
where <italic>U</italic> is the membership degree matrix. <italic>V</italic> represents the cluster centre’s matrix. <italic>n</italic> is the total number of data objects in the data set. <italic>c</italic> is the number of clusters. <italic>m</italic> is the fuzzifier. <inline-formula id="j_info1224_ineq_018"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mu _{ij}}$]]></tex-math></alternatives></inline-formula> is the membership degree of the <italic>j</italic>th data object <inline-formula id="j_info1224_ineq_019"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">x</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${x_{j}}$]]></tex-math></alternatives></inline-formula> to the <italic>i</italic>th cluster <inline-formula id="j_info1224_ineq_020"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">C</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${C_{i}}$]]></tex-math></alternatives></inline-formula>. <inline-formula id="j_info1224_ineq_021"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">v</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${v_{i}}$]]></tex-math></alternatives></inline-formula> is the cluster centre of <inline-formula id="j_info1224_ineq_022"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">C</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${C_{i}}$]]></tex-math></alternatives></inline-formula>. <inline-formula id="j_info1224_ineq_023"><alternatives>
<mml:math><mml:msubsup><mml:mrow><mml:mi mathvariant="italic">d</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mi mathvariant="italic">j</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math>
<tex-math><![CDATA[${d_{ij}^{2}}$]]></tex-math></alternatives></inline-formula> is the squared Euclidean distance between <inline-formula id="j_info1224_ineq_024"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">x</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${x_{j}}$]]></tex-math></alternatives></inline-formula> and the cluster centre <inline-formula id="j_info1224_ineq_025"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">v</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${v_{i}}$]]></tex-math></alternatives></inline-formula>, and <inline-formula id="j_info1224_ineq_026"><alternatives>
<mml:math><mml:msubsup><mml:mrow><mml:mi mathvariant="italic">d</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mi mathvariant="italic">j</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mo stretchy="false">‖</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">x</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub><mml:mo>−</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">v</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mrow><mml:mo stretchy="false">‖</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math>
<tex-math><![CDATA[${d_{ij}^{2}}=\| {x_{j}}-{v_{i}}{\| ^{2}}$]]></tex-math></alternatives></inline-formula>.</p>
<p>In the iterative procedure, membership degree <inline-formula id="j_info1224_ineq_027"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mu _{ij}}$]]></tex-math></alternatives></inline-formula> and the cluster centres <inline-formula id="j_info1224_ineq_028"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">v</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${v_{i}}$]]></tex-math></alternatives></inline-formula> are updated by: <disp-formula-group id="j_info1224_dg_001">
<disp-formula id="j_info1224_eq_002">
<label>(2)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt"><mml:mtr><mml:mtd class="align-odd"/><mml:mtd class="align-even"><mml:msub><mml:mrow><mml:mi mathvariant="italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msubsup><mml:mrow><mml:mo largeop="false" movablelimits="false">∑</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="italic">k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi mathvariant="italic">c</mml:mi></mml:mrow></mml:msubsup><mml:msup><mml:mrow><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo><mml:mstyle displaystyle="false"><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi mathvariant="italic">d</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi mathvariant="italic">d</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">k</mml:mi><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mstyle displaystyle="false"><mml:mfrac><mml:mrow><mml:mn>2</mml:mn></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:mfrac></mml:mstyle></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:mstyle><mml:mo mathvariant="normal">,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[\begin{aligned}{}& {\mu _{ij}}=\frac{1}{{\textstyle\textstyle\sum _{k=1}^{c}}{(\frac{{d_{ij}}}{{d_{kj}}})^{\frac{2}{m-1}}}},\end{aligned}\]]]></tex-math></alternatives>
</disp-formula>
<disp-formula id="j_info1224_eq_003">
<label>(3)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt"><mml:mtr><mml:mtd class="align-odd"/><mml:mtd class="align-even"><mml:msub><mml:mrow><mml:mi mathvariant="italic">v</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac><mml:mrow><mml:msubsup><mml:mrow><mml:mo largeop="false" movablelimits="false">∑</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="italic">j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi mathvariant="italic">n</mml:mi></mml:mrow></mml:msubsup><mml:msubsup><mml:mrow><mml:mi mathvariant="italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mi mathvariant="italic">j</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:mi mathvariant="italic">x</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msubsup><mml:mrow><mml:mo largeop="false" movablelimits="false">∑</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="italic">j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi mathvariant="italic">n</mml:mi></mml:mrow></mml:msubsup><mml:msubsup><mml:mrow><mml:mi mathvariant="italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mi mathvariant="italic">j</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:mstyle><mml:mo mathvariant="normal">,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[\begin{aligned}{}& {v_{i}}=\frac{{\textstyle\textstyle\sum _{j=1}^{n}}{\mu _{ij}^{m}}{x_{j}}}{{\textstyle\textstyle\sum _{j=1}^{n}}{\mu _{ij}^{m}}},\end{aligned}\]]]></tex-math></alternatives>
</disp-formula>
</disp-formula-group> where <inline-formula id="j_info1224_ineq_029"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mu _{ij}}$]]></tex-math></alternatives></inline-formula> satisfies <disp-formula-group id="j_info1224_dg_002">
<disp-formula id="j_info1224_eq_004">
<label>(4)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt"><mml:mtr><mml:mtd class="align-odd"/><mml:mtd class="align-even"><mml:msub><mml:mrow><mml:mi mathvariant="italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">∈</mml:mo><mml:mo fence="true" stretchy="false">[</mml:mo><mml:mn>0</mml:mn><mml:mo mathvariant="normal">,</mml:mo><mml:mn>1</mml:mn><mml:mo fence="true" stretchy="false">]</mml:mo><mml:mo mathvariant="normal">,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[\begin{aligned}{}& {\mu _{ij}}\in [0,1],\end{aligned}\]]]></tex-math></alternatives>
</disp-formula>
<disp-formula id="j_info1224_eq_005">
<label>(5)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt"><mml:mtr><mml:mtd class="align-odd"/><mml:mtd class="align-even">
<mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mstyle displaystyle="true"><mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi mathvariant="italic">c</mml:mi></mml:mrow></mml:munderover><mml:msub><mml:mrow><mml:mi mathvariant="italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo mathvariant="normal">,</mml:mo><mml:mspace width="1em"/><mml:mo>∀</mml:mo><mml:mi mathvariant="italic">j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo mathvariant="normal">,</mml:mo><mml:mo>…</mml:mo><mml:mo mathvariant="normal">,</mml:mo><mml:mi mathvariant="italic">n</mml:mi><mml:mo mathvariant="normal">,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[\begin{aligned}{}& {\sum \limits_{i=1}^{c}}{\mu _{ij}}=1,\hspace{1em}\forall j=1,\dots ,n,\end{aligned}\]]]></tex-math></alternatives>
</disp-formula>
<disp-formula id="j_info1224_eq_006">
<label>(6)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt"><mml:mtr><mml:mtd class="align-odd"/><mml:mtd class="align-even"><mml:mn>0</mml:mn><mml:mo mathvariant="normal">&lt;</mml:mo>
<mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mstyle displaystyle="true"><mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle></mml:mrow><mml:mrow><mml:mi mathvariant="italic">j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi mathvariant="italic">n</mml:mi></mml:mrow></mml:munderover><mml:msub><mml:mrow><mml:mi mathvariant="italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub><mml:mo mathvariant="normal">&lt;</mml:mo><mml:mi mathvariant="italic">n</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:mspace width="1em"/><mml:mo>∀</mml:mo><mml:mi mathvariant="italic">i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo mathvariant="normal">,</mml:mo><mml:mo stretchy="false">⋯</mml:mo><mml:mspace width="0.1667em"/><mml:mo mathvariant="normal">,</mml:mo><mml:mi mathvariant="italic">c</mml:mi><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[\begin{aligned}{}& 0<{\sum \limits_{j=1}^{n}}{\mu _{ij}}<n,\hspace{1em}\forall i=1,\cdots \hspace{0.1667em},c.\end{aligned}\]]]></tex-math></alternatives>
</disp-formula>
</disp-formula-group> The meanings of the symbols in Eq. (<xref rid="j_info1224_eq_002">2</xref>) to Eq. (<xref rid="j_info1224_eq_006">6</xref>) are the same as those in Eq. (<xref rid="j_info1224_eq_001">1</xref>).</p>
<p>The basic FCM algorithm is briefly reviewed as Algorithm <xref rid="j_info1224_fig_001">1</xref>.</p>
<fig id="j_info1224_fig_001">
<label>Algorithm 1</label>
<caption>
<p>Fuzzy c-means (FCM)</p>
</caption>
<graphic xlink:href="info1224_g001.jpg"/>
</fig>
<p>The flowchart of FCM algorithm can be shown in Fig. <xref rid="j_info1224_fig_002">1</xref>.</p>
<fig id="j_info1224_fig_002">
<label>Fig. 1</label>
<caption>
<p>Flow chart of FCM clustering.</p>
</caption>
<graphic xlink:href="info1224_g002.jpg"/>
</fig>
</sec>
<sec id="j_info1224_s_003">
<label>3</label>
<title>Fuzzifier Selection Method from Cluster Size Distribution Perspective</title>
<sec id="j_info1224_s_004">
<label>3.1</label>
<title>Measure of Cluster Size Distribution</title>
<p>The coefficient of variance (<inline-formula id="j_info1224_ineq_030"><alternatives>
<mml:math><mml:mi mathvariant="italic">CV</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{CV}$]]></tex-math></alternatives></inline-formula>) (Papoulis, <xref ref-type="bibr" rid="j_info1224_ref_031">1990</xref>) in statistics can be used as a measure for the variation in cluster sizes of a data set (Xiong <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_041">2009</xref>; Wu <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_038">2009c</xref>). <statement id="j_info1224_stat_001"><label>Definition 1</label>
<title><italic>(Coefficient of Variance,</italic> <inline-formula id="j_info1224_ineq_031"><alternatives>
<mml:math><mml:mi mathvariant="italic">CV</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{CV}$]]></tex-math></alternatives></inline-formula><italic>).</italic></title>
<p><inline-formula id="j_info1224_ineq_032"><alternatives>
<mml:math><mml:mi mathvariant="italic">CV</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{CV}$]]></tex-math></alternatives></inline-formula> is the ratio of the standard deviation to the mean of cluster sizes, which is calculated as follows: <disp-formula-group id="j_info1224_dg_003">
<disp-formula id="j_info1224_eq_007">
<label>(7)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt"><mml:mtr><mml:mtd class="align-odd"/><mml:mtd class="align-even"><mml:mover accent="true"><mml:mrow><mml:mi mathvariant="italic">n</mml:mi></mml:mrow><mml:mo stretchy="false">¯</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi mathvariant="italic">c</mml:mi></mml:mrow></mml:mfrac></mml:mstyle>
<mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mstyle displaystyle="true"><mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi mathvariant="italic">c</mml:mi></mml:mrow></mml:munderover><mml:msub><mml:mrow><mml:mi mathvariant="italic">n</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub><mml:mo mathvariant="normal">,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[\begin{aligned}{}& \bar{n}=\frac{1}{c}{\sum \limits_{i=1}^{c}}{n_{i}},\end{aligned}\]]]></tex-math></alternatives>
</disp-formula>
<disp-formula id="j_info1224_eq_008">
<label>(8)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt"><mml:mtr><mml:mtd class="align-odd"/><mml:mtd class="align-even"><mml:mi mathvariant="italic">σ</mml:mi><mml:mo>=</mml:mo><mml:msqrt><mml:mrow><mml:mstyle displaystyle="true"><mml:mfrac><mml:mrow><mml:msubsup><mml:mrow><mml:mo largeop="false" movablelimits="false">∑</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi mathvariant="italic">c</mml:mi></mml:mrow></mml:msubsup><mml:msup><mml:mrow><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">n</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub><mml:mo>−</mml:mo><mml:mover accent="true"><mml:mrow><mml:mi mathvariant="italic">n</mml:mi></mml:mrow><mml:mo stretchy="false">¯</mml:mo></mml:mover><mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mi mathvariant="italic">c</mml:mi><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:mfrac></mml:mstyle></mml:mrow></mml:msqrt><mml:mo mathvariant="normal">,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[\begin{aligned}{}& \sigma =\sqrt{\frac{{\textstyle\textstyle\sum _{i=1}^{c}}{({n_{i}}-\bar{n})^{2}}}{c-1}},\end{aligned}\]]]></tex-math></alternatives>
</disp-formula>
<disp-formula id="j_info1224_eq_009">
<label>(9)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt"><mml:mtr><mml:mtd class="align-odd"/><mml:mtd class="align-even"><mml:mi mathvariant="italic">CV</mml:mi><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac><mml:mrow><mml:mi mathvariant="italic">σ</mml:mi></mml:mrow><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi mathvariant="italic">n</mml:mi></mml:mrow><mml:mo stretchy="false">¯</mml:mo></mml:mover></mml:mrow></mml:mfrac></mml:mstyle><mml:mo mathvariant="normal">,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[\begin{aligned}{}& \mathit{CV}=\frac{\sigma }{\bar{n}},\end{aligned}\]]]></tex-math></alternatives>
</disp-formula>
</disp-formula-group> where <italic>c</italic> is the number of clusters, <inline-formula id="j_info1224_ineq_033"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">n</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${n_{i}}$]]></tex-math></alternatives></inline-formula> is the number of objects in cluster <inline-formula id="j_info1224_ineq_034"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">C</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${C_{i}}$]]></tex-math></alternatives></inline-formula>, <inline-formula id="j_info1224_ineq_035"><alternatives>
<mml:math><mml:mover accent="true"><mml:mrow><mml:mi mathvariant="italic">n</mml:mi></mml:mrow><mml:mo stretchy="false">¯</mml:mo></mml:mover></mml:math>
<tex-math><![CDATA[$\bar{n}$]]></tex-math></alternatives></inline-formula> is the average size of all the clusters, and <italic>σ</italic> is the standard deviation of the cluster size distribution.</p>
<p><inline-formula id="j_info1224_ineq_036"><alternatives>
<mml:math><mml:mi mathvariant="italic">CV</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{CV}$]]></tex-math></alternatives></inline-formula> can be used to measure the distribution of cluster sizes since it is the ratio of the standard deviation and the average value of cluster sizes. <inline-formula id="j_info1224_ineq_037"><alternatives>
<mml:math><mml:mi mathvariant="italic">CV</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{CV}$]]></tex-math></alternatives></inline-formula> is a dimensionless measure, which makes it more effective in measuring cluster size distributions. Generally, the larger the <inline-formula id="j_info1224_ineq_038"><alternatives>
<mml:math><mml:mi mathvariant="italic">CV</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{CV}$]]></tex-math></alternatives></inline-formula> value is, the greater the variability is in the data.</p></statement><statement id="j_info1224_stat_002"><label>Definition 2</label>
<title><italic>(DCV).</italic></title>
<p><inline-formula id="j_info1224_ineq_039"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathit{CV}_{0}}$]]></tex-math></alternatives></inline-formula> is the <inline-formula id="j_info1224_ineq_040"><alternatives>
<mml:math><mml:mi mathvariant="italic">CV</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{CV}$]]></tex-math></alternatives></inline-formula> value of the original “true” clusters, and <inline-formula id="j_info1224_ineq_041"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathit{CV}_{1}}$]]></tex-math></alternatives></inline-formula> is the CV value of the clustering result partitioned by FCM. DCV is defined as the change of variation in cluster sizes after FCM clustering (Zhou and Yang, <xref ref-type="bibr" rid="j_info1224_ref_044">2016</xref>; Wu <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_036">2009a</xref>, <xref ref-type="bibr" rid="j_info1224_ref_037">2009b</xref>). 
<disp-formula id="j_info1224_eq_010">
<label>(10)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:mi mathvariant="italic">DCV</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>−</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ \mathit{DCV}={\mathit{CV}_{0}}-{\mathit{CV}_{1}}.\]]]></tex-math></alternatives>
</disp-formula> 
From the perspective of cluster size distribution, a clustering partition which results in minor change of variation in cluster sizes (i.e. a smaller absolute value of <inline-formula id="j_info1224_ineq_042"><alternatives>
<mml:math><mml:mi mathvariant="italic">DCV</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{DCV}$]]></tex-math></alternatives></inline-formula>) refers to a steady state of clustering result. Based on this, we propose a criterion for fuzzifier selection in FCM from cluster size distribution perspective.</p></statement><statement id="j_info1224_stat_003"><label>Criterion 1</label>
<title><italic>(Fuzzifier selection criterion from cluster size distribution perspective).</italic></title>
<p>In a certain range of fuzzifier values, the fuzzifier value with which the FCM clustering can result in the minimum absolute value of DCV is the optimal selection.</p>
<p>We note that DCV is more of an indication of reaching steady state of the clustering process, and it does not necessarily indicate a better partition result. However, in FCM clustering with different fuzzifier values, for a specific data set, the distribution changes are mainly reflected in the cluster sizes. Therefore, to a certain extent, we can say that criterion 1 is valid.</p></statement></p>
</sec>
<sec id="j_info1224_s_005">
<label>3.2</label>
<title>CSD-m Algorithm for Fuzzifer Selection</title>
<p>Based on the fuzzifier selection criterion from cluster size distribution perspective, we propose a fuzzifier selection algorithm considering the change of variation in cluster sizes. The algorithm is called cluster size distribution based fuzzifier <italic>m</italic> selection algorithm (CSD-m algorithm), as described in Algorithm <xref rid="j_info1224_fig_003">2</xref>.</p>
<fig id="j_info1224_fig_003">
<label>Algorithm 2</label>
<caption>
<p>CSD-m algorithm</p>
</caption>
<graphic xlink:href="info1224_g003.jpg"/>
</fig>
<p>The flow chart of the proposed CSD-m algorithm is shown in Fig. <xref rid="j_info1224_fig_004">2</xref>.</p>
<fig id="j_info1224_fig_004">
<label>Fig. 2</label>
<caption>
<p>Flow chart of the CSD-m algorithm.</p>
</caption>
<graphic xlink:href="info1224_g004.jpg"/>
</fig>
<p>The DCV measure for the change of variation in cluster sizes after FCM clustering and the search process of fuzzifier values in a range interval are added to the traditional FCM algorithm to form the CSD-m algorithm. Apart from the number of clusters and the initial cluster centres, the search interval of fuzzifier values is also needed as the input of CSD-m algorithm. This interval can be determined according to the existing suggestions, as discussed in Section <xref rid="j_info1224_s_002">2</xref>. The key steps of CSD-m algorithm are the calculation of CV values partitioned by FCM clustering with different fuzzifier values, and the comparison of absolute DCV values. Through the iterations, the optimal value of fuzzifier is obtained when <inline-formula id="j_info1224_ineq_043"><alternatives>
<mml:math><mml:mo stretchy="false">|</mml:mo><mml:mi mathvariant="italic">D</mml:mi><mml:mi mathvariant="italic">C</mml:mi><mml:mi mathvariant="italic">V</mml:mi><mml:mo stretchy="false">|</mml:mo></mml:math>
<tex-math><![CDATA[$|DCV|$]]></tex-math></alternatives></inline-formula> reaches its minimum.</p>
</sec>
</sec>
<sec id="j_info1224_s_006">
<label>4</label>
<title>Experimental Study</title>
<sec id="j_info1224_s_007">
<label>4.1</label>
<title>Experimental Setup</title>
<p>In the experiments, 8 synthetic data sets and 4 real-world data sets are used to demonstrate the effectiveness of our proposed fuzzifier selection criterion and the CSD-m algorithm. The experimental tool is Matalb R2012b. Based on the existing research on fuzzifier selection, the search range of fuzzifier is set to <inline-formula id="j_info1224_ineq_044"><alternatives>
<mml:math><mml:mo fence="true" stretchy="false">[</mml:mo><mml:mn>1.2</mml:mn><mml:mo mathvariant="normal">,</mml:mo><mml:mn>3.0</mml:mn><mml:mo fence="true" stretchy="false">]</mml:mo></mml:math>
<tex-math><![CDATA[$[1.2,3.0]$]]></tex-math></alternatives></inline-formula>. Taking into account the efficiency of the CSD-m algorithm, we set <inline-formula id="j_info1224_ineq_045"><alternatives>
<mml:math><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>0.2</mml:mn></mml:math>
<tex-math><![CDATA[$\Delta m=0.2$]]></tex-math></alternatives></inline-formula>. The maximum number of iterations and the termination threshold of FCM are the default values, namely, 100 and 1e−5, respectively. Also, due to the randomness of initial cluster centres in FCM, we run the algorithm ten times with each <italic>m</italic> value for each data set, and the average values are obtained as the final results.</p>
<p>The synthetic data sets are named SDXYYYY, in which “SD” refers to synthetic data set, “X” refers to the dimension of the data set, and “YYYY” indicates the number of data objects in the data set. The synthetic data sets are randomly generated by using the <inline-formula id="j_info1224_ineq_046"><alternatives>
<mml:math><mml:mi mathvariant="italic">nngenc</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{nngenc}$]]></tex-math></alternatives></inline-formula> function in Matlab R2012b with different bounds and standard deviation parameters. We control the parameters of <inline-formula id="j_info1224_ineq_047"><alternatives>
<mml:math><mml:mi mathvariant="italic">nngenc</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{nngenc}$]]></tex-math></alternatives></inline-formula> function, such that all of these synthetic data sets have great variation in cluster sizes. The generation parameters of the 8 synthetic data sets are shown in Table <xref rid="j_info1224_tab_001">1</xref>.</p>
<table-wrap id="j_info1224_tab_001">
<label>Table 1</label>
<caption>
<p>Generation parameters of the synthetic datasets.</p>
</caption>
<table>
<thead>
<tr>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Dataset</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">No. of clusters</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">No. of dimensions</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Cluster centre bounds</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Std. of each cluster</td>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: top; text-align: left">SD21000</td>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">(2, 4); (4, 4)</td>
<td style="vertical-align: top; text-align: left">0.4; 0.3</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD20550</td>
<td style="vertical-align: top; text-align: left">3</td>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">(1, 1); (2, 3); (4, 2)</td>
<td style="vertical-align: top; text-align: left">0.4; 0.4; 0.4</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD21800</td>
<td style="vertical-align: top; text-align: left">4</td>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">(2, 2); (2, 7); (5, 2); (6, 7)</td>
<td style="vertical-align: top; text-align: left">0.7; 0.8; 0.4; 0.5</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD21950</td>
<td style="vertical-align: top; text-align: left">5</td>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">(2, 2); (2, 6); (6, 2); (6, 6); (4, 4)</td>
<td style="vertical-align: top; text-align: left">0.5; 0.4; 0.4; 0.4; 0.6</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD31500</td>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">3</td>
<td style="vertical-align: top; text-align: left">(2, 2, 2); (4, 4, 3)</td>
<td style="vertical-align: top; text-align: left">0.5; 0.5</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD32050</td>
<td style="vertical-align: top; text-align: left">3</td>
<td style="vertical-align: top; text-align: left">3</td>
<td style="vertical-align: top; text-align: left">(2, 2, 2); (4, 4, 3); (5, 3, 2)</td>
<td style="vertical-align: top; text-align: left">0.6; 0.4; 0.4</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD32800</td>
<td style="vertical-align: top; text-align: left">4</td>
<td style="vertical-align: top; text-align: left">3</td>
<td style="vertical-align: top; text-align: left">(2, 2, 2); (4, 4, 3); (5, 3, 2); (6, 6, 4)</td>
<td style="vertical-align: top; text-align: left">0.7; 0.4; 0.4; 0.7</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">SD34000</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">5</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">3</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">(2, 2, 2); (4, 4, 3); (5, 3, 2); (6, 6, 4); (6, 7, 2)</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">0.7; 0.4; 0.5; 0.6; 0.5</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The distributions of the 8 synthetic data sets are shown in Fig. <xref rid="j_info1224_fig_005">3</xref>.</p>
<fig id="j_info1224_fig_005">
<label>Fig. 3</label>
<caption>
<p>Distributions of the synthetic data sets.</p>
</caption>
<graphic xlink:href="info1224_g005.jpg"/>
</fig>
<p>The four real-world data sets are from different areas in the UCI Machine Learning Repository (Bache and Lichman, <xref ref-type="bibr" rid="j_info1224_ref_002">2013</xref>). The <italic>abalone</italic> data set is a real-world data set to predict the age of abalone from physical measurements. The <italic>balance-scale</italic> data set contains information about balance scale weight and distance. The <italic>breast-cancer</italic> data set includes the original Wisconsin breast cancer related information of 699 instances. The <italic>page-blocks</italic> data set measures the blocks of the page layout of a document that has been detected by a segmentation process.</p>
<p>Some key characteristics of the experimental data sets are summarized in Table <xref rid="j_info1224_tab_002">2</xref>.</p>
<table-wrap id="j_info1224_tab_002">
<label>Table 2</label>
<caption>
<p>Some characteristics of experimental data sets.</p>
</caption>
<table>
<thead>
<tr>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"/>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Data sets</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"># Objects</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"># Features</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"># classes</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">MinSize</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">MaxSize</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">AvgSize</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"><inline-formula id="j_info1224_ineq_048"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathit{CV}_{0}}$]]></tex-math></alternatives></inline-formula></td>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="8" style="vertical-align: top; text-align: left">Synthetic data sets</td>
<td style="vertical-align: top; text-align: left">SD21000</td>
<td style="vertical-align: top; text-align: left">1000</td>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">100</td>
<td style="vertical-align: top; text-align: left">900</td>
<td style="vertical-align: top; text-align: left">500</td>
<td style="vertical-align: top; text-align: left">1.131</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD20550</td>
<td style="vertical-align: top; text-align: left">550</td>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">3</td>
<td style="vertical-align: top; text-align: left">50</td>
<td style="vertical-align: top; text-align: left">350</td>
<td style="vertical-align: top; text-align: left">183</td>
<td style="vertical-align: top; text-align: left">0.833</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD21800</td>
<td style="vertical-align: top; text-align: left">1800</td>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">4</td>
<td style="vertical-align: top; text-align: left">200</td>
<td style="vertical-align: top; text-align: left">950</td>
<td style="vertical-align: top; text-align: left">450</td>
<td style="vertical-align: top; text-align: left">0.754</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD21950</td>
<td style="vertical-align: top; text-align: left">1950</td>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">5</td>
<td style="vertical-align: top; text-align: left">100</td>
<td style="vertical-align: top; text-align: left">1200</td>
<td style="vertical-align: top; text-align: left">390</td>
<td style="vertical-align: top; text-align: left">1.176</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD31500</td>
<td style="vertical-align: top; text-align: left">1500</td>
<td style="vertical-align: top; text-align: left">3</td>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">200</td>
<td style="vertical-align: top; text-align: left">1300</td>
<td style="vertical-align: top; text-align: left">750</td>
<td style="vertical-align: top; text-align: left">1.037</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD32050</td>
<td style="vertical-align: top; text-align: left">2050</td>
<td style="vertical-align: top; text-align: left">3</td>
<td style="vertical-align: top; text-align: left">3</td>
<td style="vertical-align: top; text-align: left">200</td>
<td style="vertical-align: top; text-align: left">1500</td>
<td style="vertical-align: top; text-align: left">683</td>
<td style="vertical-align: top; text-align: left">1.041</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD32800</td>
<td style="vertical-align: top; text-align: left">2800</td>
<td style="vertical-align: top; text-align: left">3</td>
<td style="vertical-align: top; text-align: left">4</td>
<td style="vertical-align: top; text-align: left">200</td>
<td style="vertical-align: top; text-align: left">1500</td>
<td style="vertical-align: top; text-align: left">700</td>
<td style="vertical-align: top; text-align: left">0.849</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD34000</td>
<td style="vertical-align: top; text-align: left">4000</td>
<td style="vertical-align: top; text-align: left">3</td>
<td style="vertical-align: top; text-align: left">5</td>
<td style="vertical-align: top; text-align: left">200</td>
<td style="vertical-align: top; text-align: left">2000</td>
<td style="vertical-align: top; text-align: left">800</td>
<td style="vertical-align: top; text-align: left">0.923</td>
</tr>
<tr>
<td rowspan="4" style="vertical-align: top; text-align: left; border-bottom: solid thin">Real-world data sets</td>
<td style="vertical-align: top; text-align: left">abalone</td>
<td style="vertical-align: top; text-align: left">4177</td>
<td style="vertical-align: top; text-align: left">8</td>
<td style="vertical-align: top; text-align: left">29</td>
<td style="vertical-align: top; text-align: left">1</td>
<td style="vertical-align: top; text-align: left">689</td>
<td style="vertical-align: top; text-align: left">144</td>
<td style="vertical-align: top; text-align: left">1.414</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">balance-scale</td>
<td style="vertical-align: top; text-align: left">625</td>
<td style="vertical-align: top; text-align: left">4</td>
<td style="vertical-align: top; text-align: left">3</td>
<td style="vertical-align: top; text-align: left">49</td>
<td style="vertical-align: top; text-align: left">288</td>
<td style="vertical-align: top; text-align: left">208</td>
<td style="vertical-align: top; text-align: left">0.662</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">breast-cancer</td>
<td style="vertical-align: top; text-align: left">699</td>
<td style="vertical-align: top; text-align: left">10</td>
<td style="vertical-align: top; text-align: left">8</td>
<td style="vertical-align: top; text-align: left">17</td>
<td style="vertical-align: top; text-align: left">367</td>
<td style="vertical-align: top; text-align: left">87</td>
<td style="vertical-align: top; text-align: left">1.320</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">pageblocks</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">5473</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">10</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">5</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">28</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">4913</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">1095</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">1.953</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>In Table <xref rid="j_info1224_tab_001">1</xref>, “# objects” represents the total number of data objects in the data set. “# features” is the number of attributes of the data. “# classes” refers to the number of clusters in the data.</p>
</sec>
<sec id="j_info1224_s_008">
<label>4.2</label>
<title>Results and Discussion</title>
<p>The clustering results of both the 2-D and 3-D synthetic data sets can be visualized so that we can directly understand the effect of different fuzzifier values on the clustering results. For simplicity, we only present the FCM clustering results with the popular fuzzifier values of <inline-formula id="j_info1224_ineq_049"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2.0</mml:mn></mml:math>
<tex-math><![CDATA[$m=2.0$]]></tex-math></alternatives></inline-formula> on the 8 synthetic data sets, as shown in Fig. <xref rid="j_info1224_fig_006">4</xref>.</p>
<fig id="j_info1224_fig_006">
<label>Fig. 4</label>
<caption>
<p>Clustering partitions of FCM with fuzzifier value <inline-formula id="j_info1224_ineq_050"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2.0</mml:mn></mml:math>
<tex-math><![CDATA[$m=2.0$]]></tex-math></alternatives></inline-formula>.</p>
</caption>
<graphic xlink:href="info1224_g006.jpg"/>
</fig>
<p>The clustering results on four synthetic data sets show that the smaller the fuzzifier value is, the better the clustering result is. With the increase of fuzzifier value, the small clusters in the data sets tend to merge with part of the larger clusters.</p>
<p>The clustering results of all the experimental data sets with different fuzzifier values are presented in Table <xref rid="j_info1224_tab_003">3</xref>.</p>
<p>Then, based on the <inline-formula id="j_info1224_ineq_051"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathit{CV}_{1}}$]]></tex-math></alternatives></inline-formula> values in Table <xref rid="j_info1224_tab_003">3</xref>, we calculate the DCV values with different fuzzifier values on all of the 12 experimental data sets. The changes of DCV values on all the experimental data sets with different fuzzifier values are shown in Fig. <xref rid="j_info1224_fig_007">5</xref>.</p>
<table-wrap id="j_info1224_tab_003">
<label>Table 3</label>
<caption>
<p>Clustering results of all the experimental data sets with different fuzzifier values.</p>
</caption>
<table>
<thead>
<tr>
<td style="vertical-align: top; text-align: left; border-top: solid thin"/>
<td rowspan="2" style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Data sets</td>
<td rowspan="2" style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"><inline-formula id="j_info1224_ineq_052"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathit{CV}_{0}}$]]></tex-math></alternatives></inline-formula></td>
<td colspan="10" style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"><inline-formula id="j_info1224_ineq_053"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathit{CV}_{1}}$]]></tex-math></alternatives></inline-formula></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"/>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"><inline-formula id="j_info1224_ineq_054"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>1.2</mml:mn></mml:math>
<tex-math><![CDATA[$m=1.2$]]></tex-math></alternatives></inline-formula></td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"><inline-formula id="j_info1224_ineq_055"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>1.4</mml:mn></mml:math>
<tex-math><![CDATA[$m=1.4$]]></tex-math></alternatives></inline-formula></td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"><inline-formula id="j_info1224_ineq_056"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>1.6</mml:mn></mml:math>
<tex-math><![CDATA[$m=1.6$]]></tex-math></alternatives></inline-formula></td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"><inline-formula id="j_info1224_ineq_057"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>1.8</mml:mn></mml:math>
<tex-math><![CDATA[$m=1.8$]]></tex-math></alternatives></inline-formula></td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"><inline-formula id="j_info1224_ineq_058"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2.0</mml:mn></mml:math>
<tex-math><![CDATA[$m=2.0$]]></tex-math></alternatives></inline-formula></td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"><inline-formula id="j_info1224_ineq_059"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2.2</mml:mn></mml:math>
<tex-math><![CDATA[$m=2.2$]]></tex-math></alternatives></inline-formula></td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"><inline-formula id="j_info1224_ineq_060"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2.4</mml:mn></mml:math>
<tex-math><![CDATA[$m=2.4$]]></tex-math></alternatives></inline-formula></td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"><inline-formula id="j_info1224_ineq_061"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2.6</mml:mn></mml:math>
<tex-math><![CDATA[$m=2.6$]]></tex-math></alternatives></inline-formula></td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"><inline-formula id="j_info1224_ineq_062"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2.8</mml:mn></mml:math>
<tex-math><![CDATA[$m=2.8$]]></tex-math></alternatives></inline-formula></td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"><inline-formula id="j_info1224_ineq_063"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>3.0</mml:mn></mml:math>
<tex-math><![CDATA[$m=3.0$]]></tex-math></alternatives></inline-formula></td>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="8" style="vertical-align: top; text-align: left">Synthetic data sets</td>
<td style="vertical-align: top; text-align: left">SD21000</td>
<td style="vertical-align: top; text-align: left">1.131</td>
<td style="vertical-align: top; text-align: left">1.095</td>
<td style="vertical-align: top; text-align: left">1.081</td>
<td style="vertical-align: top; text-align: left">1.064</td>
<td style="vertical-align: top; text-align: left">1.027</td>
<td style="vertical-align: top; text-align: left">1.001</td>
<td style="vertical-align: top; text-align: left">0.950</td>
<td style="vertical-align: top; text-align: left">0.857</td>
<td style="vertical-align: top; text-align: left">0.713</td>
<td style="vertical-align: top; text-align: left">0.619</td>
<td style="vertical-align: top; text-align: left">0.580</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD20550</td>
<td style="vertical-align: top; text-align: left">0.833</td>
<td style="vertical-align: top; text-align: left">0.824</td>
<td style="vertical-align: top; text-align: left">0.824</td>
<td style="vertical-align: top; text-align: left">0.824</td>
<td style="vertical-align: top; text-align: left">0.819</td>
<td style="vertical-align: top; text-align: left">0.819</td>
<td style="vertical-align: top; text-align: left">0.819</td>
<td style="vertical-align: top; text-align: left">0.819</td>
<td style="vertical-align: top; text-align: left">0.814</td>
<td style="vertical-align: top; text-align: left">0.814</td>
<td style="vertical-align: top; text-align: left">0.814</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD21800</td>
<td style="vertical-align: top; text-align: left">0.754</td>
<td style="vertical-align: top; text-align: left">0.738</td>
<td style="vertical-align: top; text-align: left">0.736</td>
<td style="vertical-align: top; text-align: left">0.736</td>
<td style="vertical-align: top; text-align: left">0.735</td>
<td style="vertical-align: top; text-align: left">0.732</td>
<td style="vertical-align: top; text-align: left">0.732</td>
<td style="vertical-align: top; text-align: left">0.732</td>
<td style="vertical-align: top; text-align: left">0.732</td>
<td style="vertical-align: top; text-align: left">0.730</td>
<td style="vertical-align: top; text-align: left">0.728</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD21950</td>
<td style="vertical-align: top; text-align: left">1.176</td>
<td style="vertical-align: top; text-align: left">1.075</td>
<td style="vertical-align: top; text-align: left">1.072</td>
<td style="vertical-align: top; text-align: left">1.069</td>
<td style="vertical-align: top; text-align: left">1.063</td>
<td style="vertical-align: top; text-align: left">0.640</td>
<td style="vertical-align: top; text-align: left">0.623</td>
<td style="vertical-align: top; text-align: left">0.610</td>
<td style="vertical-align: top; text-align: left">0.600</td>
<td style="vertical-align: top; text-align: left">0.593</td>
<td style="vertical-align: top; text-align: left">0.589</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD31500</td>
<td style="vertical-align: top; text-align: left">1.037</td>
<td style="vertical-align: top; text-align: left">1.033</td>
<td style="vertical-align: top; text-align: left">1.033</td>
<td style="vertical-align: top; text-align: left">1.033</td>
<td style="vertical-align: top; text-align: left">1.031</td>
<td style="vertical-align: top; text-align: left">1.030</td>
<td style="vertical-align: top; text-align: left">1.030</td>
<td style="vertical-align: top; text-align: left">1.030</td>
<td style="vertical-align: top; text-align: left">1.020</td>
<td style="vertical-align: top; text-align: left">1.015</td>
<td style="vertical-align: top; text-align: left">1.005</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD32050</td>
<td style="vertical-align: top; text-align: left">1.041</td>
<td style="vertical-align: top; text-align: left">0.162</td>
<td style="vertical-align: top; text-align: left">0.162</td>
<td style="vertical-align: top; text-align: left">0.162</td>
<td style="vertical-align: top; text-align: left">0.163</td>
<td style="vertical-align: top; text-align: left">0.168</td>
<td style="vertical-align: top; text-align: left">0.180</td>
<td style="vertical-align: top; text-align: left">0.187</td>
<td style="vertical-align: top; text-align: left">0.188</td>
<td style="vertical-align: top; text-align: left">0.187</td>
<td style="vertical-align: top; text-align: left">0.194</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD32800</td>
<td style="vertical-align: top; text-align: left">0.849</td>
<td style="vertical-align: top; text-align: left">0.739</td>
<td style="vertical-align: top; text-align: left">0.790</td>
<td style="vertical-align: top; text-align: left">0.725</td>
<td style="vertical-align: top; text-align: left">0.716</td>
<td style="vertical-align: top; text-align: left">0.704</td>
<td style="vertical-align: top; text-align: left">0.170</td>
<td style="vertical-align: top; text-align: left">0.171</td>
<td style="vertical-align: top; text-align: left">0.174</td>
<td style="vertical-align: top; text-align: left">0.179</td>
<td style="vertical-align: top; text-align: left">0.180</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">SD34000</td>
<td style="vertical-align: top; text-align: left">0.923</td>
<td style="vertical-align: top; text-align: left">0.489</td>
<td style="vertical-align: top; text-align: left">0.308</td>
<td style="vertical-align: top; text-align: left">0.307</td>
<td style="vertical-align: top; text-align: left">0.306</td>
<td style="vertical-align: top; text-align: left">0.306</td>
<td style="vertical-align: top; text-align: left">0.305</td>
<td style="vertical-align: top; text-align: left">0.303</td>
<td style="vertical-align: top; text-align: left">0.301</td>
<td style="vertical-align: top; text-align: left">0.299</td>
<td style="vertical-align: top; text-align: left">0.299</td>
</tr>
<tr>
<td rowspan="4" style="vertical-align: top; text-align: left; border-bottom: solid thin">Real-world data sets</td>
<td style="vertical-align: top; text-align: left">abalone</td>
<td style="vertical-align: top; text-align: left">1.414</td>
<td style="vertical-align: top; text-align: left">0.661</td>
<td style="vertical-align: top; text-align: left">0.564</td>
<td style="vertical-align: top; text-align: left">0.558</td>
<td style="vertical-align: top; text-align: left">0.509</td>
<td style="vertical-align: top; text-align: left">0.511</td>
<td style="vertical-align: top; text-align: left">0.453</td>
<td style="vertical-align: top; text-align: left">0.406</td>
<td style="vertical-align: top; text-align: left">0.355</td>
<td style="vertical-align: top; text-align: left">0.378</td>
<td style="vertical-align: top; text-align: left">0.354</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">balance-scale</td>
<td style="vertical-align: top; text-align: left">0.662</td>
<td style="vertical-align: top; text-align: left">0.183</td>
<td style="vertical-align: top; text-align: left">0.083</td>
<td style="vertical-align: top; text-align: left">0.030</td>
<td style="vertical-align: top; text-align: left">0.023</td>
<td style="vertical-align: top; text-align: left">0.145</td>
<td style="vertical-align: top; text-align: left">0.316</td>
<td style="vertical-align: top; text-align: left">0.211</td>
<td style="vertical-align: top; text-align: left">0.294</td>
<td style="vertical-align: top; text-align: left">0.227</td>
<td style="vertical-align: top; text-align: left">0.287</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">breast-cancer</td>
<td style="vertical-align: top; text-align: left">1.320</td>
<td style="vertical-align: top; text-align: left">0.929</td>
<td style="vertical-align: top; text-align: left">0.966</td>
<td style="vertical-align: top; text-align: left">0.978</td>
<td style="vertical-align: top; text-align: left">0.802</td>
<td style="vertical-align: top; text-align: left">0.747</td>
<td style="vertical-align: top; text-align: left">0.858</td>
<td style="vertical-align: top; text-align: left">0.901</td>
<td style="vertical-align: top; text-align: left">0.850</td>
<td style="vertical-align: top; text-align: left">0.831</td>
<td style="vertical-align: top; text-align: left">0.879</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">pageblocks</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">1.953</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">1.547</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">1.547</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">1.564</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">1.518</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">1.485</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">1.562</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">1.474</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">1.277</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">1.233</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">1.276</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="j_info1224_fig_007">
<label>Fig. 5</label>
<caption>
<p>The <italic>m</italic>–<inline-formula id="j_info1224_ineq_064"><alternatives>
<mml:math><mml:mi mathvariant="italic">DCV</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{DCV}$]]></tex-math></alternatives></inline-formula> relationship on all the experimental data sets.</p>
</caption>
<graphic xlink:href="info1224_g007.jpg"/>
</fig>
<p>According to the criterion of fuzzifier selection, we can see from Fig. <xref rid="j_info1224_fig_007">5</xref> that the optimal values of fuzzifier determined by the CSD-m algorithm on different data sets are not the same. Furthermore, the relationships between <italic>m</italic> and DCV values are not the simple linear relationship. Nevertheless, for most data sets which have large variation in clusters sizes, smaller fuzzifier values tend to produce better clustering results. Generally, small clusters tend to merge with parts of the large clusters with the increase of fuzzifier values, as illustrated in Fig. <xref rid="j_info1224_fig_004">2</xref>.</p>
<p>From the obtained DCV values, the optimal fuzzifier values of the 12 data sets are shown in Fig. <xref rid="j_info1224_fig_008">6</xref>.</p>
<fig id="j_info1224_fig_008">
<label>Fig. 6</label>
<caption>
<p>Optimal fuzzifier values obtained for the experimental data sets.</p>
</caption>
<graphic xlink:href="info1224_g008.jpg"/>
</fig>
<p>As we can see from Fig. <xref rid="j_info1224_fig_008">6</xref>, the widely accepted and applied fuzzifier value in FCM, namely <inline-formula id="j_info1224_ineq_065"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn></mml:math>
<tex-math><![CDATA[$m=2$]]></tex-math></alternatives></inline-formula>, is not an optimal value for most of the data sets. Interestingly, we find that for most of the data sets, the smaller fuzzifier, <inline-formula id="j_info1224_ineq_066"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>1.2</mml:mn></mml:math>
<tex-math><![CDATA[$m=1.2$]]></tex-math></alternatives></inline-formula>, is an optimal value.</p>
<p>As we know, the inappropriate selection of fuzzifier value can significantly influence the clustering results of FCM. From Fig. <xref rid="j_info1224_fig_005">3</xref>, we can also see that the extents to which the clustering partitions are influenced by the fuzzifier values are different. Therefore, we define an indicator called Influence Coefficient of Fuzzifier (<inline-formula id="j_info1224_ineq_067"><alternatives>
<mml:math><mml:mi mathvariant="italic">ICF</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{ICF}$]]></tex-math></alternatives></inline-formula>) based on the change of <inline-formula id="j_info1224_ineq_068"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathit{CV}_{1}}$]]></tex-math></alternatives></inline-formula> values and the threshold of fuzzifier parameter <italic>m</italic>, to measure the influence of fuzzifier parameter <italic>m</italic> on FCM clustering results. The ICF indicator is defined as 
<disp-formula id="j_info1224_eq_011">
<label>(11)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:mi mathvariant="italic">ICF</mml:mi><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:mi mathvariant="normal">Δ</mml:mi><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ \mathit{ICF}=\frac{|\Delta {\mathit{CV}_{1}}|}{\Delta m}.\]]]></tex-math></alternatives>
</disp-formula> 
With the change of <italic>m</italic>, if the change of <inline-formula id="j_info1224_ineq_069"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathit{CV}_{1}}$]]></tex-math></alternatives></inline-formula> is large, then the value of <inline-formula id="j_info1224_ineq_070"><alternatives>
<mml:math><mml:mi mathvariant="italic">ICF</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{ICF}$]]></tex-math></alternatives></inline-formula> indicator is large. It demonstrates that the influence of <italic>m</italic> on FCM clustering is large. In contrast, within the similar threshold of <italic>m</italic>, a smaller <inline-formula id="j_info1224_ineq_071"><alternatives>
<mml:math><mml:mi mathvariant="normal">Δ</mml:mi><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[$\Delta {\mathit{CV}_{1}}$]]></tex-math></alternatives></inline-formula> value indicates the influence of <italic>m</italic> on FCM clustering is relatively small.</p>
<p>We choose the range of <italic>m</italic> values from 1.2 to 3.0, and then the <inline-formula id="j_info1224_ineq_072"><alternatives>
<mml:math><mml:mi mathvariant="italic">ICF</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{ICF}$]]></tex-math></alternatives></inline-formula> values on the 12 experimental data sets can be obtained. To discover the different influences of fuzzifier value on different data sets, the relationship between <inline-formula id="j_info1224_ineq_073"><alternatives>
<mml:math><mml:mi mathvariant="italic">ICF</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{ICF}$]]></tex-math></alternatives></inline-formula> values and <inline-formula id="j_info1224_ineq_074"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathit{CV}_{0}}$]]></tex-math></alternatives></inline-formula> values are fitted as shown in Fig. <xref rid="j_info1224_fig_009">7</xref>.</p>
<fig id="j_info1224_fig_009">
<label>Fig. 7</label>
<caption>
<p>Relationship between <inline-formula id="j_info1224_ineq_075"><alternatives>
<mml:math><mml:mi mathvariant="italic">ICF</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{ICF}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_info1224_ineq_076"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathit{CV}_{0}}$]]></tex-math></alternatives></inline-formula>.</p>
</caption>
<graphic xlink:href="info1224_g009.jpg"/>
</fig>
<p>From Fig. <xref rid="j_info1224_fig_009">7</xref>, we can see that there exists a linear relationship between <inline-formula id="j_info1224_ineq_077"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathit{CV}_{0}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_info1224_ineq_078"><alternatives>
<mml:math><mml:mi mathvariant="italic">ICF</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{ICF}$]]></tex-math></alternatives></inline-formula>. The linear regression equation, <inline-formula id="j_info1224_ineq_079"><alternatives>
<mml:math><mml:mi mathvariant="italic">y</mml:mi><mml:mo>=</mml:mo><mml:mn>0.64</mml:mn><mml:mi mathvariant="italic">x</mml:mi><mml:mo>−</mml:mo><mml:mn>0.47</mml:mn></mml:math>
<tex-math><![CDATA[$y=0.64x-0.47$]]></tex-math></alternatives></inline-formula>, reveals an interesting relationship between the influence extent of fuzzifier value and the original cluster size distributions. It demonstrates that the influences of fuzzifier value on FCM clustering results are relatively small on data sets with small variation in sizes. However, for data sets with large variation in cluster sizes, it is of particular importance to pay attention to the great influence of fuzzifier value on FCM clustering.</p>
<p>We also note that to a certain extent, the very small clusters in a data set can be regarded as noises and outliers. It has been recognized that the outliers can affect the performance of FCM. To address this problem, some existing studies have suggested to modify the Euclidean distance of FCM (Hathaway <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1224_ref_017">2000</xref>; Kersten, <xref ref-type="bibr" rid="j_info1224_ref_021">1999</xref>). However, the focus of this study is the influence of fuzzifer values in FCM. Without modifying the FCM algorithm itself, the small clusters can be effectively identified with an appropriate fuzzifier value using our proposed CSD-m algorithm. Therefore, our method also contributes to the identification of noises and outliers when using traditional FCM clustering.</p>
</sec>
</sec>
<sec id="j_info1224_s_009">
<label>5</label>
<title>Conclusion</title>
<p>The fuzzifier in FCM is an important parameter which can significantly influence the clustering results of FCM. Considering that the distribution of many data sets are not uniform in practical applications, we propose a new criterion and the corresponding algorithm called CSD-m algorithm for the selection of fuzzifier from the cluster size distribution perspective. The CV and DCV values are used to measure the original variation and change of variations after FCM clustering in cluster sizes, respectively. The optimal value of fuzzifier is obtained when the absolute value of DCV reaches its mininum. The experimental results on both synthetic and real-world data sets demonstrate the effectiveness of our proposed algorithms. We can see that the influence of noisy and outlier on the results are limited, and it demonstrates the robustness of our model. The results also reveal that the widely used fuzzifier value <inline-formula id="j_info1224_ineq_080"><alternatives>
<mml:math><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn></mml:math>
<tex-math><![CDATA[$m=2$]]></tex-math></alternatives></inline-formula> is not always the optimal, especially for data sets with large variation in cluster sizes. The novelty and specialty of this study include that a new algorithm for fuzzifier selection in FCM clustering was proposed, and a new indicator <inline-formula id="j_info1224_ineq_081"><alternatives>
<mml:math><mml:mi mathvariant="italic">ICF</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{ICF}$]]></tex-math></alternatives></inline-formula> was developed to measure the influence of fuzzifier value on FCM clustering results. Also, the extensive experimental results revealed a linear relationship between the extent of fuzzifier value influence (<inline-formula id="j_info1224_ineq_082"><alternatives>
<mml:math><mml:mi mathvariant="italic">ICF</mml:mi></mml:math>
<tex-math><![CDATA[$\mathit{ICF}$]]></tex-math></alternatives></inline-formula>) and the original cluster size distributions (<inline-formula id="j_info1224_ineq_083"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">CV</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathit{CV}_{0}}$]]></tex-math></alternatives></inline-formula>).</p>
</sec>
</body>
<back>
<ref-list id="j_info1224_reflist_001">
<title>References</title>
<ref id="j_info1224_ref_001">
<mixed-citation publication-type="journal"><string-name><surname>Ahmed</surname>, <given-names>M.N.</given-names></string-name>, <string-name><surname>Yamany</surname>, <given-names>S.M.</given-names></string-name>, <string-name><surname>Mohamed</surname>, <given-names>N.</given-names></string-name>, <string-name><surname>Farag</surname>, <given-names>A.A.</given-names></string-name>, <string-name><surname>Moriarty</surname>, <given-names>T.</given-names></string-name> (<year>2002</year>). <article-title>A modified fuzzy c-means algorithm for bias field estimation and segmentation of MRI data</article-title>. <source>IEEE Transactions on Medical Imaging</source>, <volume>21</volume>(<issue>3</issue>), <fpage>193</fpage>–<lpage>199</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_002">
<mixed-citation publication-type="other"><string-name><surname>Bache</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Lichman</surname>, <given-names>M.</given-names></string-name> (2013). UCI machine learning repository. Available at: <uri>http://archive.ics.uci.edu/ml</uri> (Accessed March 10, 2019).</mixed-citation>
</ref>
<ref id="j_info1224_ref_003">
<mixed-citation publication-type="journal"><string-name><surname>Benati</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Puerto</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Rodríguez-Chía</surname>, <given-names>A.M.</given-names></string-name> (<year>2017</year>). <article-title>Clustering data that are graph connected</article-title>. <source>European Journal of Operational Research</source>, <volume>261</volume>(<issue>1</issue>), <fpage>43</fpage>–<lpage>53</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_004">
<mixed-citation publication-type="journal"><string-name><surname>Bezdek</surname>, <given-names>J.C.</given-names></string-name> (<year>1976</year>). <article-title>A physical interpretation of fuzzy ISODATA</article-title>. <source>IEEE Transactions on Systems, Man, and Cybernetics</source>, <volume>6</volume>, <fpage>387</fpage>–<lpage>390</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_005">
<mixed-citation publication-type="book"><string-name><surname>Bezdek</surname>, <given-names>J.C.</given-names></string-name> (<year>1981</year>). <source>Pattern Recognition with Fuzzy Objective Function Algorithms</source>. <publisher-name>Springer US</publisher-name>, <publisher-loc>Boston</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_006">
<mixed-citation publication-type="journal"><string-name><surname>Bezdek</surname>, <given-names>J.C.</given-names></string-name>, <string-name><surname>Ehrlich</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Full</surname>, <given-names>W.</given-names></string-name> (<year>1984</year>). <article-title>FCM: the fuzzy c-means clustering algorithm</article-title>. <source>Computers &amp; Geosciences</source>, <volume>10</volume>(<issue>2–3</issue>), <fpage>191</fpage>–<lpage>203</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_007">
<mixed-citation publication-type="journal"><string-name><surname>Bezdek</surname>, <given-names>J.C.</given-names></string-name>, <string-name><surname>Hathaway</surname>, <given-names>R.J.</given-names></string-name>, <string-name><surname>Sabin</surname>, <given-names>M.J.</given-names></string-name>, <string-name><surname>Tucker</surname>, <given-names>W.T.</given-names></string-name> (<year>1987</year>). <article-title>Convergence theory for fuzzy c-means: counterexamples and repairs</article-title>. <source>IEEE Transactions on Systems, Man, and Cybernetics</source>, <volume>17</volume>(<issue>5</issue>), <fpage>873</fpage>–<lpage>877</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_008">
<mixed-citation publication-type="journal"><string-name><surname>Borg</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Boldt</surname>, <given-names>M.</given-names></string-name> (<year>2016</year>). <article-title>Clustering residential burglaries using modus operandi and spatiotemporal information</article-title>. <source>International Journal of Information Technology &amp; Decision Making</source>, <volume>15</volume>(<issue>1</issue>), <fpage>23</fpage>–<lpage>42</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_009">
<mixed-citation publication-type="journal"><string-name><surname>Cannon</surname>, <given-names>R.L.</given-names></string-name>, <string-name><surname>Dave</surname>, <given-names>J.V.</given-names></string-name>, <string-name><surname>Bezdek</surname>, <given-names>J.C.</given-names></string-name> (<year>1986</year>). <article-title>Efficient implementation of the fuzzy c-means clustering algorithms</article-title>. <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>, <volume>PAMI–8(2)</volume>, <fpage>248</fpage>–<lpage>255</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_010">
<mixed-citation publication-type="journal"><string-name><surname>Chan</surname>, <given-names>K.P.</given-names></string-name>, <string-name><surname>Cheung</surname>, <given-names>Y.S.</given-names></string-name> (<year>1992</year>). <article-title>Clustering of clusters</article-title>. <source>Pattern Recognition</source>, <volume>25</volume>(<issue>2</issue>), <fpage>211</fpage>–<lpage>217</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_011">
<mixed-citation publication-type="chapter"><string-name><surname>Choe</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Jordan</surname>, <given-names>J.B.</given-names></string-name> (<year>1992</year>). <chapter-title>On the optimal choice of parameters in a fuzzy c-means algorithm</chapter-title>. In: <source>[1992 Proceedings] IEEE International Conference on Fuzzy Systems</source>. <publisher-name>IEEE</publisher-name>, <publisher-loc>New York</publisher-loc>, pp. <fpage>349</fpage>–<lpage>354</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_012">
<mixed-citation publication-type="journal"><string-name><surname>Dembélé</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Kastner</surname>, <given-names>P.</given-names></string-name> (<year>2003</year>). <article-title>Fuzzy C-means method for clustering microarray data</article-title>. <source>Bioinformatics</source>, <volume>19</volume>(<issue>8</issue>), <fpage>973</fpage>–<lpage>980</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_013">
<mixed-citation publication-type="journal"><string-name><surname>Dunn</surname>, <given-names>J.C.</given-names></string-name> (<year>1973</year>). <article-title>A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters</article-title>. <source>Journal of Cybernetics</source>, <volume>3</volume>(<issue>3</issue>), <fpage>32</fpage>–<lpage>57</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_014">
<mixed-citation publication-type="journal"><string-name><surname>Fadili</surname>, <given-names>M.J.</given-names></string-name>, <string-name><surname>Ruan</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Bloyet</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Mazoyer</surname>, <given-names>B.</given-names></string-name> (<year>2001</year>). <article-title>On the number of clusters and the fuzziness index for unsupervised FCA application to BOLD fMRI time series</article-title>. <source>Medical Image Analysis</source>, <volume>5</volume>(<issue>1</issue>), <fpage>55</fpage>–<lpage>67</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_015">
<mixed-citation publication-type="journal"><string-name><surname>Hall</surname>, <given-names>L.O.</given-names></string-name>, <string-name><surname>Bensaid</surname>, <given-names>A.M.</given-names></string-name>, <string-name><surname>Clarke</surname>, <given-names>L.P.</given-names></string-name>, <string-name><surname>Velthuizen</surname>, <given-names>R.P.</given-names></string-name>, <string-name><surname>Silbiger</surname>, <given-names>M.S.</given-names></string-name>, <string-name><surname>Bezdek</surname>, <given-names>J.C.</given-names></string-name> (<year>1992</year>). <article-title>A comparison of neural network and fuzzy clustering techniques in segmenting magnetic resonance images of the brain</article-title>. <source>IEEE Transactions on Neural Networks</source>, <volume>3</volume>(<issue>5</issue>), <fpage>672</fpage>–<lpage>682</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_016">
<mixed-citation publication-type="book"><string-name><surname>Hartigan</surname>, <given-names>J.A.</given-names></string-name> (<year>1975</year>). <source>Clustering Algorithms</source>. <publisher-name>John Wiley &amp; Sons</publisher-name>, <publisher-loc>New York</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_017">
<mixed-citation publication-type="journal"><string-name><surname>Hathaway</surname>, <given-names>R.J.</given-names></string-name>, <string-name><surname>Bezdek</surname>, <given-names>J.C.</given-names></string-name>, <string-name><surname>Hu</surname>, <given-names>Y.</given-names></string-name> (<year>2000</year>). <article-title>Generalized fuzzy c-means clustering strategies using L<sub>p</sub> norm distances</article-title>. <source>IEEE Transactions on Fuzzy Systems</source>, <volume>8</volume>(<issue>5</issue>), <fpage>576</fpage>–<lpage>582</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_018">
<mixed-citation publication-type="journal"><string-name><surname>Hou</surname>, <given-names>Z.</given-names></string-name>, <string-name><surname>Qian</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>Huang</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Hu</surname>, <given-names>Q.</given-names></string-name>, <string-name><surname>Nowinski</surname>, <given-names>W.L.</given-names></string-name> (<year>2007</year>). <article-title>Regularized fuzzy c-means method for brain tissue clustering</article-title>. <source>Pattern Recognition Letters</source>, <volume>28</volume>(<issue>13</issue>), <fpage>1788</fpage>–<lpage>1794</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_019">
<mixed-citation publication-type="journal"><string-name><surname>Jain</surname>, <given-names>A.K.</given-names></string-name> (<year>2010</year>). <article-title>Data clustering: 50 years beyond K-means</article-title>. <source>Pattern Recognition Letters</source>, <volume>31</volume>(<issue>8</issue>), <fpage>651</fpage>–<lpage>666</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_020">
<mixed-citation publication-type="journal"><string-name><surname>Johnson</surname>, <given-names>S.C.</given-names></string-name> (<year>1967</year>). <article-title>Hierarchical clustering schemes</article-title>. <source>Psychometrika</source>, <volume>32</volume>(<issue>3</issue>), <fpage>241</fpage>–<lpage>254</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_021">
<mixed-citation publication-type="journal"><string-name><surname>Kersten</surname>, <given-names>P.R.</given-names></string-name> (<year>1999</year>). <article-title>Fuzzy order statistics and their application to fuzzy clustering</article-title>. <source>IEEE Transactions on Fuzzy Systems</source>, <volume>7</volume>(<issue>6</issue>), <fpage>708</fpage>–<lpage>712</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_022">
<mixed-citation publication-type="journal"><string-name><surname>Khemchandani</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Pal</surname>, <given-names>A.</given-names></string-name> (<year>2019</year>). <article-title>Fuzzy semi-supervised weighted linear loss twin support vector clustering</article-title>. <source>Knowledge-Based Systems</source>, <volume>165</volume>, <fpage>132</fpage>–<lpage>148</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_023">
<mixed-citation publication-type="chapter"><string-name><surname>MacQueen</surname>, <given-names>J.</given-names></string-name> (<year>1967</year>). <chapter-title>Some methods for classification and analysis of multivariate observations</chapter-title>. In: <source>Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1: Statistics</source>. <publisher-name>University of California Press</publisher-name>, <publisher-loc>Berkeley</publisher-loc>, pp. <fpage>281</fpage>–<lpage>297</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_024">
<mixed-citation publication-type="journal"><string-name><surname>Mehdizadeh</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>Teimouri</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Zaretalab</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Niaki</surname>, <given-names>S.T.A.</given-names></string-name> (<year>2017</year>). <article-title>A combined approach based on <italic>k</italic>-means and modified electromagnetism-like mechanism for data clustering</article-title>. <source>International Journal of Information Technology &amp; Decision Making</source>, <volume>16</volume>(<issue>5</issue>), <fpage>1279</fpage>–<lpage>1307</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_025">
<mixed-citation publication-type="journal"><string-name><surname>Mokhtari</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Salmasnia</surname>, <given-names>A.</given-names></string-name> (<year>2015</year>). <article-title>An evolutionary clustering-based optimization to minimize total weighted completion time variance in a multiple machine manufacturing system</article-title>. <source>International Journal of Information Technology &amp; Decision Making</source>, <volume>14</volume>(<issue>5</issue>), <fpage>971</fpage>–<lpage>991</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_026">
<mixed-citation publication-type="journal"><string-name><surname>Motlagh</surname>, <given-names>O.</given-names></string-name>, <string-name><surname>Berry</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>O’Neil</surname>, <given-names>L.</given-names></string-name> (<year>2019</year>). <article-title>Clustering of residential electricity customers using load time series</article-title>. <source>Applied Energy</source>, <volume>237</volume>, <fpage>11</fpage>–<lpage>24</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_027">
<mixed-citation publication-type="journal"><string-name><surname>Olde Keizer</surname>, <given-names>M.C.A.</given-names></string-name>, <string-name><surname>Teunter</surname>, <given-names>R.H.</given-names></string-name>, <string-name><surname>Veldman</surname>, <given-names>J.</given-names></string-name> (<year>2016</year>). <article-title>Clustering condition-based maintenance for systems with redundancy and economic dependencies</article-title>. <source>European Journal of Operational Research</source>, <volume>251</volume>(<issue>2</issue>), <fpage>531</fpage>–<lpage>540</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_028">
<mixed-citation publication-type="chapter"><string-name><surname>Ozkan</surname>, <given-names>I.</given-names></string-name>, <string-name><surname>Turksen</surname>, <given-names>I.B.</given-names></string-name> (<year>2004</year>). <chapter-title>Entropy assessment for type-2 fuzziness</chapter-title>. In: <source>2004 IEEE International Conference on Fuzzy Systems (IEEE Cat. No. 04CH37542)</source>. <publisher-name>IEEE</publisher-name>, <publisher-loc>New York</publisher-loc>, pp. <fpage>1111</fpage>–<lpage>1115</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_029">
<mixed-citation publication-type="journal"><string-name><surname>Ozkan</surname>, <given-names>I.</given-names></string-name>, <string-name><surname>Turksen</surname>, <given-names>I.B.</given-names></string-name> (<year>2007</year>). <article-title>Upper and lower values for the level of fuzziness in FCM</article-title>. <source>Information Sciences</source>, <volume>177</volume>(<issue>23</issue>), <fpage>5143</fpage>–<lpage>5152</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_030">
<mixed-citation publication-type="journal"><string-name><surname>Pal</surname>, <given-names>N.R.</given-names></string-name>, <string-name><surname>Bezdek</surname>, <given-names>J.C.</given-names></string-name> (<year>1995</year>). <article-title>On cluster validity for the fuzzy c-means model</article-title>. <source>IEEE Transactions on Fuzzy Systems</source>, <volume>3</volume>(<issue>3</issue>), <fpage>370</fpage>–<lpage>379</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_031">
<mixed-citation publication-type="book"><string-name><surname>Papoulis</surname>, <given-names>A.</given-names></string-name> (<year>1990</year>). <source>Probability and Statistics</source>. <publisher-name>Prentice-Hall</publisher-name>, <publisher-loc>Upper Saddle River</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_032">
<mixed-citation publication-type="journal"><string-name><surname>Park</surname>, <given-names>D.C.</given-names></string-name> (<year>2009</year>). <article-title>Classification of audio signals using Fuzzy c-Means with divergence-based Kernel</article-title>. <source>Pattern Recognition Letters</source>, <volume>30</volume>(<issue>9</issue>), <fpage>794</fpage>–<lpage>798</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_033">
<mixed-citation publication-type="journal"><string-name><surname>Pham</surname>, <given-names>N.V.</given-names></string-name>, <string-name><surname>Pham</surname>, <given-names>L.T.</given-names></string-name>, <string-name><surname>Nguyen</surname>, <given-names>T.D.</given-names></string-name>, <string-name><surname>Ngo</surname>, <given-names>L.T.</given-names></string-name> (<year>2018</year>). <article-title>A new cluster tendency assessment method for fuzzy co-clustering in hyperspectral image analysis</article-title>. <source>Neurocomputing</source>, <volume>307</volume>, <fpage>213</fpage>–<lpage>226</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_034">
<mixed-citation publication-type="chapter"><string-name><surname>Shen</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Shi</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Zhang</surname>, <given-names>J.Q.</given-names></string-name> (<year>2001</year>). <chapter-title>Improvement and optimization of a fuzzy C-means clustering algorithm</chapter-title>. In: <source>Proceedings of the 18th IEEE Instrumentation and Measurement Technology Conference. Rediscovering Measurement in the Age of Informatics (Cat. No. 01CH 37188)</source>. <publisher-name>IEEE</publisher-name>, <publisher-loc>New York</publisher-loc>, pp. <fpage>1430</fpage>–<lpage>1433</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_035">
<mixed-citation publication-type="journal"><string-name><surname>Truong</surname>, <given-names>H.Q.</given-names></string-name>, <string-name><surname>Ngo</surname>, <given-names>L.T.</given-names></string-name>, <string-name><surname>Pedrycz</surname>, <given-names>W.</given-names></string-name> (<year>2017</year>). <article-title>Granular fuzzy possibilistic C-means clustering approach to DNA microarray problem</article-title>. <source>Knowledge-Based Systems</source>, <volume>133</volume>, <fpage>53</fpage>–<lpage>65</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_036">
<mixed-citation publication-type="journal"><string-name><surname>Wu</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Chen</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Xiong</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Xie</surname>, <given-names>M.</given-names></string-name> (<year>2009</year>a). <article-title>External validation measures for K-means clustering: a data distribution perspective</article-title>. <source>Expert Systems with Applications</source>, <volume>36</volume>(<issue>3</issue>), <fpage>6050</fpage>–<lpage>6061</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_037">
<mixed-citation publication-type="chapter"><string-name><surname>Wu</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Xiong</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Chen</surname>, <given-names>J.</given-names></string-name> (<year>2009</year>b). <chapter-title>Adapting the right measures for K-means clustering</chapter-title>. In: <source>Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining – KDD 09</source>. <publisher-name>ACM Press</publisher-name>, <publisher-loc>New York</publisher-loc>, pp. <fpage>877</fpage>–<lpage>886</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_038">
<mixed-citation publication-type="journal"><string-name><surname>Wu</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Xiong</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Chen</surname>, <given-names>J.</given-names></string-name> (<year>2009</year>c). <article-title>Towards understanding hierarchical clustering: a data distribution perspective</article-title>. <source>Neurocomputing</source>, <volume>72</volume>(<issue>10–12</issue>), <fpage>2319</fpage>–<lpage>2330</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_039">
<mixed-citation publication-type="journal"><string-name><surname>Wu</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Xiong</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Liu</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Chen</surname>, <given-names>J.</given-names></string-name> (<year>2012</year>). <article-title>A generalization of distance functions for fuzzy c-means clustering with centroids of arithmetic means</article-title>. <source>IEEE Transactions on Fuzzy Systems</source>, <volume>20</volume>(<issue>3</issue>), <fpage>557</fpage>–<lpage>571</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_040">
<mixed-citation publication-type="journal"><string-name><surname>Wu</surname>, <given-names>K.L.</given-names></string-name> (<year>2012</year>). <article-title>Analysis of parameter selections for fuzzy c-means</article-title>. <source>Pattern Recognition</source>, <volume>45</volume>(<issue>1</issue>), <fpage>407</fpage>–<lpage>415</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_041">
<mixed-citation publication-type="journal"><string-name><surname>Xiong</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Wu</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Chen</surname>, <given-names>J.</given-names></string-name> (<year>2009</year>). <article-title>K-means clustering versus validation measures: a data-distribution perspective</article-title>. <source>IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)</source>, <volume>39</volume>(<issue>2</issue>), <fpage>318</fpage>–<lpage>331</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_042">
<mixed-citation publication-type="journal"><string-name><surname>Yu</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Cheng</surname>, <given-names>Q.</given-names></string-name>, <string-name><surname>Huang</surname>, <given-names>H.</given-names></string-name> (<year>2004</year>). <article-title>Analysis of the weighting exponent in the FCM</article-title>. <source>IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics)</source>, <volume>34</volume>(<issue>1</issue>), <fpage>634</fpage>–<lpage>639</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_043">
<mixed-citation publication-type="journal"><string-name><surname>Zhao</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Xu</surname>, <given-names>Z.</given-names></string-name>, <string-name><surname>Wang</surname>, <given-names>Z.</given-names></string-name> (<year>2013</year>). <article-title>Intuitionistic fuzzy clustering algorithm based on boole matrix and association measure</article-title>. <source>International Journal of Information Technology &amp; Decision Making</source>, <volume>12</volume>(<issue>1</issue>), <fpage>95</fpage>–<lpage>118</lpage>.</mixed-citation>
</ref>
<ref id="j_info1224_ref_044">
<mixed-citation publication-type="journal"><string-name><surname>Zhou</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Yang</surname>, <given-names>S.</given-names></string-name> (<year>2016</year>). <article-title>Exploring the uniform effect of FCM clustering: a data distribution perspective</article-title>. <source>Knowledge-Based Systems</source>, <volume>96</volume>, <fpage>76</fpage>–<lpage>83</lpage>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>