{"id":5431,"date":"2020-05-24T00:59:02","date_gmt":"2020-05-23T23:59:02","guid":{"rendered":"http:\/\/mathscitech.org\/articles\/?p=5431"},"modified":"2024-10-15T10:00:03","modified_gmt":"2024-10-15T09:00:03","slug":"fuzzy-classifier","status":"publish","type":"post","link":"https:\/\/mathscitech.org\/articles\/fuzzy-classifier","title":{"rendered":"Fuzzy Classifiers and Quantile Statistics for continuous data monitoring with adaptive thresholds"},"content":{"rendered":"<p><strong>Abstract<\/strong>  This brief note explores the use of fuzzy classifiers, with membership functions chosen using a statistical heuristic (quantile statistics), to monitor time-series metrics.  The time series can arise from environmental measurements, industrial process control data, or sensor system outputs.  We demonstrate implementation using the <a href=\"https:\/\/mathscitech.org\/articles\/computing-toolkits\/r-for-stats\" rel=\"noopener noreferrer\" target=\"_blank\"\" rel=\"noopener\" target=\"_blank\">R language<\/a> on an example dataset (ozone levels in New York City).  Click here to skip straight to the <a href=\"https:\/\/mathscitech.org\/articles\/fuzzy-classifier#appendix-using-fuzzycluster\">coded solution<\/a>), or read on for the discussion.<\/p>\n<div id=\"attachment_5510\" style=\"width: 534px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" aria-describedby=\"caption-attachment-5510\" loading=\"lazy\" src=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/05\/screenshot.1687.png\" alt=\"\" width=\"524\" height=\"492\" class=\"size-full wp-image-5510\" srcset=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/05\/screenshot.1687.png 524w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/05\/screenshot.1687-300x282.png 300w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/05\/screenshot.1687-150x141.png 150w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/05\/screenshot.1687-400x376.png 400w\" sizes=\"auto, (max-width: 524px) 100vw, 524px\" \/><p id=\"caption-attachment-5510\" class=\"wp-caption-text\">Fuzzy classification into 5 classes using p10 and p90 levels to achieve an 80-20 rule in the outermost classes and graded class membership in the inner three classes.  Comparison with crisp classifier using the same 80-20 rule is shown in the bottom panel of the figure.<\/p><\/div>\n<p><!--more--><\/p>\n<h3>1. Case study: providing automatic classification of urban air quality<\/h3>\n<p><strong>Time-series approach<\/strong><br \/>\nConsider the problem of monitoring ozone levels in an urban environment and issuing a daily text message alert to residents indicating meaningful risk levels.  What should these levels be, and how should the thresholds be set?  If we imagine a scenario where clinical studies on human response do not yet exist on which to base the thresholds, an alternative approach is to provide a statistical classification of ozone levels relative to previous history, i.e. classed as VERY-LOW, LOW, VERY-HIGH, HIGH, and NORMAL.  Let&#8217;s see how we might do this using an example dataset from a New York City study<\/p>\n<p><strong>Looking at the NYC ozone study<\/strong><br \/>\nThe NYC ozone dataset from the State Department of Conservation is built into R and measures mean ozone levels (alongside 5 other air quality variables) over 153 days in the summer of 1973 (May 1st to Sep 30th, 1300-1500h, Roosevelt Island station).  Let&#8217;s load it up in R and inspect it.<\/p>\n<p><code><\/p>\n<pre>\r\n### Load airquality dataset\r\nlibrary(datasets)       \r\nairquality               \r\nhelp(airquality)        # show dataset details\r\n\r\nairquality$Ozone        # parts per billion\r\n\r\n### Data visualization\r\nplot(airquality$Ozone)\r\nlines(airquality$Ozone)\r\ntitle(\"Ozone level at Roosevelt Island, NYC\\n mean ppb, 1300h-1500h, May 1-Sep 30, 1973\") \r\nhist(airquality$Ozone)\r\n<\/pre>\n<p><\/code><\/p>\n<div id=\"attachment_5465\" style=\"width: 641px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" aria-describedby=\"caption-attachment-5465\" loading=\"lazy\" src=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1644-1.png\" alt=\"\" width=\"631\" height=\"588\" class=\"size-full wp-image-5465\" srcset=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1644-1.png 631w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1644-1-300x280.png 300w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1644-1-150x140.png 150w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1644-1-400x373.png 400w\" sizes=\"auto, (max-width: 631px) 100vw, 631px\" \/><p id=\"caption-attachment-5465\" class=\"wp-caption-text\">1973 ozone study in NYC over 153 days in the summer.  Exploratory plot in R using airquality$Ozone in library(datasets)<\/p><\/div>\n<p>To summarize the observations, use boxplot() to show graphically its non-parametric statistics i.e. the 5-number summary at min (p0), p25, p50, p75, and max (p100) levels.  quantile() shows the numerical values.  Note that by convention the 5-number summary captures the middle 50% of the distribution (between p25 and p75).<\/p>\n<p><code><\/p>\n<pre>\r\nboxplot(airquality$Ozone)\r\nclean <- function(v) { v[!is.na(v)] }      # remove's NAs from a list  (ex: clean(d))\r\ndd <- clean(airquality$Ozone)\r\nquantile(dd)                               # 5-number summary at min, Q1, Q2, Q3, max levels\r\n<\/pre>\n<p><\/code><\/p>\n<div id=\"attachment_5459\" style=\"width: 707px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" aria-describedby=\"caption-attachment-5459\" loading=\"lazy\" src=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1648-2.png\" alt=\"\" width=\"697\" height=\"284\" class=\"size-full wp-image-5459\" srcset=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1648-2.png 697w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1648-2-300x122.png 300w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1648-2-150x61.png 150w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1648-2-400x163.png 400w\" sizes=\"auto, (max-width: 697px) 100vw, 697px\" \/><p id=\"caption-attachment-5459\" class=\"wp-caption-text\">Boxplot shows 5-number summary: min\/max are the whiskers, Quartile 1, Quartile 3 (25th and 75th percentiles) are the box bottom and top, and Quartile 2 (median) is the bar within the box.<\/p><\/div>\n<h3>2. Choosing a statistical classifier<\/h3>\n<p>Classifiers are heuristics matching input data to categorical labels, with the \"goodness of fit\" depending on an (often subjective) separate estimate of the correct answer.<\/p>\n<p>In what follows, we use the non-parametric statistics heuristic with p-levels associated with quantiles as a way to capture a fixed % of the distribution.  For time-series data that are, or can be assumed to be, essentially stationary, the selection of thresholds can be done once and adjusted dynamically; for highly varying, seasonal, or other non-stationary time-series, the analysis becomes more complicated, with the central ideas below requiring adaptation.<\/p>\n<p>Let us take as a tenet the use of the 80-20 rule for flagging HIGH and LOW outliers, i.e. 1 in 5 observations should be abnormal, and the remaining 4 considered within usual levels of variation.  This implies setting the quantiles at p10 and p90 levels.  Notice, quantile() is overridden with the new probs vector containing the p10 and p90 levels, and the resulting distribution is plotted against these thresholding curves.<\/p>\n<p><code><\/p>\n<pre>\r\nqq <- quantile(dd, probs=c(0, 0.1, 0.5, 0.9, 1)) # 5-number summary at custom levels\r\nqq\r\nplot(airquality$Ozone)\r\ncurve(0*x+qq[[2]],add=TRUE)    # add horizontal lines at the quartile levels\r\ncurve(0*x+qq[[4]],add=TRUE)\r\n<\/pre>\n<p><\/code><\/p>\n<div id=\"attachment_5477\" style=\"width: 653px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" aria-describedby=\"caption-attachment-5477\" loading=\"lazy\" src=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1651.png\" alt=\"\" width=\"643\" height=\"556\" class=\"size-full wp-image-5477\" srcset=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1651.png 643w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1651-300x259.png 300w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1651-150x130.png 150w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1651-400x346.png 400w\" sizes=\"auto, (max-width: 643px) 100vw, 643px\" \/><p id=\"caption-attachment-5477\" class=\"wp-caption-text\">Using custom levels of p10 and p90 for the upper and lower classes leaves still a broad range of values in the middle class.<\/p><\/div>\n<h3>3. The difference between crisp (conventional) and fuzzy classifiers.<\/h3>\n<p>If this is all we do, then it remains only to formalize the classification around the pre-selected p10 and p90 thresholds.  The is given below, and it should be no surprise that the results show 80% of the values classed as normal, with 10% in each of the two classes LOW and HIGH.  The size of the buckets was pre-determined by the tenet of an 80-20 classification outcome.  The only choice was what this meant in terms of the thresholds, and this was provided by the quantile() function at the respective p-levels.<\/p>\n<p><code><\/p>\n<pre>\r\ncc3 <- c(\"LOW\", \"NORMAL\", \"HIGH\")\r\n\r\n# Crisp classifier\r\nclassify <- function(x) {\r\n    if (x < qq[[2]]) cc3[1]\r\n    else if (x > qq[[4]]) cc3[3]\r\n    else cc3[2]\r\n}\r\nb <- sapply(dd, classify)\r\nb\r\ntable(b)\r\ntable(b)\/length(dd)\r\n<\/pre>\n<p><\/code><\/p>\n<div id=\"attachment_5469\" style=\"width: 649px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" aria-describedby=\"caption-attachment-5469\" loading=\"lazy\" src=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1655.png\" alt=\"\" width=\"639\" height=\"502\" class=\"size-full wp-image-5469\" srcset=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1655.png 639w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1655-300x236.png 300w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1655-150x118.png 150w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1655-400x314.png 400w\" sizes=\"auto, (max-width: 639px) 100vw, 639px\" \/><p id=\"caption-attachment-5469\" class=\"wp-caption-text\">Non-fuzzy classifier using the p10 and p90 threshold levels.<\/p><\/div>\n<p>But you should be protesting at the use of such a classifier.  Within the \"normal\" bucket, containing 80% of the distribution, there is a wide range of values, which leads to a problem that bedevils all attempts at crisp classification into a manageable number of levels: (1) values at the top and bottom ends (for example 85 and 18 ppb) are both classed together as normal despite the fact that they are closer to points on the other side of the class boundaries (for example 89 and 12 ppb) that are classed as HIGH and LOW respectively.<\/p>\n<p><strong>Crisp classifiers get around the above limitations by using more complex mathematical models<\/strong><br \/>\nThe discussion so far has taken the simplest possible approach to classification, and the critique of the simple approach above is deserved.  But of course, no one in statistical classification would use the threshold logic as described above without accommodating for distance between points.  One approach would be to find the separating hyperplane that minimizes the average distance between pairs on either sides of that plane.  Another approach, similar, is based on choosing centroids of clusters, again minimizing distance between pairs of points.  Both of these optimization algorithms get around the above critique and give good results, but they are much more complex mathematically (see the classic reference [Tibshirani\/2009] <a href=\"https:\/\/web.stanford.edu\/~hastie\/ElemStatLearn\/\" rel=\"noopener noreferrer\" target=\"_blank\">Elements of Statistical Learning, free PDF download<\/a> courtesy of the authors & publishers).<\/p>\n<h3>4. The fuzzy classifier: making a naive classification approach work with low computational impact<\/h3>\n<p><strong>The appeal with fuzzy classification is that it is able to preserve the simple approach and resolve discontinuity problem in a computationally efficient way using simple min-max in a point calculation, rather than requiring to solve a combinatorial optimization problem ranging over all pairs of points.<\/strong><\/p>\n<p>The key to fuzzy classification is a definition of set membership that includes <em>degree of membership<\/em> as a mechanism to allow \"fuzzy boundaries\".  A single element now is a member of all classes but to different degrees.  The final classification has a strength, which matches more closely our intuition that classifications are inherently fuzzy, resisting crisp boundaries.  <\/p>\n<p>Let's look at the how this is implemented.<\/p>\n<p><code><\/p>\n<pre>\r\n# Defining linear membership functions L(x), M(x), H(x) (low, medium, high) based on p10 and p90 quantile levels\r\nx0   <- qq[[2]]\r\nx1   <- qq[[4]]\r\nLy0  <- 1.0\r\nLy1  <- 0.0\r\nHy0  <- 0.0\r\nHy1  <- 1.0\r\nxmin <- qq[[1]]\r\nxmed <- qq[[3]]\r\nxmax <- qq[5]\r\n\r\nL <- function(x) {                                       # Low\r\n         if (x < x0) Ly0\r\n    else if (x > x1) Ly1\r\n    else             Ly0 + (x-x0)*(Ly1-Ly0)\/(x1-x0)\r\n}\r\n\r\nH <- function(x) {                                       # High \r\n         if (x < x0) Hy0\r\n    else if (x > x1) Hy1\r\n    else             Hy0 + (x-x0)*(Hy1-Hy0)\/(x1-x0)\r\n}\r\n\r\nM <- function(x) {                                      # Medium\r\n         if (x < xmed) Hy0 + (x-xmin)*(Hy1-Hy0)\/(xmed-xmin)\r\n    else               Hy0 + (x-xmax)*(Ly1-Ly0)\/(xmax-xmed)\r\n}\r\n\r\n# show fuzzy membership functions\r\nx <- 1:max(dd)\r\nplot(c(1,max(dd)),c(0,1))\r\nlines(x,sapply(x,L))\r\nlines(x,sapply(x,H))\r\nlines(x,sapply(x,M))\r\n<\/pre>\n<p><\/code><\/p>\n<div id=\"attachment_5478\" style=\"width: 536px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" aria-describedby=\"caption-attachment-5478\" loading=\"lazy\" src=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1658b.png\" alt=\"\" width=\"526\" height=\"476\" class=\"size-full wp-image-5478\" srcset=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1658b.png 526w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1658b-300x271.png 300w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1658b-150x136.png 150w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1658b-400x362.png 400w\" sizes=\"auto, (max-width: 526px) 100vw, 526px\" \/><p id=\"caption-attachment-5478\" class=\"wp-caption-text\">Fuzzy membership functions L(x) - Low, M(x) - Medium, H(x) - High, using the p10 and p90 thresholds.<\/p><\/div>\n<p>Let's talk through what is going on above.  The fuzzy classifier above uses membership functions L(x), M(x), H(x) (low, medium, high) to generate overlapping degrees of membership for each class.  The membership functions are defined over the same domain, allowing classifications that show graded membership.<\/p>\n<p>What happens next is the key bit: the classification of a given point x is made by applying a fuzzification operation across the candidate fuzzy sets.  In this case the fuzzification operator is the simple mathematical maximum, max().  Working an example: the value 70 is classified into the \"LOW\" bucket based on L(70) returning the highest (max) value from the three choices L(70), M(70), H(70).<\/p>\n<p><code><\/p>\n<pre>\r\n# example:\r\ncc3 <- c(\"LOW\", \"NORMAL\", \"HIGH\")\r\nfv <- c(L(70),M(70),H(70))     # fuzzy vector\r\nfv\r\nmax(fv)\r\nwhich.max(fv)\r\ncc3[which.max(fv)]\r\n<\/pre>\n<p><\/code><\/p>\n<p>We use this notion to define the fuzzy classifier itself, and apply it to the dataset:<\/p>\n<p><code><\/p>\n<pre>\r\n# Fuzzy classifier into 3 classes\r\ncc3 <- c(\"LOW\", \"NORMAL\", \"HIGH\")\r\n\r\nfuzzy_classify3 <- function(x) {\r\n    fv <- c(L(x), M(x), H(x))\r\n    # print(fv)     #debug\r\n    # print(max(fv))  #debug\r\n    m3 <- which.max(fv)\r\n    cc3[m3]\r\n}\r\n\r\nbb <- sapply(dd, fuzzy_classify3)\r\nbb\r\ntable(bb)\r\ntable(bb)\/length(dd)\r\n<\/pre>\n<p><\/code><\/p>\n<div id=\"attachment_5479\" style=\"width: 393px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" aria-describedby=\"caption-attachment-5479\" loading=\"lazy\" src=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1658a.png\" alt=\"\" width=\"383\" height=\"145\" class=\"size-full wp-image-5479\" srcset=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1658a.png 383w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1658a-300x114.png 300w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1658a-150x57.png 150w\" sizes=\"auto, (max-width: 383px) 100vw, 383px\" \/><p id=\"caption-attachment-5479\" class=\"wp-caption-text\">Example showing fuzzy classification using max over membership functions.<\/p><\/div>\n<div id=\"attachment_5480\" style=\"width: 257px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" aria-describedby=\"caption-attachment-5480\" loading=\"lazy\" src=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1658a2.png\" alt=\"\" width=\"247\" height=\"130\" class=\"size-full wp-image-5480\" srcset=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1658a2.png 247w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1658a2-150x79.png 150w\" sizes=\"auto, (max-width: 247px) 100vw, 247px\" \/><p id=\"caption-attachment-5480\" class=\"wp-caption-text\">Fuzzy classification of NYC ozone levels into 3 classes.<\/p><\/div>\n<p>Notice that the medium cluster now occurs only 35% of the time (or 44% of the 80% previously classed as NORMAL), which is a slight problem.  The fuzzy classifier into three clusters no longer preserves the 80-20 rule that led us to use p10 and p90 thresholds in the first place.  <\/p>\n<p>The fix is to use 5 clusters where the outermost two (VERY HIGH and VERY LOW) preserve the 80-20 rule based on the p90 and p10 clusters in line with the crisp classifier .  Fuzziness is limited to the middle cluster now distinguished into three: medium-high, medium-low (now labelled HIGH and LOW), and the reduced MIDDLE class holding the 35% of instances.  <\/p>\n<p><code><\/p>\n<pre>\r\ncc5 <- c(\"VERY LOW\", \"LOW\", \"MEDIUM\", \"HIGH\", \"VERY HIGH\")\r\n\r\nfuzzy_classify5 <- function(x) {\r\n    fv <- c(L(x), M(x), H(x))\r\n    # print(fv)     #debug\r\n    # print(max(fv))  #debug\r\n    m3 <- which.max(fv)\r\n    m5 <- if (m3==1 &#038;&#038; max(fv)==1) 1 else if (m3==3 &#038;&#038; max(fv)==1) 5 else (1+m3)\r\n    cc5[m5]\r\n}\r\n\r\nbb <- sapply(dd, fuzzy_classify5)\r\nbb\r\ntable(bb)\r\ntable(bb)\/length(dd)\r\n<\/pre>\n<p><\/code><\/p>\n<div id=\"attachment_5510\" style=\"width: 534px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" aria-describedby=\"caption-attachment-5510\" loading=\"lazy\" src=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/05\/screenshot.1687.png\" alt=\"\" width=\"524\" height=\"492\" class=\"size-full wp-image-5510\" srcset=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/05\/screenshot.1687.png 524w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/05\/screenshot.1687-300x282.png 300w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/05\/screenshot.1687-150x141.png 150w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/05\/screenshot.1687-400x376.png 400w\" sizes=\"auto, (max-width: 524px) 100vw, 524px\" \/><p id=\"caption-attachment-5510\" class=\"wp-caption-text\">Fuzzy classification into 5 classes using p10 and p90 levels to achieve an 80-20 rule in the outermost classes and graded class membership in the inner three classes.  Comparison with crisp classifier using the same 80-20 rule is shown in the bottom panel of the figure.<\/p><\/div>\n<div id=\"attachment_5482\" style=\"width: 643px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" aria-describedby=\"caption-attachment-5482\" loading=\"lazy\" src=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1685.png\" alt=\"\" width=\"633\" height=\"481\" class=\"size-full wp-image-5482\" srcset=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1685.png 633w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1685-300x228.png 300w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1685-150x114.png 150w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1685-400x304.png 400w\" sizes=\"auto, (max-width: 633px) 100vw, 633px\" \/><p id=\"caption-attachment-5482\" class=\"wp-caption-text\">Fuzzy classification into 5 classes.<\/p><\/div>\n<h3>5. Evaluating the outcome<\/h3>\n<p>The fuzzy classifier described above provides a continuous transition between the classification strengths between classes that share a common boundary.  This can be seen by observing that each membership functions are piecewise linear and therefore piecewise continuous hence the max hull is also piecewise continuous.  <\/p>\n<p>The fact that the classification changes themselves are abrupt was not the concern (this is true in any categorical labeling, including in the separating hyperplane solution).  Rather it is the degree of membership separating two objects across class boundaries e.g. 25 (classed LOW) and 26 (classed MEDIUM).  Using the fuzzy classifier, this inter-cluster membership gradient is arbitrarily small based on the distance between the points, in the above example it is less than 2%.<\/p>\n<div id=\"attachment_5484\" style=\"width: 287px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" aria-describedby=\"caption-attachment-5484\" loading=\"lazy\" src=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1686.png\" alt=\"\" width=\"277\" height=\"129\" class=\"size-full wp-image-5484\" srcset=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1686.png 277w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2020\/06\/screenshot.1686-150x70.png 150w\" sizes=\"auto, (max-width: 277px) 100vw, 277px\" \/><p id=\"caption-attachment-5484\" class=\"wp-caption-text\">The difference in membership between classes is less than 2% at the class boundary.<\/p><\/div>\n<p>This continuous degree of membership allows both categorical (rules-based) and proportional (level-based) responses to be used as part of an overall fuzzy systems control.<\/p>\n<h3>6. Using fuzzy classifiers - adaptive thresholds, unsupervised vs. supervised learning, and a general framework<\/h3>\n<p>In the application context of the daily text alert, fuzzy classification would provide the autonomous categorization each alert\/alarm.  The 80-20 tenet would be reflected in the fact that approximately 20% of alerts will be in the VERY HIGH or VERY LOW risk category.<\/p>\n<p>Stepping back from the example context, the general framework for using a fuzzy classifier is as follows:<\/p>\n<p>A. TRAIN the classifier on available history.<br \/>\n  1. Where possible, CHOOSE simple thresholds.  In the above, we used the 80-20 rule and therefore p10\/p90 levels<br \/>\n  2. DEFINE a fuzzy classifier on 5 classes - the outermost two classes preserve the 20% outliers, the inner three classes hold the 80% and provide finer grained classifications of these cases.<\/p>\n<p>B. MONITOR for new inputs<br \/>\n  3. CLASSIFY - classify incoming observations as done above using max() fuzzifier.<br \/>\n  4. RESPOND - decide response with rules which could include alerts or mitigation actions or both<\/p>\n<p>C. UPDATE thresholds dynamically<br \/>\n  5. RECOMPUTE thresholds - add new observations to past history and reform the membership functions with each observation.<\/p>\n<p>This approach offers:<br \/>\n1. a methodology for continuous monitoring (watchdog), including the use of sensors to instrument a process<br \/>\n2. a way to dynamically adapt the sensitivities, i.e. for the watchdog \"to learn\"<br \/>\n3. lowers barrier to entry -- unsupervised monitoring can start using descriptive statistics adjusted for any application specific tenets.  If specific training data are available, thresholds can be refined using known subsets.<br \/>\n4. computationally efficient classifier<\/p>\n<p>One should, of course, compare a heuristic as above with the optimized classifier based on a more computationally intensive combinatorial optimization approach.<\/p>\n<h3>7. Possibilities<\/h3>\n<p>For the application programmer, the appeal of fuzzy classifiers lies in the opportunity to define a fuzzy control system while remaining in the categorical abstraction layer  of e.g. if HIGH then CLOSE WINDOWS, etc.  This keeps the quantitative basis of the underlying analog sensor (in this case the ozone levels) encapsulated.  As a second example, a fuzzy temperature checking could we use a thermometer and knowledge of normal body temperatures to classify temperatures into VERY LOW, LOW, NORMAL, ELEVATED, HIGH FEVER levels.  A user does not need to understand how the heat from our bodies warms mercury which rises (analog thermometer) or how a temperature sensor is digitally sampled.  The layer driving the appropriate response is the fuzzy classification; the number itself is a second, supporting layer of information.<\/p>\n<p>Sensors becomes interesting for training set purposes.  If the sensors can sample data from an a-priori known phenomena or pre-classified events, then the statistics around these known samples can enhance the detection\/classification.  For analog sensors, this opens up myriad possibilities:<\/p>\n<ul>\n<li>car sensors (brakes, auto gear changes)\n<li>light senors (outside conditions based on changing inside light -- sunny, cloudy, rainy, as well as time of day based on light quality coming into a given south facing room)\n<\/ul>\n<p><strong>Conclusion<\/strong><br \/>\nThe above statistical approach using fuzzy classification and adaptive thresholds is both simple, intuitive, of low computational cost, and can be effective when quick immediately workable solutions are needed, and better solutions can be added iteratively.  The above method can be used with any single-variable data series that is essentially stationary.  For non-stationary time series and multi-dimensional variables\/classification, more extensive techniques are required.<\/p>\n<hr\/>\n<p><a id=\"appendix-sourcecode\"><\/a><\/p>\n<h3>Appendix 1: Source codes in R for the fuzzy classifier<\/h3>\n<p>The exploratory code mirroring the article above is <a href=\"http:\/\/mathscitech.org\/code\/R\/fuzzy_classifier_exploration.r\" rel=\"noopener noreferrer\" target=\"_blank\">here<\/a>.<\/p>\n<p>The final solution code is here, providing a <a href=\"http:\/\/mathscitech.org\/code\/R\/fuzzy_classify.r\" rel=\"noopener noreferrer\" target=\"_blank\">library file<\/a> and an <a href=\"http:\/\/mathscitech.org\/code\/R\/fuzzy_classifier.r\" rel=\"noopener noreferrer\" target=\"_blank\">example usage file<\/a>.<\/p>\n<hr\/>\n<p><a id=\"appendix-using-fuzzycluster\"><\/id><\/p>\n<h3>Appendix 2: Using the fuzzy classifier<\/h3>\n<p>To use the fuzzy classifier, start by loading the <a href=\"http:\/\/mathscitech.org\/code\/R\/fuzzy_classify.r\" rel=\"noopener noreferrer\" target=\"_blank\">fuzzy_classify.r library file<\/a>:<\/p>\n<pre>\r\nsource(\"fuzzy_classify.r\")   # check it is visible in R's search path or current working directory getwd()\r\n<\/pre>\n<p>The main function is fuzzycluster(d) which runs the fuzzy classifer on dataset d and creates the fuzzy data frame (fdf) holding 9 results needed for analysis, visualization, etc.<\/p>\n<p>Follow the <a href=\"http:\/\/mathscitech.org\/code\/R\/fuzzy_classifier.r\" rel=\"noopener noreferrer\" target=\"_blank\">example user code<\/a> to see how to use the fuzzy classifier library on the ozone data set.<br \/>\n<code><\/p>\n<pre>\r\nsource(\"fuzzy_classify.r\")     #### load FUNCTIONS for fuzzy classifier\r\nlibrary(datasets)       # load built-in datasets\r\nd <- airquality$Ozone   # set time series\r\n### Exploratory data visualization\r\nplot(d)\r\nlines(d)\r\ntitle(\"Ozone level at Roosevelt Island, NY\\n mean ppb, 1300h-1500h, May 1-Sep 30, 1973\") \r\nhist(d)\r\nboxplot(d)\r\n\r\n### Run the fuzzy classifier\r\nfdf <- fuzzycluster(d)  # form fuzzy data frame fdf with 9 columns\r\n<\/pre>\n<p><\/code><\/p>\n<p>The key function above is <a href=\"http:\/\/mathscitech.org\/code\/R\/fuzzy_classify.r\" rel=\"noopener noreferrer\" target=\"_blank\">fuzzycluster(d) defined in the library file here<\/a> which runs the fuzzy classifer on dataset vector d and creates the fuzzy data frame (fdf) holding 9 results needed for analysis, visualization, etc.<\/p>\n<ul>\n<li> d - data vector, should be numeric\n<li> dd - cleaned data, with NA removed\n<li> q - quantiles using p10, p90 instead of p25, p75\n<li> bb - fuzzy classifier results for each value in dd what is its classification\n<li> vhigh, high, med, low, vlow - sublist showing which values are in what buckets\n<\/ul>\n<p><code><\/p>\n<pre>\r\n### Review results\r\nplotq(fdf$dd,fdf$q)             # show the cleaned up dataset with thresholds\r\nplot_fmf3(fdf$dd)               # show fuzzy membership functions\r\nfdf$bb                          # show fuzzy bucketization\r\ntable(fdf$bb)                   # summarize classification\r\ntable(fdf$bb)\/length(fdf$dd)    # relative class frequencies\r\ntable(head(fdf$bb,50))\/50       # last 50 days performance\r\nsummary(fdf$vhigh)              # summarize buckets\r\nsummary(fdf$high)\r\nsummary(fdf$med)\r\nsummary(fdf$low)\r\nsummary(fdf$vlow)\r\n<\/pre>\n<p><\/code><\/p>\n<hr\/>\n<p><a id=\"fuzzy-references\"><\/a><\/p>\n<h3>Appendix 3: References<\/h3>\n<ol>\n<strong>Papers<\/strong><\/p>\n<li> [Zadeh\/1965] <a href=\"https:\/\/www.sciencedirect.com\/science\/article\/pii\/S001999586590241X\" rel=\"noopener noreferrer\" target=\"_blank\">Fuzzy Sets<\/a>, Lotfi Zadeh, Information & Control, Vol 8, Issue 3, June 1965, Pages 338-353\n<li> [Zadeh\/1972d], <a href=\"http:\/\/www2.eecs.berkeley.edu\/Pubs\/TechRpts\/1972\/ERL-m-342.pdf\" rel=\"noopener noreferrer\" target=\"_blank\">Outline of a New Approach to the Analysis of Complex Systems and Decision Processes<\/a>  by Lotfi Zadeh, 1972, Technical Report University of California Berkeley: ERL M342\n<li> [Zadeh, 1977] \u2013 Fuzzy Logic\n<p><strong>Books<\/strong><\/p>\n<li>[Zadeh\/2012] Computing with Words, <a href=\"https:\/\/www.amazon.co.uk\/Computing-Words-Principal-Concepts-Fuzziness\/dp\/3642274722\/\" rel=\"noopener\" target=\"_blank\">\u00a380<\/a>\n<li>[Zadeh\/Aliev\/2019] <a href=\"https:\/\/www.amazon.co.uk\/Fuzzy-Logic-Theory-Applications-Part\/dp\/9813238178\/\" rel=\"noopener\" target=\"_blank\">Fuzzy Logic: Theory & Applications, Vol.1 & 2<\/a>, \u00a3111\n<li>[Cox\/1994] Fuzzy Systems Handbook, <a href=\"https:\/\/www.amazon.co.uk\/gp\/product\/0121942708\/\" rel=\"noopener\" target=\"_blank\">\u00a35.70<\/a>, <a href=\"https:\/\/www.amazon.co.uk\/Fuzzy-Systems-Handbook-Practitioners-Maintaining-dp-0121944557\/dp\/0121944557\/\" rel=\"noopener\" target=\"_blank\">2nd ed (1998), \u00a328<\/a>\n<li>[Heske\/1996] <a href=\"https:\/\/www.amazon.co.uk\/Fuzzy-Logic-Real-World-Design\/dp\/0929392248\/\" rel=\"noopener noreferrer\" target=\"_blank\">Fuzzy Logic for Real World Design<\/a>, Ted & Jill Heske, 1996  <em>Accessible to a secondary school audience.<\/em>\n<li>[Ross\/1994] Fuzzy Logic with Engineering Applications, Tim Ross, <a href=\"https:\/\/www.amazon.co.uk\/gp\/product\/0071136371\/\" rel=\"noopener\" target=\"_blank\">\u00a34.90<\/a>, <a href=\"https:\/\/www.amazon.co.uk\/Fuzzy-Logic-Engineering-Applications-Timothy-dp-1119235863\/dp\/1119235863\/\" rel=\"noopener\" target=\"_blank\">4th ed \u00a353<\/a>\n<li>[Chen\/Pham\/2001] <a href=\"https:\/\/www.amazon.co.uk\/Introduction-Fuzzy-Logic-Control-Systems\/dp\/0367397889\/\" rel=\"noopener\" target=\"_blank\">Introduction to Fuzzy Sets, Fuzzy Logic, and Fuzzy Control Systems<\/a>, (Sale price: \u00a36.12, Regular price: \u00a360) by Guanrong Chen and Truang Tat Pham, 2001, CRC Press, 328pp. <em>Comprehensive treatment of interval methods, i.e. Calculus of Fuzzy Rules (CFR)<\/em>\n<li>[Li\/Yen\/1995], <a href=\"https:\/\/www.amazon.co.uk\/Fuzzy-Sets-Decision-Making-Hongxing-Li\/dp\/0849389313\/\" rel=\"noopener\" target=\"_blank\">Fuzzy Sets and Fuzzy Decision-Making<\/a>, by Hong Xing Li and Vincent C. Yen, 1995, CRC press, 270pp.\n<li>[Nguyen\/Walker\/2023], A First Course in Fuzzy Logic, by Hung T. Nguyen, Walker, Walker, <a href=\"https:\/\/www.amazon.co.uk\/First-Course-Fuzzy-Textbooks-Mathematics\/dp\/1032475943\/\" rel=\"noopener\" target=\"_blank\">4th ed (\u00a340)<\/a>, <a href=\"https:\/\/www.amazon.co.uk\/First-Course-Fuzzy-Logic-Third\/dp\/0849316596\/\" rel=\"noopener\" target=\"_blank\">3rd ed (\u00a320)<\/a>, <a href=\"https:\/\/www.amazon.co.uk\/First-Course-Fuzzy-Nguyen-1996-08-22\/dp\/B01K2K1Q32\/\" rel=\"noopener\" target=\"_blank\">2nd ed<\/a>, 1st ed\n<li>[Nguyen\/Prasad\/Walker\/2002], A First Course in Fuzzy and Neural Control, <a href=\"https:\/\/www.amazon.co.uk\/First-Course-Fuzzy-Neural-Control\/dp\/1584882441\/\" rel=\"noopener\" target=\"_blank\">\u00a332<\/a>\n<p><strong>Related Technologies<\/strong><\/p>\n<li> [Tibshirani\/2009] <a href=\"https:\/\/web.stanford.edu\/~hastie\/ElemStatLearn\/\" rel=\"noopener noreferrer\" target=\"_blank\">Elements of Statistical Learning, 2nd Edition, free PDF download<\/a> courtesy of the authors & publishers), Trevor Hastie, Robert Tibshirani, and Jerome Friedman, 2009, Springer-Verlag.  <a href=\"https:\/\/web.stanford.edu\/~hastie\/ElemStatLearn\/reviews\/Pages%20from%20siam.pdf\" rel=\"noopener noreferrer\" target=\"_blank\">Review of First Edition in SIAM (2002)<\/a> by Michael Chernick.\n<p><strong>Historical<\/strong><\/p>\n<li> [Trillas\/2011] <a href=\"https:\/\/www.sciencedirect.com\/science\/article\/pii\/S1026309811000666\" rel=\"noopener noreferrer\" target=\"_blank\">Lotfi A. Zadeh: On the Man and his Work<\/a>, E. Trillas, Scientia Iranica, Volume 18, Issue 3, June 2011, Pages 574-579\n<li> Lotfi A. Zadeh biography (<a href=\"https:\/\/en.wikipedia.org\/wiki\/Lotfi_A._Zadeh\" rel=\"noopener noreferrer\" target=\"_blank\">Wikipedia<\/a>) | Obituary: <a href=\"https:\/\/www.newyorker.com\/tech\/annals-of-technology\/remembering-lotfi-zadeh-the-inventor-of-fuzzy-logic\" rel=\"noopener noreferrer\" target=\"_blank\">Zadeh, Sep 19th, 2017, New York Times<\/a> | Zadeh Papers: <a href=\"https:\/\/www2.eecs.berkeley.edu\/Pubs\/Faculty\/zadeh.html\" rel=\"noopener noreferrer\" target=\"_blank\">Zadeh Papers & Technical Reports at Berkeley<\/a>, <a href=\"https:\/\/dl.acm.org\/profile\/81375596015\/publications?Role=author&#038;startPage=0&#038;sortBy=Ppub_asc\" rel=\"noopener noreferrer\" target=\"_blank\">Journal articles at ACM<\/a>\n<li>[Kreinovich, 2011] - In the Beginning was the word, and the word was Fuzzy (<a href=\"https:\/\/scholarworks.utep.edu\/cs_techrep\/629\/\" rel=\"noopener\" target=\"_blank\">Online<\/a>) <em>Short expository essay reflecting on Kreinovich's introduction to fuzzy logic in the former USSR and its connection to the dialectic philosophy of Hegel.<\/em>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>Abstract This brief note explores the use of fuzzy classifiers, with membership functions chosen using a statistical heuristic (quantile statistics), to monitor time-series metrics. The time series can arise from environmental measurements, industrial process control data, or sensor system outputs. We demonstrate implementation using the [Read More&#8230;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","footnotes":""},"categories":[8,119,120,100,12],"tags":[],"coauthors":[112],"class_list":["post-5431","post","type-post","status-publish","format-standard","hentry","category-statistics","category-technical","category-software-engineering","category-softwaretools","category-technology","odd"],"views":5365,"_links":{"self":[{"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/posts\/5431","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/comments?post=5431"}],"version-history":[{"count":79,"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/posts\/5431\/revisions"}],"predecessor-version":[{"id":12074,"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/posts\/5431\/revisions\/12074"}],"wp:attachment":[{"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/media?parent=5431"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/categories?post=5431"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/tags?post=5431"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/coauthors?post=5431"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}