# Near sets

Near sets are disjoint sets that resemble each other. Resemblance between disjoint sets occurs whenever there are observable similarities between the objects in the sets. Similarity is determined by comparing lists of object feature values. Each list of feature values defines an object's description. Comparison of object descriptions provides a basis for determining the extent that disjoint sets resemble each other. Objects that are perceived as similar based on their descriptions are grouped together. These groups of similar objects can provide information and reveal patterns about objects of interest in the disjoint sets. For example, collections of digital images viewed as disjoint sets of points provide a rich hunting ground for near sets.

Near set theory provides methods that can be used to extract resemblance information from objects contained in disjoint sets, i.e., it provides a formal basis for the observation, comparison, and classification of objects. The discovery of near sets begins with choosing the appropriate method to describe observed objects. This is accomplished by the selection of probe functions representing observable object features. A probe function is a mapping from an object to a real number representing a feature value. For example, when comparing fruit such as apples, the redness of an apple (observed object) can be described by a probe function representing colour, and the output of the probe function is a number representing the degree of redness (or whatever colour apple you prefer to eat). Probe functions provide a basis for describing and discerning affinities between objects as well as between groups of similar objects. Objects that have, in some degree, affinities are considered near each other. Similarly, groups of objects (i.e. sets) that have, in some degree, affinities are also considered near each other.

Near sets offer a framework for solving problems based on human perception that arise in areas such as image processing, computer vision as well as engineering and science problems. In near set theory, perception is a combination of the view of perception in psychophysics with a view of perception found in Merleau-Ponty's work. In the context of psychophysics, perception of an object (i.e., in effect, knowledge about an object) depends on signal values gathered by our senses. In this view of perception, our senses are likened to probe functions by considering them as mappings of stimuli to sensations that are a source of values assimilated by the mind. A human sense modelled as a probe measures observable physical characteristics of objects in our environment. The sensed physical characteristics of an object are identified with object features. In Merleau-Ponty's view, an object is perceived to the extent that it can be described. In other words, object description goes hand-in-hand with object perception. It is the mind that identifies relationships between object descriptions to form perceptions of sensed objects. It is also the case that near set theory has been proven to be quite successful in finding solutions to perceptual problems such as measuring image correspondence and segmentation evaluation.

A partition of a set

## History

Example of a rough set

It has been observed that mathematical topics emerge and evolve through interactions among many researchers. This was the case with the discovery of near sets. Work on a perceptual basis for near sets began in 2002, motivated by digital image analysis. It was inspired by a study of the perception of nearness of familiar objects carried out by Z. Pawlak and J.F. Peters.[1] In this context, nearness is interpreted to mean closely corresponding to or resembling an original. This collaboration was important in paving the way toward a description-based approach to exploring the nearness of sets.

Excitement grew after 2002, when it became apparent that it was possible to introduce measures of nearness based on similarities between classes contained in coverings of disjoint sets (e.g., this is possible if we define coverings of sets representing digital images and then look for similarities between the images such as shades of green in one landscape that resemble one or more shades of green in another landscape). In this context the term similarity means resemblance between two or more individual objects or sets of objects and almost equal patterns in compared items. Collaboration between J.F. Peters, A. Skowron, and J. Stepaniuk led to a formal basis for the nearness of objects considered in the context of proximity spaces.[2] Near sets and an approach to defining resemblance between sets was introduced by J.F. Peters in.[3][4]

Example of near sets

Near set theory and its applications grew out of a generalization of the approach to the classification of objects proposed by Z. Pawlak during his work on rough sets in the early 1980s, and E. Orłowska's work on approximation spaces. Briefly, a rough set can be described as follows. Consider a non-empty finite set of objects labelled $O$. The set $O$ can be partitioned into cells (referred to as classes in near set theory) by grouping together objects that have similar descriptions (using one or more probe functions). A set $X\subset O$ is considered rough when it cannot be formed completely by the union of classes from the partition of $O$. The set $X$ is considered rough inasmuch as $X$ cannot be fully described by probe functions selected to describe the individual objects of $O$.

Near sets are considered a generalization of rough sets, since it has been shown that every rough set is a near set but not every near set is a rough set. Near sets grew out of the idea that two or more rough sets can share objects with matching descriptions if they both contain objects belonging to the same class from the partition of $O$. When this occurs, the sets are considered near each other with respect to the classes contained in the partition.

## Definitions

Definition 1: Object

An object is anything that has its origin in the physical world.

An identifying characteristic of an object is that it must have some quantifiable features. The term feature is used in S. Watanabe's sense of the word, i.e., a feature corresponds to an observable property of physical objects. Each feature has a 1-to-many relationship to real-valued functions called probe functions representing the feature. For each feature (such as colour) one or more probe functions can be introduced to represent the feature (such as grayscale, or RGB values). Objects and sets of probe functions form the basis of near set theory and are sometimes referred to as perceptual objects due to the focus on assigning values to perceived object features. A non-empty, finite set of objects is denoted by $O$.

Definition 2: Probe Function

A probe function is a real-valued function, $f:O\to\mathbb R$, representing a feature of an object.

Examples of probe functions are the colour, size, texture, edge-orientation, or weight of an object. Probe functions are used to describe an object to determine the characteristics and perceptual similarity of objects. Perceptual information is always presented with respect to probe functions just as our senses define our perception of the world. For example, our ability to view light in the visible spectrum rather than infra red or microwaves spectra defines our perception of the world just as the selection of probe functions constrains the amount of perceptual information available for feature extraction from a set of objects. The set of all probe functions is denoted by $\mathbb{F}$, and a set of specific probe functions for a given application is denoted by $\mathcal{B}\subseteq\mathbb{F}$

Definition 3: Perceptual System

A perceptual system $\langle O, \mathbb{F} \rangle$ consists of a non-empty set $O$ together with a set $\mathbb{F}$ of real-valued functions.

The notion of a perceptual system admits a wide variety of different interpretations that result from the selection of sample objects contained in a particular sample space $O$. A recent example of a perceptual system is given by D. Hall.[5] Two other examples of perceptual systems are: a set of microscope images together with a set of image processing probe functions, or a set of results from a web query together with some measures (probe functions) indicating, e.g., relevancy of the results.

Definition 4: Object Description

Consider a perceptual system $\langle O, \mathbb{F} \rangle$. The description of an object $x\in O, \phi_i\in \mathcal{B}\subseteq \mathbb{F}$ is given by the vector

$\boldsymbol{\phi}_{\mathcal{B}}(x) = (\phi_1(x),\phi_2(x),\ldots,\phi_i(x),\ldots,\phi_\ell(x)),$

where $l$ is the length of the vector $\boldsymbol{\phi}$, and each $\phi_i$ is a probe function belonging to the set $\mathcal{B}$.

Definition 5: Perceptual Indiscernibility Relation

Let $\langle O, \mathbb{F} \rangle$ be a perceptual system. For every $\mathcal{B}\subseteq\mathbb{F}$ the perceptual indiscernibility relation $\sim_{\mathcal{B}}$ is defined as follows:

$\sim_{\mathcal{B}} = \{(x,y)\in O \times O : \,\parallel \boldsymbol{\phi}_{\mathcal{B}}(x) - \boldsymbol{\phi}_{\mathcal{B}}(y)\parallel_{_2} = 0\},$

where $\parallel\cdot\parallel$ represents the $L^2$ norm. This is a refinement of the original indiscernibility relation given by Pawlak in 1981. Using the perceptual indiscernibility relation, objects with matching descriptions can be grouped to form classes called elementary sets (also called an equivalence class) defined by

$\mathbb{C}_{/\sim_{\mathcal{B}}} = \{o\in O\mid o\sim_{\mathcal{B}}c\,\forall\,c\in\mathbb{C}_{/\sim_{\mathcal{B}}}\}.$

Similarly, a quotient set is the set of all elementary sets defined as

$O_{/\sim_{\mathcal{B}}} = \bigcup\{\mathbb{C}_{/\sim_{\mathcal{B}}}\}.$

Definition 6: Perceptual Tolerance Relation

When dealing with perceptual objects (especially, components in images), it is sometimes necessary to relax the equivalence condition of Defn. 5 to facilitate observation of associations in a perceptual system. This variation is called a perceptual tolerance relation. Let $\langle O, \mathbb{F} \rangle$ be a perceptual system and let $\varepsilon\in\mathbb{R}$. For every$\mathcal{B}\subseteq\mathbb{F}$ the tolerance relation $\cong_{\mathcal{B}}$ is defined as follows:

$\cong_{\mathcal{B},\epsilon} = \{(x,y)\in O \times O : \parallel\boldsymbol{\phi}_{\mathcal{B}}(x) - \boldsymbol{\phi}_{\mathcal{B}}(y)\parallel_{_2} \leq \varepsilon\}.$

For notational convenience, this relation can be written $\cong_{\mathcal{B}}$ instead of$\cong_{\mathcal{B},\varepsilon}$ with the understanding that $\epsilon$ is inherent to the definition of the tolerance.[6]

Tolerance classes are defined in terms of preclasses. Let $A_{\mathcal{B},\varepsilon}$ denote that $A\subset O$ is a perception-based preclass. Given$A_{\mathcal{B},\varepsilon}$, then for all $x,y\in A, x\cong_{\mathcal{B},\epsilon} y$, i.e.,

$A_{\mathcal{B},\varepsilon}\ \mbox{is a preclass} \iff \forall x,y\in A, \parallel \boldsymbol{\phi}_{\mathcal{B}}(x) - \boldsymbol{\phi}_{\mathcal{B}}(y)\parallel_{_2} \leq \varepsilon.$

Let $\mathbb{C}_{\mathcal{B},\varepsilon}$ denote a tolerance class, which, by definition, is a maximal preclass. For $x\in O$, we also use the notation$x_{/_{\cong_{\mathcal{B},\epsilon}}}$ to denote a tolerance class containing $x$. Note, $\cong_{\mathcal{B},\epsilon}$ covers $O$ instead of partitioning $O$ because an object can belong to more than one class. In addition, each pair of objects $x, y$ in $\mathbb{C}_{\mathcal{B},\epsilon}$ must satisfy the condition $\parallel\boldsymbol{\phi}_{\mathcal{B}}(x) -\boldsymbol{\phi}_{\mathcal{B}}(y)\parallel_{_2}\leq\varepsilon$. Next, a covering of $O$ defined by$\cong_{\mathcal{B},\epsilon}$ is the union of all tolerance classes in the covering.

Notice that the tolerance relation $\cong_{\mathcal{B},\epsilon}$ is a generalization of the indiscernibility relation given in Defn. 5 (obtained by setting $\varepsilon = 0$).

Definition 7: Weak Nearness Relation

Let $\langle O, \mathbb{F}\rangle$ be a perceptual system and let $X,Y\subseteq O$. A set $X$ is weakly near to a set $Y$ (denoted $X \underline{\bowtie}_{\mathbb{F}} Y$) within the perceptual system $\langle O, \mathbb{F}\rangle$ iff there are $x \in X$ and $y \in Y$ and there is$\mathcal{B}\subseteq \mathbb{F}$ such that $x \cong_{\mathcal{B}} y$. Notice that the image given in the lead section is actually an example of sets that are weakly near each other (with $\varepsilon = 0$).

Definition 8: Nearness Relation

Let $\langle O, \mathbb{F}\rangle$ be perceptual system and let $X,Y \subseteq O$. A set $X$ is near to a set $Y$ (denoted $X\ \bowtie_{\mathbb{F}}\ Y$)within the perceptual system $\langle O, \mathbb{F}\rangle$ iff there are $\mathbb{F}_1, \mathbb{F}_2 \subseteq \mathbb{F}$ and $f\in \mathbb{F}$ and there are $A \in O_{/\sim_{\mathbb{F}_1}}, B\in O_{/\sim_{\mathbb{F}_2}}, C\in O_{/\sim_{f}}$ such that $A \subseteq X$, $B \subseteq Y$ and $A,B \subseteq C$.

Examples of Defn.'s 7 & 8: (a) Example of Defn. 7, (b) example of $O_{/\sim_{\mathbb{F}_1}}$, (c) example of $O_{/\sim_{\mathbb{F}_2}}$, and (d) example of$O_{/\sim_f}$ showing (together with (b) and (c)) that sets $X$ and $Y$ are near to each other according to Defn. 8.

## Examples

Simple Example

The following simple example highlights the need for a tolerance relation as well as demonstrates the construction of tolerance classes from real data. Consider the 20 objects in the table below with$|\phi(x_i)| = 1$.

Sample Perceptual System
$x_i$ $\phi(x)$ $x_i$ $\phi(x)$ $x_i$ $\phi(x)$ $x_i$ $\phi(x)$
$x_1$ .4518 $x_6$ .6943 $x_{11}$ .4002 $x_{16}$ .6079
$x_2$ .9166 $x_7$ .9246 $x_{12}$ .1910 $x_{17}$ .1869
$x_3$ .1398 $x_8$ .3537 $x_{13}$ .7476 $x_{18}$ .8489
$x_4$ .7972 $x_9$ .4722 $x_{14}$ .4990 $x_{19}$ .9170
$x_5$ .6281 $x_{10}$ .4523 $x_{15}$ .6289 $x_{20}$ .7143

Letting $\varepsilon = 0.1$ gives the following tolerance classes:

\begin{align} O = & \{ \{x_1, x_8, x_{10}, x_{11}\},\{x_1, x_9, x_{10}, x_{11}, x_{14}\},\\ & \{x_2, x_7, x_{18}, x_{19}\},\\ & \{x_3, x_{12}, x_{17}\},\\ & \{x_4, x_{13}, x_{20}\},\{x_4, x_{18}\},\\ & \{x_5, x_6, x_{15}, x_{16}\},\{x_5, x_6, x_{15}, x_{20}\},\\ & \{x_6, x_{13}, x_{20}\}\}. \end{align}

Observe that each object in a tolerance class satisfies the condition $\parallel\boldsymbol{\phi}_{\mathcal{B}}(x) -\boldsymbol{\phi}_{\mathcal{B}}(y)\parallel_2\leq\varepsilon$, and that almost all of the objects appear in more than one class. Moreover, there would be twenty classes if the indiscernibility relation was used since there are no two objects with matching descriptions. Finally, using these objects, the sets

$X = \{x_1, x_9\}$ and $Y = \{x_{11}, x_{14}\},$

are weakly near each other.

Image Processing Example

Example of images that are near each other. (a) and (b) Images from the freely available LeavesDataset (see, e.g., www.vision.caltech.edu/archive.html).

The following example provides a more useful application of near set theory. Let a subimage be defined as a small subset of pixels belonging to a digital image such that the pixels contained in the subimage form a square. Then, let the sets $X$ and $Y$ respectively represent the subimages obtained from two different images, and let $O = \{X \cup Y\}$. Finally, let the description of an object be given by the Green component in the RGB color model. The next step is to find all the tolerance classes using the tolerance relation. Using this information, tolerance classes can be formed containing objects that have similar (within some small $\varepsilon$) values for the Green component in the RGB colour model. Furthermore, images that are near (similar) to each other should have tolerance classes divided among both images (instead of a tolerance classes contained solely in one of the images). For example, the figure accompanying this example shows a subset of the tolerance classes obtained from two leaf images. In this figure, each tolerance class is assigned a separate colour. As can be seen, the two leaves share similar tolerance classes. This example is a first step toward the application of near sets to the image correspondence problem. However, it also highlights a need to measure the degree of nearness of two sets.

## Nearness measure

For some applications it is not sufficient to simply state that two sets are near each other. The practical application of near set theory sometimes requires a method for quantifying the nearness of sets. As a result, a $L_2$ norm-based nearness measure is was developed. Specifically, it was based on the idea that sets can be considered near each other when they have "things" in common. In the context of near sets, the "things" can be quantified by granules of a perceptual system, i.e., the tolerance classes. The simplest example of nearness between sets sharing "things" in common is the case when two sets have similar elements. Defn. 7 can be used to define a Nearness Measure (NM) between two sets $X$ and $Y$. Let $Z = X\cup Y$ and let the notation

$[z_{/\cong_{\mathcal{B}}}]_X = \{z\in z_{/\cong_{\mathcal{B}}}\mid z\in X\},$

denote the portion of the tolerance class $z_{/\cong_{\mathcal{B}}}$ that belongs to $X$, and similarly, use the notation

$[z_{/\cong_{\mathcal{B}}}]_Y = \{z\in z_{/\cong_{\mathcal{B}}}\mid z\in Y\},$

to denote the portion that belongs to $Y$. Further, let the sets $X$ and $Y$ be weakly near each other using Defn. 6. Also, let $Z_{/\cong_{\mathcal{B}}}$ denote a covering of $Z$ defined by $\cong_{\mathcal{B}}$. Then, a $NM_{\cong_{\mathcal{B}}}(X,Y)$ between $X$ and $Y$ is given by

$NM_{\cong_{\mathcal{B}}}(X,Y) = \Biggl ( \sum_{z_{/\cong_{\mathcal{B}}}\in Z_{/\cong_{\mathcal{B}}}} |z_{/\cong_{\mathcal{B}}}| \Biggr)^{-1} \sum_{z_{/\cong_{\mathcal{B}}}\in Z_{/\cong_{\mathcal{B}}}}|z_{/\cong_{\mathcal{B}}}| \frac{ \min (|[z_{/\cong_{\mathcal{B}}}]_X|,|[z_{/\cong_{\mathcal{B}}}]_Y|)}{\max (|[z_{/\cong_{\mathcal{B}}}]_X|,|[z_{/\cong_{\mathcal{B}}}]_Y|)}.$

The idea behind the NM is that sets that are similar should have a similar number of objects in each tolerance class. Thus, for each tolerance class obtained from the covering of $Z=X\cup Y$, the NM counts the number of objects that belong to $X$ and $Y$ and takes the ratio (as a proper fraction) of their cardinalities. Furthermore, each ratio is weighted by the total size of the tolerance class (thus giving importance to the larger classes) and the final result is normalized by dividing by the sum of all the cardinalities. The range of the NM is in the interval [0,1], where a value of 1 is obtained if the sets are equivalent and a value of 0 is obtained if they have no elements in common.

As an example of the degree of nearness between two sets, consider figure below in which each image consists of two sets of objects, $X$ and $Y$. Each colour in the figures corresponds to an elementary set where all the objects in the class share the same description. The idea behind the NM is that the nearness of sets in a perceptual system is based on the cardinality of tolerance classes that they share. Thus, the sets in left side of the figure are closer (more near) to each other in terms of their descriptions than the sets in right side of the figure.

Examples of degree of nearness between two sets: (a) High degree of nearness, and (b) Low degree of nearness.

## Near set evaluation and recognition (NEAR) system

The Near set Evaluation and Recognition (NEAR) system, is a system developed to demonstrate practical applications of near set theory to the problems of image segmentation evaluation and image correspondence. It was motivated by a need for a freely available software tool that can provide results for research and to generate interest in near set theory. The system implements a Multiple Document Interface (MDI) where each separate processing task is performed in its own child frame. The objects (in the near set sense) in this system are subimages of the images being processed and the probe functions (features) are image processing functions defined on the subimages. The system was written in C++ and was designed to facilitate the addition of new processing tasks and probe functions. Currently, the system performs five major tasks, namely, displaying equivalence and tolerance classes for an image, performing segmentation evaluation, measuring the nearness of two images, and displaying the output of processing an image using an individual probe functions.

NEAR system GUI.