STK++ 0.9.13
|
This tutorial will show how to add a mixture model to STK++.
The design pattern of the Clustering project is based on the idea of plugin so that you can add your own mixture model to the architecture without many effort. If you want a better integration of your mixture model to the library, the Clustering project proposes a set of interface classes with predefined behavior that can be used at your convenience.
In short, the main Interface to derive is the IMixture class. This class is well documented and pure virtual methods to implement are self explaining. The class MixtureComposer estimates the mixture model using only pointers on this interface.
It is thus possible to implement and integrate mixture models to the project with many freedom. However, if you want a closer integration of the mixture model to the Clustering project, it can be convenient to use general design pattern adopted by the project and to implement the Interface classes furnished.
This tutorial will describe step by step how to integrate new models to the project Clustering.
A mixture model on some subset
The
The whole set of parameters is thus
For example the diagonal Gaussian mixture model STK::DiagGaussian_sjk
is the most general diagonal Gaussian model and has a density function of the form
All the parameters are cluster specific. There is no shared parameters and
On the other side the diagonal Gaussian mixture model STK::DiagGaussian_s
is the most parsimonious diagonal Gaussian model and has density function
In both cases the means are class specific parameters, i.e.
We will illustrate this tutorial with the intermediate mixture model STK::DiagGaussian_sjsk
with density function
This mixture model was not implemented in previous version of the Clustering project (before 2017/09/05).
The first step is to be able to identify the new model as a model recognized by the STK interfaces. This is achieved by updating the files STK_Clust_Util.h and STK_Clust_Util.cpp.
In the first file, we just add a new line in the Mixture enumeration
Before: | After: |
---|---|
enum Mixture
{
Gamma_ajk_bjk_ =0,
//...
Gamma_a_bk_, // = 11
Gaussian_sjk_ =20,
Gaussian_sk_,
Gaussian_sj_,
Gaussian_s_, // = 23
//...
}
| enum Mixture
{
Gamma_ajk_bjk_ =0,
//...
Gamma_a_bk_, // = 11
Gaussian_sjk_ =20,
Gaussian_sk_,
Gaussian_sj_,
Gaussian_s_,
Gaussian_sjsk_, // = 24
//...
}
|
In the second file we update the input/output utilities functions
by adding the following pieces of code
The struct
STK::ModelParameters encapsulates the parameters of the mixture model. It is a template struct that must be fully specialized and store the parameters of the mixture model
This class structure must also implement:
This structure is defined in file STK_DiagGaussianParameters.h. We don't detail the implementation which can be found in file STK_DiagGaussianParameters.cpp.
The class DiagGaussian_sjsk is a template terminal class derived (recursively) from the class DiagGaussianBase . The template parameter is the type of the Array storing the data. This class must implement
run
method updating the values of the parameters, The run
method update the parameters
Moreover, the traits class MixtureTraits must be instanced (it is used by base class which don't know the parameter type)
Concrete implementation can be found in file STK_DiagGaussian_sjsk.h