Have run() return a struct

I think GVF could have more types of outputs. @francisco ?

DTW calculates costs, not likelihoods. They have similar meaning --- I wonder what is the best term here.

It is possible that some of these potential outputs could incur additional calculation that would be inefficient if the user doesn't need that output. For example, to get likelihoods from DTW the model would need to keep track of the most likely for each label. It doesn't do that by default; it is slightly more efficient to just find the lowest cost and then look up the label associated with that series.

mentioned in issue #14 (closed)

Yes, GVF has additional parameters on the outcomes structure.

int likeliestGesture; vector<float> likelihoods; vector<float> alignments; vector<vector<float> > dynamics; vector<vector<float> > scalings; vector<vector<float> > rotations;

There is an obvious commons set of fields too.

I think we can pretty much drop everything we need in this struct while saving backwards-compatibility.
The solution I would choose to avoid undesired additional calculations costs would be to pass a config struct (or class) to the machineLearning instance. The defaults could be the least complex set of parameter values, and could be easily overrided.

Developing the idea of passing a config class instance, for the moment I use a simple struct for the XMM wrapper, but we might want to be able to use poymorphism (static or dynamic) and have a generic template or base class for the configuration data, for example :

template <class C>
class machineLearningConfig {
    // do stuff
}

template <class C>
class machineLearning {
private:
    machineLearningConfig<C> config;
public:
    machineLearning() {
    }
    machineLearning(machineLearningConfig<C>) {
        // do stuff
    }
...
    void updateConfig(machineLearningConfig<C>)
};

Then write some specialized templates and typedef them (I didn't test this but you get the idea). Or, we could use dynamic polymorphism to achieve a similar thing.

There might be some drawbacks which I didn't think of when throwing this idea, though ...

added In progress label

@JFrink @jfrin001 Just brought up this again.

added Ready and removed In progress labels

Or, we could use dynamic polymorphism to achieve a similar thing.

This is what I was thinking of as well. Inheriting from a base class and adding to / overriding it could be a way, or as I stated in this thread, this method could also work.

mentioned in issue #49 (closed)

Looking at the actual implementations again, I realize that this is something that does need some attention. The current API is a bit confusing.

I think all regression types seem fine, they are returning a vector of parameters. But we can do better with classifiers.

knn is pretty simple: it just returns a class. That can be string or a numeric label.
dtw has potentially more complication it has:

std::string label; //label of best match
std::vector<T> costs; //The costs to match to each example

label could be the index of the best matching series. Users might also want lowest costs per label. (eg closest circle, closest triangle, etc.) There are some other things, like distances between labels and length statistics for examples. I think those should definitely be handled by some algorithm specific function other than run().

gmm returns a vector of likelihoods
hmm returns returns a vector with likelihoods and something called progressions?

Just commented on this in #49 (closed).
Indeed, the result of hmm classification is an interleaved vector of likelihoods / time progressions (normalized estimated position of the "cursor" for each gesture).
I like the idea of having specialized getters for complex data mentioned in #49 (closed). We just need to take care of what is the default returned data on each call to run() : for example, a vector of likelihoods is good but not sufficient in some cases, if I train the model with a new training set containing a new label, xmm will output the likelihoods based on the alphabetical order of the labels, which requires the user to get the vector of labels in order to know which likelihood corresponds to which label each time the labels change in the training set.

We could have run() return true if it worked, and otherwise false. The then everything else is getX(). That sounds crazy, but that's how GRT works. Here's that GMM in action:

bool predictSuccess = gmm.predict( inputVector );
if( !predictSuccess ){
    cout << "Failed to perform prediction for test sampel: " << i <<"\n";
    return EXIT_FAILURE;
}
        
//Get the predicted class label
UINT predictedClassLabel = gmm.getPredictedClassLabel();
VectorFloat classLikelihoods = gmm.getClassLikelihoods();
VectorFloat classDistances = gmm.getClassDistances();

Sounds good to me. No ambiguity, solves our problems ...
What are the drawbacks in your opinion ?

Here's that GMM in action

This looks good to me, however, not all ML models have the same functions, thereby making specialization even more undefined if the machineLearning class just appended these gets to any ML model, so one way to solve this might be to make these functions throw an error until they are overwritten, stating that the "modelName" does not have X getter?

mentioned in issue #44

Have run() return a struct

Child items

Activity

Admin message

Have run() return a struct

Child items

Linked items

Activity