Meta-learning is one of the core components of Modzy’s retraining solution. A well-performing machine learning model often requires training on a very large data set of labeled data points. Humans, in contrast, learn new concepts faster and more efficiently from only a few observations. Children need to see cats and dogs only a few times before they can quickly learn the differences between them.  Is it possible for machine learning models to exhibit the same properties and learn new concepts with only a few training examples? Can they learn about new objects that they have not seen before, using only what they have learned about the objects in the data set during training? The field of meta-learning deals with the design and development of machine learning models that exhibit these properties during inference.

What you need to know

Meta-learning, or representation learning, is an emerging artificial intelligence (AI) research area. In brief, meta-learning deals with learning ‘how to learn’ rather than just training a model to perform one specific task. This field focuses on understanding and adaptation of learning itself, instead of just completing a specific learning task such as detection or classification. A meta learner needs to be able to evaluate its own learning methods and adjust its own learning method according to a set of different tasks ([1]). At Modzy, we are actively working on developing meta-learning models that can learn from a few data points and do not require large training data sets to perform well.

The field of meta-learning has two main focuses: few-shot learning and zero-shot learning. Machine learning models trained for few-shot meta-learning “learn to learn” about the task and objects from minimal datasets with a few sample data points. This is very similar to how kids learn to identify new objects by only seeing a few pictures. Deep learning models trained for zero-shot learning learn how to complete a task or detect, classify, identify, etc. an object without receiving any training examples for that task or object. Supervised learning is a good technical example for understanding meta-learning. In the context of deep neural networks, supervised training naturally leads to generalizable representations at the top hidden layers by taking on properties that make the classification task easier. The last Softmax layer can be replaced by another kind of model, such as a nearest neighbor classifier or clustering utilized for meta-learning.

Modzy Approach to Meta-Learning

Shared representations are useful for handling multiple domains or transferring learned knowledge from one task to another task for which few or no examples are given. At Modzy, we are working on new training methodologies to develop models that learn representative features that are generalizable to different tasks. Further, given that many tasks have little to no labeled training data set associated with them, we are designing few-shot learning and zero-shot learning approaches for training models that are capable of learning or finding patterns during inference based on few or no training examples.

What this means for you

Many of the leading approaches in machine learning are data-hungry. For a deep learning model to reach parity with human-level performance for a specific task, it will require a great amount of computational power and must be trained on a large data set of labeled training examples. However, in many real-world applications, such data sets or computational resources are not. To achieve a human-like learning level, we must leverage techniques such as meta-learning to be able to teach and train our machine learning models to learn new concepts during inference or from one or a few examples during training. At Modzy, we are working on ways to design AI learning models that learn new, abstract, rich and flexible representations that can generalize to new tasks that help the model learn new concepts during inference. Additionally, we are exploring how to train machine learning models that learn and extract such representative features from sparse data sets. The future of AI depends on developing solutions that can work not only on very large data sets but also on sparse data sets with few or no labels available.