What does the term episode mean in meta-learning?

Recall in meta-learning we have a meta-set which is a data-set of data-sets:

$$D_{meta-set} = { D_n }^N_{n=1}$$

where $$D_n$$ is data-set (or usually a task). Usually defined as a data sampled from a target function for regression or N classes for a classification task. Usually these individual data sets $$D_n$$ are split into a support set (train set) and a query set (test set).

I’ve seen the term episode used in meta-learning but it’s not been clear to me. There are two possible definitions:

1. 1 episode means sampling 1 single data set $$D_n$$
2. 1 episode means sampling M data-sets. i.e. sampling a batch of tasks

which one is it?

reference:

Cross Validated Asked on November 12, 2021

In my opinion the right definition of an episode should be a batch of tasks (usually called a meta-batch). For regression if we have 100 $$f_i$$ from some family (e.g. sine functions) then 1 episode with a meta-batch size of size 16 should be 16 functions, each with a support set and a query set.

For classification an episode is still a (meta) batch of tasks. In this case a task is a N-way K-shot classification task. e.g. 5-way, 5-shot would have 25 examples for the support set and if the Keval is 15 then 75 examples for the query set. In this case if we have meta-batch size of 16 then we sample 16 tasks, each with 25+75 examples. So a total of 16*100 examples for a meta-batch.

In fact with this definition 1 episode is the same as an iteration step. When meta-batch size is 1 then a task is an episode.

I can't imagine why we'd define an episode as a task, which I thought at some point. In that case we have the same word for task and episode. But an episode of learning happens fully during each iteration.

Though, I'd prefer to not use this word at all since it seems redundant + RL already uses this term which adds to the confusion in my opinion.

Answered by Charlie Parker on November 12, 2021

Meta-learning conducts a meta analysis: it looks at multiple analyses (which in turn used different assumptions, datasets, and methods) and tries to explain these with some generalization or perhaps even a meta-model. This general idea has long been used by academics to try to generalize and learn about a complicated topic. In this setup, an episode would be one of the analyses plus its associated dataset and methods.

Meta-learning in the machine learning community takes many datasets, methods, assumptions, and results and then builds a model to explain all of those results. Early work like Omohundro (1996) looked at episodes as samples drawn from one larger dataset with each sampled modeled. Vilalta and Drissi (2002) (in a survey of meta-learning) noted that assumptions (aka "bias") are also part of an analysis. The resulting models were then averaged or combined in some manner yielding a meta-model. More recent work like this paper by Sun et al (2017) uses a generalization of that in combining results for completely different datasets, assumptions, and models. An excellent recent survey is given by Hospedales, Antoniou, Micaelli, and Storkey (2020).

From these, we can see that an episode is a tuple of (dataset, method(s), assumptions, estimated model/results) which then becomes an observation in the meta-analysis.

Answered by kurtosis on November 12, 2021

Related Questions

ARDL and ECM lags

0  Asked on August 8, 2020 by php-useless

Combining categorical and continuous features for neural networks

2  Asked on August 5, 2020 by 3michelin

What statistical analysis to used for kinetic data with multiple groups?

1  Asked on August 5, 2020 by carlos-valenzuela

In R, why do the p-values from anova() change when you add more predictors?

0  Asked on August 4, 2020 by m-smith

Random forest after cross validation

1  Asked on August 1, 2020 by steven-niggebrugge

Grey relation between two datasets?

0  Asked on July 31, 2020 by msilvy

General procedures for combined feature selection, model tuning, and model selection?

1  Asked on July 31, 2020 by uared1776

Classification model not working for a large dataset

1  Asked on July 30, 2020 by gabriel-ullmann

Sigma algebra generated by random variable on a set with generators

0  Asked on July 28, 2020 by gabriel

What is the seasonal trend lowess model in time series?

0  Asked on July 28, 2020 by christopher-u

Non seasonal and seasonal parameters of this time-series

0  Asked on July 27, 2020 by statsmonkey

Extended Cox model and cox.zph

2  Asked on July 25, 2020 by finance