Model-Free Episodic Control

MFEC stands for Memory-free Function Approximation with Continuous-kernel (C-k) dEcomposition. It is a non-parametric technique used to approximate Q-values that is based on storing all the visited states and then using k-Nearest Neighbors algorithm for inference.

Memory-free Function Approximation with Continuous-kernel (C-k) dEcomposition

MFEC is an approach that is characterized by the use of non-parametric methods to approximate Q-values. Q-value is a measure of the expected future rewards of a certain action taken in a certain state. A non-parametric method is a technique that does not assume a specific parametric functional form for a given statistical distribution. Instead, non-parametric methods use flexible functional forms to approximate the probability density function of a random variable. As a result, non-parametric methods can be more robust in cases when the underlying probability model is not known.

In the case of MFEC, the non-parametric method used is k-Nearest Neighbors. KnNearest Neighbors is a type of supervised machine learning algorithm that is used for both classification and regression problems. It works by finding the k-nearest points in the training set to a given data point based on a chosen distance metric, such as Euclidean distance. The output value of the data point is then estimated by averaging the values of the k-nearest points.

Q-values approximation

The Q-value is the expected future reward that an agent will receive for taking a certain action in a certain state. The Q-value function is used in many reinforcement learning algorithms to determine the optimal policy for the agent. The optimal policy is the sequence of actions that maximizes the expected cumulative reward that the agent receives.

MFEC uses non-parametric methods to approximate the Q-value function. The Q-value function is approximated by storing all the visited states and then using k-Nearest Neighbors to infer the Q-values for unseen states. The stored states are used as the training dataset for the k-Nearest Neighbors algorithm. The Q-value for a given state is then estimated by averaging the Q-values of the k-nearest states to the given state

The Continuous-kernel (C-k) dEcomposition

MFEC uses a continuous kernel decomposition technique to reduce the dimensionality of the state-space. The state-space is the set of all possible states that the agent can be in. Continuous kernel decomposition is a technique where high-dimensional data is decomposed into a low-dimensional continuous kernel space. The idea behind this technique is that it allows for efficient storage and retrieval of high-dimensional data by representing the data in a low-dimensional continuous space.

The continuous kernel space is constructed using a Gaussian kernel. The Gaussian kernel is a probability density function that takes on the shape of a bell curve. The width of the bell curve is determined by a tuning parameter called the bandwidth. The bandwidth parameter determines how much influence nearby points have on each other.

The continuous kernel space is used to represent the states that the agent can be in. The high-dimensional state-space is mapped onto the low-dimensional continuous kernel space using a distance metric based on the Gaussian kernel. This reduces the dimensionality of the state-space, making it easier to store and retrieve the data.

Advantages of MFEC

There are several advantages of using MFEC to approximate Q-values:

Non-parametric: MFEC is a non-parametric method that does not make any specific assumptions about the underlying probability distribution. This makes it more robust in cases where the underlying distribution is unknown.
Memory-free: MFEC does not require any memory or function approximation during the computation of the Q-values. This makes it more computationally efficient compared to other function approximation methods.
Efficient storage: MFEC uses continuous kernel decomposition to represent the data in a low-dimensional space. This makes it more efficient to store high-dimensional data compared to other methods.

Applications of MFEC

MFEC is being used in various applications in the fields of robotics, gaming, and finance. In robotics, MFEC is used to calculate the optimal policy for robot decision-making problems based on past experiences. In gaming, MFEC is used to create intelligent agents that can compete with human players in complex games. In finance, MFEC is used to predict the future value of stocks and other financial instruments.

MFEC is a promising technique for approximating Q-values in a non-parametric manner. It offers several advantages over other function approximation methods and has many practical applications. As research continues in this area, MFEC is likely to become more widely used in various fields.