Sunday, April 25, 2021

Minimal Cognitive Architecture

 I made an experimentation with a cognitive architecture with a minimal setup (Fig.1).

Code on GitHub

Fig.1

The objectives of the experimentation are as follows:
  • Usability checking of BriCA, a brain-inspired computational framework, which performs computation by passing numerical vectors among modules.
  • Curriculum learning
    With the hypothesis that perceptual components in animals or generally intelligent agents are trained independently with tasks, the perceptual component was trained first, then used later.
  • Separating an internal learning environment from the exterior environment
    An internal environment was set up with the hypothesis that the brain is a society of agents, each of which has its own environment.
With these objectives met, the architecture will serve as a design pattern for more complex architectures.

The following software frameworks were used:

Task and reinforcement learning

OpenAI Gym CartPole is an introductory task for reinforcement learning.
Tensorforce provides a sample actor-critic (PPO) code to solve the task.
Actor-critic was chosen as the brain (basal ganglia) is supposed to use an actor-critic mechanism.

Training the perceptual module

The observation data from CartPole is just a four-dimensional numeric array.  So dubbing it as 'VisualComponent' is a bit of exaggeration and a simple auto-encoder without CNN was used.
The observation data for training the perceptual module is taken while it is directly fed to the motor component which learns by PPO.  Though observation data could be taken while motor component chooses the action randomly, it would limit the range of data as it fails with a few actions in each episode.
The auto-encoder model is trained with an independent utility and saved in a file to be used in the perceptual module later.  In this 'curriculum learning', I followed the manner found in the set-up of the working memory hackathon held by WBAI and Cerenaut.
The output dimension of the auto-encoder was the same as the input so that no information compression was made, as the dimension (=four) of the input is low enough (or too low).

Using BriCA

BriCA is a brain-inspired computational framework.  While it can be used to model signal passing among regions in the brain, it restricts the computation: the information flow is limited to the 'axonal' direction and it has its own time steps.
For one thing, it cannot use gradient 'back-prop' beyond module boundaries and it is a motivation for training the auto-encoder based perceptual module in an unsupervised manner.
An issue with its own time steps is the discrepancy with those with (Gym) environment.  It takes more than one BriCA steps to complete an observe-action loop, which would cause negative effects on reinforcement learning (especially when it is not tuned to such a setup).  Thus, in this experimentation, I added a token mechanism for the system to synchronize with the observe-action loop.

Internal Environment

The brain (or basal ganglia) is supposed to have multiple reinforcement learning modules to control various external and internal actions.  That is, they require their own internal environments that accept module actions and produce their own observations.  In this experimentation, the motor component has its own internal environment.  Tensorforce was used as a framework, for it enables flexible learning setup.

Result

The agent with a learned perceptual model could learn the task, while the scores were not as good as those from the system with the observation directly fed to the motor component.  This is assumedly because of the information loss in the learned perceptual module.  The observation is low-dimensional and the auto-encoder does not have the merit of information compression.

Future Direction

The architecture created in the experimentation will serve as a design pattern for coming cognitive modelings.  The mind map below (Fig.1) shows possible extensions.

Fig.2