I have been 'playing around' with autoencoder implementations to realize 'a predictor,' as the principal function of the neocortex is supposed to be prediction. I tried a simple autoencoder and a sparse autoencoder from a cerenaut repository and a β-VAE implementation from a project repository of Princeton University (see the explanatory article). I chose the β-VAE, for I'll use it to model the association cortex, where the use of CNN may not be appropriate (the β-VAE does not use CNN but only Linear layers). (And the simple one may not be potent enough.)
I constructed a predictor with the encoder, decoder, and autoencoder factory from the repository with a single modification in the decoder setting. Namely, the predictor differs only with the decoder output setting; while the autoencoder predicts encoder input, the predictor predicts other input.
The implementation is found here: https://github.com/rondelion/AEPredictor
A test result with MNIST rotation (to predict rotated images) is shown below after 100 epochs of training: