As for attitude control, I have done:
- 1 dimensional rotation control by simple learning algorithms
(experimenting in Java) - porting the algorithms to c++
(SigVerse controllers must be written in c++.) - Implementing simple attitude control mechanism (3D but without learning) with SigVerse.
OK, attitude control doesn't require learning, and my learning/constructing c++ environment required a lot of time...
Last September, I wrote on my research plan:
The following is a bit more concrete specification for Phase II.Phase II: Recognizing Spelke's Objects
- Basic Ideas
- Spelke's Object: coherent, solid and inert bundle of features of a certain dimension that continues over time.
Features: colors, shapes (jagginess), texture, visual depth, etc.- While recognition of Spelke's objects may be preprogrammed, recognized objects become objects of categorization by means of non-supervised learning. In this process, hierarchical (deep) learning would be done from the categorization of primitive features to the re-categorization of categorized patterns.
- Object recognition will be carried out within spontaneous actions of the robot.
- The robot shall gather information preferentially on 'novel' objects (curiosity-driven behavior) ('novelty' to be defined).
Experiments will be done with the SigVerse robot simulator.
Robot Basics
- Fish like robot swimming in a 3D space
The 'aquarium' is a cube/cuboid enclosed by walls.
The robot cannot see transparent walls.
The observer cannot see inside opaque walls.
⇒
Keep the robot from wandering away by making it attracted by objects on the floor.- There are passively movable objects on the ground.
- Line of Sight and 2D distance sensors (SigVerse)
- If reward is under a threshold, then change direction.
- Change direction for reward increase.
- Accelerate towards the direction of maximal reward.
- Accelerate for reward increase. Accelerate inversely for reward decrease.
- Complexity in the (variance of) 2D distance (depth) patterns will give a positive reward (motivation of reaching for objects; curiosity; aesthetics)
- Concussion by collision will give a negative reward.
- Getting bait will give a positive reward.
- Learning by rewards (reinforcement learning)
- For example, the robot may learn bumping objects away to get bait.
- Sensory input
- 2D distance depth (including optical flow)
- Acceleration and rotation (kinesthetic input)
- Concussion (large acceleration)
- Clustering sensory input
- Of course, feature selection (manual or automatic) is the key for successful learning.
Check if the sensory pattern learning above yields the recognition of Spelke's objects.
If it does not, then add built-in mechanisms.
If it does not, then add built-in mechanisms.