rondelion AI: Phase I ⇒ Phase II

I spent too much time on attitude control. It's time to move on.
As for attitude control, I have done:

1 dimensional rotation control by simple learning algorithms
(experimenting in Java)
porting the algorithms to c++
(SigVerse controllers must be written in c++.)
Implementing simple attitude control mechanism (3D but without learning) with SigVerse.

OK, attitude control doesn't require learning, and my learning/constructing c++ environment required a lot of time...

Last September, I wrote on my research plan:

Phase II: Recognizing Spelke's Objects

Basic Ideas

Spelke's Object: coherent, solid and inert bundle of features of a certain dimension that continues over time.
Features: colors, shapes (jagginess), texture, visual depth, etc.

While recognition of Spelke's objects may be preprogrammed, recognized objects become objects of categorization by means of non-supervised learning. In this process, hierarchical (deep) learning would be done from the categorization of primitive features to the re-categorization of categorized patterns.

Object recognition will be carried out within spontaneous actions of the robot.

The robot shall gather information preferentially on 'novel' objects (curiosity-driven behavior) ('novelty' to be defined).

The following is a bit more concrete specification for Phase II.
Experiments will be done with the SigVerse robot simulator.

Robot Basics

Fish like robot swimming in a 3D space

Environment

~~The 'aquarium' is a cube/cuboid enclosed by walls.~~
The robot cannot see transparent walls.
The observer cannot see inside opaque walls.
⇒
Keep the robot from wandering away by making it attracted by objects on the floor.
There are passively movable objects on the ground.

Robot Vision

Line of Sight and 2D distance sensors (SigVerse)

Basic activities

If reward is under a threshold, then change direction.
Change direction for reward increase.
Accelerate towards the direction of maximal reward.
Accelerate for reward increase. Accelerate inversely for reward decrease.

Rewards

Complexity in the (variance of) 2D distance (depth) patterns will give a positive reward (motivation of reaching for objects; curiosity; aesthetics)
Concussion by collision will give a negative reward.
Getting bait will give a positive reward.

Learning behaviors

Learning by rewards (reinforcement learning)
For example, the robot may learn bumping objects away to get bait.

Learning sensory patterns

Sensory input
- 2D distance depth (including optical flow)
- Acceleration and rotation (kinesthetic input)
- Concussion (large acceleration)
Clustering sensory input
Of course, feature selection (manual or automatic) is the key for successful learning.

Spelke's Objects

Check if the sensory pattern learning above yields the recognition of Spelke's objects.
If it does not, then add built-in mechanisms.

rondelion AI

Monday, January 20, 2014

Phase I ⇒ Phase II

Phase II: Recognizing Spelke's Objects

No comments:

Post a Comment