- In the previous post, I thought I'd use the depth map for object recognition, but I'd use regular optical maps in this plan, as the depth map and the optical map are different modalities and it is more difficult to imagine what is going on with the depth perception from the human (phenomenological) point of view.
- In this plan, I dropped machine learning and will use more hard-wired approaches as I'd avoid unpredictable aspects.
Again recapitulating the Phase II part of my research plan here:
The following is the new plan for Phase II.Phase II: Recognizing Spelke's Objects
- Basic Ideas
- Spelke's Object: coherent, solid and inert bundle of features of a certain dimension that continues over time.
Features: colors, shapes (jagginess), texture, visual depth, etc.- While recognition of Spelke's objects may be preprogrammed, recognized objects become objects of categorization by means of non-supervised learning. In this process, hierarchical (deep) learning would be done from the categorization of primitive features to the re-categorization of categorized patterns.
- Object recognition will be carried out within spontaneous actions of the robot.
- The robot shall gather information preferentially on 'novel' objects (curiosity-driven behavior) ('novelty' to be defined).
Robot Basics
- Fish like robot swimming in a 3D space
Experiments will be done with the SigVerse robot simulator.
- Keep the robot from wandering away by making it attracted by objects on the floor (see below).
- There are passively movable objects on the ground.
Line of Sight and 2D depth sensors (SigVerse)- Static images
- Optical flow (temporal ⊿)
- Line of sight depth sensor (to avoid collision)
- CVF (gaze) moves randomly (saccades) in the fixed visual field.
- CVF is attracted to information-dense areas.
- Information density is measured by the density of extracted features such as line segments and optical flow.
- CVF is 'bored' with each attractor as time passes.
- CVS allows high-resolution feature extraction.
Peripheral visual field
- Information density is measured with low-resolution feature extraction.
Feature extraction
- Line segments (using such as
SIFT orGabor filter) - Border ownership/Figure-Ground separation (see below)
- Randomly change direction
- Direction change is attracted by line of sight
- If reward increases by direction change, then accelerate till next direction change.
- If locomotion decreases reward, then decelerate (with water resistance) (and change direction).
- If depth sensor predicts collision, then decelerate (with water resistance) and change direction.
- Increase in information density in CVF gives a positive reward (curiosity; aesthetics)
- Concussion by collision will give a negative reward.
- Uses figure-ground separation algorithms inspired by visual information processing in the brain. Ref.
- Border-ownership coding
- 視覚皮質の計算論的モデル(mostly English slides on Computational Models of Visual Cortex, Sakai 2014)
- Spelke's objects are recognized as figure-like lumps detected by figure-ground separation algorithms.
- Optical flow may also be used for figure-ground separation.