実験の様子 Video of the experiments
A. Pre-training Experiments
This is an offline learning experiment conducted to evaluate the accuracy of the pre-training by generating actions based solely on data. The pre-trained neural network was used to test whether the robot could perform collision avoidance behaviors in the experimental environment.
B. Reinforcement learning Experiments
This is an online learning experiment where the robot’s actions were generated through sequential learning without pre-training. To evaluate the system’s generalization ability, the robot was tested from three starting positions: the lower-left, lower-center, and lower-right of the environment.
C. Experiments with human commonsense
This is a collision avoidance experiment using reinforcement learning with prior human knowledge. Similar to the experiment with reinforcement learning alone, the test was conducted from three different starting positions.
D. Experiment in the dynamical environment
This is a collision avoidance experiment conducted in a constantly changing dynamic environment. Two types of experiments were performed: one using prior human knowledge and one without.
E. Experiments with faulty input values
This experiment observes the robot’s behavior when physical factors change. The test assumes that one of the four sensors installed on the robot is malfunctioning and assesses whether the robot can still perform collision avoidance. To evaluate the usefulness of pre-training, two experiments were conducted: one with pre-training and one without.
F. Experiments with faulty output values
This experiment observes how the robot’s behavior changes when either the left or right motor malfunctions. Two types of experiments were conducted: one with prior knowledge and one without.
G. Experiments in continuous control parameters
This is a collision avoidance experiment that allows greater flexibility in the robot’s movements. By using a smooth sigmoid function in the output layer of the neural network, the robot’s movements were controlled based on the function’s values, providing greater freedom. Two types of experiments were conducted: one with prior knowledge and one without.
最近のコメント