6.3 Using Numerical Optimization Methods
6.3.1 Choosing the correct Supervisor approach
There are several approaches to using optimization algorithms in Webots. Most approaches need a Supervisor and hence Webots PRO is usually required.
A numerical optimization can usually be decomposed in two separate tasks:
-
Running the optimization algorithm: Systematical Search, Random Search, Genetic Algorithms (GA), Particle Swarm Optimization (PSO), Simulated Annealing, etc.
-
Running the robot behavior with a set of parameters specified by the optimization algorithm.
One of the important things that needs to be decided is whether the implementation of these two distinct tasks should go into the same controller or in two separate controllers. Let's discuss both approaches:
Using a single controller
If your simulation needs to evaluate only one robot at a time, e.g. you are optimizing the locomotion gait of a humanoid or the behavior of a single robot, then it is possible to have both tasks implemented in the same controller; this results in a somewhat simpler code. Here is a pseudo-code example for the systematical optimization of two parameters a and b using only one controller:
#include <webots/robot.h> |
Using two distinct types of controllers
If, on the contrary, your simulation requires the simultaneous execution of several robots, e.g. swarm robotics, or if your robot is a DifferentialWheels, then it is advised to use two distinct types of controller: one for the optimization algorithm and one for the robot's behavior. The optimization algorithm should go in a Supervisor controller while the robots' behavior can go in a regular (non-Supervisor) controller.
Because these controllers will run in separate system processes, they will not be able to access each other's variables. Though, they will have to communicate by some other means in order to specify the sets of parameters that need to be evaluated. It is possible, and recommended, to use Webots Emitters and Receivers to exchange information between the Supervisor and the other controllers. For example, in a typical scenario, the Supervisor will send evaluation parameters (e.g., genotype) to the robot controllers. The robot controllers listen to their Receivers, waiting for a new set of parameters. Upon receipt, a robot controller starts executing the behavior specified by the set of parameters. In this scenario, the Supervisor needs an Emitter and each individual robot needs a Receiver.
Depending on the algorithms needs, the fitness could be evaluated either in the Supervisor or in the individual robot controllers. In the case it is evaluated in the robot controller then the fitness result needs to be sent back to the Supervisor. This bidirectional type of communication requires the usage of additional Emitters and Receivers.
6.3.2 Resetting the robot
When using optimization algorithm, you will probably need to reset the robot after or before each fitness evaluation. There are several approaches to resetting the robot:
Using the wb_supervisor_field_set_*() and wb_supervisor_simulation_physics_reset() functions
You can easily reset the position, orientation and physics of the robot using the wb_supervisor_field_set...() and wb_supervisor_simulation_physics_reset() functions, here is an example:
// get handles to the robot's translation and rotation fields |
Using the wb_supervisor_simulation_revert() function
This function restarts the physics simulation and all controllers from the very beginning. With this method, everything is reset, including the physics and the Servo positions and the controllers. But this function does also restart the controller that called wb_supervisor_simulation_revert(), this is usually the controller that runs the optimization algorithm, and as a consequence the optimization state is lost. Hence for using this technique, it is necessary to develop functions that can save and restore the complete state of the optimization algorithm. The optimization state should be saved before calling wb_supervisor_simulation_revert() and reloaded when the Supervisor controller restarts. Here is a pseudo-code example:
#include <webots/robot.h> |
By starting and quitting Webots
Finally, the last method is to start and quit the Webots program for each parameter evaluation. This may sound like an overhead, but in fact Webots startup time is usually very short compared to the time necessary to evaluate a controller, so this approach makes perfectly sense.
For example, Webots can be called from a shell script or from any type of program suitable for running the optimization algorithm. Starting Webots each time does clearly revert the simulation completely, so each robot will start from the same initial state. The drawback of this method is that the optimization algorithm has to be programmed outside of Webots. This external program can be written in any programming language, e.g. shell script, C, PHP, perl, etc., provided that there is a way to call webots and wait for its termination, e.g. like the C standard system() does. On the contrary, the parameter evaluation must be implemented in a Webots controller.
With this approach, the optimization algorithm and the robot controller(s) run in separate system processes, but they must communicate with each other in order to exchange parameter sets and fitness results. One simple way is to make them communicate by using text files. For example, the optimization algorithm can write the genotypes values into a text file then call Webots. When Webots starts, the robot controller reads the genotype file and carries out the parameter evaluation. When the robot controller finishes the evaluation, it writes the fitness result into another text file and then it calls the wb_supervisor_simulation_quit() function to terminate Webots. Then the control flow returns to the optimization program that can read the resulting fitness, associate it with the current genotype and proceed with the next genotype.
Here is a possible (pseudo-code) implementation for the robot evaluation controller:
#include <webots/robot.h> |
You will find complete examples of simulations using optimization techniques in Webots distribution: look for the worlds called advanced_particle_swarm_optimization.wbt and advanced_genetic_algorithm.wbt located in the WEBOTS_HOME/projects/samples/curriculum/worlds directory. These examples are described in the Advanced Programming Exercises of Cyberbotics' Robot Curriculum.