Webots Reference Manual - chapter 3 - section 5

Webots Reference Manual


3.5 Camera

Derived from Device.

Camera {
  SFFloat    fieldOfView      0.7854
  SFInt32    width            64
  SFInt32    height           64
  SFString   type             "color"
  SFBool     spherical        FALSE
  SFFloat    near             0.01
  SFFloat    maxRange         1.0
  SFVec2f    windowPosition   0 0
  SFFloat    pixelSize        1.0
  SFBool     antiAliasing     FALSE
  SFFloat    colorNoise       0.0
  SFFloat    rangeNoise       0.0
  SFNode     zoom             NULL
}

3.5.1 Description

The Camera node is used to model a robot's on-board camera, a range-finder, or both simultaneously. The resulted image can be displayed on the 3D window. Depending on its setup, the Camera node can model a linear camera, a lidar device, a Microsoft Kinect or even a biological eye which is spherically distorted.

3.5.2 Field Summary

3.5.3 Camera Type

The camera type can be setup by the type field described above.

Color

The color camera allows to get color information from the OpenGL context of the camera. This information can be get by the wb_camera_get_image function, while the red, green and blue channels (RGB) can be extracted from the resulted image by the wb_camera_image_get_*-like functions.

Internally when the camera is refreshed, an OpenGL context is created, and the color or depth information is copied into the buffer which can be get throught the wb_camera_get_image or the wb_camera_get_range_image functions. The format of these buffers are respectively BGRA (32 bits) and float (16 bits). We recommend to use the wb_camera_image_get_*-like functions to access the buffer because the internal format can changed.

Range-Finder

The range-finder camera allows to get depth information (in meters) from the OpenGL context of the camera. This information is obtained through the wb_camera_get_range_image function, while depth information can be extracted from the returned image by using the wb_camera_range_image_get_depth function.

Internally when the camera is refreshed, an OpenGL context is created, and the z-buffer is copied into a buffer of float. As the z-buffer contains scaled and logarithmic values, an algorithm linearizes the buffer to metric values between near and maxRange. This is the buffer which is accessible by the wb_camera_get_range_image function.

Both

This type of camera allows to get both the color data and the range-finder data in the returned buffer using the same OpenGL context. This has been introduced for optimization reasons, mainly for the Microsoft Kinect device, as creating the OpenGL context is costly. The color image and the depth data are obtained by using the wb_camera_get_image and the wb_camera_get_range_image functions as described above.

3.5.4 Frustum

The frustum is the truncated pyramid defining what is visible from the camera. Any 3D shape completely outside this frustum won't be rendered. Hence, shapes located too close to the camera (standing between the camera and the near plane) won't appear. It can be displayed with magenta lines by enabling the View|Optional Rendering|Show Camera Frustums menu item. The near parameter defines the position of the near clipping plane (x, y, -near). The fieldOfView parameter defines the horizontal angle of the frustum. The fieldOfView, the width and the height parameters defines the vertical angle of the frustum according to the formula above.

Generally speaking there is no far clipping plane while this is common in other OpenGL programs. In Webots, a camera can see as far as needed. Nevertheless, a far clipping plane is artificially added in the case of range-finder cameras (i.e. the resulted values are bounded by the maxRange field).

In the case of the spherical cameras, the frustum is quite different and difficult to represent. In comparison with the frustum description above, the near and the far planes are transformed to be sphere parts having their center at the camera position, and the fieldOfView can be greater than Pi.

3.5.5 Noise

It is possible to add quickly a white noise on the cameras by using the colorNoise and the rangeNoise fields (applied respectively on the color cameras and on the range-finder cameras). A value of 0.0 corresponds to an image without noise. For each channel of the image and at each camera refresh, a gaussian noise is computed and added to the channel. This gaussian noise has a standard deviation corresponding to the noise field times the channel range. The channel range is 256 for a color camera and maxRange for a range-finder camera.

3.5.6 Spherical projection

OpenGL is designed to have only planar projections. However spherical projections are very useful for simulating a lidar, a camera pointing on a curved mirror or a biological eye. Therefore we implemented a camera mode rendering spherical projections. It can be enabled simply by switching on the corresponding spherical parameter described above.

Internally, depending on the field of view, a spherical camera is implemented by using between 1 to 6 OpenGL cameras oriented towards the faces of a cube (the activated cameras are displayed by magenta squares when the View|Optional Rendering|Show Camera Frustums menu item is enabled). Moreover an algorithm computing the spherical projection is applied on the result of the subcameras.

So this mode is costly in terms of performance! Reducing the resolution of the cameras and using a fieldOfView which minimizes the number of activated cameras helps a lot to improve the performances if needed.

When the camera is spherical, the image returned by the wb_camera_get_image or the wb_camera_get_range_image functions is a 2-dimensional array (s,t) in spherical coordinates.

Let hFov be the horizontal field of view, and let theta be the angle in radian between the (0, 0, -z) relative coordinate and the relative coordinate of the target position along the xz plane relative to the camera, then s=0 corresponds to a theta angle of -hFov/2, s=(width-1)/2 corresponds to a theta angle of 0, and s=width-1 corresponds to a theta angle of hFov/2.

Similarly, let vFov be the vertical field of view (defined just above), and phi the angle in radian between the (0, 0, -z) relative coordinate and the relative coordinate of the target position along the xy plane relative to the camera, t=0 corresponds to a phi angle of -vFov/2, t=(height-1)/2 corresponds to a phi angle of 0, and t=height-1 corresponds to a phi angle of vFov/2).

3.5.7 Camera Functions



NAME

   wb_camera_enable, wb_camera_disable, wb_camera_get_sampling_period - enable and disable camera updates

SYNOPSIS [C++] [Java] [Python] [Matlab]

  #include <webots/camera.h>

  void wb_camera_enable(WbDeviceTag tag, int ms);
  void wb_camera_disable(WbDeviceTag tag);
  int wb_camera_get_sampling_period(WbDeviceTag tag);

DESCRIPTION

wb_camera_enable() allows the user to enable a camera update each ms milliseconds.

wb_camera_disable() turns the camera off, saving computation time.

The wb_camera_get_sampling_period() function returns the period given into the wb_camera_enable() function, or 0 if the device is disabled.



NAME

   wb_camera_get_fov, wb_camera_set_fov - get and set field of view for a camera

SYNOPSIS [C++] [Java] [Python] [Matlab]

  #include <webots/camera.h>

  double wb_camera_get_fov(WbDeviceTag tag);
  void wb_camera_set_fov(WbDeviceTag tag, double fov);

DESCRIPTION

These functions allow the controller to get and set the value for the field of view (fov) of a camera. The original value for this field of view is defined in the Camera node, as fieldOfView. Note that changing the field of view using wb_camera_set_fov() is possible only if the camera device has a CameraZoom node defined in its zoom field. The minimum and maximum values for the field of view are defined in this CameraZoom node.



NAME

   wb_camera_get_width, wb_camera_get_height - get the size of the camera image

SYNOPSIS [C++] [Java] [Python] [Matlab]

  #include <webots/camera.h>

  int wb_camera_get_width(WbDeviceTag tag);
  int wb_camera_get_height(WbDeviceTag tag);

DESCRIPTION

These functions return the width and height of a camera image as defined in the corresponding Camera node.



NAME

   wb_camera_get_near - get the near parameter of the camera device

SYNOPSIS [C++] [Java] [Python] [Matlab]

  #include <webots/camera.h>

  double wb_camera_get_near(WbDeviceTag tag);

DESCRIPTION

This function returns the near parameter of a camera device as defined in the corresponding Camera node.



NAME

   wb_camera_get_type - get the type of the camera

SYNOPSIS [C++] [Java] [Python] [Matlab]

  #include <webots/camera.h>

  int wb_camera_get_type();

DESCRIPTION

This function returns the type of the camera as defined by the type field of the corresponding Camera node. The constants defined in camera.h are summarized in table 3.1:

Camera.typereturn value
"color"WB_CAMERA_COLOR
"range-finder"WB_CAMERA_RANGE_FINDER
both"WB_CAMERA_BOTH

Table 3.1: Return values for the wb_camera_get_type() function

language: C++, Java, Python
In the oriented-object APIs, the WB_CAMERA_* constants are available as static integers of the Camera class (for example, Camera::COLOR).


NAME

   wb_camera_get_image, wb_camera_image_get_red, wb_camera_image_get_green, wb_camera_image_get_blue, wb_camera_image_get_grey - get the image data from a camera

SYNOPSIS [C++] [Java] [Python] [Matlab]

  #include <webots/camera.h>

  const unsigned char *wb_camera_get_image(WbDeviceTag tag);
  unsigned char wb_camera_image_get_red(const unsigned char *image, int width, int x, int y);
  unsigned char wb_camera_image_get_green(const unsigned char *image, int width, int x, int y);
  unsigned char wb_camera_image_get_blue(const unsigned char *image, int width, int x, int y);
  unsigned char wb_camera_image_get_grey(const unsigned char *image, int width, int x, int y);

DESCRIPTION

The wb_camera_get_image() function reads the last image grabbed by the camera. The image is coded as a sequence of three bytes representing the red, green and blue levels of a pixel. Pixels are stored in horizontal lines ranging from the top left hand side of the image down to bottom right hand side. The memory chunk returned by this function must not be freed, as it is handled by the camera itself. The size in bytes of this memory chunk can be computed as follows:

byte_size = camera_width * camera_height * 4

Internal pixel format of the buffer is BGRA (32 bits). Attempting to read outside the bounds of this chunk will cause an error.

The wb_camera_image_get_red(), wb_camera_image_get_green() and wb_camera_image_get_blue() macros can be used for directly accessing the pixel RGB levels from the pixel coordinates. The wb_camera_image_get_grey() macro works in a similar way but returns the grey level of the specified pixel by averaging the three RGB components. In the C version, these four macros return an unsigned char in the range [0..255]. Here is a C usage example:

language: C

const unsigned char *image = wb_camera_get_image(camera);
for (int x = 0; x < image_width; x++)
  for (int y = 0; y < image_height; y++) {
    int r = wb_camera_image_get_red(image, image_width, x, y);
    int g = wb_camera_image_get_green(image, image_width, x, y);
    int b = wb_camera_image_get_blue(image, image_width, x, y);
    printf("red=%d, green=%d, blue=%d", r, g, b);
  }

language: Java
Camera.getImage() returns an array of int (int[]). The length of this array corresponds to the number of pixels in the image, that is the width multiplied by the height of the image. Each int element of the array represents one pixel coded in BGRA (32 bits). For example red is 0x0000ff00, green is 0x00ff0000, etc. The Camera.pixelGetRed(), Camera.pixelGetGreen() and Camera.pixelGetBlue() functions can be used to decode a pixel value for the red, green and blue components. The Camera.pixelGetGrey() function works in a similar way, but returns the grey level of the pixel by averaging the three RGB components. Each of these four functions take an int pixel argument and return an int color/grey component in the range [0..255]. Here is an example:

int[] image = camera.getImage();
for (int i=0; i < image.length; i++) {
  int pixel = image[i];
  int r = Camera.pixelGetRed(pixel);
  int g = Camera.pixelGetGreen(pixel);
  int b = Camera.pixelGetBlue(pixel);
  System.out.println("red=" + r + " green=" + g + " blue=" + b);
}

language: Python
getImage() returns a string. This string is closely related to the const char * of the C API. imageGet*-like functions can be used to get the channels of the camera Here is an example:

#...
cameraData = camera.getImage()

# get the grey component of the pixel (5,10)
grey = Camera.imageGetGrey(cameraData, camera.getWidth(), 5, 10)

Another way to use the camera in Python is to get the image by getImageArray() which returns a list<list<list<int>>>. This three dimensional list can be directly used for accessing to the pixels. Here is an example:

image = camera.getImageArray()
# display the components of each pixel
for x in range(0,camera.getWidth()):
  for y in range(0,camera.getHeight()):
    red   = image[x][y][0]
    green = image[x][y][1]
    blue  = image[x][y][2]
    grey  = (red + green + blue) / 3
    print 'r='+str(red)+' g='+str(green)+' b='+str(blue)

language: Matlab
wb_camera_get_image() returns a 3-dimensional array of uint(8). The first two dimensions of the array are the width and the height of camera's image, the third being the RGB code: 1 for red, 2 for blue and 3 for green. wb_camera_get_range_image() returns a 2-dimensional array of float('single'). The dimensions of the array are the width and the length of camera's image and the float values are the metric distance values deduced from the OpenGL z-buffer.

camera = wb_robot_get_device('camera');
wb_camera_enable(camera,TIME_STEP);
half_width = floor(wb_camera_get_width(camera) / 2);
half_height = floor(wb_camera_get_height(camera) / 2);
% color camera image
image = wb_camera_get_image(camera);
red_middle_point = image(half_width,half_heigth,1);% red color component of the pixel lying in the middle of the image
green_middle_line = sum(image(half_width,:,2));% sum of the green color over the vertical middle line of the image
blue_overall = sum(sum(image(:,:,3));% sum of the blue color over all the pixels in the image
fprintf('red_middle_point = %d, green_middle_line = %d, blue_overall = %d\n', red_middle_point, green_middle_line, blue_overall);
% range-finder camera image
image = wb_camera_get_range_image(camera);
imagesc(image,[0 1]);
colormap(gray);
drawnow;
distance = min(min(image))% distance to the closest point seen by the camera



NAME

   wb_camera_get_range_image, wb_camera_range_image_get_depth, wb_camera_get_max_range - get the range image and range data from a range-finder camera

SYNOPSIS [C++] [Java] [Python] [Matlab]

  #include <webots/camera.h>

  const float *wb_camera_get_range_image(WbDeviceTag tag);
  float wb_camera_range_image_get_depth(const float *range_image, int width, int x, int y);
  double wb_camera_get_max_range(WbDeviceTag tag);

DESCRIPTION

The wb_camera_get_range_image() macro allows the user to read the contents of the last range image grabbed by a range-finder camera. The range image is computed using the depth buffer produced by the OpenGL rendering. Each pixel corresponds to the distance expressed in meter from the object to the plane defined by the equation z = 0 within the coordinates system of the camera. The bounds of the range image is determined by the near clipping plane (defined by the near field) and the far clipping plane (see the maxRange field). The range image is coded as an array of single precision floating point values corresponding to the range value of each pixel of the image. The precision of the range-finder values decreases when the objects are located farther from the near clipping plane. Pixels are stored in scan lines running from left to right and from top to bottom. The memory chunk returned by this function shall not be freed, as it is managed by the camera internally. The size in bytes of the range image can be computed as follows:

size = camera_width * camera_height * sizeof(float)

Attempting to read outside the bounds of this memory chunk will cause an error.

The wb_camera_range_image_get_depth() macro is a convenient way to access a range value, directly from its pixel coordinates. The camera_width parameter can be obtained from the wb_camera_get_width() function. The x and y parameters are the coordinates of the pixel in the image.

The wb_camera_get_max_range() function returns the value of the maxRange field.

language: Python
The Camera class has two methods for getting the camera image. The getRangeImage() returns a one-dimensional list of floats, while the getRangeImageArray() returns a two-dimensional list of floats. Their content are identical but their handling is of course different.


NAME

   wb_camera_save_image - save a camera image in either PNG or JPEG format

SYNOPSIS [C++] [Java] [Python] [Matlab]

  #include <webots/camera.h>

  int wb_camera_save_image(WbDeviceTag tag, const char *filename, int quality);

DESCRIPTION

The wb_camera_save_image() function allows the user to save a tag image which was previously obtained with the wb_camera_get_image() function. The image is saved in a file in either PNG or JPEG format. The image format is specified by the filename parameter. If filename is terminated by .png, the image format is PNG. If filename is terminated by .jpg or .jpeg, the image format is JPEG. Other image formats are not supported. The quality parameter is useful only for JPEG images. It defines the JPEG quality of the saved image. The quality parameter should be in the range 1 (worst quality) to 100 (best quality). Low quality JPEG files will use less disk space. For PNG images, the quality parameter is ignored.

The return value of the wb_camera_save_image() is 0 in case of success. It is -1 in case of failure (unable to open the specified file or unrecognized image file extension).

release 7.0.2
Copyright © 2012 Cyberbotics Ltd. All right reserved.