Mobile graphic exam, 2007-10-16, 8-13

1a: With a GPU it is easier to exploit parallel processing,
get better image quality, and get better energy-efficiency.
This is because the GPU is a streaming processor.
Also, you do not use the CPU for graphics, and hence the
CPU can be used for other things.

1b: M3G is an open API, open for everyone, and for every platform.
If you want to reach many customers, this is the way to go.
M3g is targeted towards higher quality rendering with hardware.
M3G has a full scene-graph which makes it easier to do hierarchical
animation, and to maintain larger worlds.

1c:
Today, the angle-per-pixel is smaller for a mobile
device compared to a desktop display, and hence the
image quality should be higher on a desktop. The
display resolution has increased rapidly for mobiles,
but the physical size remains about the same.

1d:
When texture minification occurs, you access texels with
large distances inbetween, and hence there is no use for
texture caching. Mipmapping on the other hand, accesses
one or more levels in the mipmapping hierarchy, and these
levels are selected such that you visit neighboring texels
when walking to the next pixel on the triangle.


2a: 1) float + fixed,
    2) eliminate redundant api calls, eliminate api calls which are
    seldom used.
    3) compact API, efficient, OpenGL too big! OpenGL ES <50kB!

2b:
    earth.addChild(moon);
    sun.addChild(earth);
    myWorld.addChild(sun);

2c: 
First strip: 5 3 4 1 2 0, i.e., start with vertex f, then, d, e, etc.
Second strip: 11 10 7 9 8 8 7 6, notice that
when we use the second "8", we introduce a triangle with
zero area, but it makes it possible to avoid restarting with
a new strip.

3a: A mobile device is powered by batteries, and memory accesses
are the operations that use the most energy. Heat must dissipate
into the air. Cannot use fans -- no room, and inconvenient for the
user.

3b: Texture compression is done on read-only data (images),
and it does not matter much if it takes a few minutes to
compress an image. Buffer compression operates on 
the color buffer, depth buffer, or the stencil buffer, and
these are read and write buffers. 
Texture compression is almost always lossy. This does not
work without major modifications for buffers, and hence
if compression does not work, we need a fallback so
that the data can be sent uncompressed. Whether a tile
is compressed or not is indicated by a tile table, where
a few bits indicates how it is compressed, cleared, or
not compressed.
Buffer compression does not save memory, only bandwidth.
Texture compression saves both.

3c: You need a mechanism that controls the accumulated amount of
error in a tile. This could be a four-bit value, for example,
indicating the amount of error so far. When a threshold has been
reached, you can revert to non-lossy compression.

3d: Assume 4 bytes for color, depth, texture
 d=6 gives o=2.5 (approximately)
 B = d*Z_r + o * (Z_w + C_w + 2* m* 4 * T_r)
 cost(Z) = 6 + 2.5 = 8.5
 cost(C) = 2.5
 cost(T) = 2.5 * 4 * 2* m = 20*0.25=5

4a: Ideally, we would like to have an analytical solution,
where we compute the areas covered by the different triangles,
and which color each areas has. Since this includes texturing,
and since many triangles can overlap a pixel, this becomes
a very difficult problem.

4b: texture, screen-space, shader aliasing, time

4c: put the sample points at different locations vertically,
preferably with the same distance between (vertical distance)

4d: See lecture notes.

5a: Two points: (ax,ay) to (bx,by). The line equation
is on the form: c*X+d*Y+e=0 (avoiding a and b as parameters
as the points are called a and b).

The line equation can be written as:

X(t) = ax + (bx-ax)*t ,
Y(t) = ay + (by-ay)*t , where t is in [0,1]

Rewrite as: 
(X - ax)/(bx-ax) = t
(Y - ay)/(by-ay) = t

Set them equal gives:
(X-ax)*(by-ay)-(Y-ay)*(bx-ax) = 0

This gives c, d, and e.

Another way of doing it:
All points that lie exactly on the line must
fullfil c*X+d*Y+e=0, and this means also that
if you compute the direction from a point on
the line to the starting point (ax,ay), you get
a direction: (X,Y)-(ax,ay)
This direction must be perpendicular to the normal
of the line.
The normal is simply (bx,by)-(ax,ay) = (bx-ax,by-ay)
rotated 90 degrees, which gives us the normal as:
N=(-(by-ay),bx-ax)
So, the line equation then becomes:
N*((X,Y)-(ax,ay)) = 0
This again gives c,d,e.

5b: A point in a triangle divides the triangle into
three areas, A0,A1,A2, with the entire triangle area A=A0+A1+A2
The barycentric coordinates are: (A1,A2,A0)/A
See slide 36 and 37 in Lecture L4.pdf.

5c: See slide 18, L4.pdf.


6a: The delay stream implements a form of depth peeling,
where one layer (in sorted order) is handled back to front.
It is a multi-pass algorithm, where you store all
visible transparent fragements, and then in each
pass you find the most distant surface for each pixel.
After the delay, you blend to the frame buffer.
If a triangle has unprocessed pixels,  it is 
reinserted into the delay,

From Aila et al's paper:
The depth peeling procedure using one frame buffer and one depth 
buffer proceeds as follows: 
1. Scan the OIT stream and mark all hidden pixels as processed. 
Remove fully processed triangles from the stream. 
2. At this point, the remaining surfaces are visible. Peel (a-b) 
until the OIT stream is empty: 
	(a) Clear the depth buffer, scan the OIT stream and store 
	the most distant depth value for each pixel. 
	(b) Scan the OIT stream again and for each pixel blend the 
	surface with a depth value equaling the stored depth 
	value. Mark processed pixels and remove fully pro- 
	cessed triangles. 

6b: 
(a*b) uses 2+3=5 fractional bits. 
(c*d) uses 6+2=8 fractional bits. 

To add these terms, we need to shift a*b so that is
also has 8 fractional bits:
((a*b)<<3) + (c*d)

--> 8 fractional bits in total.

6c: You may have a lenticular display, where a film of half-cylinders
has been "glued" onto an LCD-display, with several pixels under each
cylinder. With two pixels under a cylinder, you get a stereo effect,
but you have have more views than that to provide motion parallax.
With n views, it typically costs n times as much rendering time.
However, with clever algorithms this can be reduced a lot due to
that that images are very similar.