The “Camera” and Geometric Transforms

Back to documentation index.

Introduction

This page describes conventions for specifying projection and view transforms in 3D graphics, especially when using my Geometry Utilities Library, and explains how a commonly used graphics pipeline transforms vertices to help it draw triangles, lines, and other graphics primitives.

Source code for the latest version of the library is available at the Geometry Utilities Library’s project page.

Contents

Overview of Transformations

Many 3D rendering engines use the following transformations:

As explained later on this page, however, these transformations and matrices are merely for the convenience of the rendering engine; all the graphics pipeline expects is the clip space coordinates of the things it draws. The pipeline uses those coordinates and their transformed window coordinates when rendering things on the screen.

Projection Transform

A projection transform (usually in the form of a projection matrix) transforms coordinates in eye space to clip space.

Two commonly used projections in 3D graphics are the perspective projection and orthographic projection, described below. (Other kinds of projections, such as oblique projections, isometric projections, and nonlinear projection functions, are not treated here.)

Perspective Projection

A perspective projection gives the 3D scene a sense of depth. In this projection, closer objects look bigger than more distant objects with the same size, making the projection similar to how our eyes see the world.

**Two rows of spheres, and a drawing of a perspective view volume.**

**Two rows of spheres, and a side drawing of a perspective view volume.**

The 3D scene is contained in a so-called view volume, and only objects contained in the view volume will be visible. The lines above show what a perspective view volume looks like. Some of the spheres drawn would not be visible within this view volume, and others would be.

The view volume is bounded on all six sides by six clipping planes:

Note further that:

The perspective projection converts 3D coordinates to 4-element vectors in clip space. However, this is not the whole story, since in general, lines that are parallel in world space will not appear parallel in a perspective projection, so additional math is needed to achieve the perspective effect. This will be explained later.

The following methods define a perspective projection.

MathUtil.mat4perspective(fov, aspect, near, far)

This method returns a 4 × 4 matrix that adjusts the coordinate system for a perspective projection given a field of view and an aspect ratio, and sets the scene’s projection matrix accordingly.

**MathUtil.mat4frustum(left, right, bottom, top, near, far)**

This method returns a 4 × 4 matrix that adjusts the coordinate system for a perspective projection based on the location of the six clipping planes that bound the view volume. Their positions are chosen so that the result is a perspective projection.

Orthographic Projection

An orthographic projection is one in which the left and right clipping planes are parallel to each other, and the top and bottom clipping planes are parallel to each other. This results in the near and far clipping planes having the same size, unlike in a perspective projection, and objects with the same size not varying in size with their distance from the “camera”.

The following methods generate an orthographic projection.

MathUtil.mat4ortho(left, right, bottom, top, near, far)

This method returns a 4 × 4 matrix that adjusts the coordinate system for an orthographic projection.

MathUtil.mat4ortho2d(left, right, bottom, top)

This method returns a 4 × 4 matrix that adjusts the coordinate system for a two-dimensional orthographic projection. This is a convenience method that is useful for showing a two-dimensional view. The mat4ortho2d method calls mat4ortho and sets near and far to -1 and 1, respectively. This choice of values makes Z coordinates at or near 0 especially appropriate for this projection.

MathUtil.mat4orthoAspect(left, right, bottom, top, near, far, aspect)

This method returns a 4 × 4 matrix that adjusts the coordinate system for an orthographic projection, such that the resulting view isn’t stretched or squished in case the view volume’s aspect ratio and the scene’s aspect ratio are different.

View Transform

The view matrix transforms world space coordinates, shared by every object in a scene, to coordinates in eye space (also called camera space or view space), in which the “camera” is located at the center of the coordinate system: (0, 0, 0). A view matrix essentially rotates the “camera” and moves it to a given position in world space. Specifically:

MathUtil.mat4lookat(eye, lookingAt, up)

This method allows you to generate a view matrix based on the “camera”’s position and view.

Vertex Coordinates in the Graphics System

The concepts of eye space, camera space, and world space, as well as the use of matrices related to them, such as projection, view, model-view, and world matrices, are merely conventions, which exist for convenience in many 3D graphics libraries.

When a commonly used graphics pipeline (outside of the 3D graphics library concerned) draws a triangle, line, or point, all it really expects is the location of that primitive’s vertices in clip space. A so-called vertex shader communicates those locations to the graphics pipeline using the data accessible to it. Although the vertex shader can use projection, view, and world matrices to help the pipeline find a vertex’s clip space coordinates, it doesn’t have to, and can use a different paradigm for this purpose. For example, the vertex shader can be passed vertex coordinates that are already in clip space and just output those coordinates without transforming them.

As the name suggests, clip space coordinates are used for clipping primitives to the screen. Each clip space vertex is in homogeneous coordinates, consisting of an X, Y, Z, and W coordinate, where the X, Y, and Z are premultiplied by the W. The perspective matrix returned by MathUtil.mat4perspective, for example, transforms W to the negative Z coordinate in eye space, that is, it will increase with the distance to the coordinates from the “eye” or “camera”.

To take perspective into account, the clip space X, Y, and Z coordinates are divided by the clip space W, and then converted to window coordinates, which roughly correspond to screen pixels. The window coordinates will have the same range as the current viewport. A viewport is a rectangle whose size and position are generally expressed in pixels.

For the perspective matrix returned by mat4perspective, dividing the X, Y, and Z coordinates by the clip space W results in the effect that as W gets higher and higher (and farther and farther from the “eye” or “camera”), the X, Y, and Z coordinates are brought closer and closer to the center of the view. This is the perspective effect mentioned earlier: objects will appear smaller and smaller as they are more and more distant from the “camera”.

Other Pages

The following pages of mine on CodeProject also discuss the Geometry Utilities Library, formerly the Public-Domain HTML 3D Library:

Back to documentation index.