Render subsystem

From NeoWiki

The rendering subsystem core is the render device. The device is implemented on top of the underlying OS/platform-specific API used and provides a unified interface for managing rendering resources and issuing rendering operations. Most render resources are allocated through a render device and only usable for rendering operations on that specific device.

All rendering operations use vertex and index buffers for describing the geometry. The engine utilizes a flexible vertex format that can use any combination of vertex attributes of any of the types supported. Vertex buffers are created with a vertex declaration describing the attributes used and their layout. Index buffers can be either 16-bit or 32-bit (by default 16-bit for speed and lower memory footprint).

The render subsystem provides the low level functionality to render 3D graphics to the frame buffer (or other render targets). Other higher level subsystems provide abstractions for scene management and 2D GUI overlays. You usually don't have to directly use the low level rendering resources.

For a more detailed description of the methods and classes, see the API documentation. All links to sections in the API documentation are external links in bold style.

Table of contents

Introduction

Pipeline

The rendering pipeline is built up by three layers in the engine.

  • Shaders
  • Programs
  • Techniques

Each layer defines the properties of that level of the pipeline for all pipeline targets. A pipeline target identifies the API, shader dialect and hardware properties. When a pipeline is compiled at runtime, the specific target the backend and hardware utilizes gets selected. Pipeline targets can either be identified by group identifiers such as FIXED (fixed function pipeline), DX9SHADER (DirectX vs/ps shaders generation 1 to 3), GLPROGRAM (OpenGL programmable pipeline up to ARB vertex/fragment programs) and GLSL (OpenGL Shading Language), or by individual shader targets such as FIXED_PS, DX9_VS_1_1 and GL_ARBVP_1. By splitting the pipeline into these layers, resource sharing can be used to minimize the number of shaders, programs and techniques needed to express all the combinations used in a game.

To further generalize the pipeline and make it more flexible, parameters define the input properties that higher level objects can use to modify the pipeline. Examples of parameters are the textures bound to samplers, transformation matrices and other generic floating-point data.

Shaders

Shaders are the building blocks of the rendering pipeline. A shader represents the raw instructions on how a part of the pipeline is executed. For example, a vertex shader describes how the vertices building up the rendering primitives are transformed, and a pixel shader describes how each fragment (pixel) is calculated. Programmable pipeline shaders define the assembly or highlevel code to be executed in the two pipeline stages. The fixed function pipeline is also abstracted into these two parts, where fixed function "shaders" describe the state setup for the transformation and texturing/rasterization stages.

Programs

Programs combine a number of shaders of different type to represent a complete pipeline. In it's most basic form, a program links together a vertex shader and a pixel shader to describe both transformation and rasterization. Some program targets like GLSL can combine multiple shaders of each type, each one building block of that part of the pipeline, and link them together for a complete pipeline.

Techniques

Techniques define input properties to the programs and shaders as well as rasterization states. For example, filtering and wrapping modes for the samplers, blending modes for the output of the pipeline and the frame buffer and z buffer modes.

Parameters

Parameters are declared and named by the shaders. Each shader declares the parameters available in that specific implementation of that part of the pipeline. For example, a vertex sampler doing a simple perspective projection of the vertex positions will declare a 4x4 matrix parameter that holds the transformation matrix to use. A pixel shader doing a simple diffuse texture mapping with a 2D texture sampler would define a sampler parameter that holds a reference to the texture to use.

If the shaders are the code of the rendering pipeline, the parameters and their bindings are the data of the pipeline.

Parameters can have explicit bindings, such as a 4x4 matrix of floating point values, or a reference to a texture object. Parameters can also have implicit bindings to a bind path that denotes a engine state or other external values. Examples of implicit bindings is a transformation matrix binding to the path device.matrix.mvp which denotes the concatenated model, view and projection matrices of the rendering device.

Resources

Rendering resources are vertex and index buffers, textures, shaders, programs and techniques. The resources must be allocated through the render device they are used with in order to allocate the raw backend resources needed. You must never use a resource created on one render device with another render device, nor use a resource created manually (like a vertex buffer allocated with new and not by the device) for rendering operations.

The engine utilizes a flexible vertex format that can use any combination of vertex attributes of any of the types supported. Vertex buffers are created with a vertex declaration describing the attributes used and their layout. Index buffers can be either 16-bit or 32-bit (by default 16-bit for speed and lower memory footprint).

Textures are usually loaded from an image file but can also be created explicitly and have code-generated data. Textures support videos and other dynamic data that changes over time through the image class feeding the data to the texture. A texture with an image data source tagged as dynamic will have it's backend data updated automatically by the engine when needed. Textures can also be created by rendering operations (render-to-texture), and used in vertex pipeline through vertex sampler parameters.

Device

The render device is the main hub responsible for allocating render resources and processing rendering operations. The engine provides implementations of the render device for DirectX 8, DirectX 9 and OpenGL. Resources alloceted through one device cannot be used on another device.

Rendering operations are batched and executed by the device during a call to the flush method. The rendering operations are sorted to minimize hardware state changes and are not guaranteed to be executed in the order submitted to the device. You can supply explicit sorting methods to achieve this and override the internal sorting (used for rendering 2D GUI overlays with Z ordering, for example).

Resources

All resources in the rendering subsystem are derived from the Resource base class (which is in turn derived from the base resource class in the core subsystem). All resources are allocated through the render device and can only be used for rendering operations on the device which created them.

Render resources can have the backend data released and restored in certain situations such as recreating the render device (mode and/or fullscreen switching for example) or when other resources need the backend data. Each resource must either be able to upload the original data again or have an external controller re-upload the data from another source. The method isLost is used to query the state of the resource, if the call returns true the backend data has been released and the resource must be restored before used in rendering operations again.

Buffers

Buffers are used to store vertex and index data. All render geometry is described using vertex/index buffers, the engine does not have any methods to submit direct rendering commands for single primitives. The base functionality is implemented in the Buffer class.

Storage of the buffer data is done depending on use flags describing how the data will generally be used. As a rule buffers have a mirror in system RAM used for read/write access, and a backend buffers residing in VRAM used when rendering. When writing to a buffer the system RAM mirror is used, and when the buffer is unlocked the buffer is uploaded in one operation to the VRAM backend buffer. For more information see the API documentation for the Buffer class.

To read/write from/to a buffer you must lock it with a call to lock, passing in the lock type which is a combination of flags from the Lock enum. A write lock implicitly allows read access, but not the other way around. It is an error to call lock on an already locked buffer. You must match each call to lock with a call to unlock. To help minimize the pitfalls of buffer locking you can use the BufferReadLock object which automatically detects locked buffers and manage automatic unlocking.

Buffers that have been written to are marked as dirty. Before such a buffer can be used in rendering operations the backend buffer must be updated with the new data. Uploading data can be done either when the buffer is unlocked, when the buffer is used in a rendering primitive call or during the pipeline flush in the backend. By default buffers are uploaded when used in rendering primitives. You can control this system-wide default by calling setUploadPolicy, or by passing the appropriate flag to the lock method.

Buffers can have allocated memory for more primitives than is currently used. This is useful for dynamically generated geometry which changes the number of primitives on a per-frame basis. You can then allocate storage for the maximum number of primitives you need and use setNumElements to set the current number of used elements.

Vertex Buffers

Vertex buffers store vertex data in a buffer. The vertex format is flexible and controlled by a VertexDecl declaration object passed when creating the vertex buffer. To create a vertex buffer you call createVertexBuffer on the render device, passing in the buffer type, number of elements and vertex declaration object. You can also optionally pass in a pointer to a memory buffer holding initial data for all the elements in the buffer. If you allocate a NOREADWRITE buffer you must pass in a data pointer.

Access to vertex elements is done through the getElement methods after securing a read/write lock on the buffer. You can iterate elements by typecasting the returned void* to an object of the correct size, or use getElementSize to get the size of an element and treat the pointer as a raw byte pointer.

Building a vertex declaration is done by creating a VertexDecl object and then declaring the number of elements with a call to setNumElements or passing it to the constructor. Use getElement to access each VertexElement and define the attributes for them. Each element much appear in the declaration in offset order, i.e an element with a higher index cannot have a lower byte offset than an element with lower index. Also, elements must not overlap, i.e an element with offset 0 and size 4 requires that no other element have offset 0 to 3. For example, defining a vertex format with a 3-dimension vector as position and a 2-dimension texture coordinate layer is done with the following code:

VertexDecl decl( 2 ); //2 elements in declaration
decl.getElement( 0 )->set( VertexElement::POSITION, VertexElement::FLOAT3, 0 ); //Position data, 3-dimension float vector, byte offset 0
decl.getElement( 1 )->set( VertexElement::TEXCOORD0, VertexElement::FLOAT2, 12 ); //Texture coordinate data, 2-dimension float vector, byte offset 12

You can then define a type matching this declaration and typecast element pointers from buffers created with this declaration to pointers of this type:

class Vertex { public: neo::math::Vector3 pos; float uv[2]; };

Index Buffers

Index buffers define rendering primitives as indices into a vertex buffer. For exampe, triangle primitives use three indices per triangle defining the vertices for the triangle. While the engine supports both 16-bit and 32-bit index buffers, default is 16-bit to conserve memory and speed up rendering operations. If you find yourself needing 32-bit index buffers you should consider segmenting your geometry and split it into two or more separate vertex buffers, each 65535 or less vertices. This will allow you to index the vertex data using 16-bit index buffers and improve performance of your game.

Index buffers are created through the render device just as vertex buffers, but no declaration is needed, only a flag indicating 16/32 bit size. The same reasoning regarding uploading, locking and access applies as well. If your index buffers describes a triangle list you can use the predefined Triangle for 16-bit index buffer access and Triangle32 for 32-bit index buffer access.

Textures

The engine support 1D, 2D, 3D and cubemap textures of varying pixel formats. You create a texture object with a call to the render device method createTexture. Before a texture can be used in rendering operations by binding it to a sampler parameter it must have an Image data uploaded. You upload image data with a call to upload on the texture object, defining the texture type and the image data. Images are loaded through image codecs. The engine ships with support for a number of common image formats: PNG, JPEG, TGA and DDS, as well as a simple textbased format TEX for defining compound images (combining six images into a cubemap image for example) and aliases.

The render device determines which pixel formats are supported in the different texture type cases. If the pixel format is not supported, the device will convert it automatically when the image data is uploaded to the best matching pixel format. The same reasoning applies to image dimensions. If you upload an image of a non-power-of-two dimension and the device only supports power-of-two dimensions, the image will be resized to the nearest lower power-of-two dimension. Other restrictions may apply, such as that cubemap textures must have the same width and height. Mipmaps are normally autogenerated by either the engine or the graphics hardware when uploading image data. You can, however, upload pre-generated mipmap data by defining mipmap levels in the image object. See the API documentation for the Image class for more information.

The most common way to load a texture is to use the resource management subsystem and call load method on the TextureManager singleton object. The texture returned from this call is created on the current render device.

To manually upload image data to a texture you allocate the Image object, allocate the storage in the image with the given dimensions and pixel format, store the pixel data in the image object data buffer and call upload on the texture object. For more information about dealing with image objects, see the Image & Video subsystem.

Generated images

The engine also supports generating image data in code. An example of this would be a normalization cubemap where each pixel has values for the normalized vector from the center of the cubemap to that pixel. Image generators are registered with the ImageFactory singleton and identified by a "major path". For exampe, the system image generator which can produce normalization cubemaps among other images and identified by the path sys. Any requests to load images in the path /sys/<subpath>/<imagename> will be redirected to this image generator. If there are no generators for a path (or if the registered generator does not generate an image for that specific full path) the image is loaded from the engine file system using the registered image codecs.

Animated images & video

Images can also be dynamic, providing functionality to display video or slideshow animated images on textured objects in the game. An dynamic/animated image returns true on a call to isAnimated and the engine will automatically updated textures which has this image uploaded as data. Videos are loaded through video codecs, and the engine ships with an implementation for the Ogg Theora codec. Video codecs are also acting as image codecs, so you can transparently load video data onto textures through the same interface as loading images. Using the resource manager to load textures makes it completely transparent; if there is a video file matching when the manager loads the image data for the texture, it will be automatically loaded and updated without any need for any code in the game. See the video test case for an example of this.

Render-to-texture

You can also generate data for a texture by rendering to it. A renderable texture is created with a call to the render device method createRenderTexture. You must define the texture type, pixel format and dimensions before using the render texture for either rendering or as a texture bound to a sampler parameter, with calls to setType, setPixelFormat and setSize. The same reasoning about pixel formats and dimensions as when uploading images applies, if the render device does not support the pixel format and/or dimension it will pick the best match. If you render to a cubemap or 3D texture you must also specify which side (or depth slice) you want to render to with a call to setActiveSurface.

Pipeline

The rendering pipeline is defined by shaders, programs and techniques. Data is passed through vertex and index buffers defining the primitives, and parameters defining the non-varying data such as transformation matrices and sampler bindings (textures). Different generations of hardware, APIs and shader dialects forming the pipeline are identified by pipeline targets. The engine uses two kinds of pipeline targets. The current pipeline targets is selected at runtime by the render device depending on application-given preferences and the hardware capabilities present. The device will select one shader target for each shader type (vertex, pixel, geometry) and one program target

Shader targets, identifying a single shader version/dialect. The list of supported shader versions can be found in Shader::TargetID enumeration.

Program targets, identifying a group of shader versions that fall into the same category. The list of program targets can be found in Program::TargetID enumeration.

Resources can provide support and implementations for many (even all) pipeline targets. This is done by having auxiliary data objects for different pipeline identifiers when the resource is in the uncompiled state. The target identifiers are then used when compiling pipeline resources, selecting the correct implementations provided for the resource. The difference between uncompiled and compiled resources can be visualized with the following collaboration diagrams:

Image:Renderresources.png

Shaders

Shaders define the operations (assembly or high leve code, or sampler setup for fixed-function) during the different pipeline stages. Traditionally the pipeline is divided into a vertex transformation part and one pixel/fragment rasterization part. Modern hardware will have an additional component in the geometry shader stage which will soon be supported by the engine. Shaders are implemented in the Shader class in the engine.

A shader can support multiple shader targets with different code implementations by having a number of ShaderTarget objects stored, each ShaderTarget providing an implementation for one or more shader target identifiers. Each ShaderTarget has a single source code string defining the shader operations to be performed in that pipeline stage.

When the shader is compiled, it will pick the first ShaderTarget having a shader target matching the shader targets selected by the render device. This ShaderTarget source and parameters will then be assimilated by the shader, compiled and stored for use in whatever format the backend uses. The parameters for the shader is then set to the parameters for the chosen ShaderTarget. Finally all ShaderTarget objects are discarded to free up resources.

Programs

Techniques

Parameters

Device

Occlusion Culling

Post-process Effects

Utility classes