Entries tagged as: Developers

Tessellation: Enhance your geometry!

Posted by Nick Haemel on June 18, 2009

As GPUs become more powerful, we see many new applications of how they can be used as general compute devices often rivaling and surpassing the CPU. But at the same time, modern GPUs are augmented with tools and features that assist general computation. These new features add high performance paths that enhance graphics rendering capabilities. One such addition is GPU tessellation.

Tessellation in its most pure definition is the tiling of a plane or surface by smaller sub surfaces. On the GPU this translates into breaking geometry into smaller, more detailed pieces.  ATI has previously done this through TrueForm® with mixed success. A tessellation mechanism can also be implemented using the geometry shader. But the new tessellation engine in ATI Radeon HD Series and FirePro/FireGL V Series graphics hardware automates this process (currently not available for OpenGL on any nVidia hardware). Very little work is needed to get this running in any OpenGL app, just enable tessellation state in OpenGL and pick your tessellation factor based on how detailed you would like the geometry to be. The application vertex shaders can also be updated to correct texture coordinates based on the generated geometry.

image

This powerful rendering mechanism can both enhance geometry and increase performance. By using tessellation, the same level of detail can be rendered at 6-times the speed and save more than 50% of video memory, not to mention the bandwidth saved from uploading significantly less geometry. (840 original triangle model, rendered at LOD of 1,008038 triangles with and without the tessellation engine)  Such a performance boost in addition to the visual enhancement can provide a significant advantage for any application that adopts tessellation.

The result of tessellation is deterministic, and therefore well adapted to many CAD situations. But the biggest gains can be seen in digital content creation. Digital content models are often large and can be difficult to render in real-time. With tessellation, significantly smaller model sizes can be used for similar levels of detail. Pre-visualization paths can also make use of tessellation to provide better looking images faster than was previously possible. The example below is a fly-by done with tessellation enabled, showing how tessellation can enhance a landscape in real-time.

This feature can be enabled with the AMD_vertex_shader_tessellator extension located in the OpenGL extension registry.

AMD has also created a white paper detailing how to implement Catmull Clark subdivision using the tessellation engine. The demo and whitepaper can be found here. Or explore many of the other possibilities for using tessellation on OpenGL or DirectX.

OpenGL 3.1 Released - Proof is in the Pudding

Posted by Nick Haemel on March 24, 2009

Khronos and the OpenGL ARB have done it! OpenGL 3.1 and GLSL 1.40 have been released on the 6 month schedule promised at SIGGRAPH 2008. As promised, most of the legacy features marked as deprecated have been removed. No more display lists. No more immediate mode rendering. No more fixed function pipeline. The cruft accumulated over the last 17 years has been cleaned up to create a simplified and performant 3D graphics API. OpenGL 3.1 really does match the current generation of programmable graphics devices.

In addition to removing deprecated functionality, OpenGL 3.1 adds a bunch of handy new features.

Uniform buffer objects
The first and biggest is support for uniform buffer objects. This new object allows a shader to group uniforms together into a block of uniform memory. New interfaces make updating groups of uniforms easier and much more efficient. These new buffers can also be shared between programs, reducing wasted memory usage and shader uniform-reload time.

Texture buffer objects and Copy buffers
Texture buffer objects were also introduced into core OpenGL 3.1. This new texture type allows generic buffers to be attached to a texture as a 1D array. Now general buffer data is accessible to shaders through new fetch functions. Additionally, a copy mechanism (GL_EXT_copy_buffers) that allows for direct accelerated buffer-to-buffer copies has been added. This extends the benefits of generic buffer objects and creates interesting opportunities with multithreaded load/execute algorithms.

Instanced rendering
Instanced rendering has been added to core, allowing apps to draw multiple copies of similar objects without incurring system bandwidth costs (I mentioned this inadvertently in an earlier post).

Other features
Primitive restart, SNORM textures and several other new features were also added.

OpenGL is continuing to march forward with progressive revisions bringing new functionality to 3D developers. AMD will follow with full driver support for OpenGL 3.1 shortly.

FirePro OpenGL Developer Tools for tessellation and 10-bit component surfaces

Posted by Tony DeYoung on March 20, 2009

AMD has released Workstation developer tools including OpenGL tessellation example source code for the AMD_vertex_shader_tessellator extension as well as sample OpenGL source code that illustrates how to create and display 10-bit per component surfaces. This code is specifically intended for developers building products targeting professional users of FirePro workstation cards.  AMD has also released the AMD Display Library SDK for Windows and Linux that provides access graphics hardware information on FirePro and Radeon products.

Tags: 3D, Developers

OpenGL 3 - what types of changes to expect from your favorite 3D applications

Posted by Nick Haemel on February 18, 2009

Now that OpenGL 3.0 is well on its way to a desktop nearby, you may be curious about what types of changes to expect from your favorite 3D applications. There are two main categories of improvements for OpenGL 3.0, changes that introduce new tools and changes that allow for performance enhancements. Well, let’s take a look!

OpenGL 3FBOs
First, a new buffer binding mechanism called FBOs (Frame Buffer Objects) allow an OpenGL app to do comprehensive, fast and efficient off-screen rendering without creating a new context. Additionally, these FBOs can have floating point buffers attached as render targets. By using floating point buffers, applications can maintain more precision in the final image as effects are applied to a scene. This enables some really cool lighting effects such as lens aberration and blooming; similar to a feature film shot that catches a direct glimpse of the sun. Additionally, object highlights and specular reflections can appear much more realistic.

Transform feedback
Transform feedback, also called stream-out, is another new addition that will revolutionize what is possible on a GPU. Applications can use this feature to assist in physics computations directly on the GPU, preprocess or multi-process vertices, and perform complex math operations. Apps can also make use of transform feedback to efficiently tessellate geometry, adding significantly more detail to objects and scenes without increasing data file sizes on your hard drive. AMD supports a custom extension that offers applications even more control over tessellation.

Vertex array objects
OpenGL 3.0 also offers new ways of storing and referencing geometry, allowing for quicker access. Vertex array objects, or VAOs, make setting up rendering much quicker. New data formats also allow more efficient storage of geometry and texture information. All of these performance enhancements will allow applications to increase model sizes, use more sophisticated shading techniques, and increase overall visual fidelity.

Some applications have shorter development cycles than others. Typical CAD and digital content creation suites are large and complex; it may be a year or more before we see widespread adoption. Game engines may begin to look at the newest version of OpenGL sooner. But the good news is that changes in OpenGL 3.0 have made the API much lighter, allowing developers to achieve faster turnaround. There are many new tools in OpenGL 3.0 that bring exciting new power and flexibility to the 3D graphics arena. AMD is working closely with developers to bring OpenGL 3.0 to the next generation of professional 3D applications.

Instanced rendering (update 2/20/09 - available in the GL_ARB_draw_instanced extension - my mistake for first referencing it as core!)
There are numerous enhancements to OpenGL 3.0 that allow applications to process and render geometry much faster. One available for now as an extension is instanced rendering. This feature allows repeated rendering of some objects, sometimes at little or no additional cost. Imagine rendering hundreds of trees or blades of grass, all essentially the same geometry. This can also be applied for geometry stippling, other repeated patterns or even assist in bone-skinning for objects and characters that have moving joints.

Cut-to-the-chase summary of what to expect from OpenGL 3-enabled CAD and DCC apps:
- More realistic and interesting lighting effects
- Faster rendering of objects that repeat
- Improved visual fidelity and faster rendering
-  Greater detail in objects and scenes without increasing file sizes

OpenGL 3.0 - A Big Step in the Right Direction

Posted by Nick Haemel on August 18, 2008
OpenGL 3

There has been much controversy over the direction the Khronos Group/OpenGL ARB has chosen for the next major version of OpenGL. After testing an approach that would have a drastic effect on the API, requiring complete OpenGL application rewrites and not introducing any of the long awaited features modern GPUs are capable of, the choice was made to give programmers what they are really waiting for. And that’s new features now. GL 3.0 takes two important steps to moving open standard graphics forward in a major way. The first is to provide core and ARB extension access to the new and exciting capabilities of hardware. The second is to create a roadmap that allows developers to see what parts of core specifications will be going away in the future, also providing the OpenGL ARB with a way to introduce new features faster.

Over the last few years graphics hardware has made great strides forward. Different vendors have exposed home-grown extensions to give users access to hardware, but vendor extensions vary between vendors and are not a stable approach to supporting new hardware.  GL3.0 brings these new features under one roof, defining one common and accepted way that all vendors will implement. Now all GL application programmers can get access to things like float color/texture/depth buffers, integer formats, conditional rendering,  framebuffer objects, transform feedback, vertex array objects, half-float data types (vertex/pixel), and so much more. With these new features all developers have the tools to add new eye candy and much better optimize render algorithms and performance.

Many of the new features of OpenGL3.0 provide mechanisms to increase the efficiency and speed of today’s complex scenes. Conditional rendering allows an app to discard geometry that would be occluded during normal rendering. Half float formats can help drastically compress vertex data sets. Vertex array objects make setting up rendering much easier and less error prone. Map buffer range support allows a small portion of a buffer to be mapped, even while it is rendered from, no more GPU stalls required. There are also quite few additions to enhance rendering flexibility. Framebuffer objects provide a fast and simple way to accomplish off-screen rendering.  Transform feedback opens the door to a whole new set of complex multi-pass rendering and geometry generation algorithms previously impossible. Integer-in-shader support allows for much more flexible and natural shader code. Several other important features such as instanced rendering and geometry shaders have now been given ARB extension status as well.

By introducing the new deprecation model, the ARB has created a way to signal what will be removed in future revisions. This provides enough time for all developers to move code-bases to newer and better methods. Future versions of GL will remove fixed-function rendering, color index mode, immediate mode, client vertex arrays, and other seldom used portions of older specs. This helps to keep the spec lean and mean, also allowing hardware vendors to better optimize performance and maintain quality. A way for OpenGL to gracefully move forward has been long missing.  With the most recent changes, OpenGL now has the tools to keep open standard graphics current and useful for many different flavors of 3D applications.

AMD Stream Computing and Nvidia CUDA - similar but different

Posted by Tony DeYoung on June 19, 2008

CUDA vs AMD Stream Computing - both are APIs that let GPUs like the FireGL or Quadro, handle non-graphics compute applications in parallel.  The benefit of GPU computing stems from the highly parallel architecture of the GPU whereby tens to hundreds of parallel operations are performed with each clock cycle whereas the CPU can at best work only a small handful of parallel operations per clock cycle. CUDA and Stream differ in one significant way.  AMD has published their interfaces for Stream from lo to hi level.  AMD’s Brook+ (the Stream Compiler) is completely open source, while Nivida’s CUDA is proprietary.  The AMD Stream Compute Abstraction Layer (CAL), the run-time driver layer and intermediate language, is also published so developers can program at either the CAL or Brook+ level - their choice. Finally AMD publishes the instruction set architecture so developers can tune low-level performance.

Publishing APIs at all levels is good for developers and ultimately good for consumers. The move to open standards with the new Heterogeneous Computing Initiative supporting OpenCL (Open Computing Language), will be a good move for both AMD and NVIDIA.  The idea is that an application developer would write an OpenCL-based stream computing application, and it would run on any GPU or CPU with an OpenCL driver. Both AMD and Nvidia have indicated they want to support this new standard.

OpenCL is still evolving, but AMD’s director of Stream Computing, Patricia Harrell says that there are enough structural similarities to Brook+ that FireStream programmers shouldn’t find it too difficult to make the transition.  It will be a while, but in the not too distant future, our graphics accelerators will suddenly be capable of doing a lot more interesting things.

Page 3 of 3 pages  <  1 2 3

Close