AMD announced that their Open Physics Initiative now offers game developers the open source, free Bullet Physics as the default rigid body physics system combined with Pixelux’s DMM2 ( Digital Molecular Matter) material physics engine. Developers can now design and interact with rigid body systems familiar to them and easily add DMM objects incrementally enabling them to bend and break based on real physical properties.
The Free PC version of DMM2 has no license fee for development or production deployment and includes all the features of the premium version including GPU acceleration. Free PC DMM2 is expected to be made available shortly to interested developers. All of the Bullet Physics implementations described above can be run on any OpenCL- or DirectCompute-capable platform
gDEBugger is an OpenGL debugger and profiler which runs on Windows, Linux and Mac OS X. The new v5.5 release adds AMD GPU Performance Counters integration, displaying AMD (ATI) graphic hardware and driver performance counters inside gDEBugger’s Performance Graph and Performance Dashboard Views, allowing developers to optimize OpenGL application performance on ATI FirePro and Radeon graphics hardware.
AMD Developer Centeral has posted the ATI Stream OpenCL Technical Overview Video Series. The series of 5 videos provides ATI Stream developers an overview of the OpenCL API and OpenCL C programming language.
Here is a summary of the videos by AMD’s Justin Hensley:
The first production release of ATI Stream SDK with OpenCL 1.0 support is out for Windows XP, Vista, 7 as well as openSuse 11.0 and Ubuntu 9.04. The ATI implementation of OpenCL lets developers use combined CPU and GPU power for accelerating applications. This release supports all FirePro workstation cards, as well as the consumer Radeon HD 4XXX, HD 5XXX, and Mobility HD 4XXX .
What's new in ATI Stream SDK 2.0?
First production release of ATI Stream SDK with OpenCL 1.0 support.
New: Support for OpenCL ICD (Installable Client Driver).
New: Support for atomic functions for 32-bit integers.
New: Microsoft Visual Studio 2008-integrated ATI Stream Profiler performance analysis tool.
Preview: Support for OpenCL / OpenGL interoperability.
Preview: Support for OpenCL / Microsoft DirectX 10 interoperability.
Preview: Support for double-precision floating point basic arithmetic in OpenCL C kernels.
Updated OpenCL runtime to conditionally load ATI CAL runtime libraries to allow execution on compatible CPUs without ATI Catalyst installed.
Updated OpenCL runtime to allow simultaneous use of OpenCL and ATI CAL APIs in a single user application.
Updated cl.hpp from the Khronos OpenCL working group release.
Various OpenCL compiler and runtime fixes and enhancements
SmallptGPU is a small and simple Path Tracer demo written in OpenCL in order to test the performance of this new standard. Path tracing is essentially a form of ray tracing whereby each ray is recursively traced along a path until it reaches a light emitting source where the light contribution along the path is calculated. This recursive tracing helps for solving the lighting equation more accurately than conventional ray tracing (definition courtesy of Wikipedia).
SmallptGPU was originally written for Linux using the ATI OpenCL SDK beta4. But there are now Windows 32 & 64 bit builds in this thread 4th post from top. Since it is OpenCL, the code should work on any platform/implementation.
The following video shows the demo running on a Radeon 4870. You see the progressive rendering raytracing technique in action.
Keep in mind that a Radeon 5970 should be at least 4 times faster. Moreover an OpenCL renderer should scale across as many cards as you can cram onto a board.
The ATI Stream Quarterly Newsletter is now online. With the recent release of the OpenCL GPU Beta as part of the ATI Stream SDK v2.0 Beta Program, this quarterly is packed full of OpenCL information and resources.
Here is a summary of what you will find:
OpenCL CPU+GPU Beta Release
Introductory Tutorial to OpenCL with Benedict Gaster
AMD Developer Inside Track: Introduction to OpenCL with Michael Houston
Image Convolution Using OpenCL – A Step-by-Step Tutorial
OpenCL Tutorial – N-Body Simulation
Spotlight Application: Distributed RC5 Encryption with ATI Stream
AMD and SiSoftware Collaborate on OpenCL Industry Benchmark Suite
Tips and Tricks: Porting CUDA Applications to OpenCL
Coming Soon! OpenCL Technical Overview Video Series
Coming in December! CAPS to release AMD CAL/IL Backend for HMPP
Available Now! ATI Stream Development Platforms from Colfax and Exxact
Developer Training Program: OpenCL Course from VizExperts
Related GPGPU benchmarking suite released: The new Sandra 2010 benchmark suite for GPGPU computing enables testing of ATI Stream, Cuda, OpenCL and DX11 Compute Shaders. SiSoft published some initial benchmarks and the showstopper was the performance of the new Radeon 5870 running OpenCL. Quote from the test results page: “Pummels everything into dust with fantastic performance, power and cost efficiency. The very best!”
AMD has given out more details on it “Fusion” strategy of taking advantage of their combined CPU and GPU strengths.
The first product (codenamed Llano and expected in 2011) will feature Phenom CPU cores tightly integrated with a GPU that supports DirectX11/OpenGL and OpenCL. Eventually this will be replaced by “Bulldozer” which will more tightly link the GPU math hardware into the multithreaded CPU core in a single-chip Accelerated Processing Unit (APU).
The Fusion approach AMD is taking, is designed to be developer- and existing render code-friendly, focused on open standards and compatibility with existing workloads, methods, and programming models.
Recently at SIGGRAPH 2009, Khronos and the ARB announced OpenGL 3.2 and GLSL 1.50. We have continued to increment 3D graphics capability on a 6 month schedule. OpenGL 3.2 adds a few larger pieces of functionality along with many smaller tweaks, while still being compatible with most modern installed GPUs. If you have a 1 or 2 year old GPU, chances are a driver update will bring you OpenGL 3.2.
The first major landmark in OpenGL 3.2 is geometry shader support. This long awaited shader pipeline stage allows for geometry primitives to be modified on the GPU. This includes generating new primitives from existing ones, modifying in-flight primitives, or removing primitives. With this feature, an app can amplify geometry without changing the stored vertices, implement tessellation schemes, or turn lines/points into volumes. One of the side effects of geometry shaders is that the amount of data handled by the CPU and passed to the GPU for the geometry generated is significantly reduced. This means precious bandwidth, CPU cycles, and memory are conserved.
OpenGL 3.2 has also added an important feature called sync objects. This feature creates a mechanism which allows the GPU and CPU to stay in sync. Previously the only way to be sure a GPU was finished with a surface or object was to flush the whole pipeline, stalling the GPU and killing performance. With sync objects, applications can be signaled when events on the GPU complete, even while the GPU is still fully saturated. This new functionality will work particularly well at syncing the CPU and GPU, keeping multiple graphics contexts in multiple threads in sync, and at synchronizing multiple GPUs when using extensions like WGL_AMD_GPU_association.
Multisample textures and samplers are now in OpenGL 3.2, giving applications the option of applying multisample rendering hardware to textures and render buffer objects, instead of only screen space windows. Now the use of off-screen real time rendering can also benefit from multisample rendering. Additionally, shaders can read from each sample of a multisampled texture and apply custom blend schemes.
With OpenGL 3.2, we have also added the idea of profiles. Two profiles exist in OpenGL 3.2, the core profile and the compatibility profile. Core profiles are ideal for modern applications that want the full performance benefits of a slimmed down API and reduced validation. Compatibility profiles are maintained for larger, older code-bases that need access to new features. AMD plans to support the Compatibility Profile, although other vendors may not. OpenGL 3.2 also adds a significant number of modifications that allow applications to be more easily ported from other 3D APIs. This is particularly important for developers bringing applications to different hardware such as mobile devices or Open Source platforms.
OpenGL 3.2 is proof of the relevance and continued evolution of open standards for 3D graphics. The OpenGL ARB continues to make forward progress, iterating through OpenGL releases that bring new and useful features to the graphics community. You can share suggestions and comments about OpenGL 3.2 with the OpenGL ARB through the official OpenGL 3.2 feedback thread on the OpenGL forum or by leaving comments for me here.
The SIGGRAPH 2009: Beyond Programmable Shading course notes and PDF slide presentations are now now available. “Beyond Programmable Shading I” topics include: parallel graphics architectures, parallel programming models for graphics, and game-developer investigations of the use of these new capabilities in future rendering engines. “Beyond Programmable Shading II” topics include volumetric and hair lighting, alternate rendering pipelines including ray tracing and micropolygon rendering, in-frame data structure construction, and complex image processing.
Intel and AMD were the course organizers. The course presenters were all experts on advanced rendering, graphics hardware, and parallel computing for graphics from academia and industry
As GPUs become more powerful, we see many new applications of how they can be used as general compute devices often rivaling and surpassing the CPU. But at the same time, modern GPUs are augmented with tools and features that assist general computation. These new features add high performance paths that enhance graphics rendering capabilities. One such addition is GPU tessellation.
Tessellation in its most pure definition is the tiling of a plane or surface by smaller sub surfaces. On the GPU this translates into breaking geometry into smaller, more detailed pieces. ATI has previously done this through TrueForm® with mixed success. A tessellation mechanism can also be implemented using the geometry shader. But the new tessellation engine in ATI Radeon HD Series and FirePro/FireGL V Series graphics hardware automates this process (currently not available for OpenGL on any nVidia hardware). Very little work is needed to get this running in any OpenGL app, just enable tessellation state in OpenGL and pick your tessellation factor based on how detailed you would like the geometry to be. The application vertex shaders can also be updated to correct texture coordinates based on the generated geometry.
This powerful rendering mechanism can both enhance geometry and increase performance. By using tessellation, the same level of detail can be rendered at 6-times the speed and save more than 50% of video memory, not to mention the bandwidth saved from uploading significantly less geometry. (840 original triangle model, rendered at LOD of 1,008038 triangles with and without the tessellation engine) Such a performance boost in addition to the visual enhancement can provide a significant advantage for any application that adopts tessellation.
The result of tessellation is deterministic, and therefore well adapted to many CAD situations. But the biggest gains can be seen in digital content creation. Digital content models are often large and can be difficult to render in real-time. With tessellation, significantly smaller model sizes can be used for similar levels of detail. Pre-visualization paths can also make use of tessellation to provide better looking images faster than was previously possible. The example below is a fly-by done with tessellation enabled, showing how tessellation can enhance a landscape in real-time.
AMD has also created a white paper detailing how to implement Catmull Clark subdivision using the tessellation engine. The demo and whitepaper can be found here. Or explore many of the other possibilities for using tessellation on OpenGL or DirectX.
FireUser.com is a community resource for visualization, 3D, video and engineering professionals to learn about the latest acceleration and display technologies, discuss support issues, as well as influence the features and direction of the FireGL and FirePro accelerator line.