I guess I missed the news that AMD bought gDEBugger back in Oct 2010. Today AMD announced the new AMD gDEBugger release which provides developers with the ability to debug OpenCL kernels, running on AMD GPUs, and step through their source code while examining kernel variables and data. This product, which is a plug-in designed to work with Microsoft Visual Studio, includes all of gDEBugger’s previous features and capabilities.
This may seem like esoteric technical news, but gDEBugger has been the leading tool for real-time code optimization with OpenGL developers, and now for OpenCL. This should dramatically speed the development of OpenCL applications.
Side-by-side video comparison of the AMD FirePro V7900 (with GeometryBoost) vs the Nvidia Quadro 4000 running the CATIA CATBench benchmark on identical HP Z600 workstations. Both cards have 2 GB GDDR5 frame buffers and are similarly priced.
The CATbench benchmark uses a number of different models ranging in size from an engine block to an entire nuclear submarine assembly. It performs a set number of pans, zooms and rotations in shaded plus edges graphics mode.
The video speaks for itself - basically a 2:1 performance advantage for the V7900 overall. Where the difference is really apparent is with the complex large models. That is in part GeometryBoost in action.
No it’s not an application test for workstation graphics users, but it is cool and it is applicable. HotHardware tests two Radeon HD 6870 2GB consumer cards in CrossFire mode with a 5x1 Eyefinity portrait display arrangement - for gaming. From the review:
We set out to ascertain two things at the beginning of this project, to see how well a pair of Radeon HD 6870 cards scaled in CrossFire mode using recent drivers and to see how well the setup performed with a five-screen 5x1 Eyefinity configuration. Our tests showed excellent performance scaling (between 82% and 99%) in CrossFire mode with the applications we used and we achieved playable framerates at an effective resolution of 5400x1920 using Eyefinity 5x1, even in some very taxing games like Alien vs. Predator and Metro 2033, albeit with the quality settings dialed down a notch or two.
A pair of NVIDIA GeForce GTX 580 cards and a single large monitor—-which is a setup we know many enthusiasts might currently have—costs around $1300 to $1500 currently. The five 22” Dell screens we used and two PowerColor Radeon HD 6870 2GB Eyefinity 6 Edition cards would be about $1700, which isn’t all that far off.
Below is a video capture of Total War: Shogun 2 running across 5 displays at a resolution of 5400x1920.
The AFDS is coming up June 13th and there will be a lot of focus on OpenCL developers and ISVs supporting OpenCL. Below are summaries of just a few of the talks from leading CAD/CAE ISVs as well a video from AMD senior engineers.
Dassault Systemes Simulia and Acceleware, Abaqus Accelerating SIMULIA’s Abaqus Solver on AMD GPUs
As part of the mechanical design process, engineers frequently use direct sparse solvers to solve linear systems that arise during the finite element modeling of mechanical systems. These solvers factorize small dense symmetric matrices using a generalized Cholesky decomposition (LDLT). Acceleware and Dassault Systemes have developed an integrated solution that performs the LDLT factorization on AMD GPUs using OpenCL within the Abaqus software package. This presentation introduces the components of a direct sparse solver and outlines the distribution of front sizes for typical models. We then present the factorization performance in both single and double precision. On the AMD Firepro V9800 GPU, we measure the performance at approximately 400 GFlops in single precision and 150 GFlops in double precision. To conclude, we compare AMD timings against examples run on an NVIDIA GPU.
Dassault Systemes Leveraging Hybrid Computing: Challenges and Opportunities for the PLM Industry
Beyond simple visualization, GPUs are now widely used to accelerate numerical computation. Combining the capabilities of general purpose CPUs and dedicated GPUs in a single unit, the Fusion technology possesses unique potential in areas such as Solid and Surface modeling, Simulation, or Scene Graph processing. Dassault Systemes presents its vision of the problems faced in the PLM industry and how to use hybrid CPU/GPU computing in order to address them.
DEM Solutions, EDEM Using the AMD FirePro V9800 to Accelerate EDEM Simulations
EDEM is an engineering simulation software platform powered by advanced DEM (Discrete Element Method) simulation technology. It is used in the design, prototyping, and optimization of equipment for handling and processing bulk particle materials such as aggregate ores, coal, grains, tablets, powder, and fibers in industries including mining and mineral processing, metals manufacturing, construction and agricultural machinery, and pharmaceuticals. DEM Solutions has developed a prototype OpenCL implementation of its EDEM simulation engine for the AMD FirePro V9800 that greatly accelerates EDEM compute speed compared to simulation on multi-core CPUs. The performance gains vary with different simulation models but speedups of up to 12 times against current quad-core CPUs were achieved. The prototype GPGPU accelerated solver is being further developed for implementation with new products.
OPTIS, RTLab and VRLab Highly Parallel Computing in Physics-based Rendering
Light propagation algorithms need high performance parallel computing on either a CPU or a GPU. OPTIS, the world leader in light simulation software, discusses their experience in highly parallel computation and the expected benefits of the AMD Fusion APU for physics-based rendering applications.
Altair, HyperWorks RADIOSS Altair RADIOSS Solver Porting Using an AMD GPU
This session presents the results of Altair’s experience in porting our Preconditioned Conjugate Gradient (PCG) algorithm from RADIOSS FEA to OpenCL code. To evaluate GPU potential, we focused on the RADIOSS iterative solver used for resolution of implicit linear and nonlinear problems. Using several GPU cards in parallel, along with the high scalability of PCG, we observed significant reductions in time-to-solution. In addition, this approach produced performance improvements that counteracted the inherent drawbacks of the interative method to reach convergence. AMD’s evolution toward APUs opens new perspectives for both implicit and explicit methods of solving problems. At Altair we will continue to study the capability of such emerging technologies to provide breakthroughs in performance to customers.
OpenFOAM Multi-GPU Calculations in OpenFOAM
We present illustrative applications of the SpeedIt library in CFD simulations, implementing GPU compute in the OpenFOAM environment. We then show how to improve computational efficiency in very large linear systems by employing GPU-based iterative solvers with OpenFOAM. We will also show a series of examples showing how the code performs in real applications and Sparse Matrix-Vector m=Multiplication (SpmV) in OpenCL.
If you haven’t already had enough of the great reviews over the FirePro V7900 3D Professor adds another to the fold, but in his own unique way. One point he makes pretty clearly is what a significant upgrade this is from past generations, specifically the V7800 (e.g. FirePro V7900 Cayman Pro GL graphics processor can process over 1450 million triangles per second whereas the V7800 a meagre 700 million) as well as emphasizing price/performance.
Quick Takeaway: After many years of paying premiums for entry level high-end workstations we have solutions (in the FirePro V7900) that are affordable and have the power within to provide the end-users a substantial stable workstation that will last for some time to come. We have over the last few weeks completely stressed this card to its fullest without any complications. This professional graphics card has to offer plenty of room for expansion in whichever way it is utilized; as a home gamer, SoHo Workstation. Or, and more appropriate within the corporate market place as an upgrade to the standalone desktop unit as there is so much power within. The studios, CAD/CAM and DCC market has obtained an important injection of technology which will further advance their current systems and once more reiterating, a sound solid fast system for productivity output. Consequently and once more we have to reiterate the objectivity of price performance.
AMD PowerTune technology for professional graphics (in the new FirePro V5900 and FirePro V7900) helps deliver higher performance that is optimized to the thermal limits of the GPU by dynamically adjusting the clock during runtime based on an internally calculated GPU power assessment. It also improves the mechanism to deal with applications that would otherwise exceed the GPU's thermal design power (TDP). By dynamically managing the engine clock speeds (as opposed to fixed states of low, medium and high) based on calculations which determine the proximity of the GPU to its TDP limit, AMD PowerTune allows for the GPU to run at higher nominal clock speeds in the high state than otherwise possible.
What does this mean in practice?
For applications running in the highest power state, but still below the TDP limit, the GPU dynamically increases engine clock speeds to improve application performance. Most apps fall into this category.
For outlier applications which require TDP containment, the GPU is not necessarily throttled and forced into an intermediate or low power state through a thermal event. PowerTune maintains operation in the highest power state, but dials back on runtime power by modulating the high power state clock to keep the TDP range slightly below the absolute limit. Without PowerTune, applications which exceed the GPU TDP are forced to lower power states and pay a very steep performance penalty as a result of drastically reduced clock speeds and voltages.
In short, PowerTune technology maximizes TDP-constrained performance by enabling higher GPU clock speeds
Importantly, PowerTune technology in the professional graphics line is optimized for the usage profiles of workstation applications rather than consumer applications. For example, workstation applications typically do a great deal of geometry processing, and significantly less pixel shading and texturing. PowerTune on the professional graphics cars is optimized for productivity, while on the consumer graphics cards, for gaming performance.
GeometryBoost in the new FirePro V5900 and FirePro V7900 is a unique hardware capability that processes two primitives per clock cycle. Each graphics engine is assigned to its own shader engine, consisting of up to 10 SIMDs. The results is a doubling in the rate of primitive & vertex processing, as well as back/front culling rates and scan conversion setup. It also doubles early reject rates.
The new 8th generation tessellation engine improves performance up to 3X in both OpenGL 4.1 and DirectX 11.
What this means is incredibly fast geometry performance for professional applications, ensuring smoother handling of large, complex models.
New to the FirePro V5900 and FirePro V7900 is support for DisplayPort 1.2 that increase data transfer to 5.4Gbps High Bit Rate 2 (HBR2) mode, so you can drive 4K x 2K display with 30 bits per pixel over a single cable. With AMD’s HD3D 3D stereoscopy standard and DisplayPort 1.2, 3D monitors can go beyond full HD while maintaining a true 120Hz refresh rate (more bandwidth than either HDMI1.4a or DL-DVI so you could for example, play Blu-Ray 3D in a window).
Using a DisplayPort 1.2 Multi-Stream Transport (MST) hub, another advantage is that an Eyefinity-enabled card will be able to drive up to six HD resolution displays from just two DisplayPort outputs. So space constrained cards, will no longer be Eyefinity-constrained (e.g. laptops).
The new AMD FirePro V7900 is based on the third generation of 40nm GPU (formerly codenamed Cayman) and features 1280 stream processors and 2GB GDDR5 memory. It is a single slot solution with four built-in DisplayPort 1.2 outputs and with the use of the included four active adapters, supports single link DVI displays out of the box. This allows it to drive 4 displays simultaneously (Eyefinity technology). It also includes a stereoscopic 3-pin mini-DIN (with included expansion bracket) and supports Framelock/Genlock using the ATI FirePro S400 synchronization module.
The card supports the new PowerTune power management technology for dynamic clock optimization, and adds GeometryBoost which provides 2X transform and backface culling and 3X tessellation performance in OpenGL and DX11. Drivers support OpenCL 1.1. CAD application-certified OpenGL 4.1, and DirectX 11. Additional professional graphics cards can be linked together using CrossFire Prot to enable CrossFire support for windowed applications, as well enabling up to 12 simultaneous Eyefinity displays (think video walls and digital signage on the cheap).
Full review on HotHardware: “if you’re looking for a low power, multiple monitor solution for your 3D animation and rendering workloads, definitely check out the new FirePro V7900 and V5900 cards from AMD.”.
Also see Icrontic
FireUser.com is a community resource for CAD, visualization, 3D, video and engineering professionals to learn about the latest acceleration and display technologies and news with a focus on the AMD FirePro workstation graphics line.