Last week HP unveiled (with much fanfare) their new Z-series workstation PC powered by Intel’s Nehalem Xeon processors. The rock-star moniker refers both to the zippy tech specs as well as to the pretty sexy exterior (for a professional workstation) having been designed by BMW Group Designworks. The Z800 can hold as much as 192GB of DDR3 memory, along with up to two Intel Xeon processors, with the 3.2GHz, quad-core W5580 being the top-of-the-line offering. When running at full power, HP claims the workstations can deliver performance increases of between 50 to 500 per cent, depending on the application. Obviously the graphics cards here make a huge difference and HP is offering the ATI FirePro V7750, as well as the ATI FirePro V3700 and ATI FirePro V5700 graphics accelerators as options. The new HP Z Workstations with FirePro graphics are already certified for more than 35 CAD/CAE and DCC apps such as SolidWorks 2009, 3ds Max 2009, and Adobe Premiere Pro CS4.
3DProfessor.org has just written up a pretty comprehensive review of the Intel Nehalem-EP processor. What makes this particularly interesting is that it benchmarks 64-bit system performance using both the FirePro V8700 and Quadro FX4800, testing in Max 2009, Solidworks 2007, Cinebench, POV Ray, and SPECviewperf 64-bit. The basic takeaway is that real world applications are going to benefit hugely from the combination of Nehalem and pro graphic cards.
Desktop Engineering ran its own benchmarks on the FirePro v8700 (ulta high-end card with 800 stream processor, 1GB GDDR5 frame buffer, stereo, 2 DisplayPorts), and compared it to the previous top-of-the-line FireGL v8600. They basically confirmed ATI’s own SPECViewPerf benchmarks (refreshing to hear for performance numbers). Like the previous generation FireGL, the FirePro V8700 is CAD certified, has a 3-year repair/replacement warranty, features AutoDetect (optimizes the driver performance based on the user’s specific software application when running multiple programs simultaneously) and HDR. But the FirePro is smaller (but not small enough), significantly less power hungry, slightly faster than, the FireGL V8600/V8650, while costing significantly less (average US street price of $1,229).
Their conclusion “A board like the ATI FirePro V8700 is likely overkill for midrange CAD users. But for those who demand the utmost in performance, this board delivers significant speed at an affordable price.” (If you are one of those midrange CAD users and feel left out, check out the FirePro V7750 at $899 retail.)
I was reading through the FirePro V7750 press release from AMD and noticed links to some YouTube “testimonial” videos at the bottom of the release. The first 3 are pretty dry (if you can sit through them though, they do explain why you do need higher and higher powered cards - it’s not just about having the latest toy). In any case, the forth video from Troublemaker Studios is pretty cool visually and makes the compelling argument of using the new line of FirePro cars so that the technology doesn’t get in the way of the art.
OK - I stole the title directly from Engadget because it sums up the new ATI announcement for the FirePro V7750 pretty succinctly. The specs: 320 stream processing units, OpenGL 3 and Shader Model 4.1 support, a unified video decoder for accelerated H.264, AVC, VC-1 and MPEG-2 video formats, 1GB of GDDR3 frame-buffer memory, a 30-bit display pipeline with two DisplayPort outputs and and one Dual Link DVI port, which together generate a multi-monitor desktop of more than 5000 pixels wide. Since it is the pro-line of ATI cards, they have been certified for performance on all the big CAD/DCC application. And the “showstopper” news: the price at $899.
Everyone needs more realism and everyone is making bigger and more complex models. So the high performance/great price deal is pretty obvious. What I also find pretty interesting is the support for Stream Computing on the card, and that accelerated video encoding (and I assume decoding). This card aims at DCC and visualization pros as much as at the CAD community. Pretty much a no-brainer if you are looking for top performance and don’t want to stress out over price.
As many predicted, AMD showed off OpenCL-accelerated Havok gaming physics at the 2009 Game Developers Conference. Specifically the demo showed OpenCL- accelerated Havok Cloth, a runtime and toolset that lets developers add physically simulated cloth to their games. The GDC demo, duplicated in these official videos shows the acceleration applied to realistic motion of garments like skirts, capes, shirts, trousers and coats, as well other deformable items like hair.
OpenCL is the first cross-platform and open technology designed to exploit the horsepower in multi-core CPUs and GPUs (or other processors).
Khronos and the OpenGL ARB have done it! OpenGL 3.1 and GLSL 1.40 have been released on the 6 month schedule promised at SIGGRAPH 2008. As promised, most of the legacy features marked as deprecated have been removed. No more display lists. No more immediate mode rendering. No more fixed function pipeline. The cruft accumulated over the last 17 years has been cleaned up to create a simplified and performant 3D graphics API. OpenGL 3.1 really does match the current generation of programmable graphics devices.
In addition to removing deprecated functionality, OpenGL 3.1 adds a bunch of handy new features.
Uniform buffer objects
The first and biggest is support for uniform buffer objects. This new object allows a shader to group uniforms together into a block of uniform memory. New interfaces make updating groups of uniforms easier and much more efficient. These new buffers can also be shared between programs, reducing wasted memory usage and shader uniform-reload time.
Texture buffer objects and Copy buffers
Texture buffer objects were also introduced into core OpenGL 3.1. This new texture type allows generic buffers to be attached to a texture as a 1D array. Now general buffer data is accessible to shaders through new fetch functions. Additionally, a copy mechanism (GL_EXT_copy_buffers) that allows for direct accelerated buffer-to-buffer copies has been added. This extends the benefits of generic buffer objects and creates interesting opportunities with multithreaded load/execute algorithms.
Instanced rendering
Instanced rendering has been added to core, allowing apps to draw multiple copies of similar objects without incurring system bandwidth costs (I mentioned this inadvertently in an earlier post).
Other features
Primitive restart, SNORM textures and several other new features were also added.
OpenGL is continuing to march forward with progressive revisions bringing new functionality to 3D developers. AMD will follow with full driver support for OpenGL 3.1 shortly.
AMD has released Workstation developer tools including OpenGL tessellation example source code for the AMD_vertex_shader_tessellator extension as well as sample OpenGL source code that illustrates how to create and display 10-bit per component surfaces. This code is specifically intended for developers building products targeting professional users of FirePro workstation cards. AMD has also released the AMD Display Library SDK for Windows and Linux that provides access graphics hardware information on FirePro and Radeon products.
I have seen many concerns, questions, comments, etc on the progress of OpenGL over the last few years. Some are positive, some not. But what really seems to be lacking is a clarity on the how and why.
First, OpenGL is a public and open specification that is available for all to read and program for. It is written by a consortium of companies and individuals within the Khronos Group who have an interest in 3D graphics. Generally speaking, considerable time and effort is volunteered by these groups to help promote open 3D graphics specification. Modern OpenGL specifications are not the creation of one company or individual. OpenGL is intended to work on a considerable selection of platforms, making code very portable between operating systems. OpenGL is intended to work on many levels of hardware, allowing the same application to run on various graphics chips. I'd argue OpenGL related specs (OpenGL ES/SC/etc) are some of the most portable APIs for 3D graphics.
So how exactly does an OpenGL specification get written? The ARB looks at hardware capabilities and corresponding OpenGL extensions to determine candidates for inclusion in a future version. Input on feature candidates is gathered from industry players and individuals. Features are then prioritized by general usefulness and future compatibility, and then added to the core language. Simplified, a new feature goes from Idea --> Vendor/ARB Extension --> Core OpenGL Specification, assuming hardware is capable.
Why use this process? One reason is that extensions provide an efficient proof-of-concept. The idea has been tried. Its weaknesses and strengths are known and can be addressed as the extension is promoted to core. Another reason is that free-form design by committee is cumbersome and error prone at best. The side-effect of this process is that the core specification may lag behind some hardware and vendor extensions. But the final core specification is more stable and predictable.
A really common question I have seen is "I see feature XYZ in the DX spec, why isn't it in OpenGL?" There are several reasons. First and most importantly, OpenGL is a unique, separate, and independent specification. It does not, and should not follow any other API. Second, OpenGL makes decisions that are in the best interest of the industry, not of one company.
How can you help?
Given that new OpenGL features are likely to come from extensions, contact IHVs and Khronos members with constructive feedback of what you need and what types of features you think will be helpful. Use vendor specification extensions, let us know which ones are most helpful and would make good candidates for core features. OpenGL is our 3D graphics API. We can work together to make it the best API for the future of graphics.
This video presents a side-by-side comparison of identical PCs running the Binomial Tree pricing model used in financial transactions to determine the value of options . The example represents a portfolio of 8000 options.
The image on the right, running 55 times faster than the image on the left, is the RapidMind-enabled version using FireStream technology. What makes this especially interesting is the recent release of the FirePro 2450 which specifically targets the financial markets and traders. Combine that new card, 4 monitors, and Stream technology, and you have some heavy duty capabilities for playing in the financial markets .
FireUser.com is a community resource for visualization, 3D, video and engineering professionals to learn about the latest acceleration and display technologies, discuss support issues, as well as influence the features and direction of the FireGL and FirePro accelerator line.