Login

Cuda

Tips & tricks, working results, technical support

Re: Cuda

Postby jorismak » Thu Apr 17, 2014 1:46 am

Tried flicking through my AITB EAR Rooms collection. Doing 20 second renders, the speed at which it renders depends on the program loaded. Short tails are faster than longer ones of course.

I have a Core i7-860 @ 3.5ghz, and a GTX760.
Over all, if I do a render with 'Nebula Reverb' instance (4096 DSP buffer) and then do a render with my 'Cuda' instance (same 4096 DSP buffer, just picked 'OPT FREQD' to '15CUDA ac 2Mono') I see that the Cuda instance is consistently '1x realtime' faster.

So if my NebulaReverb renders at 2.4x realtime, my Cuda instance will do 3.4x realtime. If my cpu does 0.7x realtime, my Cuda instance will do 1.6x - 1.7x realtime.

Does that sound OK or did I screw something up somewhere? The TestCuda64.exe program said it was ok and device opened OK, and then just sits there.
jorismak
Member
Member
 
Posts: 344
Joined: Fri Nov 16, 2012 4:49 am

Re: Cuda

Postby giancarlo » Thu Apr 17, 2014 6:08 am

no it s ok.... we have similar improvements here
User avatar
giancarlo
Founder
Founder
 
Posts: 9210
Joined: Mon Sep 21, 2009 10:40 pm
Location: Italy

Re: Cuda

Postby vicnestE » Thu Apr 17, 2014 8:56 am

On realtime playback (using Timp's Marsh Spring MED 6 kernels Spring reverb and some other reverb as well), the CUDA one actually has more CPU usage at the start of playback. Both Nebula DSPBuffer are 8192. And when stable the CPU usage is the as (both take 0.02%).

Offline render for CUDA is a bit faster in a more noticeable way.
User avatar
vicnestE
Expert
Expert
 
Posts: 740
Joined: Sat Mar 27, 2010 5:11 am

Re: Cuda

Postby hwasser » Thu Apr 24, 2014 10:19 am

So how is CUDA working? I use a GPU-IR-reverb right now which is really nice cause it doesnt hit the CPU.

I'm thinking about getting a Nvidia graphics card.
hwasser
User Level VIII
User Level VIII
 
Posts: 88
Joined: Sun Oct 27, 2013 12:44 am

Re: Cuda

Postby jorismak » Thu Apr 24, 2014 11:36 am

Since it only works (kinda) well on Reverb programs, and the most I use Nebula for is console/preamp/tape/eq I can't really use it.

And the gains I get with reverbs is not enough to keep working on it. The difference between 2.4x realtime and 3.4x realtime is an improvement, don't get me wrong. And if it's the difference between the project running realtime or not I'm happy to have it :P. But it's not the _WOW_ factor some people hope it to be.

I also seem to have a problem where I can only put on Cuda instance on a project. If I load a second instance (in the same track or another track) it starts stuttering. If I use the regular 'nebula reverb' instance it loads fine.

Mind you, this is with a regular nvidia geforce card. I can't try with Quadro's or Tesla's, the real compute-power cards :).
jorismak
Member
Member
 
Posts: 344
Joined: Fri Nov 16, 2012 4:49 am

Re: Cuda

Postby hwasser » Thu Apr 24, 2014 12:36 pm

jorismak wrote:Since it only works (kinda) well on Reverb programs, and the most I use Nebula for is console/preamp/tape/eq I can't really use it.


Oh okay, that's sad cause I already use a GPU-IR-Reverb that works extremely well, 6 instances running right now, which uses 15% gpu power and 10% graphics memory.

I run Nebula for Console/Tape/Eq myself.
hwasser
User Level VIII
User Level VIII
 
Posts: 88
Joined: Sun Oct 27, 2013 12:44 am

Re: Cuda

Postby jorismak » Thu Apr 24, 2014 12:57 pm

Do you really need cuda for IR reverbs? Nice to know how much GPU power and memory it costs but how much CPU does it save you? Keeping the Cuda card busy to work requires CPU power too :), and I think IR convolution is so computationally inexpensive these days it isn't a problem.

I have a Reaper test running here 44.1 khz, 256 ASIO buffer. I'm running 64 channels with a simple mono wave file on them.

Playing those 64 tracks without any effects costs around 25%/26% cpu power, with my Core i7 laptop cpu clocking to 2.5 ghz.

Now, if I put a single Reverb on a track, with a _stereo_ reverb of 4.91s, and start adding that on the 2nd track, and the 3rd track, etc.. until my CPU is full, I can get to track _38_. That are 38 instances of a _long_ reverb tail in stereo, and a total of 64 tracks playing... On a (admitted, powerful) laptop cpu.

I'm thinking Cuda would just give more CPU overhead in IR-convolution than that it actually allows to put more instances in.

Kinda the same with short-tail Nebula programs. If you have very short kernels (as in, 100ms or lower) it takes more effort to load the data to the card, do the calculating and transfer the result back to the cpu, than it does to actually just calculate on the cpu :).
jorismak
Member
Member
 
Posts: 344
Joined: Fri Nov 16, 2012 4:49 am

Re: Cuda

Postby hwasser » Thu Apr 24, 2014 3:07 pm

Some long IR-reverbs are pretty CPU-heavy. The IR-reverb I use doesnt use CUDA, it uses OpenCL which is more effective.

GPU Impulse Reverb VST is an effect plugin that calculates convolution reverbs by using your graphics card as DSP for realtime reverb calculation with a CPU usage of near 0%.

Low latency, only one ASIO block size
Supports Stereo & True Stereo processing (quad-channel impulse responses)
Supports 16, 24 and 32 bit responses
Supports as many instances as your GPU can handle
2-Band EQ
Adjustable Attack/Release & Length Envelope


For example, OpenCL vs Cuda(nvidia only) performance:
http://www.extremetech.com/wp-content/u ... Price1.png
hwasser
User Level VIII
User Level VIII
 
Posts: 88
Joined: Sun Oct 27, 2013 12:44 am

Re: Cuda

Postby giancarlo » Thu Apr 24, 2014 7:18 pm

nebula is not gpu-ir-reverb and our algorithm can't be executed completely in cuda. Cuda is not the issue, or openCL. Plus nebula executes hundreds of IRs in realtime.
User avatar
giancarlo
Founder
Founder
 
Posts: 9210
Joined: Mon Sep 21, 2009 10:40 pm
Location: Italy

Re: Cuda

Postby jorismak » Fri Apr 25, 2014 11:27 am

We weren't comparing Nebula to (gpu)-IR. We just got offtopic a bit :P.

That graph you sent, says _nothing_ about general performance of OpenCL vs Cuda. Trust me, they are about the same. The algorithm you're running has way more too do with the performance and the hardware you're running it on, than the SDK you use to do it :). AMD cards are optimized for OpenCL, and they just have stronger compute-performance these days (it was the other way around in the GTX5xx and HD6xxx era). On retail Geforce cards nvidia traded compute performance for game performance for a bit, which makes perfect sense as that's what they are used for :). If you want compute performance from a nvidia card get a Quadro or Tesla :P.

Anyway, I'm still surprised that (you guys think that) OpenCL (OR Cuda) accelerated convolution is worthwhile these days. Nebula I can understand. Like Giancarlo is saying, Nebula is constantly executing over 100's of them in a single instance and blending between them and layering them... and more :P.

Are there free or demo OpenCL / Cuda IR reverbs so I can try them out? If my laptop can run 38 of them with a _LONG_ tail in stereo and then the trackcount is having more an effect than the IR convolution, I don't think I will gain a lot by switching to GPU acceleration.
jorismak
Member
Member
 
Posts: 344
Joined: Fri Nov 16, 2012 4:49 am

PreviousNext

Return to Working with Nebula

Who is online

Users browsing this forum: No registered users and 6 guests