The Crew
  6 / 10    
Great, glad you successfully fixed the Lava! It's fun when it lines up, isn't it? For the shader here, the o2.w doesn't exist in this case. In the declaration section, it's "out float2 o2" and hence only has o2.xy as possibilities. That means we can't use it in the prime directive without something else. Can always tweak the forumula but there are a lot of variants. Are you certain that o2 is the right one? The o0 is SV_POSITION0 is the actual location of the vertex, and we'd usually try to fix it. The o1 is a 4 parameter texture, and a likely candidate for alignment as well. What happens when you blank out each of the pieces?
Great, glad you successfully fixed the Lava! It's fun when it lines up, isn't it?


For the shader here, the o2.w doesn't exist in this case. In the declaration section, it's "out float2 o2" and hence only has o2.xy as possibilities.

That means we can't use it in the prime directive without something else. Can always tweak the forumula but there are a lot of variants.

Are you certain that o2 is the right one? The o0 is SV_POSITION0 is the actual location of the vertex, and we'd usually try to fix it. The o1 is a 4 parameter texture, and a likely candidate for alignment as well.

What happens when you blank out each of the pieces?

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

#76
Posted 03/18/2015 01:59 PM   
[quote="bo3b"]it's "out float2 o2" and hence only has o2.xy as possibilities[/quote] Huh, that's good to know. That's probably why I've had trouble when experimenting with some things.
bo3b said:it's "out float2 o2" and hence only has o2.xy as possibilities

Huh, that's good to know. That's probably why I've had trouble when experimenting with some things.

#77
Posted 03/18/2015 02:53 PM   
Thanks bo3b. I'm making good progress. Once I used o0.w, the prime directive worked a treat and fixed the map. Unfortunately, the minimap gets 'fixed' too, and now looks wrong. But I've been able to identify the various HUD and map parts and give them static depths at will. It seems impossible to make both the map and the HUD look good at the same time, so o now I'm going to have a go at using a key toggle to switch between 2 UI presets.
Thanks bo3b. I'm making good progress. Once I used o0.w, the prime directive worked a treat and fixed the map. Unfortunately, the minimap gets 'fixed' too, and now looks wrong.

But I've been able to identify the various HUD and map parts and give them static depths at will. It seems impossible to make both the map and the HUD look good at the same time, so o now I'm going to have a go at using a key toggle to switch between 2 UI presets.

ImageVolnaPC.com - Tips, tweaks, performance comparisons (PhysX card, SLI scaling, etc)

#78
Posted 03/19/2015 11:19 AM   
I recently added a feature to 3Dmigoto that may help (make sure you are running 0.99.49 or later) - try adding a section like this to the d3dx.ini: [code] [ShaderOverrideMapRoads] Hash=765c3e296da52533 depth_filter = depth_inactive [/code] Replace the hash with the one from the shader and see if that helps - it means your adjustment will only be used when the depth buffer is not active. If you find that it adjusts the map instead of the roads, change depth_inactive to depth_active. If you find one way adjusts both and the other way doesn't adjust either than this technique won't work. There's also a new feature to only apply a vertex shader fix when it is used in conjunction with a specific pixel shader. It's still a bit limited (only one pixel shader can be specified, no way to do different things for different pixel shaders), but it works like this (from an early attempt at fixing god rays in Far Cry 4): [code] [ShaderOverrideSunShafts] ; Vertex shader responsible for direct sun shafts + other celestial objects. ; Only apply adjustment when used with the pixel shader for sun shafts. Hash = 6831f29e59799e2f partner = af7b880f07630615 [/code]
I recently added a feature to 3Dmigoto that may help (make sure you are running 0.99.49 or later) - try adding a section like this to the d3dx.ini:
[ShaderOverrideMapRoads]
Hash=765c3e296da52533
depth_filter = depth_inactive

Replace the hash with the one from the shader and see if that helps - it means your adjustment will only be used when the depth buffer is not active.

If you find that it adjusts the map instead of the roads, change depth_inactive to depth_active. If you find one way adjusts both and the other way doesn't adjust either than this technique won't work.


There's also a new feature to only apply a vertex shader fix when it is used in conjunction with a specific pixel shader. It's still a bit limited (only one pixel shader can be specified, no way to do different things for different pixel shaders), but it works like this (from an early attempt at fixing god rays in Far Cry 4):
[ShaderOverrideSunShafts]
; Vertex shader responsible for direct sun shafts + other celestial objects.
; Only apply adjustment when used with the pixel shader for sun shafts.
Hash = 6831f29e59799e2f
partner = af7b880f07630615

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

#79
Posted 03/19/2015 11:40 AM   
[quote="DarkStarSword"] There's also a new feature to only apply a vertex shader fix when it is used in conjunction with a specific pixel shader. It's still a bit limited (only one pixel shader can be specified, no way to do different things for different pixel shaders), but it works like this (from an early attempt at fixing god rays in Far Cry 4): [code] [ShaderOverrideSunShafts] ; Vertex shader responsible for direct sun shafts + other celestial objects. ; Only apply adjustment when used with the pixel shader for sun shafts. Hash = 6831f29e59799e2f partner = af7b880f07630615 [/code][/quote] Exactly what I needed. Solved the problem beautifully. Thanks!
DarkStarSword said:
There's also a new feature to only apply a vertex shader fix when it is used in conjunction with a specific pixel shader. It's still a bit limited (only one pixel shader can be specified, no way to do different things for different pixel shaders), but it works like this (from an early attempt at fixing god rays in Far Cry 4):
[ShaderOverrideSunShafts]
; Vertex shader responsible for direct sun shafts + other celestial objects.
; Only apply adjustment when used with the pixel shader for sun shafts.
Hash = 6831f29e59799e2f
partner = af7b880f07630615


Exactly what I needed. Solved the problem beautifully. Thanks!

ImageVolnaPC.com - Tips, tweaks, performance comparisons (PhysX card, SLI scaling, etc)

#80
Posted 03/20/2015 02:52 AM   
I'd like to have something be at a static depth normally, but go to infinity on keypress. Infinity is separation * convergence, right? I tried doing [Key1] Key = XB_LEFT_THUMB type = toggle x = convergence so that in the shader: o0.x += separation * hud_depth_user_defined; my static constant (which happens to be -45000) gets replaced by whatever convergence is. But it seems that doesn't work - you can't pass a variable name from the ini file like that. So, I guess I could do something like [Key1] Key = XB_LEFT_THUMB type = toggle x = 123456.789 and then inside the shaderfile, I could do an if/then statement (if x = 123456.789, then x = convergence). Does that make sense? What is the syntax for an in/then statement? Or can I replace the "convergence" in x = convergence with a certain number that will always lead to infinity (without diverging the eyes)? EDIT: actually, x=0 seems to do the trick
I'd like to have something be at a static depth normally, but go to infinity on keypress. Infinity is separation * convergence, right?

I tried doing

[Key1]
Key = XB_LEFT_THUMB
type = toggle
x = convergence




so that in the shader:

o0.x += separation * hud_depth_user_defined;

my static constant (which happens to be -45000) gets replaced by whatever convergence is. But it seems that doesn't work - you can't pass a variable name from the ini file like that.


So, I guess I could do something like

[Key1]
Key = XB_LEFT_THUMB
type = toggle
x = 123456.789


and then inside the shaderfile, I could do an if/then statement (if x = 123456.789, then x = convergence).

Does that make sense? What is the syntax for an in/then statement?



Or can I replace the "convergence" in x = convergence with a certain number that will always lead to infinity (without diverging the eyes)?
EDIT: actually, x=0 seems to do the trick

ImageVolnaPC.com - Tips, tweaks, performance comparisons (PhysX card, SLI scaling, etc)

#81
Posted 03/20/2015 11:47 AM   
No, the infinity is only partly related to convergence. You'll want separation set to 100, and might be: [Key1] Key = XB_LEFT_THUMB type = toggle separation = 100 That set's global separation to 100, which is unlikely what you want. For a shader specific operation, you can use the ShaderOverride for a specific shader and set separation=100 there. If that's not quite right, you can pass in a magic number of some form via iniParams, like .x and look for that in the shader and decide what to do. Like your latter statement. But, you'll want to do the stereo correction formula where you just add the maximum separation, not correct for separation. out.x += 100 * (out.w - convergence) We sometimes skip the convergence there, and make it just out.x += 100 * (out.w) That approach is common for fixing skyboxes at wrong depth, where we want to push it to infinity. If statements as defined by the language: https://msdn.microsoft.com/en-us/library/windows/desktop/bb509610(v=vs.85).aspx Example here: [code]... if (iniParams.x == 123456.789) { out.x += 100 * (out.w) } [/code]
No, the infinity is only partly related to convergence.

You'll want separation set to 100, and might be:

[Key1]
Key = XB_LEFT_THUMB
type = toggle
separation = 100

That set's global separation to 100, which is unlikely what you want. For a shader specific operation, you can use the ShaderOverride for a specific shader and set separation=100 there.

If that's not quite right, you can pass in a magic number of some form via iniParams, like .x and look for that in the shader and decide what to do.

Like your latter statement. But, you'll want to do the stereo correction formula where you just add the maximum separation, not correct for separation.

out.x += 100 * (out.w - convergence)

We sometimes skip the convergence there, and make it just

out.x += 100 * (out.w)

That approach is common for fixing skyboxes at wrong depth, where we want to push it to infinity.


If statements as defined by the language:

https://msdn.microsoft.com/en-us/library/windows/desktop/bb509610(v=vs.85).aspx


Example here:

...
if (iniParams.x == 123456.789)
{
out.x += 100 * (out.w)
}

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

#82
Posted 03/20/2015 12:15 PM   
[quote="Volnaiskra"]I'd like to have something be at a static depth normally, but go to infinity on keypress. Infinity is separation * convergence, right?[/quote] I just wanted to clarify where that separation * convergence to move something to infinity comes from. As Bo3b has rightly said, to move something to infinity it needs to be adjusted by this: [code]X' = X + separation * W[/code] (Side note: the hidden perspective divide by W will cancel out that multiply by W, so in fact it is only being adjusted by separation once we get into screen coordinates) However, if the driver is already applying a stereo correction so that it was not already drawn at screen depth, you need to first remove the standard stereo correction, then apply the infinite adjustment instead: [code]X' = X - separation * (W - convergence) + separation * W[/code] Now apply your high school algebra to simplify the equation: [code]X' = X - separation * (W - convergence) + separation * W X' = X - (separation * W) + (separation * convergence) + (separation * W) X' = X + separation * convergence [/code] So, if the thing you are moving was at screen depth, use separation * W. If the thing you are moving was at some other depth, use separation * convergence.
Volnaiskra said:I'd like to have something be at a static depth normally, but go to infinity on keypress. Infinity is separation * convergence, right?

I just wanted to clarify where that separation * convergence to move something to infinity comes from.

As Bo3b has rightly said, to move something to infinity it needs to be adjusted by this:
X' = X + separation * W

(Side note: the hidden perspective divide by W will cancel out that multiply by W, so in fact it is only being adjusted by separation once we get into screen coordinates)

However, if the driver is already applying a stereo correction so that it was not already drawn at screen depth, you need to first remove the standard stereo correction, then apply the infinite adjustment instead:
X' = X - separation * (W - convergence) + separation * W


Now apply your high school algebra to simplify the equation:
X' = X - separation * (W - convergence) + separation * W
X' = X - (separation * W) + (separation * convergence) + (separation * W)
X' = X + separation * convergence


So, if the thing you are moving was at screen depth, use separation * W. If the thing you are moving was at some other depth, use separation * convergence.

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

#83
Posted 03/20/2015 06:59 PM   
Thanks for the explanations, guys. I appreciate the time you spend explaining things so well. I understand that with programming in general, if/then statements are relatively CPU expensive and should be avoided when possible. Is that a genuine concern with 3Dmigoto? If so, then would I be better off doing something like this? x=0 [Key1] Key = XB_LEFT_THUMB type = toggle x = 1 ... float infinitySeparation = 100 * x float adjustedSeparation = separation - (separation * x) + infinitySeparation out.x += adjustedSeparation * (convergence - out.w)
Thanks for the explanations, guys. I appreciate the time you spend explaining things so well.

I understand that with programming in general, if/then statements are relatively CPU expensive and should be avoided when possible. Is that a genuine concern with 3Dmigoto?

If so, then would I be better off doing something like this?


x=0

[Key1]
Key = XB_LEFT_THUMB
type = toggle
x = 1


...
float infinitySeparation = 100 * x
float adjustedSeparation = separation - (separation * x) + infinitySeparation


out.x += adjustedSeparation * (convergence - out.w)

ImageVolnaPC.com - Tips, tweaks, performance comparisons (PhysX card, SLI scaling, etc)

#84
Posted 03/21/2015 12:00 AM   
[quote="Volnaiskra"]Thanks for the explanations, guys. I appreciate the time you spend explaining things so well. I understand that with programming in general, if/then statements are relatively CPU expensive and should be avoided when possible. Is that a genuine concern with 3Dmigoto? If so, then would I be better off doing something like this? x=0 [Key1] Key = XB_LEFT_THUMB type = toggle x = 1 ... float infinitySeparation = 100 * x float adjustedSeparation = separation - (separation * x) + infinitySeparation out.x += adjustedSeparation * (convergence - out.w) [/quote] I am a bit curious...since when did the "if then, else bock" become computationally expensive? Unless you are talking writing those into a ASM directly (which is basically a cmp and a jne/je, etc + ret or jmp back) it's not... Then again ofc is more expensive than a+b=c since you execute more instructions but besides that I don't really understand why are "expensive"... It's like the debate of what is more expensive the for() or while() loop... (FYI the for() makes 3 operations; while in a while() loop you can get away with just two operations...but mostly you still do three)... I am asking because maybe you know something that I've missed/don't know? ;))
Volnaiskra said:Thanks for the explanations, guys. I appreciate the time you spend explaining things so well.

I understand that with programming in general, if/then statements are relatively CPU expensive and should be avoided when possible. Is that a genuine concern with 3Dmigoto?

If so, then would I be better off doing something like this?


x=0

[Key1]
Key = XB_LEFT_THUMB
type = toggle
x = 1


...
float infinitySeparation = 100 * x
float adjustedSeparation = separation - (separation * x) + infinitySeparation


out.x += adjustedSeparation * (convergence - out.w)


I am a bit curious...since when did the "if then, else bock" become computationally expensive? Unless you are talking writing those into a ASM directly (which is basically a cmp and a jne/je, etc + ret or jmp back) it's not...
Then again ofc is more expensive than a+b=c since you execute more instructions but besides that I don't really understand why are "expensive"... It's like the debate of what is more expensive the for() or while() loop... (FYI the for() makes 3 operations; while in a while() loop you can get away with just two operations...but mostly you still do three)...
I am asking because maybe you know something that I've missed/don't know? ;))

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

#85
Posted 03/21/2015 01:30 AM   
As the saying goes "premature optimisation is the root of all evil". Write your code as clean and easy to understand as possible first, and only consider micro-optimisations if you have actually measured the performance and identified them as a significant bottleneck. You're only hitting one shader out of hundreds or thousands used by the game - a microscopic performance impact there is not going to make any difference overall. It is true that if/then/else can mess with the pipelining on the CPU & GPU (GPUs are traditionally a bit worse effected than CPUs), but the hardware already does something called "branch prediction" to try to guess ahead of time which way the if/then/else is going to go to eliminate any performance impact when it predicts correctly (if it predicts incorrectly it will have the cost of however many instructions it had to discard from it's pipeline, but even that won't be very significant to the overall performance unless it happens a *lot*).
As the saying goes "premature optimisation is the root of all evil". Write your code as clean and easy to understand as possible first, and only consider micro-optimisations if you have actually measured the performance and identified them as a significant bottleneck. You're only hitting one shader out of hundreds or thousands used by the game - a microscopic performance impact there is not going to make any difference overall.

It is true that if/then/else can mess with the pipelining on the CPU & GPU (GPUs are traditionally a bit worse effected than CPUs), but the hardware already does something called "branch prediction" to try to guess ahead of time which way the if/then/else is going to go to eliminate any performance impact when it predicts correctly (if it predicts incorrectly it will have the cost of however many instructions it had to discard from it's pipeline, but even that won't be very significant to the overall performance unless it happens a *lot*).

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

#86
Posted 03/21/2015 04:50 AM   
[quote="DarkStarSword"]... and only consider micro-optimisations if you have actually measured the performance and identified them as a significant bottleneck.[/quote] I don't have the ability or know-how to do that, which is why I asked you guys. Just trying to make myself aware of best practices before I get too deep in. Some people have measured the impact of stuff like this in the game engine I use for Spryke, and some things had a large impact. For example, storing data as strings instead of numerals is about 15 times more expensive. I don't know much about shaders, so I didn't know what sort of impact things like if/else would have here. If you're sure it will be negligible then I won't worry about it
DarkStarSword said:... and only consider micro-optimisations if you have actually measured the performance and identified them as a significant bottleneck.


I don't have the ability or know-how to do that, which is why I asked you guys. Just trying to make myself aware of best practices before I get too deep in.

Some people have measured the impact of stuff like this in the game engine I use for Spryke, and some things had a large impact. For example, storing data as strings instead of numerals is about 15 times more expensive.

I don't know much about shaders, so I didn't know what sort of impact things like if/else would have here. If you're sure it will be negligible then I won't worry about it

ImageVolnaPC.com - Tips, tweaks, performance comparisons (PhysX card, SLI scaling, etc)

#87
Posted 03/21/2015 06:56 AM   
I'll second DarkStarSword there- write the code exactly the way you want for maximum clarity and readability. In today's computers there is simply no way that anyone will guess a bottleneck up front. Only make modifications once you have data to back it up. For 99.99% of stuff you'll never need to change it. In general, the only possible way to introduce performance problems is to create loops. Nested loops in something like a pixel shader would be worth a minute of consideration. But even for something like that, I'd still measure it first. [url]http://en.wikipedia.org/wiki/Program_optimization#When_to_optimize[/url] Your example of the game engine is interesting- I'll bet no one had the slightest idea that strings were that expensive up front, and were busily optimizing their IF statements. Meanwhile the elephant in the room was missed until some profiled it. For that string problem- if I were the engine writer, I'd seriously try to improve that code path. That's one where now they know there is problem, it's time to make the code less clear and improve the performance.
I'll second DarkStarSword there- write the code exactly the way you want for maximum clarity and readability. In today's computers there is simply no way that anyone will guess a bottleneck up front.

Only make modifications once you have data to back it up. For 99.99% of stuff you'll never need to change it. In general, the only possible way to introduce performance problems is to create loops. Nested loops in something like a pixel shader would be worth a minute of consideration. But even for something like that, I'd still measure it first.

http://en.wikipedia.org/wiki/Program_optimization#When_to_optimize

Your example of the game engine is interesting- I'll bet no one had the slightest idea that strings were that expensive up front, and were busily optimizing their IF statements. Meanwhile the elephant in the room was missed until some profiled it.

For that string problem- if I were the engine writer, I'd seriously try to improve that code path. That's one where now they know there is problem, it's time to make the code less clear and improve the performance.

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

#88
Posted 03/21/2015 09:44 AM   
[quote="Volnaiskra"] Some people have measured the impact of stuff like this in the game engine I use for Spryke, and some things had a large impact. For example, storing data as strings instead of numerals is about 15 times more expensive. [/quote] Strings are bad in general! Of course is more human readable than number and that is why programmers tend to use them. But they should be avoided at all costs if possible! Unless you really need to do real time manipulations on strings WHICH IS done VERY RARE (over the course of the program execution). String concatenation is one of the most expensive operations you can make. For example it is ok to strcat, strcmp, etc one time during an initialization phase or every time in a blue moon, however if you want to do this every single frame it will become computationally expensive. The other thing that you need to avoid and I agree with bo3b and DarkStarSword is avoid NESTED for(), while(), do() loops as much as possible as they are heavy (CPU wise). In our case, for example if for some reason you would need to invert a matrix in a Pixel Shader it is best to do it in the Vertex Shader and pass it in the Pixel Shader (unless there is a good reasoning for doing it every single time in a PS). A Vertex Shader is called less time than a PS and theoretically will be less the application will become less CPU expensive. But, first write your code the way you want it to be and only then if you have performance issues OPTIMIZE the code. Also don't go crazy and OVER Optimize stuff as that can lead to other bad things (code base could become less readable, less maintainable). Most of the times you will notice that your CPU has enough JUICE to actually run your code without any problem. So unless you work in embedded systems or you want your APP to use as low CPU as possible for some reason you will notice that 99% of the time you will not need to go crazy with optimizations. If you have Visual Studio Ultimate you can easily "track" the performance using the built-in tools. You will notice that most of the time a code block will be computationally expensive rather than a instruction in particular and that block tends to be called every x Milliseconds. Regarding the IF/ELSE statement you can quickly and cleanly optimize it like this: Instead of: [code] bool FuncA(void) { if (a == b) return true; else return false; } [/code] In something like: [code] bool FuncA(void) { bool ret = false; if (a == b) ret = true; return ret; } [/code] This is just a quick example of little things you can do basically.
Volnaiskra said:

Some people have measured the impact of stuff like this in the game engine I use for Spryke, and some things had a large impact. For example, storing data as strings instead of numerals is about 15 times more expensive.



Strings are bad in general! Of course is more human readable than number and that is why programmers tend to use them. But they should be avoided at all costs if possible! Unless you really need to do real time manipulations on strings WHICH IS done VERY RARE (over the course of the program execution).
String concatenation is one of the most expensive operations you can make.

For example it is ok to strcat, strcmp, etc one time during an initialization phase or every time in a blue moon, however if you want to do this every single frame it will become computationally expensive.

The other thing that you need to avoid and I agree with bo3b and DarkStarSword is avoid NESTED for(), while(), do() loops as much as possible as they are heavy (CPU wise).

In our case, for example if for some reason you would need to invert a matrix in a Pixel Shader it is best to do it in the Vertex Shader and pass it in the Pixel Shader (unless there is a good reasoning for doing it every single time in a PS). A Vertex Shader is called less time than a PS and theoretically will be less the application will become less CPU expensive.

But, first write your code the way you want it to be and only then if you have performance issues OPTIMIZE the code. Also don't go crazy and OVER Optimize stuff as that can lead to other bad things (code base could become less readable, less maintainable).
Most of the times you will notice that your CPU has enough JUICE to actually run your code without any problem. So unless you work in embedded systems or you want your APP to use as low CPU as possible for some reason you will notice that 99% of the time you will not need to go crazy with optimizations.

If you have Visual Studio Ultimate you can easily "track" the performance using the built-in tools. You will notice that most of the time a code block will be computationally expensive rather than a instruction in particular and that block tends to be called every x Milliseconds.

Regarding the IF/ELSE statement you can quickly and cleanly optimize it like this:

Instead of:

bool FuncA(void)
{
if (a == b)
return true;
else
return false;
}


In something like:

bool FuncA(void)
{
bool ret = false;

if (a == b)
ret = true;

return ret;
}


This is just a quick example of little things you can do basically.

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

#89
Posted 03/21/2015 10:37 AM   
Actually, not to be mean, but your example of the If statement there is nearly a textbook example of what NOT to do. If statements in the CPU are essentially free today, there is no point in optimizing them away. More to the point, you have wildly obfuscated that code, and I'll bet $100 the compiler generates exactly the same code. Maybe we disagree, but I think the second example is nearly unreadable, the first is clean and clear. I would *strongly* recommend going with the former. Concentrate on the principle that nothing you can do while you are typing has anything whatsoever to do with performance. Keep it vanilla. Do not prematurely optimize. Just don't.
Actually, not to be mean, but your example of the If statement there is nearly a textbook example of what NOT to do.

If statements in the CPU are essentially free today, there is no point in optimizing them away.

More to the point, you have wildly obfuscated that code, and I'll bet $100 the compiler generates exactly the same code.

Maybe we disagree, but I think the second example is nearly unreadable, the first is clean and clear. I would *strongly* recommend going with the former.


Concentrate on the principle that nothing you can do while you are typing has anything whatsoever to do with performance. Keep it vanilla. Do not prematurely optimize. Just don't.

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

#90
Posted 03/21/2015 12:04 PM   
  6 / 10    
Scroll To Top