Bo3b's School For Shaderhackers
  67 / 88    
[quote="helifax"]Hi Bo3b, I was wondering if there is something I can do to decrease the time it takes the wrapper to dump the shaders? In any frostbyte 3 game it takes 15 minutes to load a game... I am currently only using "export_hlsl=2" option. Nomally it dumps around 20k shaders when a level loads... If you can think of anything I can do to decrease this insane time, please let me know! Cheers![/quote] I haven't profiled this part of the code, but I don't think there is anything to be done there, at least not easily. The code does an unbelievable amount of string management, and I'm sure it could be improved. As a general rule though, I put most of my effort into fixing Decompiler bugs. When I get back to that code, I'll do a quick profile to see if there is any obvious quick fix, but it'll be awhile before I can get to it. It is worth taking a look at the log though. No need for full debug, but check for shaders that are causing exceptions in particular. Exceptions are super expensive on Windows apps, so if it's happening a lot that would be bad. Also, if you don't need the CS/DS/HS it might be worth commenting out that section in the code, as I know those tend to decompile poorly. It's setup for VS2013. If you are setup for doing one-off builds, you could edit the DecompileHLSL.cpp file. Let me know if there seems to be a lot of errors in the log, or wasted effort that is obvious.
helifax said:Hi Bo3b, I was wondering if there is something I can do to decrease the time it takes the wrapper to dump the shaders?

In any frostbyte 3 game it takes 15 minutes to load a game... I am currently only using "export_hlsl=2" option.
Nomally it dumps around 20k shaders when a level loads...
If you can think of anything I can do to decrease this insane time, please let me know!

Cheers!

I haven't profiled this part of the code, but I don't think there is anything to be done there, at least not easily. The code does an unbelievable amount of string management, and I'm sure it could be improved. As a general rule though, I put most of my effort into fixing Decompiler bugs. When I get back to that code, I'll do a quick profile to see if there is any obvious quick fix, but it'll be awhile before I can get to it.


It is worth taking a look at the log though. No need for full debug, but check for shaders that are causing exceptions in particular. Exceptions are super expensive on Windows apps, so if it's happening a lot that would be bad.

Also, if you don't need the CS/DS/HS it might be worth commenting out that section in the code, as I know those tend to decompile poorly. It's setup for VS2013. If you are setup for doing one-off builds, you could edit the DecompileHLSL.cpp file.


Let me know if there seems to be a lot of errors in the log, or wasted effort that is obvious.

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

Posted 10/25/2016 10:25 AM   
[quote="bo3b"][quote="helifax"]Hi Bo3b, I was wondering if there is something I can do to decrease the time it takes the wrapper to dump the shaders? In any frostbyte 3 game it takes 15 minutes to load a game... I am currently only using "export_hlsl=2" option. Nomally it dumps around 20k shaders when a level loads... If you can think of anything I can do to decrease this insane time, please let me know! Cheers![/quote] I haven't profiled this part of the code, but I don't think there is anything to be done there, at least not easily. The code does an unbelievable amount of string management, and I'm sure it could be improved. As a general rule though, I put most of my effort into fixing Decompiler bugs. When I get back to that code, I'll do a quick profile to see if there is any obvious quick fix, but it'll be awhile before I can get to it. It is worth taking a look at the log though. No need for full debug, but check for shaders that are causing exceptions in particular. Exceptions are super expensive on Windows apps, so if it's happening a lot that would be bad. Also, if you don't need the CS/DS/HS it might be worth commenting out that section in the code, as I know those tend to decompile poorly. It's setup for VS2013. If you are setup for doing one-off builds, you could edit the DecompileHLSL.cpp file. Let me know if there seems to be a lot of errors in the log, or wasted effort that is obvious.[/quote] Thx Bo3b, I did look in the log and saw a couple of exceptions. After which I realised that the HLSL had decompiling issues and I switched to ASM shaders. The dumping now is instant (as there is no string parsing involved), so this issue is "solved" in this case;) I can definitely see where the complication and time consumption comes from though^_^ Big thank you again for all the help!!! (I am writing some scripts for the FrostByte3 engine - for future use) and having to filter through 100-200k shaders is definitely something that can be done manually;) But switching to ASM solves this issue;) So, I'll stick with it ^_^
bo3b said:
helifax said:Hi Bo3b, I was wondering if there is something I can do to decrease the time it takes the wrapper to dump the shaders?

In any frostbyte 3 game it takes 15 minutes to load a game... I am currently only using "export_hlsl=2" option.
Nomally it dumps around 20k shaders when a level loads...
If you can think of anything I can do to decrease this insane time, please let me know!

Cheers!

I haven't profiled this part of the code, but I don't think there is anything to be done there, at least not easily. The code does an unbelievable amount of string management, and I'm sure it could be improved. As a general rule though, I put most of my effort into fixing Decompiler bugs. When I get back to that code, I'll do a quick profile to see if there is any obvious quick fix, but it'll be awhile before I can get to it.


It is worth taking a look at the log though. No need for full debug, but check for shaders that are causing exceptions in particular. Exceptions are super expensive on Windows apps, so if it's happening a lot that would be bad.

Also, if you don't need the CS/DS/HS it might be worth commenting out that section in the code, as I know those tend to decompile poorly. It's setup for VS2013. If you are setup for doing one-off builds, you could edit the DecompileHLSL.cpp file.


Let me know if there seems to be a lot of errors in the log, or wasted effort that is obvious.


Thx Bo3b,

I did look in the log and saw a couple of exceptions. After which I realised that the HLSL had decompiling issues and I switched to ASM shaders.
The dumping now is instant (as there is no string parsing involved), so this issue is "solved" in this case;)
I can definitely see where the complication and time consumption comes from though^_^

Big thank you again for all the help!!!

(I am writing some scripts for the FrostByte3 engine - for future use) and having to filter through 100-200k shaders is definitely something that can be done manually;) But switching to ASM solves this issue;) So, I'll stick with it ^_^

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

Posted 10/25/2016 11:43 AM   
[quote="helifax"] @mx-2: - Thanks for that code. Didn't try it yet, but it definitely makes sense;) Big thanks for your reply!!![/quote] I tested my inverse matrix code and fixed some bugs. If you (or anybody else) likes to try my code in a fix, the updated version can be found on [url=https://github.com/mx-2/3d-fix/blob/master/_tools_/inverseMatrix.asm]github[/url].
helifax said:
@mx-2:
- Thanks for that code. Didn't try it yet, but it definitely makes sense;) Big thanks for your reply!!!

I tested my inverse matrix code and fixed some bugs. If you (or anybody else) likes to try my code in a fix, the updated version can be found on github.
[quote="helifax"] 1) The HLSL for Matrix Inversion works;) but 3DMigoto just decides to make a low beep when I put in the HASH of the Compute shader. So it doesn't work with the game. Doesn't like that HASH for some reason no matter what I do. Thus, I had to put the matrix inverse in the shader code :) 2)The CS not working IS NOT A DRIVER ISSUE. Actually is the CS shaders that needs fixing! I haven't made FULL fix for them, but I see where things go wrong! Believe it or not, but the driver is actually working as it should;) It only affects the LEFT eye. What we know is that for Left eye we say "(-1) * separation". I expect that the CS doesn't like the NEGATIVE value of the position and just discards it or does weird thing with it! @DHR: I managed to hack it to some degree, but is not a proper fix;) Sadly, I don't know what much about Compute shaders in 3D Vision. I know DSS is the expert as he always helped me before with them. If you want we can try to see what is wrong, but without a proper understand I don't think we can come up with the true formula;) [/quote] I see you make a inverse matrix in ASM....the important thing is working (inject or direct code). The CS have also a clipping in the left that need fixing.....with more time i will take a look if the CS need to be fixed in another spot.
helifax said:
1) The HLSL for Matrix Inversion works;) but 3DMigoto just decides to make a low beep when I put in the HASH of the Compute shader. So it doesn't work with the game. Doesn't like that HASH for some reason no matter what I do.
Thus, I had to put the matrix inverse in the shader code :)

2)The CS not working IS NOT A DRIVER ISSUE. Actually is the CS shaders that needs fixing! I haven't made FULL fix for them, but I see where things go wrong! Believe it or not, but the driver is actually working as it should;) It only affects the LEFT eye. What we know is that for Left eye we say "(-1) * separation". I expect that the CS doesn't like the NEGATIVE value of the position and just discards it or does weird thing with it!

@DHR:
I managed to hack it to some degree, but is not a proper fix;)
Sadly, I don't know what much about Compute shaders in 3D Vision. I know DSS is the expert as he always helped me before with them. If you want we can try to see what is wrong, but without a proper understand I don't think we can come up with the true formula;)

I see you make a inverse matrix in ASM....the important thing is working (inject or direct code). The CS have also a clipping in the left that need fixing.....with more time i will take a look if the CS need to be fixed in another spot.

MY WEB

Helix Mod - Making 3D Better

My 3D Screenshot Gallery

Like my fixes? you can donate to Paypal: dhr.donation@gmail.com

Posted 10/25/2016 10:47 PM   
I added the code snippets to the wiki for future reference. Thanks to mx-2 and Helifax for sharing the code. [url]http://wiki.bo3b.net/index.php?title=Canonical_Stereo_Code[/url] I included the preferred approach, which is to use DarkStarSwords matrix inversion compute shader. This is preferred because it only runs once per frame, not every vertex, or worse, every pixel. If you can, move any inversion code out of Pixel Shaders. Running inversion code every pixel is costly, and not generally necessary. If you are using HelixMod, it has the built in CPU based matrix inversions, which run once per frame, and should be used there. [url]http://wiki.bo3b.net/index.php?title=HelixMod_Feature_List#InverseMatrix.2C_InverseMatrix1[/url]
I added the code snippets to the wiki for future reference. Thanks to mx-2 and Helifax for sharing the code.

http://wiki.bo3b.net/index.php?title=Canonical_Stereo_Code


I included the preferred approach, which is to use DarkStarSwords matrix inversion compute shader. This is preferred because it only runs once per frame, not every vertex, or worse, every pixel.

If you can, move any inversion code out of Pixel Shaders. Running inversion code every pixel is costly, and not generally necessary.


If you are using HelixMod, it has the built in CPU based matrix inversions, which run once per frame, and should be used there. http://wiki.bo3b.net/index.php?title=HelixMod_Feature_List#InverseMatrix.2C_InverseMatrix1

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

Posted 10/28/2016 01:15 AM   
[quote="bo3b"]If you can, move any inversion code out of Pixel Shaders. Running inversion code every pixel is costly, and not generally necessary.[/quote] Interesting. So lets say I need to do an inversion of a VPM in a pixel shader (fairly common for shadows), would it be worthwhile for me to actually include the matrix.hlsl shader into the VS, perform the inversion there, and then pass the inverted matrix through to the PS through some texcoord outputs? Or does this only apply if we're not using the custom shader, and actually performing the long route via calculating the inversion inside of the shader?
bo3b said:If you can, move any inversion code out of Pixel Shaders. Running inversion code every pixel is costly, and not generally necessary.


Interesting. So lets say I need to do an inversion of a VPM in a pixel shader (fairly common for shadows), would it be worthwhile for me to actually include the matrix.hlsl shader into the VS, perform the inversion there, and then pass the inverted matrix through to the PS through some texcoord outputs? Or does this only apply if we're not using the custom shader, and actually performing the long route via calculating the inversion inside of the shader?

3D Gaming Rig: CPU: i7 7700K @ 4.9Ghz | Mobo: Asus Maximus Hero VIII | RAM: Corsair Dominator 16GB | GPU: 2 x GTX 1080 Ti SLI | 3xSSDs for OS and Apps, 2 x HDD's for 11GB storage | PSU: Seasonic X-1250 M2| Case: Corsair C70 | Cooling: Corsair H115i Hydro cooler | Displays: Asus PG278QR, BenQ XL2420TX & BenQ HT1075 | OS: Windows 10 Pro + Windows 7 dual boot

Like my fixes? Dontations can be made to: www.paypal.me/DShanz or rshannonca@gmail.com
Like electronic music? Check out: www.soundcloud.com/dj-ryan-king

Posted 10/28/2016 02:14 AM   
I've tried to fix the skybox in the original Deus Ex, as well as some just disabling some fake lights. However, there's only like 5 VS, and 5 PS. So when I toggle though the skybox also selects geometry not in the skybox. (This is with helixmod). Is this because it's a DX7 or 8 game that's converted to DX9 using GMDX? Also, what is a good game to learn to fix with 3dmigoto DX11. I just want basically use someones fix as a tool to see how they fixed the game while I refix it. If that makes sense. Thanks.
I've tried to fix the skybox in the original Deus Ex, as well as some just disabling some fake lights.

However, there's only like 5 VS, and 5 PS. So when I toggle though the skybox also selects geometry not in the skybox. (This is with helixmod).

Is this because it's a DX7 or 8 game that's converted to DX9 using GMDX?


Also, what is a good game to learn to fix with 3dmigoto DX11. I just want basically use someones fix as a tool to see how they fixed the game while I refix it.

If that makes sense.

Thanks.

I'm ishiki, forum screwed up my name.

9900K @5.0 GHZ, 16GBDDR4@4233MHZ, 2080 Ti

Posted 10/28/2016 02:49 AM   
[quote="DJ-RK"][quote="bo3b"]If you can, move any inversion code out of Pixel Shaders. Running inversion code every pixel is costly, and not generally necessary.[/quote] Interesting. So lets say I need to do an inversion of a VPM in a pixel shader (fairly common for shadows), would it be worthwhile for me to actually include the matrix.hlsl shader into the VS, perform the inversion there, and then pass the inverted matrix through to the PS through some texcoord outputs? Or does this only apply if we're not using the custom shader, and actually performing the long route via calculating the inversion inside of the shader?[/quote] As always, without actually measuring it, we can't say for sure if it makes a difference. If we are CPU bound for example, then this would not matter at all. But, if we are in a scenario where we expect to be GPU bound, it might matter. Even in that case though, it's going to depend upon what the game is doing as to whether this matters. However, we know that in at least some scenarios it will impact frame rate, so unless there is a compelling reason, we would want to move this code out of that inner loop of pixels. So, putting the inversion in the VS and passing it as a texcoord would be better, because that is only run for every vertex, not every pixel. Probably, unless for some bizarre game reason there are more vertices than pixels. The way to determine whether this matters is to check your performance both with the fix active, and without. If there is no measurable impact on frame-rate, then it doesn't matter. Keep everything as close to the same as possible, just comment out the fixing code, but keep the shader active, or tie it to a hot-key to see an on/off live. Something close you can try that is super easy would be to use the F9 'show original', which will disable all the fixes while held down. If that showed an impact, it would be worth digging further.
DJ-RK said:
bo3b said:If you can, move any inversion code out of Pixel Shaders. Running inversion code every pixel is costly, and not generally necessary.

Interesting. So lets say I need to do an inversion of a VPM in a pixel shader (fairly common for shadows), would it be worthwhile for me to actually include the matrix.hlsl shader into the VS, perform the inversion there, and then pass the inverted matrix through to the PS through some texcoord outputs? Or does this only apply if we're not using the custom shader, and actually performing the long route via calculating the inversion inside of the shader?

As always, without actually measuring it, we can't say for sure if it makes a difference. If we are CPU bound for example, then this would not matter at all. But, if we are in a scenario where we expect to be GPU bound, it might matter. Even in that case though, it's going to depend upon what the game is doing as to whether this matters.

However, we know that in at least some scenarios it will impact frame rate, so unless there is a compelling reason, we would want to move this code out of that inner loop of pixels.

So, putting the inversion in the VS and passing it as a texcoord would be better, because that is only run for every vertex, not every pixel. Probably, unless for some bizarre game reason there are more vertices than pixels.


The way to determine whether this matters is to check your performance both with the fix active, and without. If there is no measurable impact on frame-rate, then it doesn't matter. Keep everything as close to the same as possible, just comment out the fixing code, but keep the shader active, or tie it to a hot-key to see an on/off live.

Something close you can try that is super easy would be to use the F9 'show original', which will disable all the fixes while held down. If that showed an impact, it would be worth digging further.

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

Posted 10/28/2016 11:35 PM   
[quote="ishiki"]I've tried to fix the skybox in the original Deus Ex, as well as some just disabling some fake lights. However, there's only like 5 VS, and 5 PS. So when I toggle though the skybox also selects geometry not in the skybox. (This is with helixmod). Is this because it's a DX7 or 8 game that's converted to DX9 using GMDX? Also, what is a good game to learn to fix with 3dmigoto DX11. I just want basically use someones fix as a tool to see how they fixed the game while I refix it. If that makes sense. Thanks.[/quote] That's not too surprising for an older game like original Deus Ex. Having a single shader that does multiple jobs happens a lot with older games. I don't think this is related to the conversion using GMDX, but it's possible. Maybe take a look at the HelixMod feature to make alternate shaders: [url]http://wiki.bo3b.net/index.php?title=HelixMod_Feature_List#Overriding_an_Individual_Instance_of_a_Shader[/url] There may be a way for you to separate those pieces out to different shaders, so you can fix just the skybox. For DX11 example games, there are a lot. Take a look at the two big repositories of game fixes: 3Dmigoto: https://github.com/bo3b/3Dmigoto DarkStarSword: https://github.com/DarkStarSword/3d-fixes DarkStarSword fixes tend to be better commented, and the GitHub check-in comments can also be very helpful. The flip side is that DarkStarSword is also a genius, so his fixes also tend to do really unbelievably complex things that are hard to understand. Skimming the list of games there, the two I'd recommend starting with are these two that are complete fixes, but don't include much in the way of advanced features and other weird game complexity. https://github.com/bo3b/3Dmigoto/tree/master/Alien https://github.com/bo3b/3Dmigoto/tree/master/Mordor Incomplete, but more straightforward. https://github.com/DarkStarSword/3d-fixes/tree/master/ABZU More complicated, using texture overrides, but fairly clear: https://github.com/DarkStarSword/3d-fixes/tree/master/Mad%20Max
ishiki said:I've tried to fix the skybox in the original Deus Ex, as well as some just disabling some fake lights.

However, there's only like 5 VS, and 5 PS. So when I toggle though the skybox also selects geometry not in the skybox. (This is with helixmod).

Is this because it's a DX7 or 8 game that's converted to DX9 using GMDX?


Also, what is a good game to learn to fix with 3dmigoto DX11. I just want basically use someones fix as a tool to see how they fixed the game while I refix it.

If that makes sense.

Thanks.

That's not too surprising for an older game like original Deus Ex. Having a single shader that does multiple jobs happens a lot with older games. I don't think this is related to the conversion using GMDX, but it's possible.

Maybe take a look at the HelixMod feature to make alternate shaders: http://wiki.bo3b.net/index.php?title=HelixMod_Feature_List#Overriding_an_Individual_Instance_of_a_Shader

There may be a way for you to separate those pieces out to different shaders, so you can fix just the skybox.


For DX11 example games, there are a lot. Take a look at the two big repositories of game fixes:

3Dmigoto: https://github.com/bo3b/3Dmigoto

DarkStarSword: https://github.com/DarkStarSword/3d-fixes



DarkStarSword fixes tend to be better commented, and the GitHub check-in comments can also be very helpful. The flip side is that DarkStarSword is also a genius, so his fixes also tend to do really unbelievably complex things that are hard to understand.

Skimming the list of games there, the two I'd recommend starting with are these two that are complete fixes, but don't include much in the way of advanced features and other weird game complexity.


https://github.com/bo3b/3Dmigoto/tree/master/Alien

https://github.com/bo3b/3Dmigoto/tree/master/Mordor


Incomplete, but more straightforward.

https://github.com/DarkStarSword/3d-fixes/tree/master/ABZU


More complicated, using texture overrides, but fairly clear:

https://github.com/DarkStarSword/3d-fixes/tree/master/Mad%20Max

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

Posted 10/28/2016 11:48 PM   
Hi all, while working on my Risen 3 fix, I noticed that ambient occlusion shadows look slightly different in both eyes when looking from a non perpendicular angle on the shadowed surface. However, they are basically on the correct depth so that modifying the shaders makes it worse. This results in some highlighted or "floating" shadows. I also saw this issue in the Witcher 3 (with fix) HBAO+ shadows so I believe it is unfixable. Am I correct? If yes, can somebody please explain me the reason for this.
Hi all,

while working on my Risen 3 fix, I noticed that ambient occlusion shadows look slightly different in both eyes when looking from a non perpendicular angle on the shadowed surface. However, they are basically on the correct depth so that modifying the shaders makes it worse. This results in some highlighted or "floating" shadows.

I also saw this issue in the Witcher 3 (with fix) HBAO+ shadows so I believe it is unfixable. Am I correct? If yes, can somebody please explain me the reason for this.
[quote="mx-2"]Hi all, while working on my Risen 3 fix, I noticed that ambient occlusion shadows look slightly different in both eyes when looking from a non perpendicular angle on the shadowed surface. However, they are basically on the correct depth so that modifying the shaders makes it worse. This results in some highlighted or "floating" shadows. I also saw this issue in the Witcher 3 (with fix) HBAO+ shadows so I believe it is unfixable. Am I correct? If yes, can somebody please explain me the reason for this.[/quote] Read this thread: https://forums.geforce.com/default/topic/897529/3d-hbao-normal-map-artefact-fix Which will lead you here: https://github.com/bo3b/3Dmigoto/commit/6fcb2354435b866d3697b588c982d3f44c0ec552 It seems very hard to do. I don't know if the same concept can be applied to SSAO. It certainly would help my Dark Souls 2 and Skyrim SE fixes. By the way, I need help here with Skyrim Special Edition, in case someone that can help didn't read that thread: https://forums.geforce.com/default/topic/969381/3d-vision/the-elder-scrolls-v-skyrim-special-edition/post/5008074/#5008074
mx-2 said:Hi all,

while working on my Risen 3 fix, I noticed that ambient occlusion shadows look slightly different in both eyes when looking from a non perpendicular angle on the shadowed surface. However, they are basically on the correct depth so that modifying the shaders makes it worse. This results in some highlighted or "floating" shadows.

I also saw this issue in the Witcher 3 (with fix) HBAO+ shadows so I believe it is unfixable. Am I correct? If yes, can somebody please explain me the reason for this.


Read this thread: https://forums.geforce.com/default/topic/897529/3d-hbao-normal-map-artefact-fix
Which will lead you here: https://github.com/bo3b/3Dmigoto/commit/6fcb2354435b866d3697b588c982d3f44c0ec552

It seems very hard to do. I don't know if the same concept can be applied to SSAO. It certainly would help my Dark Souls 2 and Skyrim SE fixes.


By the way, I need help here with Skyrim Special Edition, in case someone that can help didn't read that thread:

https://forums.geforce.com/default/topic/969381/3d-vision/the-elder-scrolls-v-skyrim-special-edition/post/5008074/#5008074

CPU: Intel Core i7 7700K @ 4.9GHz
Motherboard: Gigabyte Aorus GA-Z270X-Gaming 5
RAM: GSKILL Ripjaws Z 16GB 3866MHz CL18
GPU: MSI GeForce RTX 2080Ti Gaming X Trio
Monitor: Asus PG278QR
Speakers: Logitech Z506
Donations account: masterotakusuko@gmail.com

Posted 10/30/2016 06:28 PM   
[quote="masterotaku"] Read this thread: https://forums.geforce.com/default/topic/897529/3d-hbao-normal-map-artefact-fix Which will lead you here: https://github.com/bo3b/3Dmigoto/commit/6fcb2354435b866d3697b588c982d3f44c0ec552[/quote] Thank you for the link, masterotaku. I found the correct shader which looks similar to the linked github shader and tried to fix it with DarkStarSwords pattern. Based on the shader headers, I guess that this game uses SSAO not HBAO+. The first correction works and improves it a bit but then the shader multiplies something with a 3x3 view matrix and the fixes don't work for the other texture samples. Sometimes they need a positive, sometimes a negative fix to improve it a bit but it does not result in a complete fix. I tried different things like multiplying the fix with the matrix as well but without success. Does anybody has some ideas? It also seems that the fix needs a distance dependency but I'm not sure with this. This effect may be caused by the errornous correction. PS83BFC6D9: [code] // // Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111 // // Parameters: // // float2 DepthTextureSize; // float SSAORadiusAdjust; // // // Registers: // // Name Reg Size // ---------------- ----- ---- // SSAORadiusAdjust c0 1 // DepthTextureSize c1 1 // // // Default values: // // SSAORadiusAdjust // c0 = { 5000, 0, 0, 0 }; // // DepthTextureSize // c1 = { 0, 0, 0, 0 }; // // preshader // rcp c3.x, c0.x // rcp r0.x, c1.x // mul c4.x, r0.x, c1.y // approximately 3 instructions used // // Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111 // // Parameters: // // sampler2D DepthTextureAO_ss; // sampler2D DepthTexture_ss; // sampler2D DizzerTexture_ss; // float2 NearFar; // sampler2D NormalTexture_ss; // float SSAODepthRange; // float SSAOIntensity; // float SSAORadius; // float4x4 View; // // // Registers: // // Name Reg Size // ----------------- ----- ---- // View c0 3 // NearFar c5 1 // SSAOIntensity c6 1 // SSAORadius c7 1 // SSAODepthRange c8 1 // DepthTexture_ss s0 1 // NormalTexture_ss s1 1 // DepthTextureAO_ss s2 1 // DizzerTexture_ss s3 1 // // // Default values: // // View // c0 = { 0, 0, 0, 0 }; // c1 = { 0, 0, 0, 0 }; // c2 = { 0, 0, 0, 0 }; // // NearFar // c5 = { 0, 0, 0, 0 }; // // SSAOIntensity // c6 = { 0, 0, 0, 0 }; // // SSAORadius // c7 = { 400, 0, 0, 0 }; // // SSAODepthRange // c8 = { 50000, 0, 0, 0 }; // // SSAO ps_3_0 def c9, 2, -1, 1, 0 def c10, 0.0833333358, 0, -0.533333361, 0.533333361 def c11, 65536, 256, 1, 0 def c12, -1.13183296, 1.13183296, 0, 0.00390625 def c13, -0.316227764, 0.316227764, 0.282958239, -0.282958239 def c14, -0.30949223, 0.30949223, -0.948683321, 0.948683321 def c15, -0.400000006, 0.400000006, 0.773730576, -0.773730576 def c16, 0.257247865, -0.257247865, -0.428746462, 0.428746462 def c17, -0.0915737078, 0.0915737078, -0.320507973, 0.320507973 def c200, 1000000, 0, 0.0625, 1 // 0.000001 dcl_texcoord v0.xy // screen pos, correct dcl_texcoord1 v1.xy // ?? dcl_2d s0 dcl_2d s1 dcl_2d s2 dcl_2d s3 dcl_2d s13 texld_pp r0, v1, s3 // Dizzer, looks ok mad_pp r0, r0, c9.x, c9.y texld r1, v0, s0 // "standard" shadow mask mul r1.y, r1.x, c5.y mul r1.z, r1.y, c3.x mov_sat r1.z, r1.z mul r1.z, r1.z, c7.x mov_pp r2.z, c9.z mad r1.w, r1.y, c3.x, r2.z rcp r1.y, r1.y mul_pp r1.z, r1.w, r1.z mul r1.y, r1.y, r1.z rcp r1.z, r1.z mul r1.z, r1.z, c8.x mul_pp r0, r0, r1.y mul r3, r0.ywyw, c16.xxyy mul_pp r4, r0.xzxz, c4.x mad r3, r4.zwzw, c16.zzww, r3 add r3, r3, v0.xyxy texld r5, r3, s2 // Sample AO depth (W) buffer texld r6, r3.zwzw, s2 mad r3, r3, c9.x, c9.y mov r5.z, r6.x add r6.xy, -r1.x, r5.xzzw texld r31, c200.z, s13 mul r28.x, r5.z, c5.y add r31.w, -r31.y, r28.x mul r31.w, r31.w, r31.x mul r31.w, r31.w, c200.w add r6.x, r6.x, -r31.w // negative correction -> a bit better mad_sat r0.xz, r6.xyyw, r1.z, c9.x mad r7, v0.xyxy, c9.x, c9.y mul r7, r1.x, r7 mad_pp r3, r3.zwxy, r5.zzxx, -r7.zwxy mov_pp r6.zw, r3 nrm_pp r5.xyz, r6.zwxw mov r3.z, r6.y nrm_pp r6.xyz, r3 texld r3, v0, s1 // Sample normals mad_pp r2.xyw, r3.xyzz, c9.x, c9.y cmp_pp r1.y, -r2.w, c9.w, c9.z cmp_pp r1.w, r2.w, -c9.w, -c9.z add_pp r1.y, r1.w, r1.y dp2add_pp r1.w, r2, r2, c9.w add_pp r2.xy, r2, r2 add_pp r3.xy, r1.w, c9.zyzw mul_pp r1.y, r1.y, r3.y rcp r1.w, r3.x mul_pp r3.z, r1.w, r1.y mul_pp r3.xy, r1.w, r2 // View matrix, coordinate system change ?? dp3 r1.y, r3, c1 mov r8.y, -r1.y dp3 r8.x, r3, c0 dp3 r8.z, r3, c2 dp3_sat r2.y, r8, r6 dp3_sat r2.x, r8, r5 mul r1.yw, r0.xzzx, r2.xyzx mad r2.xy, r2.yxzw, -r0.zxzw, r2 mad r0.xz, r0, r2.xyyw, r1.yyww dp2add r0.x, r0.xzzw, c10.x, c10.y mul r3, r0.ywyw, c17.xxyy mad r3, r4.zwzw, c17.zzww, r3 add r3, r3, v0.xyxy texld r5, r3, s2 texld r6, r3.zwzw, s2 mad r3, r3, c9.x, c9.y mov r5.z, r6.x add r6.xy, -r1.x, r5.xzzw // Below fixes dont work very well: // Some need positive correction, some need negative correction, // some need no correction for best (but not completely fixed) results texld r31, c200.z, s13 mul r28.x, r5.z, c5.y add r31.w, -r31.y, r28.x mul r31.w, r31.w, r31.x mul r31.w, r31.w, c200.w add r6.x, r6.x, -r31.w mad_pp r3, r3, r5.xxzz, -r7 mad_sat r1.yw, r6.xxzy, r1.z, c9.x mov r5.z, r6.y mov_pp r5.xy, r3.zwzw mov_pp r6.zw, r3.xyxy nrm_pp r3.xyz, r6.zwxw dp3_sat r2.x, r8, r3 nrm_pp r3.xyz, r5 dp3_sat r2.y, r8, r3 mul r3.xy, r1.wyzw, r2.yxzw mad r2.xy, r2.yxzw, -r1.wyzw, r2 mad r1.yw, r1, r2.xxzy, r3.xxzy dp2add r0.x, r1.ywzw, c10.x, r0.x mul r3, r0.ywyw, c10.zzww mad r3, r4.zwzw, c15.xxyy, r3 add r3, r3, v0.xyxy texld r5, r3, s2 texld r6, r3.zwzw, s2 mad r3, r3, c9.x, c9.y mov r5.z, r6.x add r6.xy, -r1.x, r5.xzzw texld r31, c200.z, s13 mul r28.x, r5.z, c5.y add r31.w, -r31.y, r28.x mul r31.w, r31.w, r31.x mul r31.w, r31.w, c200.w add r6.x, r6.x, r31.w mad_pp r3, r3, r5.xxzz, -r7 mad_sat r1.yw, r6.xxzy, r1.z, c9.x mov r5.z, r6.y mov_pp r5.xy, r3.zwzw mov_pp r6.zw, r3.xyxy nrm_pp r3.xyz, r6.zwxw dp3_sat r2.x, r8, r3 nrm_pp r3.xyz, r5 dp3_sat r2.y, r8, r3 mul r3.xy, r1.wyzw, r2.yxzw mad r2.xy, r2.yxzw, -r1.wyzw, r2 mad r1.yw, r1, r2.xxzy, r3.xxzy dp2add r0.x, r1.ywzw, c10.x, r0.x mul r3, r0.ywyw, c15.zzww mad r3, r4.zwzw, c14.xxyy, r3 add r3, r3, v0.xyxy mad r5, r3, c9.x, c9.y texld r6, r3, s2 texld r3, r3.zwzw, s2 mov r6.z, r3.x mad_pp r3, r5.zwxy, r6.zzxx, -r7.zwxy add r5.xy, -r1.x, r6.xzzw texld r31, c200.z, s13 mul r28.x, r6.z, c5.y add r31.w, -r31.y, r28.x mul r31.w, r31.w, r31.x mul r31.w, r31.w, c200.w add r5.x, r5.x, -r31.w mov_pp r5.zw, r3 nrm_pp r6.xyz, r5.zwxw dp3_sat r2.x, r8, r6 mov r3.z, r5.y mad_sat r1.yw, r5.xxzy, r1.z, c9.x nrm_pp r5.xyz, r3 dp3_sat r2.y, r8, r5 mul r3.xy, r1.wyzw, r2.yxzw mad r2.xy, r2.yxzw, -r1.wyzw, r2 mad r1.yw, r1, r2.xxzy, r3.xxzy dp2add r0.x, r1.ywzw, c10.x, r0.x mul r3, r0.ywyw, c14.zzww mul r5, r0.ywyw, c13.zzww mad r5, r4, c12.xxyy, r5 mad r3, r4.zwzw, c13.xxyy, r3 add r3, r3, v0.xyxy add r4, r5, v0.xyxy mad r5, r3, c9.x, c9.y texld r6, r3, s2 texld r3, r3.zwzw, s2 mov r6.z, r3.x mad_pp r3, r5.zwxy, r6.zzxx, -r7.zwxy add r5.xy, -r1.x, r6.xzzw texld r31, c200.z, s13 mul r28.x, r6.z, c5.y add r31.w, -r31.y, r28.x mul r31.w, r31.w, r31.x mul r31.w, r31.w, c200.w add r5.x, r5.x, r31.w mov_pp r5.zw, r3 nrm_pp r6.xyz, r5.zwxw dp3_sat r2.x, r8, r6 mov r3.z, r5.y mad_sat r0.yz, r5.xxyw, r1.z, c9.x nrm_pp r5.xyz, r3 dp3_sat r2.y, r8, r5 mul r1.yw, r0.xzzy, r2.xyzx mad r2.xy, r2.yxzw, -r0.zyzw, r2 mad r0.yz, r0, r2.xxyw, r1.xyww dp2add r0.x, r0.yzzw, c10.x, r0.x mad r3, r4, c9.x, c9.y texld r5, r4, s2 texld r4, r4.zwzw, s2 mov r5.z, r4.x mad_pp r3, r3.zwxy, r5.zzxx, -r7.zwxy add r4.xy, -r1.x, r5.xzzw texld r31, c200.z, s13 mul r28.x, r5.z, c5.y add r31.w, -r31.y, r28.x mul r31.w, r31.w, r31.x mul r31.w, r31.w, c200.w add r4.x, r4.x, r31.w mul r0.yzw, r1.x, c11.xxyz frc r0.yzw, r0 mad oC0.yzw, r0.xyyz, -c12.xzww, r0 mov_pp r4.zw, r3 nrm_pp r5.xyz, r4.zwxw dp3_sat r1.x, r8, r5 mov r3.z, r4.y mad_sat r0.yz, r4.xxyw, r1.z, c9.x nrm_pp r4.xyz, r3 dp3_sat r1.y, r8, r4 mul r1.zw, r0.xyzy, r1.xyyx mad r1.xy, r1.yxzw, -r0.zyzw, r1 mad r0.yz, r0, r1.xxyw, r1.xzww dp2add r0.x, r0.yzzw, c10.x, r0.x mad_sat oC0.x, c6.x, -r0.x, r2.z // This is the buggy part //mov oC0.x, v0.y // v0: screen Pos //mov oC0.x, v1.x // ?? // approximately 191 instruction slots used (15 texture, 176 arithmetic) [/code]
masterotaku said:
Read this thread: https://forums.geforce.com/default/topic/897529/3d-hbao-normal-map-artefact-fix

Which will lead you here: https://github.com/bo3b/3Dmigoto/commit/6fcb2354435b866d3697b588c982d3f44c0ec552


Thank you for the link, masterotaku.

I found the correct shader which looks similar to the linked github shader and tried to fix it with DarkStarSwords pattern. Based on the shader headers, I guess that this game uses SSAO not HBAO+.

The first correction works and improves it a bit but then the shader multiplies something with a 3x3 view matrix and the fixes don't work for the other texture samples. Sometimes they need a positive, sometimes a negative fix to improve it a bit but it does not result in a complete fix.

I tried different things like multiplying the fix with the matrix as well but without success. Does anybody has some ideas?

It also seems that the fix needs a distance dependency but I'm not sure with this. This effect may be caused by the errornous correction.

PS83BFC6D9:
//
// Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111
//
// Parameters:
//
// float2 DepthTextureSize;
// float SSAORadiusAdjust;
//
//
// Registers:
//
// Name Reg Size
// ---------------- ----- ----
// SSAORadiusAdjust c0 1
// DepthTextureSize c1 1
//
//
// Default values:
//
// SSAORadiusAdjust
// c0 = { 5000, 0, 0, 0 };
//
// DepthTextureSize
// c1 = { 0, 0, 0, 0 };
//

// preshader
// rcp c3.x, c0.x
// rcp r0.x, c1.x
// mul c4.x, r0.x, c1.y

// approximately 3 instructions used
//
// Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111
//
// Parameters:
//
// sampler2D DepthTextureAO_ss;
// sampler2D DepthTexture_ss;
// sampler2D DizzerTexture_ss;
// float2 NearFar;
// sampler2D NormalTexture_ss;
// float SSAODepthRange;
// float SSAOIntensity;
// float SSAORadius;
// float4x4 View;
//
//
// Registers:
//
// Name Reg Size
// ----------------- ----- ----
// View c0 3
// NearFar c5 1
// SSAOIntensity c6 1
// SSAORadius c7 1
// SSAODepthRange c8 1
// DepthTexture_ss s0 1
// NormalTexture_ss s1 1
// DepthTextureAO_ss s2 1
// DizzerTexture_ss s3 1
//
//
// Default values:
//
// View
// c0 = { 0, 0, 0, 0 };
// c1 = { 0, 0, 0, 0 };
// c2 = { 0, 0, 0, 0 };
//
// NearFar
// c5 = { 0, 0, 0, 0 };
//
// SSAOIntensity
// c6 = { 0, 0, 0, 0 };
//
// SSAORadius
// c7 = { 400, 0, 0, 0 };
//
// SSAODepthRange
// c8 = { 50000, 0, 0, 0 };
//

// SSAO

ps_3_0
def c9, 2, -1, 1, 0
def c10, 0.0833333358, 0, -0.533333361, 0.533333361
def c11, 65536, 256, 1, 0
def c12, -1.13183296, 1.13183296, 0, 0.00390625
def c13, -0.316227764, 0.316227764, 0.282958239, -0.282958239
def c14, -0.30949223, 0.30949223, -0.948683321, 0.948683321
def c15, -0.400000006, 0.400000006, 0.773730576, -0.773730576
def c16, 0.257247865, -0.257247865, -0.428746462, 0.428746462
def c17, -0.0915737078, 0.0915737078, -0.320507973, 0.320507973
def c200, 1000000, 0, 0.0625, 1 // 0.000001
dcl_texcoord v0.xy // screen pos, correct
dcl_texcoord1 v1.xy // ??
dcl_2d s0
dcl_2d s1
dcl_2d s2
dcl_2d s3
dcl_2d s13

texld_pp r0, v1, s3 // Dizzer, looks ok
mad_pp r0, r0, c9.x, c9.y
texld r1, v0, s0 // "standard" shadow mask
mul r1.y, r1.x, c5.y
mul r1.z, r1.y, c3.x
mov_sat r1.z, r1.z
mul r1.z, r1.z, c7.x
mov_pp r2.z, c9.z
mad r1.w, r1.y, c3.x, r2.z
rcp r1.y, r1.y
mul_pp r1.z, r1.w, r1.z
mul r1.y, r1.y, r1.z
rcp r1.z, r1.z
mul r1.z, r1.z, c8.x
mul_pp r0, r0, r1.y
mul r3, r0.ywyw, c16.xxyy
mul_pp r4, r0.xzxz, c4.x
mad r3, r4.zwzw, c16.zzww, r3
add r3, r3, v0.xyxy

texld r5, r3, s2 // Sample AO depth (W) buffer
texld r6, r3.zwzw, s2
mad r3, r3, c9.x, c9.y
mov r5.z, r6.x
add r6.xy, -r1.x, r5.xzzw

texld r31, c200.z, s13
mul r28.x, r5.z, c5.y
add r31.w, -r31.y, r28.x
mul r31.w, r31.w, r31.x
mul r31.w, r31.w, c200.w
add r6.x, r6.x, -r31.w // negative correction -> a bit better

mad_sat r0.xz, r6.xyyw, r1.z, c9.x
mad r7, v0.xyxy, c9.x, c9.y
mul r7, r1.x, r7
mad_pp r3, r3.zwxy, r5.zzxx, -r7.zwxy
mov_pp r6.zw, r3
nrm_pp r5.xyz, r6.zwxw
mov r3.z, r6.y
nrm_pp r6.xyz, r3
texld r3, v0, s1 // Sample normals
mad_pp r2.xyw, r3.xyzz, c9.x, c9.y
cmp_pp r1.y, -r2.w, c9.w, c9.z
cmp_pp r1.w, r2.w, -c9.w, -c9.z
add_pp r1.y, r1.w, r1.y
dp2add_pp r1.w, r2, r2, c9.w
add_pp r2.xy, r2, r2
add_pp r3.xy, r1.w, c9.zyzw
mul_pp r1.y, r1.y, r3.y
rcp r1.w, r3.x
mul_pp r3.z, r1.w, r1.y
mul_pp r3.xy, r1.w, r2

// View matrix, coordinate system change ??
dp3 r1.y, r3, c1
mov r8.y, -r1.y
dp3 r8.x, r3, c0
dp3 r8.z, r3, c2

dp3_sat r2.y, r8, r6
dp3_sat r2.x, r8, r5
mul r1.yw, r0.xzzx, r2.xyzx
mad r2.xy, r2.yxzw, -r0.zxzw, r2
mad r0.xz, r0, r2.xyyw, r1.yyww
dp2add r0.x, r0.xzzw, c10.x, c10.y
mul r3, r0.ywyw, c17.xxyy
mad r3, r4.zwzw, c17.zzww, r3
add r3, r3, v0.xyxy
texld r5, r3, s2
texld r6, r3.zwzw, s2
mad r3, r3, c9.x, c9.y
mov r5.z, r6.x
add r6.xy, -r1.x, r5.xzzw

// Below fixes dont work very well:
// Some need positive correction, some need negative correction,
// some need no correction for best (but not completely fixed) results

texld r31, c200.z, s13
mul r28.x, r5.z, c5.y
add r31.w, -r31.y, r28.x
mul r31.w, r31.w, r31.x
mul r31.w, r31.w, c200.w
add r6.x, r6.x, -r31.w

mad_pp r3, r3, r5.xxzz, -r7
mad_sat r1.yw, r6.xxzy, r1.z, c9.x
mov r5.z, r6.y
mov_pp r5.xy, r3.zwzw
mov_pp r6.zw, r3.xyxy
nrm_pp r3.xyz, r6.zwxw
dp3_sat r2.x, r8, r3
nrm_pp r3.xyz, r5
dp3_sat r2.y, r8, r3
mul r3.xy, r1.wyzw, r2.yxzw
mad r2.xy, r2.yxzw, -r1.wyzw, r2
mad r1.yw, r1, r2.xxzy, r3.xxzy
dp2add r0.x, r1.ywzw, c10.x, r0.x
mul r3, r0.ywyw, c10.zzww
mad r3, r4.zwzw, c15.xxyy, r3
add r3, r3, v0.xyxy
texld r5, r3, s2
texld r6, r3.zwzw, s2
mad r3, r3, c9.x, c9.y
mov r5.z, r6.x
add r6.xy, -r1.x, r5.xzzw

texld r31, c200.z, s13
mul r28.x, r5.z, c5.y
add r31.w, -r31.y, r28.x
mul r31.w, r31.w, r31.x
mul r31.w, r31.w, c200.w
add r6.x, r6.x, r31.w

mad_pp r3, r3, r5.xxzz, -r7
mad_sat r1.yw, r6.xxzy, r1.z, c9.x
mov r5.z, r6.y
mov_pp r5.xy, r3.zwzw
mov_pp r6.zw, r3.xyxy
nrm_pp r3.xyz, r6.zwxw
dp3_sat r2.x, r8, r3
nrm_pp r3.xyz, r5
dp3_sat r2.y, r8, r3
mul r3.xy, r1.wyzw, r2.yxzw
mad r2.xy, r2.yxzw, -r1.wyzw, r2
mad r1.yw, r1, r2.xxzy, r3.xxzy
dp2add r0.x, r1.ywzw, c10.x, r0.x
mul r3, r0.ywyw, c15.zzww
mad r3, r4.zwzw, c14.xxyy, r3
add r3, r3, v0.xyxy
mad r5, r3, c9.x, c9.y
texld r6, r3, s2
texld r3, r3.zwzw, s2
mov r6.z, r3.x
mad_pp r3, r5.zwxy, r6.zzxx, -r7.zwxy
add r5.xy, -r1.x, r6.xzzw

texld r31, c200.z, s13
mul r28.x, r6.z, c5.y
add r31.w, -r31.y, r28.x
mul r31.w, r31.w, r31.x
mul r31.w, r31.w, c200.w
add r5.x, r5.x, -r31.w

mov_pp r5.zw, r3
nrm_pp r6.xyz, r5.zwxw
dp3_sat r2.x, r8, r6
mov r3.z, r5.y
mad_sat r1.yw, r5.xxzy, r1.z, c9.x
nrm_pp r5.xyz, r3
dp3_sat r2.y, r8, r5
mul r3.xy, r1.wyzw, r2.yxzw
mad r2.xy, r2.yxzw, -r1.wyzw, r2
mad r1.yw, r1, r2.xxzy, r3.xxzy
dp2add r0.x, r1.ywzw, c10.x, r0.x
mul r3, r0.ywyw, c14.zzww
mul r5, r0.ywyw, c13.zzww
mad r5, r4, c12.xxyy, r5
mad r3, r4.zwzw, c13.xxyy, r3
add r3, r3, v0.xyxy
add r4, r5, v0.xyxy
mad r5, r3, c9.x, c9.y
texld r6, r3, s2
texld r3, r3.zwzw, s2
mov r6.z, r3.x
mad_pp r3, r5.zwxy, r6.zzxx, -r7.zwxy
add r5.xy, -r1.x, r6.xzzw

texld r31, c200.z, s13
mul r28.x, r6.z, c5.y
add r31.w, -r31.y, r28.x
mul r31.w, r31.w, r31.x
mul r31.w, r31.w, c200.w
add r5.x, r5.x, r31.w

mov_pp r5.zw, r3
nrm_pp r6.xyz, r5.zwxw
dp3_sat r2.x, r8, r6
mov r3.z, r5.y
mad_sat r0.yz, r5.xxyw, r1.z, c9.x
nrm_pp r5.xyz, r3
dp3_sat r2.y, r8, r5
mul r1.yw, r0.xzzy, r2.xyzx
mad r2.xy, r2.yxzw, -r0.zyzw, r2
mad r0.yz, r0, r2.xxyw, r1.xyww
dp2add r0.x, r0.yzzw, c10.x, r0.x
mad r3, r4, c9.x, c9.y
texld r5, r4, s2
texld r4, r4.zwzw, s2
mov r5.z, r4.x
mad_pp r3, r3.zwxy, r5.zzxx, -r7.zwxy
add r4.xy, -r1.x, r5.xzzw

texld r31, c200.z, s13
mul r28.x, r5.z, c5.y
add r31.w, -r31.y, r28.x
mul r31.w, r31.w, r31.x
mul r31.w, r31.w, c200.w
add r4.x, r4.x, r31.w

mul r0.yzw, r1.x, c11.xxyz
frc r0.yzw, r0
mad oC0.yzw, r0.xyyz, -c12.xzww, r0

mov_pp r4.zw, r3
nrm_pp r5.xyz, r4.zwxw
dp3_sat r1.x, r8, r5
mov r3.z, r4.y
mad_sat r0.yz, r4.xxyw, r1.z, c9.x
nrm_pp r4.xyz, r3
dp3_sat r1.y, r8, r4
mul r1.zw, r0.xyzy, r1.xyyx
mad r1.xy, r1.yxzw, -r0.zyzw, r1
mad r0.yz, r0, r1.xxyw, r1.xzww
dp2add r0.x, r0.yzzw, c10.x, r0.x
mad_sat oC0.x, c6.x, -r0.x, r2.z // This is the buggy part

//mov oC0.x, v0.y // v0: screen Pos
//mov oC0.x, v1.x // ??
// approximately 191 instruction slots used (15 texture, 176 arithmetic)
I remember DSS making a python script for HelixMod that allows you to put different screenshots in different modes, like seen here: http://helixmod.blogspot.co.uk/2016/11/event0.html Yet, know when I search for it, I can't find it:(( I was thinking, maybe add a link to it in HelixMod at guide section? :) In the meantime PLEASE can anyone point me the web-page where I can find it? Thank you!
I remember DSS making a python script for HelixMod that allows you to put different screenshots in different modes, like seen here:

http://helixmod.blogspot.co.uk/2016/11/event0.html


Yet, know when I search for it, I can't find it:(( I was thinking, maybe add a link to it in HelixMod at guide section? :)
In the meantime PLEASE can anyone point me the web-page where I can find it?

Thank you!

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

Posted 12/03/2016 12:07 AM   
This is the post you are looking for: [url]https://forums.geforce.com/default/topic/766890/3d-vision/bo3bs-school-for-shaderhackers/post/4923723/#4923723[/url]

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 12/03/2016 04:25 AM   
@mx-2 my main experience with AO shaders has been that there are often a lot of places that need to be adjusted, and it will often look much worse then when you started until they are all adjusted simultaneously when suddenly it all looks perfect. I've fixed SSAO shaders in a couple of Unity games and as I recall it was along the same lines as fixing HBAO+, but there are more variations of SSAO than there are of HBAO+, so the details may vary. What game is that shader from?
@mx-2 my main experience with AO shaders has been that there are often a lot of places that need to be adjusted, and it will often look much worse then when you started until they are all adjusted simultaneously when suddenly it all looks perfect. I've fixed SSAO shaders in a couple of Unity games and as I recall it was along the same lines as fixing HBAO+, but there are more variations of SSAO than there are of HBAO+, so the details may vary.

What game is that shader from?

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 12/03/2016 04:35 AM   
  67 / 88    
Scroll To Top