3Dmigoto now open-source...
  27 / 141    
WOW, I never realized how much work was involved. Keep it up guys and thank you.
WOW, I never realized how much work was involved. Keep it up guys and thank you.

Posted 06/13/2015 02:24 AM   
[quote="helifax"]So.... I am using 1.1.18 version of the wrapper and I found a shader that is related to Hairworks in Witcher3. When I select it everything renders perfectly (ini file is set to skip that shader). I dump the shader, however when I say main() {return;} and restart the game is doesn't seem to make any difference. If I select it again, it gets disabled... So, I expect either is not dumped properly or loaded properly? Also, how can I make the wrapper just SKIP a shader (with a hashcode)? Big thx! Edit: Ok, I found out about the [ShaderOverride] ^_^[/quote] Hairworks uses tessellation, so it uses a few extra shader types that we don't have support for in 3Dmigoto yet (namely hull & domain shaders). It still has a vertex shader & pixel shader, but I'm not sure exactly how the vertex shader is used since the vertex positions will be determined later in the pipeline. The pixel shader should work much the same as what we are used to. It might be worth adding support for these types of shaders (and geometry shaders) to 3Dmigoto while I'm hooking up the compute shaders code. In Far Cry 4 hairworks was used for simulated fur and glitched at a *very* specific distance from the camera (at a guess... W==1?) that enabling this could potentially allow us to fix as well. What exactly is the issue with hairworks in Witcher 3? (edit: ok, I see you posted more detail in the other thread... I'll take a look)
helifax said:So....
I am using 1.1.18 version of the wrapper and I found a shader that is related to Hairworks in Witcher3.
When I select it everything renders perfectly (ini file is set to skip that shader).
I dump the shader, however when I say main() {return;} and restart the game is doesn't seem to make any difference. If I select it again, it gets disabled...

So, I expect either is not dumped properly or loaded properly?
Also, how can I make the wrapper just SKIP a shader (with a hashcode)?

Big thx!

Edit: Ok, I found out about the [ShaderOverride] ^_^


Hairworks uses tessellation, so it uses a few extra shader types that we don't have support for in 3Dmigoto yet (namely hull & domain shaders). It still has a vertex shader & pixel shader, but I'm not sure exactly how the vertex shader is used since the vertex positions will be determined later in the pipeline. The pixel shader should work much the same as what we are used to.

It might be worth adding support for these types of shaders (and geometry shaders) to 3Dmigoto while I'm hooking up the compute shaders code.

In Far Cry 4 hairworks was used for simulated fur and glitched at a *very* specific distance from the camera (at a guess... W==1?) that enabling this could potentially allow us to fix as well. What exactly is the issue with hairworks in Witcher 3? (edit: ok, I see you posted more detail in the other thread... I'll take a look)

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 06/13/2015 06:06 AM   
I'm having trouble making toggle work with 1.1.18 When using version 1.0.1, I load up the game, the shader is on by default and when I press the key it's disabled. So it works like intended. But when using version 1.1.18, the shader is instead off by default, and if I press the key the game crashes. Any ideas?
I'm having trouble making toggle work with 1.1.18

When using version 1.0.1, I load up the game, the shader is on by default and when I press the key it's disabled. So it works like intended.

But when using version 1.1.18, the shader is instead off by default, and if I press the key the game crashes.

Any ideas?

1080 Ti - i7 5820k - 16Gb RAM - Win 10 version 1607 - ASUS VG236H (1920x1080@120Hz)

Posted 06/13/2015 07:45 PM   
[quote=""]I'm having trouble making toggle work with 1.1.18 When using version 1.0.1, I load up the game, the shader is on by default and when I press the key it's disabled. So it works like intended. But when using version 1.1.18, the shader is instead off by default, and if I press the key the game crashes. Any ideas? [/quote] I don't fully understand the scenario here. There isn't really an on or off by default for the shaders here. You choose the constants that you want, and can set the default however you prefer. Shouldn't crash, so I'm not sure what's up with that. You can enable the logging with calls=1 and debug=1 and unbuffered=1 to get a good log that might help me see. I need a LOT more detail though. What game, what OS, what hardware, what GPU, what driver? Please add your system details to your signature so we don't have to keep asking for this information.
said:I'm having trouble making toggle work with 1.1.18

When using version 1.0.1, I load up the game, the shader is on by default and when I press the key it's disabled. So it works like intended.

But when using version 1.1.18, the shader is instead off by default, and if I press the key the game crashes.

Any ideas?

I don't fully understand the scenario here. There isn't really an on or off by default for the shaders here. You choose the constants that you want, and can set the default however you prefer.

Shouldn't crash, so I'm not sure what's up with that. You can enable the logging with calls=1 and debug=1 and unbuffered=1 to get a good log that might help me see.


I need a LOT more detail though. What game, what OS, what hardware, what GPU, what driver? Please add your system details to your signature so we don't have to keep asking for this information.

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

Posted 06/14/2015 02:54 AM   
[quote=""] I don't fully understand the scenario here. There isn't really an on or off by default for the shaders here. You choose the constants that you want, and can set the default however you prefer. Shouldn't crash, so I'm not sure what's up with that. You can enable the logging with calls=1 and debug=1 and unbuffered=1 to get a good log that might help me see. I need a LOT more detail though. What game, what OS, what hardware, what GPU, what driver? Please add your system details to your signature so we don't have to keep asking for this information. [/quote] Sorry I've updated my signature now. The thing is I'm using exactly the same fixed shaders (I put an on/off toggle function in them) but get the different results (described in my post above) depending on what .dll I use (1.0.1 vs 1.1.18) Here is my [url=http://download1322.mediafire.com/5dpmv99554bg/0r65aaddrr4lv00/d3d11_log.txt]d3d11_log.txt[/url]
said:
I don't fully understand the scenario here. There isn't really an on or off by default for the shaders here. You choose the constants that you want, and can set the default however you prefer.

Shouldn't crash, so I'm not sure what's up with that. You can enable the logging with calls=1 and debug=1 and unbuffered=1 to get a good log that might help me see.


I need a LOT more detail though. What game, what OS, what hardware, what GPU, what driver? Please add your system details to your signature so we don't have to keep asking for this information.


Sorry I've updated my signature now.

The thing is I'm using exactly the same fixed shaders (I put an on/off toggle function in them) but get the different results (described in my post above) depending on what .dll I use (1.0.1 vs 1.1.18)

Here is my d3d11_log.txt

1080 Ti - i7 5820k - 16Gb RAM - Win 10 version 1607 - ASUS VG236H (1920x1080@120Hz)

Posted 06/14/2015 10:53 AM   
[quote=""][quote="helifax"]So.... I am using 1.1.18 version of the wrapper and I found a shader that is related to Hairworks in Witcher3. When I select it everything renders perfectly (ini file is set to skip that shader). I dump the shader, however when I say main() {return;} and restart the game is doesn't seem to make any difference. If I select it again, it gets disabled... So, I expect either is not dumped properly or loaded properly? Also, how can I make the wrapper just SKIP a shader (with a hashcode)? Big thx! Edit: Ok, I found out about the [ShaderOverride] ^_^[/quote] Hairworks uses tessellation, so it uses a few extra shader types that we don't have support for in 3Dmigoto yet (namely hull & domain shaders). It still has a vertex shader & pixel shader, but I'm not sure exactly how the vertex shader is used since the vertex positions will be determined later in the pipeline. The pixel shader should work much the same as what we are used to. It might be worth adding support for these types of shaders (and geometry shaders) to 3Dmigoto while I'm hooking up the compute shaders code. In Far Cry 4 hairworks was used for simulated fur and glitched at a *very* specific distance from the camera (at a guess... W==1?) that enabling this could potentially allow us to fix as well. What exactly is the issue with hairworks in Witcher 3? (edit: ok, I see you posted more detail in the other thread... I'll take a look)[/quote] Don't know where to reply so i'll reply here:) Hairworks works correctly in Witcher 3 except for some shadows that appear wrong and in some positions it becomes transparent. Just skipping those 2 shaders fixes the animal hair and geralts beard. (Hair still suffers from the above). Skipping those 2 shaders seems to do the trick though;))
said:
helifax said:So....
I am using 1.1.18 version of the wrapper and I found a shader that is related to Hairworks in Witcher3.
When I select it everything renders perfectly (ini file is set to skip that shader).
I dump the shader, however when I say main() {return;} and restart the game is doesn't seem to make any difference. If I select it again, it gets disabled...

So, I expect either is not dumped properly or loaded properly?
Also, how can I make the wrapper just SKIP a shader (with a hashcode)?

Big thx!

Edit: Ok, I found out about the [ShaderOverride] ^_^


Hairworks uses tessellation, so it uses a few extra shader types that we don't have support for in 3Dmigoto yet (namely hull & domain shaders). It still has a vertex shader & pixel shader, but I'm not sure exactly how the vertex shader is used since the vertex positions will be determined later in the pipeline. The pixel shader should work much the same as what we are used to.

It might be worth adding support for these types of shaders (and geometry shaders) to 3Dmigoto while I'm hooking up the compute shaders code.

In Far Cry 4 hairworks was used for simulated fur and glitched at a *very* specific distance from the camera (at a guess... W==1?) that enabling this could potentially allow us to fix as well. What exactly is the issue with hairworks in Witcher 3? (edit: ok, I see you posted more detail in the other thread... I'll take a look)



Don't know where to reply so i'll reply here:)

Hairworks works correctly in Witcher 3 except for some shadows that appear wrong and in some positions it becomes transparent. Just skipping those 2 shaders fixes the animal hair and geralts beard. (Hair still suffers from the above).

Skipping those 2 shaders seems to do the trick though;))

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

Posted 06/14/2015 11:11 AM   
[quote=""][quote=""] I don't fully understand the scenario here. There isn't really an on or off by default for the shaders here. You choose the constants that you want, and can set the default however you prefer. Shouldn't crash, so I'm not sure what's up with that. You can enable the logging with calls=1 and debug=1 and unbuffered=1 to get a good log that might help me see. I need a LOT more detail though. What game, what OS, what hardware, what GPU, what driver? Please add your system details to your signature so we don't have to keep asking for this information.[/quote]Sorry I've updated my signature now. The thing is I'm using exactly the same fixed shaders (I put an on/off toggle function in them) but get the different results (described in my post above) depending on what .dll I use (1.0.1 vs 1.1.18) Here is my [url=http://download1322.mediafire.com/5dpmv99554bg/0r65aaddrr4lv00/d3d11_log.txt]d3d11_log.txt[/url][/quote] OK, that log suggests that x=1.0 to start with by default, then is correctly toggled to 0.0 on the override key. They Key1 with x=0.0 to start is conflicting with the x=1.0 in Constants. I'll have to think about that control flow, it's not clear what it should do in a case like that, but I'm not too surprised it might start off wrong. I'd expect 1.0.1 to use the same code as 1.1.18, but I'm not sure if DarkStarSword changed those handlers recently. If it crashes right after that 'h' keypress, then that is a bug in our handler that I'll need to look at. Does it crash consistently there?
said:
said:
I don't fully understand the scenario here. There isn't really an on or off by default for the shaders here. You choose the constants that you want, and can set the default however you prefer.

Shouldn't crash, so I'm not sure what's up with that. You can enable the logging with calls=1 and debug=1 and unbuffered=1 to get a good log that might help me see.


I need a LOT more detail though. What game, what OS, what hardware, what GPU, what driver? Please add your system details to your signature so we don't have to keep asking for this information.
Sorry I've updated my signature now.

The thing is I'm using exactly the same fixed shaders (I put an on/off toggle function in them) but get the different results (described in my post above) depending on what .dll I use (1.0.1 vs 1.1.18)

Here is my d3d11_log.txt

OK, that log suggests that x=1.0 to start with by default, then is correctly toggled to 0.0 on the override key. They Key1 with x=0.0 to start is conflicting with the x=1.0 in Constants.

I'll have to think about that control flow, it's not clear what it should do in a case like that, but I'm not too surprised it might start off wrong. I'd expect 1.0.1 to use the same code as 1.1.18, but I'm not sure if DarkStarSword changed those handlers recently.

If it crashes right after that 'h' keypress, then that is a bug in our handler that I'll need to look at. Does it crash consistently there?

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

Posted 06/14/2015 12:05 PM   
[quote=""] If it crashes right after that 'h' keypress, then that is a bug in our handler that I'll need to look at. Does it crash consistently there?[/quote] Yep, crashes everytime I press h.
said:
If it crashes right after that 'h' keypress, then that is a bug in our handler that I'll need to look at. Does it crash consistently there?


Yep, crashes everytime I press h.

1080 Ti - i7 5820k - 16Gb RAM - Win 10 version 1607 - ASUS VG236H (1920x1080@120Hz)

Posted 06/14/2015 12:08 PM   
[quote="bo3b"]I'd expect 1.0.1 to use the same code as 1.1.18, but I'm not sure if DarkStarSword changed those handlers recently. If it crashes right after that 'h' keypress, then that is a bug in our handler that I'll need to look at. Does it crash consistently there?[/quote] I don't think I've changed anything in the input code since 1.0.1, but I haven't been testing it very heavily either. Could have introduced a regression somewhere? Probably a stupid question, but is 3D enabled in the control panel? I got a crash the other day on a keypress (IIRC it was F10 to reload shaders) after a fresh driver install and had forgotten to enable 3D. I haven't tried to debug it very far, but I think the traceback pointed to the iniparams update code.
bo3b said:I'd expect 1.0.1 to use the same code as 1.1.18, but I'm not sure if DarkStarSword changed those handlers recently.

If it crashes right after that 'h' keypress, then that is a bug in our handler that I'll need to look at. Does it crash consistently there?

I don't think I've changed anything in the input code since 1.0.1, but I haven't been testing it very heavily either. Could have introduced a regression somewhere?

Probably a stupid question, but is 3D enabled in the control panel? I got a crash the other day on a keypress (IIRC it was F10 to reload shaders) after a fresh driver install and had forgotten to enable 3D. I haven't tried to debug it very far, but I think the traceback pointed to the iniparams update code.

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 06/14/2015 12:44 PM   
In the log there is an error -140: NVAPI_STEREO_NOT_INITIALIZED. So something is definitely off/busted with 3D. In general, I've been trying to make it work even in 2D, but a half working state might fall off the rails, like 3D is enabled, but driver returns an error.
In the log there is an error -140: NVAPI_STEREO_NOT_INITIALIZED. So something is definitely off/busted with 3D.

In general, I've been trying to make it work even in 2D, but a half working state might fall off the rails, like 3D is enabled, but driver returns an error.

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

Posted 06/14/2015 12:50 PM   
I see that bo3b just pushed out a new release to fix a crash in Dirt Rally: https://github.com/bo3b/3Dmigoto/releases/download/0.99.50-alpha/3Dmigoto-1.1.21.zip This release includes the a new "fake_o0=1" option in [ShaderOverride] sections that will assign a dummy render target to o0 when the shader is encountered. This is used to influence driver heuristics for shaders that only run with a depth target set to force them to run in stereo. This has the potential to resolve various "one-eye" or "mono depth-buffer" issues in games, and is used to fix MSAA hairworks in The Witcher 3. But how do you know which shader to use that on? You could probably guess by examining ShaderUsage.txt to find pixel shaders that have a <DepthTarget> but no <RenderTarget> set. I haven't tried this myself, so I'm not sure how reliable it is or if there might be any side effects. Rather, I used frame analysis to track down the problematic shaders. Speaking of which, this release contains a number of improvements to the frame analysis feature that I've been working on: - Can now change frame analysis options when specific shaders or render targets (not UAVs or textures yet) are encountered during the frame, either only for a single draw call (default), or permanently from that point on (with 'persist' keyword). This is useful to enable dump_rt_dds only when necessary and save time and disk space when not. I used this to get more detail on the problems with hairworks in the Witcher 3, and this allowed me to dump out the dds files I was interested in a matter of seconds rather than minutes. - Depth buffers will now dump correctly (were all blank before) - MSAA textures should now dump correctly (though some still don't appear to be dumping, such as the MSAA8 hairworks depth buffers in Witcher 3) - Added 'mono' option if you don't need the second perspective (ie, just to identify when an effect is drawn and save some space & time) Additionally, I've udpated my ddsinfo.py script to convert several DDS formats dumped with this feature to PNG so they can be opened with any image viewer. I've only added a few formats so far (ones that I've actually encountered), but the code is now structured so that adding more is pretty straight forward: [url]https://github.com/DarkStarSword/3d-fixes/blob/master/ddsinfo.py[/url] If anyone wants to use the script with a dds file that it can't convert yet, send me a copy of the dds (or at least the information printed in the terminal window when the script is used so I can see which format it is). Edit: The ddsinfo.py script requires Python 3.4, NumPy and python-pillow: [url]https://www.python.org/ftp/python/3.4.3/python-3.4.3.msi[/url] [url]http://sourceforge.net/projects/numpy/files/NumPy/1.9.2/numpy-1.9.2-win32-superpack-python3.4.exe/download[/url] [url]https://pypi.python.org/packages/3.4/P/Pillow/Pillow-2.8.2.win32-py3.4.exe[/url] Right now it must be run from a command prompt, like [code]C:\Users\dss>c:\Python34\python.exe c:\path\to\3d-fixes\ddsinfo.py "c:\G OG Games\The Witcher 3 Wild Hunt\bin\x64\FrameAnalysis-2015-06-18-130330\000740- D-vs-cfaabd2f8c15ff34-ps-4f47213227634ea5.dds"[/code]
I see that bo3b just pushed out a new release to fix a crash in Dirt Rally:

https://github.com/bo3b/3Dmigoto/releases/download/0.99.50-alpha/3Dmigoto-1.1.21.zip

This release includes the a new "fake_o0=1" option in [ShaderOverride] sections that will assign a dummy render target to o0 when the shader is encountered. This is used to influence driver heuristics for shaders that only run with a depth target set to force them to run in stereo. This has the potential to resolve various "one-eye" or "mono depth-buffer" issues in games, and is used to fix MSAA hairworks in The Witcher 3.

But how do you know which shader to use that on? You could probably guess by examining ShaderUsage.txt to find pixel shaders that have a <DepthTarget> but no <RenderTarget> set. I haven't tried this myself, so I'm not sure how reliable it is or if there might be any side effects. Rather, I used frame analysis to track down the problematic shaders.

Speaking of which, this release contains a number of improvements to the frame analysis feature that I've been working on:

- Can now change frame analysis options when specific shaders or render targets (not UAVs or textures yet) are encountered during the frame, either only for a single draw call (default), or permanently from that point on (with 'persist' keyword). This is useful to enable dump_rt_dds only when necessary and save time and disk space when not. I used this to get more detail on the problems with hairworks in the Witcher 3, and this allowed me to dump out the dds files I was interested in a matter of seconds rather than minutes.

- Depth buffers will now dump correctly (were all blank before)

- MSAA textures should now dump correctly (though some still don't appear to be dumping, such as the MSAA8 hairworks depth buffers in Witcher 3)

- Added 'mono' option if you don't need the second perspective (ie, just to identify when an effect is drawn and save some space & time)

Additionally, I've udpated my ddsinfo.py script to convert several DDS formats dumped with this feature to PNG so they can be opened with any image viewer. I've only added a few formats so far (ones that I've actually encountered), but the code is now structured so that adding more is pretty straight forward:

https://github.com/DarkStarSword/3d-fixes/blob/master/ddsinfo.py

If anyone wants to use the script with a dds file that it can't convert yet, send me a copy of the dds (or at least the information printed in the terminal window when the script is used so I can see which format it is).

Edit: The ddsinfo.py script requires Python 3.4, NumPy and python-pillow:
https://www.python.org/ftp/python/3.4.3/python-3.4.3.msi
http://sourceforge.net/projects/numpy/files/NumPy/1.9.2/numpy-1.9.2-win32-superpack-python3.4.exe/download
https://pypi.python.org/packages/3.4/P/Pillow/Pillow-2.8.2.win32-py3.4.exe
Right now it must be run from a command prompt, like
C:\Users\dss>c:\Python34\python.exe c:\path\to\3d-fixes\ddsinfo.py "c:\G
OG Games\The Witcher 3 Wild Hunt\bin\x64\FrameAnalysis-2015-06-18-130330\000740-
D-vs-cfaabd2f8c15ff34-ps-4f47213227634ea5.dds"

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 06/19/2015 02:34 AM   
I haven't dug into this very far yet, but I just came across a flickering shader in Lichdom that turned out to be due to a precision issue like we see in MS's disassembler, but this was a HLSL shader: https://github.com/DarkStarSword/3d-fixes/commit/53a4609e0c03e8819accbe87647fa4013552d44d I was under the impression that the decompiler got the floats from the binary, but looking at the code it seems that is not the case?
I haven't dug into this very far yet, but I just came across a flickering shader in Lichdom that turned out to be due to a precision issue like we see in MS's disassembler, but this was a HLSL shader:


https://github.com/DarkStarSword/3d-fixes/commit/53a4609e0c03e8819accbe87647fa4013552d44d


I was under the impression that the decompiler got the floats from the binary, but looking at the code it seems that is not the case?

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 07/09/2015 10:32 AM   
[quote=""]I haven't dug into this very far yet, but I just came across a flickering shader in Lichdom that turned out to be due to a precision issue like we see in MS's disassembler, but this was a HLSL shader: https://github.com/DarkStarSword/3d-fixes/commit/53a4609e0c03e8819accbe87647fa4013552d44d I was under the impression that the decompiler got the floats from the binary, but looking at the code it seems that is not the case?[/quote] No, it's a hybrid model at present. The original Decompiler was all text parsing, and at some point Chiri added the James-Jones cross compiler which decodes the binary. But Chiri didn't finish the conversion to using JJ, so there is a lot of code that still does text parsing of the disassembled code. I also have not got back in to convert that to using the JJ codes, primarily because I don't like to rock the boat unless it's necessary. For anything new that I've added, I've tried to stay toward JJ code, but it's not always possible. Worse- Witcher3 and Batman both demonstrate that the new trend is to strip the headers of reflection data, which means that the James-Jones code fails, and the only thing possible is text parsing. Bit of a quandary therefore, because JJ is much better when reflection data is available, but it's not clear that's still viable. And we are half-way here and half-way there. For some specific instructions like this mul, I'm pretty sure that it already uses the James-Jones binary for those. The opcode_mul section calls out to fixImm(op2, instr->asOperands[1]) for example, and fixImm does use the expected afImmediates[0..3] for actual C float values. As opposed to other instructions that use the op1, op2 type text values from sscanf. Probably/possibly the problem here is the output format? Chiri used the E notation for reasons I never have understood, and in this case it also truncates it to a 9 digit number (sprintf format %.9e). Based on your discovery there, that seems bad. I have thought many times about switching away from exponential notation for all the constants because I think it makes the HLSL harder to read, but I hesitate to do stuff like that without understanding the ramifications. Since we have a clear bug- that's enough to justify a change now I think. I had never quite understood Helix's comment that the disassembler was 'crap', but this makes it clear. Edit: yeah, this E format output is nearly certain to be the problem, but for reasons that don't make sense to me. The default precision for E format is 6 digits, which is exactly what we see in the output. The part I don't get is that we are specifying %.9E, and it's still only printing 6 digits instead of the expected 9. Edit2: I'm thinking it's likely to be the fragile and scary applySwizzle routine. For l value constants, that routine converts the string literal back to floats, then outputs it again. Looks to me like it's using only %e format, which would be the default of 6 digits. If you want me to take a stab at fixing this, I'd be happy to take a closer look.
said:I haven't dug into this very far yet, but I just came across a flickering shader in Lichdom that turned out to be due to a precision issue like we see in MS's disassembler, but this was a HLSL shader:

https://github.com/DarkStarSword/3d-fixes/commit/53a4609e0c03e8819accbe87647fa4013552d44d

I was under the impression that the decompiler got the floats from the binary, but looking at the code it seems that is not the case?

No, it's a hybrid model at present. The original Decompiler was all text parsing, and at some point Chiri added the James-Jones cross compiler which decodes the binary. But Chiri didn't finish the conversion to using JJ, so there is a lot of code that still does text parsing of the disassembled code.

I also have not got back in to convert that to using the JJ codes, primarily because I don't like to rock the boat unless it's necessary. For anything new that I've added, I've tried to stay toward JJ code, but it's not always possible.

Worse- Witcher3 and Batman both demonstrate that the new trend is to strip the headers of reflection data, which means that the James-Jones code fails, and the only thing possible is text parsing. Bit of a quandary therefore, because JJ is much better when reflection data is available, but it's not clear that's still viable. And we are half-way here and half-way there.


For some specific instructions like this mul, I'm pretty sure that it already uses the James-Jones binary for those. The opcode_mul section calls out to fixImm(op2, instr->asOperands[1]) for example, and fixImm does use the expected afImmediates[0..3] for actual C float values. As opposed to other instructions that use the op1, op2 type text values from sscanf.

Probably/possibly the problem here is the output format? Chiri used the E notation for reasons I never have understood, and in this case it also truncates it to a 9 digit number (sprintf format %.9e). Based on your discovery there, that seems bad.

I have thought many times about switching away from exponential notation for all the constants because I think it makes the HLSL harder to read, but I hesitate to do stuff like that without understanding the ramifications. Since we have a clear bug- that's enough to justify a change now I think.


I had never quite understood Helix's comment that the disassembler was 'crap', but this makes it clear.


Edit: yeah, this E format output is nearly certain to be the problem, but for reasons that don't make sense to me. The default precision for E format is 6 digits, which is exactly what we see in the output. The part I don't get is that we are specifying %.9E, and it's still only printing 6 digits instead of the expected 9.

Edit2: I'm thinking it's likely to be the fragile and scary applySwizzle routine. For l value constants, that routine converts the string literal back to floats, then outputs it again. Looks to me like it's using only %e format, which would be the default of 6 digits.

If you want me to take a stab at fixing this, I'd be happy to take a closer look.

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

Posted 07/09/2015 11:34 AM   
I don't think that the %.9e is the problem here (that number was %.6e anyway, which happens to be the default of %e)... though I do agree that the scientific notation is not great and already changed that to %.9g (which uses %.9f if it can exactly represent the value and %.9e otherwise) the other day but hadn't pushed the change yet. Scientific notation is necessary for very small or very large numbers like 1.4e-45, but annoying for numbers like 0, 1, 0.5, 0.0625, etc. %.9g is a quick way to get the best of both worlds (I have an algorithm in my float_to_hex.py that does better). I've done fairly extensive testing while looking at the assembler precision issues, and determined that .9g is sufficient precision for any 32bit float such that reinterpreting the string as a float will exactly reproduce the original value. Any less precision than that will lead to rounding errors, and more is unnecessary (though harmless). Edit: Just saw your edit. I'll take a closer look at applySwizzle and see if it is the culprit.
I don't think that the %.9e is the problem here (that number was %.6e anyway, which happens to be the default of %e)... though I do agree that the scientific notation is not great and already changed that to %.9g (which uses %.9f if it can exactly represent the value and %.9e otherwise) the other day but hadn't pushed the change yet. Scientific notation is necessary for very small or very large numbers like 1.4e-45, but annoying for numbers like 0, 1, 0.5, 0.0625, etc. %.9g is a quick way to get the best of both worlds (I have an algorithm in my float_to_hex.py that does better).

I've done fairly extensive testing while looking at the assembler precision issues, and determined that .9g is sufficient precision for any 32bit float such that reinterpreting the string as a float will exactly reproduce the original value. Any less precision than that will lead to rounding errors, and more is unnecessary (though harmless).

Edit: Just saw your edit. I'll take a closer look at applySwizzle and see if it is the culprit.

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 07/09/2015 05:15 PM   
OK, good, I was going to suggest that we use the %.9g format as well, as a good compromise for readability. But wasn't certain that was the best option. If you think that's the way to go, that's good enough for me. :-> Pretty sure the quick fix is to just change applySwizzle output to use %.9g instead of %e. That should give a nice bump in readability, and solve the precision problem too. Possible larger fix would be to have applySwizzle not convert that from text back to float, just to convert it back to text again. The fixImm converts the l(x,y,z,w) into the full text version. The swizzling is snipping out the unused parameters to make them match the instruction. Converting them back and forth is not necessary, just an artifact of how it was written. The risk here is fairly high though, this routine is used everywhere, and it's easy to break a different code path. e.g. stuff like treating the numbers as bitfields instead of floats. For what it's worth, my recommendation would be to just go with %.9g as a safe clear win, as this routine is used only when dumping shaders and is thus not time critical. Edit: Ah, I see you already checked in that change to just switch them to %.9g. Works for me. :->
OK, good, I was going to suggest that we use the %.9g format as well, as a good compromise for readability. But wasn't certain that was the best option. If you think that's the way to go, that's good enough for me. :->

Pretty sure the quick fix is to just change applySwizzle output to use %.9g instead of %e. That should give a nice bump in readability, and solve the precision problem too.


Possible larger fix would be to have applySwizzle not convert that from text back to float, just to convert it back to text again. The fixImm converts the l(x,y,z,w) into the full text version. The swizzling is snipping out the unused parameters to make them match the instruction. Converting them back and forth is not necessary, just an artifact of how it was written.

The risk here is fairly high though, this routine is used everywhere, and it's easy to break a different code path. e.g. stuff like treating the numbers as bitfields instead of floats.

For what it's worth, my recommendation would be to just go with %.9g as a safe clear win, as this routine is used only when dumping shaders and is thus not time critical.


Edit: Ah, I see you already checked in that change to just switch them to %.9g. Works for me. :->

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

Posted 07/10/2015 01:18 AM   
  27 / 141    
Scroll To Top