Crysis 3 - 3D fix discussion
  3 / 10    
Thanks DarkStarSword!! Now is working....i miss changing the second value for "y2". In Lichdom do you found an issue with water reflection....where the water have some type of tesselation? i don't found anything related in your github. if you look the screenshot, one reflection render ok (in the right bush), but the other not (left bush) [img]https://forums.geforce.com/cmd/default/download-comment-attachment/65348/[/img]
Thanks DarkStarSword!! Now is working....i miss changing the second value for "y2".

In Lichdom do you found an issue with water reflection....where the water have some type of tesselation? i don't found anything related in your github.

if you look the screenshot, one reflection render ok (in the right bush), but the other not (left bush)

Image
Attachments

Crysis315_99.jps

MY WEB

Helix Mod - Making 3D Better

My 3D Screenshot Gallery

Like my fixes? you can donate to Paypal: dhr.donation@gmail.com

#31
Posted 07/22/2015 02:31 AM   
The worst of that reflection looks like a fake environmental map reflection. If it's the same as Lichdom I fixed by applying a 1/2 stereo adjustment on one of the texcoord outputs: https://github.com/DarkStarSword/3d-fixes/blob/master/Lichdom%20Battlemage/ShaderFixes/3981845112a28c3d-ds.txt https://github.com/DarkStarSword/3d-fixes/blob/master/Lichdom%20Battlemage/ShaderFixes/27a0e398808011b9-vs_replace.txt That said, these reflections don't look as simple as Lichdom since they appear to contain a true reflection as well (which may need an adjustment as well? It looks like it is hovering on the surface?). I'm not sure - but it looks like the reflection of the rock on the right is doubled in the left eye, which might possibly need a StereoMode override to fix. I can take a look at these tonight and see if I can work anything out - where can I find this reflection in the game?
The worst of that reflection looks like a fake environmental map reflection. If it's the same as Lichdom I fixed by applying a 1/2 stereo adjustment on one of the texcoord outputs:


https://github.com/DarkStarSword/3d-fixes/blob/master/Lichdom%20Battlemage/ShaderFixes/3981845112a28c3d-ds.txt


https://github.com/DarkStarSword/3d-fixes/blob/master/Lichdom%20Battlemage/ShaderFixes/27a0e398808011b9-vs_replace.txt


That said, these reflections don't look as simple as Lichdom since they appear to contain a true reflection as well (which may need an adjustment as well? It looks like it is hovering on the surface?). I'm not sure - but it looks like the reflection of the rock on the right is doubled in the left eye, which might possibly need a StereoMode override to fix.

I can take a look at these tonight and see if I can work anything out - where can I find this reflection in the game?

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

#32
Posted 07/22/2015 05:19 AM   
Yes, have a true reflection and the surface have some kind of tesselation (it's a hard one)...Also, when you look a little up to that water all reflections are ok due the true reflection movement. Also changing WATER quality or/and SHADING quality looks the same. Also for that water i already fix the halo in the DS (1fbb6f74a9449b02-ds) That water is in the 5 Checkpoint of the chapter "Welcome to the Jungle". You can use my savegame: [url]https://s3.amazonaws.com/dhr/SaveGames.zip[/url]
Yes, have a true reflection and the surface have some kind of tesselation (it's a hard one)...Also, when you look a little up to that water all reflections are ok due the true reflection movement. Also changing WATER quality or/and SHADING quality looks the same.

Also for that water i already fix the halo in the DS (1fbb6f74a9449b02-ds)


That water is in the 5 Checkpoint of the chapter "Welcome to the Jungle".
You can use my savegame:

https://s3.amazonaws.com/dhr/SaveGames.zip

MY WEB

Helix Mod - Making 3D Better

My 3D Screenshot Gallery

Like my fixes? you can donate to Paypal: dhr.donation@gmail.com

#33
Posted 07/22/2015 11:17 AM   
I am so happy that you guys managed to master this engine. Look at this video from Sniper Ghost Warrior 3. https://www.youtube.com/watch?v=sgRdJmBa95o
I am so happy that you guys managed to master this engine. Look at this video from Sniper Ghost Warrior 3.
This is what I've been able to come up with for the reflections so far (on 'high' water quality, not 'very high'). It's a lot better then before, but could still use more work. I probably won't have time to look at this any more before I go on holidays (maybe a little on Saturday), but hopefully it will be enough to get you started: https://github.com/DarkStarSword/3d-fixes/commit/0f68c48ab804e9cfd6c260f209dd596d12665f9b It moves the environment reflection to water depth and adjusts the screen space reflections. The former should probably be pushed deeper and the later needs more experimentation to see if it can be improved any further - there's a bunch of places v5 is used and I have switched some of them to adj_v5 / iadj_v5 where it seemed to help, but I haven't tried all the possibilities yet. Plus a fudge factor might be able to help. I'm not sure, but I think there might be a decompiler bug making the environment map reflections brigher than they should be. I didn't look into this at all, but it's probably worth double checking. This fix won't work on 'very high' yet, as I don't think we can define new outputs in the assembler yet (Flugan - correct me if I'm wrong). Also note that the vertex + pixel shaders change if there is a ripple being drawn.
This is what I've been able to come up with for the reflections so far (on 'high' water quality, not 'very high'). It's a lot better then before, but could still use more work. I probably won't have time to look at this any more before I go on holidays (maybe a little on Saturday), but hopefully it will be enough to get you started:

https://github.com/DarkStarSword/3d-fixes/commit/0f68c48ab804e9cfd6c260f209dd596d12665f9b

It moves the environment reflection to water depth and adjusts the screen space reflections. The former should probably be pushed deeper and the later needs more experimentation to see if it can be improved any further - there's a bunch of places v5 is used and I have switched some of them to adj_v5 / iadj_v5 where it seemed to help, but I haven't tried all the possibilities yet. Plus a fudge factor might be able to help.

I'm not sure, but I think there might be a decompiler bug making the environment map reflections brigher than they should be. I didn't look into this at all, but it's probably worth double checking.

This fix won't work on 'very high' yet, as I don't think we can define new outputs in the assembler yet (Flugan - correct me if I'm wrong). Also note that the vertex + pixel shaders change if there is a ripple being drawn.

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

#35
Posted 07/23/2015 05:08 PM   
It is possible to add an output as far as I can tell before [code] dcl_output o4.xyzw dcl_output o5.xyzw dcl_temps 4 ... mov o4.xyz, r0.xyzx mov o4.w, l(0) ... mad o5.z, r0.x, l(0.250000), l(0.750000) mul o5.w, v2.w, cb2[4].x mov o5.xy, cb2[4].wyww ret [/code] after [code] dcl_output o4.xyzw dcl_output o5.xyzw dcl_output o6.xyzw dcl_temps 4 ... mov o4.xyz, r0.xyzx mov o5.xyz, r0.xyzx mov o4.w, l(0) mov o5.w, l(0) ... mad o6.z, r0.x, l(0.250000), l(0.750000) mul o6.w, v2.w, cb2[4].x mov o6.xy, cb2[4].wyww ret [/code] Lichdom shaderfix: ffd5491b00d77e9b-vs_replace.txt [code] out float4 o4 : TEXCOORD3, - out float4 o5 : COLOR0) + out float4 o5 : TEXCOORD4, + out float4 o6 : COLOR0) { float4 r0,r1,r2,r3; ... o4.xyz = AmbientObjectCol.www * r0.xyz; o4.w = 0.000000000e+000; + o5.xyz = AmbientObjectCol.www * r0.xyz; + o5.w = 0.000000000e+000; ... - o5.z = r0.x * 2.500000000e-001 + 7.500000000e-001; - o5.w = AmbientObjectCol.x * v2.w; - o5.xy = AmbientObjectCol.wy; + o6.z = r0.x * 2.500000000e-001 + 7.500000000e-001; + o6.w = AmbientObjectCol.x * v2.w; + o6.xy = AmbientObjectCol.wy; return; } [/code]
It is possible to add an output as far as I can tell
before
dcl_output o4.xyzw
dcl_output o5.xyzw
dcl_temps 4
...
mov o4.xyz, r0.xyzx
mov o4.w, l(0)
...
mad o5.z, r0.x, l(0.250000), l(0.750000)
mul o5.w, v2.w, cb2[4].x
mov o5.xy, cb2[4].wyww
ret

after
dcl_output o4.xyzw
dcl_output o5.xyzw
dcl_output o6.xyzw
dcl_temps 4
...
mov o4.xyz, r0.xyzx
mov o5.xyz, r0.xyzx
mov o4.w, l(0)
mov o5.w, l(0)
...
mad o6.z, r0.x, l(0.250000), l(0.750000)
mul o6.w, v2.w, cb2[4].x
mov o6.xy, cb2[4].wyww
ret

Lichdom shaderfix: ffd5491b00d77e9b-vs_replace.txt
out float4 o4 : TEXCOORD3,
- out float4 o5 : COLOR0)
+ out float4 o5 : TEXCOORD4,
+ out float4 o6 : COLOR0)
{
float4 r0,r1,r2,r3;
...
o4.xyz = AmbientObjectCol.www * r0.xyz;
o4.w = 0.000000000e+000;
+ o5.xyz = AmbientObjectCol.www * r0.xyz;
+ o5.w = 0.000000000e+000;
...
- o5.z = r0.x * 2.500000000e-001 + 7.500000000e-001;
- o5.w = AmbientObjectCol.x * v2.w;
- o5.xy = AmbientObjectCol.wy;
+ o6.z = r0.x * 2.500000000e-001 + 7.500000000e-001;
+ o6.w = AmbientObjectCol.x * v2.w;
+ o6.xy = AmbientObjectCol.wy;
return;
}

Thanks to everybody using my assembler it warms my heart.
To have a critical piece of code that everyone can enjoy!
What more can you ask for?

donations: ulfjalmbrant@hotmail.com

#36
Posted 07/23/2015 08:23 PM   
@DarkStarSword Using those 4 fixed shader and WATER = HIGH and PARTICLE = HIGH looks a little better....but when you start to move looking the water the fix is gone (disable), and when you look up and down with the water in front....there is a black stuff over the water that moves. Also i test with VERY HIGH settings and the PS related to water, they have a HLSL Error message in the bottom (i attach the shaders related) [code] /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~ HLSL errors ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ wrapper1349(165,13-82):warning X3206: 'SampleGrad': implicit truncation of vector type[/code] and [code]/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~ HLSL errors ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ wrapper1349(188,24-35):warning X4121: gradient-based operations must be moved out of flow control to prevent divergence. Performance may improve by using a non-gradient operation[/code] This HLSL Error [u]are also present in the fixed PS for HIGH settings[/u]....like you say, i thing there is a decompiler bug for those shaders.
@DarkStarSword
Using those 4 fixed shader and WATER = HIGH and PARTICLE = HIGH looks a little better....but when you start to move looking the water the fix is gone (disable), and when you look up and down with the water in front....there is a black stuff over the water that moves.


Also i test with VERY HIGH settings and the PS related to water, they have a HLSL Error message in the bottom (i attach the shaders related)

/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~ HLSL errors ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
wrapper1349(165,13-82):warning X3206: 'SampleGrad': implicit truncation of vector type


and

/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~ HLSL errors ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
wrapper1349(188,24-35):warning X4121: gradient-based operations must be moved out of flow control to prevent divergence. Performance may improve by using a non-gradient operation


This HLSL Error are also present in the fixed PS for HIGH settings....like you say, i thing there is a decompiler bug for those shaders.
Attachments

VERY_HIGH.zip.jpg

MY WEB

Helix Mod - Making 3D Better

My 3D Screenshot Gallery

Like my fixes? you can donate to Paypal: dhr.donation@gmail.com

#37
Posted 07/23/2015 10:32 PM   
@DHR: In general, the warnings are just that, a warning, it's rare to cause any problems with the shaders. If you want to remove the warning fix the SampleGrad line in 9b3a shader to be: [code] r3.xyzw = ReflSampler.SampleGrad(ReflSampler_s, r3.xy, r2.xy, float2(0,0)).xyzw;[/code] I cannot tell if the decompile has any errors. If you want me to take a look at these, I am happy to, but you must leave the ASM at the bottom of the file. When you delete it, I have nothing to compare against. That's why I generate that in the first place. Out of curiousity, why do you delete those? They are commented out and will have no impact on the HLSL.
@DHR: In general, the warnings are just that, a warning, it's rare to cause any problems with the shaders. If you want to remove the warning fix the SampleGrad line in 9b3a shader to be:

r3.xyzw = ReflSampler.SampleGrad(ReflSampler_s, r3.xy, r2.xy, float2(0,0)).xyzw;


I cannot tell if the decompile has any errors. If you want me to take a look at these, I am happy to, but you must leave the ASM at the bottom of the file. When you delete it, I have nothing to compare against. That's why I generate that in the first place. Out of curiousity, why do you delete those? They are commented out and will have no impact on the HLSL.

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

#38
Posted 07/24/2015 12:26 AM   
@DHR: For the 5b48 vertex shader, I do think that Decompile code is wrong. I'm still puzzling out why this happened, but here is the replacement code, and it will make the output identical to the original ASM at the bottom of the file. [code]// Water VS 1 // 9b3ac1b07bf3efc1 de67fe4f68dba921 cbuffer cb3 : register(b3) { float4 cb3[5]; } cbuffer cb2 : register(b2) { float4 cb2[7]; } cbuffer PER_BATCH : register(b0) { float4 VS_SunColor : packoffset(c0); } cbuffer PER_INSTANCE : register(b1) { row_major float3x4 ObjWorldMatrix : packoffset(c0); } cbuffer PER_FRAME : register(b2) { float4 g_VS_WorldViewPos : packoffset(c6); } cbuffer PER_MATERIAL : register(b3) { float4 MatSpecColor : packoffset(c1); float3 __0bendDetailFrequency__1bendDetailLeafAmplitude__2bendDetailBranchAmplitude__3 : packoffset(c2); float4 __0AnimFrequency__1AnimAmplitudeWav0__2AnimPhase__3AnimAmplitudeWav2 : packoffset(c3); float2 __0Tilling__1DetailTilling__2__3 : packoffset(c4); float VertexWaveScale : packoffset(c6); } Texture2D<float4> StereoParams : register(t125); Texture1D<float4> IniParams : register(t120); void main( float4 v0 : POSITION0, float2 v1 : TEXCOORD0, float4 v2 : COLOR0, out float4 o0 : TEXCOORD0, out float4 o1 : TEXCOORD1, out float4 o2 : TEXCOORD2, out float4 o3 : TEXCOORD3, out float4 o4 : TEXCOORD4) { float4 r0,r1; uint4 bitmask, uiDest; float4 fDest; r0.x = 1; r0.z = cb3[4].x; r0.xyzw = v1.xyxy * r0.xxzz; r0.xyzw = cb3[4].xxyy * r0.xyzw; o0.xyzw = float4(1,1,2,2) * r0.xyzw; o1.xyzw = float4(1,1,0.858578682,1); r0.xyz = v0.xyz; r0.w = 1; r1.x = dot(ObjWorldMatrix._m00_m01_m02_m03, r0.xyzw); r1.y = dot(ObjWorldMatrix._m10_m11_m12_m13, r0.xyzw); r1.z = dot(ObjWorldMatrix._m20_m21_m22_m23, r0.xyzw); r0.xyz = cb2[6].xyz + r1.xyz; r1.xyz = cb2[6].xyz + -r0.xyz; o4.xyz = r0.xyz; // r0.x = 0 < r1.z; // r0.x = r1.z > 0; // o2.xyz = r1.xyz; // r0.x = ((int)r0.x ? -1 : 0) + ((int)r0.y ? 1 : 0); // o2.w = (int)r0.x; int x, y; x = (0 < r1.z) ? -1 : 0; y = (r1.z < 0) ? -1 : 0; o2.xyz = r1.xyz; r0.x = -x + y; o2.w = r0.x; r0.xyz = cb3[1].xyz * VS_SunColor.xyz; o3.xyz = VS_SunColor.www * r0.xyz; o3.w = 1; o4.w = v0.z; return; } /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // // Generated by Microsoft (R) HLSL Shader Compiler 9.30.960.8229 // // using 3Dmigoto v1.1.34 on Thu Jul 23 18:56:25 2015 // // // Buffer Definitions: // // cbuffer PER_BATCH // { // // float4 VS_SunColor; // Offset: 0 Size: 16 // // } // // cbuffer PER_INSTANCE // { // // row_major float3x4 ObjWorldMatrix; // Offset: 0 Size: 48 // // } // // cbuffer PER_FRAME // { // // float4 g_VS_WorldViewPos; // Offset: 96 Size: 16 // // } // // cbuffer PER_MATERIAL // { // // float4 MatSpecColor; // Offset: 16 Size: 16 // float3 __0bendDetailFrequency__1bendDetailLeafAmplitude__2bendDetailBranchAmplitude__3;// Offset: 32 Size: 12 [unused] // float4 __0AnimFrequency__1AnimAmplitudeWav0__2AnimPhase__3AnimAmplitudeWav2;// Offset: 48 Size: 16 [unused] // float2 __0Tilling__1DetailTilling__2__3;// Offset: 64 Size: 8 // float VertexWaveScale; // Offset: 96 Size: 4 [unused] // // } // // // Resource Bindings: // // Name Type Format Dim Slot Elements // ------------------------------ ---------- ------- ----------- ---- -------- // PER_BATCH cbuffer NA NA 0 1 // PER_INSTANCE cbuffer NA NA 1 1 // PER_FRAME cbuffer NA NA 2 1 // PER_MATERIAL cbuffer NA NA 3 1 // // // // Input signature: // // Name Index Mask Register SysValue Format Used // -------------------- ----- ------ -------- -------- ------- ------ // POSITION 0 xyzw 0 NONE float xyz // TEXCOORD 0 xy 1 NONE float xy // COLOR 0 xyzw 2 NONE float // // // Output signature: // // Name Index Mask Register SysValue Format Used // -------------------- ----- ------ -------- -------- ------- ------ // TEXCOORD 0 xyzw 0 NONE float xyzw // TEXCOORD 1 xyzw 1 NONE float xyzw // TEXCOORD 2 xyzw 2 NONE float xyzw // TEXCOORD 3 xyzw 3 NONE float xyzw // TEXCOORD 4 xyzw 4 NONE float xyzw // vs_5_0 dcl_globalFlags refactoringAllowed dcl_constantbuffer cb0[1], immediateIndexed dcl_constantbuffer cb1[3], immediateIndexed dcl_constantbuffer cb2[7], immediateIndexed dcl_constantbuffer cb3[5], immediateIndexed dcl_input v0.xyz dcl_input v1.xy dcl_output o0.xyzw dcl_output o1.xyzw dcl_output o2.xyzw dcl_output o3.xyzw dcl_output o4.xyzw dcl_temps 2 mov r0.x, l(1.000000) mov r0.z, cb3[4].x mul r0.xyzw, r0.xxzz, v1.xyxy mul r0.xyzw, r0.xyzw, cb3[4].xxyy mul o0.xyzw, r0.xyzw, l(1.000000, 1.000000, 2.000000, 2.000000) mov o1.xyzw, l(1.000000,1.000000,0.858579,1.000000) mov r0.xyz, v0.xyzx mov r0.w, l(1.000000) dp4 r1.x, cb1[0].xyzw, r0.xyzw dp4 r1.y, cb1[1].xyzw, r0.xyzw dp4 r1.z, cb1[2].xyzw, r0.xyzw add r0.xyz, r1.xyzx, cb2[6].xyzx add r1.xyz, -r0.xyzx, cb2[6].xyzx mov o4.xyz, r0.xyzx lt r0.x, l(0.000000), r1.z lt r0.y, r1.z, l(0.000000) mov o2.xyz, r1.xyzx iadd r0.x, -r0.x, r0.y itof o2.w, r0.x mul r0.xyz, cb0[0].xyzx, cb3[1].xyzx mul o3.xyz, r0.xyzx, cb0[0].wwww mov o3.w, l(1.000000) mov o4.w, v0.z ret // Approximately 24 instruction slots used ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/ [/code]
@DHR: For the 5b48 vertex shader, I do think that Decompile code is wrong. I'm still puzzling out why this happened, but here is the replacement code, and it will make the output identical to the original ASM at the bottom of the file.

// Water VS 1
// 9b3ac1b07bf3efc1 de67fe4f68dba921

cbuffer cb3 : register(b3)
{
float4 cb3[5];
}

cbuffer cb2 : register(b2)
{
float4 cb2[7];
}


cbuffer PER_BATCH : register(b0)
{
float4 VS_SunColor : packoffset(c0);
}

cbuffer PER_INSTANCE : register(b1)
{
row_major float3x4 ObjWorldMatrix : packoffset(c0);
}

cbuffer PER_FRAME : register(b2)
{
float4 g_VS_WorldViewPos : packoffset(c6);
}

cbuffer PER_MATERIAL : register(b3)
{
float4 MatSpecColor : packoffset(c1);
float3 __0bendDetailFrequency__1bendDetailLeafAmplitude__2bendDetailBranchAmplitude__3 : packoffset(c2);
float4 __0AnimFrequency__1AnimAmplitudeWav0__2AnimPhase__3AnimAmplitudeWav2 : packoffset(c3);
float2 __0Tilling__1DetailTilling__2__3 : packoffset(c4);
float VertexWaveScale : packoffset(c6);
}

Texture2D<float4> StereoParams : register(t125);
Texture1D<float4> IniParams : register(t120);

void main(
float4 v0 : POSITION0,
float2 v1 : TEXCOORD0,
float4 v2 : COLOR0,
out float4 o0 : TEXCOORD0,
out float4 o1 : TEXCOORD1,
out float4 o2 : TEXCOORD2,
out float4 o3 : TEXCOORD3,
out float4 o4 : TEXCOORD4)
{
float4 r0,r1;
uint4 bitmask, uiDest;
float4 fDest;

r0.x = 1;
r0.z = cb3[4].x;
r0.xyzw = v1.xyxy * r0.xxzz;
r0.xyzw = cb3[4].xxyy * r0.xyzw;
o0.xyzw = float4(1,1,2,2) * r0.xyzw;
o1.xyzw = float4(1,1,0.858578682,1);
r0.xyz = v0.xyz;
r0.w = 1;
r1.x = dot(ObjWorldMatrix._m00_m01_m02_m03, r0.xyzw);
r1.y = dot(ObjWorldMatrix._m10_m11_m12_m13, r0.xyzw);
r1.z = dot(ObjWorldMatrix._m20_m21_m22_m23, r0.xyzw);
r0.xyz = cb2[6].xyz + r1.xyz;
r1.xyz = cb2[6].xyz + -r0.xyz;
o4.xyz = r0.xyz;

// r0.x = 0 < r1.z;
// r0.x = r1.z > 0;
// o2.xyz = r1.xyz;
// r0.x = ((int)r0.x ? -1 : 0) + ((int)r0.y ? 1 : 0);
// o2.w = (int)r0.x;

int x, y;
x = (0 < r1.z) ? -1 : 0;
y = (r1.z < 0) ? -1 : 0;
o2.xyz = r1.xyz;
r0.x = -x + y;
o2.w = r0.x;

r0.xyz = cb3[1].xyz * VS_SunColor.xyz;
o3.xyz = VS_SunColor.www * r0.xyz;
o3.w = 1;
o4.w = v0.z;
return;
}

/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
//
// Generated by Microsoft (R) HLSL Shader Compiler 9.30.960.8229
//
// using 3Dmigoto v1.1.34 on Thu Jul 23 18:56:25 2015
//
//
// Buffer Definitions:
//
// cbuffer PER_BATCH
// {
//
// float4 VS_SunColor; // Offset: 0 Size: 16
//
// }
//
// cbuffer PER_INSTANCE
// {
//
// row_major float3x4 ObjWorldMatrix; // Offset: 0 Size: 48
//
// }
//
// cbuffer PER_FRAME
// {
//
// float4 g_VS_WorldViewPos; // Offset: 96 Size: 16
//
// }
//
// cbuffer PER_MATERIAL
// {
//
// float4 MatSpecColor; // Offset: 16 Size: 16
// float3 __0bendDetailFrequency__1bendDetailLeafAmplitude__2bendDetailBranchAmplitude__3;// Offset: 32 Size: 12 [unused]
// float4 __0AnimFrequency__1AnimAmplitudeWav0__2AnimPhase__3AnimAmplitudeWav2;// Offset: 48 Size: 16 [unused]
// float2 __0Tilling__1DetailTilling__2__3;// Offset: 64 Size: 8
// float VertexWaveScale; // Offset: 96 Size: 4 [unused]
//
// }
//
//
// Resource Bindings:
//
// Name Type Format Dim Slot Elements
// ------------------------------ ---------- ------- ----------- ---- --------
// PER_BATCH cbuffer NA NA 0 1
// PER_INSTANCE cbuffer NA NA 1 1
// PER_FRAME cbuffer NA NA 2 1
// PER_MATERIAL cbuffer NA NA 3 1
//
//
//
// Input signature:
//
// Name Index Mask Register SysValue Format Used
// -------------------- ----- ------ -------- -------- ------- ------
// POSITION 0 xyzw 0 NONE float xyz
// TEXCOORD 0 xy 1 NONE float xy
// COLOR 0 xyzw 2 NONE float
//
//
// Output signature:
//
// Name Index Mask Register SysValue Format Used
// -------------------- ----- ------ -------- -------- ------- ------
// TEXCOORD 0 xyzw 0 NONE float xyzw
// TEXCOORD 1 xyzw 1 NONE float xyzw
// TEXCOORD 2 xyzw 2 NONE float xyzw
// TEXCOORD 3 xyzw 3 NONE float xyzw
// TEXCOORD 4 xyzw 4 NONE float xyzw
//
vs_5_0
dcl_globalFlags refactoringAllowed
dcl_constantbuffer cb0[1], immediateIndexed
dcl_constantbuffer cb1[3], immediateIndexed
dcl_constantbuffer cb2[7], immediateIndexed
dcl_constantbuffer cb3[5], immediateIndexed
dcl_input v0.xyz
dcl_input v1.xy
dcl_output o0.xyzw
dcl_output o1.xyzw
dcl_output o2.xyzw
dcl_output o3.xyzw
dcl_output o4.xyzw
dcl_temps 2
mov r0.x, l(1.000000)
mov r0.z, cb3[4].x
mul r0.xyzw, r0.xxzz, v1.xyxy
mul r0.xyzw, r0.xyzw, cb3[4].xxyy
mul o0.xyzw, r0.xyzw, l(1.000000, 1.000000, 2.000000, 2.000000)
mov o1.xyzw, l(1.000000,1.000000,0.858579,1.000000)
mov r0.xyz, v0.xyzx
mov r0.w, l(1.000000)
dp4 r1.x, cb1[0].xyzw, r0.xyzw
dp4 r1.y, cb1[1].xyzw, r0.xyzw
dp4 r1.z, cb1[2].xyzw, r0.xyzw
add r0.xyz, r1.xyzx, cb2[6].xyzx
add r1.xyz, -r0.xyzx, cb2[6].xyzx
mov o4.xyz, r0.xyzx
lt r0.x, l(0.000000), r1.z
lt r0.y, r1.z, l(0.000000)
mov o2.xyz, r1.xyzx
iadd r0.x, -r0.x, r0.y
itof o2.w, r0.x
mul r0.xyz, cb0[0].xyzx, cb3[1].xyzx
mul o3.xyz, r0.xyzx, cb0[0].wwww
mov o3.w, l(1.000000)
mov o4.w, v0.z
ret
// Approximately 24 instruction slots used

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

#39
Posted 07/24/2015 01:49 AM   
Ok, the "HLSL Error" message scared me a bit... I don't delete anything....they are dumped like that. Edit: sorry bo3b!...i just noticed that dxd3.ini have "export_hlsl=1" ..i just changed to 2.
Ok, the "HLSL Error" message scared me a bit...

I don't delete anything....they are dumped like that.


Edit: sorry bo3b!...i just noticed that dxd3.ini have "export_hlsl=1" ..i just changed to 2.

MY WEB

Helix Mod - Making 3D Better

My 3D Screenshot Gallery

Like my fixes? you can donate to Paypal: dhr.donation@gmail.com

#40
Posted 07/24/2015 01:55 AM   
[quote="Flugan"]It is possible to add an output as far as I can tell[/quote]Great :) I was under the impression that we couldn't do this with assembly yet because we reuse the output signature (OSGN) and input signature (ISGN) sections from the original shader binary, and those sections would need to change as it is the part that contains the semantics which are used by the hardware to link VS/DS/GS outputs to PS inputs - so, we could add an extra output but it wouldn't make it into the next shader. I also had some concerns that the MS disassembler may not be giving us enough information in the dcl_ lines to recreate these sections and we may have to parse the comments it generates (DX9 used dcl_texcoordX oY.xyz, which includes the semantic type, index, register and mask, but DX11 only uses dcl_output oY.xyz, which lacks the semantic and index). This isn't an issue with HLSL shaders as the compiler will generate new ISGN and OSGN sections.
Flugan said:It is possible to add an output as far as I can tell
Great :)

I was under the impression that we couldn't do this with assembly yet because we reuse the output signature (OSGN) and input signature (ISGN) sections from the original shader binary, and those sections would need to change as it is the part that contains the semantics which are used by the hardware to link VS/DS/GS outputs to PS inputs - so, we could add an extra output but it wouldn't make it into the next shader.

I also had some concerns that the MS disassembler may not be giving us enough information in the dcl_ lines to recreate these sections and we may have to parse the comments it generates (DX9 used dcl_texcoordX oY.xyz, which includes the semantic type, index, register and mask, but DX11 only uses dcl_output oY.xyz, which lacks the semantic and index).

This isn't an issue with HLSL shaders as the compiler will generate new ISGN and OSGN sections.

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

#41
Posted 07/24/2015 02:00 AM   
[quote="bo3b"]@DHR: For the 5b48 vertex shader, I do think that Decompile code is wrong. I'm still puzzling out why this happened, but here is the replacement code, and it will make the output identical to the original ASM at the bottom of the file.[/quote]Looks like the same thing I hit in FC4 the other day. Did you see my theory that this may be due to the differences between boolean comparisons in asm vs HLSL: https://github.com/bo3b/3Dmigoto/commit/3e6a0a5daf91262efbb673c4d2621a7406f58d1a
bo3b said:@DHR: For the 5b48 vertex shader, I do think that Decompile code is wrong. I'm still puzzling out why this happened, but here is the replacement code, and it will make the output identical to the original ASM at the bottom of the file.
Looks like the same thing I hit in FC4 the other day. Did you see my theory that this may be due to the differences between boolean comparisons in asm vs HLSL:

https://github.com/bo3b/3Dmigoto/commit/3e6a0a5daf91262efbb673c4d2621a7406f58d1a

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

#42
Posted 07/24/2015 03:07 AM   
[quote="DHR"]@DarkStarSword Using those 4 fixed shader and WATER = HIGH and PARTICLE = HIGH looks a little better....but when you start to move looking the water the fix is gone (disable), and when you look up and down with the water in front....there is a black stuff over the water that moves.[/quote] There might be some more shaders to find there. I found that there are at least two pixel shaders used for the water (one constructs the reflection, the second draws the surface), and these would switch out in certain circumstances (at least when touching the water the shaders switch). It also seems like some quality settings may share vertex/domain shaders, but other settings may use two separate vertex/domain shaders - that was why I moved the halo fix into the pixel shader. It probably also needs more experimentation to find the best fix. Unfortunately I don't think I'll have another chance to look at this for a few weeks, but hopefully you or Mike can find a solution. Regarding frame analysis in Crysis 3 - I'm finding that it doesn't seem to work reliably on most attempts. It creates the analysis directory but doesn't create any files. If I hit F8 repeatedly eventually one of the attempts does work. This may be some interaction with the Origin overlay (say, if the overlay calls Present() as well as the game we would start analysis from the game's Present(), then immediately stop it on Origin's Present()). I need to investigate further, but this won't happen until I get back.
DHR said:@DarkStarSword
Using those 4 fixed shader and WATER = HIGH and PARTICLE = HIGH looks a little better....but when you start to move looking the water the fix is gone (disable), and when you look up and down with the water in front....there is a black stuff over the water that moves.

There might be some more shaders to find there. I found that there are at least two pixel shaders used for the water (one constructs the reflection, the second draws the surface), and these would switch out in certain circumstances (at least when touching the water the shaders switch). It also seems like some quality settings may share vertex/domain shaders, but other settings may use two separate vertex/domain shaders - that was why I moved the halo fix into the pixel shader.

It probably also needs more experimentation to find the best fix. Unfortunately I don't think I'll have another chance to look at this for a few weeks, but hopefully you or Mike can find a solution.


Regarding frame analysis in Crysis 3 - I'm finding that it doesn't seem to work reliably on most attempts. It creates the analysis directory but doesn't create any files. If I hit F8 repeatedly eventually one of the attempts does work. This may be some interaction with the Origin overlay (say, if the overlay calls Present() as well as the game we would start analysis from the game's Present(), then immediately stop it on Origin's Present()). I need to investigate further, but this won't happen until I get back.

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

#43
Posted 07/24/2015 03:16 AM   
[quote="DarkStarSword"][quote="bo3b"]@DHR: For the 5b48 vertex shader, I do think that Decompile code is wrong. I'm still puzzling out why this happened, but here is the replacement code, and it will make the output identical to the original ASM at the bottom of the file.[/quote]Looks like the same thing I hit in FC4 the other day. Did you see my theory that this may be due to the differences between boolean comparisons in asm vs HLSL: https://github.com/bo3b/3Dmigoto/commit/3e6a0a5daf91262efbb673c4d2621a7406f58d1a[/quote] Hadn't seen that check-in before, but yep, that's exactly what I'm seeing with this vertex shader. In general, the Decompiler does in fact try to keep track of the difference between a boolean state and a numeric value. And tries to use -1 (0xffffffff) for the true state. So for example in _iadd it does: [code] sprintf(buffer, " %s = (%s ? -1 : 0) + (%s ? 1 : 0);\n", writeTarget(op1), ci(convertToInt(op2)).c_str(), ci(convertToInt(op3)).c_str()); [/code] Where it uses the ternary to provide the correct output of -1 or 0. Which... has a bug. The second turnary should also be using -1 in the boolean case, to provide the 0xffffffff bitfield that we need. The idea is sound though- use the HLSL idea of boolean testing (0/1) and use that to generate the binary pattern we want. Now a curious part I cannot seem to quite get around is that if I fix that bug, the code is still generated wrong. If I do the logical conversion: [code]r0.x = 0 < r1.z; r0.y = r1.z < 0; r0.x = -((int)r0.x ? -1 : 0) + ((int)r0.y ? -1 : 0); o2.w = r0.x; [/code] I still get: [code]lt r0.x, l(0.000000), r1.z and r0.x, r0.x, l(1) lt r0.y, r1.z, l(0.000000) iadd r0.x, r0.y, r0.x itof o2.w, r0.x[/code] [s] Where I have lost the minus sign for the iadd, and we are using integer 1 instead. I really hate to blame the fxc compiler, but I can't see how that is legitimate. Even if it were generating float, the values are wrong. [/s] No, that's right. The and forcing it to one, then removing the negation is in fact going to give the same ending result, as it could only ever be -1 or 0 to start with. Edit: OK, I see what's happened here now. This is a bug in _IADD generation, two bugs actually. If you replace the output line, this generates different code than the original, but it's still correct. [code]bad: r0.x = ((int)r0.x ? -1 : 0) + ((int)r0.y ? 1 : 0); good: r0.x = -((int)r0.x ? -1 : 0) + ((int)r0.y ? -1 : 0); [/code] Edit2: That _ITOF example you ran into suggests we have this upside down in the Decompiler, because to make that work, it should have had the test for boolean input and then generate the ternary. Rather than do that in every possible spot, it seems like it would be better to put the ternary assignment at each of the test instructions like _LT, _EQ and so on. Like: [code]r0.x = (0 < r1.z) ? asint(-1) : 0;[/code] That would be a gigantic, risky change though, so I'll try to think that one through and test it against prior fixes.
DarkStarSword said:
bo3b said:@DHR: For the 5b48 vertex shader, I do think that Decompile code is wrong. I'm still puzzling out why this happened, but here is the replacement code, and it will make the output identical to the original ASM at the bottom of the file.
Looks like the same thing I hit in FC4 the other day. Did you see my theory that this may be due to the differences between boolean comparisons in asm vs HLSL:
https://github.com/bo3b/3Dmigoto/commit/3e6a0a5daf91262efbb673c4d2621a7406f58d1a

Hadn't seen that check-in before, but yep, that's exactly what I'm seeing with this vertex shader.

In general, the Decompiler does in fact try to keep track of the difference between a boolean state and a numeric value. And tries to use -1 (0xffffffff) for the true state. So for example in _iadd it does:

sprintf(buffer, "  %s = (%s ? -1 : 0) + (%s ? 1 : 0);\n", writeTarget(op1), ci(convertToInt(op2)).c_str(), ci(convertToInt(op3)).c_str());


Where it uses the ternary to provide the correct output of -1 or 0.

Which... has a bug. The second turnary should also be using -1 in the boolean case, to provide the 0xffffffff bitfield that we need. The idea is sound though- use the HLSL idea of boolean testing (0/1) and use that to generate the binary pattern we want.


Now a curious part I cannot seem to quite get around is that if I fix that bug, the code is still generated wrong.

If I do the logical conversion:
r0.x = 0 < r1.z;
r0.y = r1.z < 0;
r0.x = -((int)r0.x ? -1 : 0) + ((int)r0.y ? -1 : 0);
o2.w = r0.x;


I still get:
lt r0.x, l(0.000000), r1.z
and r0.x, r0.x, l(1)
lt r0.y, r1.z, l(0.000000)
iadd r0.x, r0.y, r0.x
itof o2.w, r0.x


Where I have lost the minus sign for the iadd, and we are using integer 1 instead. I really hate to blame the fxc compiler, but I can't see how that is legitimate. Even if it were generating float, the values are wrong.

No, that's right. The and forcing it to one, then removing the negation is in fact going to give the same ending result, as it could only ever be -1 or 0 to start with.

Edit: OK, I see what's happened here now. This is a bug in _IADD generation, two bugs actually.

If you replace the output line, this generates different code than the original, but it's still correct.

bad:   r0.x = ((int)r0.x ? -1 : 0) + ((int)r0.y ? 1 : 0);

good: r0.x = -((int)r0.x ? -1 : 0) + ((int)r0.y ? -1 : 0);



Edit2: That _ITOF example you ran into suggests we have this upside down in the Decompiler, because to make that work, it should have had the test for boolean input and then generate the ternary. Rather than do that in every possible spot, it seems like it would be better to put the ternary assignment at each of the test instructions like _LT, _EQ and so on.

Like:

r0.x = (0 < r1.z) ? asint(-1) : 0;



That would be a gigantic, risky change though, so I'll try to think that one through and test it against prior fixes.

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

#44
Posted 07/24/2015 03:56 AM   
[quote="DHR"]Ok, the "HLSL Error" message scared me a bit... I don't delete anything....they are dumped like that. Edit: sorry bo3b!...i just noticed that dxd3.ini have "export_hlsl=1" ..i just changed to 2.[/quote] Ah, right-o. If you regenerate the file or have the ASM available, I'm happy to take a look at the shaders that seem to be in error. [quote="DHR"]Yes, have a true reflection and the surface have some kind of tesselation (it's a hard one)...Also, when you look a little up to that water all reflections are ok due the true reflection movement. Also changing WATER quality or/and SHADING quality looks the same.[/quote] If that VS above of 5b48 is active here, that could easily explain the symptoms. The output would be either '1' or 0, and in the wrong order. If it's in use, try my hand-fixed version there, as it will be identical to the original.
DHR said:Ok, the "HLSL Error" message scared me a bit...

I don't delete anything....they are dumped like that.


Edit: sorry bo3b!...i just noticed that dxd3.ini have "export_hlsl=1" ..i just changed to 2.

Ah, right-o. If you regenerate the file or have the ASM available, I'm happy to take a look at the shaders that seem to be in error.


DHR said:Yes, have a true reflection and the surface have some kind of tesselation (it's a hard one)...Also, when you look a little up to that water all reflections are ok due the true reflection movement. Also changing WATER quality or/and SHADING quality looks the same.

If that VS above of 5b48 is active here, that could easily explain the symptoms. The output would be either '1' or 0, and in the wrong order. If it's in use, try my hand-fixed version there, as it will be identical to the original.

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

#45
Posted 07/24/2015 04:07 AM   
  3 / 10    
Scroll To Top