Bo3b's School For Shaderhackers
  51 / 88    
Quick question [I might sound like an idiot]. What "would" have a stereo view. I understand a model would just have all would be a single image of all parts/textures stretched onto it. Just curious why there is there is a distinction between stereo/mono. ----------------- I'm mainly interested because I had a funny idea of a project. I want to see if I can switch Elizabeth into a different version of herself [using one of the different models]. As a fun little quirk for a multiple playthrough of Bioshock Infinite. Probably can't be don't but would like to try. I was just looking through my files and saw that image where helix swapped Moxxi with Elizabeth and it just got me really curious. Not to mention I have a growing interest in models since I think it will be cool with sfm / potentially VR. I kind of wanted to mess with something unique.
Quick question [I might sound like an idiot]. What "would" have a stereo view. I understand a model would just have all would be a single image of all parts/textures stretched onto it. Just curious why there is there is a distinction between stereo/mono.
-----------------
I'm mainly interested because I had a funny idea of a project. I want to see if I can switch Elizabeth into a different version of herself [using one of the different models]. As a fun little quirk for a multiple playthrough of Bioshock Infinite. Probably can't be don't but would like to try. I was just looking through my files and saw that image where helix swapped Moxxi with Elizabeth and it just got me really curious.
Not to mention I have a growing interest in models since I think it will be cool with sfm / potentially VR. I kind of wanted to mess with something unique.

Co-founder/Web host of helixmod.blog.com

Donations for web hosting @ paypal -eqzitara@yahoo.com
or
https://www.patreon.com/user?u=791918

Posted 01/27/2016 01:14 AM   
A texture can be stereo if it was used as a stereo render or depth target previously (there's no real distinction between these internally, only how they are used and whether their contents came from the CPU or GPU), so for instance if you dump out the textures of a lighting shader you will find a stereo depth buffer and a mono shadow map among them (and likely also stereo normal, specularity and colour buffers). Similarly a 3D mirror shader will have a stereo colour buffer in one of it's texture slots. Having the ability to dump these as stereo is very handy to find if there are any that are mono that should be stereo or vice versa to know which ones need to be forced, and so we can check if a broken effect was broken by a particular shader or if something had already broken it earlier in the frame. Unfortunately the nvidia api doesn't give us a way to tell if a resource was mono or stereo, so 3DMigoto has to dump everything as stereo unless you tell it otherwise which results in mono resources ending up with either nothing or vram garbage in the other eye.
A texture can be stereo if it was used as a stereo render or depth target previously (there's no real distinction between these internally, only how they are used and whether their contents came from the CPU or GPU), so for instance if you dump out the textures of a lighting shader you will find a stereo depth buffer and a mono shadow map among them (and likely also stereo normal, specularity and colour buffers). Similarly a 3D mirror shader will have a stereo colour buffer in one of it's texture slots.

Having the ability to dump these as stereo is very handy to find if there are any that are mono that should be stereo or vice versa to know which ones need to be forced, and so we can check if a broken effect was broken by a particular shader or if something had already broken it earlier in the frame.

Unfortunately the nvidia api doesn't give us a way to tell if a resource was mono or stereo, so 3DMigoto has to dump everything as stereo unless you tell it otherwise which results in mono resources ending up with either nothing or vram garbage in the other eye.

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 01/27/2016 04:55 AM   
[quote="DarkStarSword"][quote="DJ-RK"]So that's great about all that being sorted out, however that doesn't seem to improve the matter of the environmental lighting/shadows. I didn't bother to try to play around with modifying any of those shaders now that I've made the above adjustment, but at glancing at the broken effects, they don't look any different than before, so I have my doubts that they've been affected by this... yet.[/quote]Sometimes directional shadows follow the same pattern as point lights, and sometimes they don't. They could well be using a different format shadow map - it's at least fairly common for there to be several of different sizes if the game is using "cascaded shadows". [/quote] Ok, so coming back to this one, and pretty sure I've narrowed things down a bit, but still need a little bit of help. I'm pretty sure I've determined that the fix needs to be done in the following PS (38EA4D18): [code]// // Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111 // // Parameters: // // float4 CBScreen__packed1; // float4 CBViewProjection__packed4; // sampler2D SSPoint__tDepthMap; // sampler2D SSShadowDepth__tShadowMapCombine; // float4 __tShadowMapCombine__invsize; // row_major float4x4 fProj; // // struct // { // float4 __packed0; // float4 __packed1; // float4 fShadowRange; // row_major float4x4 fShadowProjectNear; // row_major float4x4 fShadowProjectMiddle; // row_major float4x4 fShadowProjectFar; // float4 __packed15; // float4 __packed16; // float4 fShadowMapSize; // row_major float3x4 fShadowRangeMat; // float4 __packed21; // float4 __packed22; // // } sShadowReceiveParam; // // // Registers: // // Name Reg Size // -------------------------------- ----- ---- // sShadowReceiveParam c1 23 // fProj c24 4 // CBScreen__packed1 c28 1 // CBViewProjection__packed4 c29 1 // __tShadowMapCombine__invsize c30 1 // SSPoint__tDepthMap s0 1 // SSShadowDepth__tShadowMapCombine s1 1 // ps_3_0 def c220, 0, 0, 0.0625, 0 dcl_2d s13 def c0, 0.5, 1, 0.00392156886, 1.53787005e-005 def c31, 1, 3, -0.5, -1.5 def c32, 1, 3, -0.5, -2.5 def c33, 0.25, 16.0804024, 8, -5 def c34, -2, 3, 0, 0 def c35, -0.497500002, 0, 1, -1 dcl_texcoord v0.xyz dcl vPos.xy dcl_2d s0 dcl_2d s1 add r0.xy, c0.x, vPos mul r0.xy, r0, c28.zwzw texld r0, r0, s0 dp3 r0.x, r0, c0.yzww add r0.x, r0.x, c26.z rcp r0.x, r0.x mul r0.x, r0.x, c27.z if_lt r0.x, c3.z mad r0.yzw, r0.x, v0.xxyz, c29.xxyz // mov r21.xyz, r0.yzw // mul r22, c175, r21.y // mad r22, c174, r21.x, r22 // mad r22, c176, r21.z, r22 // add r22, r22, c177 // texldl r24, c220.z, s13 // add r24.y, r22.w, -r24.y // mul r24.x, r24.x, r24.y // add r22.x, r22.x, -r24.x // rcp r22.w, r22.w // mul r22.xyz, r22.xyz, r22.w // mul r21, c171, r22.y // mad r21, c170, r22.x, r21 // mad r21, c172, r22.z, r21 // add r21, r21, c173 // rcp r21.w, r21.w // mul r21.xyz, r21.xyz, r21.w // mov r0.yzw, r21.xyz mul r1.xyz, r0.z, c5 mad r1.xyz, r0.y, c4, r1 mad r1.xyz, r0.w, c6, r1 add r1.xyz, r1, c7 mul r2.xyz, r0.z, c9 mad r2.xyz, r0.y, c8, r2 mad r2.xyz, r0.w, c10, r2 add r2.xyz, r2, c11 mul r3.xyz, r0.z, c13 mad r3.xyz, r0.y, c12, r3 mad r0.yzw, r0.w, c14.xxyz, r3.xxyz add r0.yzw, r0, c15.xxyz mad r3.xy, r1, c31, c31.z mad r3.zw, r2.xyxy, c31.xyxy, c31 mad r4.xy, r0.yzzw, c32, c32.zwzw add r4.zw, r3_abs.xyxy, c35.x cmp r4.zw, r4, c35.y, c35.z mul r1.w, r4.w, r4.z add r4.zw, r3_abs, c35.x cmp r4.zw, r4, c35.y, c35.z mul r2.w, r4.w, r4.z cmp r3.zw, -r2.w, r4_abs.xyxy, r3_abs cmp r3.xy, -r1.w, r3.zwzw, r3_abs cmp r4.xyz, -r2.w, r0.yzww, r2 cmp r4.xyz, -r1.w, r4, r1 cmp r1.xyz, -r1.w, r0.yzww, r2 add r2.xy, r4, -c30 mov r2.zw, c35.y texldl r2, r2, s1 mov r0.yzw, c35 mad r5, c30.xyxy, r0.wywz, r4.xyxy mul r6, r5.xyxx, c35.zzyy texldl r6, r6, s1 mul r5, r5.zwxx, c35.zzyy texldl r5, r5, s1 mad r7, c30.xyxy, r0.ywyz, r4.xyxy mul r8, r7.xyxx, c35.zzyy texldl r8, r8, s1 mov r4.w, c35.y texldl r9, r4.xyww, s1 mul r7, r7.zwxx, c35.zzyy texldl r7, r7, s1 mad r10, c30.xyxy, r0.zwzy, r4.xyxy mul r11, r10.xyxx, c35.zzyy texldl r11, r11, s1 mul r10, r10.zwxx, c35.zzyy texldl r10, r10, s1 add r12.xy, r4, c30 mov r12.zw, c35.y texldl r12, r12, s1 mov r2.y, r6.x mov r2.z, r5.x add r2.xyz, -r2, r4.z cmp r5.xyz, r2, c35.z, c35.y mov r8.y, r9.x mov r8.z, r7.x add r6.xyz, r4.z, -r8 cmp r6.xyz, r6, c35.z, c35.y mov r11.y, r10.x mov r11.z, r12.x add r7.xyz, r4.z, -r11 cmp r7.xyz, r7, c35.z, c35.y mul r3.zw, r4.xyxy, c18.xyxy frc r3.zw, r3 add r4.xyz, r5, r6 cmp r2.xyz, r2, -c35.z, -c35.y add r2.xyz, r2, r7 mad r2.xyz, r2, r3.z, r4 add r2.y, r2.y, r2.x add r2.x, -r2.x, r2.z mad r2.x, r2.x, r3.w, r2.y add r4.xy, r1, -c30 mov r4.zw, c35.y texldl r4, r4, s1 mad r5, c30.xyxy, r0.wywz, r1.xyxy mul r6, r5.xyxx, c35.zzyy texldl r6, r6, s1 mul r5, r5.zwxx, c35.zzyy texldl r5, r5, s1 mad r7, c30.xyxy, r0.ywyz, r1.xyxy mul r8, r7.xyxx, c35.zzyy texldl r8, r8, s1 mov r1.w, c35.y texldl r9, r1.xyww, s1 mul r7, r7.zwxx, c35.zzyy texldl r7, r7, s1 mad r10, c30.xyxy, r0.zwzy, r1.xyxy mul r11, r10.xyxx, c35.zzyy texldl r11, r11, s1 mul r10, r10.zwxx, c35.zzyy texldl r10, r10, s1 add r12.xy, r1, c30 mov r12.zw, c35.y texldl r12, r12, s1 mov r4.y, r6.x mov r4.z, r5.x add r0.yzw, r1.z, -r4.xxyz cmp r2.yzw, r0, c35.z, c35.y mov r8.y, r9.x mov r8.z, r7.x add r4.xyz, r1.z, -r8 cmp r4.xyz, r4, c35.z, c35.y mov r11.y, r10.x mov r11.z, r12.x add r5.xyz, r1.z, -r11 cmp r5.xyz, r5, c35.z, c35.y mul r1.xy, r1, c18 frc r1.xy, r1 add r2.yzw, r2, r4.xxyz cmp r0.yzw, r0, -c35.z, -c35.y add r0.yzw, r0, r5.xxyz mad r0.yzw, r0, r1.x, r2 add r0.z, r0.z, r0.y add r0.y, -r0.y, r0.w mad r0.y, r0.y, r1.y, r0.z mul r0.y, r0.y, c33.x max r0.z, r3.x, r3.y mad_sat r0.z, r0.z, -c33.y, c33.z mad r0.w, r2.x, c33.x, -r0.y mad r0.y, r0.z, r0.w, r0.y else mov r0.y, c35.y endif mov r0.w, c3.w mad r0.z, r0.x, r0.w, -c23.y mad r0.x, r0.x, r0.w, -c0.y mul_sat r0.x, r0.x, c33.w mad r0.w, r0.x, c34.x, c34.y mul r0.x, r0.x, r0.x mul r0.x, r0.x, r0.w cmp r0.x, r0.z, r0.x, c35.y mul oC0, r0.x, r0.y // approximately 161 instruction slots used (37 texture, 124 arithmetic) [/code] Like I mentioned this one before, there is a near identical shader in the RE6 fix. When I first tested this, I thought applying the fix was simply making the shadows disappear, but I've done a bit more testing, and some different variations of the fix, and what I've come to learn is that fixing in this PS actually does get the shadows stereoized properly, however their position moves relatively to the camera (running around or changing the view around will cause the shadows to move as well). Being fairly certain that c29 is the camera position, and based on the above observation I've tried separating the following line: mad r0.yzw, r0.x, v0.xxyz, c29.xxyz into 2 separate lines: mul r0.yzw, r0.x, v0.xxyz add r0.yzw, r0.yzw, c29.xyz and performing the stereo correction between them, and that didn't work. I also tried following your suggestion to me on another game, which was to attempt to ' Convert correction value based on depth alone to world-space then subtract it from the coordinate' as per the following: [code]// // Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111 // // Parameters: // // float4 CBScreen__packed1; // float4 CBViewProjection__packed4; // sampler2D SSPoint__tDepthMap; // sampler2D SSShadowDepth__tShadowMapCombine; // float4 __tShadowMapCombine__invsize; // row_major float4x4 fProj; // // struct // { // float4 __packed0; // float4 __packed1; // float4 fShadowRange; // row_major float4x4 fShadowProjectNear; // row_major float4x4 fShadowProjectMiddle; // row_major float4x4 fShadowProjectFar; // float4 __packed15; // float4 __packed16; // float4 fShadowMapSize; // row_major float3x4 fShadowRangeMat; // float4 __packed21; // float4 __packed22; // // } sShadowReceiveParam; // // // Registers: // // Name Reg Size // -------------------------------- ----- ---- // sShadowReceiveParam c1 23 // fProj c24 4 // CBScreen__packed1 c28 1 // CBViewProjection__packed4 c29 1 // __tShadowMapCombine__invsize c30 1 // SSPoint__tDepthMap s0 1 // SSShadowDepth__tShadowMapCombine s1 1 // ps_3_0 def c220, 0, 1, 0.0625, 0 dcl_2d s13 def c0, 0.5, 1, 0.00392156886, 1.53787005e-005 def c31, 1, 3, -0.5, -1.5 def c32, 1, 3, -0.5, -2.5 def c33, 0.25, 16.0804024, 8, -5 def c34, -2, 3, 0, 0 def c35, -0.497500002, 0, 1, -1 dcl_texcoord v0.xyz dcl vPos.xy dcl_2d s0 dcl_2d s1 add r0.xy, c0.x, vPos mul r0.xy, r0, c28.zwzw texld r0, r0, s0 dp3 r0.x, r0, c0.yzww add r0.x, r0.x, c26.z rcp r0.x, r0.x mul r0.x, r0.x, c27.z mov r30.xyzw, c220.xxxy mov r30.x, r0.x if_lt r0.x, c3.z // mad r0.yzw, r0.x, v0.xxyz, c29.xxyz mul r0.yzw, r0.x, v0.xxyz texldl r24, c220.z, s13 add r30.x, r30.x, -r24.y mul r30.x, r24.x, r30.x mul r21, c171, r30.y mad r21, c170, r30.x, r21 mad r21, c172, r30.z, r21 add r21, r21, c173 // rcp r21.w, r21.w // mul r21.xyz, r21.xyz, r21.w add r0.yzw, r0.yzw, -r21.xyz add r0.yzw, r0.yzw, c29.xyz mul r1.xyz, r0.z, c5 mad r1.xyz, r0.y, c4, r1 mad r1.xyz, r0.w, c6, r1 add r1.xyz, r1, c7 mul r2.xyz, r0.z, c9 mad r2.xyz, r0.y, c8, r2 mad r2.xyz, r0.w, c10, r2 add r2.xyz, r2, c11 mul r3.xyz, r0.z, c13 mad r3.xyz, r0.y, c12, r3 mad r0.yzw, r0.w, c14.xxyz, r3.xxyz add r0.yzw, r0, c15.xxyz mad r3.xy, r1, c31, c31.z mad r3.zw, r2.xyxy, c31.xyxy, c31 mad r4.xy, r0.yzzw, c32, c32.zwzw add r4.zw, r3_abs.xyxy, c35.x cmp r4.zw, r4, c35.y, c35.z mul r1.w, r4.w, r4.z add r4.zw, r3_abs, c35.x cmp r4.zw, r4, c35.y, c35.z mul r2.w, r4.w, r4.z cmp r3.zw, -r2.w, r4_abs.xyxy, r3_abs cmp r3.xy, -r1.w, r3.zwzw, r3_abs cmp r4.xyz, -r2.w, r0.yzww, r2 cmp r4.xyz, -r1.w, r4, r1 cmp r1.xyz, -r1.w, r0.yzww, r2 add r2.xy, r4, -c30 mov r2.zw, c35.y texldl r2, r2, s1 mov r0.yzw, c35 mad r5, c30.xyxy, r0.wywz, r4.xyxy mul r6, r5.xyxx, c35.zzyy texldl r6, r6, s1 mul r5, r5.zwxx, c35.zzyy texldl r5, r5, s1 mad r7, c30.xyxy, r0.ywyz, r4.xyxy mul r8, r7.xyxx, c35.zzyy texldl r8, r8, s1 mov r4.w, c35.y texldl r9, r4.xyww, s1 mul r7, r7.zwxx, c35.zzyy texldl r7, r7, s1 mad r10, c30.xyxy, r0.zwzy, r4.xyxy mul r11, r10.xyxx, c35.zzyy texldl r11, r11, s1 mul r10, r10.zwxx, c35.zzyy texldl r10, r10, s1 add r12.xy, r4, c30 mov r12.zw, c35.y texldl r12, r12, s1 mov r2.y, r6.x mov r2.z, r5.x add r2.xyz, -r2, r4.z cmp r5.xyz, r2, c35.z, c35.y mov r8.y, r9.x mov r8.z, r7.x add r6.xyz, r4.z, -r8 cmp r6.xyz, r6, c35.z, c35.y mov r11.y, r10.x mov r11.z, r12.x add r7.xyz, r4.z, -r11 cmp r7.xyz, r7, c35.z, c35.y mul r3.zw, r4.xyxy, c18.xyxy frc r3.zw, r3 add r4.xyz, r5, r6 cmp r2.xyz, r2, -c35.z, -c35.y add r2.xyz, r2, r7 mad r2.xyz, r2, r3.z, r4 add r2.y, r2.y, r2.x add r2.x, -r2.x, r2.z mad r2.x, r2.x, r3.w, r2.y add r4.xy, r1, -c30 mov r4.zw, c35.y texldl r4, r4, s1 mad r5, c30.xyxy, r0.wywz, r1.xyxy mul r6, r5.xyxx, c35.zzyy texldl r6, r6, s1 mul r5, r5.zwxx, c35.zzyy texldl r5, r5, s1 mad r7, c30.xyxy, r0.ywyz, r1.xyxy mul r8, r7.xyxx, c35.zzyy texldl r8, r8, s1 mov r1.w, c35.y texldl r9, r1.xyww, s1 mul r7, r7.zwxx, c35.zzyy texldl r7, r7, s1 mad r10, c30.xyxy, r0.zwzy, r1.xyxy mul r11, r10.xyxx, c35.zzyy texldl r11, r11, s1 mul r10, r10.zwxx, c35.zzyy texldl r10, r10, s1 add r12.xy, r1, c30 mov r12.zw, c35.y texldl r12, r12, s1 mov r4.y, r6.x mov r4.z, r5.x add r0.yzw, r1.z, -r4.xxyz cmp r2.yzw, r0, c35.z, c35.y mov r8.y, r9.x mov r8.z, r7.x add r4.xyz, r1.z, -r8 cmp r4.xyz, r4, c35.z, c35.y mov r11.y, r10.x mov r11.z, r12.x add r5.xyz, r1.z, -r11 cmp r5.xyz, r5, c35.z, c35.y mul r1.xy, r1, c18 frc r1.xy, r1 add r2.yzw, r2, r4.xxyz cmp r0.yzw, r0, -c35.z, -c35.y add r0.yzw, r0, r5.xxyz mad r0.yzw, r0, r1.x, r2 add r0.z, r0.z, r0.y add r0.y, -r0.y, r0.w mad r0.y, r0.y, r1.y, r0.z mul r0.y, r0.y, c33.x max r0.z, r3.x, r3.y mad_sat r0.z, r0.z, -c33.y, c33.z mad r0.w, r2.x, c33.x, -r0.y mad r0.y, r0.z, r0.w, r0.y else mov r0.y, c35.y endif mov r0.w, c3.w mad r0.z, r0.x, r0.w, -c23.y mad r0.x, r0.x, r0.w, -c0.y mul_sat r0.x, r0.x, c33.w mad r0.w, r0.x, c34.x, c34.y mul r0.x, r0.x, r0.x mul r0.x, r0.x, r0.w cmp r0.x, r0.z, r0.x, c35.y mul oC0, r0.x, r0.y // approximately 161 instruction slots used (37 texture, 124 arithmetic) [/code] I'm not sure, but I think that had it look a little bit better, but still behaving the same way. So I've tried everything that I've learned and it feels like it's really close, but I'm out of options here. Is there anything else you can see/suggest in this shader?
DarkStarSword said:
DJ-RK said:So that's great about all that being sorted out, however that doesn't seem to improve the matter of the environmental lighting/shadows. I didn't bother to try to play around with modifying any of those shaders now that I've made the above adjustment, but at glancing at the broken effects, they don't look any different than before, so I have my doubts that they've been affected by this... yet.
Sometimes directional shadows follow the same pattern as point lights, and sometimes they don't. They could well be using a different format shadow map - it's at least fairly common for there to be several of different sizes if the game is using "cascaded shadows".


Ok, so coming back to this one, and pretty sure I've narrowed things down a bit, but still need a little bit of help.

I'm pretty sure I've determined that the fix needs to be done in the following PS (38EA4D18):

//
// Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111
//
// Parameters:
//
// float4 CBScreen__packed1;
// float4 CBViewProjection__packed4;
// sampler2D SSPoint__tDepthMap;
// sampler2D SSShadowDepth__tShadowMapCombine;
// float4 __tShadowMapCombine__invsize;
// row_major float4x4 fProj;
//
// struct
// {
// float4 __packed0;
// float4 __packed1;
// float4 fShadowRange;
// row_major float4x4 fShadowProjectNear;
// row_major float4x4 fShadowProjectMiddle;
// row_major float4x4 fShadowProjectFar;
// float4 __packed15;
// float4 __packed16;
// float4 fShadowMapSize;
// row_major float3x4 fShadowRangeMat;
// float4 __packed21;
// float4 __packed22;
//
// } sShadowReceiveParam;
//
//
// Registers:
//
// Name Reg Size
// -------------------------------- ----- ----
// sShadowReceiveParam c1 23
// fProj c24 4
// CBScreen__packed1 c28 1
// CBViewProjection__packed4 c29 1
// __tShadowMapCombine__invsize c30 1
// SSPoint__tDepthMap s0 1
// SSShadowDepth__tShadowMapCombine s1 1
//

ps_3_0
def c220, 0, 0, 0.0625, 0
dcl_2d s13
def c0, 0.5, 1, 0.00392156886, 1.53787005e-005
def c31, 1, 3, -0.5, -1.5
def c32, 1, 3, -0.5, -2.5
def c33, 0.25, 16.0804024, 8, -5
def c34, -2, 3, 0, 0
def c35, -0.497500002, 0, 1, -1
dcl_texcoord v0.xyz
dcl vPos.xy
dcl_2d s0
dcl_2d s1
add r0.xy, c0.x, vPos
mul r0.xy, r0, c28.zwzw
texld r0, r0, s0
dp3 r0.x, r0, c0.yzww
add r0.x, r0.x, c26.z
rcp r0.x, r0.x
mul r0.x, r0.x, c27.z
if_lt r0.x, c3.z
mad r0.yzw, r0.x, v0.xxyz, c29.xxyz
// mov r21.xyz, r0.yzw
// mul r22, c175, r21.y
// mad r22, c174, r21.x, r22
// mad r22, c176, r21.z, r22
// add r22, r22, c177
// texldl r24, c220.z, s13
// add r24.y, r22.w, -r24.y
// mul r24.x, r24.x, r24.y
// add r22.x, r22.x, -r24.x
// rcp r22.w, r22.w
// mul r22.xyz, r22.xyz, r22.w
// mul r21, c171, r22.y
// mad r21, c170, r22.x, r21
// mad r21, c172, r22.z, r21
// add r21, r21, c173
// rcp r21.w, r21.w
// mul r21.xyz, r21.xyz, r21.w
// mov r0.yzw, r21.xyz
mul r1.xyz, r0.z, c5
mad r1.xyz, r0.y, c4, r1
mad r1.xyz, r0.w, c6, r1
add r1.xyz, r1, c7
mul r2.xyz, r0.z, c9
mad r2.xyz, r0.y, c8, r2
mad r2.xyz, r0.w, c10, r2
add r2.xyz, r2, c11
mul r3.xyz, r0.z, c13
mad r3.xyz, r0.y, c12, r3
mad r0.yzw, r0.w, c14.xxyz, r3.xxyz
add r0.yzw, r0, c15.xxyz
mad r3.xy, r1, c31, c31.z
mad r3.zw, r2.xyxy, c31.xyxy, c31
mad r4.xy, r0.yzzw, c32, c32.zwzw
add r4.zw, r3_abs.xyxy, c35.x
cmp r4.zw, r4, c35.y, c35.z
mul r1.w, r4.w, r4.z
add r4.zw, r3_abs, c35.x
cmp r4.zw, r4, c35.y, c35.z
mul r2.w, r4.w, r4.z
cmp r3.zw, -r2.w, r4_abs.xyxy, r3_abs
cmp r3.xy, -r1.w, r3.zwzw, r3_abs
cmp r4.xyz, -r2.w, r0.yzww, r2
cmp r4.xyz, -r1.w, r4, r1
cmp r1.xyz, -r1.w, r0.yzww, r2
add r2.xy, r4, -c30
mov r2.zw, c35.y
texldl r2, r2, s1
mov r0.yzw, c35
mad r5, c30.xyxy, r0.wywz, r4.xyxy
mul r6, r5.xyxx, c35.zzyy
texldl r6, r6, s1
mul r5, r5.zwxx, c35.zzyy
texldl r5, r5, s1
mad r7, c30.xyxy, r0.ywyz, r4.xyxy
mul r8, r7.xyxx, c35.zzyy
texldl r8, r8, s1
mov r4.w, c35.y
texldl r9, r4.xyww, s1
mul r7, r7.zwxx, c35.zzyy
texldl r7, r7, s1
mad r10, c30.xyxy, r0.zwzy, r4.xyxy
mul r11, r10.xyxx, c35.zzyy
texldl r11, r11, s1
mul r10, r10.zwxx, c35.zzyy
texldl r10, r10, s1
add r12.xy, r4, c30
mov r12.zw, c35.y
texldl r12, r12, s1
mov r2.y, r6.x
mov r2.z, r5.x
add r2.xyz, -r2, r4.z
cmp r5.xyz, r2, c35.z, c35.y
mov r8.y, r9.x
mov r8.z, r7.x
add r6.xyz, r4.z, -r8
cmp r6.xyz, r6, c35.z, c35.y
mov r11.y, r10.x
mov r11.z, r12.x
add r7.xyz, r4.z, -r11
cmp r7.xyz, r7, c35.z, c35.y
mul r3.zw, r4.xyxy, c18.xyxy
frc r3.zw, r3
add r4.xyz, r5, r6
cmp r2.xyz, r2, -c35.z, -c35.y
add r2.xyz, r2, r7
mad r2.xyz, r2, r3.z, r4
add r2.y, r2.y, r2.x
add r2.x, -r2.x, r2.z
mad r2.x, r2.x, r3.w, r2.y
add r4.xy, r1, -c30
mov r4.zw, c35.y
texldl r4, r4, s1
mad r5, c30.xyxy, r0.wywz, r1.xyxy
mul r6, r5.xyxx, c35.zzyy
texldl r6, r6, s1
mul r5, r5.zwxx, c35.zzyy
texldl r5, r5, s1
mad r7, c30.xyxy, r0.ywyz, r1.xyxy
mul r8, r7.xyxx, c35.zzyy
texldl r8, r8, s1
mov r1.w, c35.y
texldl r9, r1.xyww, s1
mul r7, r7.zwxx, c35.zzyy
texldl r7, r7, s1
mad r10, c30.xyxy, r0.zwzy, r1.xyxy
mul r11, r10.xyxx, c35.zzyy
texldl r11, r11, s1
mul r10, r10.zwxx, c35.zzyy
texldl r10, r10, s1
add r12.xy, r1, c30
mov r12.zw, c35.y
texldl r12, r12, s1
mov r4.y, r6.x
mov r4.z, r5.x
add r0.yzw, r1.z, -r4.xxyz
cmp r2.yzw, r0, c35.z, c35.y
mov r8.y, r9.x
mov r8.z, r7.x
add r4.xyz, r1.z, -r8
cmp r4.xyz, r4, c35.z, c35.y
mov r11.y, r10.x
mov r11.z, r12.x
add r5.xyz, r1.z, -r11
cmp r5.xyz, r5, c35.z, c35.y
mul r1.xy, r1, c18
frc r1.xy, r1
add r2.yzw, r2, r4.xxyz
cmp r0.yzw, r0, -c35.z, -c35.y
add r0.yzw, r0, r5.xxyz
mad r0.yzw, r0, r1.x, r2
add r0.z, r0.z, r0.y
add r0.y, -r0.y, r0.w
mad r0.y, r0.y, r1.y, r0.z
mul r0.y, r0.y, c33.x
max r0.z, r3.x, r3.y
mad_sat r0.z, r0.z, -c33.y, c33.z
mad r0.w, r2.x, c33.x, -r0.y
mad r0.y, r0.z, r0.w, r0.y
else
mov r0.y, c35.y
endif
mov r0.w, c3.w
mad r0.z, r0.x, r0.w, -c23.y
mad r0.x, r0.x, r0.w, -c0.y
mul_sat r0.x, r0.x, c33.w
mad r0.w, r0.x, c34.x, c34.y
mul r0.x, r0.x, r0.x
mul r0.x, r0.x, r0.w
cmp r0.x, r0.z, r0.x, c35.y
mul oC0, r0.x, r0.y

// approximately 161 instruction slots used (37 texture, 124 arithmetic)


Like I mentioned this one before, there is a near identical shader in the RE6 fix. When I first tested this, I thought applying the fix was simply making the shadows disappear, but I've done a bit more testing, and some different variations of the fix, and what I've come to learn is that fixing in this PS actually does get the shadows stereoized properly, however their position moves relatively to the camera (running around or changing the view around will cause the shadows to move as well).

Being fairly certain that c29 is the camera position, and based on the above observation I've tried separating the following line:
mad r0.yzw, r0.x, v0.xxyz, c29.xxyz

into 2 separate lines:
mul r0.yzw, r0.x, v0.xxyz
add r0.yzw, r0.yzw, c29.xyz

and performing the stereo correction between them, and that didn't work.

I also tried following your suggestion to me on another game, which was to attempt to ' Convert correction value based on depth alone to world-space then subtract it from the coordinate' as per the following:

//
// Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111
//
// Parameters:
//
// float4 CBScreen__packed1;
// float4 CBViewProjection__packed4;
// sampler2D SSPoint__tDepthMap;
// sampler2D SSShadowDepth__tShadowMapCombine;
// float4 __tShadowMapCombine__invsize;
// row_major float4x4 fProj;
//
// struct
// {
// float4 __packed0;
// float4 __packed1;
// float4 fShadowRange;
// row_major float4x4 fShadowProjectNear;
// row_major float4x4 fShadowProjectMiddle;
// row_major float4x4 fShadowProjectFar;
// float4 __packed15;
// float4 __packed16;
// float4 fShadowMapSize;
// row_major float3x4 fShadowRangeMat;
// float4 __packed21;
// float4 __packed22;
//
// } sShadowReceiveParam;
//
//
// Registers:
//
// Name Reg Size
// -------------------------------- ----- ----
// sShadowReceiveParam c1 23
// fProj c24 4
// CBScreen__packed1 c28 1
// CBViewProjection__packed4 c29 1
// __tShadowMapCombine__invsize c30 1
// SSPoint__tDepthMap s0 1
// SSShadowDepth__tShadowMapCombine s1 1
//

ps_3_0
def c220, 0, 1, 0.0625, 0
dcl_2d s13
def c0, 0.5, 1, 0.00392156886, 1.53787005e-005
def c31, 1, 3, -0.5, -1.5
def c32, 1, 3, -0.5, -2.5
def c33, 0.25, 16.0804024, 8, -5
def c34, -2, 3, 0, 0
def c35, -0.497500002, 0, 1, -1
dcl_texcoord v0.xyz
dcl vPos.xy
dcl_2d s0
dcl_2d s1
add r0.xy, c0.x, vPos
mul r0.xy, r0, c28.zwzw
texld r0, r0, s0
dp3 r0.x, r0, c0.yzww
add r0.x, r0.x, c26.z
rcp r0.x, r0.x
mul r0.x, r0.x, c27.z

mov r30.xyzw, c220.xxxy
mov r30.x, r0.x

if_lt r0.x, c3.z
// mad r0.yzw, r0.x, v0.xxyz, c29.xxyz

mul r0.yzw, r0.x, v0.xxyz
texldl r24, c220.z, s13
add r30.x, r30.x, -r24.y
mul r30.x, r24.x, r30.x

mul r21, c171, r30.y
mad r21, c170, r30.x, r21
mad r21, c172, r30.z, r21
add r21, r21, c173
// rcp r21.w, r21.w
// mul r21.xyz, r21.xyz, r21.w

add r0.yzw, r0.yzw, -r21.xyz
add r0.yzw, r0.yzw, c29.xyz

mul r1.xyz, r0.z, c5
mad r1.xyz, r0.y, c4, r1
mad r1.xyz, r0.w, c6, r1
add r1.xyz, r1, c7
mul r2.xyz, r0.z, c9
mad r2.xyz, r0.y, c8, r2
mad r2.xyz, r0.w, c10, r2
add r2.xyz, r2, c11
mul r3.xyz, r0.z, c13
mad r3.xyz, r0.y, c12, r3
mad r0.yzw, r0.w, c14.xxyz, r3.xxyz
add r0.yzw, r0, c15.xxyz
mad r3.xy, r1, c31, c31.z
mad r3.zw, r2.xyxy, c31.xyxy, c31
mad r4.xy, r0.yzzw, c32, c32.zwzw
add r4.zw, r3_abs.xyxy, c35.x
cmp r4.zw, r4, c35.y, c35.z
mul r1.w, r4.w, r4.z
add r4.zw, r3_abs, c35.x
cmp r4.zw, r4, c35.y, c35.z
mul r2.w, r4.w, r4.z
cmp r3.zw, -r2.w, r4_abs.xyxy, r3_abs
cmp r3.xy, -r1.w, r3.zwzw, r3_abs
cmp r4.xyz, -r2.w, r0.yzww, r2
cmp r4.xyz, -r1.w, r4, r1
cmp r1.xyz, -r1.w, r0.yzww, r2
add r2.xy, r4, -c30
mov r2.zw, c35.y
texldl r2, r2, s1
mov r0.yzw, c35
mad r5, c30.xyxy, r0.wywz, r4.xyxy
mul r6, r5.xyxx, c35.zzyy
texldl r6, r6, s1
mul r5, r5.zwxx, c35.zzyy
texldl r5, r5, s1
mad r7, c30.xyxy, r0.ywyz, r4.xyxy
mul r8, r7.xyxx, c35.zzyy
texldl r8, r8, s1
mov r4.w, c35.y
texldl r9, r4.xyww, s1
mul r7, r7.zwxx, c35.zzyy
texldl r7, r7, s1
mad r10, c30.xyxy, r0.zwzy, r4.xyxy
mul r11, r10.xyxx, c35.zzyy
texldl r11, r11, s1
mul r10, r10.zwxx, c35.zzyy
texldl r10, r10, s1
add r12.xy, r4, c30
mov r12.zw, c35.y
texldl r12, r12, s1
mov r2.y, r6.x
mov r2.z, r5.x
add r2.xyz, -r2, r4.z
cmp r5.xyz, r2, c35.z, c35.y
mov r8.y, r9.x
mov r8.z, r7.x
add r6.xyz, r4.z, -r8
cmp r6.xyz, r6, c35.z, c35.y
mov r11.y, r10.x
mov r11.z, r12.x
add r7.xyz, r4.z, -r11
cmp r7.xyz, r7, c35.z, c35.y
mul r3.zw, r4.xyxy, c18.xyxy
frc r3.zw, r3
add r4.xyz, r5, r6
cmp r2.xyz, r2, -c35.z, -c35.y
add r2.xyz, r2, r7
mad r2.xyz, r2, r3.z, r4
add r2.y, r2.y, r2.x
add r2.x, -r2.x, r2.z
mad r2.x, r2.x, r3.w, r2.y
add r4.xy, r1, -c30
mov r4.zw, c35.y
texldl r4, r4, s1
mad r5, c30.xyxy, r0.wywz, r1.xyxy
mul r6, r5.xyxx, c35.zzyy
texldl r6, r6, s1
mul r5, r5.zwxx, c35.zzyy
texldl r5, r5, s1
mad r7, c30.xyxy, r0.ywyz, r1.xyxy
mul r8, r7.xyxx, c35.zzyy
texldl r8, r8, s1
mov r1.w, c35.y
texldl r9, r1.xyww, s1
mul r7, r7.zwxx, c35.zzyy
texldl r7, r7, s1
mad r10, c30.xyxy, r0.zwzy, r1.xyxy
mul r11, r10.xyxx, c35.zzyy
texldl r11, r11, s1
mul r10, r10.zwxx, c35.zzyy
texldl r10, r10, s1
add r12.xy, r1, c30
mov r12.zw, c35.y
texldl r12, r12, s1
mov r4.y, r6.x
mov r4.z, r5.x
add r0.yzw, r1.z, -r4.xxyz
cmp r2.yzw, r0, c35.z, c35.y
mov r8.y, r9.x
mov r8.z, r7.x
add r4.xyz, r1.z, -r8
cmp r4.xyz, r4, c35.z, c35.y
mov r11.y, r10.x
mov r11.z, r12.x
add r5.xyz, r1.z, -r11
cmp r5.xyz, r5, c35.z, c35.y
mul r1.xy, r1, c18
frc r1.xy, r1
add r2.yzw, r2, r4.xxyz
cmp r0.yzw, r0, -c35.z, -c35.y
add r0.yzw, r0, r5.xxyz
mad r0.yzw, r0, r1.x, r2
add r0.z, r0.z, r0.y
add r0.y, -r0.y, r0.w
mad r0.y, r0.y, r1.y, r0.z
mul r0.y, r0.y, c33.x
max r0.z, r3.x, r3.y
mad_sat r0.z, r0.z, -c33.y, c33.z
mad r0.w, r2.x, c33.x, -r0.y
mad r0.y, r0.z, r0.w, r0.y
else
mov r0.y, c35.y
endif
mov r0.w, c3.w
mad r0.z, r0.x, r0.w, -c23.y
mad r0.x, r0.x, r0.w, -c0.y
mul_sat r0.x, r0.x, c33.w
mad r0.w, r0.x, c34.x, c34.y
mul r0.x, r0.x, r0.x
mul r0.x, r0.x, r0.w
cmp r0.x, r0.z, r0.x, c35.y
mul oC0, r0.x, r0.y

// approximately 161 instruction slots used (37 texture, 124 arithmetic)


I'm not sure, but I think that had it look a little bit better, but still behaving the same way.

So I've tried everything that I've learned and it feels like it's really close, but I'm out of options here. Is there anything else you can see/suggest in this shader?

3D Gaming Rig: CPU: i7 7700K @ 4.9Ghz | Mobo: Asus Maximus Hero VIII | RAM: Corsair Dominator 16GB | GPU: 2 x GTX 1080 Ti SLI | 3xSSDs for OS and Apps, 2 x HDD's for 11GB storage | PSU: Seasonic X-1250 M2| Case: Corsair C70 | Cooling: Corsair H115i Hydro cooler | Displays: Asus PG278QR, BenQ XL2420TX & BenQ HT1075 | OS: Windows 10 Pro + Windows 7 dual boot

Like my fixes? Dontations can be made to: www.paypal.me/DShanz or rshannonca@gmail.com
Like electronic music? Check out: www.soundcloud.com/dj-ryan-king

Posted 01/29/2016 08:41 AM   
[s]I think something is preventing that shader from running for both eyes (or something later in the frame is throwing away the left eye's buffer), as evidenced by: [code] def c220, 0, 1, 0.0625, 0.5 dcl_2d s3 ... texldl r31, c220.z, s3 if_lt r31.x, c220.x mov oC0, c220.x endif [/code] does nothing, while: [code] if_gt r31.x, c220.x mov oC0, c220.x endif [/code] removes shadows from both eyes. If this shader was running in stereo than both should have disabled the shadows in a single eye. I tried checking for mono render targets or depth buffers, but the search came up empty (unless it's one of those 256x256 format=12 surfaces that give me an out of memory error when I try to stereoise them). This is where I miss not being able to use 3DMigoto's frame analysis to pinpoint the exact problem. Best guess is we will need to find a driver profile that makes this shader run for both eyes. I tried the RE6 profile as well as setting all the bits in StereoTextureEnable that we know are used to stereoise things (I'll PM you the bit definitions), but we don't have any info about the other settings so at this point it's guesswork. [/s]Edit: Rookie mistake... notice which sampler I used?
I think something is preventing that shader from running for both eyes (or something later in the frame is throwing away the left eye's buffer), as evidenced by:

def c220, 0, 1, 0.0625, 0.5
dcl_2d s3
...

texldl r31, c220.z, s3
if_lt r31.x, c220.x
mov oC0, c220.x
endif


does nothing, while:

if_gt r31.x, c220.x
mov oC0, c220.x
endif


removes shadows from both eyes. If this shader was running in stereo than both should have disabled the shadows in a single eye.

I tried checking for mono render targets or depth buffers, but the search came up empty (unless it's one of those 256x256 format=12 surfaces that give me an out of memory error when I try to stereoise them).

This is where I miss not being able to use 3DMigoto's frame analysis to pinpoint the exact problem. Best guess is we will need to find a driver profile that makes this shader run for both eyes. I tried the RE6 profile as well as setting all the bits in StereoTextureEnable that we know are used to stereoise things (I'll PM you the bit definitions), but we don't have any info about the other settings so at this point it's guesswork.

Edit: Rookie mistake... notice which sampler I used?

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 01/29/2016 11:25 AM   
Hmmm, ok, I do get the logic behind that, and when I would perform the stereo fix on some shaders related to this effect, the light/shadows would only render in one eye, which seems to support that. However, does this screenshot debunk that at all? I took this when I was editing the shader earlier and noticed the shadows looked perfectly stereoized (just going all over the place as I moved/looked around). [img]https://forums.geforce.com/cmd/default/download-comment-attachment/68029/[/img] And thanks for that info via PM. I'll try to play around with that and see what I might come up with.
Hmmm, ok, I do get the logic behind that, and when I would perform the stereo fix on some shaders related to this effect, the light/shadows would only render in one eye, which seems to support that.

However, does this screenshot debunk that at all? I took this when I was editing the shader earlier and noticed the shadows looked perfectly stereoized (just going all over the place as I moved/looked around).

Image

And thanks for that info via PM. I'll try to play around with that and see what I might come up with.
Attachments

DDDA08_85.jps

3D Gaming Rig: CPU: i7 7700K @ 4.9Ghz | Mobo: Asus Maximus Hero VIII | RAM: Corsair Dominator 16GB | GPU: 2 x GTX 1080 Ti SLI | 3xSSDs for OS and Apps, 2 x HDD's for 11GB storage | PSU: Seasonic X-1250 M2| Case: Corsair C70 | Cooling: Corsair H115i Hydro cooler | Displays: Asus PG278QR, BenQ XL2420TX & BenQ HT1075 | OS: Windows 10 Pro + Windows 7 dual boot

Like my fixes? Dontations can be made to: www.paypal.me/DShanz or rshannonca@gmail.com
Like electronic music? Check out: www.soundcloud.com/dj-ryan-king

Posted 01/29/2016 11:51 AM   
Got it solved. I'd made a couple of rookie mistakes that cost me some time, and the matrix copy from the lantern shader only works while that shader is active, and thanks to the draw order was always out of date. The vertex shader for directional lighting has the inverse view-projection matrix, but for some reason it wasn't working when I tried copying that with Helix Mod so I did this instead (edit: spotted a typo in my DX9Settings.ini file that would probably explain the matrix copy not working, so feel free to use that method if you prefer, or if you need to copy the matrix to other shaders. Just keep in mind this one has the inverse matrix, whereas the other shader you are copying from was the forward matrix): Vertex Shader 66938957: [code] // directional shadows // // Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111 // // Parameters: // // float4 CBViewProjection__packed4; // float4 CBViewProjection__packed5; // float2 fScreenHalfPixelOffset; // row_major float4x4 fViewProjI; // // // Registers: // // Name Reg Size // ------------------------- ----- ---- // fViewProjI c1 4 // fScreenHalfPixelOffset c5 1 // CBViewProjection__packed4 c6 1 // CBViewProjection__packed5 c7 1 // vs_3_0 def c0, 0, 1, 0, 0 dcl_position v0 dcl_position o0 dcl_texcoord o1 // Matrix copy with Helix Mod is not working from this shader for some reason // (and the matrix copied from VS6581573B is a frame out of date and only valid // if that shader is active), so add extra outputs to copy the inverse view // projection matrix to the pixel shader instead: dcl_texcoord1 o2 dcl_texcoord2 o3 dcl_texcoord3 o4 dcl_texcoord4 o5 mov o2, c1 mov o3, c2 mov o4, c3 mov o5, c4 mul r0, c2, v0.y mad r0, v0.x, c1, r0 add r0, r0, c4 rcp r0.w, r0.w mad r0.xyz, r0, r0.w, -c6 dp3 r0.w, c7, r0 rcp r0.w, r0.w mul o1.xyz, r0.w, r0 add o0.x, -c5.x, v0.x add o0.y, c5.y, v0.y mov o0.zw, c0.xyxy mov o1.w, c0.x [/code] Pixel shader 38EA4D18: [code] // directional shadows (shows streaks when hunted) // // Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111 // // Parameters: // // float4 CBScreen__packed1; // float4 CBViewProjection__packed4; // sampler2D SSPoint__tDepthMap; // sampler2D SSShadowDepth__tShadowMapCombine; // float4 __tShadowMapCombine__invsize; // row_major float4x4 fProj; // // struct // { // float4 __packed0; // float4 __packed1; // float4 fShadowRange; // row_major float4x4 fShadowProjectNear; // row_major float4x4 fShadowProjectMiddle; // row_major float4x4 fShadowProjectFar; // float4 __packed15; // float4 __packed16; // float4 fShadowMapSize; // row_major float3x4 fShadowRangeMat; // float4 __packed21; // float4 __packed22; // // } sShadowReceiveParam; // // // Registers: // // Name Reg Size // -------------------------------- ----- ---- // sShadowReceiveParam c1 23 // fProj c24 4 // CBScreen__packed1 c28 1 // CBViewProjection__packed4 c29 1 // __tShadowMapCombine__invsize c30 1 // SSPoint__tDepthMap s0 1 // SSShadowDepth__tShadowMapCombine s1 1 // ps_3_0 def c0, 0.5, 1, 0.00392156886, 1.53787005e-005 def c31, 1, 3, -0.5, -1.5 def c32, 1, 3, -0.5, -2.5 def c33, 0.25, 16.0804024, 8, -5 def c34, -2, 3, 0, 0 def c35, -0.497500002, 0, 1, -1 dcl_texcoord v0.xyz dcl vPos.xy dcl_2d s0 dcl_2d s1 def c220, 0, 1, 0.0625, 0.86 dcl_2d s13 // Extra inputs with inverse view-projection matrix from vertex shader: dcl_texcoord1 v2 dcl_texcoord2 v3 dcl_texcoord3 v4 dcl_texcoord4 v5 add r0.xy, c0.x, vPos mul r0.xy, r0, c28.zwzw texld r0, r0, s0 dp3 r0.x, r0, c0.yzww add r0.x, r0.x, c26.z rcp r0.x, r0.x mul r0.x, r0.x, c27.z if_lt r0.x, c3.z mad r0.yzw, r0.x, v0.xxyz, c29.xxyz // World-space correction using depth from r0.x. This method skips the need to // convert to view or projection space first // Use a coordinate like (correction, 0, 0, 0). The w=0 is important to // suppress any position translations in the inverse view-projection matrix // since we only want to know how much to adjust the world-space coordinate by, // and we are not adjusting a full coordinate. mov r30, c220.x // Typical projection-space correction formula, subtracted: texldl r31, c220.z, s13 add r31.w, r0.x, -r31.y mul r30.x, -r31.w, r31.x // Multiply by the inverse view-projection matrix passed from the vertex shader: mul r29, r30.x, v2 mad r29, r30.y, v3, r29 mad r29, r30.z, v4, r29 mad r29, r30.w, v5, r29 // Finally adjust the world-space position by adding the correction amount. // Note the trap here - mask is .yzw. Since that lacks a .x, we need to pad the // first position in the swizzle since that will be ignored: add r0.yzw, r0, r29.xxyz mul r1.xyz, r0.z, c5 mad r1.xyz, r0.y, c4, r1 mad r1.xyz, r0.w, c6, r1 add r1.xyz, r1, c7 mul r2.xyz, r0.z, c9 mad r2.xyz, r0.y, c8, r2 mad r2.xyz, r0.w, c10, r2 add r2.xyz, r2, c11 mul r3.xyz, r0.z, c13 mad r3.xyz, r0.y, c12, r3 mad r0.yzw, r0.w, c14.xxyz, r3.xxyz add r0.yzw, r0, c15.xxyz mad r3.xy, r1, c31, c31.z mad r3.zw, r2.xyxy, c31.xyxy, c31 mad r4.xy, r0.yzzw, c32, c32.zwzw add r4.zw, r3_abs.xyxy, c35.x cmp r4.zw, r4, c35.y, c35.z mul r1.w, r4.w, r4.z add r4.zw, r3_abs, c35.x cmp r4.zw, r4, c35.y, c35.z mul r2.w, r4.w, r4.z cmp r3.zw, -r2.w, r4_abs.xyxy, r3_abs cmp r3.xy, -r1.w, r3.zwzw, r3_abs cmp r4.xyz, -r2.w, r0.yzww, r2 cmp r4.xyz, -r1.w, r4, r1 cmp r1.xyz, -r1.w, r0.yzww, r2 add r2.xy, r4, -c30 mov r2.zw, c35.y texldl r2, r2, s1 mov r0.yzw, c35 mad r5, c30.xyxy, r0.wywz, r4.xyxy mul r6, r5.xyxx, c35.zzyy texldl r6, r6, s1 mul r5, r5.zwxx, c35.zzyy texldl r5, r5, s1 mad r7, c30.xyxy, r0.ywyz, r4.xyxy mul r8, r7.xyxx, c35.zzyy texldl r8, r8, s1 mov r4.w, c35.y texldl r9, r4.xyww, s1 mul r7, r7.zwxx, c35.zzyy texldl r7, r7, s1 mad r10, c30.xyxy, r0.zwzy, r4.xyxy mul r11, r10.xyxx, c35.zzyy texldl r11, r11, s1 mul r10, r10.zwxx, c35.zzyy texldl r10, r10, s1 add r12.xy, r4, c30 mov r12.zw, c35.y texldl r12, r12, s1 mov r2.y, r6.x mov r2.z, r5.x add r2.xyz, -r2, r4.z cmp r5.xyz, r2, c35.z, c35.y mov r8.y, r9.x mov r8.z, r7.x add r6.xyz, r4.z, -r8 cmp r6.xyz, r6, c35.z, c35.y mov r11.y, r10.x mov r11.z, r12.x add r7.xyz, r4.z, -r11 cmp r7.xyz, r7, c35.z, c35.y mul r3.zw, r4.xyxy, c18.xyxy frc r3.zw, r3 add r4.xyz, r5, r6 cmp r2.xyz, r2, -c35.z, -c35.y add r2.xyz, r2, r7 mad r2.xyz, r2, r3.z, r4 add r2.y, r2.y, r2.x add r2.x, -r2.x, r2.z mad r2.x, r2.x, r3.w, r2.y add r4.xy, r1, -c30 mov r4.zw, c35.y texldl r4, r4, s1 mad r5, c30.xyxy, r0.wywz, r1.xyxy mul r6, r5.xyxx, c35.zzyy texldl r6, r6, s1 mul r5, r5.zwxx, c35.zzyy texldl r5, r5, s1 mad r7, c30.xyxy, r0.ywyz, r1.xyxy mul r8, r7.xyxx, c35.zzyy texldl r8, r8, s1 mov r1.w, c35.y texldl r9, r1.xyww, s1 mul r7, r7.zwxx, c35.zzyy texldl r7, r7, s1 mad r10, c30.xyxy, r0.zwzy, r1.xyxy mul r11, r10.xyxx, c35.zzyy texldl r11, r11, s1 mul r10, r10.zwxx, c35.zzyy texldl r10, r10, s1 add r12.xy, r1, c30 mov r12.zw, c35.y texldl r12, r12, s1 mov r4.y, r6.x mov r4.z, r5.x add r0.yzw, r1.z, -r4.xxyz cmp r2.yzw, r0, c35.z, c35.y mov r8.y, r9.x mov r8.z, r7.x add r4.xyz, r1.z, -r8 cmp r4.xyz, r4, c35.z, c35.y mov r11.y, r10.x mov r11.z, r12.x add r5.xyz, r1.z, -r11 cmp r5.xyz, r5, c35.z, c35.y mul r1.xy, r1, c18 frc r1.xy, r1 add r2.yzw, r2, r4.xxyz cmp r0.yzw, r0, -c35.z, -c35.y add r0.yzw, r0, r5.xxyz mad r0.yzw, r0, r1.x, r2 add r0.z, r0.z, r0.y add r0.y, -r0.y, r0.w mad r0.y, r0.y, r1.y, r0.z mul r0.y, r0.y, c33.x max r0.z, r3.x, r3.y mad_sat r0.z, r0.z, -c33.y, c33.z mad r0.w, r2.x, c33.x, -r0.y mad r0.y, r0.z, r0.w, r0.y else mov r0.y, c35.y endif mov r0.w, c3.w mad r0.z, r0.x, r0.w, -c23.y mad r0.x, r0.x, r0.w, -c0.y mul_sat r0.x, r0.x, c33.w mad r0.w, r0.x, c34.x, c34.y mul r0.x, r0.x, r0.x mul r0.x, r0.x, r0.w cmp r0.x, r0.z, r0.x, c35.y mul oC0, r0.x, r0.y // approximately 161 instruction slots used (37 texture, 124 arithmetic) [/code]
Got it solved. I'd made a couple of rookie mistakes that cost me some time, and the matrix copy from the lantern shader only works while that shader is active, and thanks to the draw order was always out of date. The vertex shader for directional lighting has the inverse view-projection matrix, but for some reason it wasn't working when I tried copying that with Helix Mod so I did this instead (edit: spotted a typo in my DX9Settings.ini file that would probably explain the matrix copy not working, so feel free to use that method if you prefer, or if you need to copy the matrix to other shaders. Just keep in mind this one has the inverse matrix, whereas the other shader you are copying from was the forward matrix):

Vertex Shader 66938957:
// directional shadows

//
// Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111
//
// Parameters:
//
// float4 CBViewProjection__packed4;
// float4 CBViewProjection__packed5;
// float2 fScreenHalfPixelOffset;
// row_major float4x4 fViewProjI;
//
//
// Registers:
//
// Name Reg Size
// ------------------------- ----- ----
// fViewProjI c1 4
// fScreenHalfPixelOffset c5 1
// CBViewProjection__packed4 c6 1
// CBViewProjection__packed5 c7 1
//

vs_3_0
def c0, 0, 1, 0, 0
dcl_position v0
dcl_position o0
dcl_texcoord o1

// Matrix copy with Helix Mod is not working from this shader for some reason
// (and the matrix copied from VS6581573B is a frame out of date and only valid
// if that shader is active), so add extra outputs to copy the inverse view
// projection matrix to the pixel shader instead:
dcl_texcoord1 o2
dcl_texcoord2 o3
dcl_texcoord3 o4
dcl_texcoord4 o5

mov o2, c1
mov o3, c2
mov o4, c3
mov o5, c4

mul r0, c2, v0.y
mad r0, v0.x, c1, r0
add r0, r0, c4
rcp r0.w, r0.w
mad r0.xyz, r0, r0.w, -c6
dp3 r0.w, c7, r0
rcp r0.w, r0.w
mul o1.xyz, r0.w, r0
add o0.x, -c5.x, v0.x
add o0.y, c5.y, v0.y
mov o0.zw, c0.xyxy
mov o1.w, c0.x


Pixel shader 38EA4D18:
// directional shadows (shows streaks when hunted)

//
// Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111
//
// Parameters:
//
// float4 CBScreen__packed1;
// float4 CBViewProjection__packed4;
// sampler2D SSPoint__tDepthMap;
// sampler2D SSShadowDepth__tShadowMapCombine;
// float4 __tShadowMapCombine__invsize;
// row_major float4x4 fProj;
//
// struct
// {
// float4 __packed0;
// float4 __packed1;
// float4 fShadowRange;
// row_major float4x4 fShadowProjectNear;
// row_major float4x4 fShadowProjectMiddle;
// row_major float4x4 fShadowProjectFar;
// float4 __packed15;
// float4 __packed16;
// float4 fShadowMapSize;
// row_major float3x4 fShadowRangeMat;
// float4 __packed21;
// float4 __packed22;
//
// } sShadowReceiveParam;
//
//
// Registers:
//
// Name Reg Size
// -------------------------------- ----- ----
// sShadowReceiveParam c1 23
// fProj c24 4
// CBScreen__packed1 c28 1
// CBViewProjection__packed4 c29 1
// __tShadowMapCombine__invsize c30 1
// SSPoint__tDepthMap s0 1
// SSShadowDepth__tShadowMapCombine s1 1
//

ps_3_0
def c0, 0.5, 1, 0.00392156886, 1.53787005e-005
def c31, 1, 3, -0.5, -1.5
def c32, 1, 3, -0.5, -2.5
def c33, 0.25, 16.0804024, 8, -5
def c34, -2, 3, 0, 0
def c35, -0.497500002, 0, 1, -1
dcl_texcoord v0.xyz
dcl vPos.xy
dcl_2d s0
dcl_2d s1

def c220, 0, 1, 0.0625, 0.86
dcl_2d s13

// Extra inputs with inverse view-projection matrix from vertex shader:
dcl_texcoord1 v2
dcl_texcoord2 v3
dcl_texcoord3 v4
dcl_texcoord4 v5

add r0.xy, c0.x, vPos
mul r0.xy, r0, c28.zwzw
texld r0, r0, s0
dp3 r0.x, r0, c0.yzww
add r0.x, r0.x, c26.z
rcp r0.x, r0.x
mul r0.x, r0.x, c27.z
if_lt r0.x, c3.z
mad r0.yzw, r0.x, v0.xxyz, c29.xxyz

// World-space correction using depth from r0.x. This method skips the need to
// convert to view or projection space first

// Use a coordinate like (correction, 0, 0, 0). The w=0 is important to
// suppress any position translations in the inverse view-projection matrix
// since we only want to know how much to adjust the world-space coordinate by,
// and we are not adjusting a full coordinate.
mov r30, c220.x

// Typical projection-space correction formula, subtracted:
texldl r31, c220.z, s13
add r31.w, r0.x, -r31.y
mul r30.x, -r31.w, r31.x

// Multiply by the inverse view-projection matrix passed from the vertex shader:
mul r29, r30.x, v2
mad r29, r30.y, v3, r29
mad r29, r30.z, v4, r29
mad r29, r30.w, v5, r29

// Finally adjust the world-space position by adding the correction amount.
// Note the trap here - mask is .yzw. Since that lacks a .x, we need to pad the
// first position in the swizzle since that will be ignored:
add r0.yzw, r0, r29.xxyz

mul r1.xyz, r0.z, c5
mad r1.xyz, r0.y, c4, r1
mad r1.xyz, r0.w, c6, r1
add r1.xyz, r1, c7
mul r2.xyz, r0.z, c9
mad r2.xyz, r0.y, c8, r2
mad r2.xyz, r0.w, c10, r2
add r2.xyz, r2, c11
mul r3.xyz, r0.z, c13
mad r3.xyz, r0.y, c12, r3
mad r0.yzw, r0.w, c14.xxyz, r3.xxyz
add r0.yzw, r0, c15.xxyz
mad r3.xy, r1, c31, c31.z
mad r3.zw, r2.xyxy, c31.xyxy, c31
mad r4.xy, r0.yzzw, c32, c32.zwzw
add r4.zw, r3_abs.xyxy, c35.x
cmp r4.zw, r4, c35.y, c35.z
mul r1.w, r4.w, r4.z
add r4.zw, r3_abs, c35.x
cmp r4.zw, r4, c35.y, c35.z
mul r2.w, r4.w, r4.z
cmp r3.zw, -r2.w, r4_abs.xyxy, r3_abs
cmp r3.xy, -r1.w, r3.zwzw, r3_abs
cmp r4.xyz, -r2.w, r0.yzww, r2
cmp r4.xyz, -r1.w, r4, r1
cmp r1.xyz, -r1.w, r0.yzww, r2
add r2.xy, r4, -c30
mov r2.zw, c35.y
texldl r2, r2, s1
mov r0.yzw, c35
mad r5, c30.xyxy, r0.wywz, r4.xyxy
mul r6, r5.xyxx, c35.zzyy
texldl r6, r6, s1
mul r5, r5.zwxx, c35.zzyy
texldl r5, r5, s1
mad r7, c30.xyxy, r0.ywyz, r4.xyxy
mul r8, r7.xyxx, c35.zzyy
texldl r8, r8, s1
mov r4.w, c35.y
texldl r9, r4.xyww, s1
mul r7, r7.zwxx, c35.zzyy
texldl r7, r7, s1
mad r10, c30.xyxy, r0.zwzy, r4.xyxy
mul r11, r10.xyxx, c35.zzyy
texldl r11, r11, s1
mul r10, r10.zwxx, c35.zzyy
texldl r10, r10, s1
add r12.xy, r4, c30
mov r12.zw, c35.y
texldl r12, r12, s1
mov r2.y, r6.x
mov r2.z, r5.x
add r2.xyz, -r2, r4.z
cmp r5.xyz, r2, c35.z, c35.y
mov r8.y, r9.x
mov r8.z, r7.x
add r6.xyz, r4.z, -r8
cmp r6.xyz, r6, c35.z, c35.y
mov r11.y, r10.x
mov r11.z, r12.x
add r7.xyz, r4.z, -r11
cmp r7.xyz, r7, c35.z, c35.y
mul r3.zw, r4.xyxy, c18.xyxy
frc r3.zw, r3
add r4.xyz, r5, r6
cmp r2.xyz, r2, -c35.z, -c35.y
add r2.xyz, r2, r7
mad r2.xyz, r2, r3.z, r4
add r2.y, r2.y, r2.x
add r2.x, -r2.x, r2.z
mad r2.x, r2.x, r3.w, r2.y
add r4.xy, r1, -c30
mov r4.zw, c35.y
texldl r4, r4, s1
mad r5, c30.xyxy, r0.wywz, r1.xyxy
mul r6, r5.xyxx, c35.zzyy
texldl r6, r6, s1
mul r5, r5.zwxx, c35.zzyy
texldl r5, r5, s1
mad r7, c30.xyxy, r0.ywyz, r1.xyxy
mul r8, r7.xyxx, c35.zzyy
texldl r8, r8, s1
mov r1.w, c35.y
texldl r9, r1.xyww, s1
mul r7, r7.zwxx, c35.zzyy
texldl r7, r7, s1
mad r10, c30.xyxy, r0.zwzy, r1.xyxy
mul r11, r10.xyxx, c35.zzyy
texldl r11, r11, s1
mul r10, r10.zwxx, c35.zzyy
texldl r10, r10, s1
add r12.xy, r1, c30
mov r12.zw, c35.y
texldl r12, r12, s1
mov r4.y, r6.x
mov r4.z, r5.x
add r0.yzw, r1.z, -r4.xxyz
cmp r2.yzw, r0, c35.z, c35.y
mov r8.y, r9.x
mov r8.z, r7.x
add r4.xyz, r1.z, -r8
cmp r4.xyz, r4, c35.z, c35.y
mov r11.y, r10.x
mov r11.z, r12.x
add r5.xyz, r1.z, -r11
cmp r5.xyz, r5, c35.z, c35.y
mul r1.xy, r1, c18
frc r1.xy, r1
add r2.yzw, r2, r4.xxyz
cmp r0.yzw, r0, -c35.z, -c35.y
add r0.yzw, r0, r5.xxyz
mad r0.yzw, r0, r1.x, r2
add r0.z, r0.z, r0.y
add r0.y, -r0.y, r0.w
mad r0.y, r0.y, r1.y, r0.z
mul r0.y, r0.y, c33.x
max r0.z, r3.x, r3.y
mad_sat r0.z, r0.z, -c33.y, c33.z
mad r0.w, r2.x, c33.x, -r0.y
mad r0.y, r0.z, r0.w, r0.y
else
mov r0.y, c35.y
endif
mov r0.w, c3.w
mad r0.z, r0.x, r0.w, -c23.y
mad r0.x, r0.x, r0.w, -c0.y
mul_sat r0.x, r0.x, c33.w
mad r0.w, r0.x, c34.x, c34.y
mul r0.x, r0.x, r0.x
mul r0.x, r0.x, r0.w
cmp r0.x, r0.z, r0.x, c35.y
mul oC0, r0.x, r0.y

// approximately 161 instruction slots used (37 texture, 124 arithmetic)

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 01/29/2016 02:01 PM   
Awesome, that totally did it! Glad that I was along the right track to try the world-space correction on that last attempt, just slightly off, and didn't know about that gotcha about that matrix copy (man, things can get really tricky). I think that was literally the final piece of the puzzle, and from what testing I've been able to do I think this game might now be fully fixed/3D ready. Just doing a little bit more testing and I should be able to put out a full release today! Thanks PROfessor! :)
Awesome, that totally did it! Glad that I was along the right track to try the world-space correction on that last attempt, just slightly off, and didn't know about that gotcha about that matrix copy (man, things can get really tricky).

I think that was literally the final piece of the puzzle, and from what testing I've been able to do I think this game might now be fully fixed/3D ready. Just doing a little bit more testing and I should be able to put out a full release today!

Thanks PROfessor! :)

3D Gaming Rig: CPU: i7 7700K @ 4.9Ghz | Mobo: Asus Maximus Hero VIII | RAM: Corsair Dominator 16GB | GPU: 2 x GTX 1080 Ti SLI | 3xSSDs for OS and Apps, 2 x HDD's for 11GB storage | PSU: Seasonic X-1250 M2| Case: Corsair C70 | Cooling: Corsair H115i Hydro cooler | Displays: Asus PG278QR, BenQ XL2420TX & BenQ HT1075 | OS: Windows 10 Pro + Windows 7 dual boot

Like my fixes? Dontations can be made to: www.paypal.me/DShanz or rshannonca@gmail.com
Like electronic music? Check out: www.soundcloud.com/dj-ryan-king

Posted 01/29/2016 04:46 PM   
Hi all, currently I'm trying to fix the game Risen 2 with HelixMod - it's a DX9 title. I've managed to fix all the simple things like HUD, sky and also shadows from the sun. Now I ran into problems with the lighting - for now point lights which are very different to the fixed shadows. Without any fixes, point lights are at a wrong 3D position. The weird thing here is that the light effect becomes worse when approaching to the light source, but when getting very close, the light "jumps" to the correct position. Hunting the Pixel Shader shows cyan spheres asround the light source at the correct 3D position so I thick that I've got the right shader. Destereorizing o0 in the vertex shader puts the light spheres to screen depth and fixes the jumping problem. But now, I have no idea how to correct the effect. I've tried different things (see comments inside shaders) without success. The main problems are definitely the "jumping" issue (some kind of overflow?) and that I'm unable to move the effect relative to screen. I'm looking forward for any suggestions or corrections of the procedure I described in the comments. Finally, here are the shaders, there are some similar pixel shaders for some lights but they are identical up to the position where I put in the fix. The already fixed shadow shader can be found here: https://github.com/mx-2/3d-fix/blob/master/Risen2/ShaderOverride/PixelShaders/A4F49828.txt VS06C99936 - hunting disables all point lights [code] // // Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111 // // Parameters: // // float3 EyePos; // float4x4 ViewProj; // // // Registers: // // Name Reg Size // ------------ ----- ---- // ViewProj c0 4 // EyePos c4 1 // // // Default values: // // ViewProj // c0 = { 0, 0, 0, 0 }; // c1 = { 0, 0, 0, 0 }; // c2 = { 0, 0, 0, 0 }; // c3 = { 0, 0, 0, 0 }; // // EyePos // c4 = { 0, 0, 0, 0 }; // // Torch and candle lights (point without shadow) and global lighting - exteriors // Jumping/flickering when moving was fixed by destereorizing o0! vs_3_0 // Helix sampler dcl_2d s1 def c200, -500, 2, 0.0625, 1 def c5, 1, 0, 0, 0 dcl_position v0 dcl_texcoord v1 dcl_texcoord1 v2 dcl_texcoord2 v3 dcl_texcoord3 v4 dcl_texcoord4 v5 dcl_position o0 dcl_texcoord o1 dcl_texcoord1 o2.xyz dcl_texcoord2 o3 dcl_texcoord3 o4 dp3 r0.x, v0, v0 rsq r0.x, r0.x rcp r0.y, r0.x mad r0.xzw, v0.xyyz, r0.x, -v0.xyyz slt r0.y, c5.x, r0.y mad r0.xyz, r0.y, r0.xzww, v0 mul r0.xyz, r0, v4.x mov r0.w, v0.w dp4 r1.x, r0, v1 dp4 r1.y, r0, v2 dp4 r1.z, r0, v3 mov r1.w, v0.w //dp4 o0.x, r1, c0 //dp4 o0.y, r1, c1 //dp4 o0.z, r1, c2 //dp4 o0.w, r1, c3 // Moves light spheres relative to world, texture inside sphere does not change // -> r1 is in world coordinates before treansformation //add r1.x, r1.x, c200.x dp4 r20.x, r1, c0 dp4 r20.y, r1, c1 dp4 r20.z, r1, c2 dp4 r20.w, r1, c3 texldl r30, c200.z, s1 add r30.w, r20.w, -r30.y mad r20.x, r30.x, -r30.w, r20.x // now o0 2D // moves respective to screen, moves when changing view direction, range: +/- 200 // -> screen coordinates //add r20.x, r20.x, c200.x mov o0, r20 texldl r30, c200.z, s1 add r30.w, r1.w, -r30.y //mad r1.x, r30.x, -r30.w, r1.x // modifying r1 -> o2 does not help add o2.xyz, r1, -c4 //mov o2.xyz, r1 // Idea: subtract c4 in pixel shader after fix, pass c4 in o4, did not helped //mov o4, c4 mov r0.x, v1.w mov r0.y, v2.w mov r0.z, v3.w add o3.xyz, r0, -c4 // disabling o3 makes it brighter rcp o3.w, v4.x mov o1, v5 // approximately 23 instruction slots used [/code] PS92EAC913 - hunting shows cyan spheres around point lights [code] // // Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111 // // Parameters: // // float3 CamDir; // sampler2D DepthSampler; // sampler2D DiffuseLookupSampler; // float2 NearFar; // sampler2D NormalSampler; // float4 ScreenToTexCoord; // sampler2D ShadowMaskSampler; // sampler2D SpecularLookupSampler; // // // Registers: // // Name Reg Size // --------------------- ----- ---- // CamDir c0 1 // NearFar c1 1 // ScreenToTexCoord c2 1 // DepthSampler s0 1 // NormalSampler s1 1 // ShadowMaskSampler s2 1 // DiffuseLookupSampler s3 1 // SpecularLookupSampler s4 1 // // // Default values: // // CamDir // c0 = { 0, 0, 0, 0 }; // // NearFar // c1 = { 0, 0, 0, 0 }; // // ScreenToTexCoord // c2 = { 0, 0, 0, 0 }; // // c220 = InverseViewProjection from VS // Lights - torches on walls - no shadows - exteriors ps_3_0 def c200, 1, 0, 0.0625, 0 def c201, -900000, 0, 0, 0 def c210, 0, 1, 0, 1 def c3, 0.5, 0.99609375, 0.124511719, 0.000244140625 def c4, 2, -1, 3, 1 dcl_texcoord v0 dcl_texcoord1 v1.xyz dcl_texcoord2_pp v2 dcl_texcoord3 v3 dcl vPos.xy dcl_2d s0 dcl_2d s1 dcl_2d s2 dcl_2d s3 dcl_2d s4 dcl_2d s13 mad r0.xy, vPos, c2, c2.zwzw texld r1, r0, s0 // sample screen depth to r1 mul r0.z, r1.x, c1.y // Fix should go here // Fix idea: pos.x += sep * (1 - c * conv / depth) mov r30, v1.xyz texld r29, c200.z, s13 rcp r28.x, r1.x // 1 / depth mul r29.w, c200.y, r28.x // c / depth mul r29.w, r29.w, r29.y // c * conv / depth add r29.w, r29.w, -c200.x // 1 - (.) //mad r30.x, r29.x, -r29.w, r30.x // += sep * (.) mul r29.w, r29.w, r29.x // Use inverse matrix to put x offset from screen coordinates to world coordinates. // Movement of x pos causes jumping (again)!! // Test: should move relative to screen, but does not - see below texld r29, c200.z, s13 mul r29.w, c201.x, r29.x // Inverse view projection (viewProj) - from parent VS // moves somehow relative to screen, but LR changes when looking WE or NS // Inverse view projection (worldViewProj) - from another VS // moves NW when transforming [+x 0 0 0] mad r30.x, r29.w, c220.x, r30.x mad r30.y, r29.w, c221.x, r30.y mad r30.z, r29.w, c222.x, r30.z mad r30.w, r29.w, c223.x, r30.w // changing r0 here: ghost images //add r30.x, r30.x, c200.x // Moves in WE dir //add r30.y, r30.y, c200.x // Moves up/down, + is down //add r30.z, r30.z, c200.x // Moves in NS dir, + is south nrm_pp r1.xyz, r30 //nrm_pp r1.xyz, v1 dp3 r0.w, c0, r1 rcp r0.w, r0.w mul r0.z, r0.w, r0.z mad_pp r2.xyz, r1, -r0.z, v2 dp3_pp r0.z, r2, r2 rsq_pp r0.z, r0.z mad_pp r3.xyz, r2, r0.z, -r1 mul_pp r2.xyz, r0.z, r2 rcp_pp r0.z, r0.z mul_sat_pp r0.z, r0.z, v2.w nrm_pp r4.xyz, r3 texld_pp r3, r0, s1 texld_pp r5, r0, s2 mad_pp r0.xyw, r3.xyzz, c4.x, c4.y mov_pp r3.y, r3.w nrm_pp r6.xyz, r0.xyww dp3_sat_pp r4.z, r4, r6 dp3_pp r0.y, r2, -r1 dp3_pp r0.x, r2, r6 dp3_pp r1.w, -r1, r6 mad_pp r4.xy, r0, c3.x, c3.x mul_pp r0.y, r5.y, c3.y mad r2.xyz, r4, c3.z, r0.y add_pp r1.xyz, r2, c3.w texld_pp r2, r1.ywzw, s3 mul_pp r0.x, r0.x, r2.w texld_pp r2, r1.xwzw, s3 texld_pp r4, r1.xwzw, s4 mov_pp r3.x, r1.z texld_pp r1, r3, s4 mul_pp r1.w, r4.y, r1.x max_pp r1.xyz, r0.x, r2 mul r1, r1, v0 mul_pp r0.x, r0.z, r0.z mul_pp r0.y, r0.x, c4.z dp2add_pp r0.x, r0.x, r0.z, -r0.y add_pp r0.x, r0.x, c4.w mul_pp r0.x, r5.x, r0.x mul_pp oC0, r0.x, r1 //mov oC0, c222 //mov oC0, c210 // approximately 50 instruction slots used (7 texture, 43 arithmetic) [/code] PSBAABE331 - hunting shows cyan spheres around point lights [code] // // Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111 // // Parameters: // // float3 CamDir; // sampler2D DepthSampler; // float2 NearFar; // sampler2D NormalSampler; // float4 ScreenToTexCoord; // sampler2D ShadowMaskSampler; // // // Registers: // // Name Reg Size // ----------------- ----- ---- // CamDir c0 1 // NearFar c1 1 // ScreenToTexCoord c2 1 // DepthSampler s0 1 // NormalSampler s1 1 // ShadowMaskSampler s2 1 // // // Default values: // // CamDir // c0 = { 0, 0, 0, 0 }; // // NearFar // c1 = { 0, 0, 0, 0 }; // // ScreenToTexCoord // c2 = { 0, 0, 0, 0 }; // // Point lights without shadows - exteriors ps_3_0 def c3, 3, 1, 2, -1 def c4, 0.5, 0, 0, 0 def c200, -0.1, 5000000, 0.0625, 1 def c210, 0, 1 ,0 ,1 // green for testing dcl_texcoord v0.xyz // color dcl_texcoord1 v1.xyz // jumping dcl_texcoord2_pp v2 // bloom?, time dependent dcl_texcoord3 v3 // eye pos - from parent VS dcl vPos.xy dcl_2d s0 dcl_2d s1 dcl_2d s2 dcl_2d s13 // code moved down //nrm r0.xyz, v1 //dp3 r0.w, c0, r0 //rcp r0.w, r0.w // get depth from transformed vPos mad r1.xy, vPos, c2, c2.zwzw texld r2, r1, s0 // remove this line: no flickering, brighter and worse lighting // Fix should go here mov r0, v1.xyz texldl r29, c200.z, s13 rcp r28.x, r2.x mov r28.x, r2.x // linear depth seems to be correct mul r29.w, c200.y, r28.x // c / depth mul r29.w, r29.w, r29.y // * conv add r29.w, c200.w, -r29.w // 1 - (.) mul r29.w, r29.w, r29.x // * sep // Use inverse matrix to put x offset from screen coordinates to world coordinates. // Test, set r29 mov r29.w, c200.x // Inverse view projection - moves somehow relative to screen, but LR changes when looking WE or NS? mad r0.x, r29.w, c220.x, r0.x mad r0.y, r29.w, c221.x, r0.y mad r0.z, r29.w, c222.x, r0.z mad r0.w, r29.w, c223.x, r0.w // Moves relative to world //add r0.x, r0.x, c200.x //mov r0, v1.xyz //add r0.xyz, r0, -v3 // subtract eye pos (removed in VS), dont work (nrm instruction fails - what to do with missing w coordinate from v1???) // this is the moved code from above nrm r0.xyz, r0//v1 // does not work with v1.xyz or a register dp3 r0.w, c0, r0 // transform w with cam pos rcp r0.w, r0.w // Moves relative to world, range: +/- 0.5 //add r0.x, c200.y, r0.x // moves in WE dir, + is W //add r0.y, c200.y, r0.y // moves up/down, + is down //add r0.z, c200.y, r0.z // moves in NS dir, + is S //add r0.w, c200.y, r0.w // scales light, - makes them larger, centered to player mul r1.z, r2.x, c1.y mul r0.w, r0.w, r1.z mad_pp r0.xyz, r0, -r0.w, v2 dp3_pp r0.w, r0, r0 rsq_pp r0.w, r0.w mul_pp r0.xyz, r0.w, r0 rcp_pp r0.w, r0.w mul_sat_pp r0.w, r0.w, v2.w // r0.x changes color here // r1.x produces stripes in x dir //add r1.x, r1.x, c200.x // normals and shadow mask texld_pp r2, r1, s1 texld_pp r1, r1, s2 mad_pp r1.yzw, r2.xxyz, c3.z, c3.w nrm_pp r2.xyz, r1.yzww dp3_pp r0.x, r0, r2 mad_pp r0.x, r0.x, c4.x, c4.x mul_pp r0.y, r0.w, r0.w mul_pp r0.z, r0.y, c3.x dp2add_pp r0.y, r0.y, r0.w, -r0.z add_pp r0.y, r0.y, c3.y mul_pp r0.y, r1.x, r0.y mul_pp r0.x, r0.x, r0.y // mul oC0.xyz, r0.x, v0 // mov oC0.w, c4.y //nrm oC0, v1 //mov oC0, c0 //mov oC0, c210 //mov oC0, c223.x //mov oC0.xyz, v0.xyz //mov oC0.w, c3.y //mov oC0.xyz, v3 //mov oC0, v1.xyz //mov r30, v3 //add oC0, v1.xyz, -r30 //mov oC0.w, c200.w // approximately 32 instruction slots used (3 texture, 29 arithmetic) [/code]
Hi all,

currently I'm trying to fix the game Risen 2 with HelixMod - it's a DX9 title.
I've managed to fix all the simple things like HUD, sky and also shadows from the sun.
Now I ran into problems with the lighting - for now point lights which are very different to the fixed shadows.

Without any fixes, point lights are at a wrong 3D position. The weird thing here is that the light effect becomes worse when approaching to the light source, but when getting very close, the light "jumps" to the correct position. Hunting the Pixel Shader shows cyan spheres asround the light source at the correct 3D position so I thick that I've got the right shader.

Destereorizing o0 in the vertex shader puts the light spheres to screen depth and fixes the jumping problem. But now, I have no idea how to correct the effect. I've tried different things (see comments inside shaders) without success. The main problems are definitely the "jumping" issue (some kind of overflow?) and that I'm unable to move the effect relative to screen.

I'm looking forward for any suggestions or corrections of the procedure I described in the comments.


Finally, here are the shaders, there are some similar pixel shaders for some lights but they are identical up to the position where I put in the fix.
The already fixed shadow shader can be found here: https://github.com/mx-2/3d-fix/blob/master/Risen2/ShaderOverride/PixelShaders/A4F49828.txt



VS06C99936 - hunting disables all point lights
//
// Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111
//
// Parameters:
//
// float3 EyePos;
// float4x4 ViewProj;
//
//
// Registers:
//
// Name Reg Size
// ------------ ----- ----
// ViewProj c0 4
// EyePos c4 1
//
//
// Default values:
//
// ViewProj
// c0 = { 0, 0, 0, 0 };
// c1 = { 0, 0, 0, 0 };
// c2 = { 0, 0, 0, 0 };
// c3 = { 0, 0, 0, 0 };
//
// EyePos
// c4 = { 0, 0, 0, 0 };
//

// Torch and candle lights (point without shadow) and global lighting - exteriors

// Jumping/flickering when moving was fixed by destereorizing o0!

vs_3_0

// Helix sampler
dcl_2d s1
def c200, -500, 2, 0.0625, 1

def c5, 1, 0, 0, 0
dcl_position v0
dcl_texcoord v1
dcl_texcoord1 v2
dcl_texcoord2 v3
dcl_texcoord3 v4
dcl_texcoord4 v5
dcl_position o0
dcl_texcoord o1
dcl_texcoord1 o2.xyz
dcl_texcoord2 o3
dcl_texcoord3 o4
dp3 r0.x, v0, v0
rsq r0.x, r0.x
rcp r0.y, r0.x
mad r0.xzw, v0.xyyz, r0.x, -v0.xyyz
slt r0.y, c5.x, r0.y
mad r0.xyz, r0.y, r0.xzww, v0
mul r0.xyz, r0, v4.x
mov r0.w, v0.w
dp4 r1.x, r0, v1
dp4 r1.y, r0, v2
dp4 r1.z, r0, v3
mov r1.w, v0.w
//dp4 o0.x, r1, c0
//dp4 o0.y, r1, c1
//dp4 o0.z, r1, c2
//dp4 o0.w, r1, c3

// Moves light spheres relative to world, texture inside sphere does not change
// -> r1 is in world coordinates before treansformation
//add r1.x, r1.x, c200.x

dp4 r20.x, r1, c0
dp4 r20.y, r1, c1
dp4 r20.z, r1, c2
dp4 r20.w, r1, c3

texldl r30, c200.z, s1
add r30.w, r20.w, -r30.y
mad r20.x, r30.x, -r30.w, r20.x // now o0 2D
// moves respective to screen, moves when changing view direction, range: +/- 200
// -> screen coordinates
//add r20.x, r20.x, c200.x
mov o0, r20

texldl r30, c200.z, s1
add r30.w, r1.w, -r30.y
//mad r1.x, r30.x, -r30.w, r1.x // modifying r1 -> o2 does not help

add o2.xyz, r1, -c4
//mov o2.xyz, r1 // Idea: subtract c4 in pixel shader after fix, pass c4 in o4, did not helped
//mov o4, c4
mov r0.x, v1.w
mov r0.y, v2.w
mov r0.z, v3.w
add o3.xyz, r0, -c4 // disabling o3 makes it brighter
rcp o3.w, v4.x
mov o1, v5

// approximately 23 instruction slots used


PS92EAC913 - hunting shows cyan spheres around point lights
//
// Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111
//
// Parameters:
//
// float3 CamDir;
// sampler2D DepthSampler;
// sampler2D DiffuseLookupSampler;
// float2 NearFar;
// sampler2D NormalSampler;
// float4 ScreenToTexCoord;
// sampler2D ShadowMaskSampler;
// sampler2D SpecularLookupSampler;
//
//
// Registers:
//
// Name Reg Size
// --------------------- ----- ----
// CamDir c0 1
// NearFar c1 1
// ScreenToTexCoord c2 1
// DepthSampler s0 1
// NormalSampler s1 1
// ShadowMaskSampler s2 1
// DiffuseLookupSampler s3 1
// SpecularLookupSampler s4 1
//
//
// Default values:
//
// CamDir
// c0 = { 0, 0, 0, 0 };
//
// NearFar
// c1 = { 0, 0, 0, 0 };
//
// ScreenToTexCoord
// c2 = { 0, 0, 0, 0 };
//

// c220 = InverseViewProjection from VS

// Lights - torches on walls - no shadows - exteriors

ps_3_0
def c200, 1, 0, 0.0625, 0
def c201, -900000, 0, 0, 0
def c210, 0, 1, 0, 1
def c3, 0.5, 0.99609375, 0.124511719, 0.000244140625
def c4, 2, -1, 3, 1
dcl_texcoord v0
dcl_texcoord1 v1.xyz
dcl_texcoord2_pp v2
dcl_texcoord3 v3
dcl vPos.xy
dcl_2d s0
dcl_2d s1
dcl_2d s2
dcl_2d s3
dcl_2d s4
dcl_2d s13

mad r0.xy, vPos, c2, c2.zwzw
texld r1, r0, s0 // sample screen depth to r1
mul r0.z, r1.x, c1.y

// Fix should go here

// Fix idea: pos.x += sep * (1 - c * conv / depth)
mov r30, v1.xyz
texld r29, c200.z, s13
rcp r28.x, r1.x // 1 / depth
mul r29.w, c200.y, r28.x // c / depth
mul r29.w, r29.w, r29.y // c * conv / depth
add r29.w, r29.w, -c200.x // 1 - (.)
//mad r30.x, r29.x, -r29.w, r30.x // += sep * (.)
mul r29.w, r29.w, r29.x

// Use inverse matrix to put x offset from screen coordinates to world coordinates.
// Movement of x pos causes jumping (again)!!

// Test: should move relative to screen, but does not - see below
texld r29, c200.z, s13
mul r29.w, c201.x, r29.x

// Inverse view projection (viewProj) - from parent VS
// moves somehow relative to screen, but LR changes when looking WE or NS

// Inverse view projection (worldViewProj) - from another VS
// moves NW when transforming [+x 0 0 0]
mad r30.x, r29.w, c220.x, r30.x
mad r30.y, r29.w, c221.x, r30.y
mad r30.z, r29.w, c222.x, r30.z
mad r30.w, r29.w, c223.x, r30.w

// changing r0 here: ghost images

//add r30.x, r30.x, c200.x // Moves in WE dir
//add r30.y, r30.y, c200.x // Moves up/down, + is down
//add r30.z, r30.z, c200.x // Moves in NS dir, + is south

nrm_pp r1.xyz, r30
//nrm_pp r1.xyz, v1
dp3 r0.w, c0, r1
rcp r0.w, r0.w
mul r0.z, r0.w, r0.z
mad_pp r2.xyz, r1, -r0.z, v2
dp3_pp r0.z, r2, r2
rsq_pp r0.z, r0.z
mad_pp r3.xyz, r2, r0.z, -r1
mul_pp r2.xyz, r0.z, r2
rcp_pp r0.z, r0.z
mul_sat_pp r0.z, r0.z, v2.w
nrm_pp r4.xyz, r3
texld_pp r3, r0, s1
texld_pp r5, r0, s2
mad_pp r0.xyw, r3.xyzz, c4.x, c4.y
mov_pp r3.y, r3.w
nrm_pp r6.xyz, r0.xyww
dp3_sat_pp r4.z, r4, r6
dp3_pp r0.y, r2, -r1
dp3_pp r0.x, r2, r6
dp3_pp r1.w, -r1, r6
mad_pp r4.xy, r0, c3.x, c3.x
mul_pp r0.y, r5.y, c3.y
mad r2.xyz, r4, c3.z, r0.y
add_pp r1.xyz, r2, c3.w
texld_pp r2, r1.ywzw, s3
mul_pp r0.x, r0.x, r2.w
texld_pp r2, r1.xwzw, s3
texld_pp r4, r1.xwzw, s4
mov_pp r3.x, r1.z
texld_pp r1, r3, s4
mul_pp r1.w, r4.y, r1.x
max_pp r1.xyz, r0.x, r2
mul r1, r1, v0
mul_pp r0.x, r0.z, r0.z
mul_pp r0.y, r0.x, c4.z
dp2add_pp r0.x, r0.x, r0.z, -r0.y
add_pp r0.x, r0.x, c4.w
mul_pp r0.x, r5.x, r0.x
mul_pp oC0, r0.x, r1
//mov oC0, c222
//mov oC0, c210

// approximately 50 instruction slots used (7 texture, 43 arithmetic)


PSBAABE331 - hunting shows cyan spheres around point lights
//
// Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111
//
// Parameters:
//
// float3 CamDir;
// sampler2D DepthSampler;
// float2 NearFar;
// sampler2D NormalSampler;
// float4 ScreenToTexCoord;
// sampler2D ShadowMaskSampler;
//
//
// Registers:
//
// Name Reg Size
// ----------------- ----- ----
// CamDir c0 1
// NearFar c1 1
// ScreenToTexCoord c2 1
// DepthSampler s0 1
// NormalSampler s1 1
// ShadowMaskSampler s2 1
//
//
// Default values:
//
// CamDir
// c0 = { 0, 0, 0, 0 };
//
// NearFar
// c1 = { 0, 0, 0, 0 };
//
// ScreenToTexCoord
// c2 = { 0, 0, 0, 0 };
//

// Point lights without shadows - exteriors

ps_3_0
def c3, 3, 1, 2, -1
def c4, 0.5, 0, 0, 0
def c200, -0.1, 5000000, 0.0625, 1
def c210, 0, 1 ,0 ,1 // green for testing
dcl_texcoord v0.xyz // color
dcl_texcoord1 v1.xyz // jumping
dcl_texcoord2_pp v2 // bloom?, time dependent
dcl_texcoord3 v3 // eye pos - from parent VS
dcl vPos.xy
dcl_2d s0
dcl_2d s1
dcl_2d s2
dcl_2d s13

// code moved down
//nrm r0.xyz, v1
//dp3 r0.w, c0, r0
//rcp r0.w, r0.w

// get depth from transformed vPos
mad r1.xy, vPos, c2, c2.zwzw
texld r2, r1, s0 // remove this line: no flickering, brighter and worse lighting

// Fix should go here
mov r0, v1.xyz
texldl r29, c200.z, s13
rcp r28.x, r2.x
mov r28.x, r2.x // linear depth seems to be correct
mul r29.w, c200.y, r28.x // c / depth
mul r29.w, r29.w, r29.y // * conv
add r29.w, c200.w, -r29.w // 1 - (.)
mul r29.w, r29.w, r29.x // * sep

// Use inverse matrix to put x offset from screen coordinates to world coordinates.

// Test, set r29
mov r29.w, c200.x

// Inverse view projection - moves somehow relative to screen, but LR changes when looking WE or NS?
mad r0.x, r29.w, c220.x, r0.x
mad r0.y, r29.w, c221.x, r0.y
mad r0.z, r29.w, c222.x, r0.z
mad r0.w, r29.w, c223.x, r0.w

// Moves relative to world
//add r0.x, r0.x, c200.x

//mov r0, v1.xyz
//add r0.xyz, r0, -v3 // subtract eye pos (removed in VS), dont work (nrm instruction fails - what to do with missing w coordinate from v1???)

// this is the moved code from above
nrm r0.xyz, r0//v1 // does not work with v1.xyz or a register
dp3 r0.w, c0, r0 // transform w with cam pos
rcp r0.w, r0.w

// Moves relative to world, range: +/- 0.5
//add r0.x, c200.y, r0.x // moves in WE dir, + is W
//add r0.y, c200.y, r0.y // moves up/down, + is down
//add r0.z, c200.y, r0.z // moves in NS dir, + is S
//add r0.w, c200.y, r0.w // scales light, - makes them larger, centered to player

mul r1.z, r2.x, c1.y
mul r0.w, r0.w, r1.z
mad_pp r0.xyz, r0, -r0.w, v2
dp3_pp r0.w, r0, r0
rsq_pp r0.w, r0.w
mul_pp r0.xyz, r0.w, r0
rcp_pp r0.w, r0.w
mul_sat_pp r0.w, r0.w, v2.w

// r0.x changes color here
// r1.x produces stripes in x dir
//add r1.x, r1.x, c200.x

// normals and shadow mask
texld_pp r2, r1, s1
texld_pp r1, r1, s2
mad_pp r1.yzw, r2.xxyz, c3.z, c3.w
nrm_pp r2.xyz, r1.yzww
dp3_pp r0.x, r0, r2
mad_pp r0.x, r0.x, c4.x, c4.x
mul_pp r0.y, r0.w, r0.w
mul_pp r0.z, r0.y, c3.x
dp2add_pp r0.y, r0.y, r0.w, -r0.z
add_pp r0.y, r0.y, c3.y
mul_pp r0.y, r1.x, r0.y
mul_pp r0.x, r0.x, r0.y
// mul oC0.xyz, r0.x, v0
// mov oC0.w, c4.y
//nrm oC0, v1
//mov oC0, c0
//mov oC0, c210
//mov oC0, c223.x
//mov oC0.xyz, v0.xyz
//mov oC0.w, c3.y
//mov oC0.xyz, v3
//mov oC0, v1.xyz
//mov r30, v3
//add oC0, v1.xyz, -r30
//mov oC0.w, c200.w
// approximately 32 instruction slots used (3 texture, 29 arithmetic)
Your fixes in the pixel shaders aren't quite at the right spot - there's some extra scaling applied to the value read from the depth buffer to get it into world coordinates before it is multiplied by a 3D coordinate: [code] mad r0.xy, vPos, c2, c2.zwzw // Screen position (using vPos means no halo issues to worry about in the VS) texld r1, r0, s0 // sample screen depth to r1 mul r0.z, r1.x, c1.y // Scaling depth buffer... nrm_pp r1.xyz, v1 dp3 r0.w, c0, r1 rcp r0.w, r0.w mul r0.z, r0.w, r0.z // Scaling depth buffer... //mad_pp r2.xyz, r1, -r0.z, v2 // Multiplying 3D coordinate by depth value and adding offset // Split that instruction in two: mul r20.xyz, r1, -r0.z // Fix should possibly go here // 2nd half of split instruction: add r2.xyz, r20, v2 // Fix might alternatively need to go here dp3_pp r0.z, r2, r2 ... [/code] I can see from your comments that you are making good observations, so see how you go with that. I'm pretty sure I have Risen 2 on my Steam account, so if you get stuck I can give you a hand. [quote]Without any fixes, point lights are at a wrong 3D position. The weird thing here is that the light effect becomes worse when approaching to the light source, but when getting very close, the light "jumps" to the correct position.[/quote]This phenomenon means that the fix in the vertex shader is not quite right yet. I see that you have unstereoised the position instead - did that remove the jumping? Unstereoising the position is essentially a shortcut to move the shadows back to screen depth so you can concentrate on the pixel shader sooner, but is generally not the correct fix as it may cause the lights to clip from a distance - this is very obvious if you set the light pixel shader to output solid white and you will see that the spheres around point lights are no longer lined up with the light in 3D. The vertex shader may have an additional output that you need to adjust to match the adjustment that the driver made to the position. This could be a simple halo type issue, or it might be in a different coordinate system (Unity games need a view-space correction, and Demonicon needed a world-space correction). [quote]Hunting the Pixel Shader shows cyan spheres asround the light source at the correct 3D position so I thick that I've got the right shader.[/quote]Yep, that's the right shader :)
Your fixes in the pixel shaders aren't quite at the right spot - there's some extra scaling applied to the value read from the depth buffer to get it into world coordinates before it is multiplied by a 3D coordinate:

mad r0.xy, vPos, c2, c2.zwzw		// Screen position (using vPos means no halo issues to worry about in the VS)
texld r1, r0, s0 // sample screen depth to r1
mul r0.z, r1.x, c1.y // Scaling depth buffer...
nrm_pp r1.xyz, v1
dp3 r0.w, c0, r1
rcp r0.w, r0.w
mul r0.z, r0.w, r0.z // Scaling depth buffer...

//mad_pp r2.xyz, r1, -r0.z, v2 // Multiplying 3D coordinate by depth value and adding offset

// Split that instruction in two:
mul r20.xyz, r1, -r0.z

// Fix should possibly go here

// 2nd half of split instruction:
add r2.xyz, r20, v2

// Fix might alternatively need to go here

dp3_pp r0.z, r2, r2
...


I can see from your comments that you are making good observations, so see how you go with that. I'm pretty sure I have Risen 2 on my Steam account, so if you get stuck I can give you a hand.

Without any fixes, point lights are at a wrong 3D position. The weird thing here is that the light effect becomes worse when approaching to the light source, but when getting very close, the light "jumps" to the correct position.
This phenomenon means that the fix in the vertex shader is not quite right yet. I see that you have unstereoised the position instead - did that remove the jumping? Unstereoising the position is essentially a shortcut to move the shadows back to screen depth so you can concentrate on the pixel shader sooner, but is generally not the correct fix as it may cause the lights to clip from a distance - this is very obvious if you set the light pixel shader to output solid white and you will see that the spheres around point lights are no longer lined up with the light in 3D. The vertex shader may have an additional output that you need to adjust to match the adjustment that the driver made to the position. This could be a simple halo type issue, or it might be in a different coordinate system (Unity games need a view-space correction, and Demonicon needed a world-space correction).

Hunting the Pixel Shader shows cyan spheres asround the light source at the correct 3D position so I thick that I've got the right shader.
Yep, that's the right shader :)

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 02/03/2016 09:27 AM   
Thank you very much for your help, DarkStarSword. I've removed the destereorization in the VS and moved the fix in the PS and it works well now if the distance to the light source is high to medium. As the destereorization in the VS fixed the jumping problem, this problem is now back again in the fixed PS. I did some experiments with the Vertex Shader but without success yet. What is the reason for that jumping? Do you have any experience with it? VS06C99936 [code] // // Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111 // // Parameters: // // float3 EyePos; // float4x4 ViewProj; // // // Registers: // // Name Reg Size // ------------ ----- ---- // ViewProj c0 4 // EyePos c4 1 // // // Default values: // // ViewProj // c0 = { 0, 0, 0, 0 }; // c1 = { 0, 0, 0, 0 }; // c2 = { 0, 0, 0, 0 }; // c3 = { 0, 0, 0, 0 }; // // EyePos // c4 = { 0, 0, 0, 0 }; <- seems to be zero ? // // Torch and candle lights (point without shadow) and global lighting - exteriors // Jump/flickering of torches when moving was removed by destereorizing o0! // Global light has some minor flickering vs_3_0 // Helix sampler dcl_2d s1 def c200, 50, -0.5, 0.0625, 1 def c5, 1, 0, 0, 0 dcl_position v0 // jumping: x<0, y<0, z<0, w==const dcl_texcoord v1 // Red dcl_texcoord1 v2 // Green dcl_texcoord2 v3 // Blue dcl_texcoord3 v4 // extreme Red dcl_texcoord4 v5 // input light color, passed to PS dcl_position o0 // 3D position dcl_texcoord o1 // color dcl_texcoord1 o2.xyz // this is jumping if output as PS color dcl_texcoord2 o3 dcl_texcoord3 o4 dp3 r0.x, v0, v0 rsq r0.x, r0.x rcp r0.y, r0.x mad r0.xzw, v0.xyyz, r0.x, -v0.xyyz slt r0.y, c5.x, r0.y mad r0.xyz, r0.y, r0.xzww, v0 mul r0.xyz, r0, v4.x mov r0.w, v0.w dp4 r1.x, r0, v1 dp4 r1.y, r0, v2 dp4 r1.z, r0, v3 mov r1.w, v0.w dp4 o0.x, r1, c0 dp4 o0.y, r1, c1 dp4 o0.z, r1, c2 dp4 o0.w, r1, c3 //add r1.x, r1.x, c200.x // moves in WE dir, + is W add o2.xyz, r1, -c4 mov r0.x, v1.w mov r0.y, v2.w mov r0.z, v3.w // fix here with *= (c200.y = [-2 -1 -0.5 0.5 1 2]) does not work add o3.xyz, r0, -c4 // what is o3 ?, time dependent, disabling makes it brighter // fix here with *= (c200.y = [-2 -1 -0.5 0.5 1 2]) does also not work //add r0.xyz, r0, -c4 //rcp r0.w, v4.x // set w before fix, still dont work //texldl r30, c200.z, s1 //add r30.w, r0.w, -r30.y //mul r30.w, r30.w, c200.y //mad r0.x, r30.x, -r30.w, r0.x //mov o3, r0 rcp o3.w, v4.x mov o1, v5 // o1 is light color // approximately 23 instruction slots used [/code] PS92EAC913 - fixed [code] // // Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111 // // Parameters: // // float3 CamDir; // sampler2D DepthSampler; // sampler2D DiffuseLookupSampler; // float2 NearFar; // sampler2D NormalSampler; // float4 ScreenToTexCoord; // sampler2D ShadowMaskSampler; // sampler2D SpecularLookupSampler; // // // Registers: // // Name Reg Size // --------------------- ----- ---- // CamDir c0 1 // NearFar c1 1 // ScreenToTexCoord c2 1 // DepthSampler s0 1 // NormalSampler s1 1 // ShadowMaskSampler s2 1 // DiffuseLookupSampler s3 1 // SpecularLookupSampler s4 1 // // // Default values: // // CamDir // c0 = { 0, 0, 0, 0 }; // // NearFar // c1 = { 0, 0, 0, 0 }; // // ScreenToTexCoord // c2 = { 0, 0, 0, 0 }; // // c220 = InverseViewProjection from parent VS // Lights - torches on walls, problem with light spheres // Fix works now, still jumping -> vs // TODO: The lights are somehow different between the eyes // if looking nearly parallel onto the lighted surface. Maybe caused by jumping issue. ps_3_0 def c200, 1, 0.005, 0.0625, 0 // 0.005 is correct def c201, -300, 0, 0, 0 // + moves left def c210, 0, 1, 0, 1 // green for test def c3, 0.5, 0.99609375, 0.124511719, 0.000244140625 def c4, 2, -1, 3, 1 dcl_texcoord v0 // color dcl_texcoord1 v1.xyz // jumping if output as color dcl_texcoord2_pp v2 // ???, used in code after fix, mostly one color, view dependent dcl_texcoord3 v3 dcl vPos.xy dcl_2d s0 dcl_2d s1 dcl_2d s2 dcl_2d s3 dcl_2d s4 dcl_2d s13 mad r0.xy, vPos, c2, c2.zwzw // Screen position // (using vPos means no halo issues to worry about in the VS) texld r1, r0, s0 // sample screen depth to r1 rcp r28.x, r1.x // store 1 / depth mul r0.z, r1.x, c1.y // Scaling depth buffer nrm_pp r1.xyz, v1 dp3 r0.w, c0, r1 rcp r0.w, r0.w mul r0.z, r0.w, r0.z // Scaling depth buffer //mad_pp r2.xyz, r1, -r0.z, v2 // Multiplying 3D coordinate by depth value and adding offset // Split that instruction in two: mul r20.xyz, r1, -r0.z //add r20.x, c201.x, r20.x // moves relative to world, WE dir, range: +/- 200 //mov r29.w, c201.x texld r29, c200.z, s13 mul r29.w, c200.y, r28.x // 0.005 / depth mul r29.w, r29.w, r29.y // 0.005 * conv / depth add r29.w, r29.w, -c200.x // 1 - (.) mul r29.w, r29.w, r29.x // * sep // x now moves relative to screen, only use c220 of matrix mul r30.x, r29.w, c220.x mul r30.y, r29.w, c220.y mul r30.z, r29.w, c220.z mul r30.w, r29.w, c220.w // 2nd half of split instruction: // now r20 + r30.xyz + v2 add r2.xyz, r20, v2 // apply correction add r2.xyz, r2.xyz, r30.xyz dp3_pp r0.z, r2, r2 rsq_pp r0.z, r0.z mad_pp r3.xyz, r2, r0.z, -r1 mul_pp r2.xyz, r0.z, r2 rcp_pp r0.z, r0.z mul_sat_pp r0.z, r0.z, v2.w nrm_pp r4.xyz, r3 texld_pp r3, r0, s1 texld_pp r5, r0, s2 mad_pp r0.xyw, r3.xyzz, c4.x, c4.y mov_pp r3.y, r3.w nrm_pp r6.xyz, r0.xyww dp3_sat_pp r4.z, r4, r6 dp3_pp r0.y, r2, -r1 dp3_pp r0.x, r2, r6 dp3_pp r1.w, -r1, r6 mad_pp r4.xy, r0, c3.x, c3.x mul_pp r0.y, r5.y, c3.y mad r2.xyz, r4, c3.z, r0.y add_pp r1.xyz, r2, c3.w texld_pp r2, r1.ywzw, s3 mul_pp r0.x, r0.x, r2.w texld_pp r2, r1.xwzw, s3 texld_pp r4, r1.xwzw, s4 mov_pp r3.x, r1.z texld_pp r1, r3, s4 mul_pp r1.w, r4.y, r1.x max_pp r1.xyz, r0.x, r2 mul r1, r1, v0 mul_pp r0.x, r0.z, r0.z mul_pp r0.y, r0.x, c4.z dp2add_pp r0.x, r0.x, r0.z, -r0.y add_pp r0.x, r0.x, c4.w mul_pp r0.x, r5.x, r0.x mul_pp oC0, r0.x, r1 //mov oC0, c222 //mov oC0, c210 //mov oC0, v2 //mov oC0, v1.xyz //mov oC0.w, c200.x // approximately 50 instruction slots used (7 texture, 43 arithmetic) [/code]
Thank you very much for your help, DarkStarSword.

I've removed the destereorization in the VS and moved the fix in the PS and it works well now if the distance to the light source is high to medium.

As the destereorization in the VS fixed the jumping problem, this problem is now back again in the fixed PS. I did some experiments with the Vertex Shader but without success yet.
What is the reason for that jumping? Do you have any experience with it?

VS06C99936
//
// Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111
//
// Parameters:
//
// float3 EyePos;
// float4x4 ViewProj;
//
//
// Registers:
//
// Name Reg Size
// ------------ ----- ----
// ViewProj c0 4
// EyePos c4 1
//
//
// Default values:
//
// ViewProj
// c0 = { 0, 0, 0, 0 };
// c1 = { 0, 0, 0, 0 };
// c2 = { 0, 0, 0, 0 };
// c3 = { 0, 0, 0, 0 };
//
// EyePos
// c4 = { 0, 0, 0, 0 }; <- seems to be zero ?
//

// Torch and candle lights (point without shadow) and global lighting - exteriors

// Jump/flickering of torches when moving was removed by destereorizing o0!
// Global light has some minor flickering

vs_3_0

// Helix sampler
dcl_2d s1
def c200, 50, -0.5, 0.0625, 1

def c5, 1, 0, 0, 0
dcl_position v0 // jumping: x<0, y<0, z<0, w==const
dcl_texcoord v1 // Red
dcl_texcoord1 v2 // Green
dcl_texcoord2 v3 // Blue
dcl_texcoord3 v4 // extreme Red
dcl_texcoord4 v5 // input light color, passed to PS

dcl_position o0 // 3D position
dcl_texcoord o1 // color
dcl_texcoord1 o2.xyz // this is jumping if output as PS color
dcl_texcoord2 o3
dcl_texcoord3 o4
dp3 r0.x, v0, v0
rsq r0.x, r0.x
rcp r0.y, r0.x
mad r0.xzw, v0.xyyz, r0.x, -v0.xyyz
slt r0.y, c5.x, r0.y
mad r0.xyz, r0.y, r0.xzww, v0
mul r0.xyz, r0, v4.x
mov r0.w, v0.w
dp4 r1.x, r0, v1
dp4 r1.y, r0, v2
dp4 r1.z, r0, v3
mov r1.w, v0.w
dp4 o0.x, r1, c0
dp4 o0.y, r1, c1
dp4 o0.z, r1, c2
dp4 o0.w, r1, c3

//add r1.x, r1.x, c200.x // moves in WE dir, + is W
add o2.xyz, r1, -c4
mov r0.x, v1.w
mov r0.y, v2.w
mov r0.z, v3.w

// fix here with *= (c200.y = [-2 -1 -0.5 0.5 1 2]) does not work
add o3.xyz, r0, -c4 // what is o3 ?, time dependent, disabling makes it brighter

// fix here with *= (c200.y = [-2 -1 -0.5 0.5 1 2]) does also not work

//add r0.xyz, r0, -c4
//rcp r0.w, v4.x // set w before fix, still dont work

//texldl r30, c200.z, s1
//add r30.w, r0.w, -r30.y
//mul r30.w, r30.w, c200.y
//mad r0.x, r30.x, -r30.w, r0.x
//mov o3, r0

rcp o3.w, v4.x
mov o1, v5 // o1 is light color

// approximately 23 instruction slots used


PS92EAC913 - fixed
//
// Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111
//
// Parameters:
//
// float3 CamDir;
// sampler2D DepthSampler;
// sampler2D DiffuseLookupSampler;
// float2 NearFar;
// sampler2D NormalSampler;
// float4 ScreenToTexCoord;
// sampler2D ShadowMaskSampler;
// sampler2D SpecularLookupSampler;
//
//
// Registers:
//
// Name Reg Size
// --------------------- ----- ----
// CamDir c0 1
// NearFar c1 1
// ScreenToTexCoord c2 1
// DepthSampler s0 1
// NormalSampler s1 1
// ShadowMaskSampler s2 1
// DiffuseLookupSampler s3 1
// SpecularLookupSampler s4 1
//
//
// Default values:
//
// CamDir
// c0 = { 0, 0, 0, 0 };
//
// NearFar
// c1 = { 0, 0, 0, 0 };
//
// ScreenToTexCoord
// c2 = { 0, 0, 0, 0 };
//

// c220 = InverseViewProjection from parent VS

// Lights - torches on walls, problem with light spheres
// Fix works now, still jumping -> vs
// TODO: The lights are somehow different between the eyes
// if looking nearly parallel onto the lighted surface. Maybe caused by jumping issue.

ps_3_0
def c200, 1, 0.005, 0.0625, 0 // 0.005 is correct
def c201, -300, 0, 0, 0 // + moves left
def c210, 0, 1, 0, 1 // green for test
def c3, 0.5, 0.99609375, 0.124511719, 0.000244140625
def c4, 2, -1, 3, 1
dcl_texcoord v0 // color
dcl_texcoord1 v1.xyz // jumping if output as color
dcl_texcoord2_pp v2 // ???, used in code after fix, mostly one color, view dependent
dcl_texcoord3 v3
dcl vPos.xy
dcl_2d s0
dcl_2d s1
dcl_2d s2
dcl_2d s3
dcl_2d s4
dcl_2d s13

mad r0.xy, vPos, c2, c2.zwzw // Screen position
// (using vPos means no halo issues to worry about in the VS)
texld r1, r0, s0 // sample screen depth to r1
rcp r28.x, r1.x // store 1 / depth
mul r0.z, r1.x, c1.y // Scaling depth buffer
nrm_pp r1.xyz, v1
dp3 r0.w, c0, r1
rcp r0.w, r0.w
mul r0.z, r0.w, r0.z // Scaling depth buffer

//mad_pp r2.xyz, r1, -r0.z, v2 // Multiplying 3D coordinate by depth value and adding offset
// Split that instruction in two:
mul r20.xyz, r1, -r0.z

//add r20.x, c201.x, r20.x // moves relative to world, WE dir, range: +/- 200
//mov r29.w, c201.x

texld r29, c200.z, s13
mul r29.w, c200.y, r28.x // 0.005 / depth
mul r29.w, r29.w, r29.y // 0.005 * conv / depth
add r29.w, r29.w, -c200.x // 1 - (.)
mul r29.w, r29.w, r29.x // * sep

// x now moves relative to screen, only use c220 of matrix
mul r30.x, r29.w, c220.x
mul r30.y, r29.w, c220.y
mul r30.z, r29.w, c220.z
mul r30.w, r29.w, c220.w

// 2nd half of split instruction:
// now r20 + r30.xyz + v2
add r2.xyz, r20, v2
// apply correction
add r2.xyz, r2.xyz, r30.xyz

dp3_pp r0.z, r2, r2
rsq_pp r0.z, r0.z
mad_pp r3.xyz, r2, r0.z, -r1
mul_pp r2.xyz, r0.z, r2
rcp_pp r0.z, r0.z
mul_sat_pp r0.z, r0.z, v2.w
nrm_pp r4.xyz, r3
texld_pp r3, r0, s1
texld_pp r5, r0, s2
mad_pp r0.xyw, r3.xyzz, c4.x, c4.y
mov_pp r3.y, r3.w
nrm_pp r6.xyz, r0.xyww
dp3_sat_pp r4.z, r4, r6
dp3_pp r0.y, r2, -r1
dp3_pp r0.x, r2, r6
dp3_pp r1.w, -r1, r6
mad_pp r4.xy, r0, c3.x, c3.x
mul_pp r0.y, r5.y, c3.y
mad r2.xyz, r4, c3.z, r0.y
add_pp r1.xyz, r2, c3.w
texld_pp r2, r1.ywzw, s3
mul_pp r0.x, r0.x, r2.w
texld_pp r2, r1.xwzw, s3
texld_pp r4, r1.xwzw, s4
mov_pp r3.x, r1.z
texld_pp r1, r3, s4
mul_pp r1.w, r4.y, r1.x
max_pp r1.xyz, r0.x, r2
mul r1, r1, v0
mul_pp r0.x, r0.z, r0.z
mul_pp r0.y, r0.x, c4.z
dp2add_pp r0.x, r0.x, r0.z, -r0.y
add_pp r0.x, r0.x, c4.w
mul_pp r0.x, r5.x, r0.x
mul_pp oC0, r0.x, r1
//mov oC0, c222
//mov oC0, c210
//mov oC0, v2
//mov oC0, v1.xyz
//mov oC0.w, c200.x

// approximately 50 instruction slots used (7 texture, 43 arithmetic)
[quote="mx-2"]As the destereorization in the VS fixed the jumping problem, this problem is now back again in the fixed PS. I did some experiments with the Vertex Shader but without success yet. What is the reason for that jumping? Do you have any experience with it?[/quote]I don't fully understand why it appears the way it does, but it comes about due to misaligned coordinates from the vertex shader - I've seen it in Unity 4 games where we had to apply a view-space correction to one of the outputs, and Demonicon where I had to apply a world-space correction to one of the outputs. You have made a very good observation in the shader: [code] ... //add r1.x, r1.x, c200.x // moves in WE dir, + is W add o2.xyz, r1, -c4 ... [/code] From that observation the fix will very likely to be apply a world-space stereo correction to that output. You are aiming to get the result of the vertex shader to be basically the same as unstereoising it, but with any clipping issues fixed.
mx-2 said:As the destereorization in the VS fixed the jumping problem, this problem is now back again in the fixed PS. I did some experiments with the Vertex Shader but without success yet.
What is the reason for that jumping? Do you have any experience with it?
I don't fully understand why it appears the way it does, but it comes about due to misaligned coordinates from the vertex shader - I've seen it in Unity 4 games where we had to apply a view-space correction to one of the outputs, and Demonicon where I had to apply a world-space correction to one of the outputs. You have made a very good observation in the shader:

...
//add r1.x, r1.x, c200.x // moves in WE dir, + is W
add o2.xyz, r1, -c4
...


From that observation the fix will very likely to be apply a world-space stereo correction to that output. You are aiming to get the result of the vertex shader to be basically the same as unstereoising it, but with any clipping issues fixed.

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 02/04/2016 04:30 AM   
[quote="DarkStarSword"]I don't fully understand why it appears the way it does, but it comes about due to misaligned coordinates from the vertex shader - I've seen it in Unity 4 games where we had to apply a view-space correction to one of the outputs, and Demonicon where I had to apply a world-space correction to one of the outputs. [/quote] That finally did it - see code below! Now I'll try to fix the remaining shaders. [code] //dp4 o0.x, r1, c0 //dp4 o0.y, r1, c1 //dp4 o0.z, r1, c2 //dp4 o0.w, r1, c3 // Transform r1 from world to screen space for output position. // Put result into temporary register (for o2) and o0. dp4 r20.x, r1, c0 dp4 r20.y, r1, c1 dp4 r20.z, r1, c2 dp4 r20.w, r1, c3 mov o0, r20 // Since o2 is related to r1, it needs the stereo correction too. // Stereo correction must be applied in screen space. texldl r30, c200.z, s1 add r30.w, r20.w, -r30.y mad r20.x, r30.x, r30.w, r20.x // Transform corrected r1 back to world space for o2. // This finally fixes the jumping issue! dp4 r21.x, r20, c220 dp4 r21.y, r20, c221 dp4 r21.z, r20, c222 dp4 r21.w, r20, c223 // Removing c4 does not change anything. add o2.xyz, r21, -c4 [/code]
DarkStarSword said:I don't fully understand why it appears the way it does, but it comes about due to misaligned coordinates from the vertex shader - I've seen it in Unity 4 games where we had to apply a view-space correction to one of the outputs, and Demonicon where I had to apply a world-space correction to one of the outputs.


That finally did it - see code below!

Now I'll try to fix the remaining shaders.

//dp4 o0.x, r1, c0
//dp4 o0.y, r1, c1
//dp4 o0.z, r1, c2
//dp4 o0.w, r1, c3

// Transform r1 from world to screen space for output position.
// Put result into temporary register (for o2) and o0.
dp4 r20.x, r1, c0
dp4 r20.y, r1, c1
dp4 r20.z, r1, c2
dp4 r20.w, r1, c3
mov o0, r20

// Since o2 is related to r1, it needs the stereo correction too.
// Stereo correction must be applied in screen space.
texldl r30, c200.z, s1
add r30.w, r20.w, -r30.y
mad r20.x, r30.x, r30.w, r20.x

// Transform corrected r1 back to world space for o2.
// This finally fixes the jumping issue!
dp4 r21.x, r20, c220
dp4 r21.y, r20, c221
dp4 r21.z, r20, c222
dp4 r21.w, r20, c223

// Removing c4 does not change anything.
add o2.xyz, r21, -c4
Great to hear :)
Great to hear :)

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 02/04/2016 06:17 PM   
Hi DarkStarSword, I have a tiny question: - I have a VS that is responsible for some Lens Flares. - Now some of them are 2D while others are already 3D. If I correct the 2D ones it will add double depth to the 3D ones (which already work by default). I looked for PS but there is one PS that is used by both of them... [code] <VertexShader hash="9dffde562be1ed0f"> <CalledPixelShaders>87cce28400ba4cf3 </CalledPixelShaders> <Register id=0 handle=000000003312D190>4bcf8033</Register> [/code] My question is how can I know or make a difference between the 2D and 3D ones? any idea? Also, once I know, How can I apply the stereo correction Only to the 2D ones?? Big big thank you in advance!
Hi DarkStarSword,

I have a tiny question:

- I have a VS that is responsible for some Lens Flares.
- Now some of them are 2D while others are already 3D. If I correct the 2D ones it will add double depth to the 3D ones (which already work by default).

I looked for PS but there is one PS that is used by both of them...
<VertexShader hash="9dffde562be1ed0f">
<CalledPixelShaders>87cce28400ba4cf3 </CalledPixelShaders>
<Register id=0 handle=000000003312D190>4bcf8033</Register>


My question is how can I know or make a difference between the 2D and 3D ones? any idea?
Also, once I know, How can I apply the stereo correction Only to the 2D ones??

Big big thank you in advance!

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

Posted 02/05/2016 01:59 AM   
Sounds like a driver heuristic issue. Depth buffer filtering might help: [code] [ShaderOverrideLensFlare] hash=9dffde562be1ed0f depth_filter=depth_inactive [/code] You can try normalising it such that W==convergence to neutralise the driver's stereo correction: [code] ... stereo = StereoParams.Load(0); o0.x += stereo.x * (o0.w - stereo.y); // Normalise the effect to W==convergence to neutralise the driver's stereo // correction. Only do this while stereo is enabled, convergence is not zero // and the effect is in front of the camera (otherwise bad things happen): if (stereo.y && o0.w > 0) { o0 = o0 / o0.w * stereo.y; } [/code]
Sounds like a driver heuristic issue. Depth buffer filtering might help:

[ShaderOverrideLensFlare]
hash=9dffde562be1ed0f
depth_filter=depth_inactive

You can try normalising it such that W==convergence to neutralise the driver's stereo correction:

...
stereo = StereoParams.Load(0);
o0.x += stereo.x * (o0.w - stereo.y);

// Normalise the effect to W==convergence to neutralise the driver's stereo
// correction. Only do this while stereo is enabled, convergence is not zero
// and the effect is in front of the camera (otherwise bad things happen):
if (stereo.y && o0.w > 0) {
o0 = o0 / o0.w * stereo.y;
}

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 02/05/2016 02:38 AM   
  51 / 88    
Scroll To Top