Bo3b's School For Shaderhackers
  56 / 87    
Yupp, tried that;) Problem is correcting some of the ambiental light but not the actual light casts..They are still 2D:( It was the first thing I actually tried:) Maybe I'll try it again lol... but I remember trying it like 5 times in different places;))
Yupp, tried that;) Problem is correcting some of the ambiental light but not the actual light casts..They are still 2D:( It was the first thing I actually tried:)
Maybe I'll try it again lol... but I remember trying it like 5 times in different places;))

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

Posted 04/20/2016 10:35 PM   
Maybe is not the correct PS? Like you say that VS is related to 30 PS...probably all "lights" related. Try to use "marking_mode=pink" instead skip....that "pink" option sometimes is very very helpfull to see some shaders that using skip are difficult to catch.
Maybe is not the correct PS?
Like you say that VS is related to 30 PS...probably all "lights" related.

Try to use "marking_mode=pink" instead skip....that "pink" option sometimes is very very helpfull to see some shaders that using skip are difficult to catch.

MY WEB

Helix Mod - Making 3D Better

My 3D Screenshot Gallery

Like my fixes? you can donate to Paypal: dhr.donation@gmail.com

Posted 04/20/2016 10:40 PM   
[quote="DHR"]Maybe is not the correct PS? Like you say that VS is related to 30 PS...probably all "lights" related. Try to use "marking_mode=pink" instead skip....that "pink" option sometimes is very very helpfull to see some shaders that using skip are difficult to catch.[/quote] Will try that as well;) and see how it goes;) I have a feeling all 30 PS are somehow related. I saw around 15 that control the TILES of the lighting system in a way or another... The current implementation is WAAAAY better than default (which is 2D) but is not quite exactly right;)
DHR said:Maybe is not the correct PS?
Like you say that VS is related to 30 PS...probably all "lights" related.

Try to use "marking_mode=pink" instead skip....that "pink" option sometimes is very very helpfull to see some shaders that using skip are difficult to catch.


Will try that as well;) and see how it goes;)
I have a feeling all 30 PS are somehow related. I saw around 15 that control the TILES of the lighting system in a way or another...
The current implementation is WAAAAY better than default (which is 2D) but is not quite exactly right;)

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

Posted 04/20/2016 10:50 PM   
So... I have this Pixel shader in GLSL: [code] // WORLD AMBIENT REFLECTIONS #version 430 core uniform float g_pixelEnabled; uniform float g_eye; uniform float g_eye_separation; uniform float g_convergence; uniform vec4 g_custom_params; uniform vec4 g_screeninfo; #extension GL_ARB_shader_clock : enable void clip( float v ) { if ( v < 0.0 ) { discard; } } void clip( vec2 v ) { if ( any( lessThan( v, vec2( 0.0 ) ) ) ) { discard; } } void clip( vec3 v ) { if ( any( lessThan( v, vec3( 0.0 ) ) ) ) { discard; } } void clip( vec4 v ) { if ( any( lessThan( v, vec4( 0.0 ) ) ) ) { discard; } } float saturate( float v ) { return clamp( v, 0.0, 1.0 ); } vec2 saturate( vec2 v ) { return clamp( v, 0.0, 1.0 ); } vec3 saturate( vec3 v ) { return clamp( v, 0.0, 1.0 ); } vec4 saturate( vec4 v ) { return clamp( v, 0.0, 1.0 ); } vec4 tex2D( sampler2D image, vec2 texcoord ) { return texture( image, texcoord.xy ); } vec4 tex2D( sampler2DShadow image, vec3 texcoord ) { return vec4( texture( image, texcoord.xyz ) ); } vec4 tex2DARRAY( sampler2DArray image, vec3 texcoord ) { return texture( image, texcoord.xyz ); } vec4 tex2D( sampler2D image, vec2 texcoord, vec2 dx, vec2 dy ) { return textureGrad( image, texcoord.xy, dx, dy ); } vec4 tex2D( sampler2DShadow image, vec3 texcoord, vec2 dx, vec2 dy ) { return vec4( textureGrad( image, texcoord.xyz, dx, dy ) ); } vec4 tex2DARRAY( sampler2DArray image, vec3 texcoord, vec2 dx, vec2 dy ) { return textureGrad( image, texcoord.xyz, dx, dy ); } vec4 texCUBE( samplerCube image, vec3 texcoord ) { return texture( image, texcoord.xyz ); } vec4 texCUBE( samplerCubeShadow image, vec4 texcoord ) { return vec4( texture( image, texcoord.xyzw ) ); } vec4 texCUBEARRAY( samplerCubeArray image, vec4 texcoord ) { return texture( image, texcoord.xyzw ); } vec4 tex1Dproj( sampler1D image, vec2 texcoord ) { return textureProj( image, texcoord ); } vec4 tex2Dproj( sampler2D image, vec3 texcoord ) { return textureProj( image, texcoord ); } vec4 tex3Dproj( sampler3D image, vec4 texcoord ) { return textureProj( image, texcoord ); } vec4 tex1Dbias( sampler1D image, vec4 texcoord ) { return texture( image, texcoord.x, texcoord.w ); } vec4 tex2Dbias( sampler2D image, vec4 texcoord ) { return texture( image, texcoord.xy, texcoord.w ); } vec4 tex2DARRAYbias( sampler2DArray image, vec4 texcoord ) { return texture( image, texcoord.xyz, texcoord.w ); } vec4 tex3Dbias( sampler3D image, vec4 texcoord ) { return texture( image, texcoord.xyz, texcoord.w ); } vec4 texCUBEbias( samplerCube image, vec4 texcoord ) { return texture( image, texcoord.xyz, texcoord.w ); } vec4 texCUBEARRAYbias( samplerCubeArray image, vec4 texcoord, float bias ) { return texture( image, texcoord.xyzw, bias); } vec4 tex1Dlod( sampler1D image, vec4 texcoord ) { return textureLod( image, texcoord.x, texcoord.w ); } vec4 tex2Dlod( sampler2D image, vec4 texcoord ) { return textureLod( image, texcoord.xy, texcoord.w ); } vec4 tex2DARRAYlod( sampler2DArray image, vec4 texcoord ) { return textureLod( image, texcoord.xyz, texcoord.w ); } vec4 tex3Dlod( sampler3D image, vec4 texcoord ) { return textureLod( image, texcoord.xyz, texcoord.w ); } vec4 texCUBElod( samplerCube image, vec4 texcoord ) { return textureLod( image, texcoord.xyz, texcoord.w ); } vec4 texCUBEARRAYlod( samplerCubeArray image, vec4 texcoord, float lod ) { return textureLod( image, texcoord.xyzw, lod ); } vec4 tex2DGatherRed( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 0 ); } vec4 tex2DGatherGreen( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 1 ); } vec4 tex2DGatherBlue( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 2 ); } vec4 tex2DGatherAlpha( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 3 ); } vec4 tex2DGatherOffsetRed( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 0 ); } vec4 tex2DGatherOffsetGreen( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 1 ); } vec4 tex2DGatherOffsetBlue( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 2 ); } vec4 tex2DGatherOffsetAlpha( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 3 ); } #define tex2DGatherOffsetsRed( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 0 ) #define tex2DGatherOffsetsGreen( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 1 ) #define tex2DGatherOffsetsBlue( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 2 ) #define tex2DGatherOffsetsAlpha( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 3 ) float asfloat ( uint x ) { return uintBitsToFloat( x ); } float asfloat ( int x ) { return intBitsToFloat( x ); } uint asuint ( float x ) { return floatBitsToUint( x ); } vec2 asfloat ( uvec2 x ) { return uintBitsToFloat( x ); } vec2 asfloat ( ivec2 x ) { return intBitsToFloat( x ); } uvec2 asuint ( vec2 x ) { return floatBitsToUint( x ); } vec3 asfloat ( uvec3 x ) { return uintBitsToFloat( x ); } vec3 asfloat ( ivec3 x ) { return intBitsToFloat( x ); } uvec3 asuint ( vec3 x ) { return floatBitsToUint( x ); } vec4 asfloat ( uvec4 x ) { return uintBitsToFloat( x ); } vec4 asfloat ( ivec4 x ) { return intBitsToFloat( x ); } uvec4 asuint ( vec4 x ) { return floatBitsToUint( x ); } float fmax3 ( float f1, float f2, float f3 ) { return max( f1, max( f2, f3 ) ); } float fmin3 ( float f1, float f2, float f3 ) { return min( f1, min( f2, f3 ) ); } vec4 sqr ( vec4 x ) { return ( x * x ); } vec3 sqr ( vec3 x ) { return ( x * x ); } vec2 sqr ( vec2 x ) { return ( x * x ); } float sqr ( float x ) { return ( x * x ); } float ApproxLog2 ( float f ) { return ( float( asuint( f ) ) / ( 1 << 23 ) - 127 ); } float ApproxExp2 ( float f ) { return asfloat( uint( ( f + 127 ) * ( 1 << 23 ) ) ); } uint packR8G8B8A8 ( vec4 value ) { value = saturate( value ); return ( ( ( uint( value.x * 255.0 ) ) << 24 ) | ( ( uint( value.y * 255.0 ) ) << 16 ) | ( ( uint( value.z * 255.0 ) ) << 8 ) | ( uint( value.w * 255.0 ) ) ); } vec4 unpackR8G8B8A8 ( uint value ) { return vec4( ( value >> 24 ) & 0xFF, ( value >> 16 ) & 0xFF, ( value >> 8 ) & 0xFF, value & 0xFF ) / 255.0; } uint packR10G10B10 ( vec3 value ) { value = saturate( value ); return ( ( uint( value.x * 1023.0 ) << 20 ) | ( uint( value.y * 1023.0 ) << 10 ) | ( uint( value.z * 1023.0 ) ) ); } vec3 unpackR10G10B10 ( uint value ) { return vec3( ( value >> 20 ) & 0x3FF, ( value >> 10 ) & 0x3FF, ( value ) & 0x3FF ) / 1023.0; } uint packRGBE ( vec3 value ) { const float sharedExp = ceil( ApproxLog2( max( max( value.r, value.g ), value.b ) ) ); return packR8G8B8A8( saturate( vec4( value / ApproxExp2( sharedExp ), ( sharedExp + 128.0 ) / 255.0 ) ) ); } vec3 unpackRGBE ( uint value ) { const vec4 rgbe = unpackR8G8B8A8( value ); return rgbe.rgb * ApproxExp2( rgbe.a * 255.0 - 128.0 ); } vec2 screenPosToTexcoord ( vec2 pos, vec4 bias_scale ) { return ( pos * bias_scale.zw + bias_scale.xy ); } vec2 screenPosToTexcoord ( vec2 pos, vec4 bias_scale, vec4 resolution_scale ) { return ( ( pos * bias_scale.zw + bias_scale.xy ) * resolution_scale.xy ); } float GetLuma ( vec3 c ) { return dot( c, vec3( 0.2126, 0.7152, 0.0722 ) ); } vec3 environmentBRDF ( float NdotV, float smoothness, vec3 f0 ) { const float t1 = 0.095 + smoothness * ( 0.6 + 4.19 * smoothness ); const float t2 = NdotV + 0.025; const float t3 = 9.5 * smoothness * NdotV; const float a0 = t1 * t2 * ApproxExp2( 1 - 14 * NdotV ); const float a1 = 0.4 + 0.6 * (1 - ApproxExp2( -t3 ) ); return mix( vec3( a0 ), vec3( a1 ), f0.xyz ); } struct lightingInput_t { vec3 albedo; vec3 colorMask; vec3 specular; float smoothness; float maskSSS; float thickness; vec3 normal; vec3 normalTS; vec3 normalSSS; vec3 lightmap; vec3 emissive; float alpha; float ssdoDiffuseMul; vec3 view; vec3 position; vec4 texCoords; vec4 fragCoord; mat3x3 invTS; vec3 ambient_lighting; vec3 diffuse_lighting; vec3 specular_lighting; vec3 output_lighting; uint albedo_packed; uint specular_packed; uint diffuse_lighting_packed; uint specular_lighting_packed; uint normal_packed; uvec4 ticksDecals; uvec4 ticksProbes; uvec4 ticksLights; }; float GetLinearDepth ( float ndcZ, vec4 projectionMatrixZ, float rcpFarZ, bool bFirstPersonArmsRescale ) { float linearZ = projectionMatrixZ.w / ( ndcZ + projectionMatrixZ.z ); if ( bFirstPersonArmsRescale ) { linearZ *= linearZ < 1.0 ? 10.0 : 1.0; } return linearZ * rcpFarZ; } vec2 OctWrap ( vec2 v ) { return ( 1.0 - abs( v.yx ) ) * vec2( ( v.x >= 0.0 ? 1.0 : -1.0 ), ( v.y >= 0.0 ? 1.0 : -1.0 ) ); } vec3 NormalOctDecode ( vec2 encN, bool expand_range ) { if ( expand_range ) { encN = encN * 2.0 - 1.0; } vec3 n; n.z = 1.0 - abs( encN.x ) - abs( encN.y ); n.xy = n.z >= 0.0 ? encN.xy : OctWrap( encN.xy ); n = normalize( n ); return n; } vec2 SmoothnessDecode ( float s ) { const float expanded_s = s * 2.0 - 1.0; return vec2( sqr( expanded_s ), expanded_s > 0 ? 1.0 : 0.0 ); } uniform vec4 _fa_freqHigh [3]; uniform vec4 _fa_freqLow [9]; uniform lightparms1_ubo { vec4 lightparms1[4096]; }; uniform lightparms2_ubo { vec4 lightparms2[4096]; }; uniform lightparms3_ubo { vec4 lightparms3[4096]; }; uniform sampler2D samp_tex2; uniform sampler2D samp_tex0; uniform sampler2D samp_tex1; uniform samplerCubeArray samp_envprobesmaparray; struct clusternumlights_t { int offset ; int numItems ; }; struct light_t { vec3 pos ; uint lightParms ; vec4 posShadow ; vec4 falloffR ; vec4 projS ; vec4 projT ; vec4 projQ ; uvec4 scaleBias ; vec4 boxMin ; vec4 boxMax ; vec4 areaPlane ; uint lightParms2 ; uint colorPacked ; float specMultiplier ; float shadowBleedReduce ; }; layout( std430 ) buffer clusternumlights_b { restrict readonly clusternumlights_t clusternumlights[ ]; }; layout( std430 ) buffer clusterlightsid_b { restrict readonly uint clusterlightsid[ ]; }; in vec4 gl_FragCoord; out vec4 out_FragColor0; void main() { { vec2 tc = screenPosToTexcoord( gl_FragCoord.xy, _fa_freqHigh[0 ] ); { { vec4 spec = tex2Dlod( samp_tex2, vec4( tc.xy, 0, 0 ) ); float smoothness = SmoothnessDecode( spec.w ).x; float depth = tex2Dlod( samp_tex0, vec4( tc.xy, 0, 0 ) ).x; out_FragColor0 = vec4( 0, 0, 0, 0 ); if(g_pixelEnabled < 1.0) {out_FragColor0= vec4(0.0);} if( dot( spec.xyz, vec3( 1 ) ) > 0 && depth < 1.0 ) { lightingInput_t inputs = lightingInput_t( vec3(0,0,0), vec3(0,0,0), vec3(0,0,0), 0, 0, 0, vec3(0,0,0), vec3(0,0,1), vec3(0,0,1), vec3(0,0,0), vec3(0,0,0), 1, 1, vec3(0,0,0), vec3(0,0,0), vec4(0,0,0,0), vec4(0,0,0,0), mat3x3( vec3(1,0,0), vec3(0,1,0), vec3(0,0,1) ), vec3(0,0,0), vec3(0,0,0), vec3(0,0,0), vec3(0,0,0), 0, 0, 0, 0, 0, uvec4(0,0,0,0), uvec4(0,0,0,0), uvec4(0,0,0,0) ); vec3 frustumVecX0 = mix( _fa_freqLow[0 ].xyz, _fa_freqLow[1 ].xyz, tc.x * _fa_freqLow[2 ].z); vec3 frustumVecX1 = mix( _fa_freqLow[3 ].xyz, _fa_freqLow[4 ].xyz, tc.x * _fa_freqLow[2 ].z); vec3 frustumVec = mix( frustumVecX1.xyz, frustumVecX0.xyz, tc.y* _fa_freqLow[2 ].w); inputs.albedo = vec3( 1 ); inputs.fragCoord.xy = vec2( gl_FragCoord.xy.xy ); vec4 wPosM = vec4( tc.xy, depth, 1.0 ); float zLinear = GetLinearDepth( depth, _fa_freqLow[5 ], 1.0, false ); vec3 world_pos = _fa_freqLow[6 ].xyz + frustumVec.xyz * zLinear; //_fa_freqLow[5] - Should be InverseProjection0 ?!?!?! or is something else in this shader?!?!?! // Works but only in certain angles world_pos.x -= g_eye * g_eye_separation * (zLinear - g_convergence) * _fa_freqLow[4].x; inputs.fragCoord.z = depth; inputs.position = world_pos.xyz; //inputs.position.x -= g_eye * g_eye_separation * (zLinear - g_convergence); inputs.view = normalize( _fa_freqLow[6 ].xyz - world_pos.xyz ); inputs.normal = NormalOctDecode( tex2Dlod( samp_tex1, vec4( tc.xy, 0, 0 ) ).xy, false ); inputs.specular = spec.xyz * spec.xyz; inputs.smoothness = smoothness; { uint skipStaticModel = _fa_freqHigh[1 ].x > 0 ? ( 1 << 3 ) : 0; uint clusterOffset; { { float NUM_CLUSTERS_X = 16; float NUM_CLUSTERS_Y = 8; float NUM_CLUSTERS_Z = 24; vec3 clusterCoordinate; clusterCoordinate.xy = screenPosToTexcoord( inputs.fragCoord.xy, _fa_freqHigh[2 ] ) * _fa_freqLow[2 ].zw; clusterCoordinate.xy = floor( clusterCoordinate.xy * vec2( NUM_CLUSTERS_X, NUM_CLUSTERS_Y ) ); float curr_z = GetLinearDepth( inputs.fragCoord.z, _fa_freqLow[5 ], 1.0, false ); float slice = log2( max( 1.0, curr_z / _fa_freqLow[7 ].z ) ) * _fa_freqLow[7 ].w; clusterCoordinate.z = min( NUM_CLUSTERS_Z - 1, floor( slice ) ); clusterOffset = uint( clusterCoordinate.x + clusterCoordinate.y * NUM_CLUSTERS_X + clusterCoordinate.z * NUM_CLUSTERS_X * NUM_CLUSTERS_Y ); } }; { { inputs.albedo_packed = packR10G10B10( inputs.albedo.xyz ); inputs.specular_packed = packR10G10B10( inputs.specular.xyz ); uint lightsMin = 0; uint lightsMax = 0; uint decalsMin = 0; uint decalsMax = 0; uint probesMin = 0; uint probesMax = 0; { { int MAX_LIGHTS_PER_CLUSTER = 256; clusternumlights_t cluster = clusternumlights[ clusterOffset ]; int dataOffset = cluster.offset & ( ~ ( cluster.offset >> 31 ) ); int numItems = cluster.numItems & ( ~ ( cluster.numItems >> 31 ) ); lightsMin = ( dataOffset >> 16 ) * MAX_LIGHTS_PER_CLUSTER; lightsMax = lightsMin + ( ( numItems ) & 0x000000FF ); decalsMin = lightsMin; decalsMax = decalsMin + ( ( numItems >> 8 ) & 0x000000FF ); probesMin = lightsMin; probesMax = probesMin + ( ( numItems >> 16 ) & 0x000000FF ); } }; vec3 ambient = vec3( 0 ); ambient += inputs.emissive; inputs.smoothness = dot( inputs.emissive.xyz, vec3( 1 ) ) > 0.0 ? -inputs.smoothness : inputs.smoothness; ambient = mix( GetLuma( ambient.xyz ).xxx, ambient.xyz, _fa_freqLow[8 ].www ); inputs.ambient_lighting = ambient; inputs.diffuse_lighting_packed = packRGBE( ambient ); float probes_dst_alpha = 1.0; for ( uint probeIdx = probesMin; probeIdx < probesMax; ) { uint divergentProbeId = ( clusterlightsid[ probeIdx ].x >> 24 ) & 0x00000FFF; uint probe_id; { probe_id = divergentProbeId; if ( probe_id >= divergentProbeId ) { ++probeIdx; } }; { light_t lightParms; { int lightParms1Size = 4; int lightParms2Size = 4; int lightParms3Size = 3; lightParms.pos = lightparms1[ lightParms1Size * ( probe_id ) + 0 ].xyz; lightParms.lightParms = asuint( lightparms1[ lightParms1Size * ( probe_id ) + 0 ].w ); lightParms.posShadow = lightparms1[ lightParms1Size * ( probe_id ) + 1 ]; lightParms.falloffR = lightparms1[ lightParms1Size * ( probe_id ) + 2 ]; lightParms.projS = lightparms2[ lightParms2Size * ( probe_id ) + 0 ]; lightParms.projT = lightparms2[ lightParms2Size * ( probe_id ) + 1 ]; lightParms.projQ = lightparms2[ lightParms2Size * ( probe_id ) + 2 ]; lightParms.scaleBias.x = asuint( lightparms1[ lightParms1Size * ( probe_id ) + 3 ].x ); lightParms.scaleBias.y = asuint( lightparms1[ lightParms1Size * ( probe_id ) + 3 ].y ); lightParms.scaleBias.z = asuint( lightparms1[ lightParms1Size * ( probe_id ) + 3 ].z ); lightParms.scaleBias.w = asuint( lightparms1[ lightParms1Size * ( probe_id ) + 3 ].w ); lightParms.boxMin = lightparms3[ lightParms3Size * ( probe_id ) + 0 ]; lightParms.boxMax = lightparms3[ lightParms3Size * ( probe_id ) + 1 ]; lightParms.areaPlane = lightparms3[ lightParms3Size * ( probe_id ) + 2 ]; lightParms.lightParms2 = asuint( lightparms2[ lightParms2Size * ( probe_id ) + 3 ].x ); lightParms.colorPacked = asuint( lightparms2[ lightParms2Size * ( probe_id ) + 3 ].y ); lightParms.specMultiplier = lightparms2[ lightParms2Size * ( probe_id ) + 3 ].z; lightParms.shadowBleedReduce = lightparms2[ lightParms2Size * ( probe_id ) + 3 ].w; }; vec3 projTC = vec3( ( inputs.position.x * lightParms.projS.x + ( inputs.position.y * lightParms.projS.y + ( inputs.position.z * lightParms.projS.z + lightParms.projS.w ) ) ), ( inputs.position.x * lightParms.projT.x + ( inputs.position.y * lightParms.projT.y + ( inputs.position.z * lightParms.projT.z + lightParms.projT.w ) ) ), ( inputs.position.x * lightParms.projQ.x + ( inputs.position.y * lightParms.projQ.y + ( inputs.position.z * lightParms.projQ.z + lightParms.projQ.w ) ) ) ); projTC.xy /= projTC.z; projTC.z = inputs.position.x * lightParms.falloffR.x + ( inputs.position.y * lightParms.falloffR.y + ( inputs.position.z * lightParms.falloffR.z + lightParms.falloffR.w ) ); float clip_min = 1.0 / 255.0; if ( fmin3( projTC.x, projTC.y, projTC.z ) <= clip_min || fmax3( projTC.x, projTC.y, projTC.z ) >= 1.0 - clip_min ) { continue; } uint light_parms = lightParms.lightParms; vec3 light_color = unpackRGBE( lightParms.colorPacked ); float light_spec_multiplier = lightParms.specMultiplier; vec3 light_position = lightParms.pos; float light_probe_innerfalloff = ( float( ( light_parms >> 16 ) & 0xFF ) / 255.0 ); vec3 norm_pos_cube = projTC.xyz * 2.0 - vec3( 1.0 ); vec3 norm_pos_sphere = norm_pos_cube * sqrt( vec3( 1.0 ) - norm_pos_cube.yzx * norm_pos_cube.yzx * 0.5 - norm_pos_cube.zxy * norm_pos_cube.zxy * 0.5 + ( norm_pos_cube.yzx * norm_pos_cube.yzx * norm_pos_cube.zxy * norm_pos_cube.zxy / 3.0 ) ); float light_attenuation = saturate( 1 - ( length( norm_pos_sphere.xyz ) - light_probe_innerfalloff ) / ( 1.0 - light_probe_innerfalloff + 1e-6 ) ); light_attenuation *= light_attenuation; light_attenuation *= lightParms.boxMin.w * probes_dst_alpha; probes_dst_alpha *= ( 1 - light_attenuation ); if( light_attenuation <= clip_min * 4 ) { continue; } vec3 light_color_final = light_color; light_color_final = mix( GetLuma( light_color_final.xyz ).xxx, light_color_final.xyz, _fa_freqLow[8 ].www ); { float light_probe_id = float( ( light_parms >> 8 ) & 0xFF ); vec3 diffEnvProbe = vec3( 0 ); vec3 specEnvProbe = vec3( 0 ); vec3 normal = inputs.normal.xyz; normal.x = dot( lightParms.posShadow.xyzw.xy, inputs.normal.xy ); normal.y = dot( lightParms.posShadow.xyzw.zw, inputs.normal.xy ); vec3 R = reflect( -inputs.view, inputs.normal ); vec3 bmax = ( lightParms.boxMax.xyz.xyz - inputs.position.xyz ) / R; vec3 bmin = ( lightParms.boxMin.xyz.xyz - inputs.position.xyz ) / R; vec3 bminmax = max( bmax, bmin ); float intersection_dist = fmin3( bminmax.x, bminmax.y, bminmax.z) ; vec3 intersection_pos = inputs.position.xyz + R.xyz * intersection_dist; vec3 LRtmp = normalize( intersection_pos - light_position ); vec3 LR = LRtmp.xyz; LR.x = dot( lightParms.posShadow.xyzw.xy, LRtmp.xy ); LR.y = dot( lightParms.posShadow.xyzw.zw, LRtmp.xy ); float num_mips = 6; float mip_level = num_mips - num_mips * abs( inputs.smoothness ); specEnvProbe = texCUBEARRAYlod( samp_envprobesmaparray, vec4( LR.xyz, light_probe_id ), mip_level ).xyz; float NdotV = saturate( dot( inputs.view.xyz, inputs.normal ) ); vec3 envBRDF = environmentBRDF( NdotV, abs( inputs.smoothness ), unpackR10G10B10( inputs.specular_packed ) ); specEnvProbe *= envBRDF * light_spec_multiplier; diffEnvProbe = mix( GetLuma( diffEnvProbe.xyz ).xxx, diffEnvProbe.xyz, _fa_freqLow[8 ].www ); specEnvProbe = mix( GetLuma( specEnvProbe.xyz ).xxx, specEnvProbe.xyz, _fa_freqLow[8 ].www ); inputs.specular_lighting_packed = packRGBE( specEnvProbe * light_color_final * saturate( light_attenuation ) + unpackRGBE( inputs.specular_lighting_packed ) ); }; if( probes_dst_alpha <= clip_min * 4 ) { break; } }; } inputs.diffuse_lighting = unpackRGBE( inputs.diffuse_lighting_packed ) * unpackR10G10B10( inputs.albedo_packed ); inputs.specular_lighting = unpackRGBE( inputs.specular_lighting_packed ); inputs.output_lighting = inputs.diffuse_lighting + inputs.specular_lighting; }; } }; out_FragColor0.rgb = inputs.specular_lighting.xyz; if(g_pixelEnabled < 1.0) {out_FragColor0= vec4(0.0);} } } }; } } [/code] Using the above correction seems to work only at 0/180 angles when looking at the reflections. At 90 angle the stereo projection instead of being left/right is top/bottom;)) I am running out of ideas:( Anyone can help me out with it ? :) Thank you in advance! Edit: See line 244
So...
I have this Pixel shader in GLSL:
// WORLD AMBIENT REFLECTIONS

#version 430 core
uniform float g_pixelEnabled;
uniform float g_eye;
uniform float g_eye_separation;
uniform float g_convergence;
uniform vec4 g_custom_params;
uniform vec4 g_screeninfo;
#extension GL_ARB_shader_clock : enable
void clip( float v ) { if ( v < 0.0 ) { discard; } }
void clip( vec2 v ) { if ( any( lessThan( v, vec2( 0.0 ) ) ) ) { discard; } }
void clip( vec3 v ) { if ( any( lessThan( v, vec3( 0.0 ) ) ) ) { discard; } }
void clip( vec4 v ) { if ( any( lessThan( v, vec4( 0.0 ) ) ) ) { discard; } }

float saturate( float v ) { return clamp( v, 0.0, 1.0 ); }
vec2 saturate( vec2 v ) { return clamp( v, 0.0, 1.0 ); }
vec3 saturate( vec3 v ) { return clamp( v, 0.0, 1.0 ); }
vec4 saturate( vec4 v ) { return clamp( v, 0.0, 1.0 ); }

vec4 tex2D( sampler2D image, vec2 texcoord ) { return texture( image, texcoord.xy ); }
vec4 tex2D( sampler2DShadow image, vec3 texcoord ) { return vec4( texture( image, texcoord.xyz ) ); }
vec4 tex2DARRAY( sampler2DArray image, vec3 texcoord ) { return texture( image, texcoord.xyz ); }

vec4 tex2D( sampler2D image, vec2 texcoord, vec2 dx, vec2 dy ) { return textureGrad( image, texcoord.xy, dx, dy ); }
vec4 tex2D( sampler2DShadow image, vec3 texcoord, vec2 dx, vec2 dy ) { return vec4( textureGrad( image, texcoord.xyz, dx, dy ) ); }
vec4 tex2DARRAY( sampler2DArray image, vec3 texcoord, vec2 dx, vec2 dy ) { return textureGrad( image, texcoord.xyz, dx, dy ); }

vec4 texCUBE( samplerCube image, vec3 texcoord ) { return texture( image, texcoord.xyz ); }
vec4 texCUBE( samplerCubeShadow image, vec4 texcoord ) { return vec4( texture( image, texcoord.xyzw ) ); }
vec4 texCUBEARRAY( samplerCubeArray image, vec4 texcoord ) { return texture( image, texcoord.xyzw ); }

vec4 tex1Dproj( sampler1D image, vec2 texcoord ) { return textureProj( image, texcoord ); }
vec4 tex2Dproj( sampler2D image, vec3 texcoord ) { return textureProj( image, texcoord ); }
vec4 tex3Dproj( sampler3D image, vec4 texcoord ) { return textureProj( image, texcoord ); }

vec4 tex1Dbias( sampler1D image, vec4 texcoord ) { return texture( image, texcoord.x, texcoord.w ); }
vec4 tex2Dbias( sampler2D image, vec4 texcoord ) { return texture( image, texcoord.xy, texcoord.w ); }
vec4 tex2DARRAYbias( sampler2DArray image, vec4 texcoord ) { return texture( image, texcoord.xyz, texcoord.w ); }
vec4 tex3Dbias( sampler3D image, vec4 texcoord ) { return texture( image, texcoord.xyz, texcoord.w ); }
vec4 texCUBEbias( samplerCube image, vec4 texcoord ) { return texture( image, texcoord.xyz, texcoord.w ); }
vec4 texCUBEARRAYbias( samplerCubeArray image, vec4 texcoord, float bias ) { return texture( image, texcoord.xyzw, bias); }

vec4 tex1Dlod( sampler1D image, vec4 texcoord ) { return textureLod( image, texcoord.x, texcoord.w ); }
vec4 tex2Dlod( sampler2D image, vec4 texcoord ) { return textureLod( image, texcoord.xy, texcoord.w ); }
vec4 tex2DARRAYlod( sampler2DArray image, vec4 texcoord ) { return textureLod( image, texcoord.xyz, texcoord.w ); }
vec4 tex3Dlod( sampler3D image, vec4 texcoord ) { return textureLod( image, texcoord.xyz, texcoord.w ); }
vec4 texCUBElod( samplerCube image, vec4 texcoord ) { return textureLod( image, texcoord.xyz, texcoord.w ); }
vec4 texCUBEARRAYlod( samplerCubeArray image, vec4 texcoord, float lod ) { return textureLod( image, texcoord.xyzw, lod ); }

vec4 tex2DGatherRed( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 0 ); }
vec4 tex2DGatherGreen( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 1 ); }
vec4 tex2DGatherBlue( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 2 ); }
vec4 tex2DGatherAlpha( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 3 ); }

vec4 tex2DGatherOffsetRed( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 0 ); }
vec4 tex2DGatherOffsetGreen( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 1 ); }
vec4 tex2DGatherOffsetBlue( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 2 ); }
vec4 tex2DGatherOffsetAlpha( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 3 ); }

#define tex2DGatherOffsetsRed( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 0 )
#define tex2DGatherOffsetsGreen( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 1 )
#define tex2DGatherOffsetsBlue( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 2 )
#define tex2DGatherOffsetsAlpha( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 3 )

float asfloat ( uint x ) { return uintBitsToFloat( x ); }
float asfloat ( int x ) { return intBitsToFloat( x ); }
uint asuint ( float x ) { return floatBitsToUint( x ); }
vec2 asfloat ( uvec2 x ) { return uintBitsToFloat( x ); }
vec2 asfloat ( ivec2 x ) { return intBitsToFloat( x ); }
uvec2 asuint ( vec2 x ) { return floatBitsToUint( x ); }
vec3 asfloat ( uvec3 x ) { return uintBitsToFloat( x ); }
vec3 asfloat ( ivec3 x ) { return intBitsToFloat( x ); }
uvec3 asuint ( vec3 x ) { return floatBitsToUint( x ); }
vec4 asfloat ( uvec4 x ) { return uintBitsToFloat( x ); }
vec4 asfloat ( ivec4 x ) { return intBitsToFloat( x ); }
uvec4 asuint ( vec4 x ) { return floatBitsToUint( x ); }
float fmax3 ( float f1, float f2, float f3 ) { return max( f1, max( f2, f3 ) ); }
float fmin3 ( float f1, float f2, float f3 ) { return min( f1, min( f2, f3 ) ); }
vec4 sqr ( vec4 x ) { return ( x * x ); }
vec3 sqr ( vec3 x ) { return ( x * x ); }
vec2 sqr ( vec2 x ) { return ( x * x ); }
float sqr ( float x ) { return ( x * x ); }
float ApproxLog2 ( float f ) {
return ( float( asuint( f ) ) / ( 1 << 23 ) - 127 );
}
float ApproxExp2 ( float f ) {
return asfloat( uint( ( f + 127 ) * ( 1 << 23 ) ) );
}
uint packR8G8B8A8 ( vec4 value ) {
value = saturate( value );
return ( ( ( uint( value.x * 255.0 ) ) << 24 ) | ( ( uint( value.y * 255.0 ) ) << 16 ) | ( ( uint( value.z * 255.0 ) ) << 8 ) | ( uint( value.w * 255.0 ) ) );
}
vec4 unpackR8G8B8A8 ( uint value ) {
return vec4( ( value >> 24 ) & 0xFF, ( value >> 16 ) & 0xFF, ( value >> 8 ) & 0xFF, value & 0xFF ) / 255.0;
}
uint packR10G10B10 ( vec3 value ) {
value = saturate( value );
return ( ( uint( value.x * 1023.0 ) << 20 ) | ( uint( value.y * 1023.0 ) << 10 ) | ( uint( value.z * 1023.0 ) ) );
}
vec3 unpackR10G10B10 ( uint value ) {
return vec3( ( value >> 20 ) & 0x3FF, ( value >> 10 ) & 0x3FF, ( value ) & 0x3FF ) / 1023.0;
}
uint packRGBE ( vec3 value ) {
const float sharedExp = ceil( ApproxLog2( max( max( value.r, value.g ), value.b ) ) );
return packR8G8B8A8( saturate( vec4( value / ApproxExp2( sharedExp ), ( sharedExp + 128.0 ) / 255.0 ) ) );
}
vec3 unpackRGBE ( uint value ) {
const vec4 rgbe = unpackR8G8B8A8( value );
return rgbe.rgb * ApproxExp2( rgbe.a * 255.0 - 128.0 );
}
vec2 screenPosToTexcoord ( vec2 pos, vec4 bias_scale ) { return ( pos * bias_scale.zw + bias_scale.xy ); }
vec2 screenPosToTexcoord ( vec2 pos, vec4 bias_scale, vec4 resolution_scale ) { return ( ( pos * bias_scale.zw + bias_scale.xy ) * resolution_scale.xy ); }
float GetLuma ( vec3 c ) {
return dot( c, vec3( 0.2126, 0.7152, 0.0722 ) );
}
vec3 environmentBRDF ( float NdotV, float smoothness, vec3 f0 ) {
const float t1 = 0.095 + smoothness * ( 0.6 + 4.19 * smoothness );
const float t2 = NdotV + 0.025;
const float t3 = 9.5 * smoothness * NdotV;
const float a0 = t1 * t2 * ApproxExp2( 1 - 14 * NdotV );
const float a1 = 0.4 + 0.6 * (1 - ApproxExp2( -t3 ) );
return mix( vec3( a0 ), vec3( a1 ), f0.xyz );
}
struct lightingInput_t {
vec3 albedo;
vec3 colorMask;
vec3 specular;
float smoothness;
float maskSSS;
float thickness;
vec3 normal;
vec3 normalTS;
vec3 normalSSS;
vec3 lightmap;
vec3 emissive;
float alpha;
float ssdoDiffuseMul;
vec3 view;
vec3 position;
vec4 texCoords;
vec4 fragCoord;
mat3x3 invTS;
vec3 ambient_lighting;
vec3 diffuse_lighting;
vec3 specular_lighting;
vec3 output_lighting;
uint albedo_packed;
uint specular_packed;
uint diffuse_lighting_packed;
uint specular_lighting_packed;
uint normal_packed;
uvec4 ticksDecals;
uvec4 ticksProbes;
uvec4 ticksLights;
};
float GetLinearDepth ( float ndcZ, vec4 projectionMatrixZ, float rcpFarZ, bool bFirstPersonArmsRescale ) {
float linearZ = projectionMatrixZ.w / ( ndcZ + projectionMatrixZ.z );
if ( bFirstPersonArmsRescale ) {
linearZ *= linearZ < 1.0 ? 10.0 : 1.0;
}
return linearZ * rcpFarZ;
}
vec2 OctWrap ( vec2 v ) {
return ( 1.0 - abs( v.yx ) ) * vec2( ( v.x >= 0.0 ? 1.0 : -1.0 ), ( v.y >= 0.0 ? 1.0 : -1.0 ) );
}
vec3 NormalOctDecode ( vec2 encN, bool expand_range ) {
if ( expand_range ) {
encN = encN * 2.0 - 1.0;
}
vec3 n;
n.z = 1.0 - abs( encN.x ) - abs( encN.y );
n.xy = n.z >= 0.0 ? encN.xy : OctWrap( encN.xy );
n = normalize( n );
return n;
}
vec2 SmoothnessDecode ( float s ) {
const float expanded_s = s * 2.0 - 1.0;
return vec2( sqr( expanded_s ), expanded_s > 0 ? 1.0 : 0.0 );
}
uniform vec4 _fa_freqHigh [3];
uniform vec4 _fa_freqLow [9];
uniform lightparms1_ubo { vec4 lightparms1[4096]; };
uniform lightparms2_ubo { vec4 lightparms2[4096]; };
uniform lightparms3_ubo { vec4 lightparms3[4096]; };
uniform sampler2D samp_tex2;
uniform sampler2D samp_tex0;
uniform sampler2D samp_tex1;
uniform samplerCubeArray samp_envprobesmaparray;
struct clusternumlights_t {
int offset ;
int numItems ;
};
struct light_t {
vec3 pos ;
uint lightParms ;
vec4 posShadow ;
vec4 falloffR ;
vec4 projS ;
vec4 projT ;
vec4 projQ ;
uvec4 scaleBias ;
vec4 boxMin ;
vec4 boxMax ;
vec4 areaPlane ;
uint lightParms2 ;
uint colorPacked ;
float specMultiplier ;
float shadowBleedReduce ;
};
layout( std430 ) buffer clusternumlights_b {
restrict readonly clusternumlights_t clusternumlights[ ];
};
layout( std430 ) buffer clusterlightsid_b {
restrict readonly uint clusterlightsid[ ];
};

in vec4 gl_FragCoord;

out vec4 out_FragColor0;

void main() {
{
vec2 tc = screenPosToTexcoord( gl_FragCoord.xy, _fa_freqHigh[0 ] );
{
{
vec4 spec = tex2Dlod( samp_tex2, vec4( tc.xy, 0, 0 ) );
float smoothness = SmoothnessDecode( spec.w ).x;
float depth = tex2Dlod( samp_tex0, vec4( tc.xy, 0, 0 ) ).x;
out_FragColor0 = vec4( 0, 0, 0, 0 );
if(g_pixelEnabled < 1.0)
{out_FragColor0= vec4(0.0);}
if( dot( spec.xyz, vec3( 1 ) ) > 0 && depth < 1.0 ) {
lightingInput_t inputs = lightingInput_t( vec3(0,0,0), vec3(0,0,0), vec3(0,0,0), 0, 0, 0, vec3(0,0,0), vec3(0,0,1), vec3(0,0,1), vec3(0,0,0), vec3(0,0,0), 1, 1, vec3(0,0,0), vec3(0,0,0), vec4(0,0,0,0), vec4(0,0,0,0), mat3x3( vec3(1,0,0), vec3(0,1,0), vec3(0,0,1) ), vec3(0,0,0), vec3(0,0,0), vec3(0,0,0), vec3(0,0,0), 0, 0, 0, 0, 0, uvec4(0,0,0,0), uvec4(0,0,0,0), uvec4(0,0,0,0) );
vec3 frustumVecX0 = mix( _fa_freqLow[0 ].xyz, _fa_freqLow[1 ].xyz, tc.x * _fa_freqLow[2 ].z);
vec3 frustumVecX1 = mix( _fa_freqLow[3 ].xyz, _fa_freqLow[4 ].xyz, tc.x * _fa_freqLow[2 ].z);
vec3 frustumVec = mix( frustumVecX1.xyz, frustumVecX0.xyz, tc.y* _fa_freqLow[2 ].w);
inputs.albedo = vec3( 1 );
inputs.fragCoord.xy = vec2( gl_FragCoord.xy.xy );
vec4 wPosM = vec4( tc.xy, depth, 1.0 );
float zLinear = GetLinearDepth( depth, _fa_freqLow[5 ], 1.0, false );
vec3 world_pos = _fa_freqLow[6 ].xyz + frustumVec.xyz * zLinear;

//_fa_freqLow[5] - Should be InverseProjection0 ?!?!?! or is something else in this shader?!?!?!
// Works but only in certain angles
world_pos.x -= g_eye * g_eye_separation * (zLinear - g_convergence) * _fa_freqLow[4].x;

inputs.fragCoord.z = depth;
inputs.position = world_pos.xyz;

//inputs.position.x -= g_eye * g_eye_separation * (zLinear - g_convergence);


inputs.view = normalize( _fa_freqLow[6 ].xyz - world_pos.xyz );
inputs.normal = NormalOctDecode( tex2Dlod( samp_tex1, vec4( tc.xy, 0, 0 ) ).xy, false );
inputs.specular = spec.xyz * spec.xyz;
inputs.smoothness = smoothness;
{
uint skipStaticModel = _fa_freqHigh[1 ].x > 0 ? ( 1 << 3 ) : 0;
uint clusterOffset;
{
{
float NUM_CLUSTERS_X = 16;
float NUM_CLUSTERS_Y = 8;
float NUM_CLUSTERS_Z = 24;
vec3 clusterCoordinate;
clusterCoordinate.xy = screenPosToTexcoord( inputs.fragCoord.xy, _fa_freqHigh[2 ] ) * _fa_freqLow[2 ].zw;
clusterCoordinate.xy = floor( clusterCoordinate.xy * vec2( NUM_CLUSTERS_X, NUM_CLUSTERS_Y ) );
float curr_z = GetLinearDepth( inputs.fragCoord.z, _fa_freqLow[5 ], 1.0, false );
float slice = log2( max( 1.0, curr_z / _fa_freqLow[7 ].z ) ) * _fa_freqLow[7 ].w;
clusterCoordinate.z = min( NUM_CLUSTERS_Z - 1, floor( slice ) );
clusterOffset = uint( clusterCoordinate.x + clusterCoordinate.y * NUM_CLUSTERS_X + clusterCoordinate.z * NUM_CLUSTERS_X * NUM_CLUSTERS_Y );
}
};
{
{
inputs.albedo_packed = packR10G10B10( inputs.albedo.xyz );
inputs.specular_packed = packR10G10B10( inputs.specular.xyz );
uint lightsMin = 0;
uint lightsMax = 0;
uint decalsMin = 0;
uint decalsMax = 0;
uint probesMin = 0;
uint probesMax = 0;
{
{
int MAX_LIGHTS_PER_CLUSTER = 256;
clusternumlights_t cluster = clusternumlights[ clusterOffset ];
int dataOffset = cluster.offset & ( ~ ( cluster.offset >> 31 ) );
int numItems = cluster.numItems & ( ~ ( cluster.numItems >> 31 ) );
lightsMin = ( dataOffset >> 16 ) * MAX_LIGHTS_PER_CLUSTER;
lightsMax = lightsMin + ( ( numItems ) & 0x000000FF );
decalsMin = lightsMin;
decalsMax = decalsMin + ( ( numItems >> 8 ) & 0x000000FF );
probesMin = lightsMin;
probesMax = probesMin + ( ( numItems >> 16 ) & 0x000000FF );
}
};
vec3 ambient = vec3( 0 );
ambient += inputs.emissive;
inputs.smoothness = dot( inputs.emissive.xyz, vec3( 1 ) ) > 0.0 ? -inputs.smoothness : inputs.smoothness;
ambient = mix( GetLuma( ambient.xyz ).xxx, ambient.xyz, _fa_freqLow[8 ].www );
inputs.ambient_lighting = ambient;
inputs.diffuse_lighting_packed = packRGBE( ambient );
float probes_dst_alpha = 1.0;
for ( uint probeIdx = probesMin; probeIdx < probesMax; ) {
uint divergentProbeId = ( clusterlightsid[ probeIdx ].x >> 24 ) & 0x00000FFF;
uint probe_id;
{
probe_id = divergentProbeId; if ( probe_id >= divergentProbeId ) {
++probeIdx;
}
};
{ light_t lightParms;
{
int lightParms1Size = 4;
int lightParms2Size = 4;
int lightParms3Size = 3;
lightParms.pos = lightparms1[ lightParms1Size * ( probe_id ) + 0 ].xyz;
lightParms.lightParms = asuint( lightparms1[ lightParms1Size * ( probe_id ) + 0 ].w );
lightParms.posShadow = lightparms1[ lightParms1Size * ( probe_id ) + 1 ];
lightParms.falloffR = lightparms1[ lightParms1Size * ( probe_id ) + 2 ];
lightParms.projS = lightparms2[ lightParms2Size * ( probe_id ) + 0 ];
lightParms.projT = lightparms2[ lightParms2Size * ( probe_id ) + 1 ];
lightParms.projQ = lightparms2[ lightParms2Size * ( probe_id ) + 2 ];
lightParms.scaleBias.x = asuint( lightparms1[ lightParms1Size * ( probe_id ) + 3 ].x );
lightParms.scaleBias.y = asuint( lightparms1[ lightParms1Size * ( probe_id ) + 3 ].y );
lightParms.scaleBias.z = asuint( lightparms1[ lightParms1Size * ( probe_id ) + 3 ].z );
lightParms.scaleBias.w = asuint( lightparms1[ lightParms1Size * ( probe_id ) + 3 ].w );
lightParms.boxMin = lightparms3[ lightParms3Size * ( probe_id ) + 0 ];
lightParms.boxMax = lightparms3[ lightParms3Size * ( probe_id ) + 1 ];
lightParms.areaPlane = lightparms3[ lightParms3Size * ( probe_id ) + 2 ];
lightParms.lightParms2 = asuint( lightparms2[ lightParms2Size * ( probe_id ) + 3 ].x );
lightParms.colorPacked = asuint( lightparms2[ lightParms2Size * ( probe_id ) + 3 ].y );
lightParms.specMultiplier = lightparms2[ lightParms2Size * ( probe_id ) + 3 ].z;
lightParms.shadowBleedReduce = lightparms2[ lightParms2Size * ( probe_id ) + 3 ].w;
};
vec3 projTC = vec3( ( inputs.position.x * lightParms.projS.x + ( inputs.position.y * lightParms.projS.y + ( inputs.position.z * lightParms.projS.z + lightParms.projS.w ) ) ),
( inputs.position.x * lightParms.projT.x + ( inputs.position.y * lightParms.projT.y + ( inputs.position.z * lightParms.projT.z + lightParms.projT.w ) ) ),
( inputs.position.x * lightParms.projQ.x + ( inputs.position.y * lightParms.projQ.y + ( inputs.position.z * lightParms.projQ.z + lightParms.projQ.w ) ) ) );
projTC.xy /= projTC.z;
projTC.z = inputs.position.x * lightParms.falloffR.x + ( inputs.position.y * lightParms.falloffR.y + ( inputs.position.z * lightParms.falloffR.z + lightParms.falloffR.w ) );
float clip_min = 1.0 / 255.0; if ( fmin3( projTC.x, projTC.y, projTC.z ) <= clip_min || fmax3( projTC.x, projTC.y, projTC.z ) >= 1.0 - clip_min ) {
continue;
}
uint light_parms = lightParms.lightParms;
vec3 light_color = unpackRGBE( lightParms.colorPacked );
float light_spec_multiplier = lightParms.specMultiplier;
vec3 light_position = lightParms.pos;
float light_probe_innerfalloff = ( float( ( light_parms >> 16 ) & 0xFF ) / 255.0 );
vec3 norm_pos_cube = projTC.xyz * 2.0 - vec3( 1.0 );
vec3 norm_pos_sphere = norm_pos_cube * sqrt( vec3( 1.0 ) - norm_pos_cube.yzx * norm_pos_cube.yzx * 0.5 - norm_pos_cube.zxy * norm_pos_cube.zxy * 0.5 + ( norm_pos_cube.yzx * norm_pos_cube.yzx * norm_pos_cube.zxy * norm_pos_cube.zxy / 3.0 ) );
float light_attenuation = saturate( 1 - ( length( norm_pos_sphere.xyz ) - light_probe_innerfalloff ) / ( 1.0 - light_probe_innerfalloff + 1e-6 ) );
light_attenuation *= light_attenuation;
light_attenuation *= lightParms.boxMin.w * probes_dst_alpha;
probes_dst_alpha *= ( 1 - light_attenuation ); if( light_attenuation <= clip_min * 4 ) {
continue;
}
vec3 light_color_final = light_color;
light_color_final = mix( GetLuma( light_color_final.xyz ).xxx, light_color_final.xyz, _fa_freqLow[8 ].www );
{
float light_probe_id = float( ( light_parms >> 8 ) & 0xFF );
vec3 diffEnvProbe = vec3( 0 );
vec3 specEnvProbe = vec3( 0 );
vec3 normal = inputs.normal.xyz;
normal.x = dot( lightParms.posShadow.xyzw.xy, inputs.normal.xy );
normal.y = dot( lightParms.posShadow.xyzw.zw, inputs.normal.xy );
vec3 R = reflect( -inputs.view, inputs.normal );
vec3 bmax = ( lightParms.boxMax.xyz.xyz - inputs.position.xyz ) / R;
vec3 bmin = ( lightParms.boxMin.xyz.xyz - inputs.position.xyz ) / R;
vec3 bminmax = max( bmax, bmin );
float intersection_dist = fmin3( bminmax.x, bminmax.y, bminmax.z) ;
vec3 intersection_pos = inputs.position.xyz + R.xyz * intersection_dist;
vec3 LRtmp = normalize( intersection_pos - light_position );
vec3 LR = LRtmp.xyz;
LR.x = dot( lightParms.posShadow.xyzw.xy, LRtmp.xy );
LR.y = dot( lightParms.posShadow.xyzw.zw, LRtmp.xy );
float num_mips = 6;
float mip_level = num_mips - num_mips * abs( inputs.smoothness );
specEnvProbe = texCUBEARRAYlod( samp_envprobesmaparray, vec4( LR.xyz, light_probe_id ), mip_level ).xyz;
float NdotV = saturate( dot( inputs.view.xyz, inputs.normal ) );
vec3 envBRDF = environmentBRDF( NdotV, abs( inputs.smoothness ), unpackR10G10B10( inputs.specular_packed ) );
specEnvProbe *= envBRDF * light_spec_multiplier;
diffEnvProbe = mix( GetLuma( diffEnvProbe.xyz ).xxx, diffEnvProbe.xyz, _fa_freqLow[8 ].www );
specEnvProbe = mix( GetLuma( specEnvProbe.xyz ).xxx, specEnvProbe.xyz, _fa_freqLow[8 ].www );
inputs.specular_lighting_packed = packRGBE( specEnvProbe * light_color_final * saturate( light_attenuation ) + unpackRGBE( inputs.specular_lighting_packed ) );
}; if( probes_dst_alpha <= clip_min * 4 ) {
break;
}
};
}
inputs.diffuse_lighting = unpackRGBE( inputs.diffuse_lighting_packed ) * unpackR10G10B10( inputs.albedo_packed );
inputs.specular_lighting = unpackRGBE( inputs.specular_lighting_packed );
inputs.output_lighting = inputs.diffuse_lighting + inputs.specular_lighting;
};
}
};
out_FragColor0.rgb = inputs.specular_lighting.xyz;
if(g_pixelEnabled < 1.0)
{out_FragColor0= vec4(0.0);}

}
}
};
}
}


Using the above correction seems to work only at 0/180 angles when looking at the reflections. At 90 angle the stereo projection instead of being left/right is top/bottom;))

I am running out of ideas:( Anyone can help me out with it ? :)

Thank you in advance!

Edit: See line 244

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

Posted 05/14/2016 04:56 PM   
It's a world coordinate so you need to convert it to the right coordinate system - do you have access to the inverse view-projection matrix? Or failing that the inverse view matrix? Since it looks like you already know the linear depth it should be something like this (I doubt this syntax is right for GLSL, but I'm sure you can fix that up): [code] vec4 adjustment_clip_space = vec4(g_eye * g_eye_separation * (zLinear - g_convergence), 0, 0, 0); vec4 adjustment_world_space = mul(adjustment_clip_space, inverse_view_projection); world_pos.xyz -= adjustment_world_space.xyz; [/code] The multiply might be the other way around depending on whether the matrix is row-major or column-major.
It's a world coordinate so you need to convert it to the right coordinate system - do you have access to the inverse view-projection matrix? Or failing that the inverse view matrix?

Since it looks like you already know the linear depth it should be something like this (I doubt this syntax is right for GLSL, but I'm sure you can fix that up):
vec4 adjustment_clip_space = vec4(g_eye * g_eye_separation * (zLinear - g_convergence), 0, 0, 0);
vec4 adjustment_world_space = mul(adjustment_clip_space, inverse_view_projection);
world_pos.xyz -= adjustment_world_space.xyz;

The multiply might be the other way around depending on whether the matrix is row-major or column-major.

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 05/14/2016 06:32 PM   
[quote="DarkStarSword"]It's a world coordinate so you need to convert it to the right coordinate system - do you have access to the inverse view-projection matrix? Or failing that the inverse view matrix? Since it looks like you already know the linear depth it should be something like this (I doubt this syntax is right for GLSL, but I'm sure you can fix that up): [code] vec4 adjustment_clip_space = vec4(g_eye * g_eye_separation * (zLinear - g_convergence), 0, 0, 0); vec4 adjustment_world_space = mul(adjustment_clip_space, inverse_view_projection); world_pos.xyz -= adjustment_world_space.xyz; [/code] The multiply might be the other way around depending on whether the matrix is row-major or column-major.[/quote] Awesome big thanks for the info;) Unfortunately all the things that I have access to are in the shader:( defined above... I see some matrices are passed through "_fa_freqLow" and "_fa_freqHigh" but have no idea what is what:(( damn... I'll play a bit and see if I can find one or spot anything;))
DarkStarSword said:It's a world coordinate so you need to convert it to the right coordinate system - do you have access to the inverse view-projection matrix? Or failing that the inverse view matrix?

Since it looks like you already know the linear depth it should be something like this (I doubt this syntax is right for GLSL, but I'm sure you can fix that up):
vec4 adjustment_clip_space = vec4(g_eye * g_eye_separation * (zLinear - g_convergence), 0, 0, 0);
vec4 adjustment_world_space = mul(adjustment_clip_space, inverse_view_projection);
world_pos.xyz -= adjustment_world_space.xyz;

The multiply might be the other way around depending on whether the matrix is row-major or column-major.


Awesome big thanks for the info;) Unfortunately all the things that I have access to are in the shader:( defined above...

I see some matrices are passed through "_fa_freqLow" and "_fa_freqHigh" but have no idea what is what:(( damn...
I'll play a bit and see if I can find one or spot anything;))

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

Posted 05/14/2016 06:54 PM   
In that case you might want to think about using the corners of the view frustum instead and derive what you need from that. I've only done this a few times (e.g. in The Last Tinker: City of Colors), so the below maths may be a little off, but it should be something like this: [code] // Frustum corners appear to be in _fa_freqLow[0,1,3,4], let's use 0 & 1 as the top two corners: vec3 camera_horizontal_world = _fa_freqLow[1].xyz - _fa_freqLow[0].xyz; // In this case I think that the frustum is already a unit distance from the // camera, otherwise it would need to be rescaled. // Find the width of the frustum using Pythagorus: float frustum_width = sqrt(dot3(camera_horizontal_world.xyz, camera_horizontal_world.xyz)); // Calculate the horizontal unit vector: vec3 camera_horizontal_world_normalized = camera_horizontal_world / frustum_width; // Calculate the stereo adjustment magnitude, using frustum_width/2 in place of // tan(fov_horizontal / 2) normally obtained from the inverse projection matrix: float adjustment_magnitude = g_eye * g_eye_separation * (zLinear - g_convergence) * frustum_width / 2; // World space adjustment is just the magitude * horizontal: world_pos.xyz -= adjustment_magnitude * camera_horizontal_world_normalized; [/code]
In that case you might want to think about using the corners of the view frustum instead and derive what you need from that. I've only done this a few times (e.g. in The Last Tinker: City of Colors), so the below maths may be a little off, but it should be something like this:

// Frustum corners appear to be in _fa_freqLow[0,1,3,4], let's use 0 & 1 as the top two corners:
vec3 camera_horizontal_world = _fa_freqLow[1].xyz - _fa_freqLow[0].xyz;

// In this case I think that the frustum is already a unit distance from the
// camera, otherwise it would need to be rescaled.

// Find the width of the frustum using Pythagorus:
float frustum_width = sqrt(dot3(camera_horizontal_world.xyz, camera_horizontal_world.xyz));

// Calculate the horizontal unit vector:
vec3 camera_horizontal_world_normalized = camera_horizontal_world / frustum_width;

// Calculate the stereo adjustment magnitude, using frustum_width/2 in place of
// tan(fov_horizontal / 2) normally obtained from the inverse projection matrix:
float adjustment_magnitude = g_eye * g_eye_separation * (zLinear - g_convergence) * frustum_width / 2;

// World space adjustment is just the magitude * horizontal:
world_pos.xyz -= adjustment_magnitude * camera_horizontal_world_normalized;

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 05/15/2016 12:42 PM   
[quote="DarkStarSword"]In that case you might want to think about using the corners of the view frustum instead and derive what you need from that. I've only done this a few times (e.g. in The Last Tinker: City of Colors), so the below maths may be a little off, but it should be something like this: [code] // Frustum corners appear to be in _fa_freqLow[0,1,3,4], let's use 0 & 1 as the top two corners: vec3 camera_horizontal_world = _fa_freqLow[1].xyz - _fa_freqLow[0].xyz; // In this case I think that the frustum is already a unit distance from the // camera, otherwise it would need to be rescaled. // Find the width of the frustum using Pythagorus: float frustum_width = sqrt(dot3(camera_horizontal_world.xyz, camera_horizontal_world.xyz)); // Calculate the horizontal unit vector: vec3 camera_horizontal_world_normalized = camera_horizontal_world / frustum_width; // Calculate the stereo adjustment magnitude, using frustum_width/2 in place of // tan(fov_horizontal / 2) normally obtained from the inverse projection matrix: float adjustment_magnitude = g_eye * g_eye_separation * (zLinear - g_convergence) * frustum_width / 2; // World space adjustment is just the magitude * horizontal: world_pos.xyz -= adjustment_magnitude * camera_horizontal_world_normalized; [/code] [/quote] OMG! AWESOME!!! It was literally Plug and PLAY. Now reading through it I kinda understand what is going on there;) Now, I also have another shader that is a bit more complicated than the above. It is responsible for environment reflections and is doing some reprojections inside. I managed to make it render from broken 3D in 2D. But now.... I am completely at a loss. I tried the above methods and none worked. The shader is different though... So, I am unsure what I need to do to it... If you have a few minutes free any help (YET AGAIN ^_^) will be INVALUABLE! [code] // AMBIENT REFLECTIONS 2 #version 430 core uniform float g_pixelEnabled; uniform float g_eye; uniform float g_eye_separation; uniform float g_convergence; uniform vec4 g_custom_params; uniform vec4 g_screeninfo; #extension GL_ARB_shader_clock : enable void clip( float v ) { if ( v < 0.0 ) { discard; } } void clip( vec2 v ) { if ( any( lessThan( v, vec2( 0.0 ) ) ) ) { discard; } } void clip( vec3 v ) { if ( any( lessThan( v, vec3( 0.0 ) ) ) ) { discard; } } void clip( vec4 v ) { if ( any( lessThan( v, vec4( 0.0 ) ) ) ) { discard; } } float saturate( float v ) { return clamp( v, 0.0, 1.0 ); } vec2 saturate( vec2 v ) { return clamp( v, 0.0, 1.0 ); } vec3 saturate( vec3 v ) { return clamp( v, 0.0, 1.0 ); } vec4 saturate( vec4 v ) { return clamp( v, 0.0, 1.0 ); } vec4 tex2D( sampler2D image, vec2 texcoord ) { return texture( image, texcoord.xy ); } vec4 tex2D( sampler2DShadow image, vec3 texcoord ) { return vec4( texture( image, texcoord.xyz ) ); } vec4 tex2DARRAY( sampler2DArray image, vec3 texcoord ) { return texture( image, texcoord.xyz ); } vec4 tex2D( sampler2D image, vec2 texcoord, vec2 dx, vec2 dy ) { return textureGrad( image, texcoord.xy, dx, dy ); } vec4 tex2D( sampler2DShadow image, vec3 texcoord, vec2 dx, vec2 dy ) { return vec4( textureGrad( image, texcoord.xyz, dx, dy ) ); } vec4 tex2DARRAY( sampler2DArray image, vec3 texcoord, vec2 dx, vec2 dy ) { return textureGrad( image, texcoord.xyz, dx, dy ); } vec4 texCUBE( samplerCube image, vec3 texcoord ) { return texture( image, texcoord.xyz ); } vec4 texCUBE( samplerCubeShadow image, vec4 texcoord ) { return vec4( texture( image, texcoord.xyzw ) ); } vec4 texCUBEARRAY( samplerCubeArray image, vec4 texcoord ) { return texture( image, texcoord.xyzw ); } vec4 tex1Dproj( sampler1D image, vec2 texcoord ) { return textureProj( image, texcoord ); } vec4 tex2Dproj( sampler2D image, vec3 texcoord ) { return textureProj( image, texcoord ); } vec4 tex3Dproj( sampler3D image, vec4 texcoord ) { return textureProj( image, texcoord ); } vec4 tex1Dbias( sampler1D image, vec4 texcoord ) { return texture( image, texcoord.x, texcoord.w ); } vec4 tex2Dbias( sampler2D image, vec4 texcoord ) { return texture( image, texcoord.xy, texcoord.w ); } vec4 tex2DARRAYbias( sampler2DArray image, vec4 texcoord ) { return texture( image, texcoord.xyz, texcoord.w ); } vec4 tex3Dbias( sampler3D image, vec4 texcoord ) { return texture( image, texcoord.xyz, texcoord.w ); } vec4 texCUBEbias( samplerCube image, vec4 texcoord ) { return texture( image, texcoord.xyz, texcoord.w ); } vec4 texCUBEARRAYbias( samplerCubeArray image, vec4 texcoord, float bias ) { return texture( image, texcoord.xyzw, bias); } vec4 tex1Dlod( sampler1D image, vec4 texcoord ) { return textureLod( image, texcoord.x, texcoord.w ); } vec4 tex2Dlod( sampler2D image, vec4 texcoord ) { return textureLod( image, texcoord.xy, texcoord.w ); } vec4 tex2DARRAYlod( sampler2DArray image, vec4 texcoord ) { return textureLod( image, texcoord.xyz, texcoord.w ); } vec4 tex3Dlod( sampler3D image, vec4 texcoord ) { return textureLod( image, texcoord.xyz, texcoord.w ); } vec4 texCUBElod( samplerCube image, vec4 texcoord ) { return textureLod( image, texcoord.xyz, texcoord.w ); } vec4 texCUBEARRAYlod( samplerCubeArray image, vec4 texcoord, float lod ) { return textureLod( image, texcoord.xyzw, lod ); } vec4 tex2DGatherRed( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 0 ); } vec4 tex2DGatherGreen( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 1 ); } vec4 tex2DGatherBlue( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 2 ); } vec4 tex2DGatherAlpha( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 3 ); } vec4 tex2DGatherOffsetRed( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 0 ); } vec4 tex2DGatherOffsetGreen( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 1 ); } vec4 tex2DGatherOffsetBlue( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 2 ); } vec4 tex2DGatherOffsetAlpha( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 3 ); } #define tex2DGatherOffsetsRed( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 0 ) #define tex2DGatherOffsetsGreen( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 1 ) #define tex2DGatherOffsetsBlue( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 2 ) #define tex2DGatherOffsetsAlpha( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 3 ) float asfloat ( uint x ) { return uintBitsToFloat( x ); } float asfloat ( int x ) { return intBitsToFloat( x ); } vec2 asfloat ( uvec2 x ) { return uintBitsToFloat( x ); } vec2 asfloat ( ivec2 x ) { return intBitsToFloat( x ); } vec3 asfloat ( uvec3 x ) { return uintBitsToFloat( x ); } vec3 asfloat ( ivec3 x ) { return intBitsToFloat( x ); } vec4 asfloat ( uvec4 x ) { return uintBitsToFloat( x ); } vec4 asfloat ( ivec4 x ) { return intBitsToFloat( x ); } int firstBitLow ( uint value ) { return findLSB( value ); } vec4 sqr ( vec4 x ) { return ( x * x ); } vec3 sqr ( vec3 x ) { return ( x * x ); } vec2 sqr ( vec2 x ) { return ( x * x ); } float sqr ( float x ) { return ( x * x ); } vec4 MatrixMul ( vec3 pos, mat4x4 mat ) { return vec4( ( pos.x * mat[0].x + ( pos.y * mat[0].y + ( pos.z * mat[0].z + mat[0].w ) ) ), ( pos.x * mat[1].x + ( pos.y * mat[1].y + ( pos.z * mat[1].z + mat[1].w ) ) ), ( pos.x * mat[2].x + ( pos.y * mat[2].y + ( pos.z * mat[2].z + mat[2].w ) ) ), ( pos.x * mat[3].x + ( pos.y * mat[3].y + ( pos.z * mat[3].z + mat[3].w ) ) ) ); } vec3 MatrixMul ( vec3 pos, mat3x4 mat ) { return vec3( ( pos.x * mat[0].x + ( pos.y * mat[0].y + ( pos.z * mat[0].z + mat[0].w ) ) ), ( pos.x * mat[1].x + ( pos.y * mat[1].y + ( pos.z * mat[1].z + mat[1].w ) ) ), ( pos.x * mat[2].x + ( pos.y * mat[2].y + ( pos.z * mat[2].z + mat[2].w ) ) ) ); } vec4 MatrixMul ( vec3 pos, vec4 matX, vec4 matY, vec4 matZ, vec4 matW ) { return vec4( ( pos.x * matX.x + ( pos.y * matX.y + ( pos.z * matX.z + matX.w ) ) ), ( pos.x * matY.x + ( pos.y * matY.y + ( pos.z * matY.z + matY.w ) ) ), ( pos.x * matZ.x + ( pos.y * matZ.y + ( pos.z * matZ.z + matZ.w ) ) ), ( pos.x * matW.x + ( pos.y * matW.y + ( pos.z * matW.z + matW.w ) ) ) ); } vec3 MatrixMul ( vec3 pos, vec4 matX, vec4 matY, vec4 matZ ) { return vec3( ( pos.x * matX.x + ( pos.y * matX.y + ( pos.z * matX.z + matX.w ) ) ), ( pos.x * matY.x + ( pos.y * matY.y + ( pos.z * matY.z + matY.w ) ) ), ( pos.x * matZ.x + ( pos.y * matZ.y + ( pos.z * matZ.z + matZ.w ) ) ) ); } vec4 MatrixMul ( mat4x4 m, vec4 v ) { return m * v; } vec4 MatrixMul ( vec4 v, mat4x4 m ) { return v * m; } vec3 MatrixMul ( mat3x3 m, vec3 v ) { return m * v; } vec3 MatrixMul ( vec3 v, mat3x3 m ) { return v * m; } vec2 MatrixMul ( mat2x2 m, vec2 v ) { return m * v; } vec2 MatrixMul ( vec2 v, mat2x2 m ) { return v * m; } float ApproxExp2 ( float f ) { return asfloat( uint( ( f + 127 ) * ( 1 << 23 ) ) ); } vec2 screenPosToTexcoord ( vec2 pos, vec4 bias_scale ) { return ( pos * bias_scale.zw + bias_scale.xy ); } vec2 screenPosToTexcoord ( vec2 pos, vec4 bias_scale, vec4 resolution_scale ) { return ( ( pos * bias_scale.zw + bias_scale.xy ) * resolution_scale.xy ); } vec4 tex2DFetch ( sampler2D image, ivec2 texcoord, int mip ) { return texelFetch( image, texcoord, mip ); } vec3 environmentBRDF ( float NdotV, float smoothness, vec3 f0 ) { const float t1 = 0.095 + smoothness * ( 0.6 + 4.19 * smoothness ); const float t2 = NdotV + 0.025; const float t3 = 9.5 * smoothness * NdotV; const float a0 = t1 * t2 * ApproxExp2( 1 - 14 * NdotV ); const float a1 = 0.4 + 0.6 * (1 - ApproxExp2( -t3 ) ); return mix( vec3( a0 ), vec3( a1 ), f0.xyz ); } struct gbuffer_t { vec3 world_pos; vec3 view; vec3 normal; vec3 specular; vec3 reflection; float smoothness; float depth; vec3 phongEnvBRDF; }; float GetLinearDepth ( float ndcZ, vec4 projectionMatrixZ, float rcpFarZ, bool bFirstPersonArmsRescale ) { float linearZ = projectionMatrixZ.w / ( ndcZ + projectionMatrixZ.z ); if ( bFirstPersonArmsRescale ) { linearZ *= linearZ < 1.0 ? 10.0 : 1.0; } return linearZ * rcpFarZ; } vec3 GetWindowPosZ ( vec3 viewPos, vec4 projection ) { return vec3( vec2( 0.5 ) + projection.xy * ( viewPos.xy / viewPos.z ), ( projection.w / viewPos.z ) + projection.z ) ; } vec3 GetViewPos ( vec3 winPos, vec4 inverseProjection0, vec4 inverseProjection1 ) { return vec3( inverseProjection0.xy * winPos.xy + inverseProjection0.zw, inverseProjection1.z ) / ( inverseProjection1.x * winPos.z + inverseProjection1.y ); } vec2 OctWrap ( vec2 v ) { return ( 1.0 - abs( v.yx ) ) * vec2( ( v.x >= 0.0 ? 1.0 : -1.0 ), ( v.y >= 0.0 ? 1.0 : -1.0 ) ); } vec3 NormalOctDecode ( vec2 encN, bool expand_range ) { if ( expand_range ) { encN = encN * 2.0 - 1.0; } vec3 n; n.z = 1.0 - abs( encN.x ) - abs( encN.y ); n.xy = n.z >= 0.0 ? encN.xy : OctWrap( encN.xy ); n = normalize( n ); return n; } vec2 SmoothnessDecode ( float s ) { const float expanded_s = s * 2.0 - 1.0; return vec2( sqr( expanded_s ), expanded_s > 0 ? 1.0 : 0.0 ); } uniform vec4 _fa_freqHigh [8]; uniform vec4 _fa_freqLow [13]; uniform sampler2D samp_tex0; uniform sampler2D samp_tex4; uniform sampler2D samp_tex1; uniform sampler2D samp_tex3; uniform sampler2D samp_tex5; in vec4 gl_FragCoord; out vec4 out_FragColor0; out vec4 out_FragColor1; void main() { { { vec2 tc = screenPosToTexcoord( gl_FragCoord.xy.xy, _fa_freqHigh[0 ] ); { { out_FragColor0 = vec4( 0 ); if(g_pixelEnabled < 1.0) {out_FragColor0= vec4(0.0);} out_FragColor1 = vec4( 0 ); gbuffer_t scene; { { vec3 normal = NormalOctDecode( tex2DFetch( samp_tex0, ivec2( gl_FragCoord.xy.xy ).xy, 0 ).xy, false ); vec4 spec = tex2DFetch( samp_tex4, ivec2( gl_FragCoord.xy.xy ).xy, 0 ); scene.normal = normal.xyz; mat3x3 invProjT = mat3x3( _fa_freqLow[0 ].xyz, _fa_freqLow[1 ].xyz, _fa_freqLow[2 ].xyz ); scene.normal = normalize( MatrixMul( invProjT, scene.normal.xyz ) ); vec2 decodedSmoothness = SmoothnessDecode( spec.w ); scene.smoothness = decodedSmoothness.y > 0? decodedSmoothness.x : 0; scene.specular = sqr( tex2DFetch( samp_tex4, ivec2( gl_FragCoord.xy.xy ).xy, 0 ).xyz ); scene.depth = tex2DFetch( samp_tex1, ivec2( gl_FragCoord.xy.xy ).xy, 0 ).x; float zLinear = GetLinearDepth( scene.depth, _fa_freqLow[3 ], 1.0, true ); scene.world_pos = GetViewPos( vec3( tc * _fa_freqLow[4 ].zw, scene.depth ), _fa_freqLow[5 ], _fa_freqLow[6 ] ); // Makes it 2d scene.world_pos.x -= g_eye * g_eye_separation * (zLinear - g_convergence); // THIS DOESN'T WORK. Because "invProjT" is only the Projection and doesn't contain the View :( vec3 adjustment_clip_space = vec3(g_eye * g_eye_separation * (zLinear - g_convergence), 0, 0); vec3 adjustment_world_space = adjustment_clip_space * invProjT; scene.world_pos.xyz -= adjustment_world_space.xyz; scene.view = normalize( - scene.world_pos.xyz ); scene.reflection = ( reflect( -scene.view.xyz, scene.normal.xyz ) ); float NdotV = saturate( dot( scene.view.xyz, scene.normal.xyz ) ); scene.phongEnvBRDF = environmentBRDF( NdotV, scene.smoothness, ( scene.specular ) ); } } float smoothness_threshold = _fa_freqHigh[1 ].x; float VdovR_threshold = 0.0; if ( scene.smoothness > smoothness_threshold && dot( scene.view.xyz, scene.reflection.xyz ) <= VdovR_threshold && scene.depth < 1.0 ) { uvec2 idxPix = uvec2( ivec2( gl_FragCoord.xy.xy ).xy) & 3; float jitter = float( ( ( ( 2068378560 * ( 1 - ( idxPix.x >> 1 ) ) + 1500172770 * ( idxPix.x >> 1 ) ) >> ( ( idxPix.y + ( ( idxPix.x & 1 ) << 2 ) ) << 2 ) ) + int( _fa_freqHigh[2 ].w ) ) & 0xF ) / 15.0; jitter -= 1.0; int max_steps = 16; float stride = 1; vec3 ray_start_vs = scene.world_pos; vec3 ray_end_vs = ray_start_vs.xyz + scene.reflection * GetLinearDepth( scene.depth, _fa_freqLow[3 ], 1, true ); vec4 hit_color; { { hit_color = vec4( 0 ); vec3 ray_start = GetWindowPosZ( ray_start_vs, _fa_freqLow[7 ] ) * vec3( _fa_freqLow[4 ].xy, 1 ); vec3 ray_end = GetWindowPosZ( ray_end_vs, _fa_freqLow[7 ] ) * vec3( _fa_freqLow[4 ].xy, 1 ); vec3 ray_step = ( stride / float( max_steps ) ) * ( ray_end.xyz - ray_start.xyz ) / length( ray_end.xy - ray_start.xy ); vec3 ray_pos = ray_start.xyz + ray_step.xyz * jitter; float z_thickness = abs( ray_step.z ); int hit = 0; vec3 best_hit = ray_pos; float prev_scene_z = ray_start.z; for ( int curr_step = 0; curr_step < max_steps ; curr_step += 4 ) { vec4 scene_z4 = 1 - vec4( tex2Dlod( samp_tex3, vec4( ray_pos.xy + ray_step.xy * float( curr_step + 1 ), 0, 0 ) ).x, tex2Dlod( samp_tex3, vec4( ray_pos.xy + ray_step.xy * float( curr_step + 2 ), 0, 0 ) ).x, tex2Dlod( samp_tex3, vec4( ray_pos.xy + ray_step.xy * float( curr_step + 3 ), 0, 0 ) ).x, tex2Dlod( samp_tex3, vec4( ray_pos.xy + ray_step.xy * float( curr_step + 4 ), 0, 0 ) ).x ); vec4 curr_step4 = vec4( 1, 2, 3, 4 ) + vec4( curr_step ); uvec4 z_test = uvec4( lessThan( abs( ( ray_pos.zzzz + ray_step.zzzz * curr_step4 ) - scene_z4 - z_thickness.xxxx ), z_thickness.xxxx ) ); uint z_mask = ( z_test[ 0 ] << 0 ) | ( z_test[ 1 ] << 1 ) | ( z_test[ 2 ] << 2 ) | ( z_test[ 3 ] << 3 ); if ( z_mask > 0 ) { int first_hit = firstBitLow( z_mask ); prev_scene_z = first_hit > 0 ? scene_z4[ max( 0, first_hit - 1 ) ] : prev_scene_z; best_hit = ray_pos + ray_step * float( curr_step + first_hit + 1 ); float z_after = scene_z4[ first_hit ] - best_hit.z; float z_before = prev_scene_z - best_hit.z + ray_step.z; float w = saturate( z_after / ( z_after - z_before ) ); vec3 prev_ray_pos = best_hit.xyz - ray_step.xyz; best_hit = prev_ray_pos * w + best_hit * ( 1.0 - w ); hit = curr_step + int( first_hit ); break; } prev_scene_z = scene_z4.w; } if ( hit > 0 ) { vec4 worldPosM = MatrixMul( vec3( best_hit.xy * _fa_freqLow[4 ].zw, best_hit.z ), _fa_freqLow[8 ], _fa_freqLow[9 ], _fa_freqLow[10 ], _fa_freqLow[11 ] ); worldPosM /= worldPosM.w; vec4 tc_reproj = MatrixMul( worldPosM.xyz, _fa_freqHigh[3 ], _fa_freqHigh[4 ], _fa_freqHigh[5 ], _fa_freqHigh[6 ] ); tc_reproj.xy /= tc_reproj.w; tc_reproj.xy = tc_reproj.xy * 0.5 + 0.5; tc_reproj.xy *= _fa_freqLow[12 ].xy; vec4 finalRefl = vec4( tex2Dlod( samp_tex5, vec4( tc_reproj.xy, 0, 0 ) ).xyz, 1 ); vec2 dist = ( tc_reproj.xy * _fa_freqLow[12 ].zw ) * 2 - 1; float edge_atten = ( smoothstep( 0.0, 0.5, saturate( 1 - dot( dist.xy, dist.xy ) ) ) ); edge_atten *= ( saturate( ( 1 - ( float( hit ) / 16.0 ) ) * 4.0 ) ); hit_color = vec4( finalRefl.xyz, edge_atten ); } } }; float smoothness_mask = saturate( saturate( 1- ( ( 1.0 - scene.smoothness ) / ( 1.0 - smoothness_threshold ) ) ) ); hit_color.w = saturate( hit_color.w * smoothness_mask * _fa_freqHigh[7 ].x ); out_FragColor0.rgb = max( vec3( 0 ), hit_color.w * hit_color.xyz * scene.phongEnvBRDF ); if(g_pixelEnabled < 1.0) {out_FragColor0= vec4(0.0);} out_FragColor1.r = saturate( hit_color.w ); } } }; }; } } [/code] At line 212, I managed to make it 2D again. I am not sure if is the right way, but the result is definitely a 2D projection now. BIG BIG BIG THANK YOU AGAIN!!!
DarkStarSword said:In that case you might want to think about using the corners of the view frustum instead and derive what you need from that. I've only done this a few times (e.g. in The Last Tinker: City of Colors), so the below maths may be a little off, but it should be something like this:

// Frustum corners appear to be in _fa_freqLow[0,1,3,4], let's use 0 & 1 as the top two corners:
vec3 camera_horizontal_world = _fa_freqLow[1].xyz - _fa_freqLow[0].xyz;

// In this case I think that the frustum is already a unit distance from the
// camera, otherwise it would need to be rescaled.


// Find the width of the frustum using Pythagorus:
float frustum_width = sqrt(dot3(camera_horizontal_world.xyz, camera_horizontal_world.xyz));

// Calculate the horizontal unit vector:
vec3 camera_horizontal_world_normalized = camera_horizontal_world / frustum_width;

// Calculate the stereo adjustment magnitude, using frustum_width/2 in place of
// tan(fov_horizontal / 2) normally obtained from the inverse projection matrix:
float adjustment_magnitude = g_eye * g_eye_separation * (zLinear - g_convergence) * frustum_width / 2;

// World space adjustment is just the magitude * horizontal:
world_pos.xyz -= adjustment_magnitude * camera_horizontal_world_normalized;




OMG! AWESOME!!! It was literally Plug and PLAY. Now reading through it I kinda understand what is going on there;)

Now, I also have another shader that is a bit more complicated than the above.
It is responsible for environment reflections and is doing some reprojections inside. I managed to make it render from broken 3D in 2D. But now.... I am completely at a loss. I tried the above methods and none worked. The shader is different though... So, I am unsure what I need to do to it...

If you have a few minutes free any help (YET AGAIN ^_^) will be INVALUABLE!

// AMBIENT REFLECTIONS 2
#version 430 core
uniform float g_pixelEnabled;
uniform float g_eye;
uniform float g_eye_separation;
uniform float g_convergence;
uniform vec4 g_custom_params;
uniform vec4 g_screeninfo;
#extension GL_ARB_shader_clock : enable
void clip( float v ) { if ( v < 0.0 ) { discard; } }
void clip( vec2 v ) { if ( any( lessThan( v, vec2( 0.0 ) ) ) ) { discard; } }
void clip( vec3 v ) { if ( any( lessThan( v, vec3( 0.0 ) ) ) ) { discard; } }
void clip( vec4 v ) { if ( any( lessThan( v, vec4( 0.0 ) ) ) ) { discard; } }

float saturate( float v ) { return clamp( v, 0.0, 1.0 ); }
vec2 saturate( vec2 v ) { return clamp( v, 0.0, 1.0 ); }
vec3 saturate( vec3 v ) { return clamp( v, 0.0, 1.0 ); }
vec4 saturate( vec4 v ) { return clamp( v, 0.0, 1.0 ); }

vec4 tex2D( sampler2D image, vec2 texcoord ) { return texture( image, texcoord.xy ); }
vec4 tex2D( sampler2DShadow image, vec3 texcoord ) { return vec4( texture( image, texcoord.xyz ) ); }
vec4 tex2DARRAY( sampler2DArray image, vec3 texcoord ) { return texture( image, texcoord.xyz ); }

vec4 tex2D( sampler2D image, vec2 texcoord, vec2 dx, vec2 dy ) { return textureGrad( image, texcoord.xy, dx, dy ); }
vec4 tex2D( sampler2DShadow image, vec3 texcoord, vec2 dx, vec2 dy ) { return vec4( textureGrad( image, texcoord.xyz, dx, dy ) ); }
vec4 tex2DARRAY( sampler2DArray image, vec3 texcoord, vec2 dx, vec2 dy ) { return textureGrad( image, texcoord.xyz, dx, dy ); }

vec4 texCUBE( samplerCube image, vec3 texcoord ) { return texture( image, texcoord.xyz ); }
vec4 texCUBE( samplerCubeShadow image, vec4 texcoord ) { return vec4( texture( image, texcoord.xyzw ) ); }
vec4 texCUBEARRAY( samplerCubeArray image, vec4 texcoord ) { return texture( image, texcoord.xyzw ); }

vec4 tex1Dproj( sampler1D image, vec2 texcoord ) { return textureProj( image, texcoord ); }
vec4 tex2Dproj( sampler2D image, vec3 texcoord ) { return textureProj( image, texcoord ); }
vec4 tex3Dproj( sampler3D image, vec4 texcoord ) { return textureProj( image, texcoord ); }

vec4 tex1Dbias( sampler1D image, vec4 texcoord ) { return texture( image, texcoord.x, texcoord.w ); }
vec4 tex2Dbias( sampler2D image, vec4 texcoord ) { return texture( image, texcoord.xy, texcoord.w ); }
vec4 tex2DARRAYbias( sampler2DArray image, vec4 texcoord ) { return texture( image, texcoord.xyz, texcoord.w ); }
vec4 tex3Dbias( sampler3D image, vec4 texcoord ) { return texture( image, texcoord.xyz, texcoord.w ); }
vec4 texCUBEbias( samplerCube image, vec4 texcoord ) { return texture( image, texcoord.xyz, texcoord.w ); }
vec4 texCUBEARRAYbias( samplerCubeArray image, vec4 texcoord, float bias ) { return texture( image, texcoord.xyzw, bias); }

vec4 tex1Dlod( sampler1D image, vec4 texcoord ) { return textureLod( image, texcoord.x, texcoord.w ); }
vec4 tex2Dlod( sampler2D image, vec4 texcoord ) { return textureLod( image, texcoord.xy, texcoord.w ); }
vec4 tex2DARRAYlod( sampler2DArray image, vec4 texcoord ) { return textureLod( image, texcoord.xyz, texcoord.w ); }
vec4 tex3Dlod( sampler3D image, vec4 texcoord ) { return textureLod( image, texcoord.xyz, texcoord.w ); }
vec4 texCUBElod( samplerCube image, vec4 texcoord ) { return textureLod( image, texcoord.xyz, texcoord.w ); }
vec4 texCUBEARRAYlod( samplerCubeArray image, vec4 texcoord, float lod ) { return textureLod( image, texcoord.xyzw, lod ); }

vec4 tex2DGatherRed( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 0 ); }
vec4 tex2DGatherGreen( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 1 ); }
vec4 tex2DGatherBlue( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 2 ); }
vec4 tex2DGatherAlpha( sampler2D image, vec2 texcoord ) { return textureGather( image, texcoord, 3 ); }

vec4 tex2DGatherOffsetRed( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 0 ); }
vec4 tex2DGatherOffsetGreen( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 1 ); }
vec4 tex2DGatherOffsetBlue( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 2 ); }
vec4 tex2DGatherOffsetAlpha( sampler2D image, vec2 texcoord, const ivec2 v0 ) { return textureGatherOffset( image, texcoord, v0, 3 ); }

#define tex2DGatherOffsetsRed( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 0 )
#define tex2DGatherOffsetsGreen( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 1 )
#define tex2DGatherOffsetsBlue( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 2 )
#define tex2DGatherOffsetsAlpha( image, texcoord, v0, v1, v2, v3 ) textureGatherOffsets( image, texcoord, ivec2[]( v0, v1, v2, v3 ), 3 )

float asfloat ( uint x ) { return uintBitsToFloat( x ); }
float asfloat ( int x ) { return intBitsToFloat( x ); }
vec2 asfloat ( uvec2 x ) { return uintBitsToFloat( x ); }
vec2 asfloat ( ivec2 x ) { return intBitsToFloat( x ); }
vec3 asfloat ( uvec3 x ) { return uintBitsToFloat( x ); }
vec3 asfloat ( ivec3 x ) { return intBitsToFloat( x ); }
vec4 asfloat ( uvec4 x ) { return uintBitsToFloat( x ); }
vec4 asfloat ( ivec4 x ) { return intBitsToFloat( x ); }
int firstBitLow ( uint value ) { return findLSB( value ); }
vec4 sqr ( vec4 x ) { return ( x * x ); }
vec3 sqr ( vec3 x ) { return ( x * x ); }
vec2 sqr ( vec2 x ) { return ( x * x ); }
float sqr ( float x ) { return ( x * x ); }
vec4 MatrixMul ( vec3 pos, mat4x4 mat ) {
return vec4( ( pos.x * mat[0].x + ( pos.y * mat[0].y + ( pos.z * mat[0].z + mat[0].w ) ) ),
( pos.x * mat[1].x + ( pos.y * mat[1].y + ( pos.z * mat[1].z + mat[1].w ) ) ),
( pos.x * mat[2].x + ( pos.y * mat[2].y + ( pos.z * mat[2].z + mat[2].w ) ) ),
( pos.x * mat[3].x + ( pos.y * mat[3].y + ( pos.z * mat[3].z + mat[3].w ) ) ) );
}
vec3 MatrixMul ( vec3 pos, mat3x4 mat ) {
return vec3( ( pos.x * mat[0].x + ( pos.y * mat[0].y + ( pos.z * mat[0].z + mat[0].w ) ) ),
( pos.x * mat[1].x + ( pos.y * mat[1].y + ( pos.z * mat[1].z + mat[1].w ) ) ),
( pos.x * mat[2].x + ( pos.y * mat[2].y + ( pos.z * mat[2].z + mat[2].w ) ) ) );
}
vec4 MatrixMul ( vec3 pos, vec4 matX, vec4 matY, vec4 matZ, vec4 matW ) {
return vec4( ( pos.x * matX.x + ( pos.y * matX.y + ( pos.z * matX.z + matX.w ) ) ),
( pos.x * matY.x + ( pos.y * matY.y + ( pos.z * matY.z + matY.w ) ) ),
( pos.x * matZ.x + ( pos.y * matZ.y + ( pos.z * matZ.z + matZ.w ) ) ),
( pos.x * matW.x + ( pos.y * matW.y + ( pos.z * matW.z + matW.w ) ) ) );
}
vec3 MatrixMul ( vec3 pos, vec4 matX, vec4 matY, vec4 matZ ) {
return vec3( ( pos.x * matX.x + ( pos.y * matX.y + ( pos.z * matX.z + matX.w ) ) ),
( pos.x * matY.x + ( pos.y * matY.y + ( pos.z * matY.z + matY.w ) ) ),
( pos.x * matZ.x + ( pos.y * matZ.y + ( pos.z * matZ.z + matZ.w ) ) ) );
}
vec4 MatrixMul ( mat4x4 m, vec4 v ) {
return m * v;
}
vec4 MatrixMul ( vec4 v, mat4x4 m ) {
return v * m;
}
vec3 MatrixMul ( mat3x3 m, vec3 v ) {
return m * v;
}
vec3 MatrixMul ( vec3 v, mat3x3 m ) {
return v * m;
}
vec2 MatrixMul ( mat2x2 m, vec2 v ) {
return m * v;
}
vec2 MatrixMul ( vec2 v, mat2x2 m ) {
return v * m;
}
float ApproxExp2 ( float f ) {
return asfloat( uint( ( f + 127 ) * ( 1 << 23 ) ) );
}
vec2 screenPosToTexcoord ( vec2 pos, vec4 bias_scale ) { return ( pos * bias_scale.zw + bias_scale.xy ); }
vec2 screenPosToTexcoord ( vec2 pos, vec4 bias_scale, vec4 resolution_scale ) { return ( ( pos * bias_scale.zw + bias_scale.xy ) * resolution_scale.xy ); }
vec4 tex2DFetch ( sampler2D image, ivec2 texcoord, int mip ) { return texelFetch( image, texcoord, mip ); }
vec3 environmentBRDF ( float NdotV, float smoothness, vec3 f0 ) {
const float t1 = 0.095 + smoothness * ( 0.6 + 4.19 * smoothness );
const float t2 = NdotV + 0.025;
const float t3 = 9.5 * smoothness * NdotV;
const float a0 = t1 * t2 * ApproxExp2( 1 - 14 * NdotV );
const float a1 = 0.4 + 0.6 * (1 - ApproxExp2( -t3 ) );
return mix( vec3( a0 ), vec3( a1 ), f0.xyz );
}
struct gbuffer_t {
vec3 world_pos;
vec3 view;
vec3 normal;
vec3 specular;
vec3 reflection;
float smoothness;
float depth;
vec3 phongEnvBRDF;
};
float GetLinearDepth ( float ndcZ, vec4 projectionMatrixZ, float rcpFarZ, bool bFirstPersonArmsRescale ) {
float linearZ = projectionMatrixZ.w / ( ndcZ + projectionMatrixZ.z );
if ( bFirstPersonArmsRescale ) {
linearZ *= linearZ < 1.0 ? 10.0 : 1.0;
}
return linearZ * rcpFarZ;
}
vec3 GetWindowPosZ ( vec3 viewPos, vec4 projection ) {
return vec3( vec2( 0.5 ) + projection.xy * ( viewPos.xy / viewPos.z ), ( projection.w / viewPos.z ) + projection.z ) ;
}
vec3 GetViewPos ( vec3 winPos, vec4 inverseProjection0, vec4 inverseProjection1 ) {
return vec3( inverseProjection0.xy * winPos.xy + inverseProjection0.zw, inverseProjection1.z ) / ( inverseProjection1.x * winPos.z + inverseProjection1.y );
}
vec2 OctWrap ( vec2 v ) {
return ( 1.0 - abs( v.yx ) ) * vec2( ( v.x >= 0.0 ? 1.0 : -1.0 ), ( v.y >= 0.0 ? 1.0 : -1.0 ) );
}
vec3 NormalOctDecode ( vec2 encN, bool expand_range ) {
if ( expand_range ) {
encN = encN * 2.0 - 1.0;
}
vec3 n;
n.z = 1.0 - abs( encN.x ) - abs( encN.y );
n.xy = n.z >= 0.0 ? encN.xy : OctWrap( encN.xy );
n = normalize( n );
return n;
}
vec2 SmoothnessDecode ( float s ) {
const float expanded_s = s * 2.0 - 1.0;
return vec2( sqr( expanded_s ), expanded_s > 0 ? 1.0 : 0.0 );
}
uniform vec4 _fa_freqHigh [8];
uniform vec4 _fa_freqLow [13];
uniform sampler2D samp_tex0;
uniform sampler2D samp_tex4;
uniform sampler2D samp_tex1;
uniform sampler2D samp_tex3;
uniform sampler2D samp_tex5;

in vec4 gl_FragCoord;

out vec4 out_FragColor0;
out vec4 out_FragColor1;

void main() {
{
{
vec2 tc = screenPosToTexcoord( gl_FragCoord.xy.xy, _fa_freqHigh[0 ] );
{
{
out_FragColor0 = vec4( 0 );
if(g_pixelEnabled < 1.0)
{out_FragColor0= vec4(0.0);}

out_FragColor1 = vec4( 0 );
gbuffer_t scene;
{
{
vec3 normal = NormalOctDecode( tex2DFetch( samp_tex0, ivec2( gl_FragCoord.xy.xy ).xy, 0 ).xy, false );
vec4 spec = tex2DFetch( samp_tex4, ivec2( gl_FragCoord.xy.xy ).xy, 0 );
scene.normal = normal.xyz;
mat3x3 invProjT = mat3x3( _fa_freqLow[0 ].xyz, _fa_freqLow[1 ].xyz, _fa_freqLow[2 ].xyz );
scene.normal = normalize( MatrixMul( invProjT, scene.normal.xyz ) );
vec2 decodedSmoothness = SmoothnessDecode( spec.w );
scene.smoothness = decodedSmoothness.y > 0? decodedSmoothness.x : 0;
scene.specular = sqr( tex2DFetch( samp_tex4, ivec2( gl_FragCoord.xy.xy ).xy, 0 ).xyz );
scene.depth = tex2DFetch( samp_tex1, ivec2( gl_FragCoord.xy.xy ).xy, 0 ).x;
float zLinear = GetLinearDepth( scene.depth, _fa_freqLow[3 ], 1.0, true );

scene.world_pos = GetViewPos( vec3( tc * _fa_freqLow[4 ].zw, scene.depth ), _fa_freqLow[5 ], _fa_freqLow[6 ] );

// Makes it 2d
scene.world_pos.x -= g_eye * g_eye_separation * (zLinear - g_convergence);

// THIS DOESN'T WORK. Because "invProjT" is only the Projection and doesn't contain the View :(
vec3 adjustment_clip_space = vec3(g_eye * g_eye_separation * (zLinear - g_convergence), 0, 0);
vec3 adjustment_world_space = adjustment_clip_space * invProjT;
scene.world_pos.xyz -= adjustment_world_space.xyz;


scene.view = normalize( - scene.world_pos.xyz );
scene.reflection = ( reflect( -scene.view.xyz, scene.normal.xyz ) );
float NdotV = saturate( dot( scene.view.xyz, scene.normal.xyz ) );
scene.phongEnvBRDF = environmentBRDF( NdotV, scene.smoothness, ( scene.specular ) );
}
}
float smoothness_threshold = _fa_freqHigh[1 ].x;
float VdovR_threshold = 0.0; if ( scene.smoothness > smoothness_threshold && dot( scene.view.xyz, scene.reflection.xyz ) <= VdovR_threshold && scene.depth < 1.0 ) {
uvec2 idxPix = uvec2( ivec2( gl_FragCoord.xy.xy ).xy) & 3;
float jitter = float( ( ( ( 2068378560 * ( 1 - ( idxPix.x >> 1 ) ) + 1500172770 * ( idxPix.x >> 1 ) ) >> ( ( idxPix.y + ( ( idxPix.x & 1 ) << 2 ) ) << 2 ) ) + int( _fa_freqHigh[2 ].w ) ) & 0xF ) / 15.0;
jitter -= 1.0;
int max_steps = 16;
float stride = 1;
vec3 ray_start_vs = scene.world_pos;
vec3 ray_end_vs = ray_start_vs.xyz + scene.reflection * GetLinearDepth( scene.depth, _fa_freqLow[3 ], 1, true );
vec4 hit_color;
{
{
hit_color = vec4( 0 );
vec3 ray_start = GetWindowPosZ( ray_start_vs, _fa_freqLow[7 ] ) * vec3( _fa_freqLow[4 ].xy, 1 );
vec3 ray_end = GetWindowPosZ( ray_end_vs, _fa_freqLow[7 ] ) * vec3( _fa_freqLow[4 ].xy, 1 );
vec3 ray_step = ( stride / float( max_steps ) ) * ( ray_end.xyz - ray_start.xyz ) / length( ray_end.xy - ray_start.xy );
vec3 ray_pos = ray_start.xyz + ray_step.xyz * jitter;
float z_thickness = abs( ray_step.z );
int hit = 0;
vec3 best_hit = ray_pos;
float prev_scene_z = ray_start.z; for ( int curr_step = 0; curr_step < max_steps ; curr_step += 4 ) {
vec4 scene_z4 = 1 - vec4( tex2Dlod( samp_tex3, vec4( ray_pos.xy + ray_step.xy * float( curr_step + 1 ), 0, 0 ) ).x,
tex2Dlod( samp_tex3, vec4( ray_pos.xy + ray_step.xy * float( curr_step + 2 ), 0, 0 ) ).x,
tex2Dlod( samp_tex3, vec4( ray_pos.xy + ray_step.xy * float( curr_step + 3 ), 0, 0 ) ).x,
tex2Dlod( samp_tex3, vec4( ray_pos.xy + ray_step.xy * float( curr_step + 4 ), 0, 0 ) ).x );
vec4 curr_step4 = vec4( 1, 2, 3, 4 ) + vec4( curr_step );
uvec4 z_test = uvec4( lessThan( abs( ( ray_pos.zzzz + ray_step.zzzz * curr_step4 ) - scene_z4 - z_thickness.xxxx ), z_thickness.xxxx ) );
uint z_mask = ( z_test[ 0 ] << 0 ) | ( z_test[ 1 ] << 1 ) | ( z_test[ 2 ] << 2 ) | ( z_test[ 3 ] << 3 ); if ( z_mask > 0 ) {
int first_hit = firstBitLow( z_mask );
prev_scene_z = first_hit > 0 ? scene_z4[ max( 0, first_hit - 1 ) ] : prev_scene_z;
best_hit = ray_pos + ray_step * float( curr_step + first_hit + 1 );
float z_after = scene_z4[ first_hit ] - best_hit.z;
float z_before = prev_scene_z - best_hit.z + ray_step.z;
float w = saturate( z_after / ( z_after - z_before ) );
vec3 prev_ray_pos = best_hit.xyz - ray_step.xyz;
best_hit = prev_ray_pos * w + best_hit * ( 1.0 - w );
hit = curr_step + int( first_hit );
break;
}
prev_scene_z = scene_z4.w;
} if ( hit > 0 ) {
vec4 worldPosM = MatrixMul( vec3( best_hit.xy * _fa_freqLow[4 ].zw, best_hit.z ), _fa_freqLow[8 ], _fa_freqLow[9 ], _fa_freqLow[10 ], _fa_freqLow[11 ] );
worldPosM /= worldPosM.w;
vec4 tc_reproj = MatrixMul( worldPosM.xyz, _fa_freqHigh[3 ], _fa_freqHigh[4 ], _fa_freqHigh[5 ], _fa_freqHigh[6 ] );
tc_reproj.xy /= tc_reproj.w;
tc_reproj.xy = tc_reproj.xy * 0.5 + 0.5;
tc_reproj.xy *= _fa_freqLow[12 ].xy;
vec4 finalRefl = vec4( tex2Dlod( samp_tex5, vec4( tc_reproj.xy, 0, 0 ) ).xyz, 1 );
vec2 dist = ( tc_reproj.xy * _fa_freqLow[12 ].zw ) * 2 - 1;
float edge_atten = ( smoothstep( 0.0, 0.5, saturate( 1 - dot( dist.xy, dist.xy ) ) ) );
edge_atten *= ( saturate( ( 1 - ( float( hit ) / 16.0 ) ) * 4.0 ) );
hit_color = vec4( finalRefl.xyz, edge_atten );
}
}
};
float smoothness_mask = saturate( saturate( 1- ( ( 1.0 - scene.smoothness ) / ( 1.0 - smoothness_threshold ) ) ) );
hit_color.w = saturate( hit_color.w * smoothness_mask * _fa_freqHigh[7 ].x );
out_FragColor0.rgb = max( vec3( 0 ), hit_color.w * hit_color.xyz * scene.phongEnvBRDF );
if(g_pixelEnabled < 1.0)
{out_FragColor0= vec4(0.0);}

out_FragColor1.r = saturate( hit_color.w );
}
}
};
};
}
}

At line 212, I managed to make it 2D again. I am not sure if is the right way, but the result is definitely a 2D projection now.

BIG BIG BIG THANK YOU AGAIN!!!

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

Posted 05/15/2016 04:10 PM   
[quote="helifax"]OMG! AWESOME!!! It was literally Plug and PLAY. Now reading through it I kinda understand what is going on there;)[/quote]Awesome =D [quote="helifax"]Now, I also have another shader that is a bit more complicated than the above. It is responsible for environment reflections and is doing some reprojections inside. I managed to make it render from broken 3D in 2D. But now.... I am completely at a loss. I tried the above methods and none worked. The shader is different though... So, I am unsure what I need to do to it...[/quote]Haha: [code] scene.world_pos = GetViewPos( vec3( tc * _fa_freqLow[4 ].zw, scene.depth ), _fa_freqLow[5 ], _fa_freqLow[6 ] );[/code] They called the variable world_pos, but I'm pretty sure that is a view-space coordinate, which would be why your attempt was able to make it 2D :) If this is an environment reflection you might just try moving it to infinity by subtracting or adding separation, or separation*convergence. You may need to use the IntProjT[0].x since it's a view-space coordinate, though I'm a little confused as to why that is defined as a 3x3 matrix and not a 4x4 matrix. Even simple reflections can vary a lot from game to game, so you might need to experiment a little to see what works best. I am a little concerned though - I haven't studied the shader in detail yet, but I notice it appears to be casting a ray, which looks like it might actually be a screen-space reflection. If so, these have been a real PITA in other games and I've never been happy with any of my "fixes" for these (though in some games like MGSV and Trine 3 they are actually decent out of the box). You might have a better chance though since the GLSL gives you a much clearer view of the ray casting loop than any of the disassembled shaders I've looked at. Which game is this BTW?
helifax said:OMG! AWESOME!!! It was literally Plug and PLAY. Now reading through it I kinda understand what is going on there;)
Awesome =D

helifax said:Now, I also have another shader that is a bit more complicated than the above.
It is responsible for environment reflections and is doing some reprojections inside. I managed to make it render from broken 3D in 2D. But now.... I am completely at a loss. I tried the above methods and none worked. The shader is different though... So, I am unsure what I need to do to it...
Haha:
scene.world_pos = GetViewPos( vec3( tc * _fa_freqLow[4 ].zw, scene.depth ), _fa_freqLow[5 ], _fa_freqLow[6 ] );

They called the variable world_pos, but I'm pretty sure that is a view-space coordinate, which would be why your attempt was able to make it 2D :)

If this is an environment reflection you might just try moving it to infinity by subtracting or adding separation, or separation*convergence. You may need to use the IntProjT[0].x since it's a view-space coordinate, though I'm a little confused as to why that is defined as a 3x3 matrix and not a 4x4 matrix. Even simple reflections can vary a lot from game to game, so you might need to experiment a little to see what works best.

I am a little concerned though - I haven't studied the shader in detail yet, but I notice it appears to be casting a ray, which looks like it might actually be a screen-space reflection. If so, these have been a real PITA in other games and I've never been happy with any of my "fixes" for these (though in some games like MGSV and Trine 3 they are actually decent out of the box). You might have a better chance though since the GLSL gives you a much clearer view of the ray casting loop than any of the disassembled shaders I've looked at.

Which game is this BTW?

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 05/15/2016 05:24 PM   
Yeah I was a little confused myself why they would call WORLD POS and then get VIEW POS... This is from DOOM (2016) and this shader controls the ambient reflections. It definitely LOOKS like Screen Space Reflections though. The behaviour kinda looks like the one in Rise of Tomb Raider (only is always broken). So now I am wondering, should I make it 2D first and then try to make it 3D from there? I think this is the way otherwise it will be off always. Going to try to experiment a bit;) Big thank you again! I hope I can fix this one as well;) The previous shader was for normal reflections but if this one is broken I have to disable both:(( Which is a pity:(
Yeah I was a little confused myself why they would call WORLD POS and then get VIEW POS...
This is from DOOM (2016) and this shader controls the ambient reflections. It definitely LOOKS like Screen Space Reflections though. The behaviour kinda looks like the one in Rise of Tomb Raider (only is always broken).

So now I am wondering, should I make it 2D first and then try to make it 3D from there? I think this is the way otherwise it will be off always. Going to try to experiment a bit;)
Big thank you again!

I hope I can fix this one as well;) The previous shader was for normal reflections but if this one is broken I have to disable both:(( Which is a pity:(

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

Posted 05/15/2016 05:33 PM   
This is possibly old news to people, but figured it wouldn't hurt to point this out. CryEngine has been open-sourced. It's on GitHub. https://github.com/CRYTEK-CRYENGINE/CRYENGINE The particularly interesting part is the inclusion of all of their cfx files, so we can see exactly what they are doing in the shaders, including variables. https://github.com/CRYTEK-CRYENGINE/CRYENGINE/tree/release/Engine/Shaders/HWScripts/CryFX
This is possibly old news to people, but figured it wouldn't hurt to point this out.

CryEngine has been open-sourced. It's on GitHub.


https://github.com/CRYTEK-CRYENGINE/CRYENGINE



The particularly interesting part is the inclusion of all of their cfx files, so we can see exactly what they are doing in the shaders, including variables.


https://github.com/CRYTEK-CRYENGINE/CRYENGINE/tree/release/Engine/Shaders/HWScripts/CryFX

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

Posted 05/25/2016 11:47 AM   
Hi guys, So, I am again attempting to solve the mystery of the compute shaders in Frostbyte 3 engine. I have this compute shader that controls some lights (No other VS/PS controls this as I looked through all shaders). [code] // // Generated by Microsoft (R) HLSL Shader Compiler 6.3.9600.16384 // // using 3Dmigoto v1.2.39 on Mon Jun 13 19:31:45 2016 // // // Buffer Definitions: // // cbuffer cbPunctualShadowLightInfo // { // // struct PunctualShadowLightInfo // { // // struct BaseLightInfo // { // // float3 pos; // Offset: 0 // float invSqrAttenuationRadius;// Offset: 12 // float3 color; // Offset: 16 // float attenuationOffset; // Offset: 28 // float3 matrixForward; // Offset: 32 // float diffuseScale; // Offset: 44 // float3 matrixUp; // Offset: 48 // float specularScale; // Offset: 60 // float3 matrixLeft; // Offset: 64 // float shadowDimmer; // Offset: 76 // float angleScale; // Offset: 80 // float angleOffset; // Offset: 84 // float2 unused; // Offset: 88 // // } baseLight; // Offset: 0 // // struct IESShadowInfo // { // // float enable; // Offset: 96 // float textureIndex; // Offset: 100 // float2 unused; // Offset: 104 // // } iesShadow; // Offset: 96 // // struct ShadowLightInfo // { // // float4 shadowMatrix1; // Offset: 112 // float4 shadowMatrix2; // Offset: 128 // float4 shadowMatrix3; // Offset: 144 // float4 shadowMatrix4; // Offset: 160 // float4 shadowMapAtlasParam[6];// Offset: 176 // float4 shadowMapIndex[2]; // Offset: 272 // float shadowType; // Offset: 304 // float quality; // Offset: 308 // float shadowAngleScale; // Offset: 312 // float shadowAngleOffset; // Offset: 316 // // } shadow; // Offset: 112 // // struct VolumetricShadowInfo // { // // float enable; // Offset: 320 // float volumeShadowMapIndex;// Offset: 324 // float invAttenuationRadius;// Offset: 328 // float tanAngle; // Offset: 332 // // } vShadow; // Offset: 320 // // } g_lightInfoPunctualShadow[128]; // Offset: 0 Size: 43008 // // } // // cbuffer cb0 // { // // float4x4 invViewProjectionMatrix; // Offset: 0 Size: 64 // float4 g_exposureMultipliers; // Offset: 64 Size: 16 // float localIblMipmapBias; // Offset: 80 Size: 4 [unused] // float screenAspectRatio; // Offset: 84 Size: 4 [unused] // float2 invResolution; // Offset: 88 Size: 8 // float4 shadowMapSizeAndInvSize; // Offset: 96 Size: 16 // uint forceSplitLighting; // Offset: 112 Size: 4 [unused] // uint sssScatteringEnables; // Offset: 116 Size: 4 [unused] // float volumetricShadowmapHalfTexelOffset;// Offset: 120 Size: 4 [unused] // float volumetricShadowmapOneMinusHalfTexelOffset;// Offset: 124 Size: 4 [unused] // float volumetricShadowmapInvMaxCount;// Offset: 128 Size: 4 [unused] // float dynamicAOFactor; // Offset: 132 Size: 4 // uint tileCountX; // Offset: 136 Size: 4 // uint pad1; // Offset: 140 Size: 4 [unused] // float4x3 g_normalBasisTransforms[6];// Offset: 144 Size: 288 // // } // // Resource bind info for g_lightCullInput // { // // uint4 $Element; // Offset: 0 Size: 16 // // } // // Resource bind info for g_lightIndexInput // { // // uint $Element; // Offset: 0 Size: 4 // // } // // Resource bind info for g_compactTileGridBuffer // { // // uint $Element; // Offset: 0 Size: 4 // // } // // // Resource Bindings: // // Name Type Format Dim Slot Elements // ------------------------------ ---------- ------- ----------- ---- -------- // g_linearSampler sampler NA NA 0 1 // g_linearLongitudeWrapSampler sampler NA NA 2 1 // g_shadowmapSampler sampler_c NA NA 3 1 // g_gbufferTexture0 texture float4 2d 0 1 // g_gbufferTexture1 texture float4 2d 1 1 // g_gbufferTexture2 texture float4 2d 2 1 // g_depthTexture texture float 2d 6 1 // g_iesTextureArray texture float 2darray 9 1 // g_diffuseOcclusionTexture texture float 2d 10 1 // g_lightCullInput texture struct r/o 19 1 // g_lightIndexInput texture struct r/o 20 1 // g_shadowmapTexture texture float4 2darray 21 1 // g_compactTileGridBuffer texture struct r/o 24 1 // g_outputTexture0 UAV float4 2d 0 1 // cb0 cbuffer NA NA 0 1 // cbPunctualShadowLightInfo cbuffer NA NA 2 1 // // // // Input signature: // // Name Index Mask Register SysValue Format Used // -------------------- ----- ------ -------- -------- ------- ------ // no Input // // Output signature: // // Name Index Mask Register SysValue Format Used // -------------------- ----- ------ -------- -------- ------- ------ // no Output cs_5_0 dcl_globalFlags refactoringAllowed dcl_immediateConstantBuffer { { 1.000000, 0, 0, 0}, { 0, 1.000000, 0, 0}, { 0, 0, 1.000000, 0}, { 0, 0, 0, 1.000000} } dcl_constantbuffer cb2[2688], dynamicIndexed dcl_constantbuffer cb0[27], dynamicIndexed dcl_sampler s0, mode_default dcl_sampler s2, mode_default dcl_sampler s3, mode_comparison dcl_resource_texture2d (float,float,float,float) t0 dcl_resource_texture2d (float,float,float,float) t1 dcl_resource_texture2d (float,float,float,float) t2 dcl_resource_texture2d (float,float,float,float) t6 dcl_resource_texture2darray (float,float,float,float) t9 dcl_resource_texture2d (float,float,float,float) t10 dcl_resource_structured t19, 16 dcl_resource_structured t20, 4 dcl_resource_texture2darray (float,float,float,float) t21 dcl_resource_structured t24, 4 dcl_uav_typed_texture2d (float,float,float,float) u0 dcl_input vThreadIDInGroupFlattened dcl_input vThreadGroupID.x dcl_input vThreadIDInGroup.xy //dcl_temps 25 dcl_temps 27 dcl_resource_texture2d (float,float,float,float) t125 dcl_tgsm_raw g0, 4 dcl_tgsm_raw g1, 4 dcl_thread_group 16, 16, 1 ld_structured_indexable(structured_buffer, stride=4)(mixed,mixed,mixed,mixed) r0.x, vThreadGroupID.x, l(0), t24.xxxx ushr r1.x, r0.x, l(16) and r1.yzw, r0.xxxx, l(0, 0x0000ffff, 0x0000ffff, 0x0000ffff) imad r0.xyzw, r1.xyzw, l(16, 16, 16, 16), vThreadIDInGroup.xyyy if_z vThreadIDInGroupFlattened.x imad r1.x, r1.w, cb0[8].z, r1.x ld_structured_indexable(structured_buffer, stride=16)(mixed,mixed,mixed,mixed) r1.xy, r1.x, l(0), t19.xyxx ushr r1.y, r1.y, l(16) store_raw g0.x, l(0), r1.x store_raw g1.x, l(0), r1.y endif sync_g_t utof r1.xy, r0.xwxx add r1.zw, r1.xxxy, l(0.000000, 0.000000, 0.500000, 0.500000) mul r1.zw, r1.zzzw, cb0[5].zzzw ftoi r2.xy, r1.xyxx mov r2.zw, l(0,0,0,0) ld_indexable(texture2d)(float,float,float,float) r3.xyzw, r2.xyww, t0.xyzw ld_indexable(texture2d)(float,float,float,float) r4.xyzw, r2.xyww, t1.xyzw ld_indexable(texture2d)(float,float,float,float) r1.xy, r2.xyww, t2.yzxw ld_indexable(texture2d)(float,float,float,float) r2.z, r2.xyzw, t6.yzxw mul r4.w, r4.w, l(6.000000) round_ne r4.w, r4.w ftou r4.w, r4.w mad r5.xy, r3.xyxx, l(2.000000, 2.000000, 0.000000, 0.000000), l(-1.000000, -1.000000, 0.000000, 0.000000) dp2 r3.x, r5.xyxx, r5.xyxx min r3.x, r3.x, l(1.000000) add r3.x, -r3.x, l(1.000000) sqrt r5.z, r3.x imul null, r3.x, r4.w, l(3) dp3 r6.x, r5.xyzx, cb0[r3.x + 9].xyzx dp3 r6.y, r5.xyzx, cb0[r3.x + 10].xyzx dp3 r6.z, r5.xyzx, cb0[r3.x + 11].xyzx add r3.x, -r3.z, l(1.000000) mul r3.y, r3.w, l(3.000000) round_ne r3.y, r3.y ftoi r3.y, r3.y ieq r3.y, r3.y, l(1) movc r1.x, r3.y, l(0), r1.x add r3.y, -r1.x, l(1.000000) mul r3.yzw, r3.yyyy, r4.xxyz mul r1.y, r1.y, r1.y mul r4.w, r1.y, l(0.160000) mad r4.xyz, -r1.yyyy, l(0.160000, 0.160000, 0.160000, 0.000000), r4.xyzx mad r4.xyz, r1.xxxx, r4.xyzx, r4.wwww dp3 r1.x, r4.xyzx, l(0.330000, 0.330000, 0.330000, 0.000000) mul_sat r1.x, r1.x, l(50.000000) mul r1.y, r3.x, r3.x mad r5.xy, r1.zwzz, l(2.000000, 2.000000, 0.000000, 0.000000), l(-1.000000, -1.000000, 0.000000, 0.000000) mul r2.xy, r5.xyxx, l(1.000000, -1.000000, 0.000000, 0.000000) mov r2.w, l(1.000000) // here we need to fix dp4 r5.x, r2.xyzw, cb0[0].xyzw dp4 r5.y, r2.xyzw, cb0[1].xyzw dp4 r5.z, r2.xyzw, cb0[2].xyzw dp4 r2.x, r2.xyzw, cb0[3].xyzw // Attempt to do something?!?! ld_indexable(texture2d)(float,float,float,float) r26.xyzw, l(0, 0, 0, 0), t125.xyzw add r26.w, r2.x, -r26.y mul r26.w, r26.w, r26.x //mul r26.w, r26.w, cb2[81].x mul r26.w, r26.w, l(5.5) //add r5.x, r5.x, -r26.w div r2.x, l(1.000000, 1.000000, 1.000000, 1.000000), r2.x mul r7.xyz, r2.xxxx, r5.xyzx dp3 r2.y, -r7.xyzx, -r7.xyzx rsq r2.y, r2.y mul r8.xyz, r2.yyyy, -r7.xyzx dp3 r2.z, r6.xyzx, r8.xyzx mov_sat r2.w, r2.z sample_l_indexable(texture2d)(float,float,float,float) r1.z, r1.zwzz, t10.yzxw, s0, l(0.000000) ld_raw r1.w, l(0), g0.xxxx ld_raw r4.w, l(0), g1.xxxx iadd r4.w, r1.w, r4.w add r2.z, |r2.z|, l(0.000010) add r8.xyz, -r4.xyzx, r1.xxxx max r1.x, r1.y, l(0.002000) mul r1.x, r1.x, r1.x mad r5.w, -r2.z, r1.x, r2.z mad r5.w, r5.w, r2.z, r1.x sqrt r5.w, r5.w mad r6.w, r3.x, l(-0.337748349), l(1.000000) add r8.w, -r2.z, l(1.000000) mul r9.x, r8.w, r8.w mul r9.x, r9.x, r9.x mul r8.w, r8.w, r9.x mov r7.w, l(1.000000) mov r9.xyz, l(0,0,0,0) mov r10.xyz, l(0,0,0,0) mov r9.w, r1.w loop uge r10.w, r9.w, r4.w breakc_nz r10.w ld_structured_indexable(structured_buffer, stride=4)(mixed,mixed,mixed,mixed) r10.w, r9.w, l(0), t20.xxxx imul null, r11.x, r10.w, l(21) mad r11.yzw, -r5.xxyz, r2.xxxx, cb2[r11.x + 0].xxyz dp3 r12.x, r11.yzwy, r11.yzwy rsq r12.y, r12.x mul r12.yzw, r11.yyzw, r12.yyyy add r13.x, r12.x, cb2[r11.x + 1].w max r13.x, r13.x, l(0.000100) div r13.x, l(1.000000, 1.000000, 1.000000, 1.000000), r13.x mul r12.x, r12.x, cb2[r11.x + 0].w mad r12.x, -r12.x, r12.x, l(1.000000) max r12.x, r12.x, l(0.000000) mul r12.x, r12.x, r12.x mul r12.x, r12.x, r13.x dp3 r13.x, cb2[r11.x + 2].xyzx, r12.yzwy mad_sat r13.y, r13.x, cb2[r11.x + 5].x, cb2[r11.x + 5].y mul r13.y, r13.y, r13.y mul r12.x, r12.x, r13.y dp3_sat r13.y, r6.xyzx, r12.yzwy mul r12.x, r12.x, r13.y lt r13.z, l(0.000000), r12.x if_nz r13.z mad r14.xyz, -r7.xyzx, r2.yyyy, r12.yzwy dp3 r13.w, r14.xyzx, r14.xyzx rsq r13.w, r13.w mul r14.xyz, r13.wwww, r14.xyzx dp3_sat r13.w, r12.yzwy, r14.xyzx dp3_sat r14.x, r6.xyzx, r14.xyzx add r14.y, -r13.w, l(1.000000) mul r14.z, r14.y, r14.y mul r14.z, r14.z, r14.z mul r14.y, r14.y, r14.z mad r14.yzw, r8.xxyz, r14.yyyy, r4.xxyz mad r15.x, -r13.y, r1.x, r13.y mad r15.x, r15.x, r13.y, r1.x sqrt r15.x, r15.x mul r15.x, r2.z, r15.x mad r15.x, r13.y, r5.w, r15.x div r15.x, l(0.500000), r15.x mad r15.y, r14.x, r1.x, -r14.x mad r14.x, r15.y, r14.x, l(1.000000) mul r14.x, r14.x, r14.x div r14.x, r1.x, r14.x mul r14.x, r14.x, r15.x mul r14.xyz, r14.xxxx, r14.yzwy mul r13.w, r13.w, r13.w dp2 r13.w, r13.wwww, r3.xxxx mad r13.w, r3.x, l(0.500000), r13.w add r13.y, -r13.y, l(1.000000) mul r14.w, r13.y, r13.y mul r14.w, r14.w, r14.w mul r13.y, r13.y, r14.w add r13.w, r13.w, l(-1.000000) mad r13.y, r13.w, r13.y, l(1.000000) mad r13.w, r13.w, r8.w, l(1.000000) mul r13.y, r13.w, r13.y mul r13.y, r6.w, r13.y mul r13.w, r12.x, cb2[r11.x + 2].w mul r15.xyz, r13.wwww, cb2[r11.x + 1].xyzx mul r15.xyz, r13.yyyy, r15.xyzx mul r12.x, r12.x, cb2[r11.x + 3].w mul r16.xyz, r12.xxxx, cb2[r11.x + 1].xyzx mul r14.xyz, r14.xyzx, r16.xyzx else mov r15.xyz, l(0,0,0,0) mov r14.xyz, l(0,0,0,0) endif lt r12.x, l(0.000000), cb2[r11.x + 6].x and r12.x, r12.x, r13.z if_nz r12.x dp3 r12.x, cb2[r11.x + 4].xyzx, -r12.yzwy dp3 r13.y, cb2[r11.x + 3].xyzx, -r12.yzwy dp3 r12.y, cb2[r11.x + 2].xyzx, -r12.yzwy mad r16.y, r12.y, l(0.500000), l(0.500000) min r12.y, |r12.x|, |r13.y| max r12.z, |r12.x|, |r13.y| div r12.z, l(1.000000, 1.000000, 1.000000, 1.000000), r12.z mul r12.y, r12.z, r12.y mul r12.z, r12.y, r12.y mad r12.w, r12.z, l(0.0208350997), l(-0.085133) mad r12.w, r12.z, r12.w, l(0.180141) mad r12.w, r12.z, r12.w, l(-0.330299497) mad r12.z, r12.z, r12.w, l(0.999866) mul r12.w, r12.z, r12.y lt r13.w, |r12.x|, |r13.y| mad r12.w, r12.w, l(-2.000000), l(1.57079637) and r12.w, r13.w, r12.w mad r12.y, r12.y, r12.z, r12.w lt r12.z, r12.x, -r12.x and r12.z, r12.z, l(0xc0490fdb) add r12.y, r12.z, r12.y min r12.z, r12.x, r13.y max r12.x, r12.x, r13.y lt r12.z, r12.z, -r12.z ge r12.x, r12.x, -r12.x and r12.x, r12.x, r12.z movc r12.x, r12.x, -r12.y, r12.y mul r16.x, r12.x, l(0.159154937) mov r16.z, cb2[r11.x + 6].y sample_l_indexable(texture2darray)(float,float,float,float) r12.x, r16.xyzx, t9.xyzw, s2, l(0.000000) else mov r12.x, l(1.000000) endif ne r12.y, l(0.000000), cb2[r11.x + 19].x and r12.y, r12.y, r13.z if_nz r12.y mad_sat r12.y, r13.x, cb2[r11.x + 19].z, cb2[r11.x + 19].w mul r12.y, r12.y, r12.y lt r12.z, l(0.000000), r12.y if_nz r12.z eq r12.z, l(2.000000), cb2[r11.x + 19].x if_nz r12.z mov r13.xyz, -r11.yzwy max r12.z, |r11.z|, |r11.y| max r12.z, |r11.w|, r12.z lt r16.xy, |r11.zwzz|, |r11.yyyy| and r12.w, r16.y, r16.x if_nz r12.w lt r12.w, l(0.000000), r13.x movc r13.x, r12.w, r13.z, r11.w and r12.w, r12.w, l(0x3f800000) else lt r16.xy, |r11.ywyy|, |r11.zzzz| and r11.z, r16.y, r16.x if_nz r11.z lt r11.z, l(0.000000), r13.y movc r13.w, r11.z, r13.z, r11.w movc r12.w, r11.z, l(3.000000), l(2.000000) mov r13.xy, r13.xwxx else lt r11.z, l(0.000000), r13.z movc r13.x, r11.z, r11.y, r13.x movc r12.w, r11.z, l(5.000000), l(4.000000) endif endif div r11.yz, r13.xxyx, r12.zzzz mad r13.xy, r11.yzyy, l(0.500000, -0.500000, 0.000000, 0.000000), l(0.500000, 0.500000, 0.000000, 0.000000) mad r11.y, -r12.z, cb2[r11.x + 9].z, cb2[r11.x + 9].w div r11.y, r11.y, r12.z mov r11.z, l(-1) // Disabled some stupid cut-off //else //dp4 r16.x, r7.xyzw, cb2[r11.x + 7].xyzw //dp4 r16.y, r7.xyzw, cb2[r11.x + 8].xyzw //dp4 r11.w, r7.xyzw, cb2[r11.x + 9].xyzw //dp4 r12.z, r7.xyzw, cb2[r11.x + 10].xyzw //div r12.z, l(1.000000, 1.000000, 1.000000, 1.000000), r12.z //mul r16.xy, r12.zzzz, r16.xyxx //mul r11.y, r11.w, r12.z //mad r13.xy, r16.xyxx, l(0.500000, -0.500000, 0.000000, 0.000000), l(0.500000, 0.500000, 0.000000, 0.000000) //mad r11.w, -r11.w, r12.z, l(1.000000) //max r12.z, |r16.y|, |r16.x| //max r11.w, r11.w, r12.z //ge r11.z, l(1.000000), r11.w //mov r12.w, l(0) endif ftou r11.w, r12.w and r12.z, r11.w, l(3) ushr r11.w, r11.w, l(2) imad r10.w, r10.w, l(21), r11.w dp4 r10.w, cb2[r10.w + 17].xyzw, icb[r12.z + 0].xyzw ge r11.w, r10.w, l(0.000000) and r11.z, r11.z, r11.w ftou r10.w, r10.w ftou r11.w, cb2[r11.x + 19].y ieq r11.w, r11.w, l(1) if_nz r11.w utof r16.z, r10.w mad r12.zw, cb0[6].xxxx, r13.xxxy, l(0.000000, 0.000000, 0.500000, 0.500000) round_ni r17.xy, r12.zwzz add r12.zw, r12.zzzw, -r17.xxxy mul r16.xy, r17.xyxx, cb0[6].zzzz gather4_c_aoffimmi_indexable(-2,-2,0)(texture2darray)(float,float,float,float) r17.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y add r18.xyzw, -r12.zzzz, l(1.000000, 2.000000, 3.000000, 5.000000) mul r19.xyzw, r17.wzxy, r18.xyxy add r17.zw, r19.yyyw, r19.xxxz gather4_c_aoffimmi_indexable(0,-2,0)(texture2darray)(float,float,float,float) r19.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y mad r17.zw, r19.zzzy, l(0.000000, 0.000000, 2.000000, 2.000000), r17.zzzw mad r17.zw, r19.wwwx, l(0.000000, 0.000000, 2.000000, 2.000000), r17.zzzw gather4_c_aoffimmi_indexable(2,-2,0)(texture2darray)(float,float,float,float) r20.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y add r21.xyzw, r12.zzzz, l(1.000000, 4.000000, 3.000000, 2.000000) mul r11.w, r12.z, r20.y mad r17.zw, r20.zzzy, r12.zzzz, r17.zzzw mad r17.zw, r20.wwwx, r21.xxxx, r17.zzzw gather4_c_aoffimmi_indexable(-2,0,0)(texture2darray)(float,float,float,float) r22.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y mad r13.w, -r12.z, l(2.000000), l(2.000000) mul r19.zw, r13.wwww, r22.wwwx mad r14.w, r12.z, l(-2.000000), l(4.000000) mad r19.zw, r22.zzzy, r14.wwww, r19.zzzw mad r14.w, r17.y, r18.z, r19.z mad r14.w, r17.x, r13.w, r14.w mad r15.w, r22.z, r18.z, r19.w mad r15.w, r22.w, r13.w, r15.w gather4_c_indexable(texture2darray)(float,float,float,float) r23.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y mad r14.w, r23.z, r21.y, r14.w mad r14.w, r23.w, r18.w, r14.w mad r15.w, r23.y, r21.y, r15.w mad r15.w, r23.x, r18.w, r15.w add r17.xy, -r12.zwzz, l(4.000000, 1.000000, 0.000000, 0.000000) mad r14.w, r19.y, r21.z, r14.w mad r14.w, r19.x, r17.x, r14.w mad r15.w, r23.z, r21.z, r15.w mad r15.w, r23.w, r17.x, r15.w gather4_c_aoffimmi_indexable(2,0,0)(texture2darray)(float,float,float,float) r19.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y mad r16.w, r12.z, l(2.000000), l(2.000000) dp2 r18.w, r19.zzzz, r12.zzzz add r14.w, r14.w, r18.w mad r14.w, r19.w, r16.w, r14.w dp2 r19.y, r19.yyyy, r12.zzzz add r15.w, r15.w, r19.y mad r15.w, r19.x, r16.w, r15.w mad r11.w, r11.w, l(2.000000), r14.w mad r20.x, r20.x, r21.w, r11.w add r11.w, r15.w, r18.w mad r20.y, r19.w, r21.w, r11.w add r17.zw, r17.zzzw, r20.xxxy gather4_c_aoffimmi_indexable(-2,2,0)(texture2darray)(float,float,float,float) r20.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y mul r24.xyzw, r18.xyxy, r20.wzxy add r18.xy, r24.ywyy, r24.xzxx mad r11.w, r22.y, r18.z, r18.x mad r11.w, r22.x, r13.w, r11.w mad r14.w, r20.z, r18.z, r18.y mad r13.w, r20.w, r13.w, r14.w gather4_c_aoffimmi_indexable(0,2,0)(texture2darray)(float,float,float,float) r18.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y mad r11.w, r18.z, l(2.000000), r11.w mad r11.w, r18.w, l(2.000000), r11.w mad r13.w, r18.y, l(2.000000), r13.w mad r13.w, r18.x, l(2.000000), r13.w mad r11.w, r23.y, r21.z, r11.w mad r11.w, r23.x, r17.x, r11.w mad r13.w, r18.z, r21.z, r13.w mad r13.w, r18.w, r17.x, r13.w gather4_c_aoffimmi_indexable(2,2,0)(texture2darray)(float,float,float,float) r16.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y mul r14.w, r12.z, r16.z mad r11.w, r16.z, r12.z, r11.w mad r11.w, r16.w, r21.x, r11.w mad r12.z, r16.y, r12.z, r13.w mad r12.z, r16.x, r21.x, r12.z add r11.w, r11.w, r19.y mad r16.x, r19.x, r21.w, r11.w mad r11.w, r14.w, l(2.000000), r12.z mad r16.y, r16.w, r21.w, r11.w add r16.xy, r16.xyxx, r17.zwzz mul r11.w, r12.w, r16.y mad r11.w, r16.x, r17.y, r11.w mul_sat r11.w, r11.w, l(0.0163934417) else utof r13.z, r10.w sample_c_lz_indexable(texture2darray)(float,float,float,float) r10.w, r13.xyzx, t21.xxxx, s3, r11.y lt r11.y, r11.y, l(1.000000) movc r11.w, r11.y, r10.w, l(1.000000) endif add r10.w, r11.w, l(-1.000000) mul r10.w, r10.w, cb2[r11.x + 4].w mad r10.w, r12.y, r10.w, l(1.000000) movc r10.w, r11.z, r10.w, l(1.000000) mul r12.x, r10.w, r12.x endif endif mad r10.xyz, r15.xyzx, r12.xxxx, r10.xyzx mad r9.xyz, r14.xyzx, r12.xxxx, r9.xyzx iadd r9.w, r9.w, l(1) endloop add r1.x, r1.z, l(-1.000000) mad r1.x, cb0[8].y, r1.x, l(1.000000) mul r2.xyz, r1.xxxx, r10.xyzx add r1.x, r1.z, r2.w mad r1.y, r1.y, l(-16.000000), l(-1.000000) exp r1.y, r1.y log r1.x, r1.x mul r1.x, r1.x, r1.y exp r1.x, r1.x add r1.x, r1.z, r1.x add_sat r1.x, r1.x, l(-1.000000) mul r1.xyz, r1.xxxx, r9.xyzx mul r2.xyz, r2.xyzx, r3.yzwy mul r1.xyz, r1.xyzx, l(0.318309873, 0.318309873, 0.318309873, 0.000000) mad r1.xyz, r2.xyzx, l(0.318309873, 0.318309873, 0.318309873, 0.000000), r1.xyzx mul r1.xyz, r1.xyzx, cb0[4].zzzz min r1.xyz, r1.xyzx, l(65504.000000, 65504.000000, 65504.000000, 0.000000) mov r1.w, l(0) store_uav_typed u0.xyzw, r0.xyzw, r1.xyzw ret // Approximately 367 instruction slots used [/code] I already attempted to fix it. I can see something that is pushed in stereo but far from the proper result. Anyone has any clues? Here is the HLSL version (is broken like you can see but helps figuring what is what a bit) [code] // ---- Created with 3Dmigoto v1.2.39 on Mon Jun 13 19:31:45 2016 /* cbuffer cbPunctualShadowLightInfo : register(b2) { struct { struct { float3 pos; float invSqrAttenuationRadius; float3 color; float attenuationOffset; float3 matrixForward; float diffuseScale; float3 matrixUp; float specularScale; float3 matrixLeft; float shadowDimmer; float angleScale; float angleOffset; float2 unused; } baseLight; struct { float enable; float textureIndex; float2 unused; } iesShadow; struct { float4 shadowMatrix1; float4 shadowMatrix2; float4 shadowMatrix3; float4 shadowMatrix4; float4 shadowMapAtlasParam[6]; float4 shadowMapIndex[2]; float shadowType; float quality; float shadowAngleScale; float shadowAngleOffset; } shadow; struct { float enable; float volumeShadowMapIndex; float invAttenuationRadius; float tanAngle; } vShadow; } g_lightInfoPunctualShadow[128] : packoffset(c0); } cbuffer cb0 : register(b0) { float4x4 invViewProjectionMatrix : packoffset(c0); float4 g_exposureMultipliers : packoffset(c4); float localIblMipmapBias : packoffset(c5); float screenAspectRatio : packoffset(c5.y); float2 invResolution : packoffset(c5.z); float4 shadowMapSizeAndInvSize : packoffset(c6); uint forceSplitLighting : packoffset(c7); uint sssScatteringEnables : packoffset(c7.y); float volumetricShadowmapHalfTexelOffset : packoffset(c7.z); float volumetricShadowmapOneMinusHalfTexelOffset : packoffset(c7.w); float volumetricShadowmapInvMaxCount : packoffset(c8); float dynamicAOFactor : packoffset(c8.y); uint tileCountX : packoffset(c8.z); uint pad1 : packoffset(c8.w); float4x3 g_normalBasisTransforms[6] : packoffset(c9); } SamplerState g_linearSampler_s : register(s0); SamplerState g_linearLongitudeWrapSampler_s : register(s2); SamplerComparisonState g_shadowmapSampler_s : register(s3); Texture2D<float4> g_gbufferTexture0 : register(t0); Texture2D<float4> g_gbufferTexture1 : register(t1); Texture2D<float4> g_gbufferTexture2 : register(t2); Texture2D<float> g_depthTexture : register(t6); Texture2DArray<float> g_iesTextureArray : register(t9); Texture2D<float> g_diffuseOcclusionTexture : register(t10); StructuredBuffer<g_lightCullInput> g_lightCullInput : register(t19); StructuredBuffer<g_lightIndexInput> g_lightIndexInput : register(t20); Texture2DArray<float4> g_shadowmapTexture : register(t21); StructuredBuffer<g_compactTileGridBuffer> g_compactTileGridBuffer : register(t24); // 3Dmigoto declarations #define cmp - Texture1D<float4> IniParams : register(t120); Texture2D<float4> StereoParams : register(t125); void main() { const float4 icb[] = { { 1.000000, 0, 0, 0}, { 0, 1.000000, 0, 0}, { 0, 0, 1.000000, 0}, { 0, 0, 0, 1.000000} }; // Needs manual fix for instruction: // unknown dcl_: dcl_resource_structured t19, 16 // Needs manual fix for instruction: // unknown dcl_: dcl_resource_structured t20, 4 // Needs manual fix for instruction: // unknown dcl_: dcl_resource_structured t24, 4 // Needs manual fix for instruction: // unknown dcl_: dcl_uav_typed_texture2d (float,float,float,float) u0 float4 r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,r10,r11,r12,r13,r14,r15,r16,r17,r18,r19,r20,r21,r22,r23,r24; uint4 bitmask, uiDest; float4 fDest; // Needs manual fix for instruction: // unknown dcl_: dcl_tgsm_raw g0, 4 // Needs manual fix for instruction: // unknown dcl_: dcl_tgsm_raw g1, 4 // Needs manual fix for instruction: // unknown dcl_: dcl_thread_group 16, 16, 1 // Known bad code for instruction (needs manual fix): ld_structured_indexable(structured_buffer, stride=4)(mixed,mixed,mixed,mixed) r0.x, vThreadGroupID.x, l(0), t24.xxxx r0.x = g_linearSampler[]..swiz; r1.x = (uint)r0.x >> 16; r1.yzw = (int3)r0.xxx & int3(0,0,0); r0.xyzw = mad((int4)r1.xyzw, int4(16,16,16,16), (int4)vThreadIDInGroup.xyyy); if (vThreadIDInGroupFlattened.x == 0) { r1.x = mad((int)r1.w, tileCountX, (int)r1.x); // Known bad code for instruction (needs manual fix): ld_structured_indexable(structured_buffer, stride=16)(mixed,mixed,mixed,mixed) r1.xy, r1.x, l(0), t19.xyxx r1.x = g_linearSampler[]..swiz; r1.y = g_linearSampler[]..swiz; r1.y = (uint)r1.y >> 16; // No code for instruction (needs manual fix): store_raw g0.x, l(0), r1.x // No code for instruction (needs manual fix): store_raw g1.x, l(0), r1.y } GroupMemoryBarrierWithGroupSync(); r1.xy = (uint2)r0.xw; r1.zw = float2(0.5,0.5) + r1.xy; r1.zw = invResolution.xy * r1.zw; r2.xy = (int2)r1.xy; r2.zw = float2(0,0); r3.xyzw = g_gbufferTexture0.Load(r2.xyw).xyzw; r4.xyzw = g_gbufferTexture1.Load(r2.xyw).xyzw; r1.xy = g_gbufferTexture2.Load(r2.xyw).yz; r2.z = g_depthTexture.Load(r2.xyz).x; r4.w = 6 * r4.w; r4.w = round(r4.w); r4.w = (uint)r4.w; r5.xy = r3.xy * float2(2,2) + float2(-1,-1); r3.x = dot(r5.xy, r5.xy); r3.x = min(1, r3.x); r3.x = 1 + -r3.x; r5.z = sqrt(r3.x); r3.x = (int)r4.w * 3; r6.x = dot(r5.xyz, g_normalBasisTransforms[r4.w]._m00_m10_m20); r6.y = dot(r5.xyz, g_normalBasisTransforms[r4.w]._m01_m11_m21); r6.z = dot(r5.xyz, g_normalBasisTransforms[r4.w]._m02_m12_m22); r3.x = 1 + -r3.z; r3.y = 3 * r3.w; r3.y = round(r3.y); r3.y = (int)r3.y; r3.y = cmp((int)r3.y == 1); r1.x = r3.y ? 0 : r1.x; r3.y = 1 + -r1.x; r3.yzw = r4.xyz * r3.yyy; r1.y = r1.y * r1.y; r4.w = 0.159999996 * r1.y; r4.xyz = -r1.yyy * float3(0.159999996,0.159999996,0.159999996) + r4.xyz; r4.xyz = r1.xxx * r4.xyz + r4.www; r1.x = dot(r4.xyz, float3(0.330000013,0.330000013,0.330000013)); r1.x = saturate(50 * r1.x); r1.y = r3.x * r3.x; r5.xy = r1.zw * float2(2,2) + float2(-1,-1); r2.xy = float2(1,-1) * r5.xy; r2.w = 1; r5.x = dot(r2.xyzw, invViewProjectionMatrix._m00_m10_m20_m30); r5.y = dot(r2.xyzw, invViewProjectionMatrix._m01_m11_m21_m31); r5.z = dot(r2.xyzw, invViewProjectionMatrix._m02_m12_m22_m32); r2.x = dot(r2.xyzw, invViewProjectionMatrix._m03_m13_m23_m33); r2.x = 1 / r2.x; r7.xyz = r5.xyz * r2.xxx; r2.y = dot(-r7.xyz, -r7.xyz); r2.y = rsqrt(r2.y); r8.xyz = -r7.xyz * r2.yyy; r2.z = dot(r6.xyz, r8.xyz); r2.w = saturate(r2.z); r1.z = g_diffuseOcclusionTexture.SampleLevel(g_linearSampler_s, r1.zw, 0).x; // No code for instruction (needs manual fix): ld_raw r1.w, l(0), g0.xxxx // No code for instruction (needs manual fix): ld_raw r4.w, l(0), g1.xxxx r4.w = (int)r1.w + (int)r4.w; r2.z = 9.99999975e-006 + abs(r2.z); r8.xyz = r1.xxx + -r4.xyz; r1.x = max(0.00200000009, r1.y); r1.x = r1.x * r1.x; r5.w = -r2.z * r1.x + r2.z; r5.w = r5.w * r2.z + r1.x; r5.w = sqrt(r5.w); r6.w = r3.x * -0.337748349 + 1; r8.w = 1 + -r2.z; r9.x = r8.w * r8.w; r9.x = r9.x * r9.x; r8.w = r9.x * r8.w; r7.w = 1; r9.xyz = float3(0,0,0); r10.xyz = float3(0,0,0); r9.w = r1.w; while (true) { r10.w = cmp((uint)r9.w >= (uint)r4.w); if (r10.w != 0) break; // Known bad code for instruction (needs manual fix): ld_structured_indexable(structured_buffer, stride=4)(mixed,mixed,mixed,mixed) r10.w, r9.w, l(0), t20.xxxx r10.w = g_linearSampler[]..swiz; r11.x = (int)r10.w * 21; r11.yzw = -r5.xyz * r2.xxx + g_lightInfoPunctualShadow[r11.x].baseLight.pos.xyz; r12.x = dot(r11.yzw, r11.yzw); r12.y = rsqrt(r12.x); r12.yzw = r12.yyy * r11.yzw; r13.x = g_lightInfoPunctualShadow[r11.x].baseLight.attenuationOffset + r12.x; r13.x = max(9.99999975e-005, r13.x); r13.x = 1 / r13.x; r12.x = g_lightInfoPunctualShadow[r11.x].baseLight.invSqrAttenuationRadius * r12.x; r12.x = -r12.x * r12.x + 1; r12.x = max(0, r12.x); r12.x = r12.x * r12.x; r12.x = r13.x * r12.x; r13.x = dot(g_lightInfoPunctualShadow[r11.x].baseLight.matrixForward.xyz, r12.yzw); r13.y = saturate(r13.x * g_lightInfoPunctualShadow[r11.x].baseLight.angleScale + g_lightInfoPunctualShadow[r11.x].baseLight.angleOffset); r13.y = r13.y * r13.y; r12.x = r13.y * r12.x; r13.y = saturate(dot(r6.xyz, r12.yzw)); r12.x = r13.y * r12.x; r13.z = cmp(0 < r12.x); if (r13.z != 0) { r14.xyz = -r7.xyz * r2.yyy + r12.yzw; r13.w = dot(r14.xyz, r14.xyz); r13.w = rsqrt(r13.w); r14.xyz = r14.xyz * r13.www; r13.w = saturate(dot(r12.yzw, r14.xyz)); r14.x = saturate(dot(r6.xyz, r14.xyz)); r14.y = 1 + -r13.w; r14.z = r14.y * r14.y; r14.z = r14.z * r14.z; r14.y = r14.z * r14.y; r14.yzw = r8.xyz * r14.yyy + r4.xyz; r15.x = -r13.y * r1.x + r13.y; r15.x = r15.x * r13.y + r1.x; r15.x = sqrt(r15.x); r15.x = r15.x * r2.z; r15.x = r13.y * r5.w + r15.x; r15.x = 0.5 / r15.x; r15.y = r14.x * r1.x + -r14.x; r14.x = r15.y * r14.x + 1; r14.x = r14.x * r14.x; r14.x = r1.x / r14.x; r14.x = r15.x * r14.x; r14.xyz = r14.yzw * r14.xxx; r13.w = r13.w * r13.w; r13.w = dot(r13.ww, r3.xx); r13.w = r3.x * 0.5 + r13.w; r13.y = 1 + -r13.y; r14.w = r13.y * r13.y; r14.w = r14.w * r14.w; r13.y = r14.w * r13.y; r13.w = -1 + r13.w; r13.y = r13.w * r13.y + 1; r13.w = r13.w * r8.w + 1; r13.y = r13.y * r13.w; r13.y = r13.y * r6.w; r13.w = g_lightInfoPunctualShadow[r11.x].baseLight.diffuseScale * r12.x; r15.xyz = g_lightInfoPunctualShadow[r11.x].baseLight.color.xyz * r13.www; r15.xyz = r15.xyz * r13.yyy; r12.x = g_lightInfoPunctualShadow[r11.x].baseLight.specularScale * r12.x; r16.xyz = g_lightInfoPunctualShadow[r11.x].baseLight.color.xyz * r12.xxx; r14.xyz = r16.xyz * r14.xyz; } else { r15.xyz = float3(0,0,0); r14.xyz = float3(0,0,0); } r12.x = cmp(0 < g_lightInfoPunctualShadow[r11.x].iesShadow.enable); r12.x = r12.x ? r13.z : 0; if (r12.x != 0) { r12.x = dot(g_lightInfoPunctualShadow[r11.x].baseLight.matrixLeft.xyz, -r12.yzw); r13.y = dot(g_lightInfoPunctualShadow[r11.x].baseLight.matrixUp.xyz, -r12.yzw); r12.y = dot(g_lightInfoPunctualShadow[r11.x].baseLight.matrixForward.xyz, -r12.yzw); r16.y = r12.y * 0.5 + 0.5; r12.y = min(abs(r13.y), abs(r12.x)); r12.z = max(abs(r13.y), abs(r12.x)); r12.z = 1 / r12.z; r12.y = r12.y * r12.z; r12.z = r12.y * r12.y; r12.w = r12.z * 0.0208350997 + -0.0851330012; r12.w = r12.z * r12.w + 0.180141002; r12.w = r12.z * r12.w + -0.330299497; r12.z = r12.z * r12.w + 0.999866009; r12.w = r12.y * r12.z; r13.w = cmp(abs(r12.x) < abs(r13.y)); r12.w = r12.w * -2 + 1.57079637; r12.w = r13.w ? r12.w : 0; r12.y = r12.y * r12.z + r12.w; r12.z = cmp(r12.x < -r12.x); r12.z = r12.z ? -3.141593 : 0; r12.y = r12.y + r12.z; r12.z = min(r13.y, r12.x); r12.x = max(r13.y, r12.x); r12.z = cmp(r12.z < -r12.z); r12.x = cmp(r12.x >= -r12.x); r12.x = r12.x ? r12.z : 0; r12.x = r12.x ? -r12.y : r12.y; r16.x = 0.159154937 * r12.x; r16.z = g_lightInfoPunctualShadow[r11.x].iesShadow.textureIndex; r12.x = g_iesTextureArray.SampleLevel(g_linearLongitudeWrapSampler_s, r16.xyz, 0).x; } else { r12.x = 1; } r12.y = cmp(0.000000 != g_lightInfoPunctualShadow[r11.x].shadow.shadowType); r12.y = r12.y ? r13.z : 0; if (r12.y != 0) { r12.y = saturate(r13.x * g_lightInfoPunctualShadow[r11.x].shadow.shadowAngleScale + g_lightInfoPunctualShadow[r11.x].shadow.shadowAngleOffset); r12.y = r12.y * r12.y; r12.z = cmp(0 < r12.y); if (r12.z != 0) { r12.z = cmp(2.000000 == g_lightInfoPunctualShadow[r11.x].shadow.shadowType); if (r12.z != 0) { r13.xyz = -r11.yzw; r12.z = max(abs(r11.y), abs(r11.z)); r12.z = max(r12.z, abs(r11.w)); r16.xy = cmp(abs(r11.zw) < abs(r11.yy)); r12.w = r16.y ? r16.x : 0; if (r12.w != 0) { r12.w = cmp(0 < r13.x); r13.x = r12.w ? r13.z : r11.w; r12.w = r12.w ? 1.000000 : 0; } else { r16.xy = cmp(abs(r11.yw) < abs(r11.zz)); r11.z = r16.y ? r16.x : 0; if (r11.z != 0) { r11.z = cmp(0 < r13.y); r13.w = r11.z ? r13.z : r11.w; r12.w = r11.z ? 3 : 2; r13.xy = r13.xw; } else { r11.z = cmp(0 < r13.z); r13.x = r11.z ? r11.y : r13.x; r12.w = r11.z ? 5 : 4; } } r11.yz = r13.xy / r12.zz; r13.xy = r11.yz * float2(0.5,-0.5) + float2(0.5,0.5); r11.y = -r12.z * g_lightInfoPunctualShadow[r11.x].shadow.shadowMatrix3.z + g_lightInfoPunctualShadow[r11.x].shadow.shadowMatrix3.w; r11.y = r11.y / r12.z; r11.z = -1; } else { r16.x = dot(r7.xyzw, g_lightInfoPunctualShadow[r11.x].shadow.shadowMatrix1.xyzw); r16.y = dot(r7.xyzw, g_lightInfoPunctualShadow[r11.x].shadow.shadowMatrix2.xyzw); r11.w = dot(r7.xyzw, g_lightInfoPunctualShadow[r11.x].shadow.shadowMatrix3.xyzw); r12.z = dot(r7.xyzw, g_lightInfoPunctualShadow[r11.x].shadow.shadowMatrix4.xyzw); r12.z = 1 / r12.z; r16.xy = r16.xy * r12.zz; r11.y = r12.z * r11.w; r13.xy = r16.xy * float2(0.5,-0.5) + float2(0.5,0.5); r11.w = -r11.w * r12.z + 1; r12.z = max(abs(r16.x), abs(r16.y)); r11.w = max(r12.z, r11.w); r11.z = cmp(1 >= r11.w); r12.w = 0; } r11.w = (uint)r12.w; r12.z = (int)r11.w & 3; r11.w = (uint)r11.w >> 2; r10.w = mad((int)r10.w, 21, (int)r11.w); r10.w = dot(g_lightInfoPunctualShadow[r10.w].shadow.shadowMapIndex[0].xyzw, icb[r12.z+0].xyzw); r11.w = cmp(r10.w >= 0); r11.z = r11.z ? r11.w : 0; r10.w = (uint)r10.w; r11.w = (uint7)g_lightInfoPunctualShadow[r11.x].shadow.quality; r11.w = cmp((int)r11.w == 1); if (r11.w != 0) { r16.z = (uint)r10.w; r12.zw = shadowMapSizeAndInvSize.xx * r13.xy + float2(0.5,0.5); r17.xy = floor(r12.zw); r12.zw = -r17.xy + r12.zw; r16.xy = shadowMapSizeAndInvSize.zz * r17.xy; r17.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(-2,-2)).xyzw; r18.xyzw = float4(1,2,3,5) + -r12.zzzz; r19.xyzw = r18.xyxy * r17.wzxy; r17.zw = r19.xz + r19.yw; r19.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(0,-2)).xyzw; r17.zw = r19.zy * float2(2,2) + r17.zw; r17.zw = r19.wx * float2(2,2) + r17.zw; r20.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(2,-2)).xyzw; r21.xyzw = float4(1,4,3,2) + r12.zzzz; r11.w = r20.y * r12.z; r17.zw = r20.zy * r12.zz + r17.zw; r17.zw = r20.wx * r21.xx + r17.zw; r22.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(-2,0)).xyzw; r13.w = -r12.z * 2 + 2; r19.zw = r22.wx * r13.ww; r14.w = r12.z * -2 + 4; r19.zw = r22.zy * r14.ww + r19.zw; r14.w = r17.y * r18.z + r19.z; r14.w = r17.x * r13.w + r14.w; r15.w = r22.z * r18.z + r19.w; r15.w = r22.w * r13.w + r15.w; r23.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y).xyzw; r14.w = r23.z * r21.y + r14.w; r14.w = r23.w * r18.w + r14.w; r15.w = r23.y * r21.y + r15.w; r15.w = r23.x * r18.w + r15.w; r17.xy = float2(4,1) + -r12.zw; r14.w = r19.y * r21.z + r14.w; r14.w = r19.x * r17.x + r14.w; r15.w = r23.z * r21.z + r15.w; r15.w = r23.w * r17.x + r15.w; r19.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(2,0)).xyzw; r16.w = r12.z * 2 + 2; r18.w = dot(r19.zz, r12.zz); r14.w = r18.w + r14.w; r14.w = r19.w * r16.w + r14.w; r19.y = dot(r19.yy, r12.zz); r15.w = r19.y + r15.w; r15.w = r19.x * r16.w + r15.w; r11.w = r11.w * 2 + r14.w; r20.x = r20.x * r21.w + r11.w; r11.w = r18.w + r15.w; r20.y = r19.w * r21.w + r11.w; r17.zw = r20.xy + r17.zw; r20.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(-2,2)).xyzw; r24.xyzw = r20.wzxy * r18.xyxy; r18.xy = r24.xz + r24.yw; r11.w = r22.y * r18.z + r18.x; r11.w = r22.x * r13.w + r11.w; r14.w = r20.z * r18.z + r18.y; r13.w = r20.w * r13.w + r14.w; r18.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(0,2)).xyzw; r11.w = r18.z * 2 + r11.w; r11.w = r18.w * 2 + r11.w; r13.w = r18.y * 2 + r13.w; r13.w = r18.x * 2 + r13.w; r11.w = r23.y * r21.z + r11.w; r11.w = r23.x * r17.x + r11.w; r13.w = r18.z * r21.z + r13.w; r13.w = r18.w * r17.x + r13.w; r16.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(2,2)).xyzw; r14.w = r16.z * r12.z; r11.w = r16.z * r12.z + r11.w; r11.w = r16.w * r21.x + r11.w; r12.z = r16.y * r12.z + r13.w; r12.z = r16.x * r21.x + r12.z; r11.w = r19.y + r11.w; r16.x = r19.x * r21.w + r11.w; r11.w = r14.w * 2 + r12.z; r16.y = r16.w * r21.w + r11.w; r16.xy = r17.zw + r16.xy; r11.w = r16.y * r12.w; r11.w = r16.x * r17.y + r11.w; r11.w = saturate(0.0163934417 * r11.w); } else { r13.z = (uint)r10.w; r10.w = g_shadowmapTexture.SampleCmpLevelZero(g_shadowmapSampler_s, r13.xyz, r11.y).x; r11.y = cmp(r11.y < 1); r11.w = r11.y ? r10.w : 1; } r10.w = -1 + r11.w; r10.w = g_lightInfoPunctualShadow[r11.x].baseLight.shadowDimmer * r10.w; r10.w = r12.y * r10.w + 1; r10.w = r11.z ? r10.w : 1; r12.x = r12.x * r10.w; } } r10.xyz = r15.xyz * r12.xxx + r10.xyz; r9.xyz = r14.xyz * r12.xxx + r9.xyz; r9.w = (int)r9.w + 1; } r1.x = -1 + r1.z; r1.x = dynamicAOFactor * r1.x + 1; r2.xyz = r10.xyz * r1.xxx; r1.x = r2.w + r1.z; r1.y = r1.y * -16 + -1; r1.y = exp2(r1.y); r1.x = log2(r1.x); r1.x = r1.y * r1.x; r1.x = exp2(r1.x); r1.x = r1.x + r1.z; r1.x = saturate(-1 + r1.x); r1.xyz = r9.xyz * r1.xxx; r2.xyz = r3.yzw * r2.xyz; r1.xyz = float3(0.318309873,0.318309873,0.318309873) * r1.xyz; r1.xyz = r2.xyz * float3(0.318309873,0.318309873,0.318309873) + r1.xyz; r1.xyz = g_exposureMultipliers.zzz * r1.xyz; r1.xyz = min(float3(65504,65504,65504), r1.xyz); r1.w = 0; // No code for instruction (needs manual fix): store_uav_typed u0.xyzw, r0.xyzw, r1.xyzw return; } */ [/code]
Hi guys,

So, I am again attempting to solve the mystery of the compute shaders in Frostbyte 3 engine.

I have this compute shader that controls some lights (No other VS/PS controls this as I looked through all shaders).

//
// Generated by Microsoft (R) HLSL Shader Compiler 6.3.9600.16384
//
// using 3Dmigoto v1.2.39 on Mon Jun 13 19:31:45 2016
//
//
// Buffer Definitions:
//
// cbuffer cbPunctualShadowLightInfo
// {
//
// struct PunctualShadowLightInfo
// {
//
// struct BaseLightInfo
// {
//
// float3 pos; // Offset: 0
// float invSqrAttenuationRadius;// Offset: 12
// float3 color; // Offset: 16
// float attenuationOffset; // Offset: 28
// float3 matrixForward; // Offset: 32
// float diffuseScale; // Offset: 44
// float3 matrixUp; // Offset: 48
// float specularScale; // Offset: 60
// float3 matrixLeft; // Offset: 64
// float shadowDimmer; // Offset: 76
// float angleScale; // Offset: 80
// float angleOffset; // Offset: 84
// float2 unused; // Offset: 88
//
// } baseLight; // Offset: 0
//
// struct IESShadowInfo
// {
//
// float enable; // Offset: 96
// float textureIndex; // Offset: 100
// float2 unused; // Offset: 104
//
// } iesShadow; // Offset: 96
//
// struct ShadowLightInfo
// {
//
// float4 shadowMatrix1; // Offset: 112
// float4 shadowMatrix2; // Offset: 128
// float4 shadowMatrix3; // Offset: 144
// float4 shadowMatrix4; // Offset: 160
// float4 shadowMapAtlasParam[6];// Offset: 176
// float4 shadowMapIndex[2]; // Offset: 272
// float shadowType; // Offset: 304
// float quality; // Offset: 308
// float shadowAngleScale; // Offset: 312
// float shadowAngleOffset; // Offset: 316
//
// } shadow; // Offset: 112
//
// struct VolumetricShadowInfo
// {
//
// float enable; // Offset: 320
// float volumeShadowMapIndex;// Offset: 324
// float invAttenuationRadius;// Offset: 328
// float tanAngle; // Offset: 332
//
// } vShadow; // Offset: 320
//
// } g_lightInfoPunctualShadow[128]; // Offset: 0 Size: 43008
//
// }
//
// cbuffer cb0
// {
//
// float4x4 invViewProjectionMatrix; // Offset: 0 Size: 64
// float4 g_exposureMultipliers; // Offset: 64 Size: 16
// float localIblMipmapBias; // Offset: 80 Size: 4 [unused]
// float screenAspectRatio; // Offset: 84 Size: 4 [unused]
// float2 invResolution; // Offset: 88 Size: 8
// float4 shadowMapSizeAndInvSize; // Offset: 96 Size: 16
// uint forceSplitLighting; // Offset: 112 Size: 4 [unused]
// uint sssScatteringEnables; // Offset: 116 Size: 4 [unused]
// float volumetricShadowmapHalfTexelOffset;// Offset: 120 Size: 4 [unused]
// float volumetricShadowmapOneMinusHalfTexelOffset;// Offset: 124 Size: 4 [unused]
// float volumetricShadowmapInvMaxCount;// Offset: 128 Size: 4 [unused]
// float dynamicAOFactor; // Offset: 132 Size: 4
// uint tileCountX; // Offset: 136 Size: 4
// uint pad1; // Offset: 140 Size: 4 [unused]
// float4x3 g_normalBasisTransforms[6];// Offset: 144 Size: 288
//
// }
//
// Resource bind info for g_lightCullInput
// {
//
// uint4 $Element; // Offset: 0 Size: 16
//
// }
//
// Resource bind info for g_lightIndexInput
// {
//
// uint $Element; // Offset: 0 Size: 4
//
// }
//
// Resource bind info for g_compactTileGridBuffer
// {
//
// uint $Element; // Offset: 0 Size: 4
//
// }
//
//
// Resource Bindings:
//
// Name Type Format Dim Slot Elements
// ------------------------------ ---------- ------- ----------- ---- --------
// g_linearSampler sampler NA NA 0 1
// g_linearLongitudeWrapSampler sampler NA NA 2 1
// g_shadowmapSampler sampler_c NA NA 3 1
// g_gbufferTexture0 texture float4 2d 0 1
// g_gbufferTexture1 texture float4 2d 1 1
// g_gbufferTexture2 texture float4 2d 2 1
// g_depthTexture texture float 2d 6 1
// g_iesTextureArray texture float 2darray 9 1
// g_diffuseOcclusionTexture texture float 2d 10 1
// g_lightCullInput texture struct r/o 19 1
// g_lightIndexInput texture struct r/o 20 1
// g_shadowmapTexture texture float4 2darray 21 1
// g_compactTileGridBuffer texture struct r/o 24 1
// g_outputTexture0 UAV float4 2d 0 1
// cb0 cbuffer NA NA 0 1
// cbPunctualShadowLightInfo cbuffer NA NA 2 1
//
//
//
// Input signature:
//
// Name Index Mask Register SysValue Format Used
// -------------------- ----- ------ -------- -------- ------- ------
// no Input
//
// Output signature:
//
// Name Index Mask Register SysValue Format Used
// -------------------- ----- ------ -------- -------- ------- ------
// no Output
cs_5_0
dcl_globalFlags refactoringAllowed
dcl_immediateConstantBuffer { { 1.000000, 0, 0, 0},
{ 0, 1.000000, 0, 0},
{ 0, 0, 1.000000, 0},
{ 0, 0, 0, 1.000000} }
dcl_constantbuffer cb2[2688], dynamicIndexed
dcl_constantbuffer cb0[27], dynamicIndexed
dcl_sampler s0, mode_default
dcl_sampler s2, mode_default
dcl_sampler s3, mode_comparison
dcl_resource_texture2d (float,float,float,float) t0
dcl_resource_texture2d (float,float,float,float) t1
dcl_resource_texture2d (float,float,float,float) t2
dcl_resource_texture2d (float,float,float,float) t6
dcl_resource_texture2darray (float,float,float,float) t9
dcl_resource_texture2d (float,float,float,float) t10
dcl_resource_structured t19, 16
dcl_resource_structured t20, 4
dcl_resource_texture2darray (float,float,float,float) t21
dcl_resource_structured t24, 4
dcl_uav_typed_texture2d (float,float,float,float) u0
dcl_input vThreadIDInGroupFlattened
dcl_input vThreadGroupID.x
dcl_input vThreadIDInGroup.xy
//dcl_temps 25
dcl_temps 27
dcl_resource_texture2d (float,float,float,float) t125

dcl_tgsm_raw g0, 4
dcl_tgsm_raw g1, 4
dcl_thread_group 16, 16, 1
ld_structured_indexable(structured_buffer, stride=4)(mixed,mixed,mixed,mixed) r0.x, vThreadGroupID.x, l(0), t24.xxxx
ushr r1.x, r0.x, l(16)
and r1.yzw, r0.xxxx, l(0, 0x0000ffff, 0x0000ffff, 0x0000ffff)
imad r0.xyzw, r1.xyzw, l(16, 16, 16, 16), vThreadIDInGroup.xyyy
if_z vThreadIDInGroupFlattened.x
imad r1.x, r1.w, cb0[8].z, r1.x
ld_structured_indexable(structured_buffer, stride=16)(mixed,mixed,mixed,mixed) r1.xy, r1.x, l(0), t19.xyxx
ushr r1.y, r1.y, l(16)
store_raw g0.x, l(0), r1.x
store_raw g1.x, l(0), r1.y
endif
sync_g_t
utof r1.xy, r0.xwxx
add r1.zw, r1.xxxy, l(0.000000, 0.000000, 0.500000, 0.500000)
mul r1.zw, r1.zzzw, cb0[5].zzzw
ftoi r2.xy, r1.xyxx
mov r2.zw, l(0,0,0,0)
ld_indexable(texture2d)(float,float,float,float) r3.xyzw, r2.xyww, t0.xyzw
ld_indexable(texture2d)(float,float,float,float) r4.xyzw, r2.xyww, t1.xyzw
ld_indexable(texture2d)(float,float,float,float) r1.xy, r2.xyww, t2.yzxw
ld_indexable(texture2d)(float,float,float,float) r2.z, r2.xyzw, t6.yzxw
mul r4.w, r4.w, l(6.000000)
round_ne r4.w, r4.w
ftou r4.w, r4.w
mad r5.xy, r3.xyxx, l(2.000000, 2.000000, 0.000000, 0.000000), l(-1.000000, -1.000000, 0.000000, 0.000000)
dp2 r3.x, r5.xyxx, r5.xyxx
min r3.x, r3.x, l(1.000000)
add r3.x, -r3.x, l(1.000000)
sqrt r5.z, r3.x
imul null, r3.x, r4.w, l(3)
dp3 r6.x, r5.xyzx, cb0[r3.x + 9].xyzx
dp3 r6.y, r5.xyzx, cb0[r3.x + 10].xyzx
dp3 r6.z, r5.xyzx, cb0[r3.x + 11].xyzx
add r3.x, -r3.z, l(1.000000)
mul r3.y, r3.w, l(3.000000)
round_ne r3.y, r3.y
ftoi r3.y, r3.y
ieq r3.y, r3.y, l(1)
movc r1.x, r3.y, l(0), r1.x
add r3.y, -r1.x, l(1.000000)
mul r3.yzw, r3.yyyy, r4.xxyz
mul r1.y, r1.y, r1.y
mul r4.w, r1.y, l(0.160000)
mad r4.xyz, -r1.yyyy, l(0.160000, 0.160000, 0.160000, 0.000000), r4.xyzx
mad r4.xyz, r1.xxxx, r4.xyzx, r4.wwww
dp3 r1.x, r4.xyzx, l(0.330000, 0.330000, 0.330000, 0.000000)
mul_sat r1.x, r1.x, l(50.000000)
mul r1.y, r3.x, r3.x
mad r5.xy, r1.zwzz, l(2.000000, 2.000000, 0.000000, 0.000000), l(-1.000000, -1.000000, 0.000000, 0.000000)
mul r2.xy, r5.xyxx, l(1.000000, -1.000000, 0.000000, 0.000000)
mov r2.w, l(1.000000)

// here we need to fix
dp4 r5.x, r2.xyzw, cb0[0].xyzw
dp4 r5.y, r2.xyzw, cb0[1].xyzw
dp4 r5.z, r2.xyzw, cb0[2].xyzw
dp4 r2.x, r2.xyzw, cb0[3].xyzw

// Attempt to do something?!?!
ld_indexable(texture2d)(float,float,float,float) r26.xyzw, l(0, 0, 0, 0), t125.xyzw
add r26.w, r2.x, -r26.y
mul r26.w, r26.w, r26.x
//mul r26.w, r26.w, cb2[81].x
mul r26.w, r26.w, l(5.5)
//add r5.x, r5.x, -r26.w

div r2.x, l(1.000000, 1.000000, 1.000000, 1.000000), r2.x
mul r7.xyz, r2.xxxx, r5.xyzx
dp3 r2.y, -r7.xyzx, -r7.xyzx
rsq r2.y, r2.y
mul r8.xyz, r2.yyyy, -r7.xyzx
dp3 r2.z, r6.xyzx, r8.xyzx
mov_sat r2.w, r2.z
sample_l_indexable(texture2d)(float,float,float,float) r1.z, r1.zwzz, t10.yzxw, s0, l(0.000000)
ld_raw r1.w, l(0), g0.xxxx
ld_raw r4.w, l(0), g1.xxxx
iadd r4.w, r1.w, r4.w
add r2.z, |r2.z|, l(0.000010)
add r8.xyz, -r4.xyzx, r1.xxxx
max r1.x, r1.y, l(0.002000)
mul r1.x, r1.x, r1.x
mad r5.w, -r2.z, r1.x, r2.z
mad r5.w, r5.w, r2.z, r1.x
sqrt r5.w, r5.w
mad r6.w, r3.x, l(-0.337748349), l(1.000000)
add r8.w, -r2.z, l(1.000000)
mul r9.x, r8.w, r8.w
mul r9.x, r9.x, r9.x
mul r8.w, r8.w, r9.x
mov r7.w, l(1.000000)
mov r9.xyz, l(0,0,0,0)
mov r10.xyz, l(0,0,0,0)
mov r9.w, r1.w
loop
uge r10.w, r9.w, r4.w
breakc_nz r10.w
ld_structured_indexable(structured_buffer, stride=4)(mixed,mixed,mixed,mixed) r10.w, r9.w, l(0), t20.xxxx
imul null, r11.x, r10.w, l(21)
mad r11.yzw, -r5.xxyz, r2.xxxx, cb2[r11.x + 0].xxyz
dp3 r12.x, r11.yzwy, r11.yzwy
rsq r12.y, r12.x
mul r12.yzw, r11.yyzw, r12.yyyy
add r13.x, r12.x, cb2[r11.x + 1].w
max r13.x, r13.x, l(0.000100)
div r13.x, l(1.000000, 1.000000, 1.000000, 1.000000), r13.x
mul r12.x, r12.x, cb2[r11.x + 0].w
mad r12.x, -r12.x, r12.x, l(1.000000)
max r12.x, r12.x, l(0.000000)
mul r12.x, r12.x, r12.x
mul r12.x, r12.x, r13.x
dp3 r13.x, cb2[r11.x + 2].xyzx, r12.yzwy
mad_sat r13.y, r13.x, cb2[r11.x + 5].x, cb2[r11.x + 5].y
mul r13.y, r13.y, r13.y
mul r12.x, r12.x, r13.y
dp3_sat r13.y, r6.xyzx, r12.yzwy
mul r12.x, r12.x, r13.y
lt r13.z, l(0.000000), r12.x
if_nz r13.z
mad r14.xyz, -r7.xyzx, r2.yyyy, r12.yzwy
dp3 r13.w, r14.xyzx, r14.xyzx
rsq r13.w, r13.w
mul r14.xyz, r13.wwww, r14.xyzx
dp3_sat r13.w, r12.yzwy, r14.xyzx
dp3_sat r14.x, r6.xyzx, r14.xyzx
add r14.y, -r13.w, l(1.000000)
mul r14.z, r14.y, r14.y
mul r14.z, r14.z, r14.z
mul r14.y, r14.y, r14.z
mad r14.yzw, r8.xxyz, r14.yyyy, r4.xxyz
mad r15.x, -r13.y, r1.x, r13.y
mad r15.x, r15.x, r13.y, r1.x
sqrt r15.x, r15.x
mul r15.x, r2.z, r15.x
mad r15.x, r13.y, r5.w, r15.x
div r15.x, l(0.500000), r15.x
mad r15.y, r14.x, r1.x, -r14.x
mad r14.x, r15.y, r14.x, l(1.000000)
mul r14.x, r14.x, r14.x
div r14.x, r1.x, r14.x
mul r14.x, r14.x, r15.x
mul r14.xyz, r14.xxxx, r14.yzwy
mul r13.w, r13.w, r13.w
dp2 r13.w, r13.wwww, r3.xxxx
mad r13.w, r3.x, l(0.500000), r13.w
add r13.y, -r13.y, l(1.000000)
mul r14.w, r13.y, r13.y
mul r14.w, r14.w, r14.w
mul r13.y, r13.y, r14.w
add r13.w, r13.w, l(-1.000000)
mad r13.y, r13.w, r13.y, l(1.000000)
mad r13.w, r13.w, r8.w, l(1.000000)
mul r13.y, r13.w, r13.y
mul r13.y, r6.w, r13.y
mul r13.w, r12.x, cb2[r11.x + 2].w
mul r15.xyz, r13.wwww, cb2[r11.x + 1].xyzx
mul r15.xyz, r13.yyyy, r15.xyzx
mul r12.x, r12.x, cb2[r11.x + 3].w
mul r16.xyz, r12.xxxx, cb2[r11.x + 1].xyzx
mul r14.xyz, r14.xyzx, r16.xyzx
else
mov r15.xyz, l(0,0,0,0)
mov r14.xyz, l(0,0,0,0)
endif
lt r12.x, l(0.000000), cb2[r11.x + 6].x
and r12.x, r12.x, r13.z
if_nz r12.x
dp3 r12.x, cb2[r11.x + 4].xyzx, -r12.yzwy
dp3 r13.y, cb2[r11.x + 3].xyzx, -r12.yzwy
dp3 r12.y, cb2[r11.x + 2].xyzx, -r12.yzwy
mad r16.y, r12.y, l(0.500000), l(0.500000)
min r12.y, |r12.x|, |r13.y|
max r12.z, |r12.x|, |r13.y|
div r12.z, l(1.000000, 1.000000, 1.000000, 1.000000), r12.z
mul r12.y, r12.z, r12.y
mul r12.z, r12.y, r12.y
mad r12.w, r12.z, l(0.0208350997), l(-0.085133)
mad r12.w, r12.z, r12.w, l(0.180141)
mad r12.w, r12.z, r12.w, l(-0.330299497)
mad r12.z, r12.z, r12.w, l(0.999866)
mul r12.w, r12.z, r12.y
lt r13.w, |r12.x|, |r13.y|
mad r12.w, r12.w, l(-2.000000), l(1.57079637)
and r12.w, r13.w, r12.w
mad r12.y, r12.y, r12.z, r12.w
lt r12.z, r12.x, -r12.x
and r12.z, r12.z, l(0xc0490fdb)
add r12.y, r12.z, r12.y
min r12.z, r12.x, r13.y
max r12.x, r12.x, r13.y
lt r12.z, r12.z, -r12.z
ge r12.x, r12.x, -r12.x
and r12.x, r12.x, r12.z
movc r12.x, r12.x, -r12.y, r12.y
mul r16.x, r12.x, l(0.159154937)
mov r16.z, cb2[r11.x + 6].y
sample_l_indexable(texture2darray)(float,float,float,float) r12.x, r16.xyzx, t9.xyzw, s2, l(0.000000)
else
mov r12.x, l(1.000000)
endif
ne r12.y, l(0.000000), cb2[r11.x + 19].x
and r12.y, r12.y, r13.z
if_nz r12.y
mad_sat r12.y, r13.x, cb2[r11.x + 19].z, cb2[r11.x + 19].w
mul r12.y, r12.y, r12.y
lt r12.z, l(0.000000), r12.y
if_nz r12.z
eq r12.z, l(2.000000), cb2[r11.x + 19].x
if_nz r12.z
mov r13.xyz, -r11.yzwy
max r12.z, |r11.z|, |r11.y|
max r12.z, |r11.w|, r12.z
lt r16.xy, |r11.zwzz|, |r11.yyyy|
and r12.w, r16.y, r16.x
if_nz r12.w
lt r12.w, l(0.000000), r13.x
movc r13.x, r12.w, r13.z, r11.w
and r12.w, r12.w, l(0x3f800000)
else
lt r16.xy, |r11.ywyy|, |r11.zzzz|
and r11.z, r16.y, r16.x
if_nz r11.z
lt r11.z, l(0.000000), r13.y
movc r13.w, r11.z, r13.z, r11.w
movc r12.w, r11.z, l(3.000000), l(2.000000)
mov r13.xy, r13.xwxx
else
lt r11.z, l(0.000000), r13.z
movc r13.x, r11.z, r11.y, r13.x
movc r12.w, r11.z, l(5.000000), l(4.000000)
endif
endif
div r11.yz, r13.xxyx, r12.zzzz
mad r13.xy, r11.yzyy, l(0.500000, -0.500000, 0.000000, 0.000000), l(0.500000, 0.500000, 0.000000, 0.000000)
mad r11.y, -r12.z, cb2[r11.x + 9].z, cb2[r11.x + 9].w
div r11.y, r11.y, r12.z
mov r11.z, l(-1)

// Disabled some stupid cut-off
//else
//dp4 r16.x, r7.xyzw, cb2[r11.x + 7].xyzw
//dp4 r16.y, r7.xyzw, cb2[r11.x + 8].xyzw
//dp4 r11.w, r7.xyzw, cb2[r11.x + 9].xyzw
//dp4 r12.z, r7.xyzw, cb2[r11.x + 10].xyzw
//div r12.z, l(1.000000, 1.000000, 1.000000, 1.000000), r12.z
//mul r16.xy, r12.zzzz, r16.xyxx
//mul r11.y, r11.w, r12.z
//mad r13.xy, r16.xyxx, l(0.500000, -0.500000, 0.000000, 0.000000), l(0.500000, 0.500000, 0.000000, 0.000000)
//mad r11.w, -r11.w, r12.z, l(1.000000)
//max r12.z, |r16.y|, |r16.x|
//max r11.w, r11.w, r12.z
//ge r11.z, l(1.000000), r11.w
//mov r12.w, l(0)

endif
ftou r11.w, r12.w
and r12.z, r11.w, l(3)
ushr r11.w, r11.w, l(2)
imad r10.w, r10.w, l(21), r11.w
dp4 r10.w, cb2[r10.w + 17].xyzw, icb[r12.z + 0].xyzw
ge r11.w, r10.w, l(0.000000)
and r11.z, r11.z, r11.w
ftou r10.w, r10.w
ftou r11.w, cb2[r11.x + 19].y
ieq r11.w, r11.w, l(1)
if_nz r11.w
utof r16.z, r10.w
mad r12.zw, cb0[6].xxxx, r13.xxxy, l(0.000000, 0.000000, 0.500000, 0.500000)
round_ni r17.xy, r12.zwzz
add r12.zw, r12.zzzw, -r17.xxxy
mul r16.xy, r17.xyxx, cb0[6].zzzz
gather4_c_aoffimmi_indexable(-2,-2,0)(texture2darray)(float,float,float,float) r17.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y
add r18.xyzw, -r12.zzzz, l(1.000000, 2.000000, 3.000000, 5.000000)
mul r19.xyzw, r17.wzxy, r18.xyxy
add r17.zw, r19.yyyw, r19.xxxz
gather4_c_aoffimmi_indexable(0,-2,0)(texture2darray)(float,float,float,float) r19.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y
mad r17.zw, r19.zzzy, l(0.000000, 0.000000, 2.000000, 2.000000), r17.zzzw
mad r17.zw, r19.wwwx, l(0.000000, 0.000000, 2.000000, 2.000000), r17.zzzw
gather4_c_aoffimmi_indexable(2,-2,0)(texture2darray)(float,float,float,float) r20.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y
add r21.xyzw, r12.zzzz, l(1.000000, 4.000000, 3.000000, 2.000000)
mul r11.w, r12.z, r20.y
mad r17.zw, r20.zzzy, r12.zzzz, r17.zzzw
mad r17.zw, r20.wwwx, r21.xxxx, r17.zzzw
gather4_c_aoffimmi_indexable(-2,0,0)(texture2darray)(float,float,float,float) r22.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y
mad r13.w, -r12.z, l(2.000000), l(2.000000)
mul r19.zw, r13.wwww, r22.wwwx
mad r14.w, r12.z, l(-2.000000), l(4.000000)
mad r19.zw, r22.zzzy, r14.wwww, r19.zzzw
mad r14.w, r17.y, r18.z, r19.z
mad r14.w, r17.x, r13.w, r14.w
mad r15.w, r22.z, r18.z, r19.w
mad r15.w, r22.w, r13.w, r15.w
gather4_c_indexable(texture2darray)(float,float,float,float) r23.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y
mad r14.w, r23.z, r21.y, r14.w
mad r14.w, r23.w, r18.w, r14.w
mad r15.w, r23.y, r21.y, r15.w
mad r15.w, r23.x, r18.w, r15.w
add r17.xy, -r12.zwzz, l(4.000000, 1.000000, 0.000000, 0.000000)
mad r14.w, r19.y, r21.z, r14.w
mad r14.w, r19.x, r17.x, r14.w
mad r15.w, r23.z, r21.z, r15.w
mad r15.w, r23.w, r17.x, r15.w
gather4_c_aoffimmi_indexable(2,0,0)(texture2darray)(float,float,float,float) r19.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y
mad r16.w, r12.z, l(2.000000), l(2.000000)
dp2 r18.w, r19.zzzz, r12.zzzz
add r14.w, r14.w, r18.w
mad r14.w, r19.w, r16.w, r14.w
dp2 r19.y, r19.yyyy, r12.zzzz
add r15.w, r15.w, r19.y
mad r15.w, r19.x, r16.w, r15.w
mad r11.w, r11.w, l(2.000000), r14.w
mad r20.x, r20.x, r21.w, r11.w
add r11.w, r15.w, r18.w
mad r20.y, r19.w, r21.w, r11.w
add r17.zw, r17.zzzw, r20.xxxy
gather4_c_aoffimmi_indexable(-2,2,0)(texture2darray)(float,float,float,float) r20.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y
mul r24.xyzw, r18.xyxy, r20.wzxy
add r18.xy, r24.ywyy, r24.xzxx
mad r11.w, r22.y, r18.z, r18.x
mad r11.w, r22.x, r13.w, r11.w
mad r14.w, r20.z, r18.z, r18.y
mad r13.w, r20.w, r13.w, r14.w
gather4_c_aoffimmi_indexable(0,2,0)(texture2darray)(float,float,float,float) r18.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y
mad r11.w, r18.z, l(2.000000), r11.w
mad r11.w, r18.w, l(2.000000), r11.w
mad r13.w, r18.y, l(2.000000), r13.w
mad r13.w, r18.x, l(2.000000), r13.w
mad r11.w, r23.y, r21.z, r11.w
mad r11.w, r23.x, r17.x, r11.w
mad r13.w, r18.z, r21.z, r13.w
mad r13.w, r18.w, r17.x, r13.w
gather4_c_aoffimmi_indexable(2,2,0)(texture2darray)(float,float,float,float) r16.xyzw, r16.xyzx, t21.xyzw, s3.x, r11.y
mul r14.w, r12.z, r16.z
mad r11.w, r16.z, r12.z, r11.w
mad r11.w, r16.w, r21.x, r11.w
mad r12.z, r16.y, r12.z, r13.w
mad r12.z, r16.x, r21.x, r12.z
add r11.w, r11.w, r19.y
mad r16.x, r19.x, r21.w, r11.w
mad r11.w, r14.w, l(2.000000), r12.z
mad r16.y, r16.w, r21.w, r11.w
add r16.xy, r16.xyxx, r17.zwzz
mul r11.w, r12.w, r16.y
mad r11.w, r16.x, r17.y, r11.w
mul_sat r11.w, r11.w, l(0.0163934417)
else
utof r13.z, r10.w
sample_c_lz_indexable(texture2darray)(float,float,float,float) r10.w, r13.xyzx, t21.xxxx, s3, r11.y
lt r11.y, r11.y, l(1.000000)
movc r11.w, r11.y, r10.w, l(1.000000)
endif
add r10.w, r11.w, l(-1.000000)
mul r10.w, r10.w, cb2[r11.x + 4].w
mad r10.w, r12.y, r10.w, l(1.000000)
movc r10.w, r11.z, r10.w, l(1.000000)
mul r12.x, r10.w, r12.x
endif
endif
mad r10.xyz, r15.xyzx, r12.xxxx, r10.xyzx
mad r9.xyz, r14.xyzx, r12.xxxx, r9.xyzx
iadd r9.w, r9.w, l(1)
endloop
add r1.x, r1.z, l(-1.000000)
mad r1.x, cb0[8].y, r1.x, l(1.000000)
mul r2.xyz, r1.xxxx, r10.xyzx
add r1.x, r1.z, r2.w
mad r1.y, r1.y, l(-16.000000), l(-1.000000)
exp r1.y, r1.y
log r1.x, r1.x
mul r1.x, r1.x, r1.y
exp r1.x, r1.x
add r1.x, r1.z, r1.x
add_sat r1.x, r1.x, l(-1.000000)
mul r1.xyz, r1.xxxx, r9.xyzx
mul r2.xyz, r2.xyzx, r3.yzwy
mul r1.xyz, r1.xyzx, l(0.318309873, 0.318309873, 0.318309873, 0.000000)
mad r1.xyz, r2.xyzx, l(0.318309873, 0.318309873, 0.318309873, 0.000000), r1.xyzx
mul r1.xyz, r1.xyzx, cb0[4].zzzz
min r1.xyz, r1.xyzx, l(65504.000000, 65504.000000, 65504.000000, 0.000000)
mov r1.w, l(0)
store_uav_typed u0.xyzw, r0.xyzw, r1.xyzw
ret
// Approximately 367 instruction slots used


I already attempted to fix it. I can see something that is pushed in stereo but far from the proper result.
Anyone has any clues?

Here is the HLSL version (is broken like you can see but helps figuring what is what a bit)
// ---- Created with 3Dmigoto v1.2.39 on Mon Jun 13 19:31:45 2016

/*
cbuffer cbPunctualShadowLightInfo : register(b2)
{

struct
{

struct
{
float3 pos;
float invSqrAttenuationRadius;
float3 color;
float attenuationOffset;
float3 matrixForward;
float diffuseScale;
float3 matrixUp;
float specularScale;
float3 matrixLeft;
float shadowDimmer;
float angleScale;
float angleOffset;
float2 unused;
} baseLight;


struct
{
float enable;
float textureIndex;
float2 unused;
} iesShadow;


struct
{
float4 shadowMatrix1;
float4 shadowMatrix2;
float4 shadowMatrix3;
float4 shadowMatrix4;
float4 shadowMapAtlasParam[6];
float4 shadowMapIndex[2];
float shadowType;
float quality;
float shadowAngleScale;
float shadowAngleOffset;
} shadow;


struct
{
float enable;
float volumeShadowMapIndex;
float invAttenuationRadius;
float tanAngle;
} vShadow;

} g_lightInfoPunctualShadow[128] : packoffset(c0);

}

cbuffer cb0 : register(b0)
{
float4x4 invViewProjectionMatrix : packoffset(c0);
float4 g_exposureMultipliers : packoffset(c4);
float localIblMipmapBias : packoffset(c5);
float screenAspectRatio : packoffset(c5.y);
float2 invResolution : packoffset(c5.z);
float4 shadowMapSizeAndInvSize : packoffset(c6);
uint forceSplitLighting : packoffset(c7);
uint sssScatteringEnables : packoffset(c7.y);
float volumetricShadowmapHalfTexelOffset : packoffset(c7.z);
float volumetricShadowmapOneMinusHalfTexelOffset : packoffset(c7.w);
float volumetricShadowmapInvMaxCount : packoffset(c8);
float dynamicAOFactor : packoffset(c8.y);
uint tileCountX : packoffset(c8.z);
uint pad1 : packoffset(c8.w);
float4x3 g_normalBasisTransforms[6] : packoffset(c9);
}

SamplerState g_linearSampler_s : register(s0);
SamplerState g_linearLongitudeWrapSampler_s : register(s2);
SamplerComparisonState g_shadowmapSampler_s : register(s3);
Texture2D<float4> g_gbufferTexture0 : register(t0);
Texture2D<float4> g_gbufferTexture1 : register(t1);
Texture2D<float4> g_gbufferTexture2 : register(t2);
Texture2D<float> g_depthTexture : register(t6);
Texture2DArray<float> g_iesTextureArray : register(t9);
Texture2D<float> g_diffuseOcclusionTexture : register(t10);
StructuredBuffer<g_lightCullInput> g_lightCullInput : register(t19);
StructuredBuffer<g_lightIndexInput> g_lightIndexInput : register(t20);
Texture2DArray<float4> g_shadowmapTexture : register(t21);
StructuredBuffer<g_compactTileGridBuffer> g_compactTileGridBuffer : register(t24);


// 3Dmigoto declarations
#define cmp -
Texture1D<float4> IniParams : register(t120);
Texture2D<float4> StereoParams : register(t125);


void main()
{
const float4 icb[] = { { 1.000000, 0, 0, 0},
{ 0, 1.000000, 0, 0},
{ 0, 0, 1.000000, 0},
{ 0, 0, 0, 1.000000} };
// Needs manual fix for instruction:
// unknown dcl_: dcl_resource_structured t19, 16
// Needs manual fix for instruction:
// unknown dcl_: dcl_resource_structured t20, 4
// Needs manual fix for instruction:
// unknown dcl_: dcl_resource_structured t24, 4
// Needs manual fix for instruction:
// unknown dcl_: dcl_uav_typed_texture2d (float,float,float,float) u0
float4 r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,r10,r11,r12,r13,r14,r15,r16,r17,r18,r19,r20,r21,r22,r23,r24;
uint4 bitmask, uiDest;
float4 fDest;

// Needs manual fix for instruction:
// unknown dcl_: dcl_tgsm_raw g0, 4
// Needs manual fix for instruction:
// unknown dcl_: dcl_tgsm_raw g1, 4
// Needs manual fix for instruction:
// unknown dcl_: dcl_thread_group 16, 16, 1
// Known bad code for instruction (needs manual fix):
ld_structured_indexable(structured_buffer, stride=4)(mixed,mixed,mixed,mixed) r0.x, vThreadGroupID.x, l(0), t24.xxxx
r0.x = g_linearSampler[]..swiz;
r1.x = (uint)r0.x >> 16;
r1.yzw = (int3)r0.xxx & int3(0,0,0);
r0.xyzw = mad((int4)r1.xyzw, int4(16,16,16,16), (int4)vThreadIDInGroup.xyyy);
if (vThreadIDInGroupFlattened.x == 0) {
r1.x = mad((int)r1.w, tileCountX, (int)r1.x);
// Known bad code for instruction (needs manual fix):
ld_structured_indexable(structured_buffer, stride=16)(mixed,mixed,mixed,mixed) r1.xy, r1.x, l(0), t19.xyxx
r1.x = g_linearSampler[]..swiz;
r1.y = g_linearSampler[]..swiz;
r1.y = (uint)r1.y >> 16;
// No code for instruction (needs manual fix):
store_raw g0.x, l(0), r1.x
// No code for instruction (needs manual fix):
store_raw g1.x, l(0), r1.y
}
GroupMemoryBarrierWithGroupSync();
r1.xy = (uint2)r0.xw;
r1.zw = float2(0.5,0.5) + r1.xy;
r1.zw = invResolution.xy * r1.zw;
r2.xy = (int2)r1.xy;
r2.zw = float2(0,0);
r3.xyzw = g_gbufferTexture0.Load(r2.xyw).xyzw;
r4.xyzw = g_gbufferTexture1.Load(r2.xyw).xyzw;
r1.xy = g_gbufferTexture2.Load(r2.xyw).yz;
r2.z = g_depthTexture.Load(r2.xyz).x;
r4.w = 6 * r4.w;
r4.w = round(r4.w);
r4.w = (uint)r4.w;
r5.xy = r3.xy * float2(2,2) + float2(-1,-1);
r3.x = dot(r5.xy, r5.xy);
r3.x = min(1, r3.x);
r3.x = 1 + -r3.x;
r5.z = sqrt(r3.x);
r3.x = (int)r4.w * 3;
r6.x = dot(r5.xyz, g_normalBasisTransforms[r4.w]._m00_m10_m20);
r6.y = dot(r5.xyz, g_normalBasisTransforms[r4.w]._m01_m11_m21);
r6.z = dot(r5.xyz, g_normalBasisTransforms[r4.w]._m02_m12_m22);
r3.x = 1 + -r3.z;
r3.y = 3 * r3.w;
r3.y = round(r3.y);
r3.y = (int)r3.y;
r3.y = cmp((int)r3.y == 1);
r1.x = r3.y ? 0 : r1.x;
r3.y = 1 + -r1.x;
r3.yzw = r4.xyz * r3.yyy;
r1.y = r1.y * r1.y;
r4.w = 0.159999996 * r1.y;
r4.xyz = -r1.yyy * float3(0.159999996,0.159999996,0.159999996) + r4.xyz;
r4.xyz = r1.xxx * r4.xyz + r4.www;
r1.x = dot(r4.xyz, float3(0.330000013,0.330000013,0.330000013));
r1.x = saturate(50 * r1.x);
r1.y = r3.x * r3.x;
r5.xy = r1.zw * float2(2,2) + float2(-1,-1);
r2.xy = float2(1,-1) * r5.xy;
r2.w = 1;
r5.x = dot(r2.xyzw, invViewProjectionMatrix._m00_m10_m20_m30);
r5.y = dot(r2.xyzw, invViewProjectionMatrix._m01_m11_m21_m31);
r5.z = dot(r2.xyzw, invViewProjectionMatrix._m02_m12_m22_m32);
r2.x = dot(r2.xyzw, invViewProjectionMatrix._m03_m13_m23_m33);
r2.x = 1 / r2.x;
r7.xyz = r5.xyz * r2.xxx;
r2.y = dot(-r7.xyz, -r7.xyz);
r2.y = rsqrt(r2.y);
r8.xyz = -r7.xyz * r2.yyy;
r2.z = dot(r6.xyz, r8.xyz);
r2.w = saturate(r2.z);
r1.z = g_diffuseOcclusionTexture.SampleLevel(g_linearSampler_s, r1.zw, 0).x;
// No code for instruction (needs manual fix):
ld_raw r1.w, l(0), g0.xxxx
// No code for instruction (needs manual fix):
ld_raw r4.w, l(0), g1.xxxx
r4.w = (int)r1.w + (int)r4.w;
r2.z = 9.99999975e-006 + abs(r2.z);
r8.xyz = r1.xxx + -r4.xyz;
r1.x = max(0.00200000009, r1.y);
r1.x = r1.x * r1.x;
r5.w = -r2.z * r1.x + r2.z;
r5.w = r5.w * r2.z + r1.x;
r5.w = sqrt(r5.w);
r6.w = r3.x * -0.337748349 + 1;
r8.w = 1 + -r2.z;
r9.x = r8.w * r8.w;
r9.x = r9.x * r9.x;
r8.w = r9.x * r8.w;
r7.w = 1;
r9.xyz = float3(0,0,0);
r10.xyz = float3(0,0,0);
r9.w = r1.w;
while (true) {
r10.w = cmp((uint)r9.w >= (uint)r4.w);
if (r10.w != 0) break;
// Known bad code for instruction (needs manual fix):
ld_structured_indexable(structured_buffer, stride=4)(mixed,mixed,mixed,mixed) r10.w, r9.w, l(0), t20.xxxx
r10.w = g_linearSampler[]..swiz;
r11.x = (int)r10.w * 21;
r11.yzw = -r5.xyz * r2.xxx + g_lightInfoPunctualShadow[r11.x].baseLight.pos.xyz;
r12.x = dot(r11.yzw, r11.yzw);
r12.y = rsqrt(r12.x);
r12.yzw = r12.yyy * r11.yzw;
r13.x = g_lightInfoPunctualShadow[r11.x].baseLight.attenuationOffset + r12.x;
r13.x = max(9.99999975e-005, r13.x);
r13.x = 1 / r13.x;
r12.x = g_lightInfoPunctualShadow[r11.x].baseLight.invSqrAttenuationRadius * r12.x;
r12.x = -r12.x * r12.x + 1;
r12.x = max(0, r12.x);
r12.x = r12.x * r12.x;
r12.x = r13.x * r12.x;
r13.x = dot(g_lightInfoPunctualShadow[r11.x].baseLight.matrixForward.xyz, r12.yzw);
r13.y = saturate(r13.x * g_lightInfoPunctualShadow[r11.x].baseLight.angleScale + g_lightInfoPunctualShadow[r11.x].baseLight.angleOffset);
r13.y = r13.y * r13.y;
r12.x = r13.y * r12.x;
r13.y = saturate(dot(r6.xyz, r12.yzw));
r12.x = r13.y * r12.x;
r13.z = cmp(0 < r12.x);
if (r13.z != 0) {
r14.xyz = -r7.xyz * r2.yyy + r12.yzw;
r13.w = dot(r14.xyz, r14.xyz);
r13.w = rsqrt(r13.w);
r14.xyz = r14.xyz * r13.www;
r13.w = saturate(dot(r12.yzw, r14.xyz));
r14.x = saturate(dot(r6.xyz, r14.xyz));
r14.y = 1 + -r13.w;
r14.z = r14.y * r14.y;
r14.z = r14.z * r14.z;
r14.y = r14.z * r14.y;
r14.yzw = r8.xyz * r14.yyy + r4.xyz;
r15.x = -r13.y * r1.x + r13.y;
r15.x = r15.x * r13.y + r1.x;
r15.x = sqrt(r15.x);
r15.x = r15.x * r2.z;
r15.x = r13.y * r5.w + r15.x;
r15.x = 0.5 / r15.x;
r15.y = r14.x * r1.x + -r14.x;
r14.x = r15.y * r14.x + 1;
r14.x = r14.x * r14.x;
r14.x = r1.x / r14.x;
r14.x = r15.x * r14.x;
r14.xyz = r14.yzw * r14.xxx;
r13.w = r13.w * r13.w;
r13.w = dot(r13.ww, r3.xx);
r13.w = r3.x * 0.5 + r13.w;
r13.y = 1 + -r13.y;
r14.w = r13.y * r13.y;
r14.w = r14.w * r14.w;
r13.y = r14.w * r13.y;
r13.w = -1 + r13.w;
r13.y = r13.w * r13.y + 1;
r13.w = r13.w * r8.w + 1;
r13.y = r13.y * r13.w;
r13.y = r13.y * r6.w;
r13.w = g_lightInfoPunctualShadow[r11.x].baseLight.diffuseScale * r12.x;
r15.xyz = g_lightInfoPunctualShadow[r11.x].baseLight.color.xyz * r13.www;
r15.xyz = r15.xyz * r13.yyy;
r12.x = g_lightInfoPunctualShadow[r11.x].baseLight.specularScale * r12.x;
r16.xyz = g_lightInfoPunctualShadow[r11.x].baseLight.color.xyz * r12.xxx;
r14.xyz = r16.xyz * r14.xyz;
} else {
r15.xyz = float3(0,0,0);
r14.xyz = float3(0,0,0);
}
r12.x = cmp(0 < g_lightInfoPunctualShadow[r11.x].iesShadow.enable);
r12.x = r12.x ? r13.z : 0;
if (r12.x != 0) {
r12.x = dot(g_lightInfoPunctualShadow[r11.x].baseLight.matrixLeft.xyz, -r12.yzw);
r13.y = dot(g_lightInfoPunctualShadow[r11.x].baseLight.matrixUp.xyz, -r12.yzw);
r12.y = dot(g_lightInfoPunctualShadow[r11.x].baseLight.matrixForward.xyz, -r12.yzw);
r16.y = r12.y * 0.5 + 0.5;
r12.y = min(abs(r13.y), abs(r12.x));
r12.z = max(abs(r13.y), abs(r12.x));
r12.z = 1 / r12.z;
r12.y = r12.y * r12.z;
r12.z = r12.y * r12.y;
r12.w = r12.z * 0.0208350997 + -0.0851330012;
r12.w = r12.z * r12.w + 0.180141002;
r12.w = r12.z * r12.w + -0.330299497;
r12.z = r12.z * r12.w + 0.999866009;
r12.w = r12.y * r12.z;
r13.w = cmp(abs(r12.x) < abs(r13.y));
r12.w = r12.w * -2 + 1.57079637;
r12.w = r13.w ? r12.w : 0;
r12.y = r12.y * r12.z + r12.w;
r12.z = cmp(r12.x < -r12.x);
r12.z = r12.z ? -3.141593 : 0;
r12.y = r12.y + r12.z;
r12.z = min(r13.y, r12.x);
r12.x = max(r13.y, r12.x);
r12.z = cmp(r12.z < -r12.z);
r12.x = cmp(r12.x >= -r12.x);
r12.x = r12.x ? r12.z : 0;
r12.x = r12.x ? -r12.y : r12.y;
r16.x = 0.159154937 * r12.x;
r16.z = g_lightInfoPunctualShadow[r11.x].iesShadow.textureIndex;
r12.x = g_iesTextureArray.SampleLevel(g_linearLongitudeWrapSampler_s, r16.xyz, 0).x;
} else {
r12.x = 1;
}
r12.y = cmp(0.000000 != g_lightInfoPunctualShadow[r11.x].shadow.shadowType);
r12.y = r12.y ? r13.z : 0;
if (r12.y != 0) {
r12.y = saturate(r13.x * g_lightInfoPunctualShadow[r11.x].shadow.shadowAngleScale + g_lightInfoPunctualShadow[r11.x].shadow.shadowAngleOffset);
r12.y = r12.y * r12.y;
r12.z = cmp(0 < r12.y);
if (r12.z != 0) {
r12.z = cmp(2.000000 == g_lightInfoPunctualShadow[r11.x].shadow.shadowType);
if (r12.z != 0) {
r13.xyz = -r11.yzw;
r12.z = max(abs(r11.y), abs(r11.z));
r12.z = max(r12.z, abs(r11.w));
r16.xy = cmp(abs(r11.zw) < abs(r11.yy));
r12.w = r16.y ? r16.x : 0;
if (r12.w != 0) {
r12.w = cmp(0 < r13.x);
r13.x = r12.w ? r13.z : r11.w;
r12.w = r12.w ? 1.000000 : 0;
} else {
r16.xy = cmp(abs(r11.yw) < abs(r11.zz));
r11.z = r16.y ? r16.x : 0;
if (r11.z != 0) {
r11.z = cmp(0 < r13.y);
r13.w = r11.z ? r13.z : r11.w;
r12.w = r11.z ? 3 : 2;
r13.xy = r13.xw;
} else {
r11.z = cmp(0 < r13.z);
r13.x = r11.z ? r11.y : r13.x;
r12.w = r11.z ? 5 : 4;
}
}
r11.yz = r13.xy / r12.zz;
r13.xy = r11.yz * float2(0.5,-0.5) + float2(0.5,0.5);
r11.y = -r12.z * g_lightInfoPunctualShadow[r11.x].shadow.shadowMatrix3.z + g_lightInfoPunctualShadow[r11.x].shadow.shadowMatrix3.w;
r11.y = r11.y / r12.z;
r11.z = -1;
} else {
r16.x = dot(r7.xyzw, g_lightInfoPunctualShadow[r11.x].shadow.shadowMatrix1.xyzw);
r16.y = dot(r7.xyzw, g_lightInfoPunctualShadow[r11.x].shadow.shadowMatrix2.xyzw);
r11.w = dot(r7.xyzw, g_lightInfoPunctualShadow[r11.x].shadow.shadowMatrix3.xyzw);
r12.z = dot(r7.xyzw, g_lightInfoPunctualShadow[r11.x].shadow.shadowMatrix4.xyzw);
r12.z = 1 / r12.z;
r16.xy = r16.xy * r12.zz;
r11.y = r12.z * r11.w;
r13.xy = r16.xy * float2(0.5,-0.5) + float2(0.5,0.5);
r11.w = -r11.w * r12.z + 1;
r12.z = max(abs(r16.x), abs(r16.y));
r11.w = max(r12.z, r11.w);
r11.z = cmp(1 >= r11.w);
r12.w = 0;
}
r11.w = (uint)r12.w;
r12.z = (int)r11.w & 3;
r11.w = (uint)r11.w >> 2;
r10.w = mad((int)r10.w, 21, (int)r11.w);
r10.w = dot(g_lightInfoPunctualShadow[r10.w].shadow.shadowMapIndex[0].xyzw, icb[r12.z+0].xyzw);
r11.w = cmp(r10.w >= 0);
r11.z = r11.z ? r11.w : 0;
r10.w = (uint)r10.w;
r11.w = (uint7)g_lightInfoPunctualShadow[r11.x].shadow.quality;
r11.w = cmp((int)r11.w == 1);
if (r11.w != 0) {
r16.z = (uint)r10.w;
r12.zw = shadowMapSizeAndInvSize.xx * r13.xy + float2(0.5,0.5);
r17.xy = floor(r12.zw);
r12.zw = -r17.xy + r12.zw;
r16.xy = shadowMapSizeAndInvSize.zz * r17.xy;
r17.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(-2,-2)).xyzw;
r18.xyzw = float4(1,2,3,5) + -r12.zzzz;
r19.xyzw = r18.xyxy * r17.wzxy;
r17.zw = r19.xz + r19.yw;
r19.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(0,-2)).xyzw;
r17.zw = r19.zy * float2(2,2) + r17.zw;
r17.zw = r19.wx * float2(2,2) + r17.zw;
r20.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(2,-2)).xyzw;
r21.xyzw = float4(1,4,3,2) + r12.zzzz;
r11.w = r20.y * r12.z;
r17.zw = r20.zy * r12.zz + r17.zw;
r17.zw = r20.wx * r21.xx + r17.zw;
r22.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(-2,0)).xyzw;
r13.w = -r12.z * 2 + 2;
r19.zw = r22.wx * r13.ww;
r14.w = r12.z * -2 + 4;
r19.zw = r22.zy * r14.ww + r19.zw;
r14.w = r17.y * r18.z + r19.z;
r14.w = r17.x * r13.w + r14.w;
r15.w = r22.z * r18.z + r19.w;
r15.w = r22.w * r13.w + r15.w;
r23.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y).xyzw;
r14.w = r23.z * r21.y + r14.w;
r14.w = r23.w * r18.w + r14.w;
r15.w = r23.y * r21.y + r15.w;
r15.w = r23.x * r18.w + r15.w;
r17.xy = float2(4,1) + -r12.zw;
r14.w = r19.y * r21.z + r14.w;
r14.w = r19.x * r17.x + r14.w;
r15.w = r23.z * r21.z + r15.w;
r15.w = r23.w * r17.x + r15.w;
r19.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(2,0)).xyzw;
r16.w = r12.z * 2 + 2;
r18.w = dot(r19.zz, r12.zz);
r14.w = r18.w + r14.w;
r14.w = r19.w * r16.w + r14.w;
r19.y = dot(r19.yy, r12.zz);
r15.w = r19.y + r15.w;
r15.w = r19.x * r16.w + r15.w;
r11.w = r11.w * 2 + r14.w;
r20.x = r20.x * r21.w + r11.w;
r11.w = r18.w + r15.w;
r20.y = r19.w * r21.w + r11.w;
r17.zw = r20.xy + r17.zw;
r20.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(-2,2)).xyzw;
r24.xyzw = r20.wzxy * r18.xyxy;
r18.xy = r24.xz + r24.yw;
r11.w = r22.y * r18.z + r18.x;
r11.w = r22.x * r13.w + r11.w;
r14.w = r20.z * r18.z + r18.y;
r13.w = r20.w * r13.w + r14.w;
r18.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(0,2)).xyzw;
r11.w = r18.z * 2 + r11.w;
r11.w = r18.w * 2 + r11.w;
r13.w = r18.y * 2 + r13.w;
r13.w = r18.x * 2 + r13.w;
r11.w = r23.y * r21.z + r11.w;
r11.w = r23.x * r17.x + r11.w;
r13.w = r18.z * r21.z + r13.w;
r13.w = r18.w * r17.x + r13.w;
r16.xyzw = g_shadowmapTexture.GatherCmp(g_shadowmapSampler_s, r16.xyz, r11.y, int2(2,2)).xyzw;
r14.w = r16.z * r12.z;
r11.w = r16.z * r12.z + r11.w;
r11.w = r16.w * r21.x + r11.w;
r12.z = r16.y * r12.z + r13.w;
r12.z = r16.x * r21.x + r12.z;
r11.w = r19.y + r11.w;
r16.x = r19.x * r21.w + r11.w;
r11.w = r14.w * 2 + r12.z;
r16.y = r16.w * r21.w + r11.w;
r16.xy = r17.zw + r16.xy;
r11.w = r16.y * r12.w;
r11.w = r16.x * r17.y + r11.w;
r11.w = saturate(0.0163934417 * r11.w);
} else {
r13.z = (uint)r10.w;
r10.w = g_shadowmapTexture.SampleCmpLevelZero(g_shadowmapSampler_s, r13.xyz, r11.y).x;
r11.y = cmp(r11.y < 1);
r11.w = r11.y ? r10.w : 1;
}
r10.w = -1 + r11.w;
r10.w = g_lightInfoPunctualShadow[r11.x].baseLight.shadowDimmer * r10.w;
r10.w = r12.y * r10.w + 1;
r10.w = r11.z ? r10.w : 1;
r12.x = r12.x * r10.w;
}
}
r10.xyz = r15.xyz * r12.xxx + r10.xyz;
r9.xyz = r14.xyz * r12.xxx + r9.xyz;
r9.w = (int)r9.w + 1;
}
r1.x = -1 + r1.z;
r1.x = dynamicAOFactor * r1.x + 1;
r2.xyz = r10.xyz * r1.xxx;
r1.x = r2.w + r1.z;
r1.y = r1.y * -16 + -1;
r1.y = exp2(r1.y);
r1.x = log2(r1.x);
r1.x = r1.y * r1.x;
r1.x = exp2(r1.x);
r1.x = r1.x + r1.z;
r1.x = saturate(-1 + r1.x);
r1.xyz = r9.xyz * r1.xxx;
r2.xyz = r3.yzw * r2.xyz;
r1.xyz = float3(0.318309873,0.318309873,0.318309873) * r1.xyz;
r1.xyz = r2.xyz * float3(0.318309873,0.318309873,0.318309873) + r1.xyz;
r1.xyz = g_exposureMultipliers.zzz * r1.xyz;
r1.xyz = min(float3(65504,65504,65504), r1.xyz);
r1.w = 0;
// No code for instruction (needs manual fix):
store_uav_typed u0.xyzw, r0.xyzw, r1.xyzw
return;
}
*/

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

Posted 06/13/2016 07:01 PM   
@helifax I think the fix is the same than the one used in the PS for shadows/lights. If you compare PS with the CS, there are almost the same, both using "invViewProjectionMatrix". You should try. BUT, i don't know how to do the inverse of the "invViewProjectionMatrix" in ASM :(
@helifax

I think the fix is the same than the one used in the PS for shadows/lights. If you compare PS with the CS, there are almost the same, both using "invViewProjectionMatrix". You should try.

BUT, i don't know how to do the inverse of the "invViewProjectionMatrix" in ASM :(

MY WEB

Helix Mod - Making 3D Better

My 3D Screenshot Gallery

Like my fixes? you can donate to Paypal: dhr.donation@gmail.com

Posted 06/14/2016 01:10 AM   
Thx DHR;) I had a feeling;) that is the case;) Nothing to worry;) I have a plan: - In HLSL make the inverse, build the shader. - In-game find it again and DUMP it. - The shader should now have the modified HLSL (from above) and also the RE-BUILT ASM code which will include everything including the inverse of the inverseViewProjMatrix;) - Then is just a bit of compare to see what is different;) I'll give it a go and see if works Oo:)) Thanks!
Thx DHR;)

I had a feeling;) that is the case;)
Nothing to worry;) I have a plan:

- In HLSL make the inverse, build the shader.
- In-game find it again and DUMP it.
- The shader should now have the modified HLSL (from above) and also the RE-BUILT ASM code which will include everything including the inverse of the inverseViewProjMatrix;)
- Then is just a bit of compare to see what is different;)

I'll give it a go and see if works Oo:))

Thanks!

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

Posted 06/14/2016 09:29 AM   
[quote="bo3b"]This is possibly old news to people, but figured it wouldn't hurt to point this out. CryEngine has been open-sourced. It's on GitHub. https://github.com/CRYTEK-CRYENGINE/CRYENGINE The particularly interesting part is the inclusion of all of their cfx files, so we can see exactly what they are doing in the shaders, including variables. https://github.com/CRYTEK-CRYENGINE/CRYENGINE/tree/release/Engine/Shaders/HWScripts/CryFX [/quote] Any new hope for Ryse then? The first Crysis can be nice too :D. Finish it a second time but in 3D will be nice. I will wait to see if a shaderhacker look and fix them. Have to take the time to look at your school, to see how you do and why not start to fix games i want too if i can. The Witcher 1 fix wasn't perfect, and look for the telltale games.
bo3b said:This is possibly old news to people, but figured it wouldn't hurt to point this out.

CryEngine has been open-sourced. It's on GitHub.


https://github.com/CRYTEK-CRYENGINE/CRYENGINE



The particularly interesting part is the inclusion of all of their cfx files, so we can see exactly what they are doing in the shaders, including variables.


https://github.com/CRYTEK-CRYENGINE/CRYENGINE/tree/release/Engine/Shaders/HWScripts/CryFX




Any new hope for Ryse then?
The first Crysis can be nice too :D.
Finish it a second time but in 3D will be nice.

I will wait to see if a shaderhacker look and fix them.

Have to take the time to look at your school, to see how you do and why not start to fix games i want too if i can.
The Witcher 1 fix wasn't perfect, and look for the telltale games.

Posted 06/14/2016 06:05 PM   
  56 / 87    
Scroll To Top