Dolphin Emulator
  2 / 2    
[quote="masterotaku"]Basically, I could fix water reflections and refraction in Zelda Twilight Princess, but it had to be with the 3D Vision option disabled in Dolphin so I could use the separation and convergence variables in 3Dmigoto (they are 0 otherwise).[/quote]It might be possible to use the arbitrary resource copying feature I added to 3DMigoto to copy these from the geometry shaders where Dolphin adds them to whichever shader you need, but looking at the Dolphin source code I'm not sure if the vertex shader is called for both eyes when using their method - I'm pretty sure it's split into two views in the geometry shader instead. [quote]I would be fine with that if that big geometry problem didn't happen (see the last screenshot of my post). It's something that depends on the camera position. The Zelda TP screenshots were carefully taken with the right position to not make the bug appear.[/quote]That's very strange - I've never seen anything like that. At a complete guess it might be something to do with the StereoCutoff or StereoCutoffDepthNear profile settings, or another heuristic? [quote][code] r2.x = dot(cproj[0].xyzw, r1.xyzw); r2.x-=separation*(dot(cproj[3].xyzw, r1.xyzw)-convergence); <------ The line I added, after loading the separation and convergence variables at the start of the main method. r2.y = dot(cproj[1].xyzw, r1.xyzw); r0.y = dot(cproj[3].xyzw, r1.xyzw); r0.z = dot(cproj[2].xyzw, r1.xyzw); [/code][/quote]Can you post the complete vertex shader and geometry shader for that effect?
masterotaku said:Basically, I could fix water reflections and refraction in Zelda Twilight Princess, but it had to be with the 3D Vision option disabled in Dolphin so I could use the separation and convergence variables in 3Dmigoto (they are 0 otherwise).
It might be possible to use the arbitrary resource copying feature I added to 3DMigoto to copy these from the geometry shaders where Dolphin adds them to whichever shader you need, but looking at the Dolphin source code I'm not sure if the vertex shader is called for both eyes when using their method - I'm pretty sure it's split into two views in the geometry shader instead.

I would be fine with that if that big geometry problem didn't happen (see the last screenshot of my post). It's something that depends on the camera position. The Zelda TP screenshots were carefully taken with the right position to not make the bug appear.
That's very strange - I've never seen anything like that. At a complete guess it might be something to do with the StereoCutoff or StereoCutoffDepthNear profile settings, or another heuristic?

r2.x = dot(cproj[0].xyzw, r1.xyzw);
r2.x-=separation*(dot(cproj[3].xyzw, r1.xyzw)-convergence); <------ The line I added, after loading the separation and convergence variables at the start of the main method.
r2.y = dot(cproj[1].xyzw, r1.xyzw);
r0.y = dot(cproj[3].xyzw, r1.xyzw);
r0.z = dot(cproj[2].xyzw, r1.xyzw);
Can you post the complete vertex shader and geometry shader for that effect?

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

#16
Posted 11/16/2015 12:58 AM   
Posting super fast before going to work. I'm not sure that it's simple to catch, but here is the vertex shader that shows the ground and many other elements (which isn't the code I posted before. That was the water reflections vertex shader): [code] cbuffer _Globals : register(b0) { float4 cproj[4] : packoffset(c0); float4 cmtrl[4] : packoffset(c4); float4 clights[40] : packoffset(c8); float4 ctexmtx[24] : packoffset(c48); float4 ctrmtx[64] : packoffset(c72); float4 cnmtx[32] : packoffset(c136); float4 cpostmtx[64] : packoffset(c168); float4 cDepth : packoffset(c232); float4 cPLOffset[13] : packoffset(c233); } Texture2D<float4> StereoParams : register(t125); Texture1D<float4> IniParams : register(t120); void main( float3 v0 : NORMAL0, float4 v1 : COLOR0, float2 v2 : TEXCOORD0, float4 v3 : BLENDINDICES0, float4 v4 : POSITION0, out float4 o0 : SV_Position0, out float4 o1 : COLOR0, out float4 o2 : COLOR1, out float4 o3 : TEXCOORD0, out float4 o4 : TEXCOORD1, out float4 o5 : TEXCOORD2, out float4 o6 : TEXCOORD3) { float4 r0,r1,r2; uint4 bitmask, uiDest; float4 fDest; r0.x = 255 * v3.x; r0.x = (int)r0.x; r1.x = dot(ctrmtx[r0.x].xyzw, v4.xyzw); r0.yzw = (int3)r0.xxx + int3(1,2,-32); r1.y = dot(ctrmtx[r0.y].xyzw, v4.xyzw); r1.z = dot(ctrmtx[r0.z].xyzw, v4.xyzw); r1.w = 1; r2.x = dot(cproj[0].xyzw, r1.xyzw); r2.y = dot(cproj[1].xyzw, r1.xyzw); r0.y = dot(cproj[3].xyzw, r1.xyzw); r0.z = dot(cproj[2].xyzw, r1.xyzw); o5.xy = r1.xy; o6.w = r1.z; o0.xy = r0.yy * cDepth.zw + r2.xy; r1.x = cDepth.y * r0.z; o5.zw = r0.zy; r0.z = cDepth.x + -1; r0.z = r0.z * r0.y + r1.x; o0.z = -r0.z; o0.w = r0.y; o1.xyzw = v1.xyzw; o2.xyzw = v1.xyzw; r1.xy = v2.xy; r1.zw = float2(1,1); r2.x = dot(r1.xyww, ctexmtx[0].xyzw); r2.y = dot(r1.xyzw, ctexmtx[1].xyzw); r2.z = 1; r0.y = dot(cpostmtx[61].xyz, r2.xyz); o3.x = cpostmtx[61].w + r0.y; r0.y = dot(cpostmtx[62].xyz, r2.xyz); r0.z = dot(cpostmtx[63].xyz, r2.xyz); o3.z = cpostmtx[63].w + r0.z; o3.y = cpostmtx[62].w + r0.y; r1.x = dot(v4.xyzw, ctexmtx[3].xyzw); r1.y = dot(v4.xyzw, ctexmtx[4].xyzw); r1.z = dot(v4.xyzw, ctexmtx[5].xyzw); r0.y = dot(cpostmtx[61].xyz, r1.xyz); o4.x = cpostmtx[61].w + r0.y; r0.y = dot(cpostmtx[62].xyz, r1.xyz); r0.z = dot(cpostmtx[63].xyz, r1.xyz); o4.z = cpostmtx[63].w + r0.z; o4.y = cpostmtx[62].w + r0.y; r0.y = (int)r0.x >= 32; r0.x = r0.y ? r0.w : r0.x; r1.x = dot(cnmtx[r0.x].xyz, v0.xyz); r0.xy = (int2)r0.xx + int2(1,2); r1.y = dot(cnmtx[r0.x].xyz, v0.xyz); r1.z = dot(cnmtx[r0.y].xyz, v0.xyz); r0.x = dot(r1.xyz, r1.xyz); r0.x = rsqrt(r0.x); o6.xyz = r1.xyz * r0.xxx; return; } [/code] When skipping this shader, shadows are still wrong when rotating the camera, and there is still one more vertex shader: [code] cbuffer _Globals : register(b0) { float4 cproj[4] : packoffset(c0); float4 cmtrl[4] : packoffset(c4); float4 clights[40] : packoffset(c8); float4 ctexmtx[24] : packoffset(c48); float4 ctrmtx[64] : packoffset(c72); float4 cnmtx[32] : packoffset(c136); float4 cpostmtx[64] : packoffset(c168); float4 cDepth : packoffset(c232); float4 cPLOffset[13] : packoffset(c233); } Texture2D<float4> StereoParams : register(t125); Texture1D<float4> IniParams : register(t120); void main( float3 v0 : NORMAL0, float4 v1 : COLOR0, float2 v2 : TEXCOORD0, float4 v3 : BLENDINDICES0, float4 v4 : POSITION0, out float4 o0 : SV_Position0, out float4 o1 : COLOR0, out float4 o2 : COLOR1, out float4 o3 : TEXCOORD0, out float4 o4 : TEXCOORD1, out float4 o5 : TEXCOORD2, out float4 o6 : TEXCOORD3, out float4 o7 : TEXCOORD4) { float4 r0,r1,r2; uint4 bitmask, uiDest; float4 fDest; r0.x = 255 * v3.x; r0.x = (int)r0.x; r1.x = dot(ctrmtx[r0.x].xyzw, v4.xyzw); r0.yzw = (int3)r0.xxx + int3(1,2,-32); r1.y = dot(ctrmtx[r0.y].xyzw, v4.xyzw); r1.z = dot(ctrmtx[r0.z].xyzw, v4.xyzw); r1.w = 1; r2.x = dot(cproj[0].xyzw, r1.xyzw); r2.y = dot(cproj[1].xyzw, r1.xyzw); r0.y = dot(cproj[3].xyzw, r1.xyzw); r0.z = dot(cproj[2].xyzw, r1.xyzw); o6.xy = r1.xy; o7.w = r1.z; o0.xy = r0.yy * cDepth.zw + r2.xy; r1.x = cDepth.y * r0.z; o6.zw = r0.zy; r0.z = cDepth.x + -1; r0.z = r0.z * r0.y + r1.x; o0.z = -r0.z; o0.w = r0.y; o1.xyzw = v1.xyzw; o2.xyzw = v1.xyzw; r1.z = 1; r2.xy = v2.xy; r2.zw = float2(1,1); r1.x = dot(r2.xyww, ctexmtx[0].xyzw); r1.y = dot(r2.xyww, ctexmtx[1].xyzw); r0.y = dot(cpostmtx[61].xyz, r1.xyz); o3.x = cpostmtx[61].w + r0.y; r0.y = dot(cpostmtx[62].xyz, r1.xyz); r0.z = dot(cpostmtx[63].xyz, r1.xyz); o3.z = cpostmtx[63].w + r0.z; o3.y = cpostmtx[62].w + r0.y; r1.x = dot(v4.xyzw, ctexmtx[3].xyzw); r1.y = dot(v4.xyzw, ctexmtx[4].xyzw); r1.z = dot(v4.xyzw, ctexmtx[5].xyzw); r0.y = dot(cpostmtx[61].xyz, r1.xyz); o4.x = cpostmtx[61].w + r0.y; r0.y = dot(cpostmtx[62].xyz, r1.xyz); r0.z = dot(cpostmtx[63].xyz, r1.xyz); o4.z = cpostmtx[63].w + r0.z; o4.y = cpostmtx[62].w + r0.y; r1.x = dot(r2.xyww, ctexmtx[6].xyzw); r1.y = dot(r2.xyzw, ctexmtx[7].xyzw); r1.z = 1; r0.y = dot(cpostmtx[61].xyz, r1.xyz); o5.x = cpostmtx[61].w + r0.y; r0.y = dot(cpostmtx[62].xyz, r1.xyz); r0.z = dot(cpostmtx[63].xyz, r1.xyz); o5.z = cpostmtx[63].w + r0.z; o5.y = cpostmtx[62].w + r0.y; r0.y = (int)r0.x >= 32; r0.x = r0.y ? r0.w : r0.x; r1.x = dot(cnmtx[r0.x].xyz, v0.xyz); r0.xy = (int2)r0.xx + int2(1,2); r1.y = dot(cnmtx[r0.x].xyz, v0.xyz); r1.z = dot(cnmtx[r0.y].xyz, v0.xyz); r0.x = dot(r1.xyz, r1.xyz); r0.x = rsqrt(r0.x); o7.xyz = r1.xyz * r0.xxx; return; } [/code]
Posting super fast before going to work.

I'm not sure that it's simple to catch, but here is the vertex shader that shows the ground and many other elements (which isn't the code I posted before. That was the water reflections vertex shader):

cbuffer _Globals : register(b0)
{
float4 cproj[4] : packoffset(c0);
float4 cmtrl[4] : packoffset(c4);
float4 clights[40] : packoffset(c8);
float4 ctexmtx[24] : packoffset(c48);
float4 ctrmtx[64] : packoffset(c72);
float4 cnmtx[32] : packoffset(c136);
float4 cpostmtx[64] : packoffset(c168);
float4 cDepth : packoffset(c232);
float4 cPLOffset[13] : packoffset(c233);
}

Texture2D<float4> StereoParams : register(t125);
Texture1D<float4> IniParams : register(t120);

void main(
float3 v0 : NORMAL0,
float4 v1 : COLOR0,
float2 v2 : TEXCOORD0,
float4 v3 : BLENDINDICES0,
float4 v4 : POSITION0,
out float4 o0 : SV_Position0,
out float4 o1 : COLOR0,
out float4 o2 : COLOR1,
out float4 o3 : TEXCOORD0,
out float4 o4 : TEXCOORD1,
out float4 o5 : TEXCOORD2,
out float4 o6 : TEXCOORD3)
{
float4 r0,r1,r2;
uint4 bitmask, uiDest;
float4 fDest;

r0.x = 255 * v3.x;
r0.x = (int)r0.x;
r1.x = dot(ctrmtx[r0.x].xyzw, v4.xyzw);
r0.yzw = (int3)r0.xxx + int3(1,2,-32);
r1.y = dot(ctrmtx[r0.y].xyzw, v4.xyzw);
r1.z = dot(ctrmtx[r0.z].xyzw, v4.xyzw);
r1.w = 1;
r2.x = dot(cproj[0].xyzw, r1.xyzw);
r2.y = dot(cproj[1].xyzw, r1.xyzw);
r0.y = dot(cproj[3].xyzw, r1.xyzw);
r0.z = dot(cproj[2].xyzw, r1.xyzw);
o5.xy = r1.xy;
o6.w = r1.z;
o0.xy = r0.yy * cDepth.zw + r2.xy;
r1.x = cDepth.y * r0.z;
o5.zw = r0.zy;
r0.z = cDepth.x + -1;
r0.z = r0.z * r0.y + r1.x;
o0.z = -r0.z;
o0.w = r0.y;
o1.xyzw = v1.xyzw;
o2.xyzw = v1.xyzw;
r1.xy = v2.xy;
r1.zw = float2(1,1);
r2.x = dot(r1.xyww, ctexmtx[0].xyzw);
r2.y = dot(r1.xyzw, ctexmtx[1].xyzw);
r2.z = 1;
r0.y = dot(cpostmtx[61].xyz, r2.xyz);
o3.x = cpostmtx[61].w + r0.y;
r0.y = dot(cpostmtx[62].xyz, r2.xyz);
r0.z = dot(cpostmtx[63].xyz, r2.xyz);
o3.z = cpostmtx[63].w + r0.z;
o3.y = cpostmtx[62].w + r0.y;
r1.x = dot(v4.xyzw, ctexmtx[3].xyzw);
r1.y = dot(v4.xyzw, ctexmtx[4].xyzw);
r1.z = dot(v4.xyzw, ctexmtx[5].xyzw);
r0.y = dot(cpostmtx[61].xyz, r1.xyz);
o4.x = cpostmtx[61].w + r0.y;
r0.y = dot(cpostmtx[62].xyz, r1.xyz);
r0.z = dot(cpostmtx[63].xyz, r1.xyz);
o4.z = cpostmtx[63].w + r0.z;
o4.y = cpostmtx[62].w + r0.y;
r0.y = (int)r0.x >= 32;
r0.x = r0.y ? r0.w : r0.x;
r1.x = dot(cnmtx[r0.x].xyz, v0.xyz);
r0.xy = (int2)r0.xx + int2(1,2);
r1.y = dot(cnmtx[r0.x].xyz, v0.xyz);
r1.z = dot(cnmtx[r0.y].xyz, v0.xyz);
r0.x = dot(r1.xyz, r1.xyz);
r0.x = rsqrt(r0.x);
o6.xyz = r1.xyz * r0.xxx;
return;
}


When skipping this shader, shadows are still wrong when rotating the camera, and there is still one more vertex shader:

cbuffer _Globals : register(b0)
{
float4 cproj[4] : packoffset(c0);
float4 cmtrl[4] : packoffset(c4);
float4 clights[40] : packoffset(c8);
float4 ctexmtx[24] : packoffset(c48);
float4 ctrmtx[64] : packoffset(c72);
float4 cnmtx[32] : packoffset(c136);
float4 cpostmtx[64] : packoffset(c168);
float4 cDepth : packoffset(c232);
float4 cPLOffset[13] : packoffset(c233);
}

Texture2D<float4> StereoParams : register(t125);
Texture1D<float4> IniParams : register(t120);

void main(
float3 v0 : NORMAL0,
float4 v1 : COLOR0,
float2 v2 : TEXCOORD0,
float4 v3 : BLENDINDICES0,
float4 v4 : POSITION0,
out float4 o0 : SV_Position0,
out float4 o1 : COLOR0,
out float4 o2 : COLOR1,
out float4 o3 : TEXCOORD0,
out float4 o4 : TEXCOORD1,
out float4 o5 : TEXCOORD2,
out float4 o6 : TEXCOORD3,
out float4 o7 : TEXCOORD4)
{
float4 r0,r1,r2;
uint4 bitmask, uiDest;
float4 fDest;

r0.x = 255 * v3.x;
r0.x = (int)r0.x;
r1.x = dot(ctrmtx[r0.x].xyzw, v4.xyzw);
r0.yzw = (int3)r0.xxx + int3(1,2,-32);
r1.y = dot(ctrmtx[r0.y].xyzw, v4.xyzw);
r1.z = dot(ctrmtx[r0.z].xyzw, v4.xyzw);
r1.w = 1;
r2.x = dot(cproj[0].xyzw, r1.xyzw);
r2.y = dot(cproj[1].xyzw, r1.xyzw);
r0.y = dot(cproj[3].xyzw, r1.xyzw);
r0.z = dot(cproj[2].xyzw, r1.xyzw);
o6.xy = r1.xy;
o7.w = r1.z;
o0.xy = r0.yy * cDepth.zw + r2.xy;
r1.x = cDepth.y * r0.z;
o6.zw = r0.zy;
r0.z = cDepth.x + -1;
r0.z = r0.z * r0.y + r1.x;
o0.z = -r0.z;
o0.w = r0.y;
o1.xyzw = v1.xyzw;
o2.xyzw = v1.xyzw;
r1.z = 1;
r2.xy = v2.xy;
r2.zw = float2(1,1);
r1.x = dot(r2.xyww, ctexmtx[0].xyzw);
r1.y = dot(r2.xyww, ctexmtx[1].xyzw);
r0.y = dot(cpostmtx[61].xyz, r1.xyz);
o3.x = cpostmtx[61].w + r0.y;
r0.y = dot(cpostmtx[62].xyz, r1.xyz);
r0.z = dot(cpostmtx[63].xyz, r1.xyz);
o3.z = cpostmtx[63].w + r0.z;
o3.y = cpostmtx[62].w + r0.y;
r1.x = dot(v4.xyzw, ctexmtx[3].xyzw);
r1.y = dot(v4.xyzw, ctexmtx[4].xyzw);
r1.z = dot(v4.xyzw, ctexmtx[5].xyzw);
r0.y = dot(cpostmtx[61].xyz, r1.xyz);
o4.x = cpostmtx[61].w + r0.y;
r0.y = dot(cpostmtx[62].xyz, r1.xyz);
r0.z = dot(cpostmtx[63].xyz, r1.xyz);
o4.z = cpostmtx[63].w + r0.z;
o4.y = cpostmtx[62].w + r0.y;
r1.x = dot(r2.xyww, ctexmtx[6].xyzw);
r1.y = dot(r2.xyzw, ctexmtx[7].xyzw);
r1.z = 1;
r0.y = dot(cpostmtx[61].xyz, r1.xyz);
o5.x = cpostmtx[61].w + r0.y;
r0.y = dot(cpostmtx[62].xyz, r1.xyz);
r0.z = dot(cpostmtx[63].xyz, r1.xyz);
o5.z = cpostmtx[63].w + r0.z;
o5.y = cpostmtx[62].w + r0.y;
r0.y = (int)r0.x >= 32;
r0.x = r0.y ? r0.w : r0.x;
r1.x = dot(cnmtx[r0.x].xyz, v0.xyz);
r0.xy = (int2)r0.xx + int2(1,2);
r1.y = dot(cnmtx[r0.x].xyz, v0.xyz);
r1.z = dot(cnmtx[r0.y].xyz, v0.xyz);
r0.x = dot(r1.xyz, r1.xyz);
r0.x = rsqrt(r0.x);
o7.xyz = r1.xyz * r0.xxx;
return;
}

CPU: Intel Core i7 7700K @ 4.9GHz
Motherboard: Gigabyte Aorus GA-Z270X-Gaming 5
RAM: GSKILL Ripjaws Z 16GB 3866MHz CL18
GPU: Gainward Phoenix 1080 GLH
Monitor: Asus PG278QR
Speakers: Logitech Z506
Donations account: masterotakusuko@gmail.com

#17
Posted 11/16/2015 06:43 AM   
I was actually interested in the water shader - since you already fixed it with 3D Vision Automatic I wanted to take a look and see if it might be possible to make the fix work with Dolphin's 3D instead. BTW to hunt geometry shaders you will need to enable it in the d3dx.ini (we don't have it enabled by default because a) most games don't need it, b) it's getting harder and harder to pick keys that are still free in most games and c) I never quite got around to adding it to the template ;-). Find where the existing hunting keys are defined and add something like this (set the keys to whatever you prefer): [code] previous_geometryshader = < next_geometryshader = > mark_geometryshader = . [/code]
I was actually interested in the water shader - since you already fixed it with 3D Vision Automatic I wanted to take a look and see if it might be possible to make the fix work with Dolphin's 3D instead.

BTW to hunt geometry shaders you will need to enable it in the d3dx.ini (we don't have it enabled by default because a) most games don't need it, b) it's getting harder and harder to pick keys that are still free in most games and c) I never quite got around to adding it to the template ;-). Find where the existing hunting keys are defined and add something like this (set the keys to whatever you prefer):
previous_geometryshader = <
next_geometryshader = >
mark_geometryshader = .

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

#18
Posted 11/16/2015 07:17 AM   
OK, I'll post the water shader and do what you said. When I'm home, which should be 9 or 10 hours from now.
OK, I'll post the water shader and do what you said. When I'm home, which should be 9 or 10 hours from now.

CPU: Intel Core i7 7700K @ 4.9GHz
Motherboard: Gigabyte Aorus GA-Z270X-Gaming 5
RAM: GSKILL Ripjaws Z 16GB 3866MHz CL18
GPU: Gainward Phoenix 1080 GLH
Monitor: Asus PG278QR
Speakers: Logitech Z506
Donations account: masterotakusuko@gmail.com

#19
Posted 11/16/2015 08:25 AM   
There are only two geometry shaders in Zelda Twilight Princess, and both of them just affect the minimap. Here's the complete code of the water vertex shader (there are more similar to this one that can be fixed the same way): [code] //Water reflections, kakariko. cbuffer _Globals : register(b0) { float4 cproj[4] : packoffset(c0); float4 cmtrl[4] : packoffset(c4); float4 clights[40] : packoffset(c8); float4 ctexmtx[24] : packoffset(c48); float4 ctrmtx[64] : packoffset(c72); float4 cnmtx[32] : packoffset(c136); float4 cpostmtx[64] : packoffset(c168); float4 cDepth : packoffset(c232); float4 cPLOffset[13] : packoffset(c233); } Texture2D<float4> StereoParams : register(t125); Texture1D<float4> IniParams : register(t120); void main( float3 v0 : NORMAL0, float4 v1 : COLOR0, float2 v2 : TEXCOORD0, float4 v3 : BLENDINDICES0, float4 v4 : POSITION0, out float4 o0 : SV_Position0, out float4 o1 : COLOR0, out float4 o2 : COLOR1, out float4 o3 : TEXCOORD0, out float4 o4 : TEXCOORD1, out float4 o5 : TEXCOORD2, out float4 o6 : TEXCOORD3) { float4 r0,r1,r2; uint4 bitmask, uiDest; float4 fDest; float4 iniparams = IniParams.Load(0); float4 stereo = StereoParams.Load(0); float separation = stereo.x; float convergence = stereo.y; r0.x = 255 * v3.x; r0.x = (int)r0.x; r1.x = dot(ctrmtx[r0.x].xyzw, v4.xyzw); r0.yzw = (int3)r0.xxx + int3(1,2,-32); r1.y = dot(ctrmtx[r0.y].xyzw, v4.xyzw); r1.z = dot(ctrmtx[r0.z].xyzw, v4.xyzw); r1.w = 1; r2.x = dot(cproj[0].xyzw, r1.xyzw); r2.x-=separation*(dot(cproj[3].xyzw, r1.xyzw)-convergence); r2.y = dot(cproj[1].xyzw, r1.xyzw); r0.y = dot(cproj[3].xyzw, r1.xyzw); r0.z = dot(cproj[2].xyzw, r1.xyzw); o5.xy = r1.xy; o6.w = r1.z; o0.xy = r0.yy * cDepth.zw + r2.xy; r1.x = cDepth.y * r0.z; o5.zw = r0.zy; r0.z = cDepth.x + -1; r0.z = r0.z * r0.y + r1.x; o0.z = -r0.z; o0.w = r0.y; o1.xyzw = v1.xyzw; o2.xyzw = v1.xyzw; r1.x = dot(v4.xyzw, ctexmtx[0].xyzw); r1.y = dot(v4.xyzw, ctexmtx[1].xyzw); r1.z = dot(v4.xyzw, ctexmtx[2].xyzw); r0.y = dot(cpostmtx[61].xyz, r1.xyz); o3.x = cpostmtx[61].w + r0.y; r0.y = dot(cpostmtx[62].xyz, r1.xyz); r0.z = dot(cpostmtx[63].xyz, r1.xyz); o3.z = cpostmtx[63].w + r0.z; o3.y = cpostmtx[62].w + r0.y; r1.xy = v2.xy; r1.zw = float2(1,1); r2.x = dot(r1.xyww, ctexmtx[3].xyzw); r2.y = dot(r1.xyzw, ctexmtx[4].xyzw); r2.z = 1; r0.y = dot(cpostmtx[61].xyz, r2.xyz); o4.x = cpostmtx[61].w + r0.y; r0.y = dot(cpostmtx[62].xyz, r2.xyz); r0.z = dot(cpostmtx[63].xyz, r2.xyz); o4.z = cpostmtx[63].w + r0.z; o4.y = cpostmtx[62].w + r0.y; r0.y = (int)r0.x >= 32; r0.x = r0.y ? r0.w : r0.x; r1.x = dot(cnmtx[r0.x].xyz, v0.xyz); r0.xy = (int2)r0.xx + int2(1,2); r1.y = dot(cnmtx[r0.x].xyz, v0.xyz); r1.z = dot(cnmtx[r0.y].xyz, v0.xyz); r0.x = dot(r1.xyz, r1.xyz); r0.x = rsqrt(r0.x); o6.xyz = r1.xyz * r0.xxx; return; } /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // // Generated by Microsoft (R) HLSL Shader Compiler 9.30.9200.20789 // // using 3Dmigoto v1.2.9 on Mon Nov 16 20:59:50 2015 // // // Buffer Definitions: // // cbuffer $Globals // { // // float4 cproj[4]; // Offset: 0 Size: 64 // float4 cmtrl[4]; // Offset: 64 Size: 64 [unused] // float4 clights[40]; // Offset: 128 Size: 640 [unused] // float4 ctexmtx[24]; // Offset: 768 Size: 384 // float4 ctrmtx[64]; // Offset: 1152 Size: 1024 // float4 cnmtx[32]; // Offset: 2176 Size: 512 // float4 cpostmtx[64]; // Offset: 2688 Size: 1024 // float4 cDepth; // Offset: 3712 Size: 16 // float4 cPLOffset[13]; // Offset: 3728 Size: 208 [unused] // // } // // // Resource Bindings: // // Name Type Format Dim Slot Elements // ------------------------------ ---------- ------- ----------- ---- -------- // $Globals cbuffer NA NA 0 1 // // // // Input signature: // // Name Index Mask Register SysValue Format Used // -------------------- ----- ------ -------- -------- ------- ------ // NORMAL 0 xyz 0 NONE float xyz // COLOR 0 xyzw 1 NONE float xyzw // TEXCOORD 0 xy 2 NONE float xy // BLENDINDICES 0 xyzw 3 NONE float x // POSITION 0 xyzw 4 NONE float xyzw // // // Output signature: // // Name Index Mask Register SysValue Format Used // -------------------- ----- ------ -------- -------- ------- ------ // SV_Position 0 xyzw 0 POS float xyzw // COLOR 0 xyzw 1 NONE float xyzw // COLOR 1 xyzw 2 NONE float xyzw // TEXCOORD 0 xyz 3 NONE float xyz // TEXCOORD 1 xyz 4 NONE float xyz // TEXCOORD 2 xyzw 5 NONE float xyzw // TEXCOORD 3 xyzw 6 NONE float xyzw // vs_5_0 dcl_globalFlags refactoringAllowed dcl_constantbuffer cb0[233], dynamicIndexed dcl_input v0.xyz dcl_input v1.xyzw dcl_input v2.xy dcl_input v3.x dcl_input v4.xyzw dcl_output_siv o0.xyzw, position dcl_output o1.xyzw dcl_output o2.xyzw dcl_output o3.xyz dcl_output o4.xyz dcl_output o5.xyzw dcl_output o6.xyzw dcl_temps 3 mul r0.x, v3.x, l(255.000000) ftoi r0.x, r0.x dp4 r1.x, cb0[r0.x + 72].xyzw, v4.xyzw iadd r0.yzw, r0.xxxx, l(0, 1, 2, -32) dp4 r1.y, cb0[r0.y + 72].xyzw, v4.xyzw dp4 r1.z, cb0[r0.z + 72].xyzw, v4.xyzw mov r1.w, l(1.000000) dp4 r2.x, cb0[0].xyzw, r1.xyzw dp4 r2.y, cb0[1].xyzw, r1.xyzw dp4 r0.y, cb0[3].xyzw, r1.xyzw dp4 r0.z, cb0[2].xyzw, r1.xyzw mov o5.xy, r1.xyxx mov o6.w, r1.z mad o0.xy, r0.yyyy, cb0[232].zwzz, r2.xyxx mul r1.x, r0.z, cb0[232].y mov o5.zw, r0.zzzy add r0.z, l(-1.000000), cb0[232].x mad r0.z, r0.z, r0.y, r1.x mov o0.z, -r0.z mov o0.w, r0.y mov o1.xyzw, v1.xyzw mov o2.xyzw, v1.xyzw dp4 r1.x, v4.xyzw, cb0[48].xyzw dp4 r1.y, v4.xyzw, cb0[49].xyzw dp4 r1.z, v4.xyzw, cb0[50].xyzw dp3 r0.y, cb0[229].xyzx, r1.xyzx add o3.x, r0.y, cb0[229].w dp3 r0.y, cb0[230].xyzx, r1.xyzx dp3 r0.z, cb0[231].xyzx, r1.xyzx add o3.z, r0.z, cb0[231].w add o3.y, r0.y, cb0[230].w mov r1.xy, v2.xyxx mov r1.zw, l(0,0,1.000000,1.000000) dp4 r2.x, r1.xyww, cb0[51].xyzw dp4 r2.y, r1.xyzw, cb0[52].xyzw mov r2.z, l(1.000000) dp3 r0.y, cb0[229].xyzx, r2.xyzx add o4.x, r0.y, cb0[229].w dp3 r0.y, cb0[230].xyzx, r2.xyzx dp3 r0.z, cb0[231].xyzx, r2.xyzx add o4.z, r0.z, cb0[231].w add o4.y, r0.y, cb0[230].w ige r0.y, r0.x, l(32) movc r0.x, r0.y, r0.w, r0.x dp3 r1.x, cb0[r0.x + 136].xyzx, v0.xyzx iadd r0.xy, r0.xxxx, l(1, 2, 0, 0) dp3 r1.y, cb0[r0.x + 136].xyzx, v0.xyzx dp3 r1.z, cb0[r0.y + 136].xyzx, v0.xyzx dp3 r0.x, r1.xyzx, r1.xyzx rsq r0.x, r0.x mul o6.xyz, r0.xxxx, r1.xyzx ret // Approximately 52 instruction slots used ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/ [/code]
There are only two geometry shaders in Zelda Twilight Princess, and both of them just affect the minimap.

Here's the complete code of the water vertex shader (there are more similar to this one that can be fixed the same way):

//Water reflections, kakariko.
cbuffer _Globals : register(b0)
{
float4 cproj[4] : packoffset(c0);
float4 cmtrl[4] : packoffset(c4);
float4 clights[40] : packoffset(c8);
float4 ctexmtx[24] : packoffset(c48);
float4 ctrmtx[64] : packoffset(c72);
float4 cnmtx[32] : packoffset(c136);
float4 cpostmtx[64] : packoffset(c168);
float4 cDepth : packoffset(c232);
float4 cPLOffset[13] : packoffset(c233);
}

Texture2D<float4> StereoParams : register(t125);
Texture1D<float4> IniParams : register(t120);

void main(
float3 v0 : NORMAL0,
float4 v1 : COLOR0,
float2 v2 : TEXCOORD0,
float4 v3 : BLENDINDICES0,
float4 v4 : POSITION0,
out float4 o0 : SV_Position0,
out float4 o1 : COLOR0,
out float4 o2 : COLOR1,
out float4 o3 : TEXCOORD0,
out float4 o4 : TEXCOORD1,
out float4 o5 : TEXCOORD2,
out float4 o6 : TEXCOORD3)
{
float4 r0,r1,r2;
uint4 bitmask, uiDest;
float4 fDest;

float4 iniparams = IniParams.Load(0);
float4 stereo = StereoParams.Load(0);
float separation = stereo.x;
float convergence = stereo.y;

r0.x = 255 * v3.x;
r0.x = (int)r0.x;
r1.x = dot(ctrmtx[r0.x].xyzw, v4.xyzw);
r0.yzw = (int3)r0.xxx + int3(1,2,-32);
r1.y = dot(ctrmtx[r0.y].xyzw, v4.xyzw);
r1.z = dot(ctrmtx[r0.z].xyzw, v4.xyzw);
r1.w = 1;
r2.x = dot(cproj[0].xyzw, r1.xyzw);
r2.x-=separation*(dot(cproj[3].xyzw, r1.xyzw)-convergence);
r2.y = dot(cproj[1].xyzw, r1.xyzw);
r0.y = dot(cproj[3].xyzw, r1.xyzw);
r0.z = dot(cproj[2].xyzw, r1.xyzw);
o5.xy = r1.xy;
o6.w = r1.z;
o0.xy = r0.yy * cDepth.zw + r2.xy;
r1.x = cDepth.y * r0.z;
o5.zw = r0.zy;
r0.z = cDepth.x + -1;
r0.z = r0.z * r0.y + r1.x;
o0.z = -r0.z;
o0.w = r0.y;
o1.xyzw = v1.xyzw;
o2.xyzw = v1.xyzw;
r1.x = dot(v4.xyzw, ctexmtx[0].xyzw);
r1.y = dot(v4.xyzw, ctexmtx[1].xyzw);
r1.z = dot(v4.xyzw, ctexmtx[2].xyzw);
r0.y = dot(cpostmtx[61].xyz, r1.xyz);
o3.x = cpostmtx[61].w + r0.y;
r0.y = dot(cpostmtx[62].xyz, r1.xyz);
r0.z = dot(cpostmtx[63].xyz, r1.xyz);
o3.z = cpostmtx[63].w + r0.z;
o3.y = cpostmtx[62].w + r0.y;
r1.xy = v2.xy;
r1.zw = float2(1,1);
r2.x = dot(r1.xyww, ctexmtx[3].xyzw);
r2.y = dot(r1.xyzw, ctexmtx[4].xyzw);
r2.z = 1;
r0.y = dot(cpostmtx[61].xyz, r2.xyz);
o4.x = cpostmtx[61].w + r0.y;
r0.y = dot(cpostmtx[62].xyz, r2.xyz);
r0.z = dot(cpostmtx[63].xyz, r2.xyz);
o4.z = cpostmtx[63].w + r0.z;
o4.y = cpostmtx[62].w + r0.y;
r0.y = (int)r0.x >= 32;
r0.x = r0.y ? r0.w : r0.x;
r1.x = dot(cnmtx[r0.x].xyz, v0.xyz);
r0.xy = (int2)r0.xx + int2(1,2);
r1.y = dot(cnmtx[r0.x].xyz, v0.xyz);
r1.z = dot(cnmtx[r0.y].xyz, v0.xyz);
r0.x = dot(r1.xyz, r1.xyz);
r0.x = rsqrt(r0.x);
o6.xyz = r1.xyz * r0.xxx;
return;
}

/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
//
// Generated by Microsoft (R) HLSL Shader Compiler 9.30.9200.20789
//
// using 3Dmigoto v1.2.9 on Mon Nov 16 20:59:50 2015
//
//
// Buffer Definitions:
//
// cbuffer $Globals
// {
//
// float4 cproj[4]; // Offset: 0 Size: 64
// float4 cmtrl[4]; // Offset: 64 Size: 64 [unused]
// float4 clights[40]; // Offset: 128 Size: 640 [unused]
// float4 ctexmtx[24]; // Offset: 768 Size: 384
// float4 ctrmtx[64]; // Offset: 1152 Size: 1024
// float4 cnmtx[32]; // Offset: 2176 Size: 512
// float4 cpostmtx[64]; // Offset: 2688 Size: 1024
// float4 cDepth; // Offset: 3712 Size: 16
// float4 cPLOffset[13]; // Offset: 3728 Size: 208 [unused]
//
// }
//
//
// Resource Bindings:
//
// Name Type Format Dim Slot Elements
// ------------------------------ ---------- ------- ----------- ---- --------
// $Globals cbuffer NA NA 0 1
//
//
//
// Input signature:
//
// Name Index Mask Register SysValue Format Used
// -------------------- ----- ------ -------- -------- ------- ------
// NORMAL 0 xyz 0 NONE float xyz
// COLOR 0 xyzw 1 NONE float xyzw
// TEXCOORD 0 xy 2 NONE float xy
// BLENDINDICES 0 xyzw 3 NONE float x
// POSITION 0 xyzw 4 NONE float xyzw
//
//
// Output signature:
//
// Name Index Mask Register SysValue Format Used
// -------------------- ----- ------ -------- -------- ------- ------
// SV_Position 0 xyzw 0 POS float xyzw
// COLOR 0 xyzw 1 NONE float xyzw
// COLOR 1 xyzw 2 NONE float xyzw
// TEXCOORD 0 xyz 3 NONE float xyz
// TEXCOORD 1 xyz 4 NONE float xyz
// TEXCOORD 2 xyzw 5 NONE float xyzw
// TEXCOORD 3 xyzw 6 NONE float xyzw
//
vs_5_0
dcl_globalFlags refactoringAllowed
dcl_constantbuffer cb0[233], dynamicIndexed
dcl_input v0.xyz
dcl_input v1.xyzw
dcl_input v2.xy
dcl_input v3.x
dcl_input v4.xyzw
dcl_output_siv o0.xyzw, position
dcl_output o1.xyzw
dcl_output o2.xyzw
dcl_output o3.xyz
dcl_output o4.xyz
dcl_output o5.xyzw
dcl_output o6.xyzw
dcl_temps 3
mul r0.x, v3.x, l(255.000000)
ftoi r0.x, r0.x
dp4 r1.x, cb0[r0.x + 72].xyzw, v4.xyzw
iadd r0.yzw, r0.xxxx, l(0, 1, 2, -32)
dp4 r1.y, cb0[r0.y + 72].xyzw, v4.xyzw
dp4 r1.z, cb0[r0.z + 72].xyzw, v4.xyzw
mov r1.w, l(1.000000)
dp4 r2.x, cb0[0].xyzw, r1.xyzw
dp4 r2.y, cb0[1].xyzw, r1.xyzw
dp4 r0.y, cb0[3].xyzw, r1.xyzw
dp4 r0.z, cb0[2].xyzw, r1.xyzw
mov o5.xy, r1.xyxx
mov o6.w, r1.z
mad o0.xy, r0.yyyy, cb0[232].zwzz, r2.xyxx
mul r1.x, r0.z, cb0[232].y
mov o5.zw, r0.zzzy
add r0.z, l(-1.000000), cb0[232].x
mad r0.z, r0.z, r0.y, r1.x
mov o0.z, -r0.z
mov o0.w, r0.y
mov o1.xyzw, v1.xyzw
mov o2.xyzw, v1.xyzw
dp4 r1.x, v4.xyzw, cb0[48].xyzw
dp4 r1.y, v4.xyzw, cb0[49].xyzw
dp4 r1.z, v4.xyzw, cb0[50].xyzw
dp3 r0.y, cb0[229].xyzx, r1.xyzx
add o3.x, r0.y, cb0[229].w
dp3 r0.y, cb0[230].xyzx, r1.xyzx
dp3 r0.z, cb0[231].xyzx, r1.xyzx
add o3.z, r0.z, cb0[231].w
add o3.y, r0.y, cb0[230].w
mov r1.xy, v2.xyxx
mov r1.zw, l(0,0,1.000000,1.000000)
dp4 r2.x, r1.xyww, cb0[51].xyzw
dp4 r2.y, r1.xyzw, cb0[52].xyzw
mov r2.z, l(1.000000)
dp3 r0.y, cb0[229].xyzx, r2.xyzx
add o4.x, r0.y, cb0[229].w
dp3 r0.y, cb0[230].xyzx, r2.xyzx
dp3 r0.z, cb0[231].xyzx, r2.xyzx
add o4.z, r0.z, cb0[231].w
add o4.y, r0.y, cb0[230].w
ige r0.y, r0.x, l(32)
movc r0.x, r0.y, r0.w, r0.x
dp3 r1.x, cb0[r0.x + 136].xyzx, v0.xyzx
iadd r0.xy, r0.xxxx, l(1, 2, 0, 0)
dp3 r1.y, cb0[r0.x + 136].xyzx, v0.xyzx
dp3 r1.z, cb0[r0.y + 136].xyzx, v0.xyzx
dp3 r0.x, r1.xyzx, r1.xyzx
rsq r0.x, r0.x
mul o6.xyz, r0.xxxx, r1.xyzx
ret
// Approximately 52 instruction slots used

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/

CPU: Intel Core i7 7700K @ 4.9GHz
Motherboard: Gigabyte Aorus GA-Z270X-Gaming 5
RAM: GSKILL Ripjaws Z 16GB 3866MHz CL18
GPU: Gainward Phoenix 1080 GLH
Monitor: Asus PG278QR
Speakers: Logitech Z506
Donations account: masterotakusuko@gmail.com

#20
Posted 11/16/2015 10:18 PM   
I must be missing something - from the Dolphin source code it looks like there should be geometry shaders for everything in 3D. Maybe they are only used when Dolphin 3D is enabled?
I must be missing something - from the Dolphin source code it looks like there should be geometry shaders for everything in 3D. Maybe they are only used when Dolphin 3D is enabled?

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

#21
Posted 11/16/2015 11:26 PM   
  2 / 2    
Scroll To Top