[b]@DarkStarWord:[/b]
Thank you for your explainings. This clears up the dead ends I was running into. :)
Crazy thing that two constant registers can't be read in one instuction.
And I totally overlooked the HelixMod Feature List. I should have done some more reading before starting to hack my first shader. ;)
[b]@bo3b:[/b]
I really enjoyed your tutorials, Thank you. :) It opened my eyes for the HelixMod. I allready have seen some HLSL shaders long time ago (getting Bioshock working on SM 2.0 cards), but the ASM shader-code was a bit strange to me.
Now I understand how the ASM-code works and what has to be done for the 3D-Vision-correction.
Sad thing, that the developers don't add this code directly, as it is much easier for them.
They dont have to search hundreds of shaders and they exactly know, how every single shader works.
@DarkStarWord:
Thank you for your explainings. This clears up the dead ends I was running into. :)
Crazy thing that two constant registers can't be read in one instuction.
And I totally overlooked the HelixMod Feature List. I should have done some more reading before starting to hack my first shader. ;)
@bo3b:
I really enjoyed your tutorials, Thank you. :) It opened my eyes for the HelixMod. I allready have seen some HLSL shaders long time ago (getting Bioshock working on SM 2.0 cards), but the ASM shader-code was a bit strange to me.
Now I understand how the ASM-code works and what has to be done for the 3D-Vision-correction.
Sad thing, that the developers don't add this code directly, as it is much easier for them.
They dont have to search hundreds of shaders and they exactly know, how every single shader works.
Desktop-PC
i7 870 @ 3.8GHz + MSI GTX1070 Gaming X + 16GB RAM + Win10 64Bit Home + AW2310+3D-Vision
@Bo3b:
I thought, I could adopt the moon fix from "Alan Wake" to "Alan Wake A N". I thought this would be easy for beginning. But it doesn't work correctly and it seems it doesn't really work in Alan Wake but is veiled by clouds and fog.
The moon somehow is no fixed skybox-texture but a sprite that is always directed to the viewport (like enemies in the old DOOM games). So the moon shifts depth with position on the screen and has different angles on the left and right eye:
[img]http://abload.de/img/alan_wakes_american_n5js8l.jpg[/img]
If you remember, Remedy claimed that "AW" would be 3D-Vision compatible (it was and is not). They tried to make compatible shaders and some of them seem to be jumped into "AWAN". Some code and a 3D Stereo Sampler is found in the Pixel Shaders but it really does weird things to the moon. I killed this code in the PS and tried to fix it in the VS :
[code]def c220,-0.1527, 0, 0.0625, 0 //3D stereo params, x = moon depth correction
dcl_2d s1 // stereo texture sampler
...
// mul o2.xyz, r0.x, r1
mul r3.xyz, r0.x, r1
texldl r5, c220.z, s1// retrieve values from stereo texture
//(W - Convergence)
// add r5.w,r3.w,-r5.y // r3.w=0 ?
// add r5.w,c220.x,-r5.y // c220.x moon-depth
mov r5.w,c220.x
//MultiplyAdd Xnew(r3.x) = Xold(r3.x) + Separation(r5.x) * (Depth r5.w))
mad r3.x, r5.x, r5.w, r3.x
mov o2.xyz, r3.xyz[/code]
This is the code and it works pretty good:
[img]http://abload.de/img/alan_wakes_american_np2skf.jpg[/img]
Problem is, I can't get the convergence calculated in. If I do (Line8 or 9) , the moon shifts depth unnaturally with convergence.
As the moon is on the far end like the stars, there is nearly no change with convergence, so I'm thinking to leave it as it is.
Or maybe you have a hint?
PS: Don't mind the square stars, I only changed the PS for a better perception of the depth-difference between stars and moon ;) .
@Bo3b:
I thought, I could adopt the moon fix from "Alan Wake" to "Alan Wake A N". I thought this would be easy for beginning. But it doesn't work correctly and it seems it doesn't really work in Alan Wake but is veiled by clouds and fog.
The moon somehow is no fixed skybox-texture but a sprite that is always directed to the viewport (like enemies in the old DOOM games). So the moon shifts depth with position on the screen and has different angles on the left and right eye:
If you remember, Remedy claimed that "AW" would be 3D-Vision compatible (it was and is not). They tried to make compatible shaders and some of them seem to be jumped into "AWAN". Some code and a 3D Stereo Sampler is found in the Pixel Shaders but it really does weird things to the moon. I killed this code in the PS and tried to fix it in the VS :
Problem is, I can't get the convergence calculated in. If I do (Line8 or 9) , the moon shifts depth unnaturally with convergence.
As the moon is on the far end like the stars, there is nearly no change with convergence, so I'm thinking to leave it as it is.
Or maybe you have a hint?
PS: Don't mind the square stars, I only changed the PS for a better perception of the depth-difference between stars and moon ;) .
Desktop-PC
i7 870 @ 3.8GHz + MSI GTX1070 Gaming X + 16GB RAM + Win10 64Bit Home + AW2310+3D-Vision
@Flint Eastwood: Great job! That is looking like a great fix. I always find it super interesting that we can fix stuff like this without any access to the source code, and the developers of the original game were unable to get it right, and they even tried.
For the moon convergence, it's fine to leave it as is. Depending upon the game and how it's constructed, we sometimes use only the separation value and ignore the convergence, especially for full distance stuff like the skybox or moon. To a large degree, we are just trying to get something that is playable, not necessarily something 'correct'. The reason for that is because we have no control over the developer, and a lot of times they cheat the model in ways that we cannot work around.
For the -15% parameter, I assume that you just found that from experimentation? It's an odd combination that we don't usually use for skybox/moon fixes.
Going with the VS is the right spot for this fix. But usually we'll just add in a maximum separation to push it to full depth. So normally I'd expect c200.x to be 1.0. Having it be negative also suggests it was initially popping out of screen instead. There are lots of variants though, so really whatever works is good.
Be sure that changing convergence doesn't affect it, and that it looks right on both low separation and high separation, and it should be good to go. Great stuff.
@Flint Eastwood: Great job! That is looking like a great fix. I always find it super interesting that we can fix stuff like this without any access to the source code, and the developers of the original game were unable to get it right, and they even tried.
For the moon convergence, it's fine to leave it as is. Depending upon the game and how it's constructed, we sometimes use only the separation value and ignore the convergence, especially for full distance stuff like the skybox or moon. To a large degree, we are just trying to get something that is playable, not necessarily something 'correct'. The reason for that is because we have no control over the developer, and a lot of times they cheat the model in ways that we cannot work around.
For the -15% parameter, I assume that you just found that from experimentation? It's an odd combination that we don't usually use for skybox/moon fixes.
Going with the VS is the right spot for this fix. But usually we'll just add in a maximum separation to push it to full depth. So normally I'd expect c200.x to be 1.0. Having it be negative also suggests it was initially popping out of screen instead. There are lots of variants though, so really whatever works is good.
Be sure that changing convergence doesn't affect it, and that it looks right on both low separation and high separation, and it should be good to go. Great stuff.
Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607 Latest 3Dmigoto Release Bo3b's School for ShaderHackers
If I'm understanding your code right the moon is initially in stereo, but too far back and you're trying to bring it closer?
If your solution is working well enough you probably should just leave it at that, but if you wanted to experiment a bit further here's a couple of ideas - I can't say if these will work in this case, just something you might experiment with:
- Try undoing the stereo correction (do a subtract in the last step instead of an add) to bring it back to screen depth then apply a new depth adjustment. You might try just adding the separation to X (or a fixed percentage of separation) which often works well for skybox & related items that should be at infinite depth.
Example: https://github.com/DarkStarSword/3d-fixes/commit/2f97eb02543f481ca9b6f594544e512c36ecb11e
- Instead of the usual depth adjustment, try multiplying the entire output coordinate by a value less than 1. Thanks to the perspective divide this won't change the 2D XY screen coordinates or scale (as you are scaling W by the same amount), but will reduce the depth of the object as it will change the result of the stereo correction applied by the driver (I saw this mentioned in some nVidia whitepaper).
Example (This is in HLSL with 3Dmigoto, not ASM): https://github.com/bo3b/3Dmigoto/commit/d1a5b6357ed723ff7964d70266e4b6e709652f29
If I'm understanding your code right the moon is initially in stereo, but too far back and you're trying to bring it closer?
If your solution is working well enough you probably should just leave it at that, but if you wanted to experiment a bit further here's a couple of ideas - I can't say if these will work in this case, just something you might experiment with:
- Try undoing the stereo correction (do a subtract in the last step instead of an add) to bring it back to screen depth then apply a new depth adjustment. You might try just adding the separation to X (or a fixed percentage of separation) which often works well for skybox & related items that should be at infinite depth.
- Instead of the usual depth adjustment, try multiplying the entire output coordinate by a value less than 1. Thanks to the perspective divide this won't change the 2D XY screen coordinates or scale (as you are scaling W by the same amount), but will reduce the depth of the object as it will change the result of the stereo correction applied by the driver (I saw this mentioned in some nVidia whitepaper).
[quote="bo3b"]For the -15% parameter, I assume that you just found that from experimentation? It's an odd combination that we don't usually use for skybox/moon fixes.[/quote]
You are right about the experimentation. Strange is, that the o2 only has XYZ , and my parameter c220.x is the only W coordinate. I guess that there is still something in the PS that shifts the X of the moon.
Small noob question(for my comprehension): The sky at infinite depth - is W = 1 or W = flt_max ? But it shouldn't be a negative value, right?
[quote="DarkStarSword"]If I'm understanding your code right the moon is initially in stereo, but too far back and you're trying to bring it closer?[/quote]
Yes. Initially stereo was done by Remedy in PS, but the moon was too far away. There was something like a depth value. I changed this, so the moon was placed right, when centered on the screen. But when you turned perspective and the moon was on the edge of screen then depth and angle was wrong, as you can see in the first shot. I guess there is still something in the PS that causes my strange negative value working.
At the moment if I shift convergence to really unplayable values, then there is a very slight difference between stars an moon visible. So I will take a look again into the moon, but if I don't find something, I will leave it as it is now. :)
bo3b said:For the -15% parameter, I assume that you just found that from experimentation? It's an odd combination that we don't usually use for skybox/moon fixes.
You are right about the experimentation. Strange is, that the o2 only has XYZ , and my parameter c220.x is the only W coordinate. I guess that there is still something in the PS that shifts the X of the moon.
Small noob question(for my comprehension): The sky at infinite depth - is W = 1 or W = flt_max ? But it shouldn't be a negative value, right?
DarkStarSword said:If I'm understanding your code right the moon is initially in stereo, but too far back and you're trying to bring it closer?
Yes. Initially stereo was done by Remedy in PS, but the moon was too far away. There was something like a depth value. I changed this, so the moon was placed right, when centered on the screen. But when you turned perspective and the moon was on the edge of screen then depth and angle was wrong, as you can see in the first shot. I guess there is still something in the PS that causes my strange negative value working.
At the moment if I shift convergence to really unplayable values, then there is a very slight difference between stars an moon visible. So I will take a look again into the moon, but if I don't find something, I will leave it as it is now. :)
Desktop-PC
i7 870 @ 3.8GHz + MSI GTX1070 Gaming X + 16GB RAM + Win10 64Bit Home + AW2310+3D-Vision
Folks I have a few more questions. What does the w parameter signify in a 3d coordinate space? Can you please also explain what the following instruction means?
[code]
mov r1.yz, v1.xyww
[/code]
Folks I have a few more questions. What does the w parameter signify in a 3d coordinate space? Can you please also explain what the following instruction means?
[quote="Flint Eastwood"]Small noob question(for my comprehension): The sky at infinite depth - is W = 1 or W = flt_max ? But it shouldn't be a negative value, right?[/quote]
No, it shouldn't be negative. There must be something else happening for you to have found you needed a negative adjustment, though it's hard to say what exactly.
As for where W=infinite, consider three possible definitions of infinite depth:
Reality: An object is sufficiently far away that the rays of light from the object are parallel. In this case you could consider the images of the object in the left and right eye to be the same distance apart as the eyes themselves (or more to the point - the pupils). For a typical person this will be around 6.4cm.
Stereo image: An object at infinite depth should be drawn the same physical distance apart as the eyes to appear to be at infinity. Failing that, it should be drawn at some user customisable distance apart - this is the separation value in 3D Vision, so you wouldn't use convergence at all if trying to use this definition of an infinite depth.
Object within a game scene: The furthest away any object can be drawn is at the far clipping plane, which is an arbitrary value defined by the game (and not something we know). EDIT: So long as it is large enough objects at the far clipping plane will be approximately (maybe slightly lower than) separation distance apart, as (W - convergence) in the stereo correction formula will be almost the same as the W perspective divide if W is a large number, leaving only separation. As a mathematician would put it "As W approaches positive infinity, the stereo adjustment approaches separation".
If we wanted to place an object in the game scene with a W equal to infinity it depends on which definition we were using. If we tried to give it a W value that would put it at 100% separation, we would quickly run into this problem if we plug it into nVidia's stereo formula:
[code]
(EDIT: Added perspective divide)
separation * (W - convergence) / W = separation
separation * (W - convergence) = separation * W
W - convergence = W
[/code]
EDIT: So, that's not actually solvable unless convergence=0 (which gives a very lame 3D image as *everything* is at infinity). So the best thing to do is either use the value of the far clipping plane (which is a guess since that can vary from game to game), or ignore the original depth and convergence altogether and just use separation (EDIT: which I guess might need to be multiplied by W if W was not 1 originally)
A couple of other things to consider - the convergence value defines what objects are rendered at screen depth (as the formula gives 0 when W = convergence), however if an object is not rendered in world, but rather directly in screen space (e.g. UI elements), the W value will typically be 1 which stops the perspective divide from altering their position.
[quote="eroc_remag"]Folks I have a few more questions. What does the w parameter signify in a 3d coordinate space?[/quote]
There's a couple of things to know about this W value:
- It turns a three dimensional Cartesian coordinate into a four dimensional Homogeneous coordinate representing the same point. You might be interested to read up on this topic on Wikipedia, but don't worry if it goes over your head - for our purposes we only really need to know that W is depth*.
- The reason it's used in computer graphics is that it allows multiple transformation matrices to be multiplied together to combine their individual operation into a single matrix. If you've heard of a "model view projection" matrix that is a single matrix that is the product of three separate matrices, and this trick only works if the matrices and the coordinates they operate on have an extra dimension.
- It is used in the "perspective divide". If you take the output coordinate from a vertex shader, to find it's screen X and Y coordinates you need to divide X and Y by W. If W is depth*, than as objects are further away their W value gets larger and as a consequence their size gets smaller thanks to this division.
- For certain calculations (fog & depth buffer), DirectX requires that the projection matrix be set up such that the W value of any coordinate passed through it will be equivalent to world space Z - this is why it is the depth*
* W is only depth after a coordinate has been multiplied by the projection matrix. Before that happens W will probably just be 1 (for some advanced fixes where view-space coordinates are involved, you might need to use Z instead of W).
[quote]Can you please also explain what the following instruction means?
[code]
mov r1.yz, v1.xyww
[/code][/quote]
That copies v1.x into r1.y and v1.y into r1.z. The extra ww on the end is ignored in that case since the destination register has only specified two components in it's swizzle (I've always found it a bit odd that shader compilers keep adding these extra components in source swizzles that will just be ignored, but then again I've seen the useless instructions that x86 and ppc compilers often produce so it's no real surprise either).
One thing to keep in mind is that in the middle of a shader the x,y,z and w components of a register won't necessarily represent xyzw coordinates and may just be used as a convenient place to store numbers by the compiler. In that snippet v1.xy will match some xy coordinates since that is an input with a defined meaning, but for whatever reason the compiler decided to store them in the y and z components of r1. Sometimes you may need to trace how a value flows through a shader to determine where to find it.
EDIT: Edited a few things after considering the implications of the perspective divide on the stereo correction formula.
Flint Eastwood said:Small noob question(for my comprehension): The sky at infinite depth - is W = 1 or W = flt_max ? But it shouldn't be a negative value, right?
No, it shouldn't be negative. There must be something else happening for you to have found you needed a negative adjustment, though it's hard to say what exactly.
As for where W=infinite, consider three possible definitions of infinite depth:
Reality: An object is sufficiently far away that the rays of light from the object are parallel. In this case you could consider the images of the object in the left and right eye to be the same distance apart as the eyes themselves (or more to the point - the pupils). For a typical person this will be around 6.4cm.
Stereo image: An object at infinite depth should be drawn the same physical distance apart as the eyes to appear to be at infinity. Failing that, it should be drawn at some user customisable distance apart - this is the separation value in 3D Vision, so you wouldn't use convergence at all if trying to use this definition of an infinite depth.
Object within a game scene: The furthest away any object can be drawn is at the far clipping plane, which is an arbitrary value defined by the game (and not something we know). EDIT: So long as it is large enough objects at the far clipping plane will be approximately (maybe slightly lower than) separation distance apart, as (W - convergence) in the stereo correction formula will be almost the same as the W perspective divide if W is a large number, leaving only separation. As a mathematician would put it "As W approaches positive infinity, the stereo adjustment approaches separation".
If we wanted to place an object in the game scene with a W equal to infinity it depends on which definition we were using. If we tried to give it a W value that would put it at 100% separation, we would quickly run into this problem if we plug it into nVidia's stereo formula:
(EDIT: Added perspective divide)
separation * (W - convergence) / W = separation
separation * (W - convergence) = separation * W
W - convergence = W
EDIT: So, that's not actually solvable unless convergence=0 (which gives a very lame 3D image as *everything* is at infinity). So the best thing to do is either use the value of the far clipping plane (which is a guess since that can vary from game to game), or ignore the original depth and convergence altogether and just use separation (EDIT: which I guess might need to be multiplied by W if W was not 1 originally)
A couple of other things to consider - the convergence value defines what objects are rendered at screen depth (as the formula gives 0 when W = convergence), however if an object is not rendered in world, but rather directly in screen space (e.g. UI elements), the W value will typically be 1 which stops the perspective divide from altering their position.
eroc_remag said:Folks I have a few more questions. What does the w parameter signify in a 3d coordinate space?
There's a couple of things to know about this W value:
- It turns a three dimensional Cartesian coordinate into a four dimensional Homogeneous coordinate representing the same point. You might be interested to read up on this topic on Wikipedia, but don't worry if it goes over your head - for our purposes we only really need to know that W is depth*.
- The reason it's used in computer graphics is that it allows multiple transformation matrices to be multiplied together to combine their individual operation into a single matrix. If you've heard of a "model view projection" matrix that is a single matrix that is the product of three separate matrices, and this trick only works if the matrices and the coordinates they operate on have an extra dimension.
- It is used in the "perspective divide". If you take the output coordinate from a vertex shader, to find it's screen X and Y coordinates you need to divide X and Y by W. If W is depth*, than as objects are further away their W value gets larger and as a consequence their size gets smaller thanks to this division.
- For certain calculations (fog & depth buffer), DirectX requires that the projection matrix be set up such that the W value of any coordinate passed through it will be equivalent to world space Z - this is why it is the depth*
* W is only depth after a coordinate has been multiplied by the projection matrix. Before that happens W will probably just be 1 (for some advanced fixes where view-space coordinates are involved, you might need to use Z instead of W).
Can you please also explain what the following instruction means?
mov r1.yz, v1.xyww
That copies v1.x into r1.y and v1.y into r1.z. The extra ww on the end is ignored in that case since the destination register has only specified two components in it's swizzle (I've always found it a bit odd that shader compilers keep adding these extra components in source swizzles that will just be ignored, but then again I've seen the useless instructions that x86 and ppc compilers often produce so it's no real surprise either).
One thing to keep in mind is that in the middle of a shader the x,y,z and w components of a register won't necessarily represent xyzw coordinates and may just be used as a convenient place to store numbers by the compiler. In that snippet v1.xy will match some xy coordinates since that is an input with a defined meaning, but for whatever reason the compiler decided to store them in the y and z components of r1. Sometimes you may need to trace how a value flows through a shader to determine where to find it.
EDIT: Edited a few things after considering the implications of the perspective divide on the stereo correction formula.
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
[quote="DarkStarSword"]* W is only depth after a coordinate has been multiplied by the projection matrix. Before that happens W will probably just be 1 (for some advanced fixes where view-space coordinates are involved, you might need to use Z instead of W).[/quote]
Something like this seems to happen for my Moon-problem.
I can't find out, why the moon was shifted with convergence so badly.
Anyway, I found a cheap workaround to get the moon perfectly working with convergence and separation. I multiply an correction constant to the convergence constant.
It's more a handicraft as a proper fix - but as long as it works, I'm happy. ;)
[code]def c220,-0.1533,-0.000192, 0.0625, 0 // 3D stereo params
dcl_2d s1 // stereo texture sampler
...
// mul o2.xyz, r0.x, r1
mul r3.xyz, r0.x, r1
texldl r5, c220.z, s1 // retrieve values from stereo texture
mul r5.y, r5.y, c220.y // weird Convergence-correction
add r5.w,c220.x,-r5.y //(W - Convergence)
//MultiplyAdd Xnew(r3.x) = Separation(r5.x) * (W-Conv r5.w)) + Xold(r3.x)
mad r3.x, r5.x, r5.w, r3.x
mov o2.xyz, r3.xyz // mov temp register to output
[/code]
DarkStarSword said:* W is only depth after a coordinate has been multiplied by the projection matrix. Before that happens W will probably just be 1 (for some advanced fixes where view-space coordinates are involved, you might need to use Z instead of W).
Something like this seems to happen for my Moon-problem.
I can't find out, why the moon was shifted with convergence so badly.
Anyway, I found a cheap workaround to get the moon perfectly working with convergence and separation. I multiply an correction constant to the convergence constant.
It's more a handicraft as a proper fix - but as long as it works, I'm happy. ;)
Some people on this thread may be interested in this - I'm working on a fix for Eleusis at the moment and added a UI depth adjustment, but the game has a drop shadow around the hand cursor in-game, which moved to the right by the depth adjustment and became misaligned from the cursor and clipped.
In order to solve this, I've come up with a new formula for doing a UI depth adjustment that works by multiplying all four components of the output position by an amount that has an equivalent result to the UI formula:
position = position * -convergence / (UI_depth - 1)
Where UI_depth is the percentage of separation that we are adjusting the UI to. The only restriction is that this cannot adjust the UI to infinite depth as that will give a divide by zero - I've generally found that an infinite UI adjustment looks wrong anyway, so I usually use a maximum of 99.5% of separation (a value I came up with ages ago experimenting in Skyrim).
There's a bit more explanation on the commit where I made this change, as well as the asm instructions necessary to do this calculation:
https://github.com/DarkStarSword/3d-fixes/commit/af442575fa010e3ba0ee98e9defbc81d30c3080a
Some people on this thread may be interested in this - I'm working on a fix for Eleusis at the moment and added a UI depth adjustment, but the game has a drop shadow around the hand cursor in-game, which moved to the right by the depth adjustment and became misaligned from the cursor and clipped.
In order to solve this, I've come up with a new formula for doing a UI depth adjustment that works by multiplying all four components of the output position by an amount that has an equivalent result to the UI formula:
position = position * -convergence / (UI_depth - 1)
Where UI_depth is the percentage of separation that we are adjusting the UI to. The only restriction is that this cannot adjust the UI to infinite depth as that will give a divide by zero - I've generally found that an infinite UI adjustment looks wrong anyway, so I usually use a maximum of 99.5% of separation (a value I came up with ages ago experimenting in Skyrim).
Thanks for the explanation DarkStarSword. I'll read up more on Homogeneous coordinate systems to better understand this. I didn't understand this part though-> W is depth*. Could you please elaborate on this?
Edit: Found this excellent article that explains Homogeneous coordinate space. http://www.tomdalling.com/blog/modern-opengl/explaining-homogenous-coordinates-and-projective-geometry/
Thanks for the explanation DarkStarSword. I'll read up more on Homogeneous coordinate systems to better understand this. I didn't understand this part though-> W is depth*. Could you please elaborate on this?
That's an excellent article you found on Homogeneous coordinates :)
To take a vertex of an object and find where it is on the screen, the vertex' coordinate is multiplied first by a model matrix to apply any local transformations (like rotating a character to face the right direction), then by the view matrix (AKA world AKA camera matrix) to move it to a point relative to the camera, then finally by the projection matrix which applies the FOV and perspective such that once it is translated back into three dimensional coordinates it will be at the right position on the screen.
Before the coordinate is multiplied by the projection matrix it will be in view space relative to the camera's position (at 0x0x0) with the Z dimension lined up with the direction the camera is pointing. This means that the value of the Z component will be distance of the object into the scene, or the depth (not quite the same as the distance from the camera since it only counts distance in one dimension).
A typical projection matrix in a game will usually look like this:
[code]
[ h 0 0 0 ] h = cot(horizontal_fov / 2)
[ 0 v 0 0 ] v = cot(vertical_fov / 2)
[ 0 0 q 1 ] near & far are the minimum and maximum draw distance
[ 0 0 -q*near 0 ] q = far / (far - near)
[/code]
The fourth column is 0,0,1,0. This means when you multiply a coordinate by the projection matrix, the resulting W component will be the value of the original Z component, as X*0 + Y*0 + Z*1 + W*0 = Z. So, before multiplying the coordinate by the projection matrix depth into the scene is Z, and afterwards it is W (note that the stereo correction formula is different in view space as well, so you can't simply do a correction using view-space Z instead of projection-space W, but I don't want to get into too much detail on that yet as that is an advanced topic for later lessons).
From a pure maths standpoint it isn't required that the final column be 0,0,1,0, so it is possible that some games may use a different projection matrix and view space Z won't necessarily equal projection space W, however this is unlikely because DirectX depends on these being equal for certain calculations, as explained in this MSDN article under "A W-Friendly Projection Matrix":
http://msdn.microsoft.com/en-us/library/windows/desktop/bb147302(v=vs.85).aspx
That's an excellent article you found on Homogeneous coordinates :)
To take a vertex of an object and find where it is on the screen, the vertex' coordinate is multiplied first by a model matrix to apply any local transformations (like rotating a character to face the right direction), then by the view matrix (AKA world AKA camera matrix) to move it to a point relative to the camera, then finally by the projection matrix which applies the FOV and perspective such that once it is translated back into three dimensional coordinates it will be at the right position on the screen.
Before the coordinate is multiplied by the projection matrix it will be in view space relative to the camera's position (at 0x0x0) with the Z dimension lined up with the direction the camera is pointing. This means that the value of the Z component will be distance of the object into the scene, or the depth (not quite the same as the distance from the camera since it only counts distance in one dimension).
A typical projection matrix in a game will usually look like this:
[ h 0 0 0 ] h = cot(horizontal_fov / 2)
[ 0 v 0 0 ] v = cot(vertical_fov / 2)
[ 0 0 q 1 ] near & far are the minimum and maximum draw distance
[ 0 0 -q*near 0 ] q = far / (far - near)
The fourth column is 0,0,1,0. This means when you multiply a coordinate by the projection matrix, the resulting W component will be the value of the original Z component, as X*0 + Y*0 + Z*1 + W*0 = Z. So, before multiplying the coordinate by the projection matrix depth into the scene is Z, and afterwards it is W (note that the stereo correction formula is different in view space as well, so you can't simply do a correction using view-space Z instead of projection-space W, but I don't want to get into too much detail on that yet as that is an advanced topic for later lessons).
From a pure maths standpoint it isn't required that the final column be 0,0,1,0, so it is possible that some games may use a different projection matrix and view space Z won't necessarily equal projection space W, however this is unlikely because DirectX depends on these being equal for certain calculations, as explained in this MSDN article under "A W-Friendly Projection Matrix":
http://msdn.microsoft.com/en-us/library/windows/desktop/bb147302(v=vs.85).aspx
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
[quote="bo3b"]
For this one, I'd start by making sure the code is still executing by putting a mov r1, c200.w instruction to set it all to zero and make sure it's still being assembled and running. With F10 reloads, it easy to incrementally try things.[/quote]
After a long break I started working on the shader from scratch & it worked. Yipee!!. Actually watching your lesson#6 helped Bo3b. Now I understand why that code is put there & what it does. I had put off watching it for too long, but now I have a good understanding of how the stereo projection works. You have done an excellent job of explaining the whole thing in simple words!
I do have a question though. In the tutorial you say that setting the following line is important for the fix to get activated:
[code]abs r0.w, c200.y[/code]
Can you please explain why that is so?
And can you also explain this line of code please?
[code]if_ne r0.w, -r0.w[/code]
This triggers the stereo specific shader code. Now why/how would this condition be false when stereo is disabled? I couldn't details of this instruction on MSDN's PS or VS instructions page.
bo3b said:
For this one, I'd start by making sure the code is still executing by putting a mov r1, c200.w instruction to set it all to zero and make sure it's still being assembled and running. With F10 reloads, it easy to incrementally try things.
After a long break I started working on the shader from scratch & it worked. Yipee!!. Actually watching your lesson#6 helped Bo3b. Now I understand why that code is put there & what it does. I had put off watching it for too long, but now I have a good understanding of how the stereo projection works. You have done an excellent job of explaining the whole thing in simple words!
I do have a question though. In the tutorial you say that setting the following line is important for the fix to get activated:
abs r0.w, c200.y
Can you please explain why that is so?
And can you also explain this line of code please?
if_ne r0.w, -r0.w
This triggers the stereo specific shader code. Now why/how would this condition be false when stereo is disabled? I couldn't details of this instruction on MSDN's PS or VS instructions page.
To add to Eroc's questions,
In the verbose canonical shader code it is stated:
// At this point r0 is the output position, correctly
// placed, but without stereo
Now, as I understand it, that applies to the r0 in the lines that follow:
texldl r30, c200.z, s0
add r30.w, r0.w, -r30.y
mul r30.z, r30.x, r30.w
add r0.x, r0.x, r30.z
And it is up to me to
1) move this correct placement into r0 before this text block
2) move the corrected r0.x out again after it.
Main question is though: how do I find the parameter(s) that holds this correct placement? In other words, at what point do I need to break into the code to make this change?
I've been going through the old Rome II fix by Mike, hoping to find some pattern, but I haven't found one yet.
In the verbose canonical shader code it is stated:
// At this point r0 is the output position, correctly
// placed, but without stereo
Now, as I understand it, that applies to the r0 in the lines that follow:
texldl r30, c200.z, s0
add r30.w, r0.w, -r30.y
mul r30.z, r30.x, r30.w
add r0.x, r0.x, r30.z
And it is up to me to
1) move this correct placement into r0 before this text block
2) move the corrected r0.x out again after it.
Main question is though: how do I find the parameter(s) that holds this correct placement? In other words, at what point do I need to break into the code to make this change?
I've been going through the old Rome II fix by Mike, hoping to find some pattern, but I haven't found one yet.
[quote="eroc_remag"]I do have a question though. In the tutorial you say that setting the following line is important for the fix to get activated:
[code]abs r0.w, c200.y[/code]
Can you please explain why that is so?[/quote]
I can't recall the context of this in the lessons, so Bo3b might have to answer. In general terms abs is "absolute value", so that instruction will behave the same as mov if c200.y is positive, but if c200.y is negative it will store the positive version of it in r0.w.
[quote]And can you also explain this line of code please?
[code]if_ne r0.w, -r0.w[/code]
This triggers the stereo specific shader code. Now why/how would this condition be false when stereo is disabled? I couldn't details of this instruction on MSDN's PS or VS instructions page.[/quote]
I also don't recall this exact form from the lessons, but assuming that just above that the separation value was copied into r0.w it will do as you describe, as the only time separation == -separation is if separation is zero, which is when stereo is disabled, and since this is a "not equals" test, the code inside the if block will only run when stereo is enabled.
Generally we would use a more explicit test for not equal to zero - I would expect the above form to be something that a compiler would do or someone trying to be tricky:
[code]
dcl c220, 0, 1, 0.0625, 0.5
...
if_ne r0.w, c220.x
...
endif
[/code]
Most stereo adjustments don't need to be explicitly enabled with a test like this, as the maths usually works out that they have no effect when 3D is disabled (because they are almost always multiplied by separation, which is zero when 3D is disabled). It is good practice to do this if you are disabling an effect though, so it will still show up when 3D is disabled.
[quote="Muizer"]// At this point r0 is the output position, correctly
// placed, but without stereo
Now, as I understand it, that applies to the r0 in the lines that follow:
texldl r30, c200.z, s0
add r30.w, r0.w, -r30.y
mul r30.z, r30.x, r30.w
add r0.x, r0.x, r30.z
And it is up to me to
1) move this correct placement into r0 before this text block
2) move the corrected r0.x out again after it.[/quote]
This is an example where it was determined that r0 contained the position. If you determined that say, r6 had the position instead you would replace the three occurrences of r0 in that code with r6.
[quote]Main question is though: how do I find the parameter(s) that holds this correct placement? In other words, at what point do I need to break into the code to make this change?[/quote]
A lot of it is just experimentation - if you suspect that a parameter might hold the correct placement add the prime directive and see what happens. Here are two common techniques I have used for this type of experimentation:
1.
If I'm working on a vertex shader I find all the lines that set output registers and replace the output register with a temporary register. To keep it simple I use a convention of adding 10 or 20 to the register number, so the final digit will match the original output register - check if the shader is already using temporary registers starting with r1N or r2N first as you don't want to use a temporary register that is already used in the shader.
I then add a series of mov instructions to the end of the shader to copy these temporary registers to the output registers:
[code]
mov o0, r20
mov o1.xy, r21 // Be sure to use the same output mask as the dcl_texcoord o1.xy
mov o2, r22
...
[/code]
Now, just above those mov instructions I add the prime directive and try it on each one, reload shaders in the game and see what changes. Sometimes an effect will disappear or start flickering all over the place - that tells me that the particular output I tried is unlikely to be the correct one. Sometimes I'll get lucky and find the correct adjustment using this technique. Other times I'll notice that an effect is still broken, but has switched eyes (e.g. a halo in the left eye moved to the right and the halo in the right eye moved to the left - can be hard to see if you aren't looking for it), telling me that I may need to add a multiply by 0.5 to the prime directive.
Instead of the prime directive you might also try setting each output to zero then reloading shaders and looking for what part of the effect disappears - that can tell you a lot about what each of the outputs do, which might help you determine which might be relevant and which are definitely unrelated.
2.
As you become more familiar with shader assembly you will start to spot patterns. For example:
[code]
vs_3_0
...
dcl_position o3 // Note which output register contains the position
...
dcl_position v0 // Input position - unimportant
...
mov o3, r7 // The output position is set from r7
...
XXX rX, rX, r7 // and r7 is *READ* again later on in this shader!
...
XXX r7, rX, rX // r7 is being *WRITTEN* - stop searching if you see this
[/code]
In this pattern we know that r7 contains the output position as it is copied to the output position register. Now, if the value of r7 is *read* again below (or above!) that point that is a really good indication that you will very likely need to add the prime directive just after the 'mov o3, r7' line.
I've added an example of another line that *writes* to r7 below that point - a line like that is not part of the pattern you are looking for, but signifies that r7 will no longer contain the position and is unimportant below that line (likewise you would look for something similar above the 'mov o3, r7' to find the first line where r7 contains the position, then look for reads between the two).
If r7 is *read* above the 'mov o3, r7' line (extremely common in Unity games) it's a little more involved as you want to adjust every use of it *except* o3 - here's an example of this from one of my earlier fixes:
https://github.com/DarkStarSword/3d-fixes/commit/f4669d43271fbc02923029c6b108754d9b228c30
Nowadays I use a script (available in my 3d-fixes repository) to automatically fix this pattern (no guarantees that it won't break something else), which makes changes like this:
https://github.com/DarkStarSword/3d-fixes/blob/b84476c1d40b29c5ebb73a4dc56f2bf575a1ea4f/DreadOut/ShaderOverride/VertexShaders/14E1AD4F.txt
eroc_remag said:I do have a question though. In the tutorial you say that setting the following line is important for the fix to get activated:
abs r0.w, c200.y
Can you please explain why that is so?
I can't recall the context of this in the lessons, so Bo3b might have to answer. In general terms abs is "absolute value", so that instruction will behave the same as mov if c200.y is positive, but if c200.y is negative it will store the positive version of it in r0.w.
And can you also explain this line of code please?
if_ne r0.w, -r0.w
This triggers the stereo specific shader code. Now why/how would this condition be false when stereo is disabled? I couldn't details of this instruction on MSDN's PS or VS instructions page.
I also don't recall this exact form from the lessons, but assuming that just above that the separation value was copied into r0.w it will do as you describe, as the only time separation == -separation is if separation is zero, which is when stereo is disabled, and since this is a "not equals" test, the code inside the if block will only run when stereo is enabled.
Generally we would use a more explicit test for not equal to zero - I would expect the above form to be something that a compiler would do or someone trying to be tricky:
Most stereo adjustments don't need to be explicitly enabled with a test like this, as the maths usually works out that they have no effect when 3D is disabled (because they are almost always multiplied by separation, which is zero when 3D is disabled). It is good practice to do this if you are disabling an effect though, so it will still show up when 3D is disabled.
Muizer said:// At this point r0 is the output position, correctly
// placed, but without stereo
Now, as I understand it, that applies to the r0 in the lines that follow:
texldl r30, c200.z, s0
add r30.w, r0.w, -r30.y
mul r30.z, r30.x, r30.w
add r0.x, r0.x, r30.z
And it is up to me to
1) move this correct placement into r0 before this text block
2) move the corrected r0.x out again after it.
This is an example where it was determined that r0 contained the position. If you determined that say, r6 had the position instead you would replace the three occurrences of r0 in that code with r6.
Main question is though: how do I find the parameter(s) that holds this correct placement? In other words, at what point do I need to break into the code to make this change?
A lot of it is just experimentation - if you suspect that a parameter might hold the correct placement add the prime directive and see what happens. Here are two common techniques I have used for this type of experimentation:
1.
If I'm working on a vertex shader I find all the lines that set output registers and replace the output register with a temporary register. To keep it simple I use a convention of adding 10 or 20 to the register number, so the final digit will match the original output register - check if the shader is already using temporary registers starting with r1N or r2N first as you don't want to use a temporary register that is already used in the shader.
I then add a series of mov instructions to the end of the shader to copy these temporary registers to the output registers:
mov o0, r20
mov o1.xy, r21 // Be sure to use the same output mask as the dcl_texcoord o1.xy
mov o2, r22
...
Now, just above those mov instructions I add the prime directive and try it on each one, reload shaders in the game and see what changes. Sometimes an effect will disappear or start flickering all over the place - that tells me that the particular output I tried is unlikely to be the correct one. Sometimes I'll get lucky and find the correct adjustment using this technique. Other times I'll notice that an effect is still broken, but has switched eyes (e.g. a halo in the left eye moved to the right and the halo in the right eye moved to the left - can be hard to see if you aren't looking for it), telling me that I may need to add a multiply by 0.5 to the prime directive.
Instead of the prime directive you might also try setting each output to zero then reloading shaders and looking for what part of the effect disappears - that can tell you a lot about what each of the outputs do, which might help you determine which might be relevant and which are definitely unrelated.
2.
As you become more familiar with shader assembly you will start to spot patterns. For example:
vs_3_0
...
dcl_position o3 // Note which output register contains the position
...
dcl_position v0 // Input position - unimportant
...
mov o3, r7 // The output position is set from r7
...
XXX rX, rX, r7 // and r7 is *READ* again later on in this shader!
...
XXX r7, rX, rX // r7 is being *WRITTEN* - stop searching if you see this
In this pattern we know that r7 contains the output position as it is copied to the output position register. Now, if the value of r7 is *read* again below (or above!) that point that is a really good indication that you will very likely need to add the prime directive just after the 'mov o3, r7' line.
I've added an example of another line that *writes* to r7 below that point - a line like that is not part of the pattern you are looking for, but signifies that r7 will no longer contain the position and is unimportant below that line (likewise you would look for something similar above the 'mov o3, r7' to find the first line where r7 contains the position, then look for reads between the two).
Woot! Got a result!
What I did was:
- find dcl_position and see which o number is attached. In this case o0
- find the spots in the shader where it gets its input to w and x coordinates
- work out the prime directive with that input, using placeholder variables where necessary.
Does that make sense or am I just incredibly lucky?
The result is still ever so slightly off, but it's nearly there. Above all, it's the first time I see any movement at all!
What I did was:
- find dcl_position and see which o number is attached. In this case o0
- find the spots in the shader where it gets its input to w and x coordinates
- work out the prime directive with that input, using placeholder variables where necessary.
Does that make sense or am I just incredibly lucky?
The result is still ever so slightly off, but it's nearly there. Above all, it's the first time I see any movement at all!
Thank you for your explainings. This clears up the dead ends I was running into. :)
Crazy thing that two constant registers can't be read in one instuction.
And I totally overlooked the HelixMod Feature List. I should have done some more reading before starting to hack my first shader. ;)
@bo3b:
I really enjoyed your tutorials, Thank you. :) It opened my eyes for the HelixMod. I allready have seen some HLSL shaders long time ago (getting Bioshock working on SM 2.0 cards), but the ASM shader-code was a bit strange to me.
Now I understand how the ASM-code works and what has to be done for the 3D-Vision-correction.
Sad thing, that the developers don't add this code directly, as it is much easier for them.
They dont have to search hundreds of shaders and they exactly know, how every single shader works.
Desktop-PC
i7 870 @ 3.8GHz + MSI GTX1070 Gaming X + 16GB RAM + Win10 64Bit Home + AW2310+3D-Vision
I thought, I could adopt the moon fix from "Alan Wake" to "Alan Wake A N". I thought this would be easy for beginning. But it doesn't work correctly and it seems it doesn't really work in Alan Wake but is veiled by clouds and fog.
The moon somehow is no fixed skybox-texture but a sprite that is always directed to the viewport (like enemies in the old DOOM games). So the moon shifts depth with position on the screen and has different angles on the left and right eye:
If you remember, Remedy claimed that "AW" would be 3D-Vision compatible (it was and is not). They tried to make compatible shaders and some of them seem to be jumped into "AWAN". Some code and a 3D Stereo Sampler is found in the Pixel Shaders but it really does weird things to the moon. I killed this code in the PS and tried to fix it in the VS :
This is the code and it works pretty good:
Problem is, I can't get the convergence calculated in. If I do (Line8 or 9) , the moon shifts depth unnaturally with convergence.
As the moon is on the far end like the stars, there is nearly no change with convergence, so I'm thinking to leave it as it is.
Or maybe you have a hint?
PS: Don't mind the square stars, I only changed the PS for a better perception of the depth-difference between stars and moon ;) .
Desktop-PC
i7 870 @ 3.8GHz + MSI GTX1070 Gaming X + 16GB RAM + Win10 64Bit Home + AW2310+3D-Vision
For the moon convergence, it's fine to leave it as is. Depending upon the game and how it's constructed, we sometimes use only the separation value and ignore the convergence, especially for full distance stuff like the skybox or moon. To a large degree, we are just trying to get something that is playable, not necessarily something 'correct'. The reason for that is because we have no control over the developer, and a lot of times they cheat the model in ways that we cannot work around.
For the -15% parameter, I assume that you just found that from experimentation? It's an odd combination that we don't usually use for skybox/moon fixes.
Going with the VS is the right spot for this fix. But usually we'll just add in a maximum separation to push it to full depth. So normally I'd expect c200.x to be 1.0. Having it be negative also suggests it was initially popping out of screen instead. There are lots of variants though, so really whatever works is good.
Be sure that changing convergence doesn't affect it, and that it looks right on both low separation and high separation, and it should be good to go. Great stuff.
Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers
If your solution is working well enough you probably should just leave it at that, but if you wanted to experiment a bit further here's a couple of ideas - I can't say if these will work in this case, just something you might experiment with:
- Try undoing the stereo correction (do a subtract in the last step instead of an add) to bring it back to screen depth then apply a new depth adjustment. You might try just adding the separation to X (or a fixed percentage of separation) which often works well for skybox & related items that should be at infinite depth.
Example: https://github.com/DarkStarSword/3d-fixes/commit/2f97eb02543f481ca9b6f594544e512c36ecb11e
- Instead of the usual depth adjustment, try multiplying the entire output coordinate by a value less than 1. Thanks to the perspective divide this won't change the 2D XY screen coordinates or scale (as you are scaling W by the same amount), but will reduce the depth of the object as it will change the result of the stereo correction applied by the driver (I saw this mentioned in some nVidia whitepaper).
Example (This is in HLSL with 3Dmigoto, not ASM): https://github.com/bo3b/3Dmigoto/commit/d1a5b6357ed723ff7964d70266e4b6e709652f29
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword
You are right about the experimentation. Strange is, that the o2 only has XYZ , and my parameter c220.x is the only W coordinate. I guess that there is still something in the PS that shifts the X of the moon.
Small noob question(for my comprehension): The sky at infinite depth - is W = 1 or W = flt_max ? But it shouldn't be a negative value, right?
Yes. Initially stereo was done by Remedy in PS, but the moon was too far away. There was something like a depth value. I changed this, so the moon was placed right, when centered on the screen. But when you turned perspective and the moon was on the edge of screen then depth and angle was wrong, as you can see in the first shot. I guess there is still something in the PS that causes my strange negative value working.
At the moment if I shift convergence to really unplayable values, then there is a very slight difference between stars an moon visible. So I will take a look again into the moon, but if I don't find something, I will leave it as it is now. :)
Desktop-PC
i7 870 @ 3.8GHz + MSI GTX1070 Gaming X + 16GB RAM + Win10 64Bit Home + AW2310+3D-Vision
No, it shouldn't be negative. There must be something else happening for you to have found you needed a negative adjustment, though it's hard to say what exactly.
As for where W=infinite, consider three possible definitions of infinite depth:
Reality: An object is sufficiently far away that the rays of light from the object are parallel. In this case you could consider the images of the object in the left and right eye to be the same distance apart as the eyes themselves (or more to the point - the pupils). For a typical person this will be around 6.4cm.
Stereo image: An object at infinite depth should be drawn the same physical distance apart as the eyes to appear to be at infinity. Failing that, it should be drawn at some user customisable distance apart - this is the separation value in 3D Vision, so you wouldn't use convergence at all if trying to use this definition of an infinite depth.
Object within a game scene: The furthest away any object can be drawn is at the far clipping plane, which is an arbitrary value defined by the game (and not something we know). EDIT: So long as it is large enough objects at the far clipping plane will be approximately (maybe slightly lower than) separation distance apart, as (W - convergence) in the stereo correction formula will be almost the same as the W perspective divide if W is a large number, leaving only separation. As a mathematician would put it "As W approaches positive infinity, the stereo adjustment approaches separation".
If we wanted to place an object in the game scene with a W equal to infinity it depends on which definition we were using. If we tried to give it a W value that would put it at 100% separation, we would quickly run into this problem if we plug it into nVidia's stereo formula:
EDIT: So, that's not actually solvable unless convergence=0 (which gives a very lame 3D image as *everything* is at infinity). So the best thing to do is either use the value of the far clipping plane (which is a guess since that can vary from game to game), or ignore the original depth and convergence altogether and just use separation (EDIT: which I guess might need to be multiplied by W if W was not 1 originally)
A couple of other things to consider - the convergence value defines what objects are rendered at screen depth (as the formula gives 0 when W = convergence), however if an object is not rendered in world, but rather directly in screen space (e.g. UI elements), the W value will typically be 1 which stops the perspective divide from altering their position.
There's a couple of things to know about this W value:
- It turns a three dimensional Cartesian coordinate into a four dimensional Homogeneous coordinate representing the same point. You might be interested to read up on this topic on Wikipedia, but don't worry if it goes over your head - for our purposes we only really need to know that W is depth*.
- The reason it's used in computer graphics is that it allows multiple transformation matrices to be multiplied together to combine their individual operation into a single matrix. If you've heard of a "model view projection" matrix that is a single matrix that is the product of three separate matrices, and this trick only works if the matrices and the coordinates they operate on have an extra dimension.
- It is used in the "perspective divide". If you take the output coordinate from a vertex shader, to find it's screen X and Y coordinates you need to divide X and Y by W. If W is depth*, than as objects are further away their W value gets larger and as a consequence their size gets smaller thanks to this division.
- For certain calculations (fog & depth buffer), DirectX requires that the projection matrix be set up such that the W value of any coordinate passed through it will be equivalent to world space Z - this is why it is the depth*
* W is only depth after a coordinate has been multiplied by the projection matrix. Before that happens W will probably just be 1 (for some advanced fixes where view-space coordinates are involved, you might need to use Z instead of W).
That copies v1.x into r1.y and v1.y into r1.z. The extra ww on the end is ignored in that case since the destination register has only specified two components in it's swizzle (I've always found it a bit odd that shader compilers keep adding these extra components in source swizzles that will just be ignored, but then again I've seen the useless instructions that x86 and ppc compilers often produce so it's no real surprise either).
One thing to keep in mind is that in the middle of a shader the x,y,z and w components of a register won't necessarily represent xyzw coordinates and may just be used as a convenient place to store numbers by the compiler. In that snippet v1.xy will match some xy coordinates since that is an input with a defined meaning, but for whatever reason the compiler decided to store them in the y and z components of r1. Sometimes you may need to trace how a value flows through a shader to determine where to find it.
EDIT: Edited a few things after considering the implications of the perspective divide on the stereo correction formula.
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword
Something like this seems to happen for my Moon-problem.
I can't find out, why the moon was shifted with convergence so badly.
Anyway, I found a cheap workaround to get the moon perfectly working with convergence and separation. I multiply an correction constant to the convergence constant.
It's more a handicraft as a proper fix - but as long as it works, I'm happy. ;)
Desktop-PC
i7 870 @ 3.8GHz + MSI GTX1070 Gaming X + 16GB RAM + Win10 64Bit Home + AW2310+3D-Vision
In order to solve this, I've come up with a new formula for doing a UI depth adjustment that works by multiplying all four components of the output position by an amount that has an equivalent result to the UI formula:
position = position * -convergence / (UI_depth - 1)
Where UI_depth is the percentage of separation that we are adjusting the UI to. The only restriction is that this cannot adjust the UI to infinite depth as that will give a divide by zero - I've generally found that an infinite UI adjustment looks wrong anyway, so I usually use a maximum of 99.5% of separation (a value I came up with ages ago experimenting in Skyrim).
There's a bit more explanation on the commit where I made this change, as well as the asm instructions necessary to do this calculation:
https://github.com/DarkStarSword/3d-fixes/commit/af442575fa010e3ba0ee98e9defbc81d30c3080a
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword
Edit: Found this excellent article that explains Homogeneous coordinate space. http://www.tomdalling.com/blog/modern-opengl/explaining-homogenous-coordinates-and-projective-geometry/
To take a vertex of an object and find where it is on the screen, the vertex' coordinate is multiplied first by a model matrix to apply any local transformations (like rotating a character to face the right direction), then by the view matrix (AKA world AKA camera matrix) to move it to a point relative to the camera, then finally by the projection matrix which applies the FOV and perspective such that once it is translated back into three dimensional coordinates it will be at the right position on the screen.
Before the coordinate is multiplied by the projection matrix it will be in view space relative to the camera's position (at 0x0x0) with the Z dimension lined up with the direction the camera is pointing. This means that the value of the Z component will be distance of the object into the scene, or the depth (not quite the same as the distance from the camera since it only counts distance in one dimension).
A typical projection matrix in a game will usually look like this:
The fourth column is 0,0,1,0. This means when you multiply a coordinate by the projection matrix, the resulting W component will be the value of the original Z component, as X*0 + Y*0 + Z*1 + W*0 = Z. So, before multiplying the coordinate by the projection matrix depth into the scene is Z, and afterwards it is W (note that the stereo correction formula is different in view space as well, so you can't simply do a correction using view-space Z instead of projection-space W, but I don't want to get into too much detail on that yet as that is an advanced topic for later lessons).
From a pure maths standpoint it isn't required that the final column be 0,0,1,0, so it is possible that some games may use a different projection matrix and view space Z won't necessarily equal projection space W, however this is unlikely because DirectX depends on these being equal for certain calculations, as explained in this MSDN article under "A W-Friendly Projection Matrix":
http://msdn.microsoft.com/en-us/library/windows/desktop/bb147302(v=vs.85).aspx
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword
After a long break I started working on the shader from scratch & it worked. Yipee!!. Actually watching your lesson#6 helped Bo3b. Now I understand why that code is put there & what it does. I had put off watching it for too long, but now I have a good understanding of how the stereo projection works. You have done an excellent job of explaining the whole thing in simple words!
I do have a question though. In the tutorial you say that setting the following line is important for the fix to get activated:
Can you please explain why that is so?
And can you also explain this line of code please?
This triggers the stereo specific shader code. Now why/how would this condition be false when stereo is disabled? I couldn't details of this instruction on MSDN's PS or VS instructions page.
In the verbose canonical shader code it is stated:
// At this point r0 is the output position, correctly
// placed, but without stereo
Now, as I understand it, that applies to the r0 in the lines that follow:
texldl r30, c200.z, s0
add r30.w, r0.w, -r30.y
mul r30.z, r30.x, r30.w
add r0.x, r0.x, r30.z
And it is up to me to
1) move this correct placement into r0 before this text block
2) move the corrected r0.x out again after it.
Main question is though: how do I find the parameter(s) that holds this correct placement? In other words, at what point do I need to break into the code to make this change?
I've been going through the old Rome II fix by Mike, hoping to find some pattern, but I haven't found one yet.
I can't recall the context of this in the lessons, so Bo3b might have to answer. In general terms abs is "absolute value", so that instruction will behave the same as mov if c200.y is positive, but if c200.y is negative it will store the positive version of it in r0.w.
I also don't recall this exact form from the lessons, but assuming that just above that the separation value was copied into r0.w it will do as you describe, as the only time separation == -separation is if separation is zero, which is when stereo is disabled, and since this is a "not equals" test, the code inside the if block will only run when stereo is enabled.
Generally we would use a more explicit test for not equal to zero - I would expect the above form to be something that a compiler would do or someone trying to be tricky:
Most stereo adjustments don't need to be explicitly enabled with a test like this, as the maths usually works out that they have no effect when 3D is disabled (because they are almost always multiplied by separation, which is zero when 3D is disabled). It is good practice to do this if you are disabling an effect though, so it will still show up when 3D is disabled.
This is an example where it was determined that r0 contained the position. If you determined that say, r6 had the position instead you would replace the three occurrences of r0 in that code with r6.
A lot of it is just experimentation - if you suspect that a parameter might hold the correct placement add the prime directive and see what happens. Here are two common techniques I have used for this type of experimentation:
1.
If I'm working on a vertex shader I find all the lines that set output registers and replace the output register with a temporary register. To keep it simple I use a convention of adding 10 or 20 to the register number, so the final digit will match the original output register - check if the shader is already using temporary registers starting with r1N or r2N first as you don't want to use a temporary register that is already used in the shader.
I then add a series of mov instructions to the end of the shader to copy these temporary registers to the output registers:
Now, just above those mov instructions I add the prime directive and try it on each one, reload shaders in the game and see what changes. Sometimes an effect will disappear or start flickering all over the place - that tells me that the particular output I tried is unlikely to be the correct one. Sometimes I'll get lucky and find the correct adjustment using this technique. Other times I'll notice that an effect is still broken, but has switched eyes (e.g. a halo in the left eye moved to the right and the halo in the right eye moved to the left - can be hard to see if you aren't looking for it), telling me that I may need to add a multiply by 0.5 to the prime directive.
Instead of the prime directive you might also try setting each output to zero then reloading shaders and looking for what part of the effect disappears - that can tell you a lot about what each of the outputs do, which might help you determine which might be relevant and which are definitely unrelated.
2.
As you become more familiar with shader assembly you will start to spot patterns. For example:
In this pattern we know that r7 contains the output position as it is copied to the output position register. Now, if the value of r7 is *read* again below (or above!) that point that is a really good indication that you will very likely need to add the prime directive just after the 'mov o3, r7' line.
I've added an example of another line that *writes* to r7 below that point - a line like that is not part of the pattern you are looking for, but signifies that r7 will no longer contain the position and is unimportant below that line (likewise you would look for something similar above the 'mov o3, r7' to find the first line where r7 contains the position, then look for reads between the two).
If r7 is *read* above the 'mov o3, r7' line (extremely common in Unity games) it's a little more involved as you want to adjust every use of it *except* o3 - here's an example of this from one of my earlier fixes:
https://github.com/DarkStarSword/3d-fixes/commit/f4669d43271fbc02923029c6b108754d9b228c30
Nowadays I use a script (available in my 3d-fixes repository) to automatically fix this pattern (no guarantees that it won't break something else), which makes changes like this:
https://github.com/DarkStarSword/3d-fixes/blob/b84476c1d40b29c5ebb73a4dc56f2bf575a1ea4f/DreadOut/ShaderOverride/VertexShaders/14E1AD4F.txt
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword
What I did was:
- find dcl_position and see which o number is attached. In this case o0
- find the spots in the shader where it gets its input to w and x coordinates
- work out the prime directive with that input, using placeholder variables where necessary.
Does that make sense or am I just incredibly lucky?
The result is still ever so slightly off, but it's nearly there. Above all, it's the first time I see any movement at all!