OMFG!!!!!!!!! I've got it!!!!!!!!
I had to disable MSAA to make it work! I enabled it a year ago and completely forgot about it.
soo.. I'm aware of the resolve_msaa flag, but I'm not entirely sure where it supposed to go.
When I add it like in the example below The buffer clears correctly, but nothing is being drawn on it.
[code]
[ResourceDepthBuffer]
;max_copies_per_frame=1
[ResourceLakePositionBuffer]
;format = R8G8B8A8_UNORM
;max_copies_per_frame=1
[ResourceBB]
[ShaderOverride-SkyPS]
Hash = 5fb7805badf885b7
ResourceDepthBuffer = copy oD
[CustomShaderPostprocess]
vs = ShaderFixes\postProcess.vs.hlsl
ps = ShaderFixes\postProcess.ps.hlsl
blend = disable
x1=rt_width
y1=rt_height
ps-t104 = bb
ps-t105 = ResourceDepthBuffer
ps-t106 = ResourceLakePositionBuffer
o0 = bb
draw = 6, 0
post ps-t104 = null
post ps-t105 = null
post ps-t106 = null
[ResourceBackupo0]
[CustomShaderPositionBuffer]
blend = disable
ps = ShaderFixes\position.ps.hlsl
ResourceLakePositionBuffer = copy_desc resolve_msaa o0
;ResourceBackupo0 = ref o0
o0 = ResourceLakePositionBuffer
Draw = from_caller
;post o0 = ResourceBackupo0
[ShaderOverride-Water1-vs]
hash = 665483756892af90
run = CustomShaderPositionBuffer
[ShaderOverride-Water2-vs]
hash = d140ce5685b3cd1d
run = CustomShaderPositionBuffer
[CustomShaderClearRT]
blend = disable
vs = ShaderFixes\clear_rt.vs.hlsl
ps = ShaderFixes\clear_rt.ps.hlsl
;ResourceBackupo0 = ref o0
o0 = ResourceLakePositionBuffer
Draw = 6, 0
;post o0 = ResourceBackupo0
[Present]
run = CustomShaderPostprocess
run = CustomShaderClearRT
ResourceLakePositionBuffer = null
ResourceDepthBuffer = null
[/code]
OMFG!!!!!!!!! I've got it!!!!!!!!
I had to disable MSAA to make it work! I enabled it a year ago and completely forgot about it.
soo.. I'm aware of the resolve_msaa flag, but I'm not entirely sure where it supposed to go.
When I add it like in the example below The buffer clears correctly, but nothing is being drawn on it.
The resolve_msaa flag takes the place of a copy operation (use it where you would use "ref" or "copy"), and will turn a Texture2DMS into a Texture2D or a Texture2DMSArray into a Texture2DArray.
Note that the reason I haven't said much about this flag before is because the MSDN documentation on the underlying function it uses indicates that it requires hardware support for the specific texture format being resolved, but I have not been able to find any documentation on what hardware supports what formats, or if there are any minimum guarantees - basically, I have no idea if it is reliable or not. I know for fact that it doesn't work with any depth buffer formats on a 680m, and seems to work fine for r8g8b8a8 formats, but that's about my only two data points.
The alternative to resolve_msaa that should work on any hardware and any format (except the stencil side of a depth/stencil buffer) is to use a custom shader that takes a Texture2DMS as input and averages all samples into a Texture2D output (I haven't tried this, but it should be possible using copy_desc and overriding msaa=1 and msaa_quality=0 in the destination resource section to make a Texture2D with the same width & height as the original Texture2DMS).
Alternatively you should be able to use the multi-sampled versions in your custom shaders by declaring them as Texture2DMS - at least on my system this seems to work even if the texture is just a Texture2D (the only caveat is that the shader needs to either be a pixel shader, or use shader model 5. Custom shaders are always compiled for shader model 5, but if you are injecting them into an existing shader from the game you may sometimes need to use a ShaderOverride section to override the model, which should be fine for HLSL shaders, but won't work with assembly shaders).
The resolve_msaa flag takes the place of a copy operation (use it where you would use "ref" or "copy"), and will turn a Texture2DMS into a Texture2D or a Texture2DMSArray into a Texture2DArray.
Note that the reason I haven't said much about this flag before is because the MSDN documentation on the underlying function it uses indicates that it requires hardware support for the specific texture format being resolved, but I have not been able to find any documentation on what hardware supports what formats, or if there are any minimum guarantees - basically, I have no idea if it is reliable or not. I know for fact that it doesn't work with any depth buffer formats on a 680m, and seems to work fine for r8g8b8a8 formats, but that's about my only two data points.
The alternative to resolve_msaa that should work on any hardware and any format (except the stencil side of a depth/stencil buffer) is to use a custom shader that takes a Texture2DMS as input and averages all samples into a Texture2D output (I haven't tried this, but it should be possible using copy_desc and overriding msaa=1 and msaa_quality=0 in the destination resource section to make a Texture2D with the same width & height as the original Texture2DMS).
Alternatively you should be able to use the multi-sampled versions in your custom shaders by declaring them as Texture2DMS - at least on my system this seems to work even if the texture is just a Texture2D (the only caveat is that the shader needs to either be a pixel shader, or use shader model 5. Custom shaders are always compiled for shader model 5, but if you are injecting them into an existing shader from the game you may sometimes need to use a ShaderOverride section to override the model, which should be fine for HLSL shaders, but won't work with assembly shaders).
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Here is the link to my current approach:
[url]https://mega.nz/#!21kCGJyS!4iS9yZwVTIO7Rh_Or6QNXkjlKweo1ctr17rODUp1BFg[/url]
[url]https://mega.nz/#!bpMz3CTK!rbIEXIuTfT76X6UdoTHUYDORjQkHXbqH-zZLuWxYcvE[/url] <- another try
Single Player -> Single Race -> Rally, Finland, Tupasentie is the best place to go.
p.s. do not submerge, or you will go blind :)
I hadn't refreshed either, but If you would like to take a look at why reflections are out of place, please be my guest.
d140ce5685b3cd1d-vs_replace.txt
3ce369598f34925e-ps_replace.txt
are the shaders containg the recent approach.
and thank you for the tip, I will focus at the moment on reflections without MSAA as I need to get this working first. If reflections aren't going to work there is no point in fixing MSAA.
and thank you for the tip, I will focus at the moment on reflections without MSAA as I need to get this working first. If reflections aren't going to work there is no point in fixing MSAA.
I have no idea what I'm doing wrong, here's another try:
[url]https://mega.nz/#!fhUjXLLR!A9NmkpNwWm0EnueuXMCtDSY-BqNLraYpe4Zg4GAuoZ8[/url]
Is there anyone here who could spare some time and help me with those damned reflections?
Edit: Oh, I see you have already started using resolve_msaa, that that is why o0 is no longer compatible with oD, since oD is still MSAA while o0 is not. I guess you still want the depth buffer assigned, so maybe ignore the below. The depth buffer can't be resolved with resolve_msaa, but you could use a custom shader to do that job - provided you only need the depth side and not the stencil side (3DMigoto currently lacks a way to access the stencil side as a render target or shader resource).
Your debug shader isn't working with MSAA because there is a render or depth target assigned that is not compatible with your ResourceLakePositionBuffer (e.g. different size, number of samples, etc). It's a good idea to look at what the SBS custom shader section does to see what may be necessary since I already added enough to it to deal with almost any state that the game may be in. Choosing to omit some things it does is usually fine, but should only be done if you know it is safe in the particular game you are working on. Explicitly unbind all other render and depth targets before assigning yours by adding this to your CustomShaderPositionBuffer:
[code]
o1 = null
o2 = null
o3 = null
o4 = null
o5 = null
o6 = null
o7 = null
oD = null
[/code]
You should do this before the o0= line to make sure you have unbound the incompatible render target before binding your own one. I believe in this case it is just the depth buffer that is causing problems, so you could get away with only doing oD = null. (No need to back these up - the CustomShader section does that for you. It's only textures you generally need to back up yourself if you are potentially replacing something the game may have assigned)
Edit: Oh, I see you have already started using resolve_msaa, that that is why o0 is no longer compatible with oD, since oD is still MSAA while o0 is not. I guess you still want the depth buffer assigned, so maybe ignore the below. The depth buffer can't be resolved with resolve_msaa, but you could use a custom shader to do that job - provided you only need the depth side and not the stencil side (3DMigoto currently lacks a way to access the stencil side as a render target or shader resource).
Your debug shader isn't working with MSAA because there is a render or depth target assigned that is not compatible with your ResourceLakePositionBuffer (e.g. different size, number of samples, etc). It's a good idea to look at what the SBS custom shader section does to see what may be necessary since I already added enough to it to deal with almost any state that the game may be in. Choosing to omit some things it does is usually fine, but should only be done if you know it is safe in the particular game you are working on. Explicitly unbind all other render and depth targets before assigning yours by adding this to your CustomShaderPositionBuffer:
You should do this before the o0= line to make sure you have unbound the incompatible render target before binding your own one. I believe in this case it is just the depth buffer that is causing problems, so you could get away with only doing oD = null. (No need to back these up - the CustomShader section does that for you. It's only textures you generally need to back up yourself if you are potentially replacing something the game may have assigned)
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
BTW copy_desc and resolve_msaa won't work together - they are mutually exclusive options since they set completely different copy operations and would need two separate passes to combine. 3DMigoto should really trigger a warning if you try to combine them, but I never expected anyone to try. The result you will have is effectively just a copy_desc with a sample count and quality override (which should be done explicitly, not this way), but you won't actually be doing the MSAA resolve pass at all.
I'm not sure why you would combine them though... if you are doing resolve_msaa you are doing a full copy, so you don't need copy_desc, and if you want to copy_desc without the full copy you should be using override options in the resource section (msaa=1, msaa_quality=0).
BTW copy_desc and resolve_msaa won't work together - they are mutually exclusive options since they set completely different copy operations and would need two separate passes to combine. 3DMigoto should really trigger a warning if you try to combine them, but I never expected anyone to try. The result you will have is effectively just a copy_desc with a sample count and quality override (which should be done explicitly, not this way), but you won't actually be doing the MSAA resolve pass at all.
I'm not sure why you would combine them though... if you are doing resolve_msaa you are doing a full copy, so you don't need copy_desc, and if you want to copy_desc without the full copy you should be using override options in the resource section (msaa=1, msaa_quality=0).
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Edit: This may be a load of BS - the function still looks suspect, but this game seems to be using it's own stereo renderer, so the property that convergence == screen depth may not hold. This also means that [Present] is the wrong place to hook in to draw an overlay on the screen in stereo (there is no right place to do this - frame analysis (analyse_options=log clear_rt dump_rt_jps mono) shows the first 50% of draw calls are for the left eye and the remaining 50% are for the right. You'd have to find something to hook into right at the end of each eye, but the final shader there is used many times. Maybe I can add something to 3DMigoto to get the back buffer for each eye... depends on whether nvapi will play nice, but it wouldn't solve all the issues anyway since your references would not be for the correct eye), and that max_executions/copies_per_frame may limit said operation to occur in one eye only. It also means that StereoParams is useless if you needed to do a stereo correction (you would have to find the equivalent information from the game instead), though fortunately as you said that is not your goal since the game already rendered fine in stereo, but it does mean you would have to be a little careful if you want your mod to continue working in stereo...
Is your linearizeDepth function meant to be accurate (or is an approximation good enough?), because I don't believe that it is accurate. 3D Vision has the property that something at screen depth is at exactly linear depth == convergence, so we can adjust convergence to find the approximate linear depth of any object in the scene. Putting the back of the car at screen depth takes convergence approx 2.9, but the result of your linearizeDepth function pins 2.9 to be several meters in front of the car.
Edit: This may be a load of BS - the function still looks suspect, but this game seems to be using it's own stereo renderer, so the property that convergence == screen depth may not hold. This also means that [Present] is the wrong place to hook in to draw an overlay on the screen in stereo (there is no right place to do this - frame analysis (analyse_options=log clear_rt dump_rt_jps mono) shows the first 50% of draw calls are for the left eye and the remaining 50% are for the right. You'd have to find something to hook into right at the end of each eye, but the final shader there is used many times. Maybe I can add something to 3DMigoto to get the back buffer for each eye... depends on whether nvapi will play nice, but it wouldn't solve all the issues anyway since your references would not be for the correct eye), and that max_executions/copies_per_frame may limit said operation to occur in one eye only. It also means that StereoParams is useless if you needed to do a stereo correction (you would have to find the equivalent information from the game instead), though fortunately as you said that is not your goal since the game already rendered fine in stereo, but it does mean you would have to be a little careful if you want your mod to continue working in stereo...
Is your linearizeDepth function meant to be accurate (or is an approximation good enough?), because I don't believe that it is accurate. 3D Vision has the property that something at screen depth is at exactly linear depth == convergence, so we can adjust convergence to find the approximate linear depth of any object in the scene. Putting the back of the car at screen depth takes convergence approx 2.9, but the result of your linearizeDepth function pins 2.9 to be several meters in front of the car.
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Rather than hooking in extra post processing or debug shaders at the present call, I suggest hooking at:
[code]
[ShaderOverrideBeforeHUD]
hash = ab94b148ac6530c6
[/code]
I haven't checked what that shader is, but it appears to be run exactly once per eye, just before the HUD starts being rendered - hooking in there will allow you to do things in stereo.
Also, if you want this to work in stereo you will need to try to get rid of all your max_*_per_frame - doing this will give you a mono depth buffer in stereo. There may be other creative solutions you could employ (e.g. copy by reference if you expect something to happen many times and full copy once right when you need it), and I was lately thinking about adding something to 3DMigoto to require that a certain number of draw calls had happened between before allowing another copy/execution, which would probably work here... I also have plans to add conditional logic and expressions to 3DMigoto in the near future, which could potentially be used to implement your own copy limit and reset it when encountering that shader...
Rather than hooking in extra post processing or debug shaders at the present call, I suggest hooking at:
[ShaderOverrideBeforeHUD]
hash = ab94b148ac6530c6
I haven't checked what that shader is, but it appears to be run exactly once per eye, just before the HUD starts being rendered - hooking in there will allow you to do things in stereo.
Also, if you want this to work in stereo you will need to try to get rid of all your max_*_per_frame - doing this will give you a mono depth buffer in stereo. There may be other creative solutions you could employ (e.g. copy by reference if you expect something to happen many times and full copy once right when you need it), and I was lately thinking about adding something to 3DMigoto to require that a certain number of draw calls had happened between before allowing another copy/execution, which would probably work here... I also have plans to add conditional logic and expressions to 3DMigoto in the near future, which could potentially be used to implement your own copy limit and reset it when encountering that shader...
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
You shouldn't need max_copies_per_frame on ResourceDepthBuffer2 at all - the shader you are copying that from is only run once per eye.
Hmmm... Why do you have max_executions_per_frame in a resource section? That's only for custom shaders...
I think... I really need to go and document some of my features better, and keep working to make the Ini Parser warn about more problems ;-)
This might work better for linear depth (you can optimise this if you wanted, but it's a starting point showing how to turn a Z buffer value into a linear depth using the projection matrix that will work in a lot of games... except those that do reverse Z projection which will get a divide by zero if you try this... and of course the usual quirk that the arguments to mul() may be swapped around in some games):
[code]
float linearizeDepth(float depth)
{
float4 tmp = mul(inverseProj, float4(0, 0, depth, 1));
return mul(projection, tmp / tmp.w).w;
}
[/code]
Alternatively, check what calculations the game does and copy that (light shaders are usually a good place to look).
In other news it looks like this game implements the nvidia formula (or equivalent), so convergence matches what we know it to mean. Separation will always be positive, but the leftEye value in the CameraParamsConstantBuffer can be used to negate it for the left eye.
This might work better for linear depth (you can optimise this if you wanted, but it's a starting point showing how to turn a Z buffer value into a linear depth using the projection matrix that will work in a lot of games... except those that do reverse Z projection which will get a divide by zero if you try this... and of course the usual quirk that the arguments to mul() may be swapped around in some games):
Alternatively, check what calculations the game does and copy that (light shaders are usually a good place to look).
In other news it looks like this game implements the nvidia formula (or equivalent), so convergence matches what we know it to mean. Separation will always be positive, but the leftEye value in the CameraParamsConstantBuffer can be used to negate it for the left eye.
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
BTW you are fetching the depth buffer after drawing part of the water... In 2D that means you will be using a frame old depth buffer, but the way the engine works in 3D that means you will be using the depth buffer from the other eye. Is there a reason you need the depth buffer from that point? If you need the depth of the water surface I would think it would be easier to just pass that from the vertex shader to the pixel shader since that is what you are drawing. Just copying the currently assigned depth buffer seems to work well enough for me... though I guess you wanted to reduce the number of copy operations?
BTW you are fetching the depth buffer after drawing part of the water... In 2D that means you will be using a frame old depth buffer, but the way the engine works in 3D that means you will be using the depth buffer from the other eye. Is there a reason you need the depth buffer from that point? If you need the depth of the water surface I would think it would be easier to just pass that from the vertex shader to the pixel shader since that is what you are drawing. Just copying the currently assigned depth buffer seems to work well enough for me... though I guess you wanted to reduce the number of copy operations?
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Instead of doing this:
[code]
float4 v10 : TEXCOORD8, //viewPosition
float3 v11 : TEXCOORD9, //viewNormal
float3 v12 : TEXCOORD10) //csPos
{
...
float4 viewPosition = v10;
float3 viewNormal = v11;
float3 csPosition = v12;
[/code]
Just do this:
[code]
float4 viewPosition : TEXCOORD8,
float3 viewNormal : TEXCOORD9,
float3 csPosition : TEXCOORD10)
{
[/code]
And instead of this:
[code]
out float4 o10 : TEXCOORD8, //viewPosition
out float3 o11 : TEXCOORD9, //viewNormal
out float3 o12 : TEXCOORD10) //csPosition
...
o10 = mul(modelView, v0); //viewPosition
o11 = mul((float3x3)modelView, v1); //viewNormal
o12 = position.xyz / position.w; //csPosition
[/code]
Just do this:
[code]
out float4 viewPosition : TEXCOORD8,
out float3 viewNormal : TEXCOORD9,
out float3 csPosition : TEXCOORD10)
...
viewPosition = mul(modelView, v0);
viewNormal = mul((float3x3)modelView, v1);
csPosition = position.xyz / position.w;
[/code]
3DMigoto only names these after the registers because it has no way to guess better names - there's no need for you to follow suit, especially when you clearly have come up with better names.
Don't do this:
[code]
static const float2 cb_depthBufferSize = float2(1920,1080);
[/code]
Do this:
[code]
DepthBuffer.GetDimensions(cb_depthBufferSize.x, cb_depthBufferSize.y);
[/code]
out float4 viewPosition : TEXCOORD8,
out float3 viewNormal : TEXCOORD9,
out float3 csPosition : TEXCOORD10)
...
viewPosition = mul(modelView, v0);
viewNormal = mul((float3x3)modelView, v1);
csPosition = position.xyz / position.w;
3DMigoto only names these after the registers because it has no way to guess better names - there's no need for you to follow suit, especially when you clearly have come up with better names.
There's a lot wrong with these reflections, and to be honest... I'm not even sure I'd bother following the paper (though that's probably just me - the stereo crosshair I implemented is conceptually pretty similar to screen space reflections since it also does ray tracing on the depth buffer and I didn't follow any guide to get that to work).
The fact that the implementation in the paper you are following is written in GLSL means that you have to worry about the fundamental differences in the coordinate systems between DX and OpenGL, and they are using camera-space with positive Z coming out of the screen towards their face like a mad man instead of view-space with positive Z going towards infinity where any sensible person would put it... IMO you would be better off finding a HLSL implementation to use as a starting point.
And "cs" is a bad acronym... They are using it for "Camera-Space", but it could easily be misinterpreted as "Clip-Space", a totally different thing. And generally it's a bad idea to use something called "Camera-Space", since the camera transformation is actually a negated view transformation (the camera transformation places the camera in the world, the view transformation brings the world to the camera), so it's not clear what coordinate system they actually expect.
You have also used the View-Projection matrix where you should have used the Projection matrix, and that multiplication is around the wrong way (compared to what worked to get the linear depth).
There's problems with the coordinate scale to get to pixels, though tbh I can't see how that is supposed to work in the paper either (maybe I missed it, but IIRC this might have been one of the differences between GL and DX).
Even fixing those issues the reflections are still way off.
There's a lot wrong with these reflections, and to be honest... I'm not even sure I'd bother following the paper (though that's probably just me - the stereo crosshair I implemented is conceptually pretty similar to screen space reflections since it also does ray tracing on the depth buffer and I didn't follow any guide to get that to work).
The fact that the implementation in the paper you are following is written in GLSL means that you have to worry about the fundamental differences in the coordinate systems between DX and OpenGL, and they are using camera-space with positive Z coming out of the screen towards their face like a mad man instead of view-space with positive Z going towards infinity where any sensible person would put it... IMO you would be better off finding a HLSL implementation to use as a starting point.
And "cs" is a bad acronym... They are using it for "Camera-Space", but it could easily be misinterpreted as "Clip-Space", a totally different thing. And generally it's a bad idea to use something called "Camera-Space", since the camera transformation is actually a negated view transformation (the camera transformation places the camera in the world, the view transformation brings the world to the camera), so it's not clear what coordinate system they actually expect.
You have also used the View-Projection matrix where you should have used the Projection matrix, and that multiplication is around the wrong way (compared to what worked to get the linear depth).
There's problems with the coordinate scale to get to pixels, though tbh I can't see how that is supposed to work in the paper either (maybe I missed it, but IIRC this might have been one of the differences between GL and DX).
Even fixing those issues the reflections are still way off.
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
I had to disable MSAA to make it work! I enabled it a year ago and completely forgot about it.
soo.. I'm aware of the resolve_msaa flag, but I'm not entirely sure where it supposed to go.
When I add it like in the example below The buffer clears correctly, but nothing is being drawn on it.
EVGA GeForce GTX 980 SC
Core i5 2500K
MSI Z77A-G45
8GB DDR3
Windows 10 x64
Send me what you've got and I'll take a look in game.Never mind, I hadn't refreshed my browser window.2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword
Note that the reason I haven't said much about this flag before is because the MSDN documentation on the underlying function it uses indicates that it requires hardware support for the specific texture format being resolved, but I have not been able to find any documentation on what hardware supports what formats, or if there are any minimum guarantees - basically, I have no idea if it is reliable or not. I know for fact that it doesn't work with any depth buffer formats on a 680m, and seems to work fine for r8g8b8a8 formats, but that's about my only two data points.
The alternative to resolve_msaa that should work on any hardware and any format (except the stencil side of a depth/stencil buffer) is to use a custom shader that takes a Texture2DMS as input and averages all samples into a Texture2D output (I haven't tried this, but it should be possible using copy_desc and overriding msaa=1 and msaa_quality=0 in the destination resource section to make a Texture2D with the same width & height as the original Texture2DMS).
Alternatively you should be able to use the multi-sampled versions in your custom shaders by declaring them as Texture2DMS - at least on my system this seems to work even if the texture is just a Texture2D (the only caveat is that the shader needs to either be a pixel shader, or use shader model 5. Custom shaders are always compiled for shader model 5, but if you are injecting them into an existing shader from the game you may sometimes need to use a ShaderOverride section to override the model, which should be fine for HLSL shaders, but won't work with assembly shaders).
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword
https://mega.nz/#!21kCGJyS!4iS9yZwVTIO7Rh_Or6QNXkjlKweo1ctr17rODUp1BFg
https://mega.nz/#!bpMz3CTK!rbIEXIuTfT76X6UdoTHUYDORjQkHXbqH-zZLuWxYcvE <- another try
Single Player -> Single Race -> Rally, Finland, Tupasentie is the best place to go.
p.s. do not submerge, or you will go blind :)
I hadn't refreshed either, but If you would like to take a look at why reflections are out of place, please be my guest.
d140ce5685b3cd1d-vs_replace.txt
3ce369598f34925e-ps_replace.txt
are the shaders containg the recent approach.
EVGA GeForce GTX 980 SC
Core i5 2500K
MSI Z77A-G45
8GB DDR3
Windows 10 x64
EVGA GeForce GTX 980 SC
Core i5 2500K
MSI Z77A-G45
8GB DDR3
Windows 10 x64
https://mega.nz/#!fhUjXLLR!A9NmkpNwWm0EnueuXMCtDSY-BqNLraYpe4Zg4GAuoZ8
Is there anyone here who could spare some time and help me with those damned reflections?
EVGA GeForce GTX 980 SC
Core i5 2500K
MSI Z77A-G45
8GB DDR3
Windows 10 x64
Your debug shader isn't working with MSAA because there is a render or depth target assigned that is not compatible with your ResourceLakePositionBuffer (e.g. different size, number of samples, etc). It's a good idea to look at what the SBS custom shader section does to see what may be necessary since I already added enough to it to deal with almost any state that the game may be in. Choosing to omit some things it does is usually fine, but should only be done if you know it is safe in the particular game you are working on. Explicitly unbind all other render and depth targets before assigning yours by adding this to your CustomShaderPositionBuffer:
You should do this before the o0= line to make sure you have unbound the incompatible render target before binding your own one. I believe in this case it is just the depth buffer that is causing problems, so you could get away with only doing oD = null. (No need to back these up - the CustomShader section does that for you. It's only textures you generally need to back up yourself if you are potentially replacing something the game may have assigned)
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword
I'm not sure why you would combine them though... if you are doing resolve_msaa you are doing a full copy, so you don't need copy_desc, and if you want to copy_desc without the full copy you should be using override options in the resource section (msaa=1, msaa_quality=0).
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword
Is your linearizeDepth function meant to be accurate (or is an approximation good enough?), because I don't believe that it is accurate. 3D Vision has the property that something at screen depth is at exactly linear depth == convergence, so we can adjust convergence to find the approximate linear depth of any object in the scene. Putting the back of the car at screen depth takes convergence approx 2.9, but the result of your linearizeDepth function pins 2.9 to be several meters in front of the car.
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword
I haven't checked what that shader is, but it appears to be run exactly once per eye, just before the HUD starts being rendered - hooking in there will allow you to do things in stereo.
Also, if you want this to work in stereo you will need to try to get rid of all your max_*_per_frame - doing this will give you a mono depth buffer in stereo. There may be other creative solutions you could employ (e.g. copy by reference if you expect something to happen many times and full copy once right when you need it), and I was lately thinking about adding something to 3DMigoto to require that a certain number of draw calls had happened between before allowing another copy/execution, which would probably work here... I also have plans to add conditional logic and expressions to 3DMigoto in the near future, which could potentially be used to implement your own copy limit and reset it when encountering that shader...
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword
Hmmm... Why do you have max_executions_per_frame in a resource section? That's only for custom shaders...
I think... I really need to go and document some of my features better, and keep working to make the Ini Parser warn about more problems ;-)
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword
Alternatively, check what calculations the game does and copy that (light shaders are usually a good place to look).
In other news it looks like this game implements the nvidia formula (or equivalent), so convergence matches what we know it to mean. Separation will always be positive, but the leftEye value in the CameraParamsConstantBuffer can be used to negate it for the left eye.
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword
Just do this:
And instead of this:
Just do this:
3DMigoto only names these after the registers because it has no way to guess better names - there's no need for you to follow suit, especially when you clearly have come up with better names.
Don't do this:
Do this:
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword
The fact that the implementation in the paper you are following is written in GLSL means that you have to worry about the fundamental differences in the coordinate systems between DX and OpenGL, and they are using camera-space with positive Z coming out of the screen towards their face like a mad man instead of view-space with positive Z going towards infinity where any sensible person would put it... IMO you would be better off finding a HLSL implementation to use as a starting point.
And "cs" is a bad acronym... They are using it for "Camera-Space", but it could easily be misinterpreted as "Clip-Space", a totally different thing. And generally it's a bad idea to use something called "Camera-Space", since the camera transformation is actually a negated view transformation (the camera transformation places the camera in the world, the view transformation brings the world to the camera), so it's not clear what coordinate system they actually expect.
You have also used the View-Projection matrix where you should have used the Projection matrix, and that multiplication is around the wrong way (compared to what worked to get the linear depth).
There's problems with the coordinate scale to get to pixels, though tbh I can't see how that is supposed to work in the paper either (maybe I missed it, but IIRC this might have been one of the differences between GL and DX).
Even fixing those issues the reflections are still way off.
2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit
Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD
Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword