3D Vision CPU Bottelneck: Gathering Information thread.
9 / 22
Indeed I am not a software developer. You software guys seem to have a unique view in the world :)
I hope you are right bo3b, I look forward to the day when this gets fixed.
@masterotaku, thanks for the update.
So, from my non-developer understanding, this is what I think is happening:
A game has 1 main thread, and that main thread passes workload onto other threads. If the main game thread gets encumbered, then it can't pass off work to other threads as efficiently, which is what is apparently happening with 3D vision. The driver is apparently making the main game thread idle according to nVidia (perhaps I am using the wrong terminology). But this means that the CPU as a whole decreases in usage, and is unable to then effectively send work to the GPU.
So this results in both the CPU workload decreasing as well as the GPU workload decreasing (relative to twice the workload that 3D Vision puts on the GPU) when 3D Vision is enabled, leading to severely degraded FPS well below what 3D Vision's already substantial impact on performance should give.
This is why we see the problem manifest itself as a Core Limit - there is no limit, it is just that there is not enough work for the CPU to fill all cores, and it can be done on fewer cores - dependent on game of course.
Indeed I am not a software developer. You software guys seem to have a unique view in the world :)
I hope you are right bo3b, I look forward to the day when this gets fixed.
@masterotaku, thanks for the update.
So, from my non-developer understanding, this is what I think is happening:
A game has 1 main thread, and that main thread passes workload onto other threads. If the main game thread gets encumbered, then it can't pass off work to other threads as efficiently, which is what is apparently happening with 3D vision. The driver is apparently making the main game thread idle according to nVidia (perhaps I am using the wrong terminology). But this means that the CPU as a whole decreases in usage, and is unable to then effectively send work to the GPU.
So this results in both the CPU workload decreasing as well as the GPU workload decreasing (relative to twice the workload that 3D Vision puts on the GPU) when 3D Vision is enabled, leading to severely degraded FPS well below what 3D Vision's already substantial impact on performance should give.
This is why we see the problem manifest itself as a Core Limit - there is no limit, it is just that there is not enough work for the CPU to fill all cores, and it can be done on fewer cores - dependent on game of course.
Windows 10 64-bit, Intel 7700K @ 5.1GHz, 16GB 3600MHz CL15 DDR4 RAM, 2x GTX 1080 SLI, Asus Maximus IX Hero, Sound Blaster ZxR, PCIe Quad SSD, Oculus Rift CV1, DLP Link PGD-150 glasses, ViewSonic PJD6531w 3D DLP Projector @ 1280x800 120Hz native / 2560x1600 120Hz DSR 3D Gaming.
Testing with The Witcher 3 and driver 378.66 and it seems that "d3d11.dll" (3d fix) is lowering the CPU and GPU use. The file "dxgi.dll" (reshade) also contributes to lower down even more the usage of both, gpu and cpu.
Testing with 1280x720 and WHITHOUT both dll's I get around 90-99% GPU usage. CPU usage never raise 60-65% in any circumstance (I assume that it is normal in this game when using my 6 cores). If I enable Hyperthreading I don't get more fps, it just lower the cpu usage of all the cores but the result is the same.
When using "d3d11.dll" the gpu usage lower down around 10% both, cpu and gpu, and as result my fps drop around 25 fps. I put dxgi.dll together with d3d1.dll into the game folder it lower down even more gpu and cpu (70-75% gpu usage and 45-50% cpu usage).
I have tested only in one area of the game, but I asssume that what happens must be similar in other areas, just a matter of more cpu or gpu demanding zones. Of course NOT using dlls and raising the resolution and details gives as a result a lower CPU usage because the bottleneck is always created in the GPU.
Testing with The Witcher 3 and driver 378.66 and it seems that "d3d11.dll" (3d fix) is lowering the CPU and GPU use. The file "dxgi.dll" (reshade) also contributes to lower down even more the usage of both, gpu and cpu.
Testing with 1280x720 and WHITHOUT both dll's I get around 90-99% GPU usage. CPU usage never raise 60-65% in any circumstance (I assume that it is normal in this game when using my 6 cores). If I enable Hyperthreading I don't get more fps, it just lower the cpu usage of all the cores but the result is the same.
When using "d3d11.dll" the gpu usage lower down around 10% both, cpu and gpu, and as result my fps drop around 25 fps. I put dxgi.dll together with d3d1.dll into the game folder it lower down even more gpu and cpu (70-75% gpu usage and 45-50% cpu usage).
I have tested only in one area of the game, but I asssume that what happens must be similar in other areas, just a matter of more cpu or gpu demanding zones. Of course NOT using dlls and raising the resolution and details gives as a result a lower CPU usage because the bottleneck is always created in the GPU.
[quote="Duerf"]Testing with The Witcher 3 and driver 378.66 and it seems that "d3d11.dll" (3d fix) is lowering the CPU and GPU use. The file "dxgi.dll" (reshade) also contributes to lower down even more the usage of both, gpu and cpu.
Testing with 1280x720 and WHITHOUT both dll's I get around 90-99% GPU usage. CPU usage never raise 60-65% in any circumstance (I assume that it is normal in this game when using my 6 cores). If I enable Hyperthreading I don't get more fps, it just lower the cpu usage of all the cores but the result is the same.
When using "d3d11.dll" the gpu usage lower down around 10% both, cpu and gpu, and as result my fps drop around 25 fps. I put dxgi.dll together with d3d1.dll into the game folder it lower down even more gpu and cpu (70-75% gpu usage and 45-50% cpu usage).
I have tested only in one area of the game, but I asssume that what happens must be similar in other areas, just a matter of more cpu or gpu demanding zones. Of course NOT using dlls and raising the resolution and details gives as a result a lower CPU usage because the bottleneck is always created in the GPU.
[/quote]
Funny... I run "The Witcher 3" at 45-50 FPS in 3D Surround (5760x1080 in 3D) and haven't seen that issue.
Also, try not to mix wrapper together? Hmm? This discussion is not about fixes that might lower performance DUE TO THE SHADERS BEINGS CHANGED AND ADDITIONAL GPU CYCLES BEING USED :) This is about just "Raw" 2D vs 3D Vision usage (without any fix). We already fixed a problem in the performance of 3DMigoto & 3D Vision when "stereo2mono" function is being used (which was NOT a bug, but the BUS being saturated due to the HUGE amount of data that needed to be sent "across")
@RAGEdemon:
That is NOT HOW THREADS work at all;) Is quite the opposite:
- You have a MAIN thread (AKA your App)
- You have other threads that do the "heavy" stuff.
- The MAIN thread WAITS for other threads to finish their JOB.
- The other threads CAN't constantly be "spinning". (If you are interested: Mutexes, Critical Sections, Semaphores, etc)
- Sometimes, it a Thread is required to SLEEP in order to process new and valid Data.
- 3D Vision IS LOCKED at 120Hz and 60Hz (for glasses) hence the BIGGER delays - Which can translate in "lower" CPU usage. Again, NOT A BUG, but a SYNCHRONISATION mechanism.
- The data that needs to be shared between CPU and GPU(s) is also increased which can lead to a BUS saturation - leaving both the CPU and GPU at lower usage, until the data is sent... bla bla bla...
- Threads are really complicated stuff and when you are "bound" to present 2 frames in the same time at regular intervals... well... the complexity is exponential...
- Biggest problem we "fail" to see is that 3D Vision is REVERSE-ENGINEERING stereo3D into games that were coded & tested to work in 2D. (And as we see some perform POOR in 2D, let alone 3D). Very easy to "point fingers" in general. There are always logical and real limitations to different things, but is easy just to say "that or the other doesn't work" without trying to understand WHY the result is like this. Perhaps if more people would be interested to see how 3D Vision "ticks" we would understand A LOT more about it;)
@bo3b:
Please, can you do me a favour? Just put a "while(1)" in a thread in 3DMigoto, since people want to see their CPU usage 99% or something ^_^. That would be fun:))
Duerf said:Testing with The Witcher 3 and driver 378.66 and it seems that "d3d11.dll" (3d fix) is lowering the CPU and GPU use. The file "dxgi.dll" (reshade) also contributes to lower down even more the usage of both, gpu and cpu.
Testing with 1280x720 and WHITHOUT both dll's I get around 90-99% GPU usage. CPU usage never raise 60-65% in any circumstance (I assume that it is normal in this game when using my 6 cores). If I enable Hyperthreading I don't get more fps, it just lower the cpu usage of all the cores but the result is the same.
When using "d3d11.dll" the gpu usage lower down around 10% both, cpu and gpu, and as result my fps drop around 25 fps. I put dxgi.dll together with d3d1.dll into the game folder it lower down even more gpu and cpu (70-75% gpu usage and 45-50% cpu usage).
I have tested only in one area of the game, but I asssume that what happens must be similar in other areas, just a matter of more cpu or gpu demanding zones. Of course NOT using dlls and raising the resolution and details gives as a result a lower CPU usage because the bottleneck is always created in the GPU.
Funny... I run "The Witcher 3" at 45-50 FPS in 3D Surround (5760x1080 in 3D) and haven't seen that issue.
Also, try not to mix wrapper together? Hmm? This discussion is not about fixes that might lower performance DUE TO THE SHADERS BEINGS CHANGED AND ADDITIONAL GPU CYCLES BEING USED :) This is about just "Raw" 2D vs 3D Vision usage (without any fix). We already fixed a problem in the performance of 3DMigoto & 3D Vision when "stereo2mono" function is being used (which was NOT a bug, but the BUS being saturated due to the HUGE amount of data that needed to be sent "across")
@RAGEdemon:
That is NOT HOW THREADS work at all;) Is quite the opposite:
- You have a MAIN thread (AKA your App)
- You have other threads that do the "heavy" stuff.
- The MAIN thread WAITS for other threads to finish their JOB.
- The other threads CAN't constantly be "spinning". (If you are interested: Mutexes, Critical Sections, Semaphores, etc)
- Sometimes, it a Thread is required to SLEEP in order to process new and valid Data.
- 3D Vision IS LOCKED at 120Hz and 60Hz (for glasses) hence the BIGGER delays - Which can translate in "lower" CPU usage. Again, NOT A BUG, but a SYNCHRONISATION mechanism.
- The data that needs to be shared between CPU and GPU(s) is also increased which can lead to a BUS saturation - leaving both the CPU and GPU at lower usage, until the data is sent... bla bla bla...
- Threads are really complicated stuff and when you are "bound" to present 2 frames in the same time at regular intervals... well... the complexity is exponential...
- Biggest problem we "fail" to see is that 3D Vision is REVERSE-ENGINEERING stereo3D into games that were coded & tested to work in 2D. (And as we see some perform POOR in 2D, let alone 3D). Very easy to "point fingers" in general. There are always logical and real limitations to different things, but is easy just to say "that or the other doesn't work" without trying to understand WHY the result is like this. Perhaps if more people would be interested to see how 3D Vision "ticks" we would understand A LOT more about it;)
@bo3b:
Please, can you do me a favour? Just put a "while(1)" in a thread in 3DMigoto, since people want to see their CPU usage 99% or something ^_^. That would be fun:))
1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc
helifax mate, I did say many pages back that in the hope of preventing hurt feelings and thread derailment, I would not be replying to your posts in this specific thread, but I feel bad when you talk to me here and I don't respond. It just seems disrespectful.
I am not a professional developer so I don't know the inner workings of code. My wife will vouch that I'm a stupid guy who is good at some things and horrible at others. You have just said that I don't have a clue about CPU threads etc. That's fine :). Saying that, it was evident that you also didn't fully grasp the test philosophy or inner workings of threads and the CPU, as I attempted to explain things such as eliminating any GPU bottleneck, setting affinity, thread core usage meaning, isolation on physical vs virtual cores etc, a few times before things turned sour.
We all love ya man, this stupid thread and 'non-bug' isn't worth losing you over. I am sorry, but it's best that I keep doing as before and don't respond here at all to you - I hope you understand, and no disrespect intended.
Out of said respect, and for your understanding, I'll quickly cover the points in your previous post because you seem to have put a lot of effort into it:
i. It's not a sync lag issue because you can disable VSync with D3DOverrider, taking your 3D Vision FPS well over 120, but still have exactly the same FPS in trouble areas.
ii. The official word from nVidia's driver development team is that the exact cause is 'multiple' items: "[color="green"]High cross GPU transfer time, Wait on CPU thread until we complete the copy, Game thread spending more time on CPU on case of stereo[/color]"
iii. Perhaps this makes more sense to you professional developers out there, but to me this sounds like the primary game thread is the problem, not the waiting on other threads that the game thread sends the work to.
iv. bo3b tested the impact of 3Dmigoto on the CPU, and he said that it is not worth considering... something like 1% was said but I might be mistaken. It is interesting that Duerf has this result, and is worth investigating. If I recall correctly, I did test it with TW3, and Fixed vs Non-Fixed performance was identical except maybe 1 or 2 FPS which are within margin of error.
All the best!
helifax mate, I did say many pages back that in the hope of preventing hurt feelings and thread derailment, I would not be replying to your posts in this specific thread, but I feel bad when you talk to me here and I don't respond. It just seems disrespectful.
I am not a professional developer so I don't know the inner workings of code. My wife will vouch that I'm a stupid guy who is good at some things and horrible at others. You have just said that I don't have a clue about CPU threads etc. That's fine :). Saying that, it was evident that you also didn't fully grasp the test philosophy or inner workings of threads and the CPU, as I attempted to explain things such as eliminating any GPU bottleneck, setting affinity, thread core usage meaning, isolation on physical vs virtual cores etc, a few times before things turned sour.
We all love ya man, this stupid thread and 'non-bug' isn't worth losing you over. I am sorry, but it's best that I keep doing as before and don't respond here at all to you - I hope you understand, and no disrespect intended.
Out of said respect, and for your understanding, I'll quickly cover the points in your previous post because you seem to have put a lot of effort into it:
i. It's not a sync lag issue because you can disable VSync with D3DOverrider, taking your 3D Vision FPS well over 120, but still have exactly the same FPS in trouble areas.
ii. The official word from nVidia's driver development team is that the exact cause is 'multiple' items: "High cross GPU transfer time, Wait on CPU thread until we complete the copy, Game thread spending more time on CPU on case of stereo"
iii. Perhaps this makes more sense to you professional developers out there, but to me this sounds like the primary game thread is the problem, not the waiting on other threads that the game thread sends the work to.
iv. bo3b tested the impact of 3Dmigoto on the CPU, and he said that it is not worth considering... something like 1% was said but I might be mistaken. It is interesting that Duerf has this result, and is worth investigating. If I recall correctly, I did test it with TW3, and Fixed vs Non-Fixed performance was identical except maybe 1 or 2 FPS which are within margin of error.
All the best!
Windows 10 64-bit, Intel 7700K @ 5.1GHz, 16GB 3600MHz CL15 DDR4 RAM, 2x GTX 1080 SLI, Asus Maximus IX Hero, Sound Blaster ZxR, PCIe Quad SSD, Oculus Rift CV1, DLP Link PGD-150 glasses, ViewSonic PJD6531w 3D DLP Projector @ 1280x800 120Hz native / 2560x1600 120Hz DSR 3D Gaming.
[quote="RAGEdemon"]but to me this sounds like the primary game thread is the problem, not the waiting on other threads that the game thread sends the work to.[/quote]
You should read this, if you haven't the previous times I linked it
https://msdn.microsoft.com/en-us/library/ee417693(VS.85,loband).aspx
DirectX 12 supposedly addresses this.
BTW, Intel revealed a new performance tool at GDC, it's like Perfmon, GPUView and FCAT.
https://www.youtube.com/watch?v=oMUKoFQLRYI
RAGEdemon said:but to me this sounds like the primary game thread is the problem, not the waiting on other threads that the game thread sends the work to.
Now that I have an I7 processor I did a little bit of testing with Battlefield 1 to check for CPU usage and the results are interesting.
I don't know if BF1 is the best to test because over 3 cores the game will always keep 1 core or thread as reserve, it is very possible that other games do it also but in this one you can see it written.
I measured CPu usage using MSI afterburner (the same data I receive also form HWinfo64) and task manager that for some reason that I don't know gives me different data.
It gets interesting when you reach the 4 cores. Even though it sais it is suing only 3 threads there is an obvious performance improvement in 2D however in 3D the result is the same.
When you go beyond the 4 thread mark you see that with every step CPU usage is lower and lower in 3D and performance wise FPS is the same (the difference you see is normal variation in MP, even tough the server is empty.
What is the conclusion? No matter How many threads you have over 3, the FPS is still the same (in the 64-68 range)
PS: However... It is visible that also in 2D HT doesn't bring any improvement.
Threads/in use/3D/CPU% MSI/CPU% Taskmngr. /fsp
3 / 3 / No /100% / 100% /125
3 / 3 / Yes/100% / 100% /68
4 / 3 / NO / 99% / 99% /154
4 / 3 / Yes /78% /99% /66
8 / 6 / NO /76% /99% /151
8 / 6 / Yes /55% /73% /69
8 / 7 / NO /83% /100% /147
8 / 7 / Yes /52% /77% /64
As Antimalwere did me the honnor of kicking in 3 core 3d I redid the measurement.
3 /3 / NO /100% /100% /122
3 /3 / YES /95% /100% /67
I took snapshots but I can't upload for some reason. Uploaded on Google drive
https://drive.google.com/file/d/0B_pg0KRXApEnVUxBcU9IUExWNTA/view?usp=sharing
Now that I have an I7 processor I did a little bit of testing with Battlefield 1 to check for CPU usage and the results are interesting.
I don't know if BF1 is the best to test because over 3 cores the game will always keep 1 core or thread as reserve, it is very possible that other games do it also but in this one you can see it written.
I measured CPu usage using MSI afterburner (the same data I receive also form HWinfo64) and task manager that for some reason that I don't know gives me different data.
It gets interesting when you reach the 4 cores. Even though it sais it is suing only 3 threads there is an obvious performance improvement in 2D however in 3D the result is the same.
When you go beyond the 4 thread mark you see that with every step CPU usage is lower and lower in 3D and performance wise FPS is the same (the difference you see is normal variation in MP, even tough the server is empty.
What is the conclusion? No matter How many threads you have over 3, the FPS is still the same (in the 64-68 range)
PS: However... It is visible that also in 2D HT doesn't bring any improvement.
I am amazed you can push above 60 FPS while 3D is enabled. It should lock it at 60 FPS.
One thing to mention and is interesting, is that MSI measures the delta time between 2 consecutive swap-chains;)
This doesn't mean is necessarily what the GPU presents. (Remember you have that "render ahead limit")
The GPU can easily discard frames to be in sync with the 3D /60 FPS monitor.
As a matter of fact, if you do this test again on SOMA + My wrapper you will see that MSI reports a lot more FPS than you actually see;) Best tool to make the measurements - which is developed by Nvidia - and doesn't measure the time between swap-chains but actually what the hardware sends to the Monitor is Shadow Play (FPS overlay).
It would be interesting to see what you get there ;)
If you can make a quick test with that one as well, it would be interesting to see if you get something else;)
I am amazed you can push above 60 FPS while 3D is enabled. It should lock it at 60 FPS.
One thing to mention and is interesting, is that MSI measures the delta time between 2 consecutive swap-chains;)
This doesn't mean is necessarily what the GPU presents. (Remember you have that "render ahead limit")
The GPU can easily discard frames to be in sync with the 3D /60 FPS monitor.
As a matter of fact, if you do this test again on SOMA + My wrapper you will see that MSI reports a lot more FPS than you actually see;) Best tool to make the measurements - which is developed by Nvidia - and doesn't measure the time between swap-chains but actually what the hardware sends to the Monitor is Shadow Play (FPS overlay).
It would be interesting to see what you get there ;)
If you can make a quick test with that one as well, it would be interesting to see if you get something else;)
1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc
I made some changes in the settings. As I play in multi I Have render ahead 0 and triple buffering off.
I would be happy if I could keep the fps over 60 fps but is not possible :(
I made some changes in the settings. As I play in multi I Have render ahead 0 and triple buffering off.
I would be happy if I could keep the fps over 60 fps but is not possible :(
Intel i7 8086K
Gigabyte GTX 1080Ti Aorus Extreme
DDR4 2x8gb 3200mhz Cl14
TV LG OLED65E6V
Windows 10 64bits
@D-Man11
DX12 titles such as DX:MD and ROTTR show that DX12 + 3DVision has solved nothing unfortunately.
In fact, I have read some posts describing horrible DX12 performance compared to DX11.
Update:
[color="green"]Hi Shahzad,
The did get an update from the development lead on this issue. Due to other higher priority tasks and that this issue would need they weren't able to allocate resource to start work on this issue. He assured me they have not forgotten about this issue and that he will assign resource as soon as he can. Apparently this will need considerable amount of analysis and optimization so it could take more time.
Best regards,
Ray[/color]
The did get an update from the development lead on this issue. Due to other higher priority tasks and that this issue would need they weren't able to allocate resource to start work on this issue. He assured me they have not forgotten about this issue and that he will assign resource as soon as he can. Apparently this will need considerable amount of analysis and optimization so it could take more time.
Best regards,
Ray
Windows 10 64-bit, Intel 7700K @ 5.1GHz, 16GB 3600MHz CL15 DDR4 RAM, 2x GTX 1080 SLI, Asus Maximus IX Hero, Sound Blaster ZxR, PCIe Quad SSD, Oculus Rift CV1, DLP Link PGD-150 glasses, ViewSonic PJD6531w 3D DLP Projector @ 1280x800 120Hz native / 2560x1600 120Hz DSR 3D Gaming.
Asus Maximus X Hero Z370
MSI Gaming X 1080Ti (2100 mhz OC Watercooled)
8700k (4.7ghz OC Watercooled)
16gb DDR4 3000 Ram
500GB SAMSUNG 860 EVO SERIES SSD M.2
I hope you are right bo3b, I look forward to the day when this gets fixed.
@masterotaku, thanks for the update.
So, from my non-developer understanding, this is what I think is happening:
A game has 1 main thread, and that main thread passes workload onto other threads. If the main game thread gets encumbered, then it can't pass off work to other threads as efficiently, which is what is apparently happening with 3D vision. The driver is apparently making the main game thread idle according to nVidia (perhaps I am using the wrong terminology). But this means that the CPU as a whole decreases in usage, and is unable to then effectively send work to the GPU.
So this results in both the CPU workload decreasing as well as the GPU workload decreasing (relative to twice the workload that 3D Vision puts on the GPU) when 3D Vision is enabled, leading to severely degraded FPS well below what 3D Vision's already substantial impact on performance should give.
This is why we see the problem manifest itself as a Core Limit - there is no limit, it is just that there is not enough work for the CPU to fill all cores, and it can be done on fewer cores - dependent on game of course.
Windows 10 64-bit, Intel 7700K @ 5.1GHz, 16GB 3600MHz CL15 DDR4 RAM, 2x GTX 1080 SLI, Asus Maximus IX Hero, Sound Blaster ZxR, PCIe Quad SSD, Oculus Rift CV1, DLP Link PGD-150 glasses, ViewSonic PJD6531w 3D DLP Projector @ 1280x800 120Hz native / 2560x1600 120Hz DSR 3D Gaming.
Testing with 1280x720 and WHITHOUT both dll's I get around 90-99% GPU usage. CPU usage never raise 60-65% in any circumstance (I assume that it is normal in this game when using my 6 cores). If I enable Hyperthreading I don't get more fps, it just lower the cpu usage of all the cores but the result is the same.
When using "d3d11.dll" the gpu usage lower down around 10% both, cpu and gpu, and as result my fps drop around 25 fps. I put dxgi.dll together with d3d1.dll into the game folder it lower down even more gpu and cpu (70-75% gpu usage and 45-50% cpu usage).
I have tested only in one area of the game, but I asssume that what happens must be similar in other areas, just a matter of more cpu or gpu demanding zones. Of course NOT using dlls and raising the resolution and details gives as a result a lower CPU usage because the bottleneck is always created in the GPU.
- Windows 7 64bits (SSD OCZ-Vertez2 128Gb)
- "ASUS P6X58D-E" motherboard
- "MSI GTX 660 TI"
- "Intel Xeon X5670" @4000MHz CPU (20.0[12-25]x200MHz)
- RAM 16 Gb DDR3 1600
- "Dell S2716DG" monitor (2560x1440 @144Hz)
- "Corsair Carbide 600C" case
- Labrador dog (cinnamon edition)
Funny... I run "The Witcher 3" at 45-50 FPS in 3D Surround (5760x1080 in 3D) and haven't seen that issue.
Also, try not to mix wrapper together? Hmm? This discussion is not about fixes that might lower performance DUE TO THE SHADERS BEINGS CHANGED AND ADDITIONAL GPU CYCLES BEING USED :) This is about just "Raw" 2D vs 3D Vision usage (without any fix). We already fixed a problem in the performance of 3DMigoto & 3D Vision when "stereo2mono" function is being used (which was NOT a bug, but the BUS being saturated due to the HUGE amount of data that needed to be sent "across")
@RAGEdemon:
That is NOT HOW THREADS work at all;) Is quite the opposite:
- You have a MAIN thread (AKA your App)
- You have other threads that do the "heavy" stuff.
- The MAIN thread WAITS for other threads to finish their JOB.
- The other threads CAN't constantly be "spinning". (If you are interested: Mutexes, Critical Sections, Semaphores, etc)
- Sometimes, it a Thread is required to SLEEP in order to process new and valid Data.
- 3D Vision IS LOCKED at 120Hz and 60Hz (for glasses) hence the BIGGER delays - Which can translate in "lower" CPU usage. Again, NOT A BUG, but a SYNCHRONISATION mechanism.
- The data that needs to be shared between CPU and GPU(s) is also increased which can lead to a BUS saturation - leaving both the CPU and GPU at lower usage, until the data is sent... bla bla bla...
- Threads are really complicated stuff and when you are "bound" to present 2 frames in the same time at regular intervals... well... the complexity is exponential...
- Biggest problem we "fail" to see is that 3D Vision is REVERSE-ENGINEERING stereo3D into games that were coded & tested to work in 2D. (And as we see some perform POOR in 2D, let alone 3D). Very easy to "point fingers" in general. There are always logical and real limitations to different things, but is easy just to say "that or the other doesn't work" without trying to understand WHY the result is like this. Perhaps if more people would be interested to see how 3D Vision "ticks" we would understand A LOT more about it;)
@bo3b:
Please, can you do me a favour? Just put a "while(1)" in a thread in 3DMigoto, since people want to see their CPU usage 99% or something ^_^. That would be fun:))
1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc
My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com
(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)
I am not a professional developer so I don't know the inner workings of code. My wife will vouch that I'm a stupid guy who is good at some things and horrible at others. You have just said that I don't have a clue about CPU threads etc. That's fine :). Saying that, it was evident that you also didn't fully grasp the test philosophy or inner workings of threads and the CPU, as I attempted to explain things such as eliminating any GPU bottleneck, setting affinity, thread core usage meaning, isolation on physical vs virtual cores etc, a few times before things turned sour.
We all love ya man, this stupid thread and 'non-bug' isn't worth losing you over. I am sorry, but it's best that I keep doing as before and don't respond here at all to you - I hope you understand, and no disrespect intended.
Out of said respect, and for your understanding, I'll quickly cover the points in your previous post because you seem to have put a lot of effort into it:
i. It's not a sync lag issue because you can disable VSync with D3DOverrider, taking your 3D Vision FPS well over 120, but still have exactly the same FPS in trouble areas.
ii. The official word from nVidia's driver development team is that the exact cause is 'multiple' items: "High cross GPU transfer time, Wait on CPU thread until we complete the copy, Game thread spending more time on CPU on case of stereo"
iii. Perhaps this makes more sense to you professional developers out there, but to me this sounds like the primary game thread is the problem, not the waiting on other threads that the game thread sends the work to.
iv. bo3b tested the impact of 3Dmigoto on the CPU, and he said that it is not worth considering... something like 1% was said but I might be mistaken. It is interesting that Duerf has this result, and is worth investigating. If I recall correctly, I did test it with TW3, and Fixed vs Non-Fixed performance was identical except maybe 1 or 2 FPS which are within margin of error.
All the best!
Windows 10 64-bit, Intel 7700K @ 5.1GHz, 16GB 3600MHz CL15 DDR4 RAM, 2x GTX 1080 SLI, Asus Maximus IX Hero, Sound Blaster ZxR, PCIe Quad SSD, Oculus Rift CV1, DLP Link PGD-150 glasses, ViewSonic PJD6531w 3D DLP Projector @ 1280x800 120Hz native / 2560x1600 120Hz DSR 3D Gaming.
You should read this, if you haven't the previous times I linked it
https://msdn.microsoft.com/en-us/library/ee417693(VS.85,loband).aspx
DirectX 12 supposedly addresses this.
BTW, Intel revealed a new performance tool at GDC, it's like Perfmon, GPUView and FCAT.
I don't know if BF1 is the best to test because over 3 cores the game will always keep 1 core or thread as reserve, it is very possible that other games do it also but in this one you can see it written.
I measured CPu usage using MSI afterburner (the same data I receive also form HWinfo64) and task manager that for some reason that I don't know gives me different data.
It gets interesting when you reach the 4 cores. Even though it sais it is suing only 3 threads there is an obvious performance improvement in 2D however in 3D the result is the same.
When you go beyond the 4 thread mark you see that with every step CPU usage is lower and lower in 3D and performance wise FPS is the same (the difference you see is normal variation in MP, even tough the server is empty.
What is the conclusion? No matter How many threads you have over 3, the FPS is still the same (in the 64-68 range)
PS: However... It is visible that also in 2D HT doesn't bring any improvement.
Threads/in use/3D/CPU% MSI/CPU% Taskmngr. /fsp
3 / 3 / No /100% / 100% /125
3 / 3 / Yes/100% / 100% /68
4 / 3 / NO / 99% / 99% /154
4 / 3 / Yes /78% /99% /66
8 / 6 / NO /76% /99% /151
8 / 6 / Yes /55% /73% /69
8 / 7 / NO /83% /100% /147
8 / 7 / Yes /52% /77% /64
As Antimalwere did me the honnor of kicking in 3 core 3d I redid the measurement.
3 /3 / NO /100% /100% /122
3 /3 / YES /95% /100% /67
I took snapshots but I can't upload for some reason. Uploaded on Google drive
https://drive.google.com/file/d/0B_pg0KRXApEnVUxBcU9IUExWNTA/view?usp=sharing
Intel i7 8086K
Gigabyte GTX 1080Ti Aorus Extreme
DDR4 2x8gb 3200mhz Cl14
TV LG OLED65E6V
Windows 10 64bits
One thing to mention and is interesting, is that MSI measures the delta time between 2 consecutive swap-chains;)
This doesn't mean is necessarily what the GPU presents. (Remember you have that "render ahead limit")
The GPU can easily discard frames to be in sync with the 3D /60 FPS monitor.
As a matter of fact, if you do this test again on SOMA + My wrapper you will see that MSI reports a lot more FPS than you actually see;) Best tool to make the measurements - which is developed by Nvidia - and doesn't measure the time between swap-chains but actually what the hardware sends to the Monitor is Shadow Play (FPS overlay).
It would be interesting to see what you get there ;)
If you can make a quick test with that one as well, it would be interesting to see if you get something else;)
1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc
My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com
(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)
I would be happy if I could keep the fps over 60 fps but is not possible :(
Intel i7 8086K
Gigabyte GTX 1080Ti Aorus Extreme
DDR4 2x8gb 3200mhz Cl14
TV LG OLED65E6V
Windows 10 64bits
DX12 titles such as DX:MD and ROTTR show that DX12 + 3DVision has solved nothing unfortunately.
In fact, I have read some posts describing horrible DX12 performance compared to DX11.
Windows 10 64-bit, Intel 7700K @ 5.1GHz, 16GB 3600MHz CL15 DDR4 RAM, 2x GTX 1080 SLI, Asus Maximus IX Hero, Sound Blaster ZxR, PCIe Quad SSD, Oculus Rift CV1, DLP Link PGD-150 glasses, ViewSonic PJD6531w 3D DLP Projector @ 1280x800 120Hz native / 2560x1600 120Hz DSR 3D Gaming.
Hi Shahzad,
The did get an update from the development lead on this issue. Due to other higher priority tasks and that this issue would need they weren't able to allocate resource to start work on this issue. He assured me they have not forgotten about this issue and that he will assign resource as soon as he can. Apparently this will need considerable amount of analysis and optimization so it could take more time.
Best regards,
Ray
Windows 10 64-bit, Intel 7700K @ 5.1GHz, 16GB 3600MHz CL15 DDR4 RAM, 2x GTX 1080 SLI, Asus Maximus IX Hero, Sound Blaster ZxR, PCIe Quad SSD, Oculus Rift CV1, DLP Link PGD-150 glasses, ViewSonic PJD6531w 3D DLP Projector @ 1280x800 120Hz native / 2560x1600 120Hz DSR 3D Gaming.
CPU: Intel Core i7 7700K @ 4.9GHz
Motherboard: Gigabyte Aorus GA-Z270X-Gaming 5
RAM: GSKILL Ripjaws Z 16GB 3866MHz CL18
GPU: Gainward Phoenix 1080 GLH
Monitor: Asus PG278QR
Speakers: Logitech Z506
Donations account: masterotakusuko@gmail.com
Asus Deluxe Gen3, Core i7 2700k@4.5Ghz, GTX 1080Ti, 16 GB RAM, Win 7 64bit
Samsung Pro 250 GB SSD, 4 TB WD Black (games)
Benq XL2720Z
Gaming Rig 1
i7 5820K 3.3ghz (Stock Clock)
GTX 1080 Founders Edition (Stock Clock)
16GB DDR4 2400 RAM
512 SAMSUNG 840 PRO
Gaming Rig 2
My new build
Asus Maximus X Hero Z370
MSI Gaming X 1080Ti (2100 mhz OC Watercooled)
8700k (4.7ghz OC Watercooled)
16gb DDR4 3000 Ram
500GB SAMSUNG 860 EVO SERIES SSD M.2