3Dmigoto now open-source...
  28 / 141    
[quote="bo3b"]Edit: Ah, I see you already checked in that change to just switch them to %.9g. Works for me. :->[/quote]The one I checked in is just the one I did the other day that changes fixImm(). I have another patch ready to go to switch applySwizzle() and ParseBufferDefinitions(), but I want to do a smokescreen test on a bunch of shaders before I push that and confirm that it fixes my original issue.
bo3b said:Edit: Ah, I see you already checked in that change to just switch them to %.9g. Works for me. :->
The one I checked in is just the one I did the other day that changes fixImm(). I have another patch ready to go to switch applySwizzle() and ParseBufferDefinitions(), but I want to do a smokescreen test on a bunch of shaders before I push that and confirm that it fixes my original issue.

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 07/10/2015 03:40 AM   
[quote=""][quote="bo3b"]Edit: Ah, I see you already checked in that change to just switch them to %.9g. Works for me. :->[/quote]The one I checked in is just the one I did the other day that changes fixImm(). I have another patch ready to go to switch applySwizzle() and ParseBufferDefinitions(), but I want to do a smokescreen test on a bunch of shaders before I push that and confirm that it fixes my original issue.[/quote] Cool, yeah I noticed that was for fixImm. Your change there is actually not necessary unless I've missed something, because I don't think it ever gets output. Maybe you have a case where there is value to having the internal representation be %g though. Could be some weird edge cases that crop up by changing stuff like this though. For example, if %.9g outputs "1" instead of "1.00000E+000" then when other code reads that "1" as a %f, will it _always_ do the right thing? I don't have that level of confidence in the Microsoft CRT. I have a medium to strong preference for keeping the internal representation at %.9e as it was, because we know for absolute certain that it's a no-loss storage, and we have some years long experience with it working OK. Then fixing the output as desired.
said:
bo3b said:Edit: Ah, I see you already checked in that change to just switch them to %.9g. Works for me. :->
The one I checked in is just the one I did the other day that changes fixImm(). I have another patch ready to go to switch applySwizzle() and ParseBufferDefinitions(), but I want to do a smokescreen test on a bunch of shaders before I push that and confirm that it fixes my original issue.

Cool, yeah I noticed that was for fixImm. Your change there is actually not necessary unless I've missed something, because I don't think it ever gets output. Maybe you have a case where there is value to having the internal representation be %g though.

Could be some weird edge cases that crop up by changing stuff like this though. For example, if %.9g outputs "1" instead of "1.00000E+000" then when other code reads that "1" as a %f, will it _always_ do the right thing? I don't have that level of confidence in the Microsoft CRT.

I have a medium to strong preference for keeping the internal representation at %.9e as it was, because we know for absolute certain that it's a no-loss storage, and we have some years long experience with it working OK. Then fixing the output as desired.

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

Posted 07/10/2015 05:41 AM   
That's definitely not just the internal representation - here's some diffs showing before and after from that commit (these also include fixes): https://github.com/DarkStarSword/3d-fixes/commit/b18c16c9385a3da04174331b0472803bda9d4410 https://github.com/DarkStarSword/3d-fixes/commit/30c050fd043c605aa136adf490372dcd6215b6c6 https://github.com/DarkStarSword/3d-fixes/commit/81923e157d7c2f8835a77df289618023a77a3dc3 There's still a few numbers using scientific notation in that, which are due to applySwizzle() being called on the result of fixImm(). For the decompiler this should all be fine since it's strongly typed and 1 == 1.0. scanf("%f") will treat 1, 1.0 and 1.0e+0 exactly the same (that's part of the C standard). The assembler is a different matter as l(1) != l(1.0), so we wouldn't want to produce floats with no decimal point there.
That's definitely not just the internal representation - here's some diffs showing before and after from that commit (these also include fixes):
https://github.com/DarkStarSword/3d-fixes/commit/b18c16c9385a3da04174331b0472803bda9d4410
https://github.com/DarkStarSword/3d-fixes/commit/30c050fd043c605aa136adf490372dcd6215b6c6
https://github.com/DarkStarSword/3d-fixes/commit/81923e157d7c2f8835a77df289618023a77a3dc3

There's still a few numbers using scientific notation in that, which are due to applySwizzle() being called on the result of fixImm().

For the decompiler this should all be fine since it's strongly typed and 1 == 1.0. scanf("%f") will treat 1, 1.0 and 1.0e+0 exactly the same (that's part of the C standard). The assembler is a different matter as l(1) != l(1.0), so we wouldn't want to produce floats with no decimal point there.

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 07/10/2015 08:50 AM   
Those are good examples. It really drives home how much better the non-E format is for regular numbers. In those cases, the single l(x) values are treated differently by applySwizzle, and passed through unchanged. This has always seemed a little weird to me, and this really suggests that we ought to fix that up too. I really think that we ought to set fixImm and convertHexToFloat to use the known good internal representation and unify it around the idea that the output is where the conversion to human readable happens. I don't share your confidence that Microsoft follows the c standard that well, there are many examples I've run across where they just drop the ball. I'm not sure where the standard was made, but the Microsoft C compiler doesn't support modern stuff very well. This might not be one, but I just don't trust them to do right, and the problem is that if we/they miss we generate very subtle bugs in the decompiled output. (just like this nasty 6 digit v. 9 digit problem.) Tell you what, since I'm most concerned about these problems, let me go ahead and make these changes to unify this code and make this format distinction more clear and less of ad hoc solution. This will be a good improvement to the overall output HLSL code quality. Question though- I commented on one of your check-in lines. The %.9g format shows rounding of the value from 2.500000037e-002 to 0.0250000004. This seems bad from our goal of ensuring bit-perfect output. %g is new to me, and I see the documentation, but figure you have a good grasp of relative accuracy. I know and agree that %.9e is valid as a full-precision format, and that's likely why Chiri started with that. I don't understand the ramifications of that %g conversion though, so if you tell me what to use, I'll plug those in.
Those are good examples. It really drives home how much better the non-E format is for regular numbers.

In those cases, the single l(x) values are treated differently by applySwizzle, and passed through unchanged. This has always seemed a little weird to me, and this really suggests that we ought to fix that up too.

I really think that we ought to set fixImm and convertHexToFloat to use the known good internal representation and unify it around the idea that the output is where the conversion to human readable happens.


I don't share your confidence that Microsoft follows the c standard that well, there are many examples I've run across where they just drop the ball. I'm not sure where the standard was made, but the Microsoft C compiler doesn't support modern stuff very well.

This might not be one, but I just don't trust them to do right, and the problem is that if we/they miss we generate very subtle bugs in the decompiled output. (just like this nasty 6 digit v. 9 digit problem.)


Tell you what, since I'm most concerned about these problems, let me go ahead and make these changes to unify this code and make this format distinction more clear and less of ad hoc solution. This will be a good improvement to the overall output HLSL code quality.


Question though- I commented on one of your check-in lines. The %.9g format shows rounding of the value from
2.500000037e-002 to 0.0250000004. This seems bad from our goal of ensuring bit-perfect output.

%g is new to me, and I see the documentation, but figure you have a good grasp of relative accuracy. I know and agree that %.9e is valid as a full-precision format, and that's likely why Chiri started with that. I don't understand the ramifications of that %g conversion though, so if you tell me what to use, I'll plug those in.

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

Posted 07/10/2015 09:32 AM   
I've done my smokescreen test of changing everything to %.9g and I'm happy with the result - the original precision issue is fixed and numbers are much easier to read now :) The only part I haven't tested is the change I made in ParseBufferDefinitions (need to find a game that uses the default values). There's only a few cases where the representation is arguably worse (though still accurate) - since you noted one I'll come back to this in a sec. I'm quite happy for you to look at improving the internal representation, but my changes are enough for now. Really it seems like we should store them internally as 32bit floats up until we actually output them using either %.9g or the algorithm in my float_to_hex.py. [quote="bo3b"]I don't share your confidence that Microsoft follows the c standard that well, there are many examples I've run across where they just drop the ball. I'm not sure where the standard was made, but the Microsoft C compiler doesn't support modern stuff very well.[/quote]That's part of the reason I've done the smokescreen - to look for any edge cases where they may round differently to glibc that may require additional precision to compensate for. So far the results have all been identical to my tests in python, including for edge cases that fail on %.8g but succeed on %.9g. There's always additional cases we can try in case I've missed something, but at the moment I'm fairly convinced that they got this part right :) [quote]Question though- I commented on one of your check-in lines. The %.9g format shows rounding of the value from 2.500000037e-002 to 0.0250000004. This seems bad from our goal of ensuring bit-perfect output.[/quote]That's actually perfectly fine. What you have to understand here is that there are certain numbers that cannot be represented precisely in base 2 floating point, just like there are numbers that cannot be represented accurately in base 10, such as 1/3 (0.333333...), pi (3.14159...), sqrt(2) (1.41421...) and so on. The thing is, there are some numbers that we can represent exactly in base 10 that can't be represented exactly in base 2, such as 0.1 and 0.025. These numbers cause a repeater in base 2 which will show up if they are printed with more precision than necessary. In this case: [code] ian@draal~/c/3d-fixes [i] (master)> ./float_to_hex.py 2.500000037e-002 0.0250000004 0x3ccccccd 0.025 from float check double check ---- ----- ----- ------ ----- 2.500000037e-002 0x3ccccccd 0.025 0x3f9999999ff4e091 0.02500000037 0.0250000004 0x3ccccccd 0.025 0x3f999999a078d190 0.0250000004 0x3ccccccd 0.025 True 5.039740005e-315 True 0.025 0x3ccccccd 0.025 0x3f9999999999999a 0.025 [/code] You can see here that the developer most likely just used 0.025, which can't be represented exactly in base 2 causing a repeater (ccccc...). The rounding will always work out to recreate the original value (so long as the precision is at least large enough to represent their original value) - anything after that is just an artefact of rounding to fit in a 24bit mantissa of a 32bit float (the 'd' at the end). The algorithm in my float_to_hex.py works on the principle of finding the minimum precision necessary to recreate the original binary value exactly, and in this case it stops at %.3f when it finds that 0.025 (the check column) creates the same repeater as the original. We could potentially implement this same algorithm in 3DMigoto to clean up some of these cases, but %.9g will work.
I've done my smokescreen test of changing everything to %.9g and I'm happy with the result - the original precision issue is fixed and numbers are much easier to read now :)

The only part I haven't tested is the change I made in ParseBufferDefinitions (need to find a game that uses the default values).

There's only a few cases where the representation is arguably worse (though still accurate) - since you noted one I'll come back to this in a sec.

I'm quite happy for you to look at improving the internal representation, but my changes are enough for now. Really it seems like we should store them internally as 32bit floats up until we actually output them using either %.9g or the algorithm in my float_to_hex.py.

bo3b said:I don't share your confidence that Microsoft follows the c standard that well, there are many examples I've run across where they just drop the ball. I'm not sure where the standard was made, but the Microsoft C compiler doesn't support modern stuff very well.
That's part of the reason I've done the smokescreen - to look for any edge cases where they may round differently to glibc that may require additional precision to compensate for. So far the results have all been identical to my tests in python, including for edge cases that fail on %.8g but succeed on %.9g. There's always additional cases we can try in case I've missed something, but at the moment I'm fairly convinced that they got this part right :)

Question though- I commented on one of your check-in lines. The %.9g format shows rounding of the value from
2.500000037e-002 to 0.0250000004. This seems bad from our goal of ensuring bit-perfect output.
That's actually perfectly fine. What you have to understand here is that there are certain numbers that cannot be represented precisely in base 2 floating point, just like there are numbers that cannot be represented accurately in base 10, such as 1/3 (0.333333...), pi (3.14159...), sqrt(2) (1.41421...) and so on. The thing is, there are some numbers that we can represent exactly in base 10 that can't be represented exactly in base 2, such as 0.1 and 0.025. These numbers cause a repeater in base 2 which will show up if they are printed with more precision than necessary. In this case:
ian@draal~/c/3d-fixes [i] (master)> ./float_to_hex.py 2.500000037e-002 0.0250000004 0x3ccccccd 0.025
from float check double check
---- ----- ----- ------ -----
2.500000037e-002 0x3ccccccd 0.025 0x3f9999999ff4e091 0.02500000037
0.0250000004 0x3ccccccd 0.025 0x3f999999a078d190 0.0250000004
0x3ccccccd 0.025 True 5.039740005e-315 True
0.025 0x3ccccccd 0.025 0x3f9999999999999a 0.025

You can see here that the developer most likely just used 0.025, which can't be represented exactly in base 2 causing a repeater (ccccc...). The rounding will always work out to recreate the original value (so long as the precision is at least large enough to represent their original value) - anything after that is just an artefact of rounding to fit in a 24bit mantissa of a 32bit float (the 'd' at the end).

The algorithm in my float_to_hex.py works on the principle of finding the minimum precision necessary to recreate the original binary value exactly, and in this case it stops at %.3f when it finds that 0.025 (the check column) creates the same repeater as the original. We could potentially implement this same algorithm in 3DMigoto to clean up some of these cases, but %.9g will work.

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 07/10/2015 10:27 AM   
One case where my algorithm would definitely help is for certain numbers like 0.0001. This number's base 10 exponent (-4) will cause %.9g to switch to scientific notation regardless of anything else, however the rounding is not pretty: 9.99999975e-05 Whereas the rounding with a precision of %.6e is slightly clearer since it rounded in a "nicer" direction, and this particular number doesn't need a larger precision: 1.000000e-04 This is just a artefact of the way the rounding went for this particular value and both of these will parse back to the same binary representation as 0.0001: [code] ian@draal~/c/3d-fixes [i] (master)> ./float_to_hex.py 0.0001 9.99999975e-05 1.000000e-04 from float check double check ---- ----- ----- ------ ----- 0.0001 0x38d1b717 0.0001 0x3f1a36e2eb1c432d 0.0001 9.99999975e-05 0x38d1b717 0.0001 0x3f1a36e2e01d833c 0.0000999999975 1.000000e-04 0x38d1b717 0.0001 0x3f1a36e2eb1c432d 0.0001 [/code] But this is one case where my algorithm will clearly do better since it returns "0.0001"
One case where my algorithm would definitely help is for certain numbers like 0.0001. This number's base 10 exponent (-4) will cause %.9g to switch to scientific notation regardless of anything else, however the rounding is not pretty:
9.99999975e-05

Whereas the rounding with a precision of %.6e is slightly clearer since it rounded in a "nicer" direction, and this particular number doesn't need a larger precision:
1.000000e-04

This is just a artefact of the way the rounding went for this particular value and both of these will parse back to the same binary representation as 0.0001:
ian@draal~/c/3d-fixes [i] (master)> ./float_to_hex.py 0.0001 9.99999975e-05 1.000000e-04
from float check double check
---- ----- ----- ------ -----
0.0001 0x38d1b717 0.0001 0x3f1a36e2eb1c432d 0.0001
9.99999975e-05 0x38d1b717 0.0001 0x3f1a36e2e01d833c 0.0000999999975
1.000000e-04 0x38d1b717 0.0001 0x3f1a36e2eb1c432d 0.0001


But this is one case where my algorithm will clearly do better since it returns "0.0001"

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 07/10/2015 10:58 AM   
[quote][quote]Question though- I commented on one of your check-in lines. The %.9g format shows rounding of the value from 2.500000037e-002 to 0.0250000004. This seems bad from our goal of ensuring bit-perfect output.[/quote]That's actually perfectly fine. What you have to understand here is that there are certain numbers that cannot be represented precisely in base 2 floating point, just like there are numbers that cannot be represented accurately in base 10, such as 1/3 (0.333333...), pi (3.14159...), sqrt(2) (1.41421...) and so on. The thing is, there are some numbers that we can represent exactly in base 10 that can't be represented exactly in base 2, such as 0.1 and 0.025. These numbers cause a repeater in base 2 which will show up if they are printed with more precision than necessary. [/quote] Hey cool, in all the time I've been doing computers, some 0x42160000 years, I've never run across the idea of repeating numeric patterns in binary. Makes sense, it's just funny that never came up until now. Thanks for the explanation. [quote]I'm quite happy for you to look at improving the internal representation, but my changes are enough for now. Really it seems like we should store them internally as 32bit floats up until we actually output them using either %.9g or the algorithm in my float_to_hex.py.[/quote] Can't argue with that. At some point I'd like to refactor the Decompiler into smaller and more cohesive subroutines, so that would be a good time to keep everything as floats.
Question though- I commented on one of your check-in lines. The %.9g format shows rounding of the value from
2.500000037e-002 to 0.0250000004. This seems bad from our goal of ensuring bit-perfect output.
That's actually perfectly fine. What you have to understand here is that there are certain numbers that cannot be represented precisely in base 2 floating point, just like there are numbers that cannot be represented accurately in base 10, such as 1/3 (0.333333...), pi (3.14159...), sqrt(2) (1.41421...) and so on. The thing is, there are some numbers that we can represent exactly in base 10 that can't be represented exactly in base 2, such as 0.1 and 0.025. These numbers cause a repeater in base 2 which will show up if they are printed with more precision than necessary.

Hey cool, in all the time I've been doing computers, some 0x42160000 years, I've never run across the idea of repeating numeric patterns in binary. Makes sense, it's just funny that never came up until now. Thanks for the explanation.

I'm quite happy for you to look at improving the internal representation, but my changes are enough for now. Really it seems like we should store them internally as 32bit floats up until we actually output them using either %.9g or the algorithm in my float_to_hex.py.

Can't argue with that. At some point I'd like to refactor the Decompiler into smaller and more cohesive subroutines, so that would be a good time to keep everything as floats.

Acer H5360 (1280x720@120Hz) - ASUS VG248QE with GSync mod - 3D Vision 1&2 - Driver 372.54
GTX 970 - i5-4670K@4.2GHz - 12GB RAM - Win7x64+evilKB2670838 - 4 Disk X25 RAID
SAGER NP9870-S - GTX 980 - i7-6700K - Win10 Pro 1607
Latest 3Dmigoto Release
Bo3b's School for ShaderHackers

Posted 07/10/2015 12:20 PM   
My current code for handling floats: [code] string convertF(DWORD original) { char buf[80]; char buf2[80]; float fOriginal = reinterpret_cast<float &>(original); sprintf(buf2, "%.9E", fOriginal); int len = strlen(buf2); if (buf2[len - 4] == '-') { int exp = atoi(buf2 + len - 3); switch (exp) { case 1: sprintf(buf, "%.9f", fOriginal); break; case 2: sprintf(buf, "%.10f", fOriginal); break; case 3: sprintf(buf, "%.11f", fOriginal); break; case 4: sprintf(buf, "%.12f", fOriginal); break; case 5: sprintf(buf, "%.13f", fOriginal); break; case 6: sprintf(buf, "%.14f", fOriginal); break; default: sprintf(buf, "%.9E", fOriginal); break; } } else { int exp = atoi(buf2 + len - 3); switch (exp) { case 0: sprintf(buf, "%.8f", fOriginal); break; default: sprintf(buf, "%.8f", fOriginal); break; } } string sLiteral(buf); DWORD newDWORD = strToDWORD(sLiteral); if (newDWORD != original) { if (failFile == NULL) failFile = fopen("debug.txt", "wb"); FILE *f = failFile; fprintf(f, "%s\n", sLiteral.c_str()); fprintf(f, "o:%08X\n", original); fprintf(f, "n:%08X\n", newDWORD); fprintf(f, "\n"); } return sLiteral; } [/code] And strToDWORD [code] DWORD strToDWORD(string s) { if (s == "-1.#IND0000") return 0xFFC00000; if (s == "1.#INF0000") return 0x7F800000; if (s == "-1.#INF0000") return 0xFF800000; if (s == "-1.#QNAN000") return 0xFFC10000; if (s.substr(0, 2) == "0x") { DWORD decimalValue; sscanf_s(s.c_str(), "0x%x", &decimalValue); return decimalValue; } if (s.find('.') < s.size()) { float f = (float)atof(s.c_str()); DWORD* pF = (DWORD*)&f; return *pF; } return atoi(s.c_str()); } [/code] Currently gives binary accurate results when comparing to original binary. The code is only applied when the original MS literal is not binary accurate to begin with.
My current code for handling floats:
string convertF(DWORD original) {
char buf[80];
char buf2[80];

float fOriginal = reinterpret_cast<float &>(original);
sprintf(buf2, "%.9E", fOriginal);
int len = strlen(buf2);
if (buf2[len - 4] == '-') {
int exp = atoi(buf2 + len - 3);
switch (exp) {
case 1:
sprintf(buf, "%.9f", fOriginal);
break;
case 2:
sprintf(buf, "%.10f", fOriginal);
break;
case 3:
sprintf(buf, "%.11f", fOriginal);
break;
case 4:
sprintf(buf, "%.12f", fOriginal);
break;
case 5:
sprintf(buf, "%.13f", fOriginal);
break;
case 6:
sprintf(buf, "%.14f", fOriginal);
break;
default:
sprintf(buf, "%.9E", fOriginal);
break;
}
} else {
int exp = atoi(buf2 + len - 3);
switch (exp) {
case 0:
sprintf(buf, "%.8f", fOriginal);
break;
default:
sprintf(buf, "%.8f", fOriginal);
break;
}
}
string sLiteral(buf);
DWORD newDWORD = strToDWORD(sLiteral);
if (newDWORD != original) {
if (failFile == NULL)
failFile = fopen("debug.txt", "wb");
FILE *f = failFile;
fprintf(f, "%s\n", sLiteral.c_str());
fprintf(f, "o:%08X\n", original);
fprintf(f, "n:%08X\n", newDWORD);
fprintf(f, "\n");
}
return sLiteral;
}

And strToDWORD
DWORD strToDWORD(string s) {
if (s == "-1.#IND0000")
return 0xFFC00000;
if (s == "1.#INF0000")
return 0x7F800000;
if (s == "-1.#INF0000")
return 0xFF800000;
if (s == "-1.#QNAN000")
return 0xFFC10000;
if (s.substr(0, 2) == "0x") {
DWORD decimalValue;
sscanf_s(s.c_str(), "0x%x", &decimalValue);
return decimalValue;

}
if (s.find('.') < s.size()) {
float f = (float)atof(s.c_str());
DWORD* pF = (DWORD*)&f;
return *pF;
}
return atoi(s.c_str());
}

Currently gives binary accurate results when comparing to original binary.
The code is only applied when the original MS literal is not binary accurate to begin with.

Thanks to everybody using my assembler it warms my heart.
To have a critical piece of code that everyone can enjoy!
What more can you ask for?

donations: ulfjalmbrant@hotmail.com

Posted 07/10/2015 03:10 PM   
I've just pushed up a new feature to 3DMigoto that I know people have been looking forward to. Can anyone guess what this does: [code] [ShaderOverrideHUD] Hash=xxx x2=ps-t0 [TextureOverrideCrosshair] Hash=xxx filter_index=2 [/code] [code] float4 stereo = StereoParams.Load(0); float4 tex_filter = IniParams.Load(int2(2,0)); if (tex_filter.x == 2) { o0.x += stereo.x * 0.9; } [/code] Expect this in 3DMigoto 1.1.34 ;-)
I've just pushed up a new feature to 3DMigoto that I know people have been looking forward to. Can anyone guess what this does:

[ShaderOverrideHUD]
Hash=xxx
x2=ps-t0
[TextureOverrideCrosshair]
Hash=xxx
filter_index=2

float4 stereo = StereoParams.Load(0);
float4 tex_filter = IniParams.Load(int2(2,0));
if (tex_filter.x == 2) {
o0.x += stereo.x * 0.9;
}


Expect this in 3DMigoto 1.1.34 ;-)

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 07/15/2015 01:28 PM   
[quote=""]I've just pushed up a new feature to 3DMigoto that I know people have been looking forward to. Can anyone guess what this does: [code] [ShaderOverrideHUD] Hash=xxx x2=ps-t0 [TextureOverrideCrosshair] Hash=xxx filter_index=2 [/code] [code] float4 stereo = StereoParams.Load(0); float4 tex_filter = IniParams.Load(int2(2,0)); if (tex_filter.x == 2) { o0.x += stereo.x * 0.9; } [/code] Expect this in 3DMigoto 1.1.34 ;-)[/quote] Applying different values to a shader based on a "predefined"/selection texture id ? ^_^
said:I've just pushed up a new feature to 3DMigoto that I know people have been looking forward to. Can anyone guess what this does:

[ShaderOverrideHUD]
Hash=xxx
x2=ps-t0
[TextureOverrideCrosshair]
Hash=xxx
filter_index=2

float4 stereo = StereoParams.Load(0);
float4 tex_filter = IniParams.Load(int2(2,0));
if (tex_filter.x == 2) {
o0.x += stereo.x * 0.9;
}


Expect this in 3DMigoto 1.1.34 ;-)


Applying different values to a shader based on a "predefined"/selection texture id ? ^_^

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

Posted 07/15/2015 01:38 PM   
Yeah, it works similar to the texture filtering in Helix mod, but with a few key differences: - Instead of DefinedTexturesVS, a texture is considered defined if there is a TextureOverride section for it. - There is no ValForDefined/ValNotDefiend - the passed in value is 0 for undefined textures or 1 for defined textures - The filter_index works exactly the same as the [TEXnnnnnnnn] Index=X in Helix mod - that is, it provides a method to override the value passed into the shader, which is useful when different textures need different adjustments. - The texture slot has to be explicitly specified (avoids problems with a texture being left assigned in a slot not used by the current shader). It will usually be ps-t0 (that is, slot 0 in the pixel shader), but if you need to test a different texture slot or even a texture assigned to a different shader in the pipeline you can. You could even test multiple slots at once if you needed to by passing them into different parameters of the IniParams resource (which now has more space - x,y,z,w,x1,y1,...,z7,w7). Use either ShaderUsage.txt or frame analysis to find the texture slot & hash.
Yeah, it works similar to the texture filtering in Helix mod, but with a few key differences:

- Instead of DefinedTexturesVS, a texture is considered defined if there is a TextureOverride section for it.

- There is no ValForDefined/ValNotDefiend - the passed in value is 0 for undefined textures or 1 for defined textures

- The filter_index works exactly the same as the [TEXnnnnnnnn] Index=X in Helix mod - that is, it provides a method to override the value passed into the shader, which is useful when different textures need different adjustments.

- The texture slot has to be explicitly specified (avoids problems with a texture being left assigned in a slot not used by the current shader). It will usually be ps-t0 (that is, slot 0 in the pixel shader), but if you need to test a different texture slot or even a texture assigned to a different shader in the pipeline you can. You could even test multiple slots at once if you needed to by passing them into different parameters of the IniParams resource (which now has more space - x,y,z,w,x1,y1,...,z7,w7). Use either ShaderUsage.txt or frame analysis to find the texture slot & hash.

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 07/15/2015 01:55 PM   
[quote=""]Yeah, it works similar to the texture filtering in Helix mod, but with a few key differences: - Instead of DefinedTexturesVS, a texture is considered defined if there is a TextureOverride section for it. - There is no ValForDefined/ValNotDefiend - the passed in value is 0 for undefined textures or 1 for defined textures - The filter_index works exactly the same as the [TEXnnnnnnnn] Index=X in Helix mod - that is, it provides a method to override the value passed into the shader, which is useful when different textures need different adjustments. - The texture slot has to be explicitly specified (avoids problems with a texture being left assigned in a slot not used by the current shader). It will usually be ps-t0 (that is, slot 0 in the pixel shader), but if you need to test a different texture slot or even a texture assigned to a different shader in the pipeline you can. You could even test multiple slots at once if you needed to by passing them into different parameters of the IniParams resource (which now has more space - x,y,z,w,x1,y1,...,z7,w7). Use either ShaderUsage.txt or frame analysis to find the texture slot & hash.[/quote] Really awesome features there! Looking really nice!!
said:Yeah, it works similar to the texture filtering in Helix mod, but with a few key differences:

- Instead of DefinedTexturesVS, a texture is considered defined if there is a TextureOverride section for it.

- There is no ValForDefined/ValNotDefiend - the passed in value is 0 for undefined textures or 1 for defined textures

- The filter_index works exactly the same as the [TEXnnnnnnnn] Index=X in Helix mod - that is, it provides a method to override the value passed into the shader, which is useful when different textures need different adjustments.

- The texture slot has to be explicitly specified (avoids problems with a texture being left assigned in a slot not used by the current shader). It will usually be ps-t0 (that is, slot 0 in the pixel shader), but if you need to test a different texture slot or even a texture assigned to a different shader in the pipeline you can. You could even test multiple slots at once if you needed to by passing them into different parameters of the IniParams resource (which now has more space - x,y,z,w,x1,y1,...,z7,w7). Use either ShaderUsage.txt or frame analysis to find the texture slot & hash.


Really awesome features there! Looking really nice!!

1x Palit RTX 2080Ti Pro Gaming OC(watercooled and overclocked to hell)
3x 3D Vision Ready Asus VG278HE monitors (5760x1080).
Intel i9 9900K (overclocked to 5.3 and watercooled ofc).
Asus Maximus XI Hero Mobo.
16 GB Team Group T-Force Dark Pro DDR4 @ 3600.
Lots of Disks:
- Raid 0 - 256GB Sandisk Extreme SSD.
- Raid 0 - WD Black - 2TB.
- SanDisk SSD PLUS 480 GB.
- Intel 760p 256GB M.2 PCIe NVMe SSD.
Creative Sound Blaster Z.
Windows 10 x64 Pro.
etc


My website with my fixes and OpenGL to 3D Vision wrapper:
http://3dsurroundgaming.com

(If you like some of the stuff that I've done and want to donate something, you can do it with PayPal at tavyhome@gmail.com)

Posted 07/15/2015 02:40 PM   
Very nice news Thanks a lot DarkStarSword!!
Very nice news
Thanks a lot DarkStarSword!!

MY WEB

Helix Mod - Making 3D Better

My 3D Screenshot Gallery

Like my fixes? you can donate to Paypal: dhr.donation@gmail.com

Posted 07/15/2015 04:06 PM   
Awesome! This will allow for different types of fixes, as well as auto-convergence - is that right?
Awesome! This will allow for different types of fixes, as well as auto-convergence - is that right?

Posted 07/16/2015 03:41 PM   
[quote="pirateguybrush"]Awesome! This will allow for different types of fixes, as well as auto-convergence - is that right?[/quote]It will allow for the same type of texture filtering as Helix mod which will help with UI adjustments, but auto-convergence isn't in yet (do you mean scene detection?). I've got more plans for this which will end up supporting scene detection and should work better if a key binding is also setting the same parameter, but that will take a bit more work + testing to get right. On the one hand I want scene detection to avoid adjusting the menus in Lichdom (adjusting part of the UI for damage numbers), but on the other I really want to focus on getting it out the door since it's very close, and I can always update it later.
pirateguybrush said:Awesome! This will allow for different types of fixes, as well as auto-convergence - is that right?
It will allow for the same type of texture filtering as Helix mod which will help with UI adjustments, but auto-convergence isn't in yet (do you mean scene detection?). I've got more plans for this which will end up supporting scene detection and should work better if a key binding is also setting the same parameter, but that will take a bit more work + testing to get right. On the one hand I want scene detection to avoid adjusting the menus in Lichdom (adjusting part of the UI for damage numbers), but on the other I really want to focus on getting it out the door since it's very close, and I can always update it later.

2x Geforce GTX 980 in SLI provided by NVIDIA, i7 6700K 4GHz CPU, Asus 27" VG278HE 144Hz 3D Monitor, BenQ W1070 3D Projector, 120" Elite Screens YardMaster 2, 32GB Corsair DDR4 3200MHz RAM, Samsung 850 EVO 500G SSD, 4x750GB HDD in RAID5, Gigabyte Z170X-Gaming 7 Motherboard, Corsair Obsidian 750D Airflow Edition Case, Corsair RM850i PSU, HTC Vive, Win 10 64bit

Alienware M17x R4 w/ built in 3D, Intel i7 3740QM, GTX 680m 2GB, 16GB DDR3 1600MHz RAM, Win7 64bit, 1TB SSD, 1TB HDD, 750GB HDD

Pre-release 3D fixes, shadertool.py and other goodies: http://github.com/DarkStarSword/3d-fixes
Support me on Patreon: https://www.patreon.com/DarkStarSword or PayPal: https://www.paypal.me/DarkStarSword

Posted 07/17/2015 05:21 PM   
  28 / 141    
Scroll To Top