FormatBaseN() returns different values in Lucee compared to ACF

Description

This only affects negative numbers

Lucee
FormatBaseN( -17998321, 2 ) => -1000100101010000111110001

ACF
FormatBaseN(-17998321, 2) => 11111110111011010101111000001111

Activity

Show:

Pothys - MitrahSoft 9 February 2024 at 15:39

I've checked this issue with Lucee version 6.0.1.67-SNAPSHOT. When I use negative numbers in formatBaseN() it returns the expected results in Lucee and the issue was resolved.

Michael Offner 8 February 2024 at 09:10

Adam Cameron 8 May 2023 at 10:29

I was chatting with about this on the Slack channel, and he asked me to transfer it here. So…


AC: Why might converting decimal -1234 to hex result in 0xfffffb2e rather than -0x4D2

 ChatGPT: When converting decimal -1234 to a 32-bit two's complement representation in hexadecimal, the correct result is -0x4D2. However, if you try to convert the decimal number to an unsigned 32-bit representation in hexadecimal, you may get 0xfffffb2e as the result.
This happens because when a decimal number is converted to hexadecimal, it is first converted to its binary representation. In the case of -1234, the binary representation in 32-bit two's complement form is 0b11111111111111111111100110101110.
If you interpret this binary number as an unsigned integer, you get 4294966062 (0xfffffb2e in hexadecimal). However, if you interpret it as a 32-bit two's complement signed integer, you get -1234 (which is represented as 0xfffffb2e in hexadecimal).
Therefore, it is important to specify whether you want to convert the decimal number to an unsigned or a signed hexadecimal representation, depending on the context of your problem.

That explains the result. Still thinking / researching as to whether I reckon that's what CF ought to be doing.

NB: it's been behaving that way since at least CF5.


ZS: one question, which comes to mind, does it round trip backwards

AC: https://trycf.com/gist/b9e3f5d1cb56467218aa2518357e6e33/lucee5?theme=monokai

// cf / lucee writeDump(formatBaseN(-1234,16)); // fffffb2e / -4d2 writeDump(formatBaseN(1234,16)); // 4d2 writeDump(inputBaseN("fffffb2e", 16)); // 4294966062 writeDump(inputBaseN("-4d2", 16)); // -1234

Re round trip... it would seem reasonable to me that formatBaseN and inputBaseN follow the same rules re signed/unsigned ints

ZS: I just quickly cross ref-ed those two functions https://github.com/lucee/lucee-docs/pull/1369


AC:

PHP:
https://onlinephp.io/c/83cd3d

echo dechex(-1234); // fffffffffffffb2e echo PHP_EOL; echo dechex(1234); // 4d2

JS:

(-1234).toString(16) '-4d2' (1234).toString(16) '4d2'

Ruby:

irb(main):001:0> -1234.to_s(16) => "-4d2"

Python:

>>> print(hex(-1234)) -0x4d2

Groovy:

groovy> Integer.toHexString(-1234) Result: fffffb2e

Java (via CFML...):
(https://trycf.com/gist/909a4e1cbc53f26b32b6d5de30c3d72a/lucee5?theme=monokai )

Long = createObject("java", "java.lang.Long") l = createObject("java", "java.lang.Long").init(-1234) n = -1234 writeDump([ l = [ type = l.getClass().getName(), // java.lang.Long hex = Long.toHexString(l) // fffffffffffffb2e ], n = [ type = n.getClass().getName(), // java.lang.Double hex = Long.toHexString(n) // fffffffffffffb2e ] ])

So. I'd say there's basis for Lucee doing the way that it is, but there is also basis for the way CF does it, and it's always done it that way, so... it's right.I think that raw Java is the same also kinda provides the important precedent here.

Clojure is interesting though:

user=> (Integer/toString -1234 16) "-4d2"

Given it's a JVM language 'n' all. Indicates perhaps that my Java code is not the final word on how it works in Java. There's no doubt some (un)signed idiosyncracy that we ought to be considering.

I'd try to rope Sean Corfield into this, as he's the brains of the outfit around here :wink:

I think there's def a Q for Mark Takata in here too, as to whether the round trip via inputBaseN is working predictably / by design on CF.


Do with this info what you will. I can see arguments both ways on this, and the Lucee round-trip is more robust, but this perhaps indicates an issue with CF’s inputBaseN function than an issue with formatBaseN. If CF’s decision on formatBaseN is informed / by design, then I think that’s how Lucee ought to behave as well.

Zac Spitzer 8 May 2023 at 08:40
Edited

[java] [script] -> JUnitReport: C:\\work\\lucee6\\test/reports/junit-test-results-6.0.0.387-SNAPSHOT.xml [java] [script] Failed: test.functions.FormatBaseN [java] [script] Format to base binary(2) [java] [script] Expected [11111111111111111111101100101110] but received [-10011010010] [java] [script] C:\\work\\lucee6\\test\\functions\\FormatBaseN.cfc:8 [java] [script] C:\\work\\lucee6\\test\\_testRunner.cfc:275 [java] [script] C:\\work\\lucee6\\test\\run-tests.cfm:234 [java] [script] [java] [script] Failed: test.functions.FormatBaseN [java] [script] Format to base hexadecimal(16) [java] [script] Expected [fffffb2e] but received [-4d2] [java] [script] C:\\work\\lucee6\\test\\functions\\FormatBaseN.cfc:26 [java] [script] C:\\work\\lucee6\\test\\_testRunner.cfc:275 [java] [script] C:\\work\\lucee6\\test\\run-tests.cfm:234

I think the ACF is wrong for the second one? https://trycf.com/gist/2dd7f3de3692f85bd5b104957fea79aa/acf2021?theme=monokai

Fixed

Details

Assignee

Reporter

New Issue warning screen

Before you create a new Issue, please post to the mailing list first https://dev.lucee.org

Once the issue has been verified, one of the Lucee team will ask you to file an issue

Sprint

Fix versions

Affects versions

Priority

Created 7 December 2021 at 07:16
Updated 18 September 2024 at 10:11
Resolved 9 February 2024 at 15:39

Flag notifications