Determining the Maximum Decimal Digits at Compile-Time




Introduction

Suppose you want to convert an integer value to its decimal string representation, e.g., 42 to "42". In C, you have to know how big to make the string buffer. Specifically, given some integral type T, you need to know how many decimal digits comprise max(T) (and min(T)). Using sizeof alone doesn’t help since that gives you the number of bytes, not the number of decimal digits.

In general, the number of decimal digits d required to represent an integer of b bits is:

d = ceil(b * log10(2))
  = ceil(b * .3010299)
  = (unsigned)(b * .3010299 + 1)
Enter fullscreen mode

Exit fullscreen mode

However, even if you implement a macro like:

#define MAX_DEC_INT_DIGITS(TYPE) \
  ((unsigned)(sizeof(TYPE) * CHAR_BIT * .3010299 + 1))
Enter fullscreen mode

Exit fullscreen mode

you can’t use an expression whose value is calculated at run-time at compile-time such as when declaring an array:

char buf[ MAX_DEC_INT_DIGITS(int) ];  // error
Enter fullscreen mode

Exit fullscreen mode

Actually, in C, this would work if your compiler supports variable length arrays (VLAs); but, in general, you don’t want to use VLAs. In C++, this would always be an error since C++ doesn’t support VLAs.

You might ask:

If sizeof is a compile-time operator, why can’t the value be calculated at compile-time?

Because the compiler can’t do floating-point math; it can evaluate only constant integer expressions at compile-time. So the question is: can multiplying by .3010299 be approximated using integer math? It turns out, yes.



The Trick

The trick is to realize that 1233 / 4096 = .30102539 which is a close approximation of .3010299. Integer division by 4096 is the same as right-shifting by 12. Therefore, the macro can become:

#define MAX_DEC_INT_DIGITS(TYPE) \
  ((sizeof(TYPE) * CHAR_BIT * 1233) >> 12 + 1)
Enter fullscreen mode

Exit fullscreen mode

It’s easy to check since TYPE will only ever be one of the integer types and the number of bits will typically only be one of 8, 16, 32, or 64. If you do the math, that works out to 3, 5, 10, and 20 — which is correct. Almost.

For signed integer types, there needs to be +1 to account for the minus sign — so you need to add 1 only if TYPE is signed. We can implement an IS_SIGNED_TYPE macro like:

#define IS_SIGNED_TYPE(TYPE)    !IS_UNSIGNED_TYPE(TYPE)
#define IS_UNSIGNED_TYPE(TYPE)  ((TYPE)-1 > 0)
Enter fullscreen mode

Exit fullscreen mode

That is, if –1 cast to TYPE > 0, it means TYPE is unsigned; and the ! of that means TYPE is signed. Now the macro can be:

#define MAX_DEC_INT_DIGITS(TYPE)            \
  (((sizeof(TYPE) * CHAR_BIT * 1233) >> 12) \
    + IS_SIGNED_TYPE(TYPE))
Enter fullscreen mode

Exit fullscreen mode

and the compiler can calculate this at compile-time thus be an integer constant expression.



Conclusion

Using an integer approximation for a floating-point calculation along with some clever macros can allow you to generate constant integer expressions that can be evaluated at compile-time.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *