Coder Perfect

64 bit ntohl() in C++?

Problem

According to the man pages for htonl(), you can only use it for values up to 32 bits. (In truth, ntohl() is for unsigned long, which is 32 bits on my platform.) I assume it would work for 64 bit ints if the unsigned long was 8 bytes).

My issue is that I need to convert 64 bit integers from big endian to little endian (in my case, this is an unsigned long long). Right now, I have to do that particular conversion. But it would be much better if the function (like ntohl()) didn’t convert my 64-bit value to big-endian if the destination platform was big-endian. (I’d rather not accomplish this using my own preprocessor magic.)

What am I able to do? If something standard exists, I’d like it, but I’m open to implementation recommendations. This type of conversion has been done in the past utilizing unions. With an unsigned long long and a char[8], I think I could have a union. Then rearrange the bytes as needed. (Obviously, this would fail on big-endian platforms.)

Asked by Tom

Solution #1

On Linux (glibc >= 2.9) or FreeBSD, use man htobe64 for documentation.

Unfortunately, during an attempt in 2009 to produce a single (non-kernel-API) libc standard, OpenBSD, FreeBSD, and glibc (Linux) did not get along so well.

This little piece of preprocessing code is currently in use:

#if defined(__linux__)
#  include <endian.h>
#elif defined(__FreeBSD__) || defined(__NetBSD__)
#  include <sys/endian.h>
#elif defined(__OpenBSD__)
#  include <sys/types.h>
#  define be16toh(x) betoh16(x)
#  define be32toh(x) betoh32(x)
#  define be64toh(x) betoh64(x)
#endif

(tested on Linux and OpenBSD) should make the differences inconsequential. On those four platforms, it provides Linux/FreeBSD-style macros.

Use example:

  #include <stdint.h>    // For 'uint64_t'

  uint64_t  host_int = 123;
  uint64_t  big_endian;

  big_endian = htobe64( host_int );
  host_int = be64toh( big_endian );

It’s the most “standard C library”-like approach currently accessible.

Answered by Nanno Langstraat

Solution #2

http://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html is a good place to start.

#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>

uint64_t
ntoh64(const uint64_t *input)
{
    uint64_t rval;
    uint8_t *data = (uint8_t *)&rval;

    data[0] = *input >> 56;
    data[1] = *input >> 48;
    data[2] = *input >> 40;
    data[3] = *input >> 32;
    data[4] = *input >> 24;
    data[5] = *input >> 16;
    data[6] = *input >> 8;
    data[7] = *input >> 0;

    return rval;
}

uint64_t
hton64(const uint64_t *input)
{
    return (ntoh64(input));
}

int
main(void)
{
    uint64_t ull;

    ull = 1;
    printf("%"PRIu64"\n", ull);

    ull = ntoh64(&ull);
    printf("%"PRIu64"\n", ull);

    ull = hton64(&ull);
    printf("%"PRIu64"\n", ull);

    return 0;
}

The following output will be displayed:

1
72057594037927936
1

If you remove the upper four bytes, you may test this using ntohl().

You can also transform this into a nice C++ templated function that works with any size integer:

template <typename T>
static inline T
hton_any(const T &input)
{
    T output(0);
    const std::size_t size = sizeof(input);
    uint8_t *data = reinterpret_cast<uint8_t *>(&output);

    for (std::size_t i = 0; i < size; i++) {
        data[i] = input >> ((size - i - 1) * 8);
    }

    return output;
}

You’re now 128-bit secure as well!

Answered by user442585

Solution #3

Use the following union to determine your endianness:

union {
    unsigned long long ull;
    char c[8];
} x;
x.ull = 0x0123456789abcdef; // may need special suffix for ULL.

The contents of x.c[] can then be examined to determine where each byte went.

To perform the conversion, I’d first use the detection code to determine the platform’s endianness, then construct my own function to perform the swaps.

You could make it dynamic so that it runs on any platform (detect once, then use a switch inside your conversion code to choose the appropriate conversion), but if you’re only going to use one platform, I’d just do the detection in a separate program and then write a simple conversion routine, making sure to document that it only runs (or has been tested) on that platform.

Here’s some code I put up to demonstrate the concept. It has been tested, though not thoroughly, and should be sufficient to get you started.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define TYP_INIT 0
#define TYP_SMLE 1
#define TYP_BIGE 2

static unsigned long long cvt(unsigned long long src) {
    static int typ = TYP_INIT;
    unsigned char c;
    union {
        unsigned long long ull;
        unsigned char c[8];
    } x;

    if (typ == TYP_INIT) {
        x.ull = 0x01;
        typ = (x.c[7] == 0x01) ? TYP_BIGE : TYP_SMLE;
    }

    if (typ == TYP_SMLE)
        return src;

    x.ull = src;
    c = x.c[0]; x.c[0] = x.c[7]; x.c[7] = c;
    c = x.c[1]; x.c[1] = x.c[6]; x.c[6] = c;
    c = x.c[2]; x.c[2] = x.c[5]; x.c[5] = c;
    c = x.c[3]; x.c[3] = x.c[4]; x.c[4] = c;
    return x.ull;
}

int main (void) {
    unsigned long long ull = 1;
    ull = cvt (ull);
    printf ("%llu\n",ull);
    return 0;
}

It’s important to remember that this just checks for pure big/little endian. If the bytes are stored in a strange order, such as 5,2,3,1,0,7,6,4 order, cvt() will be a little more complicated. Such a design doesn’t belong in the world, but I’m not dismissing our friends in the microprocessor industry:-)

Remember that you’re not intended to access a union member by any field other than the last one written, therefore this is technically undefined behavior. It should work with most implementations, but if you’re a purist, you should probably simply bite the bullet and write your own functions using macros, such as:

// Assumes 64-bit unsigned long long.
unsigned long long switchOrderFn (unsigned long long in) {
    in  = (in && 0xff00000000000000ULL) >> 56
        | (in && 0x00ff000000000000ULL) >> 40
        | (in && 0x0000ff0000000000ULL) >> 24
        | (in && 0x000000ff00000000ULL) >> 8
        | (in && 0x00000000ff000000ULL) << 8
        | (in && 0x0000000000ff0000ULL) << 24
        | (in && 0x000000000000ff00ULL) << 40
        | (in && 0x00000000000000ffULL) << 56;
    return in;
}
#ifdef ULONG_IS_NET_ORDER
    #define switchOrder(n) (n)
#else
    #define switchOrder(n) switchOrderFn(n)
#endif

Answered by paxdiablo

Solution #4

#include <endian.h>    // __BYTE_ORDER __LITTLE_ENDIAN
#include <byteswap.h>  // bswap_64()

uint64_t value = 0x1122334455667788;

#if __BYTE_ORDER == __LITTLE_ENDIAN
value = bswap_64(value);  // Compiler builtin GCC/Clang
#endif

endian.h is not a C++ standard header, according to zhaorufei (see her/his remark), and the macros __BYTE ORDER and __LITTLE ENDIAN may be undefined. Because undefined macros are interpreted as 0, the #if statement is unpredictable.

If you wish to offer your C++ elegant trick for detecting endianness, please modify this answer.

Furthermore, the GCC and Clang compilers support the macro bswap 64(), but not the Visual C++ compiler. You might be motivated by the following snippet to provide portable source code:

#ifdef _MSC_VER
  #include <stdlib.h>
  #define bswap_16(x) _byteswap_ushort(x)
  #define bswap_32(x) _byteswap_ulong(x)
  #define bswap_64(x) _byteswap_uint64(x)
#else
  #include <byteswap.h>  // bswap_16 bswap_32 bswap_64
#endif

Cross-platform _byteswap uint64 source code is also available.

hton() is a generic function that works with 16 bits, 32 bits, 64 bits, and more…

#include <endian.h>   // __BYTE_ORDER __LITTLE_ENDIAN
#include <algorithm>  // std::reverse()

template <typename T>
constexpr T htonT (T value) noexcept
{
#if __BYTE_ORDER == __LITTLE_ENDIAN
  char* ptr = reinterpret_cast<char*>(&value);
  std::reverse(ptr, ptr + sizeof(T));
#endif
  return value;
}
template <typename T>
constexpr T htonT (T value, char* ptr=0) noexcept
{
  return 
#if __BYTE_ORDER == __LITTLE_ENDIAN
    ptr = reinterpret_cast<char*>(&value), 
    std::reverse(ptr, ptr + sizeof(T)),
#endif
    value;
}

Using -Wall -Wextra -pedantic, there are no compilation warnings on clang-3.5 and GCC-4.9 (see compilation and run output on coliru).

However, the above version prevents the creation of constexpr variables such as:

constexpr int32_t hton_six = htonT( int32_t(6) );

Finally, the functions that rely on 16/32/64 bits must be separated (specialized). We can, however, preserve generic functions. (Coliru has the complete snippet)

The attributes std::enable if are used in the C++11 snippet below to attack Substitution Failure Is Not An Error (SFINAE).

template <typename T>
constexpr typename std::enable_if<sizeof(T) == 2, T>::type
htonT (T value) noexcept
{
   return  ((value & 0x00FF) << 8)
         | ((value & 0xFF00) >> 8);
}

template <typename T>
constexpr typename std::enable_if<sizeof(T) == 4, T>::type
htonT (T value) noexcept
{
   return  ((value & 0x000000FF) << 24)
         | ((value & 0x0000FF00) <<  8)
         | ((value & 0x00FF0000) >>  8)
         | ((value & 0xFF000000) >> 24);
}

template <typename T>
constexpr typename std::enable_if<sizeof(T) == 8, T>::type
htonT (T value) noexcept
{
   return  ((value & 0xFF00000000000000ull) >> 56)
         | ((value & 0x00FF000000000000ull) >> 40)
         | ((value & 0x0000FF0000000000ull) >> 24)
         | ((value & 0x000000FF00000000ull) >>  8)
         | ((value & 0x00000000FF000000ull) <<  8)
         | ((value & 0x0000000000FF0000ull) << 24)
         | ((value & 0x000000000000FF00ull) << 40)
         | ((value & 0x00000000000000FFull) << 56);
}

Alternatively, use std::enable if txxx> as a shorthand for std::enable ifxxx>::type: std::enable ifxxx>::type: std::enable ifxxx>::type: std::enable ifxxx>::type: std::enable ifxxx>::type: std

template <typename T>
constexpr typename std::enable_if_t<sizeof(T) == 2, T>
htonT (T value) noexcept
{
    return bswap_16(value);  // __bswap_constant_16
}

template <typename T>
constexpr typename std::enable_if_t<sizeof(T) == 4, T>
htonT (T value) noexcept
{
    return bswap_32(value);  // __bswap_constant_32
}

template <typename T>
constexpr typename std::enable_if_t<sizeof(T) == 8, T>
htonT (T value) noexcept
{
    return bswap_64(value);  // __bswap_constant_64
}
std::uint8_t uc = 'B';                  std::cout <<std::setw(16)<< uc <<'\n';
uc = htonT( uc );                       std::cout <<std::setw(16)<< uc <<'\n';

std::uint16_t us = 0x1122;              std::cout <<std::setw(16)<< us <<'\n';
us = htonT( us );                       std::cout <<std::setw(16)<< us <<'\n';

std::uint32_t ul = 0x11223344;          std::cout <<std::setw(16)<< ul <<'\n';
ul = htonT( ul );                       std::cout <<std::setw(16)<< ul <<'\n';

std::uint64_t uL = 0x1122334455667788; std::cout <<std::setw(16)<< uL <<'\n';
uL = htonT( uL );                      std::cout <<std::setw(16)<< uL <<'\n';
constexpr uint8_t  a1 = 'B';               std::cout<<std::setw(16)<<a1<<'\n';
constexpr auto     b1 = htonT(a1);         std::cout<<std::setw(16)<<b1<<'\n';

constexpr uint16_t a2 = 0x1122;            std::cout<<std::setw(16)<<a2<<'\n';
constexpr auto     b2 = htonT(a2);         std::cout<<std::setw(16)<<b2<<'\n';

constexpr uint32_t a4 = 0x11223344;        std::cout<<std::setw(16)<<a4<<'\n';
constexpr auto     b4 = htonT(a4);         std::cout<<std::setw(16)<<b4<<'\n';

constexpr uint64_t a8 = 0x1122334455667788;std::cout<<std::setw(16)<<a8<<'\n';
constexpr auto     b8 = htonT(a8);         std::cout<<std::setw(16)<<b8<<'\n';
               B
               B
            1122
            2211
        11223344
        44332211
1122334455667788
8877665544332211

The resulting code is indicated using the online C++ compiler gcc.godbolt.org.

g++-4.9.2 -std=c++14 -O3

std::enable_if<(sizeof (unsigned char))==(1), unsigned char>::type htonT<unsigned char>(unsigned char):
    movl    %edi, %eax
    ret
std::enable_if<(sizeof (unsigned short))==(2), unsigned short>::type htonT<unsigned short>(unsigned short):
    movl    %edi, %eax
    rolw    $8, %ax
    ret
std::enable_if<(sizeof (unsigned int))==(4), unsigned int>::type htonT<unsigned int>(unsigned int):
    movl    %edi, %eax
    bswap   %eax
    ret
std::enable_if<(sizeof (unsigned long))==(8), unsigned long>::type htonT<unsigned long>(unsigned long):
    movq    %rdi, %rax
    bswap   %rax
    ret

clang++-3.5.1 -std=c++14 -O3

std::enable_if<(sizeof (unsigned char))==(1), unsigned char>::type htonT<unsigned char>(unsigned char): # @std::enable_if<(sizeof (unsigned char))==(1), unsigned char>::type htonT<unsigned char>(unsigned char)
    movl    %edi, %eax
    retq

std::enable_if<(sizeof (unsigned short))==(2), unsigned short>::type htonT<unsigned short>(unsigned short): # @std::enable_if<(sizeof (unsigned short))==(2), unsigned short>::type htonT<unsigned short>(unsigned short)
    rolw    $8, %di
    movzwl  %di, %eax
    retq

std::enable_if<(sizeof (unsigned int))==(4), unsigned int>::type htonT<unsigned int>(unsigned int): # @std::enable_if<(sizeof (unsigned int))==(4), unsigned int>::type htonT<unsigned int>(unsigned int)
    bswapl  %edi
    movl    %edi, %eax
    retq

std::enable_if<(sizeof (unsigned long))==(8), unsigned long>::type htonT<unsigned long>(unsigned long): # @std::enable_if<(sizeof (unsigned long))==(8), unsigned long>::type htonT<unsigned long>(unsigned long)
    bswapq  %rdi
    movq    %rdi, %rax
    retq

Note that my first solution was not consistent with C++11-constexpr.

This response is CC0 1.0 Universal Public Domain.

Answered by oHo

Solution #5

Some BSD systems have betoh64, which performs the tasks you require.

Answered by Francis

Post is based on https://stackoverflow.com/questions/809902/64-bit-ntohl-in-c