unsorted C snippets for small/fast static apps

For discussions about programming, programming questions/advice, and projects that don't really have anything to do with Puppy.
Message
Author
User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#46 Post by technosaurus »

I recently removed most of the __builtin_*(...) wrappers because there is no standardize way to check for them (yet another thing to suggest to the C standards board ... Clang's has_builtin() would be a good standard) I do plan on putting them back in, but I wanted to have a fallback for unsupported browsers as well as older versions of compilers like gcc-4.2.1

In case anyone else wants to do something similar, this is how to grok 90% of them.
1. Create a wrapper around the __builtin_*(...)

Code: Select all

v4hi pmulhrw(v4hi a, v4hi b){return __builtin_ia32_pmulhrw(a,b);}
2. Compile it with -S to get the assembly output (or use gcc.godbolt.org)

Code: Select all

pmulhrw:
        movdq2q %xmm1, %mm0
        movdq2q %xmm0, %mm1
        pmulhrw %mm0, %mm1
        movq2dq %mm1, %xmm0
        ret
3. Grok the assembly for the appropriate line(s) of code into inline asm(it helps to know the platform's calling convention, so you can tell which line are just to move the input parameters and returns)
For this case it is really just:

Code: Select all

         pmulhrw %mm0, %mm1
Which becomes this inline asm:

Code: Select all

v4hi __not_builtin_pmulhrw(v4hi a, v4hi b){__asm("pmulhrw %1, %0":"+y"(a):"y"(b));return a;}
Note the registers get replaced with %0 and %1, those are the parameter numbers in order and that instead of using "r" for a general purpose register, I used "y" for an mmx register according to https://gcc.gnu.org/onlinedocs/gcc-5.3. ... aints.html
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#47 Post by technosaurus »

I have had a few projects where I needed to share code between C and javascript. Rather than having to update 2 separate files or run a build process to generate both, I came up with some hacks to allow the code to be valid in both:

http://stackoverflow.com/a/35012334/1162141

Code: Select all

/* C comment ends with the '/' on next line but js comment is open  *\
/ //BEGIN C Block
#define function int
/* This ends the original js comment, but we add an opening '/*' for C  */

/*Most compilers can build K&R style C with parameters like this:*/
function volume(x,y,z)/**\
/int x,y,z;/**/
{
  return x*y*z;
}

/**\
/
#undef function
#define var const char**
#define new (const char*[])
#define Array(...)  {__VA_ARGS__}
/**/

var cars = new Array("Ford", "Chevy", "Dodge");

/* Or a more readable version *\
/// BEGIN C Block
#undef var
#undef new
/* END C Block */
You can do something similar for Java by using the "??/" triglyph for the '\'
and setting up some macros and structs with function pointers as they did here.
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#48 Post by technosaurus »

musl libc uses some funky #include +macro hackery to map enums to strings for strerror, and although it is pretty clever, its not quite obvious what it is doing since the data is in a separate file, so here is the simplified version:

Code: Select all

#define TAG_MAP { \
	_MAP(TAG_BODY,"body"), \
	_MAP(TAG_HEAD,"head"), \
	_MAP(TAG_HTML,"html"), \
	_MAP(TAG_UNKNOWN,"unknown"), \
}

#define _MAP(x,y) x
enum tags TAG_MAP;
#undef _MAP
#define _MAP(x,y) y
const char *tagstrings[] = TAG_MAP;
#undef _MAP
//usage: printf("%s\n",tagstrings[TAG_HTML]);
This could be extended for any amount of tabular data
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#49 Post by technosaurus »

... and some more macro hackery

this allows you to reduce multiple 3-line #ifdefs to a single line or even inline them in your functions

Code: Select all

#define PASTE_(x,y) x##y
#define PASTE(x,y) PASTE_(x,y)
#define ENABLED(...) __VA_ARGS__
#define DISABLED(...)
#define NOT_DISABLED ENABLED
#define NOT_ENABLED DISABLED
#define IF_ENABLED(x,...) x(__VA_ARGS__)
#define IF_NOT_ENABLED(x,...) PASTE(NOT_,x)(__VA_ARGS__)
example

Code: Select all

#define PNG_SUPPORT ENABLED
#define JPG_SUPPORT DISABLED
void init(void){
  IF_ENABLED(PNG_SUPPORT, init_png();)
  IF_ENABLED(JPG_SUPPORT, init_jpg();)
  return;
}

int main(void){
	puts("supported types:\n"
		IF_ENABLED(PNG_SUPPORT,     "\tpng supported\n")
		IF_ENABLED(JPG_SUPPORT,     "\tjpeg supported\n")
		IF_NOT_ENABLED(JPG_SUPPORT, IF_NOT_ENABLED(PNG_SUPPORT, "\tnone supported\n"))
	);
}
vs. the traditional way

Code: Select all

#define PNG_SUPPORT
#define JPG_SUPPORT
void init(void){
#ifdef PNG_SUPPORT
  init_png();
#endif
#ifdef JPG_SUPPORT
  init_jpg();
#endif
  return;
}

int main(void){
  puts("supported types:\n"
#ifdef PNG_SUPPORT
    "\tpng supported\n"
#endif
#ifdef JPG_SUPPORT
    "\tjpeg supported\n"
#endif
#if !defined(JPG_SUPPORT) && !defined(PNG_SUPPORT)
    "\tnone supported\n"
#endif	
  );
}
It works for multiple commands as well:

Code: Select all

IF_ENABLED(PNG_SUPPORT,
int *getRGBfromPNG(void *buf, void *return_data){
  //etc...
})
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

Ibidem
Posts: 549
Joined: Wed 26 May 2010, 03:31
Location: State of Jefferson

#50 Post by Ibidem »

Well, I've been poking at bqc.
So far, I've implemented _socketcall() (looking at musl src/internal/syscall.h to figure out how) and almost all the socketcall wrappers.
I've also discovered a small (*cough*) problem.
With GCC 5.3.x (stock for Alpine Linux) on i386 and the standard flags (-nostdlib -nostartfiles), apparently the argc/argv initialization doesn't work; for example, if I run ./get google.com /index.html it thinks argc is 0.
(I hacked a debug line in to check that.)
Hardcoding the host/url seems to result in a 'working' binary.

Attaching a patch (git format-patch) to fix what I can figure out.
Attachments
socketcall.patch.gz
gzipped _socketcall() implementation, along with socketcall-based networking functions
(2.08 KiB) Downloaded 261 times

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#51 Post by technosaurus »

Ibidem wrote:Well, I've been poking at bqc.
So far, I've implemented _socketcall() (looking at musl src/internal/syscall.h to figure out how) and almost all the socketcall wrappers.
I've also discovered a small (*cough*) problem.
With GCC 5.3.x (stock for Alpine Linux) on i386 and the standard flags (-nostdlib -nostartfiles), apparently the argc/argv initialization doesn't work; for example, if I run ./get google.com /index.html it thinks argc is 0.
(I hacked a debug line in to check that.)
Hardcoding the host/url seems to result in a 'working' binary.

Attaching a patch (git format-patch) to fix what I can figure out.
Thanks for the socketcall patch - it has been high on the todo list for a while, is the patch public-domain/any-license as the rest of the code?

Feel free to submit issues and pull requests on github if you are comfortable with it... its easier for me to keep track of.

As for the argc issue, I discovered (and documented?) that it needs optimization turned on or gcc will screw up the stack pointer in _start() before it can be used for argc/argv... I tried an alternative method using a dummy struct parameter to _start(), but it suffered similar problems. I also tried to declare an array of long on the stack and set argc using its address -1 and argv -2, but optimizations screwed with that too. That part of the code is _really_ hard to make fully "portable" because in their infinite wisdom, "they" decided to always pass the _start() parameters on the stack instead of the system's default calling convention, so you can't just do void _start(long argc, char **argv){...} and neither gcc or clang have a builtin way to get the stack pointer AFAIK (I looked)
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#52 Post by technosaurus »

here is an update on my anti-ifdef macros to enable boolean logic

Code: Select all

#define PASTE_(x,y) x##y
#define PASTE(x,y) PASTE_(x,y)
#define PASTE3_(x,y,z) x##y##z
#define PASTE3(x,y,z) PASTE3_(x,y,z)
#define Y(...) __VA_ARGS__
#define N(...)
#define IF(x) x //alternate method similar to IFNOT()

#define NOT_N Y
#define NOT_Y N
#define IF_NOT(x) PASTE(NOT_,x)
#define NOT(x) PASTE(NOT_,x)

#define N_OR_N N
#define N_OR_Y Y
#define Y_OR_N Y
#define Y_OR_Y Y
#define OR(x,y) PASTE3(x,_OR_,y)

#define N_AND_N N
#define N_AND_Y N
#define Y_AND_N N
#define Y_AND_Y Y
#define AND(x,y) PASTE3(x,_AND_,y)

#define N_XOR_N N
#define N_XOR_Y Y
#define Y_XOR_N Y
#define Y_XOR_Y N
#define XOR(x,y) PASTE3(x,_XOR_,y)

#define N_NOR_N Y
#define N_NOR_Y N
#define Y_NOR_N N
#define Y_NOR_Y N
#define NOR(x,y) PASTE3(x,_NOR_,y)

#define N_NAND_N Y
#define N_NAND_Y Y
#define Y_NAND_N Y
#define Y_NAND_Y N
#define NAND(x,y) PASTE3(x,_NAND_,y)

#define N_XNOR_N Y
#define N_XNOR_Y N
#define Y_XNOR_N N
#define Y_XNOR_Y Y
#define XNOR(x,y) PASTE3(x,_XNOR_,y)

#define IF2(x,y,z) PASTE3(x,y,z)

//HACK: #error requires its own line and _Pragma support is sketchy
#define ERROR(x) char PASTE(PASTE(ERROR_on_line__,__LINE__),PASTE(__XXX_,x))[-1];
NOTES:
This version uses Y and N instead of ENABLED and DISABLED, so it may have more naming conflicts
Usage:
//in your config.h
#define FOO Y
#define BAR N
#define BAZ Y

in your code
AND(FOO,BAR)(/*do stuff if both FOO and BAR are enabled*/)
or
IF2(FOO,_AND_,BAR)( /*do stuff if both FOO and BAR are enabled*/ )
... note the parenthesis instead of squiggly braces in both cases
They can also be combined:

Code: Select all

OR(BAZ,AND(FOO,BAR))(
  /*do stuff if both FOO and BAR are enabled or BAZ is enabled*/ 
)
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

Ibidem
Posts: 549
Joined: Wed 26 May 2010, 03:31
Location: State of Jefferson

#53 Post by Ibidem »

technosaurus wrote:
Ibidem wrote:Well, I've been poking at bqc.
So far, I've implemented _socketcall() (looking at musl src/internal/syscall.h to figure out how) and almost all the socketcall wrappers.
I've also discovered a small (*cough*) problem.
With GCC 5.3.x (stock for Alpine Linux) on i386 and the standard flags (-nostdlib -nostartfiles), apparently the argc/argv initialization doesn't work; for example, if I run ./get google.com /index.html it thinks argc is 0.
(I hacked a debug line in to check that.)
Hardcoding the host/url seems to result in a 'working' binary.

Attaching a patch (git format-patch) to fix what I can figure out.
Thanks for the socketcall patch - it has been high on the todo list for a while, is the patch public-domain/any-license as the rest of the code?

Feel free to submit issues and pull requests on github if you are comfortable with it... its easier for me to keep track of.
PD/any-license, yes.

Unfortunately, github.com doesn't play nicely with Links, which I'm currently using pretty much exclusively.
As for the argc issue, I discovered (and documented?) that it needs optimization turned on or gcc will screw up the stack pointer in _start() before it can be used for argc/argv...
Yes, but none of -Os -O[0-3] work in this case, with or without frame-pointers.
I do realize that this is not something that *can* be done portably, but I can't figure out any way.
I tried an alternative method using a dummy struct parameter to _start(), but it suffered similar problems. I also tried to declare an array of long on the stack and set argc using its address -1 and argv -2, but optimizations screwed with that too. That part of the code is _really_ hard to make fully "portable" because in their infinite wisdom, "they" decided to always pass the _start() parameters on the stack instead of the system's default calling convention, so you can't just do void _start(long argc, char **argv){...} and neither gcc or clang have a builtin way to get the stack pointer AFAIK (I looked)
By the way: can you add 'ushort' to the typedefs?

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#54 Post by technosaurus »

Here is a really stupid trick that I figured out just to see how much I could abuse the preprocessor.

Make a file with the contents:

Code: Select all

#if __COUNTER__ < MAX_COUNT
#include __FILE__
__FILE__
#endif
Then name the file what ever text you would like to repeat (could be one letter or a long string). You can make symlinks/hardlinks to this file for different strings.

Then in your C file you can do something like this:

Code: Select all

static const char s[] = 
#define MAX_COUNT 198 //don't exceed max inclusion depth
#include "0"
#undef MAX_COUNT
#define MAX_COUNT 256
#include "0"
#undef MAX_COUNT
;
Now you have a string with 256 '0's (if you named the file
"fizzbuzz" and included it instead, the string would be 8x longer)

A more appropriate use of this would be to unroll a loop MAX_COUNT times, but what fun is that - Bonus points to anyone who can implement a compile time fizzbuzz using this method.
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#55 Post by technosaurus »

long division in binary

Code: Select all

struct div_t{
  int quot;
  int rem;
};

struct div_t div(int dividend, int divisor){
    _Bool dividend_is_negative = (dividend < 0),
        divisor_is_negative = (divisor < 0),
        result_is_negative = divisor_is_negative ^ dividend_is_negative;
    unsigned quotient =0, shift, shifted;
    //if (dividend_is_negative) dividend = -dividend;
    divisor ^= -divisor_is_negative;
    divisor += divisor_is_negative;
    //if (dividend_is_negative) dividend = -dividend;
    dividend ^= -dividend_is_negative;
    dividend += dividend_is_negative;
    shifted = divisor;
    //shift divisor so its MSB is same as dividend's - minimize loops
    //if no builtin clz, then shift divisor until its >= dividend
    //such as: while (shifted<dividend) shifted<<=1;
    shift = __builtin_clz(divisor)-__builtin_clz(dividend);
    //clamp shift to 0 to avoid undefined behavior
    shift &= -(shift > 0);
    shifted<<=shift;
    do{
        unsigned tmp;
        quotient <<=1;
        tmp = (unsigned long) (shifted <= dividend);
        quotient |= tmp;
        //if (tmp) dividend -= shifted;
		dividend -= shifted & -tmp;
        shifted >>=1;
    }while (shifted >= divisor);
    //if (result_is_negative) quotient=-quotient, dividend=-dividend;
    quotient ^= -result_is_negative;
    dividend ^= -result_is_negative;
    quotient += result_is_negative;
    dividend += result_is_negative;      
    return (struct div_t){quotient, dividend};
}

Code: Select all

since integer division is so slow, I wanted to see if bitops would/could be faster... its pretty close but not useful for platforms with a native instruction ... maybe for jcore j1 from [url]http://j-core.org[/url]
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

sse2 string functions

#56 Post by technosaurus »

Code: Select all

size_t strlen_sse2(const char *s){
  const __m128i *vp =((__m128i*)s)-4, all0 = (__m128i){0};
  __m128i v0,v1,v2,v3,v;
  do{
    vp+=4;
    v = v0 = _mm_cmpeq_epi8(_mm_loadu_si128(vp+0),all0);
    v|= v1 = _mm_cmpeq_epi8(_mm_loadu_si128(vp+1),all0);
    v|= v2 = _mm_cmpeq_epi8(_mm_loadu_si128(vp+2),all0);
    v|= v3 = _mm_cmpeq_epi8(_mm_loadu_si128(vp+3),all0);
  }while(!(_mm_movemask_epi8(v)));
  u64 m = (u64)_mm_movemask_epi8(v0) | ((u64)_mm_movemask_epi8(v1)<<16) | 
    ((u64)_mm_movemask_epi8(v2)<<32) | ((u64)_mm_movemask_epi8(v3)<<48);
  return (char*)vp - s + __builtin_ctzll(m);
}

int strcmp_sse2(const char *s0, const char *s1){
  const __m128i *lp =(__m128i*)s0, *rp=(__m128i*)s1,
          all0 = (__m128i){0}, all1 = _mm_set1_epi8(~0);
  __m128i l, r, tmp;
  unsigned m=1;
  size_t i = 0;
  do{
    l = _mm_loadu_si128 (lp+i);
    r = _mm_loadu_si128 (rp+i);
    m =_mm_movemask_epi8(_mm_cmpeq_epi8(l,all0)|_mm_xor_si128(_mm_cmpeq_epi8(l,r),all1));
    ++i;
  }while(!m);
  return ((union{__m128i v;char c[16];})(l-r)).c[__builtin_ctz(m)];
}

int strcasecmp_sse2(const char *s0, const char *s1){
  __m128i *l =(__m128i*)s0, *r=(__m128i*)s1,
          all0 = (__m128i){0}, all1 = (__m128i){-1,-1},
          allA = _mm_set1_epi8('A'-1), allZ = _mm_set1_epi8('Z'+1),
          all32 = _mm_set1_epi8(1<<5), lcl, lcr, tmp;
  unsigned m;
  size_t i = 0;
  do{
    lcl = _mm_loadu_si128 (l+i);
    lcr = _mm_loadu_si128 (r+i);
    tmp = _mm_cmpeq_epi8(lcl,all0);
    lcl |= (_mm_cmpgt_epi8(lcl,allA) & _mm_cmplt_epi8(lcl,allZ) & all32);
    lcr |= (_mm_cmpgt_epi8(lcr,allA) & _mm_cmplt_epi8(lcr,allZ) & all32);
    tmp |= (_mm_cmpeq_epi8(lcl,lcr) ^ all1);
    ++i;
  }while(!(m=_mm_movemask_epi8(tmp)));
  return ((union{__m128i v;char c[16];})(lcl-lcr)).c[__builtin_ctz(m)];
}
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
Moose On The Loose
Posts: 965
Joined: Thu 24 Feb 2011, 14:54

#57 Post by Moose On The Loose »

technosaurus wrote:long division in binary
... code that works just fine ...
On a lot of machines, Booth's method written as nested while loops is a bit faster. You subtract without making the test at all. If the remainder goes negative, you drop out of the subtracting loop and into a loop that adds until the remainder goes positive again. It works better on machines where you have to subtract to compare.

On an 8051 like processor, you can do a very different sort of long hand divide using the processors multiply and a little cleverness to correctly guess the next "digit". The code only needs to do the outer loop a little more often than the number of bytes in the numbers. The subtracting away for one digit of the answer takes a loop or can be unrolled.

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#58 Post by technosaurus »

This post is mainly so I can refer back to it, probably not of interest to anyone who isn't building a c library or implementing a high speed interpreted language.

In order to make my entry code (_start:) more portable and in preparation for adding setjmp/longjmp, I needed a way to get (and set) specific registers. It only took a few minutes to get it working in gcc. After a couple of days of beating clang with progressively larger and larger sticks, I have come up with these:

Code: Select all

#define getreg(reg) ({ \
     register void *tmp __asm__ (reg); \
     __asm__ __volatile__("":"+r"(tmp)); \
     tmp; \
})

#define setreg(reg,val) do{ \
     register void *tmp __asm__ (reg) = val; \
     __asm__ __volatile__("":"+r"(tmp)); \
} while(0)

#define save_reg(reg, loc) do{ \
     register intptr_t tmp __asm__ (reg); \
     __asm__ __volatile__("":"+r"(tmp)); \
     *(intptr_t*)loc = tmp; \
} while(0)

#define restore_reg(reg, loc) do{ \
     register void *tmp __asm__ (reg) = *(long**)loc; \
     __asm__ __volatile__("":"+r"(tmp)); \
} while(0)
Now I can easily get the stack pointer using:
void *get_sp(void){return getreg("sp");}

And hopefully I can use save_reg() and restore_reg() along with computed gotos to implement setjmp/longjmp (yes, I know compiler documentation warns about using computed gotos this way, but that's why I needed to save/restore registers)

This will allow me to use X-macros to implement the typically assembly-language-only parts of C for the 40+ architectures supported by some version of linux (mainline, uclinux, various forks...).
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#59 Post by technosaurus »

(almost) minimal, tested aes implementation

Code: Select all

#include <stdint.h>
#if defined (_MSC_VER)
#define VEC(x) __declspec(intrin_type,align(16))
#else
#define VEC(x) __attribute__ ((__vector_size__ (16), __may_alias__))
#endif


#ifdef INITSBOX
static uint8_t sbox[256], inv_sbox[256];
#else
static const uint8_t sbox[256] = {
	0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5,  0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76,
	0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0,  0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0,
	0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc,  0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15,
	0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a,  0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75,
	0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0,  0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84,
	0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b,  0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf,
	0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85,  0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8,
	0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5,  0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2,
	0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17,  0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73,
	0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88,  0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb,
	0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c,  0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79,
	0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9,  0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08,
	0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6,  0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a,
	0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e,  0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e,
	0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94,  0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf,
	0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68,  0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16
},inv_sbox[256] = {
/*       0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,  0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f*/
/*0x00*/ 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38,  0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb,
/*0x01*/ 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87,  0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb,
/*0x02*/ 0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d,  0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e,
/*0x03*/ 0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2,  0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25,
/*0x04*/ 0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16,  0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92,
/*0x05*/ 0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda,  0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84,
/*0x06*/ 0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a,  0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06,
/*0x07*/ 0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02,  0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b,
/*0x08*/ 0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea,  0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73,
/*0x09*/ 0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85,  0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e,
/*0x0a*/ 0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89,  0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b,
/*0x0b*/ 0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20,  0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4,
/*0x0c*/ 0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31,  0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f,
/*0x0d*/ 0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d,  0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef,
/*0x0e*/ 0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0,  0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61,
/*0x0f*/ 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26,  0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d
};
#endif
static const uint8_t sh [16] = {0,5,10,15,4,9,14,3,8,13,2,7,12,1,6,11}
,ish[16] = { 0,13,10,7, 4,1,14,11, 8,5,2,15, 12,9,6,3 }
,rcon [] = {0x01, 0x02, 0x04, 0x08,0x10, 0x20, 0x40, 0x80,0x1b, 0x36, 0x6c, 0xd8};


void aes128(void *data, const void *skey){
	union aes128  { uint8_t c[16]; uint8_t rc[4][4]; uint32_t i[4]; uint64_t VEC(16) m; }
		*state = (union aes128*)data, //reuse input memory since we write to it
		key = *(union aes128*)skey,   //but locally copy key (it get's reused)
		tmp; //need a tmp copy of state to avoid code complexity of ShiftRows
	for(unsigned long i=0;;i++){
		//combine SubBytes (sbox[]), AddRoundKey (^) and ShiftRows (sh[j])
		for(unsigned long j = 0; j < 16; ++j)
			tmp.c[j] = sbox[state->c[sh[j]] ^ key.c[sh[j]]];
		//ComputeRoundKey
		for (unsigned long j=1; j<=4; ++j) key.rc[0][j-1] ^= sbox[ key.rc[3][(j&3)] ];
		key.c[0] ^= rcon[i];
		//for (j=4;j<16;j++) key.c[j] ^= key.c[j-4]; //for slow uint32_t
		for (unsigned long j=1;j<4;j++) key.i[j] ^= key.i[j-1];
		if ( i == 9 ) break;
		//mix columns
		for(unsigned long j = 0; j < 4; ++j){
			uint8_t e = 0, s[4];
			for (unsigned long k=0;k<4;++k) e^=tmp.rc[j][k];
			for (unsigned long k=0;k<4;++k) s[k]=tmp.rc[j][((k+1)&3)];
			for (unsigned long k=0;k<4;++k){
				uint8_t t = tmp.rc[j][k]^s[k];
				state->rc[j][k] = tmp.rc[j][k]^e^((t<<1)^(0x1b &-(t>0x7f)));
			}
		}
	}
	//add final round key to tmp for output
	//for(i=0;i<4;i++) state->i[i]=tmp.i[i]^key.i[i];  //fast 32 bit
	//for(i=0;i<16;i++) state->c[i]=tmp.c[i]^key.c[i]; //for 8 bit only
	state->m = tmp.m ^ key.m; //smallest and fastest with simd
}


#define xtime(x) (((x) & 0x80) ? (((x) << 1) ^ 0x1b) : ((x)<<1))
void inv_aes(void *data, const void *skey){
	uint8_t state[16], key[11][16], *in = data;
	for(unsigned long i=0; i < 16; i++)
		key[0][i] = ((uint8_t *)skey)[i];
	for(unsigned long i = 1; i <= 10; i++) { 	/*Generate Round Keys*/
		for (unsigned long j=1; j<=4; ++j)
			key[i][j-1] = key[i-1][j-1] ^ sbox[ key[i-1][12+(j&3)] ];
		key[i][0] ^= rcon[i-1];
		for (unsigned long j=4;j<16;j++)
			key[i][j] = key[i-1][j] ^ key[i][j-4];
	}
	for(unsigned long i=0; i < 16; i++) //addRoundKey(10)
		in[i] ^= key[10][i];
	for(unsigned long i = 0; i < 10; i++){ //do rounds
		if (i) for(unsigned long j = 0; j < 16; j+=4){ //inv_mixColumns();
			uint8_t e=0;
			for(unsigned long k=0;k<4;++k)
				e ^= state[j+k];
			for (unsigned long k=0;k<4;++k)
				in[j+k] = e ^ state[j+k] ^ xtime(state[j+k] ^ state[j+((k+1)&3)])
					^ xtime(xtime(xtime(e) ^ state[j+k] ^ state[j+((k+2)&3)]) );
		}
		for(unsigned long j = 0; j < 16; j++) //inv_shiftRows+inv_subBytes+addRoundKey(9-i)
			state[j] = inv_sbox[ in[ish[j]] ] ^ key[9-i][j];
	}
	//for(unsigned long i=0;i<16;++i)in[i]=state[i];
	*(uint64_t VEC(16) *)in=*(uint64_t VEC(16) *)state;
}
#undef xtime


#ifdef INITSBOX
#define ROTL8(x,shift) ((uint8_t) ((x) << (shift)) | ((x) >> (8 - (shift))))
void init_aes(void) {
	uint8_t p = 1, q = 1;
		do {
		p = p ^ (p << 1) ^ (p & 0x80 ? 0x1B : 0);
		q ^= q << 1;
		q ^= q << 2;
		q ^= q << 4;
		q ^= q & 0x80 ? 0x09 : 0;
		uint8_t xformed = q ^ ROTL8(q, 1) ^ ROTL8(q, 2) ^ ROTL8(q, 3) ^ ROTL8(q, 4);
		sbox[p] = xformed ^ 0x63;
		inv_sbox[xformed ^ 0x63] = p;
	} while (p != 1);
	sbox[0] = 0x63;
	inv_sbox[0x63]=0;
}
#undef ROTL8
#else
#define init_aes(...)
#endif

#ifdef TESTAES
#include <stdio.h>
int main(){
	uint8_t key[16] = { 0x2b,0x7e,0x15,0x16,0x28,0xae,0xd2,0xa6, 0xab,0xf7,0x15,0x88,0x09,0xcf,0x4f,0x3c };
	uint8_t in[16] = { 0x6b,0xc1,0xbe,0xe2,0x2e,0x40,0x9f,0x96, 0xe9,0x3d,0x7e,0x11,0x73,0x93,0x17,0x2a };
	init_aes();
	aes128(in,key);
	size_t i;
	for (i=0;i<16;++i) printf("%02x",0xFF&(unsigned)in[i]);
	puts("");
	inv_aes(in,key);
	for (i=0;i<16;++i) printf("%02x",0xFF&(unsigned)in[i]);
	puts("");
	aes128(in,key);
	for (i=0;i<16;++i) printf("%02x",0xFF&(unsigned)in[i]);
	puts("");
/* Output should be:
3ad77bb40d7a3660a89ecaf32466ef97
6bc1bee22e409f96e93d7e117393172a
3ad77bb40d7a3660a89ecaf32466ef97
*/
}
#endif
I think I can reduce this a bit further, but its at a good stopping point.

minimal (as yet untested) crc32

Code: Select all

uint32_t crc32( uint32_t crc, const uint8_t *ptr, size_t len){
    const uint32_t lut[] = {
        0x00000000, 0x1db71064, 0x3b6e20c8, 0x26d930ac,
        0x76dc4190, 0x6b6b51f4, 0x4db26158, 0x5005713c,
        0xedb88320, 0xf00f9344, 0xd6d6a3e8, 0xcb61b38c,
        0x9b64c2b0, 0x86d3d2d4, 0xa00ae278, 0xbdbdf21c
    };
    const uint8_t *end = ptr + len;
    for( ; ptr<end; ++ptr ) {
        crc = ( crc >> 4 ) ^ lut[ (crc & 0xf)^(*ptr & 0xf) ];
        crc = ( crc >> 4 ) ^ lut[ (crc & 0xf)^(*ptr >> 4) ];
    }
    return crc;
}
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#60 Post by technosaurus »

Back when C11 came around, I thought _Generic expressions would bring some of the power of templating of more bloated languages to C but due to the legacy of the C pre-processor, this step is done after the preprocessing phase so it can't effectively use macros internally, so it is pretty damn useless and could be replaced by a static inline trampoline function on any decent optimizing compiler. Fortunately gcc has had a implementation specific alternative that has been around since before C11 was even drafted ... but it really isn't that implementation specific - both clang and icc support it. It turns out you can do a better job at "Generic" with __typeof__, __builtin_types_compatible_p, and __builtin_choose_expr

Here are some C macros I stubbed out to account for up to 8 different type combinations - If you need more, just follow the pattern to extend it out. (Note that it could be made more generic by making a specifically taylored NARGS style wrapper)

Code: Select all

#define G1(val, typ, expr, ...) __builtin_choose_expr( \
	__builtin_types_compatible_p(__typeof__(val),typ), expr, __VA_ARGS__)
#define G1x2(v1,t1,e1, ...) G1(v1,t1,e1,G1(v1,__VA_ARGS__))
#define G1x3(v1,t1,e1, ...) G1(v1,t1,e1,G1x2(v1,__VA_ARGS__))
#define G1x4(v1,t1,e1, ...) G1(v1,t1,e1,G1x3(v1,__VA_ARGS__))
#define G1x5(v1,t1,e1, ...) G1(v1,t1,e1,G1x4(v1,__VA_ARGS__))
#define G1x6(v1,t1,e1, ...) G1(v1,t1,e1,G1x5(v1,__VA_ARGS__))
#define G1x7(v1,t1,e1, ...) G1(v1,t1,e1,G1x6(v1,__VA_ARGS__))
#define G1x8(v1,t1,e1, ...) G1(v1,t1,e1,G1x7(v1,__VA_ARGS__))

#define G2(v1, v2, t1, t2, e, ...) \
	__builtin_choose_expr( __builtin_constant_p( \
		__builtin_types_compatible_p(__typeof__(v1),t1)&& \
		__builtin_types_compatible_p(__typeof__(v2),t2) \
	), e, __VA_ARGS__)
#define G2x2(v1,v2,x1,y1,e,...) G2(v1,v2,x1,y1,e,G2(v1,v2,__VA_ARGS__))
#define G2x3(v1,v2,x1,y1,e,...) G2(v1,v2,x1,y1,e,G2x2(v1,v2,__VA_ARGS__))
#define G2x4(v1,v2,x1,y1,e,...) G2(v1,v2,x1,y1,e,G2x3(v1,v2,__VA_ARGS__))
#define G2x5(v1,v2,x1,y1,e,...) G2(v1,v2,x1,y1,e,G2x4(v1,v2,__VA_ARGS__))
#define G2x6(v1,v2,x1,y1,e,...) G2(v1,v2,x1,y1,e,G2x5(v1,v2,__VA_ARGS__))
#define G2x7(v1,v2,x1,y1,e,...) G2(v1,v2,x1,y1,e,G2x6(v1,v2,__VA_ARGS__))
#define G2x8(v1,v2,x1,y1,e,...) G2(v1,v2,x1,y1,e,G2x7(v1,v2,__VA_ARGS__))


#define G3(v1,v2,v3,t1,t2,t3,e,...) \
	__builtin_choose_expr( __builtin_constant_p( \
		__builtin_types_compatible_p(__typeof__(v1),t1)&& \
		__builtin_types_compatible_p(__typeof__(v2),t2)&& \
		__builtin_types_compatible_p(__typeof__(v3),t3) \
	), e, __VA_ARGS__ )
#define G3x2(v1,v2,v3,t1,t2,t3,e,...) \
	G3(v1,v2,v3,t1,t2,t3,e,G3(v1,v2,v3,__VA_ARGS__))
#define G3x3(v1,v2,v3,t1,t2,t3,e,...) \
	G3(v1,v2,v3,t1,t2,t3,e,G3x2(v1,v2,v3,__VA_ARGS__))
#define G3x4(v1,v2,v3,t1,t2,t3,e,...) \
	G3(v1,v2,v3,t1,t2,t3,e,G3x3(v1,v2,v3,__VA_ARGS__))
#define G3x5(v1,v2,v3,t1,t2,t3,e,...) \
	G3(v1,v2,v3,t1,t2,t3,e,G3x4(v1,v2,v3,__VA_ARGS__))
#define G3x6(v1,v2,v3,t1,t2,t3,e,...) \
	G3(v1,v2,v3,t1,t2,t3,e,G3x5(v1,v2,v3,__VA_ARGS__))
#define G3x7(v1,v2,v3,t1,t2,t3,e,...) \
	G3(v1,v2,v3,t1,t2,t3,e,G3x6(v1,v2,v3,__VA_ARGS__))
#define G3x8(v1,v2,v3,t1,t2,t3,e,...) \
	G3(v1,v2,v3,t1,t2,t3,e,G3x7(v1,v2,v3,__VA_ARGS__))

#define G4(v1,v2,v3,v4,t1,t2,t3,t4,e,...) \
	__builtin_choose_expr( __builtin_constant_p( \
		__builtin_types_compatible_p(__typeof__(v1),t1)&& \
		__builtin_types_compatible_p(__typeof__(v2),t2)&& \
		__builtin_types_compatible_p(__typeof__(v3),t3)&& \
		__builtin_types_compatible_p(__typeof__(v4),t4) \
	), e, __VA_ARGS__ )
#define G4x2(v1,v2,v3,v4,t1,t2,t3,t4,e,...) \
	G4(v1,v2,v3,v4,t1,t2,t3,t4,e,G4(v1,v2,v3,v4,__VA_ARGS__))
#define G4x3(v1,v2,v3,v4,t1,t2,t3,t4,e,...) \
	G4(v1,v2,v3,v4,t1,t2,t3,t4,e,G4x2(v1,v2,v3,v4,__VA_ARGS__))
#define G4x4(v1,v2,v3,v4,t1,t2,t3,t4,e,...) \
	G4(v1,v2,v3,v4,t1,t2,t3,t4,e,G4x3(v1,v2,v3,v4,__VA_ARGS__))
#define G4x5(v1,v2,v3,v4,t1,t2,t3,t4,e,...) \
	G4(v1,v2,v3,v4,t1,t2,t3,t4,e,G4x4(v1,v2,v3,v4,__VA_ARGS__))
#define G4x6(v1,v2,v3,v4,t1,t2,t3,t4,e,...) \
	G4(v1,v2,v3,v4,t1,t2,t3,t4,e,G4x5(v1,v2,v3,v4,__VA_ARGS__))
#define G4x7(v1,v2,v3,v4,t1,t2,t3,t4,e,...) \
	G4(v1,v2,v3,v4,t1,t2,t3,t4,e,G4x6(v1,v2,v3,v4,__VA_ARGS__))
#define G4x8(v1,v2,v3,v4,t1,t2,t3,t4,e,...) \
	G4(v1,v2,v3,v4,t1,t2,t3,t4,e,G4x7(v1,v2,v3,v4,__VA_ARGS__))
usage:

Code: Select all

#define GACOS(x) G1x6(x, \
    long double complex, cacosl, \
    double complex, cacos, \
    float complex, cacosf, \
    long double, acosl, \
    float, acosf, \
    acos )(x)
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#61 Post by technosaurus »

Some practice porting C code to use vector extensions (from https://github.com/easyaspi314/xxhash-c ... sh32-ref.c)

Code: Select all

#include <stddef.h> /* size_t, NULL */
#include <stdint.h> /* uint8_t, uint32_t */
typedef uint32_t __attribute__ ((__vector_size__ (16))) u32x4t;
#define PRIME32_1 0x9E3779B1U  /* 0b10011110001101110111100110110001 */
#define PRIME32_2 0x85EBCA77U  /* 0b10000101111010111100101001110111 */
#define PRIME32_3 0xC2B2AE3DU  /* 0b11000010101100101010111000111101 */
#define PRIME32_4 0x27D4EB2FU  /* 0b00100111110101001110101100101111 */
#define PRIME32_5 0x165667B1U  /* 0b00010110010101100110011110110001 */
#define XXH_rotl32(v, a) ((v << a) | (v >> (32 - a)))

uint32_t XXH32(void const * input, size_t  length, uint32_t const seed){
	uint8_t const *data = (uint8_t const *) input;
    uint32_t const *udata;
    uint32_t hash = seed + PRIME32_5;
	u32x4t const * u32x4data = (u32x4t const *) input;
	size_t remaining = length;

	if (input != NULL){ /* Don't dereference a null pointer.*/
		if (remaining >= 16) {
			u32x4t lanes = { PRIME32_1+PRIME32_2,PRIME32_2,0,-PRIME32_1 };
			lanes += seed;
			while (remaining >= 16) {
				lanes += *u32x4data++ * PRIME32_2;
				lanes = XXH_rotl32(lanes, 13);
				lanes *= PRIME32_1;
				remaining -= 16;
			}
			lanes = XXH_rotl32(lanes, ((u32x4t){1,7,12,18}));
			hash = lanes[0]+lanes[1]+lanes[2]+lanes[3];
		} else { /* Not enough data for main loop, put something in there instead.*/
			hash = seed + PRIME32_5;
		}
		hash += (uint32_t) length;
		/* Process the remaining data. */
		udata = (uint32_t const *) u32x4data;
		while (remaining >= 4) {
			hash += *udata++ * PRIME32_3;
			hash  = XXH_rotl32(hash, 17);
			hash *= PRIME32_4;
			remaining -= 4;
		}
		data = (uint8_t const *)udata;
		while (remaining != 0) {
			hash += (uint32_t) *data++ * PRIME32_5;
			hash  = XXH_rotl32(hash, 11);
			hash *= PRIME32_1;
			--remaining;
		}
	}
	hash ^= hash >> 15;
	hash *= PRIME32_2;
	hash ^= hash >> 13;
	hash *= PRIME32_3;
	hash ^= hash >> 16;
	return hash;
}

  
/*
 *  xxHash - Fast Hash algorithm
 *  Copyright (C) 2012-2019, Yann Collet
 *  Copyright (C) 2019, easyaspi314 (Devin)
 *
 *  BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
 *
 *  Redistribution and use in source and binary forms, with or without
 *  modification, are permitted provided that the following conditions are
 *  met:
 *
 *  * Redistributions of source code must retain the above copyright
 *  notice, this list of conditions and the following disclaimer.
 *  * Redistributions in binary form must reproduce the above
 *  copyright notice, this list of conditions and the following disclaimer
 *  in the documentation and/or other materials provided with the
 *  distribution.
 *
 *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
 *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
 *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
 *  A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
 *  OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
 *  SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
 *  LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 *  DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 *  THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 *  (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 *  OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 *
 *  You can contact the author at :
 *  - xxHash homepage: http://www.xxhash.com
 *  - xxHash source repository : https://github.com/Cyan4973/xxHash */
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

Post Reply