Optimize sdarray.cpp to use g++ builtin instead of doing naive counting.