Fastest way to determine if an integer s square root is an integer

Question

I m looking for the fastest way to determine if a long value is a perfect square  i e  its square root is another integer      I ve done it the easy way  by using the built-in Math sqrt   function  but I m wondering if there is a way to do it faster by restricting yourself to integer-only domain  Maintaining a lookup table is impractical  since there are about 231 5 integers whose square is less than 263     Here is the very simple and straightforward way I m doing it now   public final static boolean isPerfectSquare long n      if  n  lt  0      return false     long tst    long  Math sqrt n    0 5     return tst tst    n      Note  I m using this function in many Project Euler problems   So no one else will ever have to maintain this code   And this kind of micro-optimization could actually make a difference  since part of the challenge is to do every algorithm in less than a minute  and this function will need to be called millions of times in some problems     I ve tried the different solutions to the problem    After exhaustive testing  I found that adding 0 5 to the result of Math sqrt   is not necessary  at least not on my machine  The fast inverse square root was faster  but it gave incorrect results for n    410881   However  as suggested by BobbyShaftoe  we can use the FISR hack for n  lt  410881  Newton s method was a good bit slower than Math sqrt     This is probably because Math sqrt   uses something similar to Newton s Method  but implemented in the hardware so it s much faster than in Java   Also  Newton s Method still required use of doubles  A modified Newton s method  which used a few tricks so that only integer math was involved  required some hacks to avoid overflow  I want this function to work with all positive 64-bit signed integers   and it was still slower than Math sqrt    Binary chop was even slower   This makes sense because the binary chop will on average require 16 passes to find the square root of a 64-bit number  According to John s tests  using or statements is faster in C   than using a switch  but in Java and C  there appears to be no difference between or and switch  I also tried making a lookup table  as a private static array of 64 boolean values    Then instead of either switch or or statement  I would just say if lookup  int  n amp 0x3F      test   else return false    To my surprise  this was  just slightly  slower  This is because array bounds are checked in Java

User · Answer

Just for the record  another approach is to use the prime decomposition  If every factor of the decomposition is even  then the number is a perfect square  So what you want is to see if a number can be decomposed as a product of squares of prime numbers  Of course  you don t need to obtain such a decomposition  just to see if it exists   First build a table of squares of prime numbers which are lower than 2 32  This is far smaller than a table of all integers up to this limit   A solution would then be like this   boolean isPerfectSquare long number        if  number  lt  0  return false      if  number  lt  2  return true       for  int i   0    i                  long square   squareTable i           if  square  gt  number  return false          while  number   square    0                        number    square                    if  number    1  return true            I guess it s a bit cryptic  What it does is checking in every step that the square of a prime number divide the input number  If it does then it divides the number by the square as long as it is possible  to remove this square from the prime decomposition  If by this process  we came to 1  then the input number was a decomposition of square of prime numbers  If the square becomes larger than the number itself  then there is no way this square  or any larger squares  can divide it  so the number can not be a decomposition of squares of prime numbers   Given nowadays  sqrt done in hardware and the need to compute prime numbers here  I guess this solution is way slower  But it should give better results than solution with sqrt which won t work over 2 54  as says mrzl in his answer

User · Answer

You ll have to do some benchmarking   The best algorithm will depend on the distribution of your inputs   Your algorithm may be nearly optimal  but you might want to do a quick check to rule out some possibilities before calling your square root routine   For example  look at the last digit of your number in hex by doing a bit-wise  and    Perfect squares can only end in 0  1  4  or 9 in base 16   So for 75  of your inputs  assuming they are uniformly distributed  you can avoid a call to the square root in exchange for some very fast bit twiddling   Kip benchmarked the following code implementing the hex trick   When testing numbers 1 through 100 000 000  this code ran twice as fast as the original   public final static boolean isPerfectSquare long n        if  n  lt  0          return false       switch  int  n  amp  0xF             case 0  case 1  case 4  case 9          long tst    long Math sqrt n           return tst tst    n       default          return false            When I tested the analogous code in C    it actually ran slower than the original  However  when I eliminated the switch statement  the hex trick once again make the code twice as fast   int isPerfectSquare int n        int h   n  amp  0xF      h is the last hex  digit      if  h  gt  9          return 0         Use lazy evaluation to jump out of the if statement as soon as possible     if  h    2  amp  amp  h    3  amp  amp  h    5  amp  amp  h    6  amp  amp  h    7  amp  amp  h    8                int t    int  floor  sqrt  double  n    0 5            return t t    n            return 0      Eliminating the switch statement had little effect on the C  code

User · Answer

I figured out a method that works  35  faster than your 6bits Carmack sqrt code  at least with my CPU  x86  and programming language  C C      Your results may vary  especially because I don t know how the Java factor will play out   My approach is threefold    First  filter out obvious answers   This includes negative numbers and looking at the last 4 bits    I found looking at the last six didn t help    I also answer yes for 0    In reading the code below  note that my input is int64 x    if  x  lt  0     x 2       x   7     5       x   11     8        return false  if  x    0       return true   Next  check if it s a square modulo 255   3   5   17   Because that s a product of three distinct primes  only about 1 8 of the residues mod 255 are squares   However  in my experience  calling the modulo operator     costs more than the benefit one gets  so I use bit tricks involving 255   2 8-1 to compute the residue    For better or worse  I am not using the trick of reading individual bytes out of a word  only bitwise-and and shifts   int64 y   x  y    y   4294967295LL     y  gt  gt  32    y    y   65535     y  gt  gt  16   y    y   255      y  gt  gt  8    255     y  gt  gt  16      At this point  y is between 0 and 511   More code can reduce it farther   To actually check if the residue is a square  I look up the answer in a precomputed table  if  bad255 y        return false     However  I just use a table of size 512   Finally  try to compute the square root using a method similar to Hensel s lemma    I don t think it s applicable directly  but it works with some modifications    Before doing that  I divide out all powers of 2 with a binary search  if  x   4294967295LL     0      x  gt  gt   32  if  x   65535     0      x  gt  gt   16  if  x   255     0      x  gt  gt   8  if  x   15     0      x  gt  gt   4  if  x   3     0      x  gt  gt   2  At this point  for our number to be a square  it must be 1 mod 8  if  x   7     1      return false  The basic structure of Hensel s lemma is the following    Note  untested code  if it doesn t work  try t 2 or 8   int64 t   4  r   1  t  lt  lt   1  r      x - r   r    t   gt  gt  1  t  lt  lt   1  r      x - r   r    t   gt  gt  1  t  lt  lt   1  r      x - r   r    t   gt  gt  1     Repeat until t is 2 33 or so   Use a loop if you want  The idea is that at each iteration  you add one bit onto r  the  current  square root of x  each square root is accurate modulo a larger and larger power of 2  namely t 2   At the end  r and t 2-r will be square roots of x modulo t 2    Note that if r is a square root of x  then so is -r   This is true even modulo numbers  but beware  modulo some numbers  things can have even more than 2 square roots  notably  this includes powers of 2    Because our actual square root is less than 2 32  at that point we can actually just check if r or t 2-r are real square roots   In my actual code  I use the following modified loop  int64 r  t  z  r   start  x  gt  gt  3    1023   do       z   x - r   r      if  z    0           return true      if  z  lt  0           return false      t   z    -z       r     z   t   gt  gt  1      if  r  gt   t  gt  gt  1            r   t - r    while  t  lt    1LL  lt  lt  33     The speedup here is obtained in three ways  precomputed start value  equivalent to  10 iterations of the loop   earlier exit of the loop  and skipping some t values   For the last part  I look at z   r - x   x  and set t to be the largest power of 2 dividing z with a bit trick   This allows me to skip t values that wouldn t have affected the value of r anyway   The precomputed start value in my case picks out the  smallest positive  square root modulo 8192     Even if this code doesn t work faster for you  I hope you enjoy some of the ideas it contains   Complete  tested code follows  including the precomputed tables   typedef signed long long int int64   int start 1024     1 3 1769 5 1937 1741 7 1451 479 157 9 91 945 659 1817 11  1983 707 1321 1211 1071 13 1479 405 415 1501 1609 741 15 339 1703 203  129 1411 873 1669 17 1715 1145 1835 351 1251 887 1573 975 19 1127 395  1855 1981 425 453 1105 653 327 21 287 93 713 1691 1935 301 551 587  257 1277 23 763 1903 1075 1799 1877 223 1437 1783 859 1201 621 25 779  1727 573 471 1979 815 1293 825 363 159 1315 183 27 241 941 601 971  385 131 919 901 273 435 647 1493 95 29 1417 805 719 1261 1177 1163  1599 835 1367 315 1361 1933 1977 747 31 1373 1079 1637 1679 1581 1753 1355  513 1539 1815 1531 1647 205 505 1109 33 1379 521 1627 1457 1901 1767 1547  1471 1853 1833 1349 559 1523 967 1131 97 35 1975 795 497 1875 1191 1739  641 1149 1385 133 529 845 1657 725 161 1309 375 37 463 1555 615 1931  1343 445 937 1083 1617 883 185 1515 225 1443 1225 869 1423 1235 39 1973  769 259 489 1797 1391 1485 1287 341 289 99 1271 1701 1713 915 537 1781  1215 963 41 581 303 243 1337 1899 353 1245 329 1563 753 595 1113 1589  897 1667 407 635 785 1971 135 43 417 1507 1929 731 207 275 1689 1397  1087 1725 855 1851 1873 397 1607 1813 481 163 567 101 1167 45 1831 1205  1025 1021 1303 1029 1135 1331 1017 427 545 1181 1033 933 1969 365 1255 1013  959 317 1751 187 47 1037 455 1429 609 1571 1463 1765 1009 685 679 821  1153 387 1897 1403 1041 691 1927 811 673 227 137 1499 49 1005 103 629  831 1091 1449 1477 1967 1677 697 1045 737 1117 1737 667 911 1325 473 437  1281 1795 1001 261 879 51 775 1195 801 1635 759 165 1871 1645 1049 245  703 1597 553 955 209 1779 1849 661 865 291 841 997 1265 1965 1625 53  1409 893 105 1925 1297 589 377 1579 929 1053 1655 1829 305 1811 1895 139  575 189 343 709 1711 1139 1095 277 993 1699 55 1435 655 1491 1319 331  1537 515 791 507 623 1229 1529 1963 1057 355 1545 603 1615 1171 743 523  447 1219 1239 1723 465 499 57 107 1121 989 951 229 1521 851 167 715  1665 1923 1687 1157 1553 1869 1415 1749 1185 1763 649 1061 561 531 409 907  319 1469 1961 59 1455 141 1209 491 1249 419 1847 1893 399 211 985 1099  1793 765 1513 1275 367 1587 263 1365 1313 925 247 1371 1359 109 1561 1291  191 61 1065 1605 721 781 1735 875 1377 1827 1353 539 1777 429 1959 1483  1921 643 617 389 1809 947 889 981 1441 483 1143 293 817 749 1383 1675  63 1347 169 827 1199 1421 583 1259 1505 861 457 1125 143 1069 807 1867  2047 2045 279 2043 111 307 2041 597 1569 1891 2039 1957 1103 1389 231 2037  65 1341 727 837 977 2035 569 1643 1633 547 439 1307 2033 1709 345 1845  1919 637 1175 379 2031 333 903 213 1697 797 1161 475 1073 2029 921 1653  193 67 1623 1595 943 1395 1721 2027 1761 1955 1335 357 113 1747 1497 1461  1791 771 2025 1285 145 973 249 171 1825 611 265 1189 847 1427 2023 1269  321 1475 1577 69 1233 755 1223 1685 1889 733 1865 2021 1807 1107 1447 1077  1663 1917 1129 1147 1775 1613 1401 555 1953 2019 631 1243 1329 787 871 885  449 1213 681 1733 687 115 71 1301 2017 675 969 411 369 467 295 693  1535 509 233 517 401 1843 1543 939 2015 669 1527 421 591 147 281 501  577 195 215 699 1489 525 1081 917 1951 2013 73 1253 1551 173 857 309  1407 899 663 1915 1519 1203 391 1323 1887 739 1673 2011 1585 493 1433 117  705 1603 1111 965 431 1165 1863 533 1823 605 823 1179 625 813 2009 75  1279 1789 1559 251 657 563 761 1707 1759 1949 777 347 335 1133 1511 267  833 1085 2007 1467 1745 1805 711 149 1695 803 1719 485 1295 1453 935 459  1151 381 1641 1413 1263 77 1913 2005 1631 541 119 1317 1841 1773 359 651  961 323 1193 197 175 1651 441 235 1567 1885 1481 1947 881 2003 217 843  1023 1027 745 1019 913 717 1031 1621 1503 867 1015 1115 79 1683 793 1035  1089 1731 297 1861 2001 1011 1593 619 1439 477 585 283 1039 1363 1369 1227  895 1661 151 645 1007 1357 121 1237 1375 1821 1911 549 1999 1043 1945 1419  1217 957 599 571 81 371 1351 1003 1311 931 311 1381 1137 723 1575 1611  767 253 1047 1787 1169 1997 1273 853 1247 413 1289 1883 177 403 999 1803  1345 451 1495 1093 1839 269 199 1387 1183 1757 1207 1051 783 83 423 1995  639 1155 1943 123 751 1459 1671 469 1119 995 393 219 1743 237 153 1909  1473 1859 1705 1339 337 909 953 1771 1055 349 1993 613 1393 557 729 1717  511 1533 1257 1541 1425 819 519 85 991 1693 503 1445 433 877 1305 1525  1601 829 809 325 1583 1549 1991 1941 927 1059 1097 1819 527 1197 1881 1333  383 125 361 891 495 179 633 299 863 285 1399 987 1487 1517 1639 1141  1729 579 87 1989 593 1907 839 1557 799 1629 201 155 1649 1837 1063 949  255 1283 535 773 1681 461 1785 683 735 1123 1801 677 689 1939 487 757  1857 1987 983 443 1327 1267 313 1173 671 221 695 1509 271 1619 89 565  127 1405 1431 1659 239 1101 1159 1067 607 1565 905 1755 1231 1299 665 373  1985 701 1879 1221 849 627 1465 789 543 1187 1591 923 1905 979 1241 181    bool bad255 512     0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 0 0 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1   1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1 1 1   0 1 0 1 1 0 0 1 1 1 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 1 1 1 1 1 0 1   1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 1 1 0 0 1 1 1 1 1 1   1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1   1 1 1 1 1 1 0 1 1 0 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1   1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1   1 0 1 1 1 0 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1   0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 0 0 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1   1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1 1 1   0 1 0 1 1 0 0 1 1 1 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 1 1 1 1 1 0 1   1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 1 1 0 0 1 1 1 1 1 1   1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1   1 1 1 1 1 1 0 1 1 0 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1   1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1   1 0 1 1 1 0 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1   0 0    inline bool square  int64 x            Quickfail     if  x  lt  0     x 2       x   7     5       x   11     8            return false      if  x    0           return true          Check mod 255   3   5   17  for fun     int64 y   x      y    y   4294967295LL     y  gt  gt  32       y    y   65535     y  gt  gt  16       y    y   255      y  gt  gt  8    255     y  gt  gt  16       if  bad255 y            return false          Divide out powers of 4 using binary search     if  x   4294967295LL     0          x  gt  gt   32      if  x   65535     0          x  gt  gt   16      if  x   255     0          x  gt  gt   8      if  x   15     0          x  gt  gt   4      if  x   3     0          x  gt  gt   2       if  x   7     1          return false          Compute sqrt using something like Hensel s lemma     int64 r  t  z      r   start  x  gt  gt  3    1023       do           z   x - r   r          if  z    0               return true          if  z  lt  0               return false          t   z    -z           r     z   t   gt  gt  1          if  r  gt   t   gt  gt  1                r   t - r        while  t  lt    1LL  lt  lt  33          return false

User · Answer

Here is the simplest and most concise way  although I do not know how it compares in terms of CPU cycles   This works great if you only wish to know if the root is a whole number   If you really care if it is an integer  you can also figure that out   Here is a simple  and pure  function  private static final MathContext precision   new MathContext 20    private static final Function lt Long  Boolean gt  isRootWhole    n  - gt        long digit   n   10      if  digit    2    digit    3    digit    7    digit    8            return false            return new BigDecimal n  sqrt precision  scale      0      If you do not need micro-optimization  this answer is better in terms of simplicity and maintainability   If you will be calculating negative numbers  you will need to handle that accordingly  and send the absolute value into the function   I have included a minor optimization because no perfect squares have a tens digit of 2  3  7  or 8 due to quadratic residues mod 10  On my CPU  a run of this algorithm on 0 - 10 000 000 took an average of 1000 - 1100 nanoseconds per calculation  If you are performing a lesser number of calculations  the earlier calculations take a bit longer  I had a negative comment that my previous edit did not work for large numbers   The OP mentioned Longs  and the largest perfect square that is a Long is 9223372030926249001  so this method works for all Longs

User · Answer

It s been pointed out that the last d digits of a perfect square can only take on certain values  The last d digits  in base b  of a number n is the same as the remainder when n is divided by bd  ie  in C notation n   pow b  d    This can be generalized to any modulus m  ie  n   m can be used to rule out some percentage of numbers from being perfect squares  The modulus you are currently using is 64  which allows 12  ie  19  of remainders  as possible squares  With a little coding I found the modulus 110880  which allows only 2016  ie  1 8  of remainders as possible squares  So depending on the cost of a modulus operation  ie  division  and a table lookup versus a square root on your machine  using this modulus might be faster   By the way if Java has a way to store a packed array of bits for the lookup table  don t use it  110880 32-bit words is not much RAM these days and fetching a machine word is going to be faster than fetching a single bit

User · Answer

Not sure if this is the fastest way  but this is something I stumbled upon   long time ago in high-school  when I was bored and playing with my calculator during math class  At that time  I was really amazed this was working     public static boolean isIntRoot int number        return isIntRootHelper number  1      private static boolean isIntRootHelper int number  int index        if  number    index            return true            if  number  lt  index            return false            else           return isIntRootHelper number - 2   index  index   1

User · Answer

Project Euler is mentioned in the tags and many of the problems in it require checking numbers    2 64   Most of the optimizations mentioned above don t work easily when you are working with an 80 byte buffer   I used java BigInteger and a slightly modified version of Newton s method  one that works better with integers   The problem was that exact squares n 2 converged to  n-1  instead of n because n 2-1    n-1  n 1  and the final error was just one step below the final divisor and the algorithm terminated   It was easy to fix by adding one to the original argument before computing the error    Add two for cube roots  etc    One nice attribute of this algorithm is that you can immediately tell if the number is a perfect square - the final error  not correction  in Newton s method will be zero   A simple modification also lets you quickly calculate floor sqrt x   instead of the closest integer   This is handy with several Euler problems

User · Answer

I like the idea to use an almost correct method on some of the input  Here is a version with a higher  offset   The code seems to work and passes my simple test case   Just replace your   if n  lt  410881L         code with this one   if  n  lt  11043908100L          John Carmack hack  converted to Java         See  http   www codemaestro com reviews 9     int i      float x2  y       x2   n   0 5F      y   n      i   Float floatToRawIntBits y         using the magic number from        http   www lomont org Math Papers 2003 InvSqrt pdf       since it more accurate     i   0x5f375a86 -  i  gt  gt  1       y   Float intBitsToFloat i       y   y    1 5F -  x2   y   y        y   y    1 5F -  x2   y   y      Newton iteration  more accurate      sqrt   Math round 1 0F   y     else         Carmack hack gives incorrect answer for n  gt   11043908100      sqrt    long  Math sqrt n

User · Answer

I m not sure if it would be faster  or even accurate  but you could use John Carmack s Magical Square Root  algorithm to solve the square root faster   You could probably easily test this for all possible 32 bit integers  and validate that you actually got correct results  as it s only an appoximation   However  now that I think about it  using doubles is approximating also  so I m not sure how that would come into play

User · Answer

The following simplification of maaartinus s solution appears to shave a few percentage points off the runtime  but I m not good enough at benchmarking to produce a benchmark I can trust   long goodMask     0xC840C04048404040 computed below       for  int i 0  i lt 64    i  goodMask    Long MIN VALUE  gt  gt  gt   i i      public boolean isSquare long x           This tests if the 6 least significant bits are right         Moving the to be tested bit to the highest position saves us masking      if  goodMask  lt  lt  x  gt   0  return false         Remove an even number of trailing zeros  leaving at most one      x  gt  gt    Long numberOfTrailingZeros x   amp   -2          Repeat the test on the 6 least significant remaining bits      if  goodMask  lt  lt  x  gt   0   x  lt   0  return x    0         Do it in the classical way         The correctness is not trivial as the conversion from long to double is lossy      final long tst    long  Math sqrt x       return tst   tst    x      It would be worth checking how omitting the first test   if  goodMask  lt  lt  x  gt   0  return false    would affect performance

User · Answer

An integer problem deserves an integer solution  Thus  Do binary search on the  non-negative  integers to find the greatest integer t such that t  2  lt   n  Then test whether r  2   n exactly  This takes time O log n     If you don t know how to binary search the positive integers because the set is unbounded  it s easy  You starting by computing your increasing function f  above f t    t  2 - n  on powers of two  When you see it turn positive  you ve found an upper bound  Then you can do standard binary search

User · Answer

Regarding the Carmac method  it seems like it would be quite easy just to iterate once more  which should double the number of digits of accuracy  It is  after all  an extremely truncated iterative method -- Newton s  with a very good first guess   Regarding your current best  I see two micro-optimizations    move the check vs  0 after the check using mod255 rearrange the dividing out powers of four to skip all the checks for the usual  75   case      I e      Divide out powers of 4 using binary search  if  n  amp  0x3L     0      n  gt  gt  2     if  n  amp  0xffffffffL     0      n  gt  gt   32    if  n  amp  0xffffL     0        n  gt  gt   16    if  n  amp  0xffL     0        n  gt  gt   8    if  n  amp  0xfL     0        n  gt  gt   4    if  n  amp  0x3L     0        n  gt  gt   2      Even better might be a simple  while   n  amp  0x03L     0  n  gt  gt   2    Obviously  it would be interesting to know how many numbers get culled at each checkpoint -- I rather doubt the checks are truly independent  which makes things tricky

User · Answer

Here is a divide and conquer solution    If the square root of a natural number  number  is a natural number  solution   you can easily determine a range for solution based on the number of digits of number    number has 1 digit  solution in range   1 - 4 number has 2 digits  solution in range   3 - 10 number has 3 digits  solution in range   10 - 40 number has 4 digits  solution in range   30 - 100 number has 5 digits  solution in range   100 - 400   Notice the repetition   You can use this range in a binary search approach to see if there is a solution for which   number    solution   solution   Here is the code  Here is my class SquareRootChecker  public class SquareRootChecker        private long number      private long initialLow      private long initialHigh       public SquareRootChecker long number            this number   number           initialLow   1          initialHigh   4          if  Long toString number  length     2    0                initialLow   3              initialHigh   10                    for  long i   0  i  lt  Long toString number  length     2  i                  initialLow    10              initialHigh    10                    if  Long toString number  length     2    0                initialLow    10              initialHigh   10                       public boolean checkSquareRoot             return findSquareRoot initialLow  initialHigh  number              private boolean findSquareRoot long low  long high  long number            long check   low    high - low    2          if  high  gt   low                if  number    check   check                    return true                            else if  number  lt  check   check                    high   check - 1                  return findSquareRoot low  high  number                             else                    low   check   1                  return findSquareRoot low  high  number                                   return false             And here is an example on how to use it   long number    1234567  long square   number   number  SquareRootChecker squareRootChecker   new SquareRootChecker square   System out println square          squareRootChecker checkSquareRoot       Prints  1524155677489  true   long notSquare   square   1  squareRootChecker   new SquareRootChecker notSquare   System out println notSquare          squareRootChecker checkSquareRoot       Prints  1524155677490  false

User · Answer

I checked all of the possible results when the last n bits of a square is observed  By successively examining more bits  up to 5 6th of inputs can be eliminated  I actually designed this to implement Fermat s Factorization algorithm  and it is very fast there   public static boolean isSquare final long val       if   val  amp  2     2     val  amp  7     5         return false          if   val  amp  11     8     val  amp  31     20         return false           if   val  amp  47     32     val  amp  127     80         return false           if   val  amp  191     128     val  amp  511     320         return false              if  val  amp  a    b      val  amp  c    d           return false              if   modSq  int   val   modSq length              return false           final long root    long  Math sqrt val      return root   root    val      The last bit of pseudocode can be used to extend the tests to eliminate more values  The tests above are for k   0  1  2  3  a is of the form  3  lt  lt  2k  - 1     b is of the form  2  lt  lt  2k      c is of the form  2  lt  lt  2k   2  - 1     d is of the form  2  lt  lt  2k - 1    10  It first tests whether it has a  square residual with moduli of power of two  then it tests based on a final modulus  then it uses the Math sqrt to do a final test  I came up with the idea from the top post  and attempted to extend upon it  I appreciate any comments or suggestions   Update  Using the test by a modulus   modSq  and a modulus base of 44352  my test runs in 96  of the time of the one in the OP s update for numbers up to 1 000 000 000

User · Answer

It s been pointed out that the last d digits of a perfect square can only take on certain values  The last d digits  in base b  of a number n is the same as the remainder when n is divided by bd  ie  in C notation n   pow b  d    This can be generalized to any modulus m  ie  n   m can be used to rule out some percentage of numbers from being perfect squares  The modulus you are currently using is 64  which allows 12  ie  19  of remainders  as possible squares  With a little coding I found the modulus 110880  which allows only 2016  ie  1 8  of remainders as possible squares  So depending on the cost of a modulus operation  ie  division  and a table lookup versus a square root on your machine  using this modulus might be faster   By the way if Java has a way to store a packed array of bits for the lookup table  don t use it  110880 32-bit words is not much RAM these days and fetching a machine word is going to be faster than fetching a single bit

User · Answer

I figured out a method that works  35  faster than your 6bits Carmack sqrt code  at least with my CPU  x86  and programming language  C C      Your results may vary  especially because I don t know how the Java factor will play out   My approach is threefold    First  filter out obvious answers   This includes negative numbers and looking at the last 4 bits    I found looking at the last six didn t help    I also answer yes for 0    In reading the code below  note that my input is int64 x    if  x  lt  0     x 2       x   7     5       x   11     8        return false  if  x    0       return true   Next  check if it s a square modulo 255   3   5   17   Because that s a product of three distinct primes  only about 1 8 of the residues mod 255 are squares   However  in my experience  calling the modulo operator     costs more than the benefit one gets  so I use bit tricks involving 255   2 8-1 to compute the residue    For better or worse  I am not using the trick of reading individual bytes out of a word  only bitwise-and and shifts   int64 y   x  y    y   4294967295LL     y  gt  gt  32    y    y   65535     y  gt  gt  16   y    y   255      y  gt  gt  8    255     y  gt  gt  16      At this point  y is between 0 and 511   More code can reduce it farther   To actually check if the residue is a square  I look up the answer in a precomputed table  if  bad255 y        return false     However  I just use a table of size 512   Finally  try to compute the square root using a method similar to Hensel s lemma    I don t think it s applicable directly  but it works with some modifications    Before doing that  I divide out all powers of 2 with a binary search  if  x   4294967295LL     0      x  gt  gt   32  if  x   65535     0      x  gt  gt   16  if  x   255     0      x  gt  gt   8  if  x   15     0      x  gt  gt   4  if  x   3     0      x  gt  gt   2  At this point  for our number to be a square  it must be 1 mod 8  if  x   7     1      return false  The basic structure of Hensel s lemma is the following    Note  untested code  if it doesn t work  try t 2 or 8   int64 t   4  r   1  t  lt  lt   1  r      x - r   r    t   gt  gt  1  t  lt  lt   1  r      x - r   r    t   gt  gt  1  t  lt  lt   1  r      x - r   r    t   gt  gt  1     Repeat until t is 2 33 or so   Use a loop if you want  The idea is that at each iteration  you add one bit onto r  the  current  square root of x  each square root is accurate modulo a larger and larger power of 2  namely t 2   At the end  r and t 2-r will be square roots of x modulo t 2    Note that if r is a square root of x  then so is -r   This is true even modulo numbers  but beware  modulo some numbers  things can have even more than 2 square roots  notably  this includes powers of 2    Because our actual square root is less than 2 32  at that point we can actually just check if r or t 2-r are real square roots   In my actual code  I use the following modified loop  int64 r  t  z  r   start  x  gt  gt  3    1023   do       z   x - r   r      if  z    0           return true      if  z  lt  0           return false      t   z    -z       r     z   t   gt  gt  1      if  r  gt   t  gt  gt  1            r   t - r    while  t  lt    1LL  lt  lt  33     The speedup here is obtained in three ways  precomputed start value  equivalent to  10 iterations of the loop   earlier exit of the loop  and skipping some t values   For the last part  I look at z   r - x   x  and set t to be the largest power of 2 dividing z with a bit trick   This allows me to skip t values that wouldn t have affected the value of r anyway   The precomputed start value in my case picks out the  smallest positive  square root modulo 8192     Even if this code doesn t work faster for you  I hope you enjoy some of the ideas it contains   Complete  tested code follows  including the precomputed tables   typedef signed long long int int64   int start 1024     1 3 1769 5 1937 1741 7 1451 479 157 9 91 945 659 1817 11  1983 707 1321 1211 1071 13 1479 405 415 1501 1609 741 15 339 1703 203  129 1411 873 1669 17 1715 1145 1835 351 1251 887 1573 975 19 1127 395  1855 1981 425 453 1105 653 327 21 287 93 713 1691 1935 301 551 587  257 1277 23 763 1903 1075 1799 1877 223 1437 1783 859 1201 621 25 779  1727 573 471 1979 815 1293 825 363 159 1315 183 27 241 941 601 971  385 131 919 901 273 435 647 1493 95 29 1417 805 719 1261 1177 1163  1599 835 1367 315 1361 1933 1977 747 31 1373 1079 1637 1679 1581 1753 1355  513 1539 1815 1531 1647 205 505 1109 33 1379 521 1627 1457 1901 1767 1547  1471 1853 1833 1349 559 1523 967 1131 97 35 1975 795 497 1875 1191 1739  641 1149 1385 133 529 845 1657 725 161 1309 375 37 463 1555 615 1931  1343 445 937 1083 1617 883 185 1515 225 1443 1225 869 1423 1235 39 1973  769 259 489 1797 1391 1485 1287 341 289 99 1271 1701 1713 915 537 1781  1215 963 41 581 303 243 1337 1899 353 1245 329 1563 753 595 1113 1589  897 1667 407 635 785 1971 135 43 417 1507 1929 731 207 275 1689 1397  1087 1725 855 1851 1873 397 1607 1813 481 163 567 101 1167 45 1831 1205  1025 1021 1303 1029 1135 1331 1017 427 545 1181 1033 933 1969 365 1255 1013  959 317 1751 187 47 1037 455 1429 609 1571 1463 1765 1009 685 679 821  1153 387 1897 1403 1041 691 1927 811 673 227 137 1499 49 1005 103 629  831 1091 1449 1477 1967 1677 697 1045 737 1117 1737 667 911 1325 473 437  1281 1795 1001 261 879 51 775 1195 801 1635 759 165 1871 1645 1049 245  703 1597 553 955 209 1779 1849 661 865 291 841 997 1265 1965 1625 53  1409 893 105 1925 1297 589 377 1579 929 1053 1655 1829 305 1811 1895 139  575 189 343 709 1711 1139 1095 277 993 1699 55 1435 655 1491 1319 331  1537 515 791 507 623 1229 1529 1963 1057 355 1545 603 1615 1171 743 523  447 1219 1239 1723 465 499 57 107 1121 989 951 229 1521 851 167 715  1665 1923 1687 1157 1553 1869 1415 1749 1185 1763 649 1061 561 531 409 907  319 1469 1961 59 1455 141 1209 491 1249 419 1847 1893 399 211 985 1099  1793 765 1513 1275 367 1587 263 1365 1313 925 247 1371 1359 109 1561 1291  191 61 1065 1605 721 781 1735 875 1377 1827 1353 539 1777 429 1959 1483  1921 643 617 389 1809 947 889 981 1441 483 1143 293 817 749 1383 1675  63 1347 169 827 1199 1421 583 1259 1505 861 457 1125 143 1069 807 1867  2047 2045 279 2043 111 307 2041 597 1569 1891 2039 1957 1103 1389 231 2037  65 1341 727 837 977 2035 569 1643 1633 547 439 1307 2033 1709 345 1845  1919 637 1175 379 2031 333 903 213 1697 797 1161 475 1073 2029 921 1653  193 67 1623 1595 943 1395 1721 2027 1761 1955 1335 357 113 1747 1497 1461  1791 771 2025 1285 145 973 249 171 1825 611 265 1189 847 1427 2023 1269  321 1475 1577 69 1233 755 1223 1685 1889 733 1865 2021 1807 1107 1447 1077  1663 1917 1129 1147 1775 1613 1401 555 1953 2019 631 1243 1329 787 871 885  449 1213 681 1733 687 115 71 1301 2017 675 969 411 369 467 295 693  1535 509 233 517 401 1843 1543 939 2015 669 1527 421 591 147 281 501  577 195 215 699 1489 525 1081 917 1951 2013 73 1253 1551 173 857 309  1407 899 663 1915 1519 1203 391 1323 1887 739 1673 2011 1585 493 1433 117  705 1603 1111 965 431 1165 1863 533 1823 605 823 1179 625 813 2009 75  1279 1789 1559 251 657 563 761 1707 1759 1949 777 347 335 1133 1511 267  833 1085 2007 1467 1745 1805 711 149 1695 803 1719 485 1295 1453 935 459  1151 381 1641 1413 1263 77 1913 2005 1631 541 119 1317 1841 1773 359 651  961 323 1193 197 175 1651 441 235 1567 1885 1481 1947 881 2003 217 843  1023 1027 745 1019 913 717 1031 1621 1503 867 1015 1115 79 1683 793 1035  1089 1731 297 1861 2001 1011 1593 619 1439 477 585 283 1039 1363 1369 1227  895 1661 151 645 1007 1357 121 1237 1375 1821 1911 549 1999 1043 1945 1419  1217 957 599 571 81 371 1351 1003 1311 931 311 1381 1137 723 1575 1611  767 253 1047 1787 1169 1997 1273 853 1247 413 1289 1883 177 403 999 1803  1345 451 1495 1093 1839 269 199 1387 1183 1757 1207 1051 783 83 423 1995  639 1155 1943 123 751 1459 1671 469 1119 995 393 219 1743 237 153 1909  1473 1859 1705 1339 337 909 953 1771 1055 349 1993 613 1393 557 729 1717  511 1533 1257 1541 1425 819 519 85 991 1693 503 1445 433 877 1305 1525  1601 829 809 325 1583 1549 1991 1941 927 1059 1097 1819 527 1197 1881 1333  383 125 361 891 495 179 633 299 863 285 1399 987 1487 1517 1639 1141  1729 579 87 1989 593 1907 839 1557 799 1629 201 155 1649 1837 1063 949  255 1283 535 773 1681 461 1785 683 735 1123 1801 677 689 1939 487 757  1857 1987 983 443 1327 1267 313 1173 671 221 695 1509 271 1619 89 565  127 1405 1431 1659 239 1101 1159 1067 607 1565 905 1755 1231 1299 665 373  1985 701 1879 1221 849 627 1465 789 543 1187 1591 923 1905 979 1241 181    bool bad255 512     0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 0 0 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1   1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1 1 1   0 1 0 1 1 0 0 1 1 1 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 1 1 1 1 1 0 1   1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 1 1 0 0 1 1 1 1 1 1   1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1   1 1 1 1 1 1 0 1 1 0 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1   1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1   1 0 1 1 1 0 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1   0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 0 0 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1   1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1 1 1   0 1 0 1 1 0 0 1 1 1 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 1 1 1 1 1 0 1   1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 1 1 0 0 1 1 1 1 1 1   1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1   1 1 1 1 1 1 0 1 1 0 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1   1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1   1 0 1 1 1 0 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1   0 0    inline bool square  int64 x            Quickfail     if  x  lt  0     x 2       x   7     5       x   11     8            return false      if  x    0           return true          Check mod 255   3   5   17  for fun     int64 y   x      y    y   4294967295LL     y  gt  gt  32       y    y   65535     y  gt  gt  16       y    y   255      y  gt  gt  8    255     y  gt  gt  16       if  bad255 y            return false          Divide out powers of 4 using binary search     if  x   4294967295LL     0          x  gt  gt   32      if  x   65535     0          x  gt  gt   16      if  x   255     0          x  gt  gt   8      if  x   15     0          x  gt  gt   4      if  x   3     0          x  gt  gt   2       if  x   7     1          return false          Compute sqrt using something like Hensel s lemma     int64 r  t  z      r   start  x  gt  gt  3    1023       do           z   x - r   r          if  z    0               return true          if  z  lt  0               return false          t   z    -z           r     z   t   gt  gt  1          if  r  gt   t   gt  gt  1                r   t - r        while  t  lt    1LL  lt  lt  33          return false

User · Answer

I m not sure if it would be faster  or even accurate  but you could use John Carmack s Magical Square Root  algorithm to solve the square root faster   You could probably easily test this for all possible 32 bit integers  and validate that you actually got correct results  as it s only an appoximation   However  now that I think about it  using doubles is approximating also  so I m not sure how that would come into play

User · Answer

An integer problem deserves an integer solution  Thus  Do binary search on the  non-negative  integers to find the greatest integer t such that t  2  lt   n  Then test whether r  2   n exactly  This takes time O log n     If you don t know how to binary search the positive integers because the set is unbounded  it s easy  You starting by computing your increasing function f  above f t    t  2 - n  on powers of two  When you see it turn positive  you ve found an upper bound  Then you can do standard binary search

User · Answer

You ll have to do some benchmarking   The best algorithm will depend on the distribution of your inputs   Your algorithm may be nearly optimal  but you might want to do a quick check to rule out some possibilities before calling your square root routine   For example  look at the last digit of your number in hex by doing a bit-wise  and    Perfect squares can only end in 0  1  4  or 9 in base 16   So for 75  of your inputs  assuming they are uniformly distributed  you can avoid a call to the square root in exchange for some very fast bit twiddling   Kip benchmarked the following code implementing the hex trick   When testing numbers 1 through 100 000 000  this code ran twice as fast as the original   public final static boolean isPerfectSquare long n        if  n  lt  0          return false       switch  int  n  amp  0xF             case 0  case 1  case 4  case 9          long tst    long Math sqrt n           return tst tst    n       default          return false            When I tested the analogous code in C    it actually ran slower than the original  However  when I eliminated the switch statement  the hex trick once again make the code twice as fast   int isPerfectSquare int n        int h   n  amp  0xF      h is the last hex  digit      if  h  gt  9          return 0         Use lazy evaluation to jump out of the if statement as soon as possible     if  h    2  amp  amp  h    3  amp  amp  h    5  amp  amp  h    6  amp  amp  h    7  amp  amp  h    8                int t    int  floor  sqrt  double  n    0 5            return t t    n            return 0      Eliminating the switch statement had little effect on the C  code

User · Answer

The sqrt call is not perfectly accurate  as has been mentioned  but it s interesting and instructive that it doesn t blow away the other answers in terms of speed  After all  the sequence of assembly language instructions for a sqrt is tiny  Intel has a hardware instruction  which isn t used by Java I believe because it doesn t conform to IEEE   So why is it slow  Because Java is actually calling a C routine through JNI  and it s actually slower to do so than to call a Java subroutine  which itself is slower than doing it inline  This is very annoying  and Java should have come up with a better solution  ie building in floating point library calls if necessary  Oh well   In C    I suspect all the complex alternatives would lose on speed  but I haven t checked them all  What I did  and what Java people will find usefull  is a simple hack  an extension of the special case testing suggested by A  Rex  Use a single long value as a bit array  which isn t bounds checked  That way  you have 64 bit boolean lookup   typedef unsigned long long UVLONG UVLONG pp1 pp2   void init2       for  int i   0  i  lt  64  i          for  int j   0  j  lt  64  j          if  isPerfectSquare i   64   j         pp1     1  lt  lt  j       pp2     1  lt  lt  i       break                  cout  lt  lt   pp1    lt  lt  pp1  lt  lt       lt  lt  pp2  lt  lt    n         inline bool isPerfectSquare5 UVLONG x      return pp1  amp   1  lt  lt   x  amp  0x3F     isPerfectSquare x    false      The routine isPerfectSquare5 runs in about 1 3 the time on my core2 duo machine  I suspect that further tweaks along the same lines could reduce the time further on average  but every time you check  you are trading off more testing for more eliminating  so you can t go too much farther on that road   Certainly  rather than having a separate test for negative  you could check the high 6 bits the same way   Note that all I m doing is eliminating possible squares  but when I have a potential case I have to call the original  inlined isPerfectSquare   The init2 routine is called once to initialize the static values of pp1 and pp2  Note that in my implementation in C    I m using unsigned long long  so since you re signed  you d have to use the     operator   There is no intrinsic need to bounds check the array  but Java s optimizer has to figure this stuff out pretty quickly  so I don t blame them for that

User · Answer

I ran my own analysis of several of the algorithms in this thread and came up with some new results  You can see those old results in the edit history of this answer  but they re not accurate  as I made a mistake  and wasted time analyzing several algorithms which aren t close  However  pulling lessons from several different answers  I now have two algorithms that crush the  winner  of this thread  Here s the core thing I do differently than everyone else      This is faster because a number is divisible by 2 4 or more only 6  of the time    and more than that a vanishingly small percentage  while  x  amp  0x3     0  x  gt  gt   2     This is effectively the same as the switch-case statement used in the original    answer   if  x  amp  0x7     1  return false    However  this simple line  which most of the time adds one or two very fast instructions  greatly simplifies the switch-case statement into one if statement  However  it can add to the runtime if many of the tested numbers have significant power-of-two factors   The algorithms below are as follows    Internet - Kip s posted answer Durron - My modified answer using the one-pass answer as a base DurronTwo - My modified answer using the two-pass answer  by  JohnnyHeggheim   with some other slight modifications    Here is a sample runtime if the numbers are generated using Math abs java util Random nextLong      0  Scenario vm java  trial 0  benchmark Internet  39673 40 ns    378 78 ns   3 trials 33  Scenario vm java  trial 0  benchmark Durron  37785 75 ns    478 86 ns   10 trials 67  Scenario vm java  trial 0  benchmark DurronTwo  35978 10 ns    734 10 ns   10 trials  benchmark   us linear runtime  Internet 39 7                                   Durron 37 8                              DurronTwo 36 0                              vm  java trial  0   And here is a sample runtime if it s run on the first million longs only    0  Scenario vm java  trial 0  benchmark Internet  2933380 84 ns    56939 84 ns   10 trials 33  Scenario vm java  trial 0  benchmark Durron  2243266 81 ns    50537 62 ns   10 trials 67  Scenario vm java  trial 0  benchmark DurronTwo  3159227 68 ns    10766 22 ns   3 trials  benchmark   ms linear runtime  Internet 2 93                                Durron 2 24                       DurronTwo 3 16                                 vm  java trial  0   As you can see  DurronTwo does better for large inputs  because it gets to use the magic trick very very often  but gets clobbered compared to the first algorithm and Math sqrt because the numbers are so much smaller  Meanwhile  the simpler Durron is a huge winner because it never has to divide by 4 many many times in the first million numbers   Here s Durron   public final static boolean isPerfectSquareDurron long n        if n  lt  0  return false      if n    0  return true       long x   n         This is faster because a number is divisible by 16 only 6  of the time        and more than that a vanishingly small percentage      while  x  amp  0x3     0  x  gt  gt   2         This is effectively the same as the switch-case statement used in the original        answer       if  x  amp  0x7     1             long sqrt          if x  lt  410881L                        int i              float x2  y               x2   x   0 5F              y    x              i    Float floatToRawIntBits y               i    0x5f3759df -   i  gt  gt  1                y    Float intBitsToFloat i               y    y     1 5F -   x2   y   y                   sqrt    long  1 0F y             else               sqrt    long  Math sqrt x                     return sqrt sqrt    x            return false      And DurronTwo  public final static boolean isPerfectSquareDurronTwo long n        if n  lt  0  return false         Needed to prevent infinite loop     if n    0  return true       long x   n      while  x  amp  0x3     0  x  gt  gt   2      if  x  amp  0x7     1            long sqrt          if  x  lt  41529141369L                int i              float x2  y               x2   x   0 5F              y   x              i   Float floatToRawIntBits y                 using the magic number from                http   www lomont org Math Papers 2003 InvSqrt pdf               since it more accurate             i   0x5f375a86 -  i  gt  gt  1               y   Float intBitsToFloat i               y   y    1 5F -  x2   y   y                y   y    1 5F -  x2   y   y      Newton iteration  more accurate             sqrt    long    1 0F y    0 2             else                 Carmack hack gives incorrect answer for n  gt   41529141369              sqrt    long  Math sqrt x                     return sqrt sqrt    x            return false      And my benchmark harness   Requires Google caliper 0 1-rc5   public class SquareRootBenchmark       public static class Benchmark1 extends SimpleBenchmark           private static final int ARRAY SIZE   10000          long   trials   new long ARRAY SIZE             Override         protected void setUp   throws Exception               Random r   new Random                for  int i   0  i  lt  ARRAY SIZE  i                      trials i    Math abs r nextLong                                       public int timeInternet int reps                int trues   0              for int i   0  i  lt  reps  i                      for int j   0  j  lt  ARRAY SIZE  j                          if SquareRootAlgs isPerfectSquareInternet trials j    trues                                                 return trues                        public int timeDurron int reps                int trues   0              for int i   0  i  lt  reps  i                      for int j   0  j  lt  ARRAY SIZE  j                          if SquareRootAlgs isPerfectSquareDurron trials j    trues                                                 return trues                        public int timeDurronTwo int reps                int trues   0              for int i   0  i  lt  reps  i                      for int j   0  j  lt  ARRAY SIZE  j                          if SquareRootAlgs isPerfectSquareDurronTwo trials j    trues                                                 return trues                          public static void main String    args            Runner main Benchmark1 class  args             UPDATE  I ve made a new algorithm that is faster in some scenarios  slower in others  I ve gotten different benchmarks based on different inputs  If we calculate modulo 0xFFFFFF   3 x 3 x 5 x 7 x 13 x 17 x 241  we can eliminate 97 82  of numbers that cannot be squares  This can be  sort of  done in one line  with 5 bitwise operations   if   goodLookupSquares  int    n  amp  0xFFFFFFl      n  gt  gt  24   amp  0xFFFFFFl     n  gt  gt  48     return false    The resulting index is either 1  the residue  2  the residue   0xFFFFFF  or 3  the residue   0x1FFFFFE  Of course  we need to have a lookup table for residues modulo 0xFFFFFF  which is about a 3mb file  in this case stored as ascii text decimal numbers  not optimal but clearly improvable with a ByteBuffer and so forth  But since that is precalculation it doesn t matter so much  You can find the file here  or generate it yourself     public final static boolean isPerfectSquareDurronThree long n        if n  lt  0  return false      if n    0  return true       long x   n      while  x  amp  0x3     0  x  gt  gt   2      if  x  amp  0x7     1            if   goodLookupSquares  int    n  amp  0xFFFFFFl      n  gt  gt  24   amp  0xFFFFFFl     n  gt  gt  48     return false          long sqrt          if x  lt  410881L                        int i              float x2  y               x2   x   0 5F              y    x              i    Float floatToRawIntBits y               i    0x5f3759df -   i  gt  gt  1                y    Float intBitsToFloat i               y    y     1 5F -   x2   y   y                   sqrt    long  1 0F y             else               sqrt    long  Math sqrt x                     return sqrt sqrt    x            return false      I load it into a boolean array like this   private static boolean   goodLookupSquares   null   public static void initGoodLookupSquares   throws Exception       Scanner s   new Scanner new File  24residues squares txt          goodLookupSquares   new boolean 0x1FFFFFE        while s hasNextLine              int residue   Integer valueOf s nextLine             goodLookupSquares residue    true          goodLookupSquares residue   0xFFFFFF    true          goodLookupSquares residue   0x1FFFFFE    true             s close        Example runtime  It beat Durron  version one  in every trial I ran    0  Scenario vm java  trial 0  benchmark Internet  40665 77 ns    566 71 ns   10 trials 33  Scenario vm java  trial 0  benchmark Durron  38397 60 ns    784 30 ns   10 trials 67  Scenario vm java  trial 0  benchmark DurronThree  36171 46 ns    693 02 ns   10 trials    benchmark   us linear runtime    Internet 40 7                                     Durron 38 4                              DurronThree 36 2                             vm  java trial  0

User · Answer

Newton s Method with integer arithmetic  If you wish to avoid non-integer operations you could use the method below  It basically uses Newton s Method modified for integer arithmetic          Test if the given number is a perfect square      param n Must be greater than 0 and less       than Long MAX VALUE      return  lt code gt true lt  code gt  if n is a perfect       square  or  lt code gt false lt  code gt  otherwise      public static boolean isSquare long n        long x1   n      long x2   1L       while  x1  gt  x2                x1    x1   x2    2L          x2   n   x1             return x1    x2  amp  amp  n   x1    0L      This implementation can not compete with solutions that use Math sqrt  However  its performance can be improved by using the filtering mechanisms described in some of the other posts

User · Answer

I m pretty late to the party  but I hope to provide a better answer  shorter and  assuming my benchmark is correct  also much faster  long goodMask     0xC840C04048404040 computed below       for  int i 0  i lt 64    i  goodMask    Long MIN VALUE  gt  gt  gt   i i      public boolean isSquare long x           This tests if the 6 least significant bits are right         Moving the to be tested bit to the highest position saves us masking      if  goodMask  lt  lt  x  gt   0  return false      final int numberOfTrailingZeros   Long numberOfTrailingZeros x          Each square ends with an even number of zeros      if   numberOfTrailingZeros  amp  1     0  return false      x  gt  gt   numberOfTrailingZeros         Now x is either 0 or odd         In binary each odd square ends with 001         Postpone the sign test until now  handle zero in the branch      if   x amp 7     1   x  lt   0  return x    0         Do it in the classical way         The correctness is not trivial as the conversion from long to double is lossy      final long tst    long  Math sqrt x       return tst   tst    x     The first test catches most non-squares quickly  It uses a 64-item table packed in a long  so there s no array access cost  indirection and bounds checks   For a uniformly random long  there s a 81 25  probability of ending here  The second test catches all numbers having an odd number of twos in their factorization  The method Long numberOfTrailingZeros is very fast as it gets JIT-ed into a single i86 instruction  After dropping the trailing zeros  the third test handles numbers ending with 011  101  or 111 in binary  which are no perfect squares  It also cares about negative numbers and also handles 0  The final test falls back to double arithmetic  As double has only 53 bits mantissa  the conversion from long to double includes rounding for big values  Nonetheless  the test is  correct  unless the proof is wrong   Trying to incorporate the mod255 idea wasn t successful

User · Answer

If you do a binary chop to try to find the  right  square root  you can fairly easily detect if the value you ve got is close enough to tell    n 1  2   n 2   2n   1  n-1  2   n 2 - 2n   1   So having calculated n 2  the options are    n 2   target  done  return true n 2   2n   1  gt  target  gt  n 2   you re close  but it s not perfect  return false n 2 - 2n   1  lt  target  lt  n 2   ditto target  lt  n 2 - 2n   1   binary chop on a lower n target  gt  n 2   2n   1   binary chop on a higher n    Sorry  this uses n as your current guess  and target for the parameter  Apologise for the confusion    I don t know whether this will be faster or not  but it s worth a try   EDIT  The binary chop doesn t have to take in the whole range of integers  either  2 x  2   2  2x   so once you ve found the top set bit in your target  which can be done with a bit-twiddling trick  I forget exactly how  you can quickly get a range of potential answers  Mind you  a naive binary chop is still only going to take up to 31 or 32 iterations

User · Answer

Project Euler is mentioned in the tags and many of the problems in it require checking numbers    2 64   Most of the optimizations mentioned above don t work easily when you are working with an 80 byte buffer   I used java BigInteger and a slightly modified version of Newton s method  one that works better with integers   The problem was that exact squares n 2 converged to  n-1  instead of n because n 2-1    n-1  n 1  and the final error was just one step below the final divisor and the algorithm terminated   It was easy to fix by adding one to the original argument before computing the error    Add two for cube roots  etc    One nice attribute of this algorithm is that you can immediately tell if the number is a perfect square - the final error  not correction  in Newton s method will be zero   A simple modification also lets you quickly calculate floor sqrt x   instead of the closest integer   This is handy with several Euler problems

User · Answer

Square Root of a number  given that the number is a perfect square   The complexity is log n          Calculate square root if the given number is a perfect square         Approach  Sum of n odd numbers is equals to the square root of n n  given     that n is a perfect square         param number     return squareRoot      public static int calculateSquareRoot int number         int sum 1      int count  1      int squareRoot 1      while sum lt number            count  2          sum  count          squareRoot              return squareRoot

User · Answer

I figured out a method that works  35  faster than your 6bits Carmack sqrt code  at least with my CPU  x86  and programming language  C C      Your results may vary  especially because I don t know how the Java factor will play out   My approach is threefold    First  filter out obvious answers   This includes negative numbers and looking at the last 4 bits    I found looking at the last six didn t help    I also answer yes for 0    In reading the code below  note that my input is int64 x    if  x  lt  0     x 2       x   7     5       x   11     8        return false  if  x    0       return true   Next  check if it s a square modulo 255   3   5   17   Because that s a product of three distinct primes  only about 1 8 of the residues mod 255 are squares   However  in my experience  calling the modulo operator     costs more than the benefit one gets  so I use bit tricks involving 255   2 8-1 to compute the residue    For better or worse  I am not using the trick of reading individual bytes out of a word  only bitwise-and and shifts   int64 y   x  y    y   4294967295LL     y  gt  gt  32    y    y   65535     y  gt  gt  16   y    y   255      y  gt  gt  8    255     y  gt  gt  16      At this point  y is between 0 and 511   More code can reduce it farther   To actually check if the residue is a square  I look up the answer in a precomputed table  if  bad255 y        return false     However  I just use a table of size 512   Finally  try to compute the square root using a method similar to Hensel s lemma    I don t think it s applicable directly  but it works with some modifications    Before doing that  I divide out all powers of 2 with a binary search  if  x   4294967295LL     0      x  gt  gt   32  if  x   65535     0      x  gt  gt   16  if  x   255     0      x  gt  gt   8  if  x   15     0      x  gt  gt   4  if  x   3     0      x  gt  gt   2  At this point  for our number to be a square  it must be 1 mod 8  if  x   7     1      return false  The basic structure of Hensel s lemma is the following    Note  untested code  if it doesn t work  try t 2 or 8   int64 t   4  r   1  t  lt  lt   1  r      x - r   r    t   gt  gt  1  t  lt  lt   1  r      x - r   r    t   gt  gt  1  t  lt  lt   1  r      x - r   r    t   gt  gt  1     Repeat until t is 2 33 or so   Use a loop if you want  The idea is that at each iteration  you add one bit onto r  the  current  square root of x  each square root is accurate modulo a larger and larger power of 2  namely t 2   At the end  r and t 2-r will be square roots of x modulo t 2    Note that if r is a square root of x  then so is -r   This is true even modulo numbers  but beware  modulo some numbers  things can have even more than 2 square roots  notably  this includes powers of 2    Because our actual square root is less than 2 32  at that point we can actually just check if r or t 2-r are real square roots   In my actual code  I use the following modified loop  int64 r  t  z  r   start  x  gt  gt  3    1023   do       z   x - r   r      if  z    0           return true      if  z  lt  0           return false      t   z    -z       r     z   t   gt  gt  1      if  r  gt   t  gt  gt  1            r   t - r    while  t  lt    1LL  lt  lt  33     The speedup here is obtained in three ways  precomputed start value  equivalent to  10 iterations of the loop   earlier exit of the loop  and skipping some t values   For the last part  I look at z   r - x   x  and set t to be the largest power of 2 dividing z with a bit trick   This allows me to skip t values that wouldn t have affected the value of r anyway   The precomputed start value in my case picks out the  smallest positive  square root modulo 8192     Even if this code doesn t work faster for you  I hope you enjoy some of the ideas it contains   Complete  tested code follows  including the precomputed tables   typedef signed long long int int64   int start 1024     1 3 1769 5 1937 1741 7 1451 479 157 9 91 945 659 1817 11  1983 707 1321 1211 1071 13 1479 405 415 1501 1609 741 15 339 1703 203  129 1411 873 1669 17 1715 1145 1835 351 1251 887 1573 975 19 1127 395  1855 1981 425 453 1105 653 327 21 287 93 713 1691 1935 301 551 587  257 1277 23 763 1903 1075 1799 1877 223 1437 1783 859 1201 621 25 779  1727 573 471 1979 815 1293 825 363 159 1315 183 27 241 941 601 971  385 131 919 901 273 435 647 1493 95 29 1417 805 719 1261 1177 1163  1599 835 1367 315 1361 1933 1977 747 31 1373 1079 1637 1679 1581 1753 1355  513 1539 1815 1531 1647 205 505 1109 33 1379 521 1627 1457 1901 1767 1547  1471 1853 1833 1349 559 1523 967 1131 97 35 1975 795 497 1875 1191 1739  641 1149 1385 133 529 845 1657 725 161 1309 375 37 463 1555 615 1931  1343 445 937 1083 1617 883 185 1515 225 1443 1225 869 1423 1235 39 1973  769 259 489 1797 1391 1485 1287 341 289 99 1271 1701 1713 915 537 1781  1215 963 41 581 303 243 1337 1899 353 1245 329 1563 753 595 1113 1589  897 1667 407 635 785 1971 135 43 417 1507 1929 731 207 275 1689 1397  1087 1725 855 1851 1873 397 1607 1813 481 163 567 101 1167 45 1831 1205  1025 1021 1303 1029 1135 1331 1017 427 545 1181 1033 933 1969 365 1255 1013  959 317 1751 187 47 1037 455 1429 609 1571 1463 1765 1009 685 679 821  1153 387 1897 1403 1041 691 1927 811 673 227 137 1499 49 1005 103 629  831 1091 1449 1477 1967 1677 697 1045 737 1117 1737 667 911 1325 473 437  1281 1795 1001 261 879 51 775 1195 801 1635 759 165 1871 1645 1049 245  703 1597 553 955 209 1779 1849 661 865 291 841 997 1265 1965 1625 53  1409 893 105 1925 1297 589 377 1579 929 1053 1655 1829 305 1811 1895 139  575 189 343 709 1711 1139 1095 277 993 1699 55 1435 655 1491 1319 331  1537 515 791 507 623 1229 1529 1963 1057 355 1545 603 1615 1171 743 523  447 1219 1239 1723 465 499 57 107 1121 989 951 229 1521 851 167 715  1665 1923 1687 1157 1553 1869 1415 1749 1185 1763 649 1061 561 531 409 907  319 1469 1961 59 1455 141 1209 491 1249 419 1847 1893 399 211 985 1099  1793 765 1513 1275 367 1587 263 1365 1313 925 247 1371 1359 109 1561 1291  191 61 1065 1605 721 781 1735 875 1377 1827 1353 539 1777 429 1959 1483  1921 643 617 389 1809 947 889 981 1441 483 1143 293 817 749 1383 1675  63 1347 169 827 1199 1421 583 1259 1505 861 457 1125 143 1069 807 1867  2047 2045 279 2043 111 307 2041 597 1569 1891 2039 1957 1103 1389 231 2037  65 1341 727 837 977 2035 569 1643 1633 547 439 1307 2033 1709 345 1845  1919 637 1175 379 2031 333 903 213 1697 797 1161 475 1073 2029 921 1653  193 67 1623 1595 943 1395 1721 2027 1761 1955 1335 357 113 1747 1497 1461  1791 771 2025 1285 145 973 249 171 1825 611 265 1189 847 1427 2023 1269  321 1475 1577 69 1233 755 1223 1685 1889 733 1865 2021 1807 1107 1447 1077  1663 1917 1129 1147 1775 1613 1401 555 1953 2019 631 1243 1329 787 871 885  449 1213 681 1733 687 115 71 1301 2017 675 969 411 369 467 295 693  1535 509 233 517 401 1843 1543 939 2015 669 1527 421 591 147 281 501  577 195 215 699 1489 525 1081 917 1951 2013 73 1253 1551 173 857 309  1407 899 663 1915 1519 1203 391 1323 1887 739 1673 2011 1585 493 1433 117  705 1603 1111 965 431 1165 1863 533 1823 605 823 1179 625 813 2009 75  1279 1789 1559 251 657 563 761 1707 1759 1949 777 347 335 1133 1511 267  833 1085 2007 1467 1745 1805 711 149 1695 803 1719 485 1295 1453 935 459  1151 381 1641 1413 1263 77 1913 2005 1631 541 119 1317 1841 1773 359 651  961 323 1193 197 175 1651 441 235 1567 1885 1481 1947 881 2003 217 843  1023 1027 745 1019 913 717 1031 1621 1503 867 1015 1115 79 1683 793 1035  1089 1731 297 1861 2001 1011 1593 619 1439 477 585 283 1039 1363 1369 1227  895 1661 151 645 1007 1357 121 1237 1375 1821 1911 549 1999 1043 1945 1419  1217 957 599 571 81 371 1351 1003 1311 931 311 1381 1137 723 1575 1611  767 253 1047 1787 1169 1997 1273 853 1247 413 1289 1883 177 403 999 1803  1345 451 1495 1093 1839 269 199 1387 1183 1757 1207 1051 783 83 423 1995  639 1155 1943 123 751 1459 1671 469 1119 995 393 219 1743 237 153 1909  1473 1859 1705 1339 337 909 953 1771 1055 349 1993 613 1393 557 729 1717  511 1533 1257 1541 1425 819 519 85 991 1693 503 1445 433 877 1305 1525  1601 829 809 325 1583 1549 1991 1941 927 1059 1097 1819 527 1197 1881 1333  383 125 361 891 495 179 633 299 863 285 1399 987 1487 1517 1639 1141  1729 579 87 1989 593 1907 839 1557 799 1629 201 155 1649 1837 1063 949  255 1283 535 773 1681 461 1785 683 735 1123 1801 677 689 1939 487 757  1857 1987 983 443 1327 1267 313 1173 671 221 695 1509 271 1619 89 565  127 1405 1431 1659 239 1101 1159 1067 607 1565 905 1755 1231 1299 665 373  1985 701 1879 1221 849 627 1465 789 543 1187 1591 923 1905 979 1241 181    bool bad255 512     0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 0 0 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1   1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1 1 1   0 1 0 1 1 0 0 1 1 1 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 1 1 1 1 1 0 1   1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 1 1 0 0 1 1 1 1 1 1   1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1   1 1 1 1 1 1 0 1 1 0 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1   1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1   1 0 1 1 1 0 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1   0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 0 0 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1   1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1 1 1   0 1 0 1 1 0 0 1 1 1 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 1 1 1 1 1 0 1   1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 1 1 0 0 1 1 1 1 1 1   1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1   1 1 1 1 1 1 0 1 1 0 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1   1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1   1 0 1 1 1 0 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1   0 0    inline bool square  int64 x            Quickfail     if  x  lt  0     x 2       x   7     5       x   11     8            return false      if  x    0           return true          Check mod 255   3   5   17  for fun     int64 y   x      y    y   4294967295LL     y  gt  gt  32       y    y   65535     y  gt  gt  16       y    y   255      y  gt  gt  8    255     y  gt  gt  16       if  bad255 y            return false          Divide out powers of 4 using binary search     if  x   4294967295LL     0          x  gt  gt   32      if  x   65535     0          x  gt  gt   16      if  x   255     0          x  gt  gt   8      if  x   15     0          x  gt  gt   4      if  x   3     0          x  gt  gt   2       if  x   7     1          return false          Compute sqrt using something like Hensel s lemma     int64 r  t  z      r   start  x  gt  gt  3    1023       do           z   x - r   r          if  z    0               return true          if  z  lt  0               return false          t   z    -z           r     z   t   gt  gt  1          if  r  gt   t   gt  gt  1                r   t - r        while  t  lt    1LL  lt  lt  33          return false

User · Answer

If you want speed  given that your integers are of finite size  I suspect that the quickest way would involve  a  partitioning the parameters by size  e g  into categories by largest bit set   then checking the value against an array of perfect squares within that range

User · Answer

I like the idea to use an almost correct method on some of the input  Here is a version with a higher  offset   The code seems to work and passes my simple test case   Just replace your   if n  lt  410881L         code with this one   if  n  lt  11043908100L          John Carmack hack  converted to Java         See  http   www codemaestro com reviews 9     int i      float x2  y       x2   n   0 5F      y   n      i   Float floatToRawIntBits y         using the magic number from        http   www lomont org Math Papers 2003 InvSqrt pdf       since it more accurate     i   0x5f375a86 -  i  gt  gt  1       y   Float intBitsToFloat i       y   y    1 5F -  x2   y   y        y   y    1 5F -  x2   y   y      Newton iteration  more accurate      sqrt   Math round 1 0F   y     else         Carmack hack gives incorrect answer for n  gt   11043908100      sqrt    long  Math sqrt n

User · Answer

You ll have to do some benchmarking   The best algorithm will depend on the distribution of your inputs   Your algorithm may be nearly optimal  but you might want to do a quick check to rule out some possibilities before calling your square root routine   For example  look at the last digit of your number in hex by doing a bit-wise  and    Perfect squares can only end in 0  1  4  or 9 in base 16   So for 75  of your inputs  assuming they are uniformly distributed  you can avoid a call to the square root in exchange for some very fast bit twiddling   Kip benchmarked the following code implementing the hex trick   When testing numbers 1 through 100 000 000  this code ran twice as fast as the original   public final static boolean isPerfectSquare long n        if  n  lt  0          return false       switch  int  n  amp  0xF             case 0  case 1  case 4  case 9          long tst    long Math sqrt n           return tst tst    n       default          return false            When I tested the analogous code in C    it actually ran slower than the original  However  when I eliminated the switch statement  the hex trick once again make the code twice as fast   int isPerfectSquare int n        int h   n  amp  0xF      h is the last hex  digit      if  h  gt  9          return 0         Use lazy evaluation to jump out of the if statement as soon as possible     if  h    2  amp  amp  h    3  amp  amp  h    5  amp  amp  h    6  amp  amp  h    7  amp  amp  h    8                int t    int  floor  sqrt  double  n    0 5            return t t    n            return 0      Eliminating the switch statement had little effect on the C  code

User · Answer

It should be much faster to use Newton s method to calculate the Integer Square Root  then square this number and check  as you do in your current solution   Newton s method is the basis for the Carmack solution mentioned in some other answers   You should be able to get a faster answer since you re only interested in the integer part of the root  allowing you to stop the approximation algorithm sooner   Another optimization that you can try   If the Digital Root of a number doesn t end in  1  4  7  or 9 the number is not a perfect square   This can be used as a quick way to eliminate 60  of your inputs before applying the slower square root algorithm

User · Answer

Don t know about fastest  but the simplest is to take the square root in the normal fashion  multiply the result by itself  and see if it matches your original value   Since we re talking integers here  the fasted would probably involve a collection where you can just make a lookup

User · Answer

This a rework from decimal to binary of the old Marchant calculator algorithm  sorry  I don t have a reference   in Ruby  adapted specifically for this question   def isexactsqrt v      value   v abs     residue   value     root   0     onebit   1     onebit  lt  lt   8 while  onebit  lt  residue      onebit  gt  gt   2 while  onebit  gt  residue      while  onebit  gt  0          x   root   onebit         if  residue  gt   x  then             residue -  x             root   x   onebit         end         root  gt  gt   1         onebit  gt  gt   2     end     return  residue    0  end   Here s a workup of something similar  please don t vote me down for coding style smells or clunky O O - it s the algorithm that counts  and C   is not my home language   In this case  we re looking for residue    0    include  lt iostream gt     using namespace std    typedef unsigned long long int llint   class ISqrt                Integer Square Root     llint value            Integer whose square root is required     llint root             Result  floor sqrt value       llint residue          Result  value-root root     llint onebit  x        Working bit  working value  public       ISqrt llint v   2          Constructor         Root v                 Take the root              llint Root llint r         Resets and calculates new square root         value   r              Store input         residue   value        Initialise for subtracting down         root   0               Clear root accumulator          onebit   1                     Calculate start value of counter         onebit  lt  lt    8 sizeof llint -2              Set up counter bit as greatest odd power of 2          while  onebit  gt  residue   onebit  gt  gt   2         Shift down until just  lt  value          while  onebit  gt  0                x   root   onebit              Will check root 1bit  root bit corresponding to onebit is always zero              if  residue  gt   x               Room to subtract                  residue -  x               Yes - deduct from residue                 root   x   onebit          and step root                            root  gt  gt   1              onebit  gt  gt   2                     return root                                 llint Residue                  Returns residue from last calculation         return residue                              int main         llint big  i  q  r  v  delta      big   0  big    big-1              Kludge for  big number      ISqrt b                                Make q sqrt generator     for   i   big  i  gt  0   i    7          for several numbers         q   b Root i                       Get the square root         r   b Residue                      Get the residue         v   q q r                          Recalc original value         delta   v-i                        And diff  hopefully 0         cout  lt  lt  i  lt  lt        lt  lt  q  lt  lt          lt  lt  r  lt  lt    V     lt  lt  v  lt  lt    Delta     lt  lt  delta  lt  lt    n              return 0

User · Answer

Considering for general bit length  though I have used specific type here   I tried to design simplistic algo as below  Simple and obvious check for 0 1 2 or  lt 0 is required initially  Following is simple in sense that it doesn t try to use any existing maths functions  Most of the operator can be replaced with bit-wise operators  I haven t tested with any bench mark data though  I m neither expert at maths or computer algorithm design in particular  I would love to see you pointing out problem  I know there is lots of improvement chances there   int main         unsigned int c1 0  c2   0        unsigned int x   0        unsigned int p   0        int k1   0        scanf   d   amp p         if p   2    0              x   p 2               else             x    p 2   1                while x                 if  x x   gt  p                  c1   x                x   x 2            else                 c2   x                break                            if  p 2     0            c2         while c2  lt  c1                   if  c2   c2      p                  k1   1                break                        c2                 if k1            printf   n Perfect square for  d   c2         else           printf   n Not perfect but nearest to   d     c2         return 0

User · Answer

Don t know about fastest  but the simplest is to take the square root in the normal fashion  multiply the result by itself  and see if it matches your original value   Since we re talking integers here  the fasted would probably involve a collection where you can just make a lookup

User · Answer

Here is the simplest and most concise way  although I do not know how it compares in terms of CPU cycles   This works great if you only wish to know if the root is a whole number   If you really care if it is an integer  you can also figure that out   Here is a simple  and pure  function  private static final MathContext precision   new MathContext 20    private static final Function lt Long  Boolean gt  isRootWhole    n  - gt        long digit   n   10      if  digit    2    digit    3    digit    7    digit    8            return false            return new BigDecimal n  sqrt precision  scale      0      If you do not need micro-optimization  this answer is better in terms of simplicity and maintainability   If you will be calculating negative numbers  you will need to handle that accordingly  and send the absolute value into the function   I have included a minor optimization because no perfect squares have a tens digit of 2  3  7  or 8 due to quadratic residues mod 10  On my CPU  a run of this algorithm on 0 - 10 000 000 took an average of 1000 - 1100 nanoseconds per calculation  If you are performing a lesser number of calculations  the earlier calculations take a bit longer  I had a negative comment that my previous edit did not work for large numbers   The OP mentioned Longs  and the largest perfect square that is a Long is 9223372030926249001  so this method works for all Longs

User · Answer

I was thinking about the horrible times I ve spent in Numerical Analysis course   And then I remember  there was this function circling around the  net from the Quake Source code   float Q rsqrt  float number       long i    float x2  y    const float threehalfs   1 5F     x2   number   0 5F    y    number    i        long      amp y      evil floating point bit level hacking   i    0x5f3759df -   i  gt  gt  1       wtf    y        float      amp i    y    y     threehalfs -   x2   y   y         1st iteration      y    y     threehalfs -   x2   y   y         2nd iteration  this can be removed     ifndef Q3 VM    ifdef   linux       assert   isnan y        bk010122 - FPE     endif    endif   return y      Which basically calculates a square root  using Newton s approximation function  cant remember the exact name    It should be usable and might even be faster  it s from one of the phenomenal id software s game   It s written in C   but it should not be too hard to reuse the same technique in Java once you get the idea   I originally found it at  http   www codemaestro com reviews 9  Newton s method explained at wikipedia  http   en wikipedia org wiki Newton 27s method  You can follow the link for more explanation of how it works  but if you don t care much  then this is roughly what I remember from reading the blog and from taking the Numerical Analysis course    the    long    amp y is basically a fast convert-to-long function so integer operations can be applied on the raw bytes  the 0x5f3759df -  i  gt  gt  1   line is a pre-calculated seed value for the approximation function  the    float    amp i converts the value back to floating point  the y    y     threehalfs -   x2   y   y     line bascially iterates the value over the function again    The approximation function gives more precise values the more you iterate the function over the result  In Quake s case  one iteration is  good enough   but if it wasn t for you    then you could add as much iteration as you need   This should be faster because it reduces the number of division operations done in naive square rooting down to a simple divide by 2  actually a   0 5F multiply operation  and replace it with a few fixed number of multiplication operations instead

User · Answer

I want this function to work with all   positive 64-bit signed integers   Math sqrt   works with doubles as input parameters  so you won t get accurate results for integers bigger than 2 53

User · Answer

Here is a divide and conquer solution    If the square root of a natural number  number  is a natural number  solution   you can easily determine a range for solution based on the number of digits of number    number has 1 digit  solution in range   1 - 4 number has 2 digits  solution in range   3 - 10 number has 3 digits  solution in range   10 - 40 number has 4 digits  solution in range   30 - 100 number has 5 digits  solution in range   100 - 400   Notice the repetition   You can use this range in a binary search approach to see if there is a solution for which   number    solution   solution   Here is the code  Here is my class SquareRootChecker  public class SquareRootChecker        private long number      private long initialLow      private long initialHigh       public SquareRootChecker long number            this number   number           initialLow   1          initialHigh   4          if  Long toString number  length     2    0                initialLow   3              initialHigh   10                    for  long i   0  i  lt  Long toString number  length     2  i                  initialLow    10              initialHigh    10                    if  Long toString number  length     2    0                initialLow    10              initialHigh   10                       public boolean checkSquareRoot             return findSquareRoot initialLow  initialHigh  number              private boolean findSquareRoot long low  long high  long number            long check   low    high - low    2          if  high  gt   low                if  number    check   check                    return true                            else if  number  lt  check   check                    high   check - 1                  return findSquareRoot low  high  number                             else                    low   check   1                  return findSquareRoot low  high  number                                   return false             And here is an example on how to use it   long number    1234567  long square   number   number  SquareRootChecker squareRootChecker   new SquareRootChecker square   System out println square          squareRootChecker checkSquareRoot       Prints  1524155677489  true   long notSquare   square   1  squareRootChecker   new SquareRootChecker notSquare   System out println notSquare          squareRootChecker checkSquareRoot       Prints  1524155677490  false

User · Answer

If you do a binary chop to try to find the  right  square root  you can fairly easily detect if the value you ve got is close enough to tell    n 1  2   n 2   2n   1  n-1  2   n 2 - 2n   1   So having calculated n 2  the options are    n 2   target  done  return true n 2   2n   1  gt  target  gt  n 2   you re close  but it s not perfect  return false n 2 - 2n   1  lt  target  lt  n 2   ditto target  lt  n 2 - 2n   1   binary chop on a lower n target  gt  n 2   2n   1   binary chop on a higher n    Sorry  this uses n as your current guess  and target for the parameter  Apologise for the confusion    I don t know whether this will be faster or not  but it s worth a try   EDIT  The binary chop doesn t have to take in the whole range of integers  either  2 x  2   2  2x   so once you ve found the top set bit in your target  which can be done with a bit-twiddling trick  I forget exactly how  you can quickly get a range of potential answers  Mind you  a naive binary chop is still only going to take up to 31 or 32 iterations

User · Answer

It should be much faster to use Newton s method to calculate the Integer Square Root  then square this number and check  as you do in your current solution   Newton s method is the basis for the Carmack solution mentioned in some other answers   You should be able to get a faster answer since you re only interested in the integer part of the root  allowing you to stop the approximation algorithm sooner   Another optimization that you can try   If the Digital Root of a number doesn t end in  1  4  7  or 9 the number is not a perfect square   This can be used as a quick way to eliminate 60  of your inputs before applying the slower square root algorithm

User · Answer

It should be much faster to use Newton s method to calculate the Integer Square Root  then square this number and check  as you do in your current solution   Newton s method is the basis for the Carmack solution mentioned in some other answers   You should be able to get a faster answer since you re only interested in the integer part of the root  allowing you to stop the approximation algorithm sooner   Another optimization that you can try   If the Digital Root of a number doesn t end in  1  4  7  or 9 the number is not a perfect square   This can be used as a quick way to eliminate 60  of your inputs before applying the slower square root algorithm

User · Answer

Just for the record  another approach is to use the prime decomposition  If every factor of the decomposition is even  then the number is a perfect square  So what you want is to see if a number can be decomposed as a product of squares of prime numbers  Of course  you don t need to obtain such a decomposition  just to see if it exists   First build a table of squares of prime numbers which are lower than 2 32  This is far smaller than a table of all integers up to this limit   A solution would then be like this   boolean isPerfectSquare long number        if  number  lt  0  return false      if  number  lt  2  return true       for  int i   0    i                  long square   squareTable i           if  square  gt  number  return false          while  number   square    0                        number    square                    if  number    1  return true            I guess it s a bit cryptic  What it does is checking in every step that the square of a prime number divide the input number  If it does then it divides the number by the square as long as it is possible  to remove this square from the prime decomposition  If by this process  we came to 1  then the input number was a decomposition of square of prime numbers  If the square becomes larger than the number itself  then there is no way this square  or any larger squares  can divide it  so the number can not be a decomposition of squares of prime numbers   Given nowadays  sqrt done in hardware and the need to compute prime numbers here  I guess this solution is way slower  But it should give better results than solution with sqrt which won t work over 2 54  as says mrzl in his answer

User · Answer

I was thinking about the horrible times I ve spent in Numerical Analysis course   And then I remember  there was this function circling around the  net from the Quake Source code   float Q rsqrt  float number       long i    float x2  y    const float threehalfs   1 5F     x2   number   0 5F    y    number    i        long      amp y      evil floating point bit level hacking   i    0x5f3759df -   i  gt  gt  1       wtf    y        float      amp i    y    y     threehalfs -   x2   y   y         1st iteration      y    y     threehalfs -   x2   y   y         2nd iteration  this can be removed     ifndef Q3 VM    ifdef   linux       assert   isnan y        bk010122 - FPE     endif    endif   return y      Which basically calculates a square root  using Newton s approximation function  cant remember the exact name    It should be usable and might even be faster  it s from one of the phenomenal id software s game   It s written in C   but it should not be too hard to reuse the same technique in Java once you get the idea   I originally found it at  http   www codemaestro com reviews 9  Newton s method explained at wikipedia  http   en wikipedia org wiki Newton 27s method  You can follow the link for more explanation of how it works  but if you don t care much  then this is roughly what I remember from reading the blog and from taking the Numerical Analysis course    the    long    amp y is basically a fast convert-to-long function so integer operations can be applied on the raw bytes  the 0x5f3759df -  i  gt  gt  1   line is a pre-calculated seed value for the approximation function  the    float    amp i converts the value back to floating point  the y    y     threehalfs -   x2   y   y     line bascially iterates the value over the function again    The approximation function gives more precise values the more you iterate the function over the result  In Quake s case  one iteration is  good enough   but if it wasn t for you    then you could add as much iteration as you need   This should be faster because it reduces the number of division operations done in naive square rooting down to a simple divide by 2  actually a   0 5F multiply operation  and replace it with a few fixed number of multiplication operations instead

User · Answer

This is the fastest Java implementation I could come up with  using a combination of techniques suggested by others in this thread    Mod-256 test Inexact mod-3465 test  avoids integer division at the cost of some false positives  Floating-point square root  round and compare with input value   I also experimented with these modifications but they did not help performance    Additional mod-255 test Dividing the input value by powers of 4 Fast Inverse Square Root  to work for high values of N it needs 3 iterations  enough to make it slower than the hardware square root function       public class SquareTester        public static boolean isPerfectSquare long n            if  n  lt  0                return false            else               switch   byte  n                case -128  case -127  case -124  case -119  case -112              case -111  case -103  case  -95  case  -92  case  -87              case  -79  case  -71  case  -64  case  -63  case  -60              case  -55  case  -47  case  -39  case  -31  case  -28              case  -23  case  -15  case   -7  case    0  case    1              case    4  case    9  case   16  case   17  case   25              case   33  case   36  case   41  case   49  case   57              case   64  case   65  case   68  case   73  case   81              case   89  case   97  case  100  case  105  case  113              case  121                  long i    n   INV3465   gt  gt  gt  52                  if    good3465  int  i                         return false                    else                       long r   round Math sqrt n                        return r r    n                                 default                  return false                                     private static int round double x            return  int  Double doubleToRawLongBits x    double   1L  lt  lt  52                   3465 lt sup gt -1 lt  sup gt  modulo 2 lt sup gt 64 lt  sup gt         private static final long INV3465   0x8ffed161732e78b9L       private static final boolean   good3465           new boolean 0x1000        static           for  int r   0  r  lt  3465     r                int i    int    r   r   INV3465   gt  gt  gt  52               good3465 i    good3465 i 1    true

User · Answer

I m looking for the fastest way to determine if a long value is a perfect square  i e  its square root is another integer      The answers are impressive  but I failed to see a simple check    check whether the first number on the right of the long it a member of the set  0 1 4 5 6 9    If it is not  then it cannot possibly be a  perfect square     eg   4567 - cannot be a perfect square

User · Answer

Calculating square roots by Newton s method is horrendously fast     provided that the starting value is reasonable  However there is no reasonable starting value  and in practice we end with bisection and log 2 64  behaviour  To be really fast we need a fast way to get at a reasonable starting value  and that means we need to descend into machine language   If a processor provides an instruction like POPCNT in the Pentium  that counts the leading zeroes we can use that to have a starting value with half the significant bits  With care we can find a a fixed number of Newton steps that will always suffice    Thus foregoing the need to loop and have very fast execution    A second solution is going via the floating point facility  which may have a fast sqrt calculation  like the i87 coprocessor   Even an excursion via exp   and log   may be faster than Newton degenerated into a binary search  There is a tricky aspect to this  a processor dependant analysis of what and if refinement afterwards is necessary   A third solution solves a slightly different problem  but is well worth mentionning because the situation is described in the question  If you want to calculate a great many square roots for numbers that differ slightly  you can use Newton iteration  if you never reinitialise the starting value  but just leave it where the previous calculation left off  I ve used this with success in at least one Euler problem

User · Answer

I m not sure if it would be faster  or even accurate  but you could use John Carmack s Magical Square Root  algorithm to solve the square root faster   You could probably easily test this for all possible 32 bit integers  and validate that you actually got correct results  as it s only an appoximation   However  now that I think about it  using doubles is approximating also  so I m not sure how that would come into play

User · Answer

The sqrt call is not perfectly accurate  as has been mentioned  but it s interesting and instructive that it doesn t blow away the other answers in terms of speed  After all  the sequence of assembly language instructions for a sqrt is tiny  Intel has a hardware instruction  which isn t used by Java I believe because it doesn t conform to IEEE   So why is it slow  Because Java is actually calling a C routine through JNI  and it s actually slower to do so than to call a Java subroutine  which itself is slower than doing it inline  This is very annoying  and Java should have come up with a better solution  ie building in floating point library calls if necessary  Oh well   In C    I suspect all the complex alternatives would lose on speed  but I haven t checked them all  What I did  and what Java people will find usefull  is a simple hack  an extension of the special case testing suggested by A  Rex  Use a single long value as a bit array  which isn t bounds checked  That way  you have 64 bit boolean lookup   typedef unsigned long long UVLONG UVLONG pp1 pp2   void init2       for  int i   0  i  lt  64  i          for  int j   0  j  lt  64  j          if  isPerfectSquare i   64   j         pp1     1  lt  lt  j       pp2     1  lt  lt  i       break                  cout  lt  lt   pp1    lt  lt  pp1  lt  lt       lt  lt  pp2  lt  lt    n         inline bool isPerfectSquare5 UVLONG x      return pp1  amp   1  lt  lt   x  amp  0x3F     isPerfectSquare x    false      The routine isPerfectSquare5 runs in about 1 3 the time on my core2 duo machine  I suspect that further tweaks along the same lines could reduce the time further on average  but every time you check  you are trading off more testing for more eliminating  so you can t go too much farther on that road   Certainly  rather than having a separate test for negative  you could check the high 6 bits the same way   Note that all I m doing is eliminating possible squares  but when I have a potential case I have to call the original  inlined isPerfectSquare   The init2 routine is called once to initialize the static values of pp1 and pp2  Note that in my implementation in C    I m using unsigned long long  so since you re signed  you d have to use the     operator   There is no intrinsic need to bounds check the array  but Java s optimizer has to figure this stuff out pretty quickly  so I don t blame them for that

User · Answer

If you do a binary chop to try to find the  right  square root  you can fairly easily detect if the value you ve got is close enough to tell    n 1  2   n 2   2n   1  n-1  2   n 2 - 2n   1   So having calculated n 2  the options are    n 2   target  done  return true n 2   2n   1  gt  target  gt  n 2   you re close  but it s not perfect  return false n 2 - 2n   1  lt  target  lt  n 2   ditto target  lt  n 2 - 2n   1   binary chop on a lower n target  gt  n 2   2n   1   binary chop on a higher n    Sorry  this uses n as your current guess  and target for the parameter  Apologise for the confusion    I don t know whether this will be faster or not  but it s worth a try   EDIT  The binary chop doesn t have to take in the whole range of integers  either  2 x  2   2  2x   so once you ve found the top set bit in your target  which can be done with a bit-twiddling trick  I forget exactly how  you can quickly get a range of potential answers  Mind you  a naive binary chop is still only going to take up to 31 or 32 iterations

User · Answer

Don t know about fastest  but the simplest is to take the square root in the normal fashion  multiply the result by itself  and see if it matches your original value   Since we re talking integers here  the fasted would probably involve a collection where you can just make a lookup

User · Answer

May be the best algorithm for the problem is a fast integer square root algorithm https   stackoverflow com a 51585204 5191852  There  Kde claims that three iterations of the Newton method would be sufficient for the accuracy of   1 for 32 bit integers  Certainly  more iterations are needed for 64-bit integers  may be 6 or 7

User · Answer

If you want speed  given that your integers are of finite size  I suspect that the quickest way would involve  a  partitioning the parameters by size  e g  into categories by largest bit set   then checking the value against an array of perfect squares within that range

User · Answer

The following simplification of maaartinus s solution appears to shave a few percentage points off the runtime  but I m not good enough at benchmarking to produce a benchmark I can trust   long goodMask     0xC840C04048404040 computed below       for  int i 0  i lt 64    i  goodMask    Long MIN VALUE  gt  gt  gt   i i      public boolean isSquare long x           This tests if the 6 least significant bits are right         Moving the to be tested bit to the highest position saves us masking      if  goodMask  lt  lt  x  gt   0  return false         Remove an even number of trailing zeros  leaving at most one      x  gt  gt    Long numberOfTrailingZeros x   amp   -2          Repeat the test on the 6 least significant remaining bits      if  goodMask  lt  lt  x  gt   0   x  lt   0  return x    0         Do it in the classical way         The correctness is not trivial as the conversion from long to double is lossy      final long tst    long  Math sqrt x       return tst   tst    x      It would be worth checking how omitting the first test   if  goodMask  lt  lt  x  gt   0  return false    would affect performance

User · Answer

Just for the record  another approach is to use the prime decomposition  If every factor of the decomposition is even  then the number is a perfect square  So what you want is to see if a number can be decomposed as a product of squares of prime numbers  Of course  you don t need to obtain such a decomposition  just to see if it exists   First build a table of squares of prime numbers which are lower than 2 32  This is far smaller than a table of all integers up to this limit   A solution would then be like this   boolean isPerfectSquare long number        if  number  lt  0  return false      if  number  lt  2  return true       for  int i   0    i                  long square   squareTable i           if  square  gt  number  return false          while  number   square    0                        number    square                    if  number    1  return true            I guess it s a bit cryptic  What it does is checking in every step that the square of a prime number divide the input number  If it does then it divides the number by the square as long as it is possible  to remove this square from the prime decomposition  If by this process  we came to 1  then the input number was a decomposition of square of prime numbers  If the square becomes larger than the number itself  then there is no way this square  or any larger squares  can divide it  so the number can not be a decomposition of squares of prime numbers   Given nowadays  sqrt done in hardware and the need to compute prime numbers here  I guess this solution is way slower  But it should give better results than solution with sqrt which won t work over 2 54  as says mrzl in his answer

User · Answer

I was thinking about the horrible times I ve spent in Numerical Analysis course   And then I remember  there was this function circling around the  net from the Quake Source code   float Q rsqrt  float number       long i    float x2  y    const float threehalfs   1 5F     x2   number   0 5F    y    number    i        long      amp y      evil floating point bit level hacking   i    0x5f3759df -   i  gt  gt  1       wtf    y        float      amp i    y    y     threehalfs -   x2   y   y         1st iteration      y    y     threehalfs -   x2   y   y         2nd iteration  this can be removed     ifndef Q3 VM    ifdef   linux       assert   isnan y        bk010122 - FPE     endif    endif   return y      Which basically calculates a square root  using Newton s approximation function  cant remember the exact name    It should be usable and might even be faster  it s from one of the phenomenal id software s game   It s written in C   but it should not be too hard to reuse the same technique in Java once you get the idea   I originally found it at  http   www codemaestro com reviews 9  Newton s method explained at wikipedia  http   en wikipedia org wiki Newton 27s method  You can follow the link for more explanation of how it works  but if you don t care much  then this is roughly what I remember from reading the blog and from taking the Numerical Analysis course    the    long    amp y is basically a fast convert-to-long function so integer operations can be applied on the raw bytes  the 0x5f3759df -  i  gt  gt  1   line is a pre-calculated seed value for the approximation function  the    float    amp i converts the value back to floating point  the y    y     threehalfs -   x2   y   y     line bascially iterates the value over the function again    The approximation function gives more precise values the more you iterate the function over the result  In Quake s case  one iteration is  good enough   but if it wasn t for you    then you could add as much iteration as you need   This should be faster because it reduces the number of division operations done in naive square rooting down to a simple divide by 2  actually a   0 5F multiply operation  and replace it with a few fixed number of multiplication operations instead

User · Answer

Considering for general bit length  though I have used specific type here   I tried to design simplistic algo as below  Simple and obvious check for 0 1 2 or  lt 0 is required initially  Following is simple in sense that it doesn t try to use any existing maths functions  Most of the operator can be replaced with bit-wise operators  I haven t tested with any bench mark data though  I m neither expert at maths or computer algorithm design in particular  I would love to see you pointing out problem  I know there is lots of improvement chances there   int main         unsigned int c1 0  c2   0        unsigned int x   0        unsigned int p   0        int k1   0        scanf   d   amp p         if p   2    0              x   p 2               else             x    p 2   1                while x                 if  x x   gt  p                  c1   x                x   x 2            else                 c2   x                break                            if  p 2     0            c2         while c2  lt  c1                   if  c2   c2      p                  k1   1                break                        c2                 if k1            printf   n Perfect square for  d   c2         else           printf   n Not perfect but nearest to   d     c2         return 0

User · Answer

Not sure if this is the fastest way  but this is something I stumbled upon   long time ago in high-school  when I was bored and playing with my calculator during math class  At that time  I was really amazed this was working     public static boolean isIntRoot int number        return isIntRootHelper number  1      private static boolean isIntRootHelper int number  int index        if  number    index            return true            if  number  lt  index            return false            else           return isIntRootHelper number - 2   index  index   1

User · Answer

I was thinking about the horrible times I ve spent in Numerical Analysis course   And then I remember  there was this function circling around the  net from the Quake Source code   float Q rsqrt  float number       long i    float x2  y    const float threehalfs   1 5F     x2   number   0 5F    y    number    i        long      amp y      evil floating point bit level hacking   i    0x5f3759df -   i  gt  gt  1       wtf    y        float      amp i    y    y     threehalfs -   x2   y   y         1st iteration      y    y     threehalfs -   x2   y   y         2nd iteration  this can be removed     ifndef Q3 VM    ifdef   linux       assert   isnan y        bk010122 - FPE     endif    endif   return y      Which basically calculates a square root  using Newton s approximation function  cant remember the exact name    It should be usable and might even be faster  it s from one of the phenomenal id software s game   It s written in C   but it should not be too hard to reuse the same technique in Java once you get the idea   I originally found it at  http   www codemaestro com reviews 9  Newton s method explained at wikipedia  http   en wikipedia org wiki Newton 27s method  You can follow the link for more explanation of how it works  but if you don t care much  then this is roughly what I remember from reading the blog and from taking the Numerical Analysis course    the    long    amp y is basically a fast convert-to-long function so integer operations can be applied on the raw bytes  the 0x5f3759df -  i  gt  gt  1   line is a pre-calculated seed value for the approximation function  the    float    amp i converts the value back to floating point  the y    y     threehalfs -   x2   y   y     line bascially iterates the value over the function again    The approximation function gives more precise values the more you iterate the function over the result  In Quake s case  one iteration is  good enough   but if it wasn t for you    then you could add as much iteration as you need   This should be faster because it reduces the number of division operations done in naive square rooting down to a simple divide by 2  actually a   0 5F multiply operation  and replace it with a few fixed number of multiplication operations instead

User · Answer

It should be much faster to use Newton s method to calculate the Integer Square Root  then square this number and check  as you do in your current solution   Newton s method is the basis for the Carmack solution mentioned in some other answers   You should be able to get a faster answer since you re only interested in the integer part of the root  allowing you to stop the approximation algorithm sooner   Another optimization that you can try   If the Digital Root of a number doesn t end in  1  4  7  or 9 the number is not a perfect square   This can be used as a quick way to eliminate 60  of your inputs before applying the slower square root algorithm

User · Answer

I m not sure if it would be faster  or even accurate  but you could use John Carmack s Magical Square Root  algorithm to solve the square root faster   You could probably easily test this for all possible 32 bit integers  and validate that you actually got correct results  as it s only an appoximation   However  now that I think about it  using doubles is approximating also  so I m not sure how that would come into play

User · Answer

This is the fastest Java implementation I could come up with  using a combination of techniques suggested by others in this thread    Mod-256 test Inexact mod-3465 test  avoids integer division at the cost of some false positives  Floating-point square root  round and compare with input value   I also experimented with these modifications but they did not help performance    Additional mod-255 test Dividing the input value by powers of 4 Fast Inverse Square Root  to work for high values of N it needs 3 iterations  enough to make it slower than the hardware square root function       public class SquareTester        public static boolean isPerfectSquare long n            if  n  lt  0                return false            else               switch   byte  n                case -128  case -127  case -124  case -119  case -112              case -111  case -103  case  -95  case  -92  case  -87              case  -79  case  -71  case  -64  case  -63  case  -60              case  -55  case  -47  case  -39  case  -31  case  -28              case  -23  case  -15  case   -7  case    0  case    1              case    4  case    9  case   16  case   17  case   25              case   33  case   36  case   41  case   49  case   57              case   64  case   65  case   68  case   73  case   81              case   89  case   97  case  100  case  105  case  113              case  121                  long i    n   INV3465   gt  gt  gt  52                  if    good3465  int  i                         return false                    else                       long r   round Math sqrt n                        return r r    n                                 default                  return false                                     private static int round double x            return  int  Double doubleToRawLongBits x    double   1L  lt  lt  52                   3465 lt sup gt -1 lt  sup gt  modulo 2 lt sup gt 64 lt  sup gt         private static final long INV3465   0x8ffed161732e78b9L       private static final boolean   good3465           new boolean 0x1000        static           for  int r   0  r  lt  3465     r                int i    int    r   r   INV3465   gt  gt  gt  52               good3465 i    good3465 i 1    true

User · Answer

If speed is a concern  why not partition off the most commonly used set of inputs and their values to a lookup table and then do whatever optimized magic algorithm you have come up with for the exceptional cases

User · Answer

I want this function to work with all   positive 64-bit signed integers   Math sqrt   works with doubles as input parameters  so you won t get accurate results for integers bigger than 2 53

User · Answer

Regarding the Carmac method  it seems like it would be quite easy just to iterate once more  which should double the number of digits of accuracy  It is  after all  an extremely truncated iterative method -- Newton s  with a very good first guess   Regarding your current best  I see two micro-optimizations    move the check vs  0 after the check using mod255 rearrange the dividing out powers of four to skip all the checks for the usual  75   case      I e      Divide out powers of 4 using binary search  if  n  amp  0x3L     0      n  gt  gt  2     if  n  amp  0xffffffffL     0      n  gt  gt   32    if  n  amp  0xffffL     0        n  gt  gt   16    if  n  amp  0xffL     0        n  gt  gt   8    if  n  amp  0xfL     0        n  gt  gt   4    if  n  amp  0x3L     0        n  gt  gt   2      Even better might be a simple  while   n  amp  0x03L     0  n  gt  gt   2    Obviously  it would be interesting to know how many numbers get culled at each checkpoint -- I rather doubt the checks are truly independent  which makes things tricky

User · Answer

It s been pointed out that the last d digits of a perfect square can only take on certain values  The last d digits  in base b  of a number n is the same as the remainder when n is divided by bd  ie  in C notation n   pow b  d    This can be generalized to any modulus m  ie  n   m can be used to rule out some percentage of numbers from being perfect squares  The modulus you are currently using is 64  which allows 12  ie  19  of remainders  as possible squares  With a little coding I found the modulus 110880  which allows only 2016  ie  1 8  of remainders as possible squares  So depending on the cost of a modulus operation  ie  division  and a table lookup versus a square root on your machine  using this modulus might be faster   By the way if Java has a way to store a packed array of bits for the lookup table  don t use it  110880 32-bit words is not much RAM these days and fetching a machine word is going to be faster than fetching a single bit

User · Answer

Just for the record  another approach is to use the prime decomposition  If every factor of the decomposition is even  then the number is a perfect square  So what you want is to see if a number can be decomposed as a product of squares of prime numbers  Of course  you don t need to obtain such a decomposition  just to see if it exists   First build a table of squares of prime numbers which are lower than 2 32  This is far smaller than a table of all integers up to this limit   A solution would then be like this   boolean isPerfectSquare long number        if  number  lt  0  return false      if  number  lt  2  return true       for  int i   0    i                  long square   squareTable i           if  square  gt  number  return false          while  number   square    0                        number    square                    if  number    1  return true            I guess it s a bit cryptic  What it does is checking in every step that the square of a prime number divide the input number  If it does then it divides the number by the square as long as it is possible  to remove this square from the prime decomposition  If by this process  we came to 1  then the input number was a decomposition of square of prime numbers  If the square becomes larger than the number itself  then there is no way this square  or any larger squares  can divide it  so the number can not be a decomposition of squares of prime numbers   Given nowadays  sqrt done in hardware and the need to compute prime numbers here  I guess this solution is way slower  But it should give better results than solution with sqrt which won t work over 2 54  as says mrzl in his answer

User · Answer

If speed is a concern  why not partition off the most commonly used set of inputs and their values to a lookup table and then do whatever optimized magic algorithm you have come up with for the exceptional cases

User · Answer

If speed is a concern  why not partition off the most commonly used set of inputs and their values to a lookup table and then do whatever optimized magic algorithm you have come up with for the exceptional cases

User · Answer

Square Root of a number  given that the number is a perfect square   The complexity is log n          Calculate square root if the given number is a perfect square         Approach  Sum of n odd numbers is equals to the square root of n n  given     that n is a perfect square         param number     return squareRoot      public static int calculateSquareRoot int number         int sum 1      int count  1      int squareRoot 1      while sum lt number            count  2          sum  count          squareRoot              return squareRoot

User · Answer

I m looking for the fastest way to determine if a long value is a perfect square  i e  its square root is another integer      The answers are impressive  but I failed to see a simple check    check whether the first number on the right of the long it a member of the set  0 1 4 5 6 9    If it is not  then it cannot possibly be a  perfect square     eg   4567 - cannot be a perfect square

User · Answer

For performance  you very often have to do some compromsies  Others have expressed various methods  however  you noted Carmack s hack was faster up to certain values of N  Then  you should check the  n  and if it is less than that number N  use Carmack s hack  else use some other method described in the answers here

User · Answer

Calculating square roots by Newton s method is horrendously fast     provided that the starting value is reasonable  However there is no reasonable starting value  and in practice we end with bisection and log 2 64  behaviour  To be really fast we need a fast way to get at a reasonable starting value  and that means we need to descend into machine language   If a processor provides an instruction like POPCNT in the Pentium  that counts the leading zeroes we can use that to have a starting value with half the significant bits  With care we can find a a fixed number of Newton steps that will always suffice    Thus foregoing the need to loop and have very fast execution    A second solution is going via the floating point facility  which may have a fast sqrt calculation  like the i87 coprocessor   Even an excursion via exp   and log   may be faster than Newton degenerated into a binary search  There is a tricky aspect to this  a processor dependant analysis of what and if refinement afterwards is necessary   A third solution solves a slightly different problem  but is well worth mentionning because the situation is described in the question  If you want to calculate a great many square roots for numbers that differ slightly  you can use Newton iteration  if you never reinitialise the starting value  but just leave it where the previous calculation left off  I ve used this with success in at least one Euler problem

User · Answer

If speed is a concern  why not partition off the most commonly used set of inputs and their values to a lookup table and then do whatever optimized magic algorithm you have come up with for the exceptional cases

User · Answer

I figured out a method that works  35  faster than your 6bits Carmack sqrt code  at least with my CPU  x86  and programming language  C C      Your results may vary  especially because I don t know how the Java factor will play out   My approach is threefold    First  filter out obvious answers   This includes negative numbers and looking at the last 4 bits    I found looking at the last six didn t help    I also answer yes for 0    In reading the code below  note that my input is int64 x    if  x  lt  0     x 2       x   7     5       x   11     8        return false  if  x    0       return true   Next  check if it s a square modulo 255   3   5   17   Because that s a product of three distinct primes  only about 1 8 of the residues mod 255 are squares   However  in my experience  calling the modulo operator     costs more than the benefit one gets  so I use bit tricks involving 255   2 8-1 to compute the residue    For better or worse  I am not using the trick of reading individual bytes out of a word  only bitwise-and and shifts   int64 y   x  y    y   4294967295LL     y  gt  gt  32    y    y   65535     y  gt  gt  16   y    y   255      y  gt  gt  8    255     y  gt  gt  16      At this point  y is between 0 and 511   More code can reduce it farther   To actually check if the residue is a square  I look up the answer in a precomputed table  if  bad255 y        return false     However  I just use a table of size 512   Finally  try to compute the square root using a method similar to Hensel s lemma    I don t think it s applicable directly  but it works with some modifications    Before doing that  I divide out all powers of 2 with a binary search  if  x   4294967295LL     0      x  gt  gt   32  if  x   65535     0      x  gt  gt   16  if  x   255     0      x  gt  gt   8  if  x   15     0      x  gt  gt   4  if  x   3     0      x  gt  gt   2  At this point  for our number to be a square  it must be 1 mod 8  if  x   7     1      return false  The basic structure of Hensel s lemma is the following    Note  untested code  if it doesn t work  try t 2 or 8   int64 t   4  r   1  t  lt  lt   1  r      x - r   r    t   gt  gt  1  t  lt  lt   1  r      x - r   r    t   gt  gt  1  t  lt  lt   1  r      x - r   r    t   gt  gt  1     Repeat until t is 2 33 or so   Use a loop if you want  The idea is that at each iteration  you add one bit onto r  the  current  square root of x  each square root is accurate modulo a larger and larger power of 2  namely t 2   At the end  r and t 2-r will be square roots of x modulo t 2    Note that if r is a square root of x  then so is -r   This is true even modulo numbers  but beware  modulo some numbers  things can have even more than 2 square roots  notably  this includes powers of 2    Because our actual square root is less than 2 32  at that point we can actually just check if r or t 2-r are real square roots   In my actual code  I use the following modified loop  int64 r  t  z  r   start  x  gt  gt  3    1023   do       z   x - r   r      if  z    0           return true      if  z  lt  0           return false      t   z    -z       r     z   t   gt  gt  1      if  r  gt   t  gt  gt  1            r   t - r    while  t  lt    1LL  lt  lt  33     The speedup here is obtained in three ways  precomputed start value  equivalent to  10 iterations of the loop   earlier exit of the loop  and skipping some t values   For the last part  I look at z   r - x   x  and set t to be the largest power of 2 dividing z with a bit trick   This allows me to skip t values that wouldn t have affected the value of r anyway   The precomputed start value in my case picks out the  smallest positive  square root modulo 8192     Even if this code doesn t work faster for you  I hope you enjoy some of the ideas it contains   Complete  tested code follows  including the precomputed tables   typedef signed long long int int64   int start 1024     1 3 1769 5 1937 1741 7 1451 479 157 9 91 945 659 1817 11  1983 707 1321 1211 1071 13 1479 405 415 1501 1609 741 15 339 1703 203  129 1411 873 1669 17 1715 1145 1835 351 1251 887 1573 975 19 1127 395  1855 1981 425 453 1105 653 327 21 287 93 713 1691 1935 301 551 587  257 1277 23 763 1903 1075 1799 1877 223 1437 1783 859 1201 621 25 779  1727 573 471 1979 815 1293 825 363 159 1315 183 27 241 941 601 971  385 131 919 901 273 435 647 1493 95 29 1417 805 719 1261 1177 1163  1599 835 1367 315 1361 1933 1977 747 31 1373 1079 1637 1679 1581 1753 1355  513 1539 1815 1531 1647 205 505 1109 33 1379 521 1627 1457 1901 1767 1547  1471 1853 1833 1349 559 1523 967 1131 97 35 1975 795 497 1875 1191 1739  641 1149 1385 133 529 845 1657 725 161 1309 375 37 463 1555 615 1931  1343 445 937 1083 1617 883 185 1515 225 1443 1225 869 1423 1235 39 1973  769 259 489 1797 1391 1485 1287 341 289 99 1271 1701 1713 915 537 1781  1215 963 41 581 303 243 1337 1899 353 1245 329 1563 753 595 1113 1589  897 1667 407 635 785 1971 135 43 417 1507 1929 731 207 275 1689 1397  1087 1725 855 1851 1873 397 1607 1813 481 163 567 101 1167 45 1831 1205  1025 1021 1303 1029 1135 1331 1017 427 545 1181 1033 933 1969 365 1255 1013  959 317 1751 187 47 1037 455 1429 609 1571 1463 1765 1009 685 679 821  1153 387 1897 1403 1041 691 1927 811 673 227 137 1499 49 1005 103 629  831 1091 1449 1477 1967 1677 697 1045 737 1117 1737 667 911 1325 473 437  1281 1795 1001 261 879 51 775 1195 801 1635 759 165 1871 1645 1049 245  703 1597 553 955 209 1779 1849 661 865 291 841 997 1265 1965 1625 53  1409 893 105 1925 1297 589 377 1579 929 1053 1655 1829 305 1811 1895 139  575 189 343 709 1711 1139 1095 277 993 1699 55 1435 655 1491 1319 331  1537 515 791 507 623 1229 1529 1963 1057 355 1545 603 1615 1171 743 523  447 1219 1239 1723 465 499 57 107 1121 989 951 229 1521 851 167 715  1665 1923 1687 1157 1553 1869 1415 1749 1185 1763 649 1061 561 531 409 907  319 1469 1961 59 1455 141 1209 491 1249 419 1847 1893 399 211 985 1099  1793 765 1513 1275 367 1587 263 1365 1313 925 247 1371 1359 109 1561 1291  191 61 1065 1605 721 781 1735 875 1377 1827 1353 539 1777 429 1959 1483  1921 643 617 389 1809 947 889 981 1441 483 1143 293 817 749 1383 1675  63 1347 169 827 1199 1421 583 1259 1505 861 457 1125 143 1069 807 1867  2047 2045 279 2043 111 307 2041 597 1569 1891 2039 1957 1103 1389 231 2037  65 1341 727 837 977 2035 569 1643 1633 547 439 1307 2033 1709 345 1845  1919 637 1175 379 2031 333 903 213 1697 797 1161 475 1073 2029 921 1653  193 67 1623 1595 943 1395 1721 2027 1761 1955 1335 357 113 1747 1497 1461  1791 771 2025 1285 145 973 249 171 1825 611 265 1189 847 1427 2023 1269  321 1475 1577 69 1233 755 1223 1685 1889 733 1865 2021 1807 1107 1447 1077  1663 1917 1129 1147 1775 1613 1401 555 1953 2019 631 1243 1329 787 871 885  449 1213 681 1733 687 115 71 1301 2017 675 969 411 369 467 295 693  1535 509 233 517 401 1843 1543 939 2015 669 1527 421 591 147 281 501  577 195 215 699 1489 525 1081 917 1951 2013 73 1253 1551 173 857 309  1407 899 663 1915 1519 1203 391 1323 1887 739 1673 2011 1585 493 1433 117  705 1603 1111 965 431 1165 1863 533 1823 605 823 1179 625 813 2009 75  1279 1789 1559 251 657 563 761 1707 1759 1949 777 347 335 1133 1511 267  833 1085 2007 1467 1745 1805 711 149 1695 803 1719 485 1295 1453 935 459  1151 381 1641 1413 1263 77 1913 2005 1631 541 119 1317 1841 1773 359 651  961 323 1193 197 175 1651 441 235 1567 1885 1481 1947 881 2003 217 843  1023 1027 745 1019 913 717 1031 1621 1503 867 1015 1115 79 1683 793 1035  1089 1731 297 1861 2001 1011 1593 619 1439 477 585 283 1039 1363 1369 1227  895 1661 151 645 1007 1357 121 1237 1375 1821 1911 549 1999 1043 1945 1419  1217 957 599 571 81 371 1351 1003 1311 931 311 1381 1137 723 1575 1611  767 253 1047 1787 1169 1997 1273 853 1247 413 1289 1883 177 403 999 1803  1345 451 1495 1093 1839 269 199 1387 1183 1757 1207 1051 783 83 423 1995  639 1155 1943 123 751 1459 1671 469 1119 995 393 219 1743 237 153 1909  1473 1859 1705 1339 337 909 953 1771 1055 349 1993 613 1393 557 729 1717  511 1533 1257 1541 1425 819 519 85 991 1693 503 1445 433 877 1305 1525  1601 829 809 325 1583 1549 1991 1941 927 1059 1097 1819 527 1197 1881 1333  383 125 361 891 495 179 633 299 863 285 1399 987 1487 1517 1639 1141  1729 579 87 1989 593 1907 839 1557 799 1629 201 155 1649 1837 1063 949  255 1283 535 773 1681 461 1785 683 735 1123 1801 677 689 1939 487 757  1857 1987 983 443 1327 1267 313 1173 671 221 695 1509 271 1619 89 565  127 1405 1431 1659 239 1101 1159 1067 607 1565 905 1755 1231 1299 665 373  1985 701 1879 1221 849 627 1465 789 543 1187 1591 923 1905 979 1241 181    bool bad255 512     0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 0 0 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1   1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1 1 1   0 1 0 1 1 0 0 1 1 1 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 1 1 1 1 1 0 1   1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 1 1 0 0 1 1 1 1 1 1   1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1   1 1 1 1 1 1 0 1 1 0 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1   1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1   1 0 1 1 1 0 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1   0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 0 0 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1   1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1 1 1   0 1 0 1 1 0 0 1 1 1 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 1 1 1 1 1 0 1   1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 1 1 0 0 1 1 1 1 1 1   1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1   1 1 1 1 1 1 0 1 1 0 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1   1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1   1 0 1 1 1 0 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1   0 0    inline bool square  int64 x            Quickfail     if  x  lt  0     x 2       x   7     5       x   11     8            return false      if  x    0           return true          Check mod 255   3   5   17  for fun     int64 y   x      y    y   4294967295LL     y  gt  gt  32       y    y   65535     y  gt  gt  16       y    y   255      y  gt  gt  8    255     y  gt  gt  16       if  bad255 y            return false          Divide out powers of 4 using binary search     if  x   4294967295LL     0          x  gt  gt   32      if  x   65535     0          x  gt  gt   16      if  x   255     0          x  gt  gt   8      if  x   15     0          x  gt  gt   4      if  x   3     0          x  gt  gt   2       if  x   7     1          return false          Compute sqrt using something like Hensel s lemma     int64 r  t  z      r   start  x  gt  gt  3    1023       do           z   x - r   r          if  z    0               return true          if  z  lt  0               return false          t   z    -z           r     z   t   gt  gt  1          if  r  gt   t   gt  gt  1                r   t - r        while  t  lt    1LL  lt  lt  33          return false

User · Answer

It s been pointed out that the last d digits of a perfect square can only take on certain values  The last d digits  in base b  of a number n is the same as the remainder when n is divided by bd  ie  in C notation n   pow b  d    This can be generalized to any modulus m  ie  n   m can be used to rule out some percentage of numbers from being perfect squares  The modulus you are currently using is 64  which allows 12  ie  19  of remainders  as possible squares  With a little coding I found the modulus 110880  which allows only 2016  ie  1 8  of remainders as possible squares  So depending on the cost of a modulus operation  ie  division  and a table lookup versus a square root on your machine  using this modulus might be faster   By the way if Java has a way to store a packed array of bits for the lookup table  don t use it  110880 32-bit words is not much RAM these days and fetching a machine word is going to be faster than fetching a single bit

User · Answer

I checked all of the possible results when the last n bits of a square is observed  By successively examining more bits  up to 5 6th of inputs can be eliminated  I actually designed this to implement Fermat s Factorization algorithm  and it is very fast there   public static boolean isSquare final long val       if   val  amp  2     2     val  amp  7     5         return false          if   val  amp  11     8     val  amp  31     20         return false           if   val  amp  47     32     val  amp  127     80         return false           if   val  amp  191     128     val  amp  511     320         return false              if  val  amp  a    b      val  amp  c    d           return false              if   modSq  int   val   modSq length              return false           final long root    long  Math sqrt val      return root   root    val      The last bit of pseudocode can be used to extend the tests to eliminate more values  The tests above are for k   0  1  2  3  a is of the form  3  lt  lt  2k  - 1     b is of the form  2  lt  lt  2k      c is of the form  2  lt  lt  2k   2  - 1     d is of the form  2  lt  lt  2k - 1    10  It first tests whether it has a  square residual with moduli of power of two  then it tests based on a final modulus  then it uses the Math sqrt to do a final test  I came up with the idea from the top post  and attempted to extend upon it  I appreciate any comments or suggestions   Update  Using the test by a modulus   modSq  and a modulus base of 44352  my test runs in 96  of the time of the one in the OP s update for numbers up to 1 000 000 000

User · Answer

May be the best algorithm for the problem is a fast integer square root algorithm https   stackoverflow com a 51585204 5191852  There  Kde claims that three iterations of the Newton method would be sufficient for the accuracy of   1 for 32 bit integers  Certainly  more iterations are needed for 64-bit integers  may be 6 or 7

User · Answer

I m pretty late to the party  but I hope to provide a better answer  shorter and  assuming my benchmark is correct  also much faster  long goodMask     0xC840C04048404040 computed below       for  int i 0  i lt 64    i  goodMask    Long MIN VALUE  gt  gt  gt   i i      public boolean isSquare long x           This tests if the 6 least significant bits are right         Moving the to be tested bit to the highest position saves us masking      if  goodMask  lt  lt  x  gt   0  return false      final int numberOfTrailingZeros   Long numberOfTrailingZeros x          Each square ends with an even number of zeros      if   numberOfTrailingZeros  amp  1     0  return false      x  gt  gt   numberOfTrailingZeros         Now x is either 0 or odd         In binary each odd square ends with 001         Postpone the sign test until now  handle zero in the branch      if   x amp 7     1   x  lt   0  return x    0         Do it in the classical way         The correctness is not trivial as the conversion from long to double is lossy      final long tst    long  Math sqrt x       return tst   tst    x     The first test catches most non-squares quickly  It uses a 64-item table packed in a long  so there s no array access cost  indirection and bounds checks   For a uniformly random long  there s a 81 25  probability of ending here  The second test catches all numbers having an odd number of twos in their factorization  The method Long numberOfTrailingZeros is very fast as it gets JIT-ed into a single i86 instruction  After dropping the trailing zeros  the third test handles numbers ending with 011  101  or 111 in binary  which are no perfect squares  It also cares about negative numbers and also handles 0  The final test falls back to double arithmetic  As double has only 53 bits mantissa  the conversion from long to double includes rounding for big values  Nonetheless  the test is  correct  unless the proof is wrong   Trying to incorporate the mod255 idea wasn t successful

User · Answer

For performance  you very often have to do some compromsies  Others have expressed various methods  however  you noted Carmack s hack was faster up to certain values of N  Then  you should check the  n  and if it is less than that number N  use Carmack s hack  else use some other method described in the answers here

User · Answer

If you want speed  given that your integers are of finite size  I suspect that the quickest way would involve  a  partitioning the parameters by size  e g  into categories by largest bit set   then checking the value against an array of perfect squares within that range

User · Answer

If you want speed  given that your integers are of finite size  I suspect that the quickest way would involve  a  partitioning the parameters by size  e g  into categories by largest bit set   then checking the value against an array of perfect squares within that range

User · Answer

You should get rid of the 2-power part of N right from the start   2nd Edit The magical expression for m below should be  m   N -  N  amp   N-1      and not as written  End of 2nd edit  m   N  amp   N-1      the lawest bit of N N    m  byte   N  amp  0x0F  if   m   2      byte   1  amp  amp  byte   9     return false    1st Edit   Minor improvement   m   N  amp   N-1      the lawest bit of N N    m  if   m   2      N  amp  0x07    1     return false    End of 1st edit  Now continue as usual  This way  by the time you get to the floating point part  you already got rid of all the numbers whose 2-power part is odd  about half   and then you only consider 1 8 of whats left  I e  you run the floating point part on 6  of the numbers

User · Answer

This a rework from decimal to binary of the old Marchant calculator algorithm  sorry  I don t have a reference   in Ruby  adapted specifically for this question   def isexactsqrt v      value   v abs     residue   value     root   0     onebit   1     onebit  lt  lt   8 while  onebit  lt  residue      onebit  gt  gt   2 while  onebit  gt  residue      while  onebit  gt  0          x   root   onebit         if  residue  gt   x  then             residue -  x             root   x   onebit         end         root  gt  gt   1         onebit  gt  gt   2     end     return  residue    0  end   Here s a workup of something similar  please don t vote me down for coding style smells or clunky O O - it s the algorithm that counts  and C   is not my home language   In this case  we re looking for residue    0    include  lt iostream gt     using namespace std    typedef unsigned long long int llint   class ISqrt                Integer Square Root     llint value            Integer whose square root is required     llint root             Result  floor sqrt value       llint residue          Result  value-root root     llint onebit  x        Working bit  working value  public       ISqrt llint v   2          Constructor         Root v                 Take the root              llint Root llint r         Resets and calculates new square root         value   r              Store input         residue   value        Initialise for subtracting down         root   0               Clear root accumulator          onebit   1                     Calculate start value of counter         onebit  lt  lt    8 sizeof llint -2              Set up counter bit as greatest odd power of 2          while  onebit  gt  residue   onebit  gt  gt   2         Shift down until just  lt  value          while  onebit  gt  0                x   root   onebit              Will check root 1bit  root bit corresponding to onebit is always zero              if  residue  gt   x               Room to subtract                  residue -  x               Yes - deduct from residue                 root   x   onebit          and step root                            root  gt  gt   1              onebit  gt  gt   2                     return root                                 llint Residue                  Returns residue from last calculation         return residue                              int main         llint big  i  q  r  v  delta      big   0  big    big-1              Kludge for  big number      ISqrt b                                Make q sqrt generator     for   i   big  i  gt  0   i    7          for several numbers         q   b Root i                       Get the square root         r   b Residue                      Get the residue         v   q q r                          Recalc original value         delta   v-i                        And diff  hopefully 0         cout  lt  lt  i  lt  lt        lt  lt  q  lt  lt          lt  lt  r  lt  lt    V     lt  lt  v  lt  lt    Delta     lt  lt  delta  lt  lt    n              return 0

User · Answer

For performance  you very often have to do some compromsies  Others have expressed various methods  however  you noted Carmack s hack was faster up to certain values of N  Then  you should check the  n  and if it is less than that number N  use Carmack s hack  else use some other method described in the answers here

User · Answer

You ll have to do some benchmarking   The best algorithm will depend on the distribution of your inputs   Your algorithm may be nearly optimal  but you might want to do a quick check to rule out some possibilities before calling your square root routine   For example  look at the last digit of your number in hex by doing a bit-wise  and    Perfect squares can only end in 0  1  4  or 9 in base 16   So for 75  of your inputs  assuming they are uniformly distributed  you can avoid a call to the square root in exchange for some very fast bit twiddling   Kip benchmarked the following code implementing the hex trick   When testing numbers 1 through 100 000 000  this code ran twice as fast as the original   public final static boolean isPerfectSquare long n        if  n  lt  0          return false       switch  int  n  amp  0xF             case 0  case 1  case 4  case 9          long tst    long Math sqrt n           return tst tst    n       default          return false            When I tested the analogous code in C    it actually ran slower than the original  However  when I eliminated the switch statement  the hex trick once again make the code twice as fast   int isPerfectSquare int n        int h   n  amp  0xF      h is the last hex  digit      if  h  gt  9          return 0         Use lazy evaluation to jump out of the if statement as soon as possible     if  h    2  amp  amp  h    3  amp  amp  h    5  amp  amp  h    6  amp  amp  h    7  amp  amp  h    8                int t    int  floor  sqrt  double  n    0 5            return t t    n            return 0      Eliminating the switch statement had little effect on the C  code

User · Answer

I want this function to work with all   positive 64-bit signed integers   Math sqrt   works with doubles as input parameters  so you won t get accurate results for integers bigger than 2 53

User · Answer

This a rework from decimal to binary of the old Marchant calculator algorithm  sorry  I don t have a reference   in Ruby  adapted specifically for this question   def isexactsqrt v      value   v abs     residue   value     root   0     onebit   1     onebit  lt  lt   8 while  onebit  lt  residue      onebit  gt  gt   2 while  onebit  gt  residue      while  onebit  gt  0          x   root   onebit         if  residue  gt   x  then             residue -  x             root   x   onebit         end         root  gt  gt   1         onebit  gt  gt   2     end     return  residue    0  end   Here s a workup of something similar  please don t vote me down for coding style smells or clunky O O - it s the algorithm that counts  and C   is not my home language   In this case  we re looking for residue    0    include  lt iostream gt     using namespace std    typedef unsigned long long int llint   class ISqrt                Integer Square Root     llint value            Integer whose square root is required     llint root             Result  floor sqrt value       llint residue          Result  value-root root     llint onebit  x        Working bit  working value  public       ISqrt llint v   2          Constructor         Root v                 Take the root              llint Root llint r         Resets and calculates new square root         value   r              Store input         residue   value        Initialise for subtracting down         root   0               Clear root accumulator          onebit   1                     Calculate start value of counter         onebit  lt  lt    8 sizeof llint -2              Set up counter bit as greatest odd power of 2          while  onebit  gt  residue   onebit  gt  gt   2         Shift down until just  lt  value          while  onebit  gt  0                x   root   onebit              Will check root 1bit  root bit corresponding to onebit is always zero              if  residue  gt   x               Room to subtract                  residue -  x               Yes - deduct from residue                 root   x   onebit          and step root                            root  gt  gt   1              onebit  gt  gt   2                     return root                                 llint Residue                  Returns residue from last calculation         return residue                              int main         llint big  i  q  r  v  delta      big   0  big    big-1              Kludge for  big number      ISqrt b                                Make q sqrt generator     for   i   big  i  gt  0   i    7          for several numbers         q   b Root i                       Get the square root         r   b Residue                      Get the residue         v   q q r                          Recalc original value         delta   v-i                        And diff  hopefully 0         cout  lt  lt  i  lt  lt        lt  lt  q  lt  lt          lt  lt  r  lt  lt    V     lt  lt  v  lt  lt    Delta     lt  lt  delta  lt  lt    n              return 0

User · Answer

You should get rid of the 2-power part of N right from the start   2nd Edit The magical expression for m below should be  m   N -  N  amp   N-1      and not as written  End of 2nd edit  m   N  amp   N-1      the lawest bit of N N    m  byte   N  amp  0x0F  if   m   2      byte   1  amp  amp  byte   9     return false    1st Edit   Minor improvement   m   N  amp   N-1      the lawest bit of N N    m  if   m   2      N  amp  0x07    1     return false    End of 1st edit  Now continue as usual  This way  by the time you get to the floating point part  you already got rid of all the numbers whose 2-power part is odd  about half   and then you only consider 1 8 of whats left  I e  you run the floating point part on 6  of the numbers

User · Answer

You should get rid of the 2-power part of N right from the start   2nd Edit The magical expression for m below should be  m   N -  N  amp   N-1      and not as written  End of 2nd edit  m   N  amp   N-1      the lawest bit of N N    m  byte   N  amp  0x0F  if   m   2      byte   1  amp  amp  byte   9     return false    1st Edit   Minor improvement   m   N  amp   N-1      the lawest bit of N N    m  if   m   2      N  amp  0x07    1     return false    End of 1st edit  Now continue as usual  This way  by the time you get to the floating point part  you already got rid of all the numbers whose 2-power part is odd  about half   and then you only consider 1 8 of whats left  I e  you run the floating point part on 6  of the numbers

User · Answer

Don t know about fastest  but the simplest is to take the square root in the normal fashion  multiply the result by itself  and see if it matches your original value   Since we re talking integers here  the fasted would probably involve a collection where you can just make a lookup

User · Answer

Newton s Method with integer arithmetic  If you wish to avoid non-integer operations you could use the method below  It basically uses Newton s Method modified for integer arithmetic          Test if the given number is a perfect square      param n Must be greater than 0 and less       than Long MAX VALUE      return  lt code gt true lt  code gt  if n is a perfect       square  or  lt code gt false lt  code gt  otherwise      public static boolean isSquare long n        long x1   n      long x2   1L       while  x1  gt  x2                x1    x1   x2    2L          x2   n   x1             return x1    x2  amp  amp  n   x1    0L      This implementation can not compete with solutions that use Math sqrt  However  its performance can be improved by using the filtering mechanisms described in some of the other posts

User · Answer

I ran my own analysis of several of the algorithms in this thread and came up with some new results  You can see those old results in the edit history of this answer  but they re not accurate  as I made a mistake  and wasted time analyzing several algorithms which aren t close  However  pulling lessons from several different answers  I now have two algorithms that crush the  winner  of this thread  Here s the core thing I do differently than everyone else      This is faster because a number is divisible by 2 4 or more only 6  of the time    and more than that a vanishingly small percentage  while  x  amp  0x3     0  x  gt  gt   2     This is effectively the same as the switch-case statement used in the original    answer   if  x  amp  0x7     1  return false    However  this simple line  which most of the time adds one or two very fast instructions  greatly simplifies the switch-case statement into one if statement  However  it can add to the runtime if many of the tested numbers have significant power-of-two factors   The algorithms below are as follows    Internet - Kip s posted answer Durron - My modified answer using the one-pass answer as a base DurronTwo - My modified answer using the two-pass answer  by  JohnnyHeggheim   with some other slight modifications    Here is a sample runtime if the numbers are generated using Math abs java util Random nextLong      0  Scenario vm java  trial 0  benchmark Internet  39673 40 ns    378 78 ns   3 trials 33  Scenario vm java  trial 0  benchmark Durron  37785 75 ns    478 86 ns   10 trials 67  Scenario vm java  trial 0  benchmark DurronTwo  35978 10 ns    734 10 ns   10 trials  benchmark   us linear runtime  Internet 39 7                                   Durron 37 8                              DurronTwo 36 0                              vm  java trial  0   And here is a sample runtime if it s run on the first million longs only    0  Scenario vm java  trial 0  benchmark Internet  2933380 84 ns    56939 84 ns   10 trials 33  Scenario vm java  trial 0  benchmark Durron  2243266 81 ns    50537 62 ns   10 trials 67  Scenario vm java  trial 0  benchmark DurronTwo  3159227 68 ns    10766 22 ns   3 trials  benchmark   ms linear runtime  Internet 2 93                                Durron 2 24                       DurronTwo 3 16                                 vm  java trial  0   As you can see  DurronTwo does better for large inputs  because it gets to use the magic trick very very often  but gets clobbered compared to the first algorithm and Math sqrt because the numbers are so much smaller  Meanwhile  the simpler Durron is a huge winner because it never has to divide by 4 many many times in the first million numbers   Here s Durron   public final static boolean isPerfectSquareDurron long n        if n  lt  0  return false      if n    0  return true       long x   n         This is faster because a number is divisible by 16 only 6  of the time        and more than that a vanishingly small percentage      while  x  amp  0x3     0  x  gt  gt   2         This is effectively the same as the switch-case statement used in the original        answer       if  x  amp  0x7     1             long sqrt          if x  lt  410881L                        int i              float x2  y               x2   x   0 5F              y    x              i    Float floatToRawIntBits y               i    0x5f3759df -   i  gt  gt  1                y    Float intBitsToFloat i               y    y     1 5F -   x2   y   y                   sqrt    long  1 0F y             else               sqrt    long  Math sqrt x                     return sqrt sqrt    x            return false      And DurronTwo  public final static boolean isPerfectSquareDurronTwo long n        if n  lt  0  return false         Needed to prevent infinite loop     if n    0  return true       long x   n      while  x  amp  0x3     0  x  gt  gt   2      if  x  amp  0x7     1            long sqrt          if  x  lt  41529141369L                int i              float x2  y               x2   x   0 5F              y   x              i   Float floatToRawIntBits y                 using the magic number from                http   www lomont org Math Papers 2003 InvSqrt pdf               since it more accurate             i   0x5f375a86 -  i  gt  gt  1               y   Float intBitsToFloat i               y   y    1 5F -  x2   y   y                y   y    1 5F -  x2   y   y      Newton iteration  more accurate             sqrt    long    1 0F y    0 2             else                 Carmack hack gives incorrect answer for n  gt   41529141369              sqrt    long  Math sqrt x                     return sqrt sqrt    x            return false      And my benchmark harness   Requires Google caliper 0 1-rc5   public class SquareRootBenchmark       public static class Benchmark1 extends SimpleBenchmark           private static final int ARRAY SIZE   10000          long   trials   new long ARRAY SIZE             Override         protected void setUp   throws Exception               Random r   new Random                for  int i   0  i  lt  ARRAY SIZE  i                      trials i    Math abs r nextLong                                       public int timeInternet int reps                int trues   0              for int i   0  i  lt  reps  i                      for int j   0  j  lt  ARRAY SIZE  j                          if SquareRootAlgs isPerfectSquareInternet trials j    trues                                                 return trues                        public int timeDurron int reps                int trues   0              for int i   0  i  lt  reps  i                      for int j   0  j  lt  ARRAY SIZE  j                          if SquareRootAlgs isPerfectSquareDurron trials j    trues                                                 return trues                        public int timeDurronTwo int reps                int trues   0              for int i   0  i  lt  reps  i                      for int j   0  j  lt  ARRAY SIZE  j                          if SquareRootAlgs isPerfectSquareDurronTwo trials j    trues                                                 return trues                          public static void main String    args            Runner main Benchmark1 class  args             UPDATE  I ve made a new algorithm that is faster in some scenarios  slower in others  I ve gotten different benchmarks based on different inputs  If we calculate modulo 0xFFFFFF   3 x 3 x 5 x 7 x 13 x 17 x 241  we can eliminate 97 82  of numbers that cannot be squares  This can be  sort of  done in one line  with 5 bitwise operations   if   goodLookupSquares  int    n  amp  0xFFFFFFl      n  gt  gt  24   amp  0xFFFFFFl     n  gt  gt  48     return false    The resulting index is either 1  the residue  2  the residue   0xFFFFFF  or 3  the residue   0x1FFFFFE  Of course  we need to have a lookup table for residues modulo 0xFFFFFF  which is about a 3mb file  in this case stored as ascii text decimal numbers  not optimal but clearly improvable with a ByteBuffer and so forth  But since that is precalculation it doesn t matter so much  You can find the file here  or generate it yourself     public final static boolean isPerfectSquareDurronThree long n        if n  lt  0  return false      if n    0  return true       long x   n      while  x  amp  0x3     0  x  gt  gt   2      if  x  amp  0x7     1            if   goodLookupSquares  int    n  amp  0xFFFFFFl      n  gt  gt  24   amp  0xFFFFFFl     n  gt  gt  48     return false          long sqrt          if x  lt  410881L                        int i              float x2  y               x2   x   0 5F              y    x              i    Float floatToRawIntBits y               i    0x5f3759df -   i  gt  gt  1                y    Float intBitsToFloat i               y    y     1 5F -   x2   y   y                   sqrt    long  1 0F y             else               sqrt    long  Math sqrt x                     return sqrt sqrt    x            return false      I load it into a boolean array like this   private static boolean   goodLookupSquares   null   public static void initGoodLookupSquares   throws Exception       Scanner s   new Scanner new File  24residues squares txt          goodLookupSquares   new boolean 0x1FFFFFE        while s hasNextLine              int residue   Integer valueOf s nextLine             goodLookupSquares residue    true          goodLookupSquares residue   0xFFFFFF    true          goodLookupSquares residue   0x1FFFFFE    true             s close        Example runtime  It beat Durron  version one  in every trial I ran    0  Scenario vm java  trial 0  benchmark Internet  40665 77 ns    566 71 ns   10 trials 33  Scenario vm java  trial 0  benchmark Durron  38397 60 ns    784 30 ns   10 trials 67  Scenario vm java  trial 0  benchmark DurronThree  36171 46 ns    693 02 ns   10 trials    benchmark   us linear runtime    Internet 40 7                                     Durron 38 4                              DurronThree 36 2                             vm  java trial  0

User · Answer

It ought to be possible to pack the  cannot be a perfect square if the last X digits are N  much more efficiently than that  I ll use java 32 bit ints  and produce enough data to check the last 16 bits of the number - that s 2048 hexadecimal int values        Ok  Either I have run into some number theory that is a little beyond me  or there is a bug in my code  In any case  here is the code   public static void main String   args        final int BITS   16       BitSet foo   new BitSet         for int i   0  i lt   1 lt  lt BITS   i              int sq    i i           sq   sq  amp    1 lt  lt BITS -1           foo set sq              System out println  int   mayBeASquare             for int i   0  i lt  1 lt  lt  BITS-5   i              int kk   0          for int j   0  j lt 32  j                  if foo get  i  lt  lt  5    j                     kk    1 lt  lt j                                  System out print  0x    Integer toHexString kk                   if i 8    7  System out println              System out println            and here are the results    ed  elided for poor performance in prettify js  view revision history to see

User · Answer

For performance  you very often have to do some compromsies  Others have expressed various methods  however  you noted Carmack s hack was faster up to certain values of N  Then  you should check the  n  and if it is less than that number N  use Carmack s hack  else use some other method described in the answers here

User · Answer

I want this function to work with all   positive 64-bit signed integers   Math sqrt   works with doubles as input parameters  so you won t get accurate results for integers bigger than 2 53

User · Answer

If you do a binary chop to try to find the  right  square root  you can fairly easily detect if the value you ve got is close enough to tell    n 1  2   n 2   2n   1  n-1  2   n 2 - 2n   1   So having calculated n 2  the options are    n 2   target  done  return true n 2   2n   1  gt  target  gt  n 2   you re close  but it s not perfect  return false n 2 - 2n   1  lt  target  lt  n 2   ditto target  lt  n 2 - 2n   1   binary chop on a lower n target  gt  n 2   2n   1   binary chop on a higher n    Sorry  this uses n as your current guess  and target for the parameter  Apologise for the confusion    I don t know whether this will be faster or not  but it s worth a try   EDIT  The binary chop doesn t have to take in the whole range of integers  either  2 x  2   2  2x   so once you ve found the top set bit in your target  which can be done with a bit-twiddling trick  I forget exactly how  you can quickly get a range of potential answers  Mind you  a naive binary chop is still only going to take up to 31 or 32 iterations

User · Answer

It ought to be possible to pack the  cannot be a perfect square if the last X digits are N  much more efficiently than that  I ll use java 32 bit ints  and produce enough data to check the last 16 bits of the number - that s 2048 hexadecimal int values        Ok  Either I have run into some number theory that is a little beyond me  or there is a bug in my code  In any case  here is the code   public static void main String   args        final int BITS   16       BitSet foo   new BitSet         for int i   0  i lt   1 lt  lt BITS   i              int sq    i i           sq   sq  amp    1 lt  lt BITS -1           foo set sq              System out println  int   mayBeASquare             for int i   0  i lt  1 lt  lt  BITS-5   i              int kk   0          for int j   0  j lt 32  j                  if foo get  i  lt  lt  5    j                     kk    1 lt  lt j                                  System out print  0x    Integer toHexString kk                   if i 8    7  System out println              System out println            and here are the results    ed  elided for poor performance in prettify js  view revision history to see

User · Answer

This a rework from decimal to binary of the old Marchant calculator algorithm  sorry  I don t have a reference   in Ruby  adapted specifically for this question   def isexactsqrt v      value   v abs     residue   value     root   0     onebit   1     onebit  lt  lt   8 while  onebit  lt  residue      onebit  gt  gt   2 while  onebit  gt  residue      while  onebit  gt  0          x   root   onebit         if  residue  gt   x  then             residue -  x             root   x   onebit         end         root  gt  gt   1         onebit  gt  gt   2     end     return  residue    0  end   Here s a workup of something similar  please don t vote me down for coding style smells or clunky O O - it s the algorithm that counts  and C   is not my home language   In this case  we re looking for residue    0    include  lt iostream gt     using namespace std    typedef unsigned long long int llint   class ISqrt                Integer Square Root     llint value            Integer whose square root is required     llint root             Result  floor sqrt value       llint residue          Result  value-root root     llint onebit  x        Working bit  working value  public       ISqrt llint v   2          Constructor         Root v                 Take the root              llint Root llint r         Resets and calculates new square root         value   r              Store input         residue   value        Initialise for subtracting down         root   0               Clear root accumulator          onebit   1                     Calculate start value of counter         onebit  lt  lt    8 sizeof llint -2              Set up counter bit as greatest odd power of 2          while  onebit  gt  residue   onebit  gt  gt   2         Shift down until just  lt  value          while  onebit  gt  0                x   root   onebit              Will check root 1bit  root bit corresponding to onebit is always zero              if  residue  gt   x               Room to subtract                  residue -  x               Yes - deduct from residue                 root   x   onebit          and step root                            root  gt  gt   1              onebit  gt  gt   2                     return root                                 llint Residue                  Returns residue from last calculation         return residue                              int main         llint big  i  q  r  v  delta      big   0  big    big-1              Kludge for  big number      ISqrt b                                Make q sqrt generator     for   i   big  i  gt  0   i    7          for several numbers         q   b Root i                       Get the square root         r   b Residue                      Get the residue         v   q q r                          Recalc original value         delta   v-i                        And diff  hopefully 0         cout  lt  lt  i  lt  lt        lt  lt  q  lt  lt          lt  lt  r  lt  lt    V     lt  lt  v  lt  lt    Delta     lt  lt  delta  lt  lt    n              return 0

User · Answer

You should get rid of the 2-power part of N right from the start   2nd Edit The magical expression for m below should be  m   N -  N  amp   N-1      and not as written  End of 2nd edit  m   N  amp   N-1      the lawest bit of N N    m  byte   N  amp  0x0F  if   m   2      byte   1  amp  amp  byte   9     return false    1st Edit   Minor improvement   m   N  amp   N-1      the lawest bit of N N    m  if   m   2      N  amp  0x07    1     return false    End of 1st edit  Now continue as usual  This way  by the time you get to the floating point part  you already got rid of all the numbers whose 2-power part is odd  about half   and then you only consider 1 8 of whats left  I e  you run the floating point part on 6  of the numbers

[java] Fastest way to determine if an integer's square root is an integer

Examples related to java

Examples related to math

Examples related to optimization

Examples related to perfect-square