Float and double datatype in Java

Question

The float data type is a single-precision 32-bit IEEE 754 floating point and the double data type is a double-precision 64-bit IEEE 754 floating point   What does it mean  And when should I use float instead of double or vice-versa

User · Answer

This example illustrates how to extract the sign (the leftmost bit), exponent (the 8 following bits) and mantissa (the 23 rightmost bits) from a float in Java.

int bits = Float.floatToIntBits(-0.005f);
int sign = bits >>> 31;
int exp = (bits >>> 23 & ((1 << 8) - 1)) - ((1 << 7) - 1);
int mantissa = bits & ((1 << 23) - 1);
System.out.println(sign + " " + exp + " " + mantissa + " " +
  Float.intBitsToFloat((sign << 31) | (exp + ((1 << 7) - 1)) << 23 | mantissa));

The same approach can be used for double’s (11 bit exponent and 52 bit mantissa).

long bits = Double.doubleToLongBits(-0.005);
long sign = bits >>> 63;
long exp = (bits >>> 52 & ((1 << 11) - 1)) - ((1 << 10) - 1);
long mantissa = bits & ((1L << 52) - 1);
System.out.println(sign + " " + exp + " " + mantissa + " " +
  Double.longBitsToDouble((sign << 63) | (exp + ((1 << 10) - 1)) << 52 | mantissa));

Credit: http://s-j.github.io/java-float/

User · Answer

In regular programming calculations  we don   t use float  If we ensure that the result range is within the range of float data type then we can choose a float data type for saving memory  Generally  we use double because of two reasons -  If we want to use the floating-point number as float data type then method caller must explicitly suffix F or f  because by default every floating-point number is treated as double  It increases the burden to the programmer  If we use a floating-point number as double data type then we don   t need to add any suffix  Float is a single-precision data type means it occupies 4 bytes  Hence in large computations  we will not get a complete result  If we choose double data type  it occupies 8 bytes and we will get complete results   Both float and double data types were designed especially for scientific calculations  where approximation errors are acceptable  If accuracy is the most prior concern then  it is recommended to use BigDecimal class instead of float or double data types  Source - Float and double datatypes in Java

User · Answer

A float gives you approx  6-7 decimal digits precision while a double gives you approx  15-16  Also the range of numbers is larger for double   A double needs 8 bytes of storage space while a float needs just 4 bytes

User · Answer

The Wikipedia page on it is a good place to start   To sum up    float is represented in 32 bits  with 1 sign bit  8 bits of exponent  and 23 bits of the significand  or what follows from a scientific-notation number   2 33728 1012  33728 is the significand   double is represented in 64 bits  with 1 sign bit  11 bits of exponent  and 52 bits of significand    By default  Java uses double to represent its floating-point numerals  so a literal 3 14 is typed double    It s also the data type that will give you a much larger number range  so I would strongly encourage its use over float   There may be certain libraries that actually force your usage of float  but in general - unless you can guarantee that your result will be small enough to fit in float s prescribed range  then it s best to opt with double   If you require accuracy - for instance  you can t have a decimal value that is inaccurate  like 1 10   2 10   or you re doing anything with currency  for example  representing  10 33 in the system   then use a BigDecimal  which can support an arbitrary amount of precision and handle situations like that elegantly

User · Answer

According to the IEEE standards  float is a 32 bit representation of a real number while double is a 64 bit representation   In Java programs we normally mostly see the use of double data type  It s just to avoid overflows as the range of numbers that can be accommodated using the double data type is more that the range when float is used   Also when high precision is required  the use of double is encouraged  Few library methods that were implemented a long time ago still requires the use of float data type as a must  that is only because it was implemented using float  nothing else     But if you are certain that your program requires small numbers and an overflow won t occur with your use of float  then the use of float will largely improve your space complexity as floats require half the memory as required by double

User · Answer

Floating-point numbers  also known as real numbers  are used when evaluating expressions that require fractional precision  For example  calculations such as square root  or transcendentals such as sine and cosine  result in a value whose precision requires a floating-point type  Java implements the standard  IEEE   754  set of floatingpoint types and operators  There are two kinds of floating-point types  float and double  which represent single- and double-precision numbers  respectively  Their width and ranges are shown here         Name     Width in Bits   Range      double  64              1  7e   308 to 1 7e 308     float   32              3  4e   038 to 3 4e 038     float  The type float specifies a single-precision value that uses 32 bits of storage  Single precision is faster on some processors and takes half as much space as double precision  but will become imprecise when the values are either very large or very small  Variables of type float are useful when you need a fractional component  but don t require a large degree of precision    Here are some example float variable declarations   float hightemp  lowtemp      double  Double precision  as denoted by the double keyword  uses 64 bits to store a value  Double precision is actually faster than single precision on some modern processors that have been optimized for high-speed mathematical calculations  All transcendental math functions  such as sin     cos     and sqrt     return double values  When you need to maintain accuracy over many iterative calculations  or are manipulating large-valued numbers  double is the best choice

User · Answer

You should use double instead of float for precise calculations  and float instead of double when using less accurate calculations  Float contains only decimal numbers  but double contains an IEEE754 double-precision floating point number  making it easier to contain and computate numbers more accurately  Hope this helps

User · Answer

This will give error   public class MyClass       public static void main String args              float a   0 5             MyClass java 3  error  incompatible types  possible lossy conversion from double to float         float a   0 5   This will work perfectly fine  public class MyClass       public static void main String args              double a   0 5            This will also work perfectly fine  public class MyClass       public static void main String args              float a    float 0 5            Reason   Java by default stores real numbers as double to ensure higher precision    Double takes more space but more precise during computation and float takes less space but less precise

User · Answer

Java seems to have a bias towards using double for computations nonetheless   Case in point the program I wrote earlier today  the methods didn t work when I used float  but now work great when I substituted float with double  in the NetBeans IDE    package palettedos  import java util     class Palettedos      private static Scanner Z   new Scanner System in       public static final double pi   3 142       public static void main String  args           Palettedos A   new Palettedos            System out println  Enter the base and height of the triangle respectively            int base   Z nextInt            int height   Z nextInt            System out println  Enter the radius of the circle            int radius   Z nextInt            System out println  Enter the length of the square            long length   Z nextInt            double tArea   A calculateArea base  height           double cArea   A calculateArea radius           long sqArea   A calculateArea length           System out println  The area of the triangle is t    tArea           System out println  The area of the circle is t    cArea           System out println  The area of the square is t    sqArea              double calculateArea int base  int height           double triArea   0 5 base height          return triArea             double calculateArea int radius           double circArea   pi radius radius          return circArea             long calculateArea long length           long squaArea   length length          return squaArea

[java] Float and double datatype in Java

Examples related to java

Examples related to floating-point

Examples related to double

Examples related to ieee-754