[java] Float and double datatype in Java

The float data type is a single-precision 32-bit IEEE 754 floating point and the double data type is a double-precision 64-bit IEEE 754 floating point.

What does it mean? And when should I use float instead of double or vice-versa?

This question is related to java floating-point double ieee-754

The answer is


This will give error:

public class MyClass {
    public static void main(String args[]) {
        float a = 0.5;
    }
}

/MyClass.java:3: error: incompatible types: possible lossy conversion from double to float float a = 0.5;

This will work perfectly fine

public class MyClass {
    public static void main(String args[]) {
        double a = 0.5;
    }
}

This will also work perfectly fine

public class MyClass {
    public static void main(String args[]) {
        float a = (float)0.5;
    }
}

Reason : Java by default stores real numbers as double to ensure higher precision.

Double takes more space but more precise during computation and float takes less space but less precise.


Floating-point numbers, also known as real numbers, are used when evaluating expressions that require fractional precision. For example, calculations such as square root, or transcendentals such as sine and cosine, result in a value whose precision requires a floating-point type. Java implements the standard (IEEE–754) set of floatingpoint types and operators. There are two kinds of floating-point types, float and double, which represent single- and double-precision numbers, respectively. Their width and ranges are shown here:


   Name     Width in Bits   Range 
    double  64              1 .7e–308 to 1.7e+308
    float   32              3 .4e–038 to 3.4e+038


float

The type float specifies a single-precision value that uses 32 bits of storage. Single precision is faster on some processors and takes half as much space as double precision, but will become imprecise when the values are either very large or very small. Variables of type float are useful when you need a fractional component, but don't require a large degree of precision.

Here are some example float variable declarations:

float hightemp, lowtemp;


double

Double precision, as denoted by the double keyword, uses 64 bits to store a value. Double precision is actually faster than single precision on some modern processors that have been optimized for high-speed mathematical calculations. All transcendental math functions, such as sin( ), cos( ), and sqrt( ), return double values. When you need to maintain accuracy over many iterative calculations, or are manipulating large-valued numbers, double is the best choice.


According to the IEEE standards, float is a 32 bit representation of a real number while double is a 64 bit representation.

In Java programs we normally mostly see the use of double data type. It's just to avoid overflows as the range of numbers that can be accommodated using the double data type is more that the range when float is used.

Also when high precision is required, the use of double is encouraged. Few library methods that were implemented a long time ago still requires the use of float data type as a must (that is only because it was implemented using float, nothing else!).

But if you are certain that your program requires small numbers and an overflow won't occur with your use of float, then the use of float will largely improve your space complexity as floats require half the memory as required by double.


This example illustrates how to extract the sign (the leftmost bit), exponent (the 8 following bits) and mantissa (the 23 rightmost bits) from a float in Java.

int bits = Float.floatToIntBits(-0.005f);
int sign = bits >>> 31;
int exp = (bits >>> 23 & ((1 << 8) - 1)) - ((1 << 7) - 1);
int mantissa = bits & ((1 << 23) - 1);
System.out.println(sign + " " + exp + " " + mantissa + " " +
  Float.intBitsToFloat((sign << 31) | (exp + ((1 << 7) - 1)) << 23 | mantissa));

The same approach can be used for double’s (11 bit exponent and 52 bit mantissa).

long bits = Double.doubleToLongBits(-0.005);
long sign = bits >>> 63;
long exp = (bits >>> 52 & ((1 << 11) - 1)) - ((1 << 10) - 1);
long mantissa = bits & ((1L << 52) - 1);
System.out.println(sign + " " + exp + " " + mantissa + " " +
  Double.longBitsToDouble((sign << 63) | (exp + ((1 << 10) - 1)) << 52 | mantissa));

Credit: http://s-j.github.io/java-float/


Java seems to have a bias towards using double for computations nonetheless:

Case in point the program I wrote earlier today, the methods didn't work when I used float, but now work great when I substituted float with double (in the NetBeans IDE):

package palettedos;
import java.util.*;

class Palettedos{
    private static Scanner Z = new Scanner(System.in);
    public static final double pi = 3.142;

    public static void main(String[]args){
        Palettedos A = new Palettedos();
        System.out.println("Enter the base and height of the triangle respectively");
        int base = Z.nextInt();
        int height = Z.nextInt();
        System.out.println("Enter the radius of the circle");
        int radius = Z.nextInt();
        System.out.println("Enter the length of the square");
        long length = Z.nextInt();
        double tArea = A.calculateArea(base, height);
        double cArea = A.calculateArea(radius);
        long sqArea = A.calculateArea(length);
        System.out.println("The area of the triangle is\t" + tArea);
        System.out.println("The area of the circle is\t" + cArea);
        System.out.println("The area of the square is\t" + sqArea);
    }

    double calculateArea(int base, int height){
        double triArea = 0.5*base*height;
        return triArea;
    }

    double calculateArea(int radius){
        double circArea = pi*radius*radius;
        return circArea;
    }

    long calculateArea(long length){
        long squaArea = length*length;
        return squaArea;
    }
}

In regular programming calculations, we don’t use float. If we ensure that the result range is within the range of float data type then we can choose a float data type for saving memory. Generally, we use double because of two reasons:-

  • If we want to use the floating-point number as float data type then method caller must explicitly suffix F or f, because by default every floating-point number is treated as double. It increases the burden to the programmer. If we use a floating-point number as double data type then we don’t need to add any suffix.
  • Float is a single-precision data type means it occupies 4 bytes. Hence in large computations, we will not get a complete result. If we choose double data type, it occupies 8 bytes and we will get complete results.

Both float and double data types were designed especially for scientific calculations, where approximation errors are acceptable. If accuracy is the most prior concern then, it is recommended to use BigDecimal class instead of float or double data types. Source:- Float and double datatypes in Java


A float gives you approx. 6-7 decimal digits precision while a double gives you approx. 15-16. Also the range of numbers is larger for double.

A double needs 8 bytes of storage space while a float needs just 4 bytes.


You should use double instead of float for precise calculations, and float instead of double when using less accurate calculations. Float contains only decimal numbers, but double contains an IEEE754 double-precision floating point number, making it easier to contain and computate numbers more accurately. Hope this helps.


Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to floating-point

Convert list or numpy array of single element to float in python Convert float to string with precision & number of decimal digits specified? Float and double datatype in Java C convert floating point to int Convert String to Float in Swift How do I change data-type of pandas data frame to string with a defined format? How to check if a float value is a whole number Convert floats to ints in Pandas? Converting Float to Dollars and Cents Format / Suppress Scientific Notation from Python Pandas Aggregation Results

Examples related to double

Round up double to 2 decimal places Convert double to float in Java Float and double datatype in Java Rounding a double value to x number of decimal places in swift How to round a Double to the nearest Int in swift? Swift double to string How to get the Power of some Integer in Swift language? Difference between int and double How to cast the size_t to double or int C++ How to get absolute value from double - c-language

Examples related to ieee-754

Float and double datatype in Java Ranges of floating point datatype in C? Double precision - decimal places biggest integer that can be stored in a double Formatting doubles for output in C#