[java] Difference between declaring variables before or in loop?

I have always wondered if, in general, declaring a throw-away variable before a loop, as opposed to repeatedly inside the loop, makes any (performance) difference? A (quite pointless) example in Java:

a) declaration before loop:

double intermediateResult;
for(int i=0; i < 1000; i++){
    intermediateResult = i;
    System.out.println(intermediateResult);
}

b) declaration (repeatedly) inside loop:

for(int i=0; i < 1000; i++){
    double intermediateResult = i;
    System.out.println(intermediateResult);
}

Which one is better, a or b?

I suspect that repeated variable declaration (example b) creates more overhead in theory, but that compilers are smart enough so that it doesn't matter. Example b has the advantage of being more compact and limiting the scope of the variable to where it is used. Still, I tend to code according example a.

Edit: I am especially interested in the Java case.

This question is related to java performance loops variables initialization

The answer is


Even if I know my compiler is smart enough, I won't like to rely on it, and will use the a) variant.

The b) variant makes sense to me only if you desperately need to make the intermediateResult unavailable after the loop body. But I can't imagine such desperate situation, anyway....

EDIT: Jon Skeet made a very good point, showing that variable declaration inside a loop can make an actual semantic difference.


I would always use A (rather than relying on the compiler) and might also rewrite to:

for(int i=0, double intermediateResult=0; i<1000; i++){
    intermediateResult = i;
    System.out.println(intermediateResult);
}

This still restricts intermediateResult to the loop's scope, but doesn't redeclare during each iteration.


It depends on the language and the exact use. For instance, in C# 1 it made no difference. In C# 2, if the local variable is captured by an anonymous method (or lambda expression in C# 3) it can make a very signficant difference.

Example:

using System;
using System.Collections.Generic;

class Test
{
    static void Main()
    {
        List<Action> actions = new List<Action>();

        int outer;
        for (int i=0; i < 10; i++)
        {
            outer = i;
            int inner = i;
            actions.Add(() => Console.WriteLine("Inner={0}, Outer={1}", inner, outer));
        }

        foreach (Action action in actions)
        {
            action();
        }
    }
}

Output:

Inner=0, Outer=9
Inner=1, Outer=9
Inner=2, Outer=9
Inner=3, Outer=9
Inner=4, Outer=9
Inner=5, Outer=9
Inner=6, Outer=9
Inner=7, Outer=9
Inner=8, Outer=9
Inner=9, Outer=9

The difference is that all of the actions capture the same outer variable, but each has its own separate inner variable.


I suspect a few compilers could optimize both to be the same code, but certainly not all. So I'd say you're better off with the former. The only reason for the latter is if you want to ensure that the declared variable is used only within your loop.


I tested for JS with Node 4.0.0 if anyone is interested. Declaring outside the loop resulted in a ~.5 ms performance improvement on average over 1000 trials with 100 million loop iterations per trial. So I'm gonna say go ahead and write it in the most readable / maintainable way which is B, imo. I would put my code in a fiddle, but I used the performance-now Node module. Here's the code:

var now = require("../node_modules/performance-now")

// declare vars inside loop
function varInside(){
    for(var i = 0; i < 100000000; i++){
        var temp = i;
        var temp2 = i + 1;
        var temp3 = i + 2;
    }
}

// declare vars outside loop
function varOutside(){
    var temp;
    var temp2;
    var temp3;
    for(var i = 0; i < 100000000; i++){
        temp = i
        temp2 = i + 1
        temp3 = i + 2
    }
}

// for computing average execution times
var insideAvg = 0;
var outsideAvg = 0;

// run varInside a million times and average execution times
for(var i = 0; i < 1000; i++){
    var start = now()
    varInside()
    var end = now()
    insideAvg = (insideAvg + (end-start)) / 2
}

// run varOutside a million times and average execution times
for(var i = 0; i < 1000; i++){
    var start = now()
    varOutside()
    var end = now()
    outsideAvg = (outsideAvg + (end-start)) / 2
}

console.log('declared inside loop', insideAvg)
console.log('declared outside loop', outsideAvg)

I suspect a few compilers could optimize both to be the same code, but certainly not all. So I'd say you're better off with the former. The only reason for the latter is if you want to ensure that the declared variable is used only within your loop.


I've always thought that if you declare your variables inside of your loop then you're wasting memory. If you have something like this:

for(;;) {
  Object o = new Object();
}

Then not only does the object need to be created for each iteration, but there needs to be a new reference allocated for each object. It seems that if the garbage collector is slow then you'll have a bunch of dangling references that need to be cleaned up.

However, if you have this:

Object o;
for(;;) {
  o = new Object();
}

Then you're only creating a single reference and assigning a new object to it each time. Sure, it might take a bit longer for it to go out of scope, but then there's only one dangling reference to deal with.


It depends on the language and the exact use. For instance, in C# 1 it made no difference. In C# 2, if the local variable is captured by an anonymous method (or lambda expression in C# 3) it can make a very signficant difference.

Example:

using System;
using System.Collections.Generic;

class Test
{
    static void Main()
    {
        List<Action> actions = new List<Action>();

        int outer;
        for (int i=0; i < 10; i++)
        {
            outer = i;
            int inner = i;
            actions.Add(() => Console.WriteLine("Inner={0}, Outer={1}", inner, outer));
        }

        foreach (Action action in actions)
        {
            action();
        }
    }
}

Output:

Inner=0, Outer=9
Inner=1, Outer=9
Inner=2, Outer=9
Inner=3, Outer=9
Inner=4, Outer=9
Inner=5, Outer=9
Inner=6, Outer=9
Inner=7, Outer=9
Inner=8, Outer=9
Inner=9, Outer=9

The difference is that all of the actions capture the same outer variable, but each has its own separate inner variable.


I made a simple test:

int b;
for (int i = 0; i < 10; i++) {
    b = i;
}

vs

for (int i = 0; i < 10; i++) {
    int b = i;
}

I compiled these codes with gcc - 5.2.0. And then I disassembled the main () of these two codes and that's the result:

1º:

   0x00000000004004b6 <+0>:     push   rbp
   0x00000000004004b7 <+1>:     mov    rbp,rsp
   0x00000000004004ba <+4>:     mov    DWORD PTR [rbp-0x4],0x0
   0x00000000004004c1 <+11>:    jmp    0x4004cd <main+23>
   0x00000000004004c3 <+13>:    mov    eax,DWORD PTR [rbp-0x4]
   0x00000000004004c6 <+16>:    mov    DWORD PTR [rbp-0x8],eax
   0x00000000004004c9 <+19>:    add    DWORD PTR [rbp-0x4],0x1
   0x00000000004004cd <+23>:    cmp    DWORD PTR [rbp-0x4],0x9
   0x00000000004004d1 <+27>:    jle    0x4004c3 <main+13>
   0x00000000004004d3 <+29>:    mov    eax,0x0
   0x00000000004004d8 <+34>:    pop    rbp
   0x00000000004004d9 <+35>:    ret

vs

   0x00000000004004b6 <+0>: push   rbp
   0x00000000004004b7 <+1>: mov    rbp,rsp
   0x00000000004004ba <+4>: mov    DWORD PTR [rbp-0x4],0x0
   0x00000000004004c1 <+11>:    jmp    0x4004cd <main+23>
   0x00000000004004c3 <+13>:    mov    eax,DWORD PTR [rbp-0x4]
   0x00000000004004c6 <+16>:    mov    DWORD PTR [rbp-0x8],eax
   0x00000000004004c9 <+19>:    add    DWORD PTR [rbp-0x4],0x1
   0x00000000004004cd <+23>:    cmp    DWORD PTR [rbp-0x4],0x9
   0x00000000004004d1 <+27>:    jle    0x4004c3 <main+13>
   0x00000000004004d3 <+29>:    mov    eax,0x0
   0x00000000004004d8 <+34>:    pop    rbp
   0x00000000004004d9 <+35>:    ret 

Which are exaclty the same asm result. isn't a proof that the two codes produce the same thing?


Well, you could always make a scope for that:

{ //Or if(true) if the language doesn't support making scopes like this
    double intermediateResult;
    for (int i=0; i<1000; i++) {
        intermediateResult = i;
        System.out.println(intermediateResult);
    }
}

This way you only declare the variable once, and it'll die when you leave the loop.


From a performance perspective, outside is (much) better.

public static void outside() {
    double intermediateResult;
    for(int i=0; i < Integer.MAX_VALUE; i++){
        intermediateResult = i;
    }
}

public static void inside() {
    for(int i=0; i < Integer.MAX_VALUE; i++){
        double intermediateResult = i;
    }
}

I executed both functions 1 billion times each. outside() took 65 milliseconds. inside() took 1.5 seconds.


Well I ran your A and B examples 20 times each, looping 100 million times.(JVM - 1.5.0)

A: average execution time: .074 sec

B: average execution time : .067 sec

To my surprise B was slightly faster. As fast as computers are now its hard to say if you could accurately measure this. I would code it the A way as well but I would say it doesn't really matter.


As a general rule, I declare my variables in the inner-most possible scope. So, if you're not using intermediateResult outside of the loop, then I'd go with B.


Even if I know my compiler is smart enough, I won't like to rely on it, and will use the a) variant.

The b) variant makes sense to me only if you desperately need to make the intermediateResult unavailable after the loop body. But I can't imagine such desperate situation, anyway....

EDIT: Jon Skeet made a very good point, showing that variable declaration inside a loop can make an actual semantic difference.


I think it depends on the compiler and is hard to give a general answer.


I would always use A (rather than relying on the compiler) and might also rewrite to:

for(int i=0, double intermediateResult=0; i<1000; i++){
    intermediateResult = i;
    System.out.println(intermediateResult);
}

This still restricts intermediateResult to the loop's scope, but doesn't redeclare during each iteration.


Even if I know my compiler is smart enough, I won't like to rely on it, and will use the a) variant.

The b) variant makes sense to me only if you desperately need to make the intermediateResult unavailable after the loop body. But I can't imagine such desperate situation, anyway....

EDIT: Jon Skeet made a very good point, showing that variable declaration inside a loop can make an actual semantic difference.


A) is a safe bet than B).........Imagine if you are initializing structure in loop rather than 'int' or 'float' then what?

like

typedef struct loop_example{

JXTZ hi; // where JXTZ could be another type...say closed source lib 
         // you include in Makefile

}loop_example_struct;

//then....

int j = 0; // declare here or face c99 error if in loop - depends on compiler setting

for ( ;j++; )
{
   loop_example loop_object; // guess the result in memory heap?
}

You are certainly bound to face problems with memory leaks!. Hence I believe 'A' is safer bet while 'B' is vulnerable to memory accumulation esp working close source libraries.You can check usinng 'Valgrind' Tool on Linux specifically sub tool 'Helgrind'.


Well I ran your A and B examples 20 times each, looping 100 million times.(JVM - 1.5.0)

A: average execution time: .074 sec

B: average execution time : .067 sec

To my surprise B was slightly faster. As fast as computers are now its hard to say if you could accurately measure this. I would code it the A way as well but I would say it doesn't really matter.


In my opinion, b is the better structure. In a, the last value of intermediateResult sticks around after your loop is finished.

Edit: This doesn't make a lot of difference with value types, but reference types can be somewhat weighty. Personally, I like variables to be dereferenced as soon as possible for cleanup, and b does that for you,


This is a gotcha in VB.NET. The Visual Basic result won't reinitialize the variable in this example:

For i as Integer = 1 to 100
    Dim j as Integer
    Console.WriteLine(j)
    j = i
Next

' Output: 0 1 2 3 4...

This will print 0 the first time (Visual Basic variables have default values when declared!) but i each time after that.

If you add a = 0, though, you get what you might expect:

For i as Integer = 1 to 100
    Dim j as Integer = 0
    Console.WriteLine(j)
    j = i
Next

'Output: 0 0 0 0 0...

It's an interesting question. From my experience there is an ultimate question to consider when you debate this matter for a code:

Is there any reason why the variable would need to be global?

It makes sense to only declare the variable once, globally, as opposed to many times locally, because it is better for organizing the code and requires less lines of code. However, if it only needs to be declared locally within one method, I would initialize it in that method so it is clear that the variable is exclusively relevant to that method. Be careful not to call this variable outside the method in which it is initialized if you choose the latter option--your code won't know what you're talking about and will report an error.

Also, as a side note, don't duplicate local variable names between different methods even if their purposes are near-identical; it just gets confusing.


It is language dependent - IIRC C# optimises this, so there isn't any difference, but JavaScript (for example) will do the whole memory allocation shebang each time.


The following is what I wrote and compiled in .NET.

double r0;
for (int i = 0; i < 1000; i++) {
    r0 = i*i;
    Console.WriteLine(r0);
}

for (int j = 0; j < 1000; j++) {
    double r1 = j*j;
    Console.WriteLine(r1);
}

This is what I get from .NET Reflector when CIL is rendered back into code.

for (int i = 0; i < 0x3e8; i++)
{
    double r0 = i * i;
    Console.WriteLine(r0);
}
for (int j = 0; j < 0x3e8; j++)
{
    double r1 = j * j;
    Console.WriteLine(r1);
}

So both look exactly same after compilation. In managed languages code is converted into CL/byte code and at time of execution it's converted into machine language. So in machine language a double may not even be created on the stack. It may just be a register as code reflect that it is a temporary variable for WriteLine function. There are a whole set optimization rules just for loops. So the average guy shouldn't be worried about it, especially in managed languages. There are cases when you can optimize manage code, for example, if you have to concatenate a large number of strings using just string a; a+=anotherstring[i] vs using StringBuilder. There is very big difference in performance between both. There are a lot of such cases where the compiler cannot optimize your code, because it cannot figure out what is intended in a bigger scope. But it can pretty much optimize basic things for you.


I would always use A (rather than relying on the compiler) and might also rewrite to:

for(int i=0, double intermediateResult=0; i<1000; i++){
    intermediateResult = i;
    System.out.println(intermediateResult);
}

This still restricts intermediateResult to the loop's scope, but doesn't redeclare during each iteration.


The following is what I wrote and compiled in .NET.

double r0;
for (int i = 0; i < 1000; i++) {
    r0 = i*i;
    Console.WriteLine(r0);
}

for (int j = 0; j < 1000; j++) {
    double r1 = j*j;
    Console.WriteLine(r1);
}

This is what I get from .NET Reflector when CIL is rendered back into code.

for (int i = 0; i < 0x3e8; i++)
{
    double r0 = i * i;
    Console.WriteLine(r0);
}
for (int j = 0; j < 0x3e8; j++)
{
    double r1 = j * j;
    Console.WriteLine(r1);
}

So both look exactly same after compilation. In managed languages code is converted into CL/byte code and at time of execution it's converted into machine language. So in machine language a double may not even be created on the stack. It may just be a register as code reflect that it is a temporary variable for WriteLine function. There are a whole set optimization rules just for loops. So the average guy shouldn't be worried about it, especially in managed languages. There are cases when you can optimize manage code, for example, if you have to concatenate a large number of strings using just string a; a+=anotherstring[i] vs using StringBuilder. There is very big difference in performance between both. There are a lot of such cases where the compiler cannot optimize your code, because it cannot figure out what is intended in a bigger scope. But it can pretty much optimize basic things for you.


A co-worker prefers the first form, telling it is an optimization, preferring to re-use a declaration.

I prefer the second one (and try to persuade my co-worker! ;-)), having read that:

  • It reduces scope of variables to where they are needed, which is a good thing.
  • Java optimizes enough to make no significant difference in performance. IIRC, perhaps the second form is even faster.

Anyway, it falls in the category of premature optimization that rely in quality of compiler and/or JVM.


From a performance perspective, outside is (much) better.

public static void outside() {
    double intermediateResult;
    for(int i=0; i < Integer.MAX_VALUE; i++){
        intermediateResult = i;
    }
}

public static void inside() {
    for(int i=0; i < Integer.MAX_VALUE; i++){
        double intermediateResult = i;
    }
}

I executed both functions 1 billion times each. outside() took 65 milliseconds. inside() took 1.5 seconds.


I made a simple test:

int b;
for (int i = 0; i < 10; i++) {
    b = i;
}

vs

for (int i = 0; i < 10; i++) {
    int b = i;
}

I compiled these codes with gcc - 5.2.0. And then I disassembled the main () of these two codes and that's the result:

1º:

   0x00000000004004b6 <+0>:     push   rbp
   0x00000000004004b7 <+1>:     mov    rbp,rsp
   0x00000000004004ba <+4>:     mov    DWORD PTR [rbp-0x4],0x0
   0x00000000004004c1 <+11>:    jmp    0x4004cd <main+23>
   0x00000000004004c3 <+13>:    mov    eax,DWORD PTR [rbp-0x4]
   0x00000000004004c6 <+16>:    mov    DWORD PTR [rbp-0x8],eax
   0x00000000004004c9 <+19>:    add    DWORD PTR [rbp-0x4],0x1
   0x00000000004004cd <+23>:    cmp    DWORD PTR [rbp-0x4],0x9
   0x00000000004004d1 <+27>:    jle    0x4004c3 <main+13>
   0x00000000004004d3 <+29>:    mov    eax,0x0
   0x00000000004004d8 <+34>:    pop    rbp
   0x00000000004004d9 <+35>:    ret

vs

   0x00000000004004b6 <+0>: push   rbp
   0x00000000004004b7 <+1>: mov    rbp,rsp
   0x00000000004004ba <+4>: mov    DWORD PTR [rbp-0x4],0x0
   0x00000000004004c1 <+11>:    jmp    0x4004cd <main+23>
   0x00000000004004c3 <+13>:    mov    eax,DWORD PTR [rbp-0x4]
   0x00000000004004c6 <+16>:    mov    DWORD PTR [rbp-0x8],eax
   0x00000000004004c9 <+19>:    add    DWORD PTR [rbp-0x4],0x1
   0x00000000004004cd <+23>:    cmp    DWORD PTR [rbp-0x4],0x9
   0x00000000004004d1 <+27>:    jle    0x4004c3 <main+13>
   0x00000000004004d3 <+29>:    mov    eax,0x0
   0x00000000004004d8 <+34>:    pop    rbp
   0x00000000004004d9 <+35>:    ret 

Which are exaclty the same asm result. isn't a proof that the two codes produce the same thing?


A co-worker prefers the first form, telling it is an optimization, preferring to re-use a declaration.

I prefer the second one (and try to persuade my co-worker! ;-)), having read that:

  • It reduces scope of variables to where they are needed, which is a good thing.
  • Java optimizes enough to make no significant difference in performance. IIRC, perhaps the second form is even faster.

Anyway, it falls in the category of premature optimization that rely in quality of compiler and/or JVM.


A) is a safe bet than B).........Imagine if you are initializing structure in loop rather than 'int' or 'float' then what?

like

typedef struct loop_example{

JXTZ hi; // where JXTZ could be another type...say closed source lib 
         // you include in Makefile

}loop_example_struct;

//then....

int j = 0; // declare here or face c99 error if in loop - depends on compiler setting

for ( ;j++; )
{
   loop_example loop_object; // guess the result in memory heap?
}

You are certainly bound to face problems with memory leaks!. Hence I believe 'A' is safer bet while 'B' is vulnerable to memory accumulation esp working close source libraries.You can check usinng 'Valgrind' Tool on Linux specifically sub tool 'Helgrind'.


Well I ran your A and B examples 20 times each, looping 100 million times.(JVM - 1.5.0)

A: average execution time: .074 sec

B: average execution time : .067 sec

To my surprise B was slightly faster. As fast as computers are now its hard to say if you could accurately measure this. I would code it the A way as well but I would say it doesn't really matter.


this is the better form

double intermediateResult;
int i = byte.MinValue;

for(; i < 1000; i++)
{
intermediateResult = i;
System.out.println(intermediateResult);
}

1) in this way declared once time both variable, and not each for cycle. 2) the assignment it's fatser thean all other option. 3) So the bestpractice rule is any declaration outside the iteration for.


It depends on the language and the exact use. For instance, in C# 1 it made no difference. In C# 2, if the local variable is captured by an anonymous method (or lambda expression in C# 3) it can make a very signficant difference.

Example:

using System;
using System.Collections.Generic;

class Test
{
    static void Main()
    {
        List<Action> actions = new List<Action>();

        int outer;
        for (int i=0; i < 10; i++)
        {
            outer = i;
            int inner = i;
            actions.Add(() => Console.WriteLine("Inner={0}, Outer={1}", inner, outer));
        }

        foreach (Action action in actions)
        {
            action();
        }
    }
}

Output:

Inner=0, Outer=9
Inner=1, Outer=9
Inner=2, Outer=9
Inner=3, Outer=9
Inner=4, Outer=9
Inner=5, Outer=9
Inner=6, Outer=9
Inner=7, Outer=9
Inner=8, Outer=9
Inner=9, Outer=9

The difference is that all of the actions capture the same outer variable, but each has its own separate inner variable.


I had this very same question for a long time. So I tested an even simpler piece of code.

Conclusion: For such cases there is NO performance difference.

Outside loop case

int intermediateResult;
for(int i=0; i < 1000; i++){
    intermediateResult = i+2;
    System.out.println(intermediateResult);
}

Inside loop case

for(int i=0; i < 1000; i++){
    int intermediateResult = i+2;
    System.out.println(intermediateResult);
}

I checked the compiled file on IntelliJ's decompiler and for both cases, I got the same Test.class

for(int i = 0; i < 1000; ++i) {
    int intermediateResult = i + 2;
    System.out.println(intermediateResult);
}

I also disassembled code for both the case using the method given in this answer. I'll show only the parts relevant to the answer

Outside loop case

Code:
  stack=2, locals=3, args_size=1
     0: iconst_0
     1: istore_2
     2: iload_2
     3: sipush        1000
     6: if_icmpge     26
     9: iload_2
    10: iconst_2
    11: iadd
    12: istore_1
    13: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
    16: iload_1
    17: invokevirtual #3                  // Method java/io/PrintStream.println:(I)V
    20: iinc          2, 1
    23: goto          2
    26: return
LocalVariableTable:
        Start  Length  Slot  Name   Signature
           13      13     1 intermediateResult   I
            2      24     2     i   I
            0      27     0  args   [Ljava/lang/String;

Inside loop case

Code:
      stack=2, locals=3, args_size=1
         0: iconst_0
         1: istore_1
         2: iload_1
         3: sipush        1000
         6: if_icmpge     26
         9: iload_1
        10: iconst_2
        11: iadd
        12: istore_2
        13: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
        16: iload_2
        17: invokevirtual #3                  // Method java/io/PrintStream.println:(I)V
        20: iinc          1, 1
        23: goto          2
        26: return
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
           13       7     2 intermediateResult   I
            2      24     1     i   I
            0      27     0  args   [Ljava/lang/String;

If you pay close attention, only the Slot assigned to i and intermediateResult in LocalVariableTable is swapped as a product of their order of appearance. The same difference in slot is reflected in other lines of code.

  • No extra operation is being performed
  • intermediateResult is still a local variable in both cases, so there is no difference access time.

BONUS

Compilers do a ton of optimization, take a look at what happens in this case.

Zero work case

for(int i=0; i < 1000; i++){
    int intermediateResult = i;
    System.out.println(intermediateResult);
}

Zero work decompiled

for(int i = 0; i < 1000; ++i) {
    System.out.println(i);
}

I would always use A (rather than relying on the compiler) and might also rewrite to:

for(int i=0, double intermediateResult=0; i<1000; i++){
    intermediateResult = i;
    System.out.println(intermediateResult);
}

This still restricts intermediateResult to the loop's scope, but doesn't redeclare during each iteration.


There is a difference in C# if you are using the variable in a lambda, etc. But in general the compiler will basically do the same thing, assuming the variable is only used within the loop.

Given that they are basically the same: Note that version b makes it much more obvious to readers that the variable isn't, and can't, be used after the loop. Additionally, version b is much more easily refactored. It is more difficult to extract the loop body into its own method in version a. Moreover, version b assures you that there is no side effect to such a refactoring.

Hence, version a annoys me to no end, because there's no benefit to it and it makes it much more difficult to reason about the code...


It's an interesting question. From my experience there is an ultimate question to consider when you debate this matter for a code:

Is there any reason why the variable would need to be global?

It makes sense to only declare the variable once, globally, as opposed to many times locally, because it is better for organizing the code and requires less lines of code. However, if it only needs to be declared locally within one method, I would initialize it in that method so it is clear that the variable is exclusively relevant to that method. Be careful not to call this variable outside the method in which it is initialized if you choose the latter option--your code won't know what you're talking about and will report an error.

Also, as a side note, don't duplicate local variable names between different methods even if their purposes are near-identical; it just gets confusing.


In my opinion, b is the better structure. In a, the last value of intermediateResult sticks around after your loop is finished.

Edit: This doesn't make a lot of difference with value types, but reference types can be somewhat weighty. Personally, I like variables to be dereferenced as soon as possible for cleanup, and b does that for you,


It is language dependent - IIRC C# optimises this, so there isn't any difference, but JavaScript (for example) will do the whole memory allocation shebang each time.


This is a gotcha in VB.NET. The Visual Basic result won't reinitialize the variable in this example:

For i as Integer = 1 to 100
    Dim j as Integer
    Console.WriteLine(j)
    j = i
Next

' Output: 0 1 2 3 4...

This will print 0 the first time (Visual Basic variables have default values when declared!) but i each time after that.

If you add a = 0, though, you get what you might expect:

For i as Integer = 1 to 100
    Dim j as Integer = 0
    Console.WriteLine(j)
    j = i
Next

'Output: 0 0 0 0 0...

As a general rule, I declare my variables in the inner-most possible scope. So, if you're not using intermediateResult outside of the loop, then I'd go with B.


This is a gotcha in VB.NET. The Visual Basic result won't reinitialize the variable in this example:

For i as Integer = 1 to 100
    Dim j as Integer
    Console.WriteLine(j)
    j = i
Next

' Output: 0 1 2 3 4...

This will print 0 the first time (Visual Basic variables have default values when declared!) but i each time after that.

If you add a = 0, though, you get what you might expect:

For i as Integer = 1 to 100
    Dim j as Integer = 0
    Console.WriteLine(j)
    j = i
Next

'Output: 0 0 0 0 0...

There is a difference in C# if you are using the variable in a lambda, etc. But in general the compiler will basically do the same thing, assuming the variable is only used within the loop.

Given that they are basically the same: Note that version b makes it much more obvious to readers that the variable isn't, and can't, be used after the loop. Additionally, version b is much more easily refactored. It is more difficult to extract the loop body into its own method in version a. Moreover, version b assures you that there is no side effect to such a refactoring.

Hence, version a annoys me to no end, because there's no benefit to it and it makes it much more difficult to reason about the code...


Tried the same thing in Go, and compared the compiler output using go tool compile -S with go 1.9.4

Zero difference, as per the assembler output.


A co-worker prefers the first form, telling it is an optimization, preferring to re-use a declaration.

I prefer the second one (and try to persuade my co-worker! ;-)), having read that:

  • It reduces scope of variables to where they are needed, which is a good thing.
  • Java optimizes enough to make no significant difference in performance. IIRC, perhaps the second form is even faster.

Anyway, it falls in the category of premature optimization that rely in quality of compiler and/or JVM.


As a general rule, I declare my variables in the inner-most possible scope. So, if you're not using intermediateResult outside of the loop, then I'd go with B.


Well I ran your A and B examples 20 times each, looping 100 million times.(JVM - 1.5.0)

A: average execution time: .074 sec

B: average execution time : .067 sec

To my surprise B was slightly faster. As fast as computers are now its hard to say if you could accurately measure this. I would code it the A way as well but I would say it doesn't really matter.


I think it depends on the compiler and is hard to give a general answer.


It depends on the language and the exact use. For instance, in C# 1 it made no difference. In C# 2, if the local variable is captured by an anonymous method (or lambda expression in C# 3) it can make a very signficant difference.

Example:

using System;
using System.Collections.Generic;

class Test
{
    static void Main()
    {
        List<Action> actions = new List<Action>();

        int outer;
        for (int i=0; i < 10; i++)
        {
            outer = i;
            int inner = i;
            actions.Add(() => Console.WriteLine("Inner={0}, Outer={1}", inner, outer));
        }

        foreach (Action action in actions)
        {
            action();
        }
    }
}

Output:

Inner=0, Outer=9
Inner=1, Outer=9
Inner=2, Outer=9
Inner=3, Outer=9
Inner=4, Outer=9
Inner=5, Outer=9
Inner=6, Outer=9
Inner=7, Outer=9
Inner=8, Outer=9
Inner=9, Outer=9

The difference is that all of the actions capture the same outer variable, but each has its own separate inner variable.


I suspect a few compilers could optimize both to be the same code, but certainly not all. So I'd say you're better off with the former. The only reason for the latter is if you want to ensure that the declared variable is used only within your loop.


Well, you could always make a scope for that:

{ //Or if(true) if the language doesn't support making scopes like this
    double intermediateResult;
    for (int i=0; i<1000; i++) {
        intermediateResult = i;
        System.out.println(intermediateResult);
    }
}

This way you only declare the variable once, and it'll die when you leave the loop.


I've always thought that if you declare your variables inside of your loop then you're wasting memory. If you have something like this:

for(;;) {
  Object o = new Object();
}

Then not only does the object need to be created for each iteration, but there needs to be a new reference allocated for each object. It seems that if the garbage collector is slow then you'll have a bunch of dangling references that need to be cleaned up.

However, if you have this:

Object o;
for(;;) {
  o = new Object();
}

Then you're only creating a single reference and assigning a new object to it each time. Sure, it might take a bit longer for it to go out of scope, but then there's only one dangling reference to deal with.


I tested for JS with Node 4.0.0 if anyone is interested. Declaring outside the loop resulted in a ~.5 ms performance improvement on average over 1000 trials with 100 million loop iterations per trial. So I'm gonna say go ahead and write it in the most readable / maintainable way which is B, imo. I would put my code in a fiddle, but I used the performance-now Node module. Here's the code:

var now = require("../node_modules/performance-now")

// declare vars inside loop
function varInside(){
    for(var i = 0; i < 100000000; i++){
        var temp = i;
        var temp2 = i + 1;
        var temp3 = i + 2;
    }
}

// declare vars outside loop
function varOutside(){
    var temp;
    var temp2;
    var temp3;
    for(var i = 0; i < 100000000; i++){
        temp = i
        temp2 = i + 1
        temp3 = i + 2
    }
}

// for computing average execution times
var insideAvg = 0;
var outsideAvg = 0;

// run varInside a million times and average execution times
for(var i = 0; i < 1000; i++){
    var start = now()
    varInside()
    var end = now()
    insideAvg = (insideAvg + (end-start)) / 2
}

// run varOutside a million times and average execution times
for(var i = 0; i < 1000; i++){
    var start = now()
    varOutside()
    var end = now()
    outsideAvg = (outsideAvg + (end-start)) / 2
}

console.log('declared inside loop', insideAvg)
console.log('declared outside loop', outsideAvg)

My practice is following:

  • if type of variable is simple (int, double, ...) I prefer variant b (inside).
    Reason: reducing scope of variable.

  • if type of variable is not simple (some kind of class or struct) I prefer variant a (outside).
    Reason: reducing number of ctor-dtor calls.


It is language dependent - IIRC C# optimises this, so there isn't any difference, but JavaScript (for example) will do the whole memory allocation shebang each time.


In my opinion, b is the better structure. In a, the last value of intermediateResult sticks around after your loop is finished.

Edit: This doesn't make a lot of difference with value types, but reference types can be somewhat weighty. Personally, I like variables to be dereferenced as soon as possible for cleanup, and b does that for you,


A co-worker prefers the first form, telling it is an optimization, preferring to re-use a declaration.

I prefer the second one (and try to persuade my co-worker! ;-)), having read that:

  • It reduces scope of variables to where they are needed, which is a good thing.
  • Java optimizes enough to make no significant difference in performance. IIRC, perhaps the second form is even faster.

Anyway, it falls in the category of premature optimization that rely in quality of compiler and/or JVM.


I had this very same question for a long time. So I tested an even simpler piece of code.

Conclusion: For such cases there is NO performance difference.

Outside loop case

int intermediateResult;
for(int i=0; i < 1000; i++){
    intermediateResult = i+2;
    System.out.println(intermediateResult);
}

Inside loop case

for(int i=0; i < 1000; i++){
    int intermediateResult = i+2;
    System.out.println(intermediateResult);
}

I checked the compiled file on IntelliJ's decompiler and for both cases, I got the same Test.class

for(int i = 0; i < 1000; ++i) {
    int intermediateResult = i + 2;
    System.out.println(intermediateResult);
}

I also disassembled code for both the case using the method given in this answer. I'll show only the parts relevant to the answer

Outside loop case

Code:
  stack=2, locals=3, args_size=1
     0: iconst_0
     1: istore_2
     2: iload_2
     3: sipush        1000
     6: if_icmpge     26
     9: iload_2
    10: iconst_2
    11: iadd
    12: istore_1
    13: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
    16: iload_1
    17: invokevirtual #3                  // Method java/io/PrintStream.println:(I)V
    20: iinc          2, 1
    23: goto          2
    26: return
LocalVariableTable:
        Start  Length  Slot  Name   Signature
           13      13     1 intermediateResult   I
            2      24     2     i   I
            0      27     0  args   [Ljava/lang/String;

Inside loop case

Code:
      stack=2, locals=3, args_size=1
         0: iconst_0
         1: istore_1
         2: iload_1
         3: sipush        1000
         6: if_icmpge     26
         9: iload_1
        10: iconst_2
        11: iadd
        12: istore_2
        13: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
        16: iload_2
        17: invokevirtual #3                  // Method java/io/PrintStream.println:(I)V
        20: iinc          1, 1
        23: goto          2
        26: return
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
           13       7     2 intermediateResult   I
            2      24     1     i   I
            0      27     0  args   [Ljava/lang/String;

If you pay close attention, only the Slot assigned to i and intermediateResult in LocalVariableTable is swapped as a product of their order of appearance. The same difference in slot is reflected in other lines of code.

  • No extra operation is being performed
  • intermediateResult is still a local variable in both cases, so there is no difference access time.

BONUS

Compilers do a ton of optimization, take a look at what happens in this case.

Zero work case

for(int i=0; i < 1000; i++){
    int intermediateResult = i;
    System.out.println(intermediateResult);
}

Zero work decompiled

for(int i = 0; i < 1000; ++i) {
    System.out.println(i);
}

I think it depends on the compiler and is hard to give a general answer.


It is language dependent - IIRC C# optimises this, so there isn't any difference, but JavaScript (for example) will do the whole memory allocation shebang each time.


I think it depends on the compiler and is hard to give a general answer.


Tried the same thing in Go, and compared the compiler output using go tool compile -S with go 1.9.4

Zero difference, as per the assembler output.


this is the better form

double intermediateResult;
int i = byte.MinValue;

for(; i < 1000; i++)
{
intermediateResult = i;
System.out.println(intermediateResult);
}

1) in this way declared once time both variable, and not each for cycle. 2) the assignment it's fatser thean all other option. 3) So the bestpractice rule is any declaration outside the iteration for.


I suspect a few compilers could optimize both to be the same code, but certainly not all. So I'd say you're better off with the former. The only reason for the latter is if you want to ensure that the declared variable is used only within your loop.


As a general rule, I declare my variables in the inner-most possible scope. So, if you're not using intermediateResult outside of the loop, then I'd go with B.


My practice is following:

  • if type of variable is simple (int, double, ...) I prefer variant b (inside).
    Reason: reducing scope of variable.

  • if type of variable is not simple (some kind of class or struct) I prefer variant a (outside).
    Reason: reducing number of ctor-dtor calls.


In my opinion, b is the better structure. In a, the last value of intermediateResult sticks around after your loop is finished.

Edit: This doesn't make a lot of difference with value types, but reference types can be somewhat weighty. Personally, I like variables to be dereferenced as soon as possible for cleanup, and b does that for you,


Even if I know my compiler is smart enough, I won't like to rely on it, and will use the a) variant.

The b) variant makes sense to me only if you desperately need to make the intermediateResult unavailable after the loop body. But I can't imagine such desperate situation, anyway....

EDIT: Jon Skeet made a very good point, showing that variable declaration inside a loop can make an actual semantic difference.


Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to performance

Why is 2 * (i * i) faster than 2 * i * i in Java? What is the difference between spark.sql.shuffle.partitions and spark.default.parallelism? How to check if a key exists in Json Object and get its value Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly? Most efficient way to map function over numpy array The most efficient way to remove first N elements in a list? Fastest way to get the first n elements of a List into an Array Why is "1000000000000000 in range(1000000000000001)" so fast in Python 3? pandas loc vs. iloc vs. at vs. iat? Android Recyclerview vs ListView with Viewholder

Examples related to loops

How to increment a letter N times per iteration and store in an array? Angular 2 Cannot find control with unspecified name attribute on formArrays What is the difference between i = i + 1 and i += 1 in a 'for' loop? Prime numbers between 1 to 100 in C Programming Language Python Loop: List Index Out of Range JavaScript: Difference between .forEach() and .map() Why does using from __future__ import print_function breaks Python2-style print? Creating an array from a text file in Bash Iterate through dictionary values? C# Wait until condition is true

Examples related to variables

When to create variables (memory management) How to print a Groovy variable in Jenkins? What does ${} (dollar sign and curly braces) mean in a string in Javascript? How to access global variables How to initialize a variable of date type in java? How to define a variable in a Dockerfile? Why does foo = filter(...) return a <filter object>, not a list? How can I pass variable to ansible playbook in the command line? How do I use this JavaScript variable in HTML? Static vs class functions/variables in Swift classes?

Examples related to initialization

"error: assignment to expression with array type error" when I assign a struct field (C) How to set default values in Go structs How to declare an ArrayList with values? Initialize array of strings Initializing a dictionary in python with a key value and no corresponding values Declare and Initialize String Array in VBA VBA (Excel) Initialize Entire Array without Looping Default values and initialization in Java Initializing array of structures C char array initialization