[c] What is a bus error?

What does the "bus error" message mean, and how does it differ from a segfault?

This question is related to c unix segmentation-fault bus-error

The answer is


It depends on your OS, CPU, Compiler, and possibly other factors.

In general it means the CPU bus could not complete a command, or suffered a conflict, but that could mean a whole range of things depending on the environment and code being run.

-Adam


I agree with all the answers above. Here are my 2 cents regarding the BUS error:

A BUS error need not arise from the instructions within the program's code. This can happen when you are running a binary and during the execution, the binary is modified (overwritten by a build or deleted, etc.).

Verifying if this is the case

A simple way to check if this is the cause is by launching a couple of instances of the same binary form a build output directory, and running a build after they start. Both the running instances would crash with a SIGBUS error shortly after the build has finished and replaced the binary (the one that both the instances are currently running).

Underlying Reason

This is because OS swaps memory pages and in some cases, the binary might not be entirely loaded in memory. These crashes would occur when the OS tries to fetch the next page from the same binary, but the binary has changed since the last time it was read.


A segfault is accessing memory that you're not allowed to access. It's read-only, you don't have permission, etc...

A bus error is trying to access memory that can't possibly be there. You've used an address that's meaningless to the system, or the wrong kind of address for that operation.


I believe the kernel raises SIGBUS when an application exhibits data misalignment on the data bus. I think that since most[?] modern compilers for most processors pad / align the data for the programmers, the alignment troubles of yore (at least) mitigated, and hence one does not see SIGBUS too often these days (AFAIK).

From: Here


To add to what blxtd answered above, bus errors also occur when your process cannot attempt to access the memory of a particular 'variable'.

for (j = 0; i < n; j++) {
    for (i =0; i < m; i++) {
        a[n+1][j] += a[i][j];
    }
}

Notice the 'inadvertent' usage of variable 'i' in the first 'for loop'? That's what is causing the bus error in this case.


A specific example of a bus error I just encountered while programming C on OS X:

#include <string.h>
#include <stdio.h>

int main(void)
{
    char buffer[120];
    fgets(buffer, sizeof buffer, stdin);
    strcat("foo", buffer);
    return 0;
}

In case you don't remember the docs strcat appends the second argument to the first by changing the first argument(flip the arguments and it works fine). On linux this gives a segmentation fault(as expected), but on OS X it gives a bus error. Why? I really don't know.


A segfault is accessing memory that you're not allowed to access. It's read-only, you don't have permission, etc...

A bus error is trying to access memory that can't possibly be there. You've used an address that's meaningless to the system, or the wrong kind of address for that operation.


My reason for bus error on Mac OS X was that I tried to allocate about 1Mb on the stack. This worked well in one thread, but when using openMP this drives to bus error, because Mac OS X has very limited stack size for non-main threads.


I just found out the hard way that on an ARMv7 processor you can write some code that gives you a segmentation fault when unoptimized, but it gives you a bus error when compiled with -O2 (optimize more).

I am using the GCC ARM gnueabihf cross compiler from Ubuntu 64 bit.


I was getting a bus error when the root directory was at 100%.


It normally means an un-aligned access.

An attempt to access memory that isn't physically present would also give a bus error, but you won't see this if you're using a processor with an MMU and an OS that's not buggy, because you won't have any non-existent memory mapped to your process's address space.


To add to what blxtd answered above, bus errors also occur when your process cannot attempt to access the memory of a particular 'variable'.

for (j = 0; i < n; j++) {
    for (i =0; i < m; i++) {
        a[n+1][j] += a[i][j];
    }
}

Notice the 'inadvertent' usage of variable 'i' in the first 'for loop'? That's what is causing the bus error in this case.


One classic instance of a bus error is on certain architecures, such as the SPARC (at least some SPARCs, maybe this has been changed), is when you do a mis-aligned access. For instance:

unsigned char data[6];
(unsigned int *) (data + 2) = 0xdeadf00d;

This snippet tries to write the 32-bit integer value 0xdeadf00d to an address that is (most likely) not properly aligned, and will generate a bus error on architectures that are "picky" in this regard. The Intel x86 is, by the way, not such an architecture, it would allow the access (albeit execute it more slowly).


You can also get SIGBUS when a code page cannot be paged in for some reason.


I agree with all the answers above. Here are my 2 cents regarding the BUS error:

A BUS error need not arise from the instructions within the program's code. This can happen when you are running a binary and during the execution, the binary is modified (overwritten by a build or deleted, etc.).

Verifying if this is the case

A simple way to check if this is the cause is by launching a couple of instances of the same binary form a build output directory, and running a build after they start. Both the running instances would crash with a SIGBUS error shortly after the build has finished and replaced the binary (the one that both the instances are currently running).

Underlying Reason

This is because OS swaps memory pages and in some cases, the binary might not be entirely loaded in memory. These crashes would occur when the OS tries to fetch the next page from the same binary, but the binary has changed since the last time it was read.


I just found out the hard way that on an ARMv7 processor you can write some code that gives you a segmentation fault when unoptimized, but it gives you a bus error when compiled with -O2 (optimize more).

I am using the GCC ARM gnueabihf cross compiler from Ubuntu 64 bit.


It normally means an un-aligned access.

An attempt to access memory that isn't physically present would also give a bus error, but you won't see this if you're using a processor with an MMU and an OS that's not buggy, because you won't have any non-existent memory mapped to your process's address space.


I believe the kernel raises SIGBUS when an application exhibits data misalignment on the data bus. I think that since most[?] modern compilers for most processors pad / align the data for the programmers, the alignment troubles of yore (at least) mitigated, and hence one does not see SIGBUS too often these days (AFAIK).

From: Here


It normally means an un-aligned access.

An attempt to access memory that isn't physically present would also give a bus error, but you won't see this if you're using a processor with an MMU and an OS that's not buggy, because you won't have any non-existent memory mapped to your process's address space.


You can also get SIGBUS when a code page cannot be paged in for some reason.


I believe the kernel raises SIGBUS when an application exhibits data misalignment on the data bus. I think that since most[?] modern compilers for most processors pad / align the data for the programmers, the alignment troubles of yore (at least) mitigated, and hence one does not see SIGBUS too often these days (AFAIK).

From: Here


mmap minimal POSIX 7 example

"Bus error" happens when the kernel sends SIGBUS to a process.

A minimal example that produces it because ftruncate was forgotten:

#include <fcntl.h> /* O_ constants */
#include <unistd.h> /* ftruncate */
#include <sys/mman.h> /* mmap */

int main() {
    int fd;
    int *map;
    int size = sizeof(int);
    char *name = "/a";

    shm_unlink(name);
    fd = shm_open(name, O_RDWR | O_CREAT, (mode_t)0600);
    /* THIS is the cause of the problem. */
    /*ftruncate(fd, size);*/
    map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    /* This is what generates the SIGBUS. */
    *map = 0;
}

Run with:

gcc -std=c99 main.c -lrt
./a.out

Tested in Ubuntu 14.04.

POSIX describes SIGBUS as:

Access to an undefined portion of a memory object.

The mmap spec says that:

References within the address range starting at pa and continuing for len bytes to whole pages following the end of an object shall result in delivery of a SIGBUS signal.

And shm_open says that it generates objects of size 0:

The shared memory object has a size of zero.

So at *map = 0 we are touching past the end of the allocated object.

Unaligned stack memory accesses in ARMv8 aarch64

This was mentioned at: What is a bus error? for SPARC, but here I will provide a more reproducible example.

All you need is a freestanding aarch64 program:

.global _start
_start:
asm_main_after_prologue:
    /* misalign the stack out of 16-bit boundary */
    add sp, sp, #-4
    /* access the stack */
    ldr w0, [sp]

    /* exit syscall in case SIGBUS does not happen */
    mov x0, 0
    mov x8, 93
    svc 0

That program then raises SIGBUS on Ubuntu 18.04 aarch64, Linux kernel 4.15.0 in a ThunderX2 server machine.

Unfortunately, I can't reproduce it on QEMU v4.0.0 user mode, I'm not sure why.

The fault appears to be optional and controlled by the SCTLR_ELx.SA and SCTLR_EL1.SA0 fields, I have summarized the related docs a bit further here.


My reason for bus error on Mac OS X was that I tried to allocate about 1Mb on the stack. This worked well in one thread, but when using openMP this drives to bus error, because Mac OS X has very limited stack size for non-main threads.


It normally means an un-aligned access.

An attempt to access memory that isn't physically present would also give a bus error, but you won't see this if you're using a processor with an MMU and an OS that's not buggy, because you won't have any non-existent memory mapped to your process's address space.


One classic instance of a bus error is on certain architecures, such as the SPARC (at least some SPARCs, maybe this has been changed), is when you do a mis-aligned access. For instance:

unsigned char data[6];
(unsigned int *) (data + 2) = 0xdeadf00d;

This snippet tries to write the 32-bit integer value 0xdeadf00d to an address that is (most likely) not properly aligned, and will generate a bus error on architectures that are "picky" in this regard. The Intel x86 is, by the way, not such an architecture, it would allow the access (albeit execute it more slowly).


You can also get SIGBUS when a code page cannot be paged in for some reason.


It depends on your OS, CPU, Compiler, and possibly other factors.

In general it means the CPU bus could not complete a command, or suffered a conflict, but that could mean a whole range of things depending on the environment and code being run.

-Adam


One classic instance of a bus error is on certain architecures, such as the SPARC (at least some SPARCs, maybe this has been changed), is when you do a mis-aligned access. For instance:

unsigned char data[6];
(unsigned int *) (data + 2) = 0xdeadf00d;

This snippet tries to write the 32-bit integer value 0xdeadf00d to an address that is (most likely) not properly aligned, and will generate a bus error on architectures that are "picky" in this regard. The Intel x86 is, by the way, not such an architecture, it would allow the access (albeit execute it more slowly).


It depends on your OS, CPU, Compiler, and possibly other factors.

In general it means the CPU bus could not complete a command, or suffered a conflict, but that could mean a whole range of things depending on the environment and code being run.

-Adam


One classic instance of a bus error is on certain architecures, such as the SPARC (at least some SPARCs, maybe this has been changed), is when you do a mis-aligned access. For instance:

unsigned char data[6];
(unsigned int *) (data + 2) = 0xdeadf00d;

This snippet tries to write the 32-bit integer value 0xdeadf00d to an address that is (most likely) not properly aligned, and will generate a bus error on architectures that are "picky" in this regard. The Intel x86 is, by the way, not such an architecture, it would allow the access (albeit execute it more slowly).


I was getting a bus error when the root directory was at 100%.


mmap minimal POSIX 7 example

"Bus error" happens when the kernel sends SIGBUS to a process.

A minimal example that produces it because ftruncate was forgotten:

#include <fcntl.h> /* O_ constants */
#include <unistd.h> /* ftruncate */
#include <sys/mman.h> /* mmap */

int main() {
    int fd;
    int *map;
    int size = sizeof(int);
    char *name = "/a";

    shm_unlink(name);
    fd = shm_open(name, O_RDWR | O_CREAT, (mode_t)0600);
    /* THIS is the cause of the problem. */
    /*ftruncate(fd, size);*/
    map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    /* This is what generates the SIGBUS. */
    *map = 0;
}

Run with:

gcc -std=c99 main.c -lrt
./a.out

Tested in Ubuntu 14.04.

POSIX describes SIGBUS as:

Access to an undefined portion of a memory object.

The mmap spec says that:

References within the address range starting at pa and continuing for len bytes to whole pages following the end of an object shall result in delivery of a SIGBUS signal.

And shm_open says that it generates objects of size 0:

The shared memory object has a size of zero.

So at *map = 0 we are touching past the end of the allocated object.

Unaligned stack memory accesses in ARMv8 aarch64

This was mentioned at: What is a bus error? for SPARC, but here I will provide a more reproducible example.

All you need is a freestanding aarch64 program:

.global _start
_start:
asm_main_after_prologue:
    /* misalign the stack out of 16-bit boundary */
    add sp, sp, #-4
    /* access the stack */
    ldr w0, [sp]

    /* exit syscall in case SIGBUS does not happen */
    mov x0, 0
    mov x8, 93
    svc 0

That program then raises SIGBUS on Ubuntu 18.04 aarch64, Linux kernel 4.15.0 in a ThunderX2 server machine.

Unfortunately, I can't reproduce it on QEMU v4.0.0 user mode, I'm not sure why.

The fault appears to be optional and controlled by the SCTLR_ELx.SA and SCTLR_EL1.SA0 fields, I have summarized the related docs a bit further here.


You can also get SIGBUS when a code page cannot be paged in for some reason.


A typical buffer overflow which results in Bus error is,

{
    char buf[255];
    sprintf(buf,"%s:%s\n", ifname, message);
}

Here if size of the string in double quotes ("") is more than buf size it gives bus error.


I believe the kernel raises SIGBUS when an application exhibits data misalignment on the data bus. I think that since most[?] modern compilers for most processors pad / align the data for the programmers, the alignment troubles of yore (at least) mitigated, and hence one does not see SIGBUS too often these days (AFAIK).

From: Here


A typical buffer overflow which results in Bus error is,

{
    char buf[255];
    sprintf(buf,"%s:%s\n", ifname, message);
}

Here if size of the string in double quotes ("") is more than buf size it gives bus error.


A specific example of a bus error I just encountered while programming C on OS X:

#include <string.h>
#include <stdio.h>

int main(void)
{
    char buffer[120];
    fgets(buffer, sizeof buffer, stdin);
    strcat("foo", buffer);
    return 0;
}

In case you don't remember the docs strcat appends the second argument to the first by changing the first argument(flip the arguments and it works fine). On linux this gives a segmentation fault(as expected), but on OS X it gives a bus error. Why? I really don't know.


It depends on your OS, CPU, Compiler, and possibly other factors.

In general it means the CPU bus could not complete a command, or suffered a conflict, but that could mean a whole range of things depending on the environment and code being run.

-Adam


A segfault is accessing memory that you're not allowed to access. It's read-only, you don't have permission, etc...

A bus error is trying to access memory that can't possibly be there. You've used an address that's meaningless to the system, or the wrong kind of address for that operation.


Examples related to c

conflicting types for 'outchar' Can't compile C program on a Mac after upgrade to Mojave Program to find largest and second largest number in array Prime numbers between 1 to 100 in C Programming Language In c, in bool, true == 1 and false == 0? How I can print to stderr in C? Visual Studio Code includePath "error: assignment to expression with array type error" when I assign a struct field (C) Compiling an application for use in highly radioactive environments How can you print multiple variables inside a string using printf?

Examples related to unix

Docker CE on RHEL - Requires: container-selinux >= 2.9 What does `set -x` do? How to find files modified in last x minutes (find -mmin does not work as expected) sudo: npm: command not found How to sort a file in-place How to read a .properties file which contains keys that have a period character using Shell script gpg decryption fails with no secret key error Loop through a comma-separated shell variable Best way to find os name and version in Unix/Linux platform Resource u'tokenizers/punkt/english.pickle' not found

Examples related to segmentation-fault

Command failed due to signal: Segmentation fault: 11 Android Fatal signal 11 (SIGSEGV) at 0x636f7d89 (code=1). How can it be tracked down? Counter exit code 139 when running, but gdb make it through segmentation fault : 11 Segmentation Fault - C What causes a Python segmentation fault? How to return a class object by reference in C++? Login with facebook android sdk app crash API 4 Returning pointer from a function Fixing Segmentation faults in C++

Examples related to bus-error

What is a bus error?