[c++] Compiling an application for use in highly radioactive environments

Here are huge amount of replies, but I'll try to sum up my ideas about this.

Something crashes or does not work correctly could be result of your own mistakes - then it should be easily to fix when you locate the problem. But there is also possibility of hardware failures - and that's difficult if not impossible to fix in overall.

I would recommend first to try to catch the problematic situation by logging (stack, registers, function calls) - either by logging them somewhere into file, or transmitting them somehow directly ("oh no - I'm crashing").

Recovery from such error situation is either reboot (if software is still alive and kicking) or hardware reset (e.g. hw watchdogs). Easier to start from first one.

If problem is hardware related - then logging should help you to identify in which function call problem occurs and that can give you inside knowledge of what is not working and where.

Also if code is relatively complex - it makes sense to "divide and conquer" it - meaning you remove / disable some function calls where you suspect problem is - typically disabling half of code and enabling another half - you can get "does work" / "does not work" kind of decision after which you can focus into another half of code. (Where problem is)

If problem occurs after some time - then stack overflow can be suspected - then it's better to monitor stack point registers - if they constantly grows.

And if you manage to fully minimize your code until "hello world" kind of application - and it's still failing randomly - then hardware problems are expected - and there needs to be "hardware upgrade" - meaning invent such cpu / ram / ... -hardware combination which would tolerate radiation better.

Most important thing is probably how you get your logs back if machine fully stopped / resetted / does not work - probably first thing bootstap should do - is a head back home if problematic situation is entcovered.

If it's possible in your environment also to transmit a signal and receive response - you could try out to construct some sort of online remote debugging environment, but then you must have at least of communication media working and some processor/ some ram in working state. And by remote debugging I mean either GDB / gdb stub kind of approach or your own implementation of what you need to get back from your application (e.g. download log files, download call stack, download ram, restart)

Examples related to c++

Method Call Chaining; returning a pointer vs a reference? How can I tell if an algorithm is efficient? Difference between opening a file in binary vs text How can compare-and-swap be used for a wait-free mutual exclusion for any shared data structure? Install Qt on Ubuntu #include errors detected in vscode Cannot open include file: 'stdio.h' - Visual Studio Community 2017 - C++ Error How to fix the error "Windows SDK version 8.1" was not found? Visual Studio 2017 errors on standard headers How do I check if a Key is pressed on C++

Examples related to c

conflicting types for 'outchar' Can't compile C program on a Mac after upgrade to Mojave Program to find largest and second largest number in array Prime numbers between 1 to 100 in C Programming Language In c, in bool, true == 1 and false == 0? How I can print to stderr in C? Visual Studio Code includePath "error: assignment to expression with array type error" when I assign a struct field (C) Compiling an application for use in highly radioactive environments How can you print multiple variables inside a string using printf?

Examples related to gcc

Can't compile C program on a Mac after upgrade to Mojave Compiling an application for use in highly radioactive environments Make Error 127 when running trying to compile code How to Install gcc 5.3 with yum on CentOS 7.2? How does one set up the Visual Studio Code compiler/debugger to GCC? How do I set up CLion to compile and run? CMake error at CMakeLists.txt:30 (project): No CMAKE_C_COMPILER could be found How to printf a 64-bit integer as hex? Differences between arm64 and aarch64 Fatal error: iostream: No such file or directory in compiling C program using GCC

Examples related to embedded

Compiling an application for use in highly radioactive environments Sorting 1 million 8-decimal-digit numbers with 1 MB of RAM What does this GCC error "... relocation truncated to fit..." mean? How do you implement a class in C? Understanding Linux /proc/id/maps Using floats with sprintf() in embedded C What is the difference between C and embedded C? Unit Testing C Code

Examples related to fault-tolerance

Compiling an application for use in highly radioactive environments