Why is the gets function so dangerous that it should not be used

Question

When I try to compile C code that uses the gets() function with GCC, I get this warning:

(.text+0x34): warning: the `gets' function is dangerous and should not be used.

I remember this has something to do with stack protection and security, but I'm not sure exactly why.

How can I remove this warning and why is there such a warning about using gets()?

If gets() is so dangerous then why can't we remove it?

User · Answer

Why is gets   dangerous  The first internet worm  the Morris Internet Worm  escaped about 30 years ago  1988-11-02   and it used gets   and a buffer overflow as one of its methods of propagating from system to system  The basic problem is that the function doesn t know how big the buffer is  so it continues reading until it finds a newline or encounters EOF  and may overflow the bounds of the buffer it was given   You should forget you ever heard that gets   existed   The C11 standard ISO IEC 9899 2011 eliminated gets   as a standard function  which is A Good Thing     it was formally marked as  obsolescent  and  deprecated  in ISO IEC 9899 1999 Cor 3 2007     Technical Corrigendum 3 for C99  and then removed in C11    Sadly  it will remain in libraries for many years  meaning  decades   for reasons of backwards compatibility  If it were up to me  the implementation of gets   would become   char  gets char  buffer        assert buffer    0       abort        return 0      Given that your code will crash anyway  sooner or later  it is better to head the trouble off sooner rather than later   I d be prepared to add an error message    fputs  obsolete and dangerous function gets   called n   stderr     Modern versions of the Linux compilation system generates warnings if you link gets       and also for some other functions that also have security problems  mktemp           Alternatives to gets    fgets    As everyone else said  the canonical alternative to gets   is fgets   specifying stdin as the file stream   char buffer BUFSIZ    while  fgets buffer  sizeof buffer   stdin     0           process line of data        What no-one else yet mentioned is that gets   does not include the newline but fgets   does   So  you might need to use a wrapper around fgets   that deletes the newline   char  fgets wrapper char  buffer  size t buflen  FILE  fp        if  fgets buffer  buflen  fp     0                size t len   strlen buffer           if  len  gt  0  amp  amp  buffer len-1       n               buffer len-1      0           return buffer            return 0      Or  better   char  fgets wrapper char  buffer  size t buflen  FILE  fp        if  fgets buffer  buflen  fp     0                buffer strcspn buffer    n        0           return buffer            return 0      Also  as caf points out in a comment and paxdiablo shows in his answer  with fgets   you might have data left over on a line   My wrapper code leaves that data to be read next time  you can readily modify it to gobble the rest of the line of data if you prefer           if  len  gt  0  amp  amp  buffer len-1       n               buffer len-1      0           else                        int ch               while   ch   getc fp      EOF  amp  amp  ch      n                                  The residual problem is how to report the three different result states     EOF or error  line read and not truncated  and partial line read but data was truncated   This problem doesn t arise with gets   because it doesn t know where your buffer ends and merrily tramples beyond the end  wreaking havoc on your beautifully tended memory layout  often messing up the return stack  a Stack Overflow  if the buffer is allocated on the stack  or trampling over the control information if the buffer is dynamically allocated  or copying data over other precious global  or module  variables if the buffer is statically allocated   None of these is a good idea     they epitomize the phrase  undefined behaviour      There is also the TR 24731-1  Technical Report from the C Standard Committee  which provides safer alternatives to a variety of functions  including gets          6 5 4 1 The gets s function      Synopsis   define   STDC WANT LIB EXT1   1  include  lt stdio h gt  char  gets s char  s  rsize t n         Runtime-constraints      s shall not be a null pointer  n shall neither be equal to zero nor be greater than   RSIZE MAX  A new-line character  end-of-file  or read error shall occur within reading   n-1 characters from stdin 25       3 If there is a runtime-constraint violation  s 0  is set to the null character  and characters   are read and discarded from stdin until a new-line character is read  or end-of-file or a   read error occurs       Description      4 The gets s function reads at most one less than the number of characters specified by n   from the stream pointed to by stdin  into the array pointed to by s  No additional   characters are read after a new-line character  which is discarded  or after end-of-file    The discarded new-line character does not count towards number of characters read  A   null character is written immediately after the last character read into the array       5 If end-of-file is encountered and no characters have been read into the array  or if a read   error occurs during the operation  then s 0  is set to the null character  and the other   elements of s take unspecified values       Recommended practice      6 The fgets function allows properly-written programs to safely process input lines too   long to store in the result array  In general this requires that callers of fgets pay   attention to the presence or absence of a new-line character in the result array  Consider   using fgets  along with any needed processing based on new-line characters  instead of   gets s       25  The gets s function  unlike gets  makes it a runtime-constraint violation for a line of input to   overflow the buffer to store it  Unlike fgets  gets s maintains a one-to-one relationship between   input lines and successful calls to gets s  Programs that use gets expect such a relationship    The Microsoft Visual Studio compilers implement an approximation to the TR 24731-1 standard  but there are differences between the signatures implemented by Microsoft and those in the TR   The C11 standard  ISO IEC 9899-2011  includes TR24731 in Annex K as an optional part of the library  Unfortunately  it is seldom implemented on Unix-like systems     getline       POSIX  POSIX 2008 also provides a safe alternative to gets   called getline     It allocates space for the line dynamically  so you end up needing to free it   It removes the limitation on line length  therefore  It also returns the length of the data that was read  or -1  and not EOF    which means that null bytes in the input can be handled reliably   There is also a  choose your own single-character delimiter  variation called getdelim    this can be useful if you are dealing with the output from find -print0 where the ends of the file names are marked with an ASCII NUL   0  character  for example

User · Answer

gets   is dangerous because it is possible for the user to crash the program by typing too much into the prompt  It can t detect the end of available memory  so if you allocate an amount of memory too small for the purpose  it can cause a seg fault and crash  Sometimes it seems very unlikely that a user will type 1000 letters into a prompt meant for a person s name  but as programmers  we need to make our programs bulletproof   it may also be a security risk if a user can crash a system program by sending too much data    fgets   allows you to specify how many characters are taken out of the standard input buffer  so they don t overrun the variable

User · Answer

fgets   To read from the stdin   char string 512    fgets string  sizeof string   stdin      no buffer overflows here  you re safe

User · Answer

Because gets doesn t do any kind of check while getting bytes from stdin and putting them somewhere  A simple example   char array1      12345   char array2      67890    gets array1     Now  first of all you are allowed to input how many characters you want  gets won t care about it  Secondly the bytes over the size of the array in which you put them  in this case array1  will overwrite whatever they find in memory because gets will write them  In the previous example this means that if you input  abcdefghijklmnopqrts  maybe  unpredictably  it will overwrite also array2 or whatever   The function is unsafe because it assumes consistent input  NEVER USE IT

User · Answer

I read recently  in a USENET post to comp lang c  that gets   is getting removed from the Standard  WOOHOO     You ll be happy to know that the   committee just voted  unanimously  as   it turns out  to remove gets   from   the draft as well

User · Answer

You can t remove API functions without breaking the API  If you would  many applications would no longer compile or run at all   This is the reason that one reference gives      Reading a line that overflows the   array pointed to by s results in   undefined behavior  The use of fgets     is recommended

User · Answer

In C11 ISO IEC 9899 201x   gets   has been removed   It s deprecated in ISO IEC 9899 1999 Cor 3 2007 E    In addition to fgets    C11 introduces a new safe alternative gets s        C11 K 3 5 4 1 The gets s function   define   STDC WANT LIB EXT1   1  include  lt stdio h gt  char  gets s char  s  rsize t n      However  in the Recommended practice section  fgets   is still preferred      The fgets function allows properly-written programs to safely process input lines too   long to store in the result array  In general this requires that callers of fgets pay   attention to the presence or absence of a new-line character in the result array  Consider   using fgets  along with any needed processing based on new-line characters  instead of   gets s

User · Answer

You should not use gets since it has no way to stop a buffer overflow  If the user types in more data than can fit in your buffer  you will most likely end up with corruption or worse   In fact  ISO have actually taken the step of removing gets from the C standard  as of C11  though it was deprecated in C99  which  given how highly they rate backward compatibility  should be an indication of how bad that function was   The correct thing to do is to use the fgets function with the stdin file handle since you can limit the characters read from the user   But this also has its problems such as    extra characters entered by the user will be picked up the next time around  there s no quick notification that the user entered too much data    To that end  almost every C coder at some point in their career will write a more useful wrapper around fgets as well  Here s mine    include  lt stdio h gt   include  lt string h gt    define OK       0  define NO INPUT 1  define TOO LONG 2 static int getLine  char  prmpt  char  buff  size t sz        int ch  extra          Get line with buffer overrun protection      if  prmpt    NULL            printf    s   prmpt           fflush  stdout             if  fgets  buff  sz  stdin     NULL          return NO INPUT          If it was too long  there ll be no newline  In that case  we flush        to end of line so that excess doesn t affect the next call      if  buff strlen buff -1       n             extra   0          while    ch   getchar         n    amp  amp   ch    EOF               extra   1          return  extra    1    TOO LONG   OK                Otherwise remove newline and give string back to caller      buff strlen buff -1      0       return OK      with some test code      Test program for getLine     int main  void        int rc      char buff 10        rc   getLine   Enter string gt     buff  sizeof buff        if  rc    NO INPUT            printf   No input n            return 1             if  rc    TOO LONG            printf   Input too long n            return 1             printf   OK   s  n   buff        return 0      It provides the same protections as fgets in that it prevents buffer overflows but it also notifies the caller as to what happened and clears out the excess characters so that they do not affect your next input operation   Feel free to use it as you wish  I hereby release it under the  do what you damn well want to  licence  -

User · Answer

The C gets function is dangerous and has been a very costly mistake. Tony Hoare singles it out for specific mention in his talk "Null References: The Billion Dollar Mistake":

http://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare

The whole hour is worth watching but for his comments view from 30 minutes on with the specific gets criticism around 39 minutes.

Hopefully this whets your appetite for the whole talk, which draws attention to how we need more formal correctness proofs in languages and how language designers should be blamed for the mistakes in their languages, not the programmer. This seems to have been the whole dubious reason for designers of bad languages to push the blame to programmers in the guise of 'programmer freedom'.

User · Answer

I would like to extend an earnest invitation to any C library maintainers out there who are still including gets in their libraries "just in case anyone is still depending on it": Please replace your implementation with the equivalent of

char *gets(char *str)
{
    strcpy(str, "Never use gets!");
    return str;
}

This will help make sure nobody is still depending on it. Thank you.

User · Answer

In order to use gets safely  you have to know exactly how many characters you will be reading  so that you can make your buffer large enough  You will only know that if you know exactly what data you will be reading   Instead of using gets  you want to use fgets  which has the signature  char  fgets char  string  int length  FILE   stream      fgets  if it reads an entire line  will leave the   n  in the string  you ll have to deal with that    It remained an official part of the language up to the 1999 ISO C standard  but it was officially removed by the 2011 standard  Most C implementations still support it  but at least gcc issues a warning for any code that uses it

[c] Why is the gets function so dangerous that it should not be used?

The answer is

Why is `gets()` dangerous

Alternatives to `gets()`

fgets()

§6.5.4.1 The `gets_s` function

Synopsis

Runtime-constraints

Description

Recommended practice

`getline()` — POSIX

C11 K.3.5.4.1 The `gets_s` function

Examples related to c

Examples related to fgets

Examples related to buffer-overflow

Examples related to gets

Tags