What is the correct way of reading from a TCP socket in C C

Question

Here s my code      Not all headers are relevant to the code snippet   include  lt stdio h gt   include  lt sys types h gt   include  lt sys socket h gt   include  lt netinet in h gt   include  lt netdb h gt   include  lt cstdlib gt   include  lt cstring gt   include  lt unistd h gt   char  buffer  stringstream readStream  bool readData   true   while  readData        cout  lt  lt   Receiving chunk               Read a bit at a time  eventually  end  string will be received      bzero buffer  BUFFER SIZE       int readResult   read socketFileDescriptor  buffer  BUFFER SIZE       if  readResult  lt  0                THROW VIMRID EX  Could not read from socket                   Concatenate the received data to the existing data      readStream  lt  lt  buffer          Continue reading while end is not found      readData   readStream str   find  end       string  npos       cout  lt  lt   Done  length     lt  lt  readStream str   length    lt  lt       lt  lt  endl      It s a little bit of C and C   as you can tell  The BUFFER SIZE is 256 - should I just increase the size  If so  what to  Does it matter   I know that if  end  is not received for what ever reason  this will be an endless loop  which is bad - so if you could suggest a better way  please also do so

User · Answer

Just to add to things from several of the posts above:

read() -- at least on my system -- returns ssize_t. This is like size_t, except is signed. On my system, it's a long, not an int. You might get compiler warnings if you use int, depending on your system, your compiler, and what warnings you have turned on.

User · Answer

Where are you allocating memory for your buffer  The line where you invoke bzero invokes undefined behavior since buffer does not point to any valid region of memory   char  buffer   new char  BUFFER SIZE       do processing     don t forget to release delete   buffer

User · Answer

Without knowing your full application it is hard to say what the best way to approach the problem is  but a common technique is to use a header which starts with a fixed length field  which denotes the length of the rest of your message   Assume that your header consist only of a 4 byte integer which denotes the length of the rest of your message  Then simply do the following      This assumes buffer is at least x bytes long     and that the socket is blocking  void ReadXBytes int socket  unsigned int x  void  buffer        int bytesRead   0      int result      while  bytesRead  lt  x                result   read socket  buffer   bytesRead  x - bytesRead           if  result  lt  1                            Throw your error                     bytesRead    result            Then later in the code  unsigned int length   0  char  buffer   0     we assume that sizeof length  will return 4 here  ReadXBytes socketFileDescriptor  sizeof length    void    amp length    buffer   new char length   ReadXBytes socketFileDescriptor  length   void  buffer       Then process the data as needed   delete    buffer    This makes a few assumptions    ints are the same size on the sender and receiver  Endianess is the same on both the sender and receiver  You have control of the protocol on both sides When you send a message you can calculate the length up front    Since it is common to want to explicitly know the size of the integer you are sending across the network define them in a header file and use them explicitly such as      These typedefs will vary across different platforms    such as linux  win32  OS X etc  but the idea    is that a Int8 is always 8 bits  and a UInt32 is always    32 bits regardless of the platform you are on     These vary from compiler to compiler  so you have to     look them up in the compiler documentation  typedef char Int8  typedef short int Int16  typedef int Int32   typedef unsigned char UInt8  typedef unsigned short int UInt16  typedef unsigned int UInt32    This would change the above to   UInt32 length   0  char  buffer   0   ReadXBytes socketFileDescriptor  sizeof length    void    amp length    buffer   new char length   ReadXBytes socketFileDescriptor  length   void  buffer       process  delete    buffer    I hope this helps

User · Answer

Several pointers    You need to handle a return value of 0  which tells you that the remote host closed the socket   For nonblocking sockets  you also need to check an error return value  -1  and make sure that errno isn t EINPROGRESS  which is expected   You definitely need better error handling - you re potentially leaking the buffer pointed to by  buffer   Which  I noticed  you don t allocate anywhere in this code snippet   Someone else made a good point about how your buffer isn t a null terminated C string if your read   fills the entire buffer  That is indeed a problem  and a serious one   Your buffer size is a bit small  but should work as long as you don t try to read more than 256 bytes  or whatever you allocate for it    If you re worried about getting into an infinite loop when the remote host sends you a malformed message  a potential denial of service attack  then you should use select   with a timeout on the socket to check for readability  and only read if data is available  and bail out if select   times out   Something like this might work for you   fd set read set  struct timeval timeout   timeout tv sec   60     Time out after a minute timeout tv usec   0   FD ZERO  amp read set   FD SET socketFileDescriptor   amp read set    int r select socketFileDescriptor 1   amp read set  NULL  NULL   amp timeout    if  r lt 0            Handle the error    if  r  0            Timeout - handle that  You could try waiting again  close the socket       if  r gt 0            The socket is ready for reading - call read   on it      Depending on the volume of data you expect to receive  the way you scan the entire message repeatedly for the  end   token is very inefficient  This is better done with a state machine  the states being  e -  n -  d -      so that you only look at each incoming character once   And seriously  you should consider finding a library to do all this for you  It s not easy getting it right

User · Answer

For any non-trivial application  I E  the application must receive and handle different kinds of messages with different lengths   the solution to your particular problem isn t necessarily just a programming solution - it s a convention  I E  a protocol  In order to determine how many bytes you should pass to your read call  you should establish a common prefix  or header  that your application receives  That way  when a socket first has reads available  you can make decisions about what to expect  A binary example might look like this   include  lt stdint h gt   include  lt stdlib h gt   include  lt stdio h gt   include  lt unistd h gt   include  lt arpa inet h gt   enum MessageType       MESSAGE FOO      MESSAGE BAR      struct MessageHeader       uint32 t type      uint32 t length             Attempts to continue reading a  socket  until  bytes  number    of bytes are read  Returns truthy on success  falsy on failure        Similar to  grieve s ReadXBytes      int readExpected int socket  void  destination  size t bytes                 Can t increment a void pointer  as incrementing       is done by the width of the pointed-to type -       and void doesn t have a width             You can in GCC but it s not very portable            char  destinationBytes   destination      while  bytes            ssize t readBytes   read socket  destinationBytes  bytes           if  readBytes  lt  1              return 0          destinationBytes    readBytes          bytes -  readBytes            return 1     int main int argc  char   argv        int selectedFd          use  select  or  poll  to wait on sockets        received a message on  selectedFd   start reading      char  fooMessage      struct           uint32 t a          uint32 t b        barMessage       struct MessageHeader received      if   readExpected  selectedFd   amp received  sizeof received                 handle error              handle network host byte order differences maybe     received type   ntohl received type       received length   ntohl received length        switch  received type            case MESSAGE FOO                  quot foo quot  sends an ASCII string or something             fooMessage   calloc received length   1  1               if  readExpected  selectedFd  fooMessage  received length                   puts fooMessage               free fooMessage               break          case MESSAGE BAR                  quot bar quot  sends a message of a fixed size             if  readExpected  selectedFd   amp barMessage  sizeof barMessage                      barMessage a   ntohl barMessage a                   barMessage b   ntohl barMessage b                   printf  quot a   b    d n quot   barMessage a   barMessage b                             break          default              puts  quot Malformed type received quot                   kick the client out probably          You can likely already see one disadvantage of using a binary format - for each attribute greater than a char you read  you will have to ensure its byte order is correct using the ntohl or ntohs functions  An alternative is to use byte-encoded messages  such as simple ASCII or UTF-8 strings  which avoid byte-order issues entirely but require extra effort to parse and validate  There are two final considerations for network data in C  The first is that some C types do not have fixed widths  For example  the humble int is defined as the word size of the processor  so 32 bit processors will produce 32 bit ints  while 64 bit processors will produces 64 bit ints  Good  portable code should have network data use fixed-width types  like those defined in stdint h  The second is struct padding  A struct with different-widthed members will add data in between some members to maintain memory alignment  making the struct faster to use in the program but sometimes producing confusing results   include  lt stdio h gt   include  lt stdint h gt   int main         struct A           char a          uint32 t b        A       printf  quot sizeof A    ld n quot   sizeof A       In this example  its actual width won t be 1 char   4 uint32 t   5 bytes  it ll be 8  mharrison mharrison-KATANA    gcc -o padding padding c mharrison mharrison-KATANA      padding  sizeof A   8  This is because 3 bytes are added after char a to make sure uint32 t b is memory-aligned  So if you write a struct A  then attempt to read a char and a uint32 t on the other side  you ll get char a  and a uint32 t where the first three bytes are garbage and the last byte is the first byte of the actual integer you wrote  Either document your data format explicitly as C struct types or  better yet  document any padding bytes they might contain

User · Answer

This is an article that I always refer to when working with sockets    THE WORLD OF SELECT    It will show you how to reliably use  select    and contains some other useful links at the bottom for further info on sockets

User · Answer

1  Others  especially dirkgently  have noted that buffer needs to be allocated some memory space   For smallish values of N  say  N  lt   4096   you can also allocate it on the stack    define BUFFER SIZE 4096 char buffer BUFFER SIZE    This saves you the worry of ensuring that you delete   the buffer should an exception be thrown   But remember that stacks are finite in size  so are heaps  but stacks are finiter   so you don t want to put too much there   2  On a -1 return code  you should not simply return immediately  throwing an exception immediately is even more sketchy    There are certain normal conditions that you need to handle  if your code is to be anything more than a short homework assignment  For example  EAGAIN may be returned in errno if no data is currently available on a non-blocking socket  Have a look at the man page for read 2

User · Answer

If you actually create the buffer as per dirks suggestion  then     int readResult   read socketFileDescriptor  buffer  BUFFER SIZE     may completely fill the buffer  possibly overwriting the terminating zero character which you depend on when extracting to a stringstream  You need     int readResult   read socketFileDescriptor  buffer  BUFFER SIZE - 1

[c++] What is the correct way of reading from a TCP socket in C/C++?

Examples related to c++

Examples related to c

Examples related to tcp