C switch statement limitations - why

Question

When writing a switch statement  there appears to be two limitations on what you can switch on in case statements   For example  and yes  I know  if you re doing this sort of thing it probably means your object-oriented  OO  architecture is iffy - this is just a contrived example       Type t   typeof int      switch  t         case typeof int         Console WriteLine  int           break       case typeof string         Console WriteLine  string           break       default        Console WriteLine  unknown           break        Here the switch   statement fails with  A value of an integral type expected  and the case statements fail with  A constant value is expected    Why are these restrictions in place  and what is the underlying justification  I don t see any reason why the switch statement has to succumb to static analysis only  and why the value being switched on has to be integral  that is   primitive   What is the justification

User · Answer

By the way  VB  having the same underlying architecture  allows much more flexible Select Case statements  the above code would work in VB  and still produces efficient code where this is possible so the argument by techical constraint has to be considered carefully

User · Answer

This is not a reason why  but the C  specification section 8 7 2 states the following      The governing type of a switch statement is established by the switch expression  If the type of the switch expression is sbyte  byte  short  ushort  int  uint  long  ulong  char  string  or an enum-type  then that is the governing type of the switch statement  Otherwise  exactly one user-defined implicit conversion    6 4  must exist from the type of the switch expression to one of the following possible governing types  sbyte  byte  short  ushort  int  uint  long  ulong  char  string  If no such implicit conversion exists  or if more than one such implicit conversion exists  a compile-time error occurs    The C  3 0 specification is located at   http   download microsoft com download 3 8 8 388e7205-bc10-4226-b2a8-75351c669b09 CSharp 20Language 20Specification doc

User · Answer

While on the topic  according to Jeff Atwood  the switch statement is a programming atrocity  Use them sparingly    You can often accomplish the same task using a table  For example   var table   new Dictionary lt Type  string gt           typeof int    it s an int          typeof string    it s a string         Type someType   typeof int   Console WriteLine table someType

User · Answer

It s important not to confuse the C  switch statement with the CIL switch instruction   The CIL switch is a jump table  that requires an index into a set of jump addresses   This is only useful if the C  switch s cases are adjacent   case 3  blah  break  case 4  blah  break  case 5  blah  break    But of little use if they aren t   case 10  blah  break  case 200  blah  break  case 3000  blah  break     You d need a table  3000 entries in size  with only 3 slots used   With non-adjacent expressions  the compiler may start to perform linear if-else-if-else checks   With larger non- adjacent expression sets  the compiler may start with a binary tree search  and finally if-else-if-else the last few items   With expression sets containing clumps of adjacent items  the compiler may binary tree search  and finally a CIL switch   This is full of  mays   amp   mights   and it is dependent on the compiler  may differ with Mono or Rotor    I replicated your results on my machine using adjacent cases      total time to execute a 10 way switch  10000 iterations  ms    25 1383   approximate time per 10 way switch  ms                         0 00251383      total time to execute a 50 way switch  10000 iterations  ms    26 593   approximate time per 50 way switch  ms                         0 0026593      total time to execute a 5000 way switch  10000 iterations  ms    23 7094   approximate time per 5000 way switch  ms                         0 00237094      total time to execute a 50000 way switch  10000 iterations  ms    20 0933   approximate time per 50000 way switch  ms                         0 00200933   Then I also did using non-adjacent case expressions      total time to execute a 10 way switch  10000 iterations  ms    19 6189   approximate time per 10 way switch  ms                         0 00196189      total time to execute a 500 way switch  10000 iterations  ms    19 1664   approximate time per 500 way switch  ms                         0 00191664      total time to execute a 5000 way switch  10000 iterations  ms    19 5871   approximate time per 5000 way switch  ms                         0 00195871      A non-adjacent 50 000 case switch statement would not compile     An expression is too long or complex to compile near  ConsoleApplication1 Program Main string           What s funny here  is that the binary tree search appears a little  probably not statistically  quicker than the CIL switch instruction   Brian  you ve used the word  constant   which has a very definite meaning from a computational complexity theory perspective  While the simplistic adjacent integer example may produce CIL that is considered O 1   constant   a sparse example is O log n   logarithmic   clustered examples lie somewhere in between  and small examples are O n   linear    This doesn t even address the String situation  in which a static Generic Dictionary lt string int32 gt  may be created  and will suffer definite overhead on first use  Performance here will be dependent on the performance of Generic Dictionary   If you check the C  Language Specification  not the CIL spec  you ll find  15 7 2 The switch statement  makes no mention of  constant time  or that the underlying implementation even uses the CIL switch instruction  be very careful of assuming such things    At the end of the day  a C  switch against an integer expression on a modern system is a sub-microsecond operation  and not normally worth worrying about     Of course these times will depend on machines and conditions  I wouldn   t pay attention to these timing tests  the microsecond durations we   re talking about are dwarfed by any    real    code being run  and you must include some    real code    otherwise the compiler will optimise the branch away   or jitter in the system  My answers are based on using IL DASM to examine the CIL created by the C  compiler  Of course  this isn   t final  as the actual instructions the CPU runs are then created by the JIT   I have checked the final CPU instructions actually executed on my x86 machine  and can confirm a simple adjacent set switch doing something like     jmp     ds 300025F0 eax 4    Where a binary tree search is full of     cmp     ebx  79Eh   jg      3000352B   cmp     ebx  654h   jg      300032BB         cmp     ebx  0F82h   jz      30005EEE

User · Answer

C  8 allows you to solve this problem elegantly and compactly using an switch expression   public string GetTypeName object obj        return obj switch               int i   gt   Int32           string s   gt   String                 gt   Unknown               gt  throw new ArgumentNullException nameof obj              As a result  you get   Console WriteLine GetTypeName obj  1                 Int32 Console WriteLine GetTypeName obj   string           String Console WriteLine GetTypeName obj  1 2               Unknown Console WriteLine GetTypeName obj  null              System ArgumentNullException   You can read more about the new feature here

User · Answer

I suppose there is no fundamental reason why the compiler couldn t automatically translate your switch statement into   if  t    typeof int           elseif  t    typeof string                 But there isn t much gained by that   A case statement on integral types allows the compiler to make a number of optimizations    There is no duplication  unless you duplicate case labels  which the compiler detects    In your example t could match multiple types due to inheritance   Should the first match be executed  All of them  The compiler can choose to implement a switch statement over an integral type by a jump table to avoid all the comparisons   If you are switching on an enumeration that has integer values 0 to 100 then it creates an array with 100 pointers in it  one for each switch statement  At runtime it simply looks up the address from the array based on the integer value being switched on   This makes for much better runtime performance than performing 100 comparisons

User · Answer

Judah s answer above gave me an idea   You can  fake  the OP s switch behavior above using a Dictionary lt Type  Func lt T gt    Dictionary lt Type  Func lt object  string   string gt  gt  typeTable   new Dictionary lt Type  Func lt object  string  string gt  gt     typeTable Add typeof int    o  s    gt                                                return string Format   0    1    s  o ToString                               This allows you to associate behavior with a type in the same style as the switch statement   I believe it has the added benefit of being keyed instead of a switch-style jump table when compiled to IL

User · Answer

I think Henk nailed it with the  no sttatic access to the type system  thing  Another option is that there is no order to types where as numerics and strings can be  Thus a type switch would can t build a binary search tree  just a linear search

User · Answer

Microsoft finally heard you   Now with C  7 you can   switch shape    case Circle c      WriteLine   circle with radius  c Radius         break  case Rectangle s when  s Length    s Height       WriteLine    s Length  x  s Height  square        break  case Rectangle r      WriteLine    r Length  x  r Height  rectangle        break  default      WriteLine   lt unknown shape gt         break  case null      throw new ArgumentNullException nameof shape

User · Answer

I don t see any reason why the switch statement has to succomb to static analysis only   True  it doesn t have to  and many languages do in fact use dynamic switch statements  This means however that reordering the  case  clauses can change the behaviour of the code   There s some interesting info behind the design decisions that went into  switch  in here  Why is the C  switch statement designed to not allow fall-through  but still require a break   Allowing dynamic case expressions can lead to monstrosities such as this PHP code   switch  true        case a    5                      break      case b    10                      break      which frankly should just use the if-else statement

User · Answer

I have virtually no knowledge of C   but I suspect that either switch was simply taken as it occurs in other languages without thinking about making it more general or the developer decided that extending it was not worth it   Strictly speaking you are absolutely right that there is no reason to put these restrictions on it  One might suspect that the reason is that for the allowed cases the implementation is very efficient  as suggested by Brian Ensink  44921    but I doubt the implementation is very efficient  w r t  if-statements  if I use integers and some random cases  e g  345  -4574 and 1234203   And in any case  what is the harm in allowing it for everything  or at least more  and saying that it is only efficient for specific cases  such as  almost  consecutive numbers    I can  however  imagine that one might want to exclude types because of reasons such as the one given by lomaxx  44918    Edit   Henk  44970   If Strings are maximally shared  strings with equal content will be pointers to the same memory location as well  Then  if you can make sure that the strings used in the cases are stored consecutively in memory  you can very efficiently implement the switch  i e  with execution in the order of 2 compares  an addition and two jumps

User · Answer

This is my original post  which sparked some debate    because it is wrong      The switch statement is not the same   thing as a big if-else statement     Each case must be unique and evaluated   statically   The switch statement does   a constant time branch regardless of   how many cases you have   The if-else   statement evaluates each condition   until it finds one that is true      In fact  the C  switch statement is not always a constant time branch   In some cases the compiler will use a CIL switch statement which is indeed a constant time branch using a jump table   However  in sparse cases as pointed out by Ivan Hamilton the compiler may generate something else entirely   This is actually quite easy to verify by writing various C  switch statements  some sparse  some dense  and looking at the resulting CIL with the ildasm exe tool

User · Answer

I agree with this comment that using a table driven approach is often better   In C  1 0 this was not possible because it didn t have generics and anonymous delegates  New versions of C  have the scaffolding to make this work  Having a notation for object literals is also helps

User · Answer

According to the switch statement documentation if there is an unambiguous way to implicitly convert the the object to an integral type  then it will be allowed   I think you are expecting a behavior where for each case statement it would be replaced with if  t    typeof int    but that would open a whole can of worms when you get to overload that operator   The behavior would change when implementation details for the switch statement changed if you wrote your    override incorrectly   By reducing the comparisons to integral types and string and those things that can be reduced to integral types  and are intended to  they avoid potential issues

User · Answer

wrote             The switch statement does a constant time branch regardless of how many cases you have        Since the language allows the string type to be used in a switch statement I presume the compiler is unable to generate code for a constant time branch implementation for this type and needs to generate an if-then style    mweerden - Ah I see  Thanks   I do not have a lot of experience in C  and  NET but it seems the language designers do not allow static access to the type system except in narrow circumstances  The typeof keyword returns an object so this is accessible at run-time only

User · Answer

The first reason that comes to mind is historical   Since most C  C    and Java programmers are not accustomed to having such freedoms  they do not demand them   Another  more valid  reason is that the language complexity would increase   First of all  should the objects be compared with  Equals   or with the    operator  Both are valid in some cases  Should we introduce new syntax to do this  Should we allow the programmer to introduce their own comparison method   In addition  allowing to switch on objects would break underlying assumptions about the switch statement  There are two rules governing the switch statement that the compiler would not be able to enforce if objects were allowed to be switched on  see the C  version 3 0 language specification    8 7 2     That the values of switch labels are constant That the values of switch labels are distinct  so that only one switch block can be selected for a given switch-expression    Consider this code example in the hypothetical case that non-constant case values were allowed   void DoIt         String foo    bar       Switch foo  foo      void Switch String val1  String val2        switch   bar                    The compiler will not know that val1 and val2 are not distinct         case val1                 Is this case block selected              break          case val2                 Or this one              break          case  bar                  Or perhaps this one              break            What will the code do  What if the case statements are reordered  Indeed  one of the reasons why C  made switch fall-through illegal is that the switch statements could be arbitrarily rearranged   These rules are in place for a reason - so that the programmer can  by looking at one case block  know for certain the precise condition under which the block is entered  When the aforementioned switch statement grows into 100 lines or more  and it will   such knowledge is invaluable

User · Answer

Mostly  those restrictions are in place because of language designers  The underlying justification may be compatibility with languange history  ideals  or simplification of compiler design   The compiler may  and does  choose to    create a big if-else statement  use a MSIL switch instruction  jump table  build a Generic Dictionary lt string int32 gt   populate it on first use  and call Generic Dictionary lt  gt   TryGetValue   for a index to pass to a MSIL switch instruction  jump table   use a combination of if-elses  amp  MSIL  switch  jumps   The switch statement IS NOT a constant time branch  The compiler may find short-cuts  using hash buckets  etc   but more complicated cases will generate more complicated MSIL code with some cases branching out earlier than others    To handle the String case  the compiler will end up  at some point  using a Equals b   and possibly a GetHashCode      I think it would be trival for the compiler to use any object that satisfies these constraints   As for the need for static case expressions    some of those optimisations  hashing  caching  etc  would not be available if the case expressions weren t deterministic  But we ve already seen that sometimes the compiler just picks the simplistic if-else-if-else road anyway     Edit  lomaxx - Your understanding of the  typeof  operator is not correct  The  typeof  operator is used to obtain the System Type object for a type  nothing to do with its supertypes or interfaces   Checking run-time compatibility of an object with a given type is the  is  operator s job  The use of  typeof  here to express an object is irrelevant

[c#] C# switch statement limitations - why?

Examples related to c#

Examples related to switch-statement