Better Faster to Loop through set or list

Question

If I have a python list that is has many duplicates  and I want to iterate through each item  but not through the duplicates  is it best to use a set  as in set mylist   or find another way to create a list without duplicates   I was thinking of just looping through the list and checking for duplicates but I figured that s what set   does when it s initialized   So if mylist    3 1 5 2 4 4 1 4 2 5 1 3  and I really just want to loop through  1 2 3 4 5   order doesn t matter   should I use set mylist  or something else   An alternative is possible in the last example  since the list contains every integer between its min and max value  I could loop through range min mylist  max mylist   or through set mylist    Should I generally try to avoid using set in this case   Also  would finding the min and max be slower than just creating the set     In the case in the last example  the set is faster   from numpy random import random integers ids   random integers 1e3 size 1e6   def set loop mylist       idlist          for id in set mylist           idlist append id      return idlist  def list loop mylist       idlist          for id in range min mylist  max mylist            idlist append id      return idlist   timeit set loop ids   1 loops  best of 3  232 ms per loop   timeit list loop ids   1 loops  best of 3  408 ms per loop

User · Answer

For simplicity s sake  newList   list set oldList    But there are better options out there if you d like to get speed ordering optimization instead  http   www peterbe com plog uniqifiers-benchmark

User · Answer

While a set may be what you want structure-wise  the question is what is faster   A list is faster   Your example code doesn t accurately compare set vs list because you re converting from a list to a set in set loop  and then you re creating the list you ll be looping through in list loop     The set and list you iterate through should be constructed and in memory ahead of time  and simply looped through to see which data structure is faster at iterating   ids list   range 1000000  ids set   set ids  def f x       for i in x           pass   timeit f ids set   1 loops  best of 3  214 ms per loop  timeit f ids list   1 loops  best of 3  176 ms per loop

User · Answer

Just use a set   Its semantics are exactly what you want  a collection of unique items   Technically you ll be iterating through the list twice  once to create the set  once for your actual loop   But you d be doing just as much work or more with any other approach

User · Answer

set is what you want  so you should use set  Trying to be clever introduces subtle bugs like forgetting to add one tomax mylist   Code defensively  Worry about what s faster when you determine that it is too slow   range min mylist   max mylist    1      lt -- don t forget to add 1

User · Answer

I the list is vary large looping two time over it will take a lot of time and more in the second time you are looping a set not a list and as we know iterating over a set is slower than list   i think you need the power of generator and set   def first test         def loop one time my list             create a set to keep the items          iterated items   set             as we know iterating over list is faster then list          for value in my list                 as we know checking if element exist in set is very fast not               metter the size of the set              if value not in iterated items                    iterated items add value    add this item to list                 yield value       mylist    3 1 5 2 4 4 1 4 2 5 1 3       for v in loop one time mylist  pass    def second test        mylist    3 1 5 2 4 4 1 4 2 5 1 3      s   set mylist      for v in s pass   import timeit  print timeit timeit  first test     setup  from   main   import first test   number 10000   print timeit timeit  second test     setup  from   main   import second test   number 10000     out put       0 024003583388435043    0 010424674188938422   Note  this technique order is guaranteed

[python] Better/Faster to Loop through set or list?

Examples related to python

Examples related to list

Examples related to loops

Examples related to set