Python - difference between two strings

Question

I d like to store a lot of words in a list  Many of these words are very similar  For example I have word afrykanerskojezyczny and many of words like afrykanerskojezycznym  afrykanerskojezyczni  nieafrykanerskojezyczni  What is the effective  fast and giving small diff size  solution to find difference between two strings and restore second string from the first one and diff

User · Answer

You can look into the regex module  the fuzzy section   I don t know if you can get the actual differences  but at least you can specify allowed number of different types of changes like insert  delete  and substitutions   import regex sequence    afrykanerskojezyczny  queries      afrykanerskojezycznym    afrykanerskojezyczni                 nieafrykanerskojezyczni    for q in queries      m   regex search r   s  e lt  2   q  sequence      print  match  if m else  nomatch

User · Answer

I like the ndiff answer  but if you want to spit it all into a list of only the changes  you could do something like   import difflib  case a    afrykbnerskojezyczny  case b    afrykanerskojezycznym   output list    li for li in difflib ndiff case a  case b  if li 0

User · Answer

You can use ndiff in the difflib module to do this  It has all the information necessary to convert one string into another string   A simple example    import difflib  cases    afrykanerskojezyczny    afrykanerskojezycznym             afrykanerskojezyczni    nieafrykanerskojezyczni             afrykanerskojezycznym    afrykanerskojezyczny             nieafrykanerskojezyczni    afrykanerskojezyczni             nieafrynerskojezyczni    afrykanerskojzyczni             abcdefg   xac      for a b in cases           print       gt      format a b         for i s in enumerate difflib ndiff a  b            if s 0        continue         elif s 0    -               print u Delete      from position     format s -1  i           elif s 0                    print u Add      to position     format s -1  i           print           prints   afrykanerskojezyczny   gt  afrykanerskojezycznym Add  m  to position 20  afrykanerskojezyczni   gt  nieafrykanerskojezyczni Add  n  to position 0 Add  i  to position 1 Add  e  to position 2  afrykanerskojezycznym   gt  afrykanerskojezyczny Delete  m  from position 20  nieafrykanerskojezyczni   gt  afrykanerskojezyczni Delete  n  from position 0 Delete  i  from position 1 Delete  e  from position 2  nieafrynerskojezyczni   gt  afrykanerskojzyczni Delete  n  from position 0 Delete  i  from position 1 Delete  e  from position 2 Add  k  to position 7 Add  a  to position 8 Delete  e  from position 16  abcdefg   gt  xac Add  x  to position 0 Delete  b  from position 2 Delete  d  from position 4 Delete  e  from position 5 Delete  f  from position 6 Delete  g  from position 7

User · Answer

What you are asking for is a specialized form of compression   xdelta3 was designed for this particular kind of compression  and there s a python binding for it  but you could probably get away with using zlib directly   You d want to use zlib compressobj and zlib decompressobj with the zdict parameter set to your  base word   e g  afrykanerskojezyczny   Caveats are zdict is only supported in python 3 3 and higher  and it s easiest to code if you have the same  base word  for all your diffs  which may or may not be what you want

User · Answer

The answer to my comment above on the Original Question makes me think this is all he wants   loopnum   0 word    afrykanerskojezyczny  wordlist     afrykanerskojezycznym   afrykanerskojezyczni   nieafrykanerskojezyczni   for i in wordlist      wordlist loopnum    word     loopnum    1   This will do the following   For every value in wordlist  set that value of the wordlist to the origional code   All you have to do is put this piece of code where you need to change wordlist  making sure you store the words you need to change in wordlist  and that the original word is correct   Hope this helps

[python] Python - difference between two strings

Examples related to python

Examples related to string

Examples related to python-3.x

Examples related to diff