I have two strings like
string1="abc def ghi"
and
string2="def ghi abc"
How to get that this two string are same without breaking the words?
This question is related to
python
string
python-2.7
string-comparison
I think difflib is a good library to do this job
>>>import difflib
>>> diff = difflib.Differ()
>>> a='he is going home'
>>> b='he is goes home'
>>> list(diff.compare(a,b))
[' h', ' e', ' ', ' i', ' s', ' ', ' g', ' o', '+ e', '+ s', '- i', '- n', '- g', ' ', ' h', ' o', ' m', ' e']
>>> list(diff.compare(a.split(),b.split()))
[' he', ' is', '- going', '+ goes', ' home']
If you want to know if both the strings are equal, you can simply do
print string1 == string2
But if you want to know if they both have the same set of characters and they occur same number of times, you can use collections.Counter
, like this
>>> string1, string2 = "abc def ghi", "def ghi abc"
>>> from collections import Counter
>>> Counter(string1) == Counter(string2)
True
If you want a really simple answer:
s_1 = "abc def ghi"
s_2 = "def ghi abc"
flag = 0
for i in s_1:
if i not in s_2:
flag = 1
if flag == 0:
print("a == b")
else:
print("a != b")
Something like this:
if string1 == string2:
print 'they are the same'
update: if you want to see if each sub-string may exist in the other:
elem1 = [x for x in string1.split()]
elem2 = [x for x in string2.split()]
for item in elem1:
if item in elem2:
print item
You can use simple loops to check two strings are equal. .But ideally you can use something like return s1==s2
s1 = 'hello'
s2 = 'hello'
a = []
for ele in s1:
a.append(ele)
for i in range(len(s2)):
if a[i]==s2[i]:
a.pop()
if len(a)>0:
return False
else:
return True
This is a pretty basic example, but after the logical comparisons (==) or string1.lower() == string2.lower()
, maybe can be useful to try some of the basic metrics of distances between two strings.
You can find examples everywhere related to these or some other metrics, try also the fuzzywuzzy package (https://github.com/seatgeek/fuzzywuzzy).
import Levenshtein
import difflib
print(Levenshtein.ratio('String1', 'String2'))
print(difflib.SequenceMatcher(None, 'String1', 'String2').ratio())
I am going to provide several solutions and you can choose the one that meets your needs:
1) If you are concerned with just the characters, i.e, same characters and having equal frequencies of each in both the strings, then use:
''.join(sorted(string1)).strip() == ''.join(sorted(string2)).strip()
2) If you are also concerned with the number of spaces (white space characters) in both strings, then simply use the following snippet:
sorted(string1) == sorted(string2)
3) If you are considering words but not their ordering and checking if both the strings have equal frequencies of words, regardless of their order/occurrence, then can use:
sorted(string1.split()) == sorted(string2.split())
4) Extending the above, if you are not concerned with the frequency count, but just need to make sure that both the strings contain the same set of words, then you can use the following:
set(string1.split()) == set(string2.split())
If you just need to check if the two strings are exactly same,
text1 = 'apple'
text2 = 'apple'
text1 == text2
The result will be
True
If you need the matching percentage,
import difflib
text1 = 'Since 1958.'
text2 = 'Since 1958'
output = str(int(difflib.SequenceMatcher(None, text1, text2).ratio()*100))
Matching percentage output will be,
'95'
>>> s1="abc def ghi"
>>> s2="def ghi abc"
>>> s1 == s2 # For string comparison
False
>>> sorted(list(s1)) == sorted(list(s2)) # For comparing if they have same characters.
True
>>> sorted(list(s1))
[' ', ' ', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
>>> sorted(list(s2))
[' ', ' ', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
open both of the files then compare them by splitting its word contents;
log_file_A='file_A.txt'
log_file_B='file_B.txt'
read_A=open(log_file_A,'r')
read_A=read_A.read()
print read_A
read_B=open(log_file_B,'r')
read_B=read_B.read()
print read_B
File_A_set = set(read_A.split(' '))
File_A_set = set(read_B.split(' '))
print File_A_set == File_B_set
Try to covert both strings to upper or lower case. Then you can use ==
comparison operator.
For that, you can use default difflib in python
from difflib import SequenceMatcher
def similar(a, b):
return SequenceMatcher(None, a, b).ratio()
then call similar() as
similar(string1, string2)
it will return compare as ,ratio >= threshold to get match result
Seems question is not about strings equality, but of sets equality. You can compare them this way only by splitting strings and converting them to sets:
s1 = 'abc def ghi'
s2 = 'def ghi abc'
set1 = set(s1.split(' '))
set2 = set(s2.split(' '))
print set1 == set2
Result will be
True
Equality in direct comparing:
string1 = "sample"
string2 = "sample"
if string1 == string2 :
print("Strings are equal with text : ", string1," & " ,string2)
else :
print ("Strings are not equal")
Equality in character sets:
string1 = 'abc def ghi'
string2 = 'def ghi abc'
set1 = set(string1.split(' '))
set2 = set(string2.split(' '))
print set1 == set2
if string1 == string2 :
print("Strings are equal with text : ", string1," & " ,string2)
else :
print ("Strings are not equal")
Source: Stackoverflow.com