One can use the textract library. It take care of both "doc" as well as "docx"
import textract
text = textract.process("path/to/file.extension")
You can even use 'antiword' (sudo apt-get install antiword) and then convert doc to first into docx and then read through docx2txt.
antiword filename.doc > filename.docx
Ultimately, textract in the backend is using antiword.