I know this is an old issue but I just had to do this for a project at work, and I am very surprised that nobody has thought of this solution yet: Just open the .pdf with Microsoft word.
The code is a lot easier to work with when you are trying to extract data from a .docx because it opens in Microsoft Word. Excel and Word play well together because they are both Microsoft programs. In my case, the file of question had to be a .pdf file. Here's the solution I came up with:
Yes you could just convert the .pdf file to a .docx file but this is a much simpler solution in my opinion.