[python] How to solve the memory error in Python

Assuming your example text is representative of all the text, one line would consume about 75 bytes on my machine:

In [3]: sys.getsizeof('usedfor zipper fasten_coat')
Out[3]: 75

Doing some rough math:

75 bytes * 8,000,000 lines / 1024 / 1024 = ~572 MB

So roughly 572 meg to store the strings alone for one of these files. Once you start adding in additional, similarly structured and sized files, you'll quickly approach your virtual address space limits, as mentioned in @ShadowRanger's answer.

If upgrading your python isn't feasible for you, or if it only kicks the can down the road (you have finite physical memory after all), you really have two options: write your results to temporary files in-between loading in and reading the input files, or write your results to a database. Since you need to further post-process the strings after aggregating them, writing to a database would be the superior approach.