In Python, one can use statsmodels.iolib.foreign.genfromdta
to read Stata datasets. In addition, there is also a wrapper of the aforementioned function which can be used to read a Stata file directly from the web: statsmodels.datasets.webuse
.
Nevertheless, both of the above rely on the use of the pandas.io.stata.StataReader.data
, which is now a legacy function and has been deprecated. As such, the new pandas.read_stata
function should now always be used instead.
According to the source file of stata.py
, as of version 0.23.0
, the following are supported:
As others have noted, the pandas.to_csv
function can then be used to save the file into disk. A related function numpy.savetxt
can also save the data
as a text file.
EDIT:
The following details come from help dtaversion
in Stata 15.1:
Stata version .dta file format
----------------------------------------
1 102
2, 3 103
4 104
5 105
6 108
7 110 and 111
8, 9 112 and 113
10, 11 114
12 115
13 117
14 and 15 118 (# of variables <= 32,767)
15 119 (# of variables > 32,767, Stata/MP only)
----------------------------------------
file formats 103, 106, 107, 109, and 116
were never used in any official release.