Yahoo is the simplest option to get preliminary free data. The link described in eckesicle's answer could be easily used in a python code, but you first need all the tickers. I'd use the NYSE for this example, but this can be used for different exchanges as well.
I used this wiki page to download all company tickers with the following script (I'm not a very talented Pythonist, sorry if this code isn't very efficient):
import string
import urllib2
from bs4 import BeautifulSoup
global f
def download_page(url):
aurl = urllib2.urlopen(url)
soup = BeautifulSoup(aurl.read())
print url
for row in soup('table')[1]('tr'):
tds = row('td')
if (len(tds) > 0):
f.write(tds[1].string + '\n')
f = open('stock_names.txt', 'w')
url_part1 = 'http://en.wikipedia.org/wiki/Companies_listed_on_the_New_York_Stock_Exchange_'
url = url_part1 + '(0-9)'
download_page(url)
for letter in string.uppercase[:26]:
url_part2 = letter
url = url_part1 + '(' + letter + ')'
download_page(url)
f.close()
For downloading each ticker I used another quite similar script:
import string
import urllib2
from bs4 import BeautifulSoup
global f
url_part1 = 'http://ichart.finance.yahoo.com/table.csv?s='
url_part2 = '&d=0&e=28&f=2010&g=d&a=3&b=12&c=1996&ignore=.csv'
print "Starting"
f = open('stock_names.txt', 'r')
file_content = f.readlines()
count = 1;
print "About %d tickers will be downloaded" % len(file_content)
for ticker in file_content:
ticker = ticker.strip()
url = url_part1 + ticker + url_part2
try:
# This will cause exception on a 404
response = urllib2.urlopen(url)
print "Downloading ticker %s (%d out of %d)" % (ticker, count, len(file_content))
count = count + 1
history_file = open('C:\\Users\\Nitay\\Desktop\\Historical Data\\' + ticker + '.csv', 'w')
history_file.write(response.read())
history_file.close()
except Exception, e:
pass
f.close()
Notice that the major downside to this method is that different data is available for different companies - Companies that don't have data existing in the requested dates (newly listed) will get you a 404 page.
Also keep in mind that this method is only good for preliminary data - If you really want to test your algorithm you should pay a bit and use a trusted data supplier like CSIData or others