Regarding Fred the Fantastic's answer:
Not every JPEG marker between C0
-CF
are SOF
markers; I excluded DHT (C4
), DNL (C8
) and DAC (CC
). Note that I haven't looked into whether it is even possible to parse any frames other than C0
and C2
in this manner. However, the other ones seem to be fairly rare (I personally haven't encountered any other than C0
and C2
).
Either way, this solves the problem mentioned in comments by Malandy with Bangles.jpg
(DHT erroneously parsed as SOF).
The other problem mentioned with 1431588037-WgsI3vK.jpg
is due to imghdr
only being able detect the APP0 (EXIF) and APP1 (JFIF) headers.
This can be fixed by adding a more lax test to imghdr (e.g. simply FFD8
or maybe FFD8FF
?) or something much more complex (possibly even data validation). With a more complex approach I've only found issues with: APP14 (FFEE
) (Adobe); the first marker being DQT (FFDB
); and APP2 and issues with embedded ICC_PROFILEs.
Revised code below, also altered the call to imghdr.what()
slightly:
import struct
import imghdr
def test_jpeg(h, f):
# SOI APP2 + ICC_PROFILE
if h[0:4] == '\xff\xd8\xff\xe2' and h[6:17] == b'ICC_PROFILE':
print "A"
return 'jpeg'
# SOI APP14 + Adobe
if h[0:4] == '\xff\xd8\xff\xee' and h[6:11] == b'Adobe':
return 'jpeg'
# SOI DQT
if h[0:4] == '\xff\xd8\xff\xdb':
return 'jpeg'
imghdr.tests.append(test_jpeg)
def get_image_size(fname):
'''Determine the image type of fhandle and return its size.
from draco'''
with open(fname, 'rb') as fhandle:
head = fhandle.read(24)
if len(head) != 24:
return
what = imghdr.what(None, head)
if what == 'png':
check = struct.unpack('>i', head[4:8])[0]
if check != 0x0d0a1a0a:
return
width, height = struct.unpack('>ii', head[16:24])
elif what == 'gif':
width, height = struct.unpack('<HH', head[6:10])
elif what == 'jpeg':
try:
fhandle.seek(0) # Read 0xff next
size = 2
ftype = 0
while not 0xc0 <= ftype <= 0xcf or ftype in (0xc4, 0xc8, 0xcc):
fhandle.seek(size, 1)
byte = fhandle.read(1)
while ord(byte) == 0xff:
byte = fhandle.read(1)
ftype = ord(byte)
size = struct.unpack('>H', fhandle.read(2))[0] - 2
# We are at a SOFn block
fhandle.seek(1, 1) # Skip `precision' byte.
height, width = struct.unpack('>HH', fhandle.read(4))
except Exception: #IGNORE:W0703
return
else:
return
return width, height
Note: Created a full answer instead of a comment, since I'm not yet allowed to.