I'm starting a new application and looking at using an ORM -- in particular, SQLAlchemy.
Say I've got a column 'foo' in my database and I want to increment it. In straight sqlite, this is easy:
db = sqlite3.connect('mydata.sqlitedb')
cur = db.cursor()
cur.execute('update table stuff set foo = foo + 1')
I figured out the SQLAlchemy SQL-builder equivalent:
engine = sqlalchemy.create_engine('sqlite:///mydata.sqlitedb')
md = sqlalchemy.MetaData(engine)
table = sqlalchemy.Table('stuff', md, autoload=True)
upd = table.update(values={table.c.foo:table.c.foo+1})
engine.execute(upd)
This is slightly slower, but there's not much in it.
Here's my best guess for a SQLAlchemy ORM approach:
# snip definition of Stuff class made using declarative_base
# snip creation of session object
for c in session.query(Stuff):
c.foo = c.foo + 1
session.flush()
session.commit()
This does the right thing, but it takes just under fifty times as long as the other two approaches. I presume that's because it has to bring all the data into memory before it can work with it.
Is there any way to generate the efficient SQL using SQLAlchemy's ORM? Or using any other python ORM? Or should I just go back to writing the SQL by hand?
This question is related to
python
orm
sqlalchemy
If it is because of the overhead in terms of creating objects, then it probably can't be sped up at all with SA.
If it is because it is loading up related objects, then you might be able to do something with lazy loading. Are there lots of objects being created due to references? (IE, getting a Company object also gets all of the related People objects).
session.query(Clients).filter(Clients.id == client_id_list).update({'status': status})
session.commit()
Try this =)
Withough testing, I'd try:
for c in session.query(Stuff).all():
c.foo = c.foo+1
session.commit()
(IIRC, commit() works without flush()).
I've found that at times doing a large query and then iterating in python can be up to 2 orders of magnitude faster than lots of queries. I assume that iterating over the query object is less efficient than iterating over a list generated by the all() method of the query object.
[Please note comment below - this did not speed things up at all].
If it is because of the overhead in terms of creating objects, then it probably can't be sped up at all with SA.
If it is because it is loading up related objects, then you might be able to do something with lazy loading. Are there lots of objects being created due to references? (IE, getting a Company object also gets all of the related People objects).
Here's an example of how to solve the same problem without having to map the fields manually:
from sqlalchemy import Column, ForeignKey, Integer, String, Date, DateTime, text, create_engine
from sqlalchemy.exc import IntegrityError
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlalchemy.orm.attributes import InstrumentedAttribute
engine = create_engine('postgres://postgres@localhost:5432/database')
session = sessionmaker()
session.configure(bind=engine)
Base = declarative_base()
class Media(Base):
__tablename__ = 'media'
id = Column(Integer, primary_key=True)
title = Column(String, nullable=False)
slug = Column(String, nullable=False)
type = Column(String, nullable=False)
def update(self):
s = session()
mapped_values = {}
for item in Media.__dict__.iteritems():
field_name = item[0]
field_type = item[1]
is_column = isinstance(field_type, InstrumentedAttribute)
if is_column:
mapped_values[field_name] = getattr(self, field_name)
s.query(Media).filter(Media.id == self.id).update(mapped_values)
s.commit()
So to update a Media instance, you can do something like this:
media = Media(id=123, title="Titular Line", slug="titular-line", type="movie")
media.update()
If it is because of the overhead in terms of creating objects, then it probably can't be sped up at all with SA.
If it is because it is loading up related objects, then you might be able to do something with lazy loading. Are there lots of objects being created due to references? (IE, getting a Company object also gets all of the related People objects).
Here's an example of how to solve the same problem without having to map the fields manually:
from sqlalchemy import Column, ForeignKey, Integer, String, Date, DateTime, text, create_engine
from sqlalchemy.exc import IntegrityError
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlalchemy.orm.attributes import InstrumentedAttribute
engine = create_engine('postgres://postgres@localhost:5432/database')
session = sessionmaker()
session.configure(bind=engine)
Base = declarative_base()
class Media(Base):
__tablename__ = 'media'
id = Column(Integer, primary_key=True)
title = Column(String, nullable=False)
slug = Column(String, nullable=False)
type = Column(String, nullable=False)
def update(self):
s = session()
mapped_values = {}
for item in Media.__dict__.iteritems():
field_name = item[0]
field_type = item[1]
is_column = isinstance(field_type, InstrumentedAttribute)
if is_column:
mapped_values[field_name] = getattr(self, field_name)
s.query(Media).filter(Media.id == self.id).update(mapped_values)
s.commit()
So to update a Media instance, you can do something like this:
media = Media(id=123, title="Titular Line", slug="titular-line", type="movie")
media.update()
If it is because of the overhead in terms of creating objects, then it probably can't be sped up at all with SA.
If it is because it is loading up related objects, then you might be able to do something with lazy loading. Are there lots of objects being created due to references? (IE, getting a Company object also gets all of the related People objects).
There are several ways to UPDATE using sqlalchemy
1) for c in session.query(Stuff).all():
c.foo += 1
session.commit()
2) session.query().\
update({"foo": (Stuff.foo + 1)})
session.commit()
3) conn = engine.connect()
stmt = Stuff.update().\
values(Stuff.foo = (Stuff.foo + 1))
conn.execute(stmt)
session.query(Clients).filter(Clients.id == client_id_list).update({'status': status})
session.commit()
Try this =)
Withough testing, I'd try:
for c in session.query(Stuff).all():
c.foo = c.foo+1
session.commit()
(IIRC, commit() works without flush()).
I've found that at times doing a large query and then iterating in python can be up to 2 orders of magnitude faster than lots of queries. I assume that iterating over the query object is less efficient than iterating over a list generated by the all() method of the query object.
[Please note comment below - this did not speed things up at all].
There are several ways to UPDATE using sqlalchemy
1) for c in session.query(Stuff).all():
c.foo += 1
session.commit()
2) session.query().\
update({"foo": (Stuff.foo + 1)})
session.commit()
3) conn = engine.connect()
stmt = Stuff.update().\
values(Stuff.foo = (Stuff.foo + 1))
conn.execute(stmt)
Source: Stackoverflow.com