Kristof Beyls
2015-May-10  18:21 UTC
[LLVMdev] http://llvm.org/perf/ instability: some clues
Daniel, Tobias, Renato and myself have been looking a little bit at the
potential underlying reason
for why http://llvm.org/perf/ is instable, and have found some clues. I want
to share them here
to give people with more experience in the frameworks used by LNT (flask,
sqlalchemy, wsgi, .)
a chance to check if our reasoning below seems plausible.
 
Daniel noticed the following backtrace in the log after
http://llvm.org/perf started giving "Internal Server Error"
again:
2015-05-08 22:57:05,309 ERROR: Exception on /db_default/v4/nts/287/graph
[GET] [in
/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.
py:1423]
Traceback (most recent call last):
  File
"/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app
.py", line 1817, in wsgi_app
    response = self.full_dispatch_request()
  File
"/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app
.py", line 1477, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File
"/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app
.py", line 1381, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File
"/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app
.py", line 1475, in full_dispatch_request
    rv = self.dispatch_request()
  File
"/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app
.py", line 1461, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File
"/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/serve
r/ui/decorators.py", line 67, in wrap
    result = f(**args)
  File
"/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/serve
r/ui/views.py", line 385, in v4_run_graph
    ts = request.get_testsuite()
  File
"/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/serve
r/ui/app.py", line 76, in get_testsuite
    testsuites = self.get_db().testsuite
  File
"/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/serve
r/ui/app.py", line 55, in get_db
    self.db = current_app.old_config.get_database(g.db_name, echo=echo)
  File
"/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/serve
r/config.py", line 148, in get_database
    return lnt.server.db.v4db.V4DB(db_entry.path, self, echo=echo)
  File
"/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/serve
r/db/v4db.py", line 108, in __init__
    .filter_by(id = lnt.testing.PASS).first()
  File
"/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlal
chemy/orm/query.py", line 2334, in first
    ret = list(self[0:1])
  File
"/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlal
chemy/orm/query.py", line 2201, in __getitem__
    return list(res)
  File
"/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlal
chemy/orm/query.py", line 2405, in __iter__
    return self._execute_and_instances(context)
  File
"/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlal
chemy/orm/query.py", line 2418, in _execute_and_instances
    close_with_result=True)
  File
"/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlal
chemy/orm/query.py", line 2409, in _connection_from_session
    **kw)
  File
"/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlal
chemy/orm/session.py", line 846, in connection
    close_with_result=close_with_result)
  File
"/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlal
chemy/orm/session.py", line 850, in _connection_for_bind
    return self.transaction._connection_for_bind(engine)
  File
"/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlal
chemy/orm/session.py", line 315, in _connection_for_bind
    conn = bind.contextual_connect()
  File
"/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlal
chemy/engine/base.py", line 1737, in contextual_connect
    self.pool.connect(),
  File
"/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlal
chemy/pool.py", line 332, in connect
    return _ConnectionFairy._checkout(self)
  File
"/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlal
chemy/pool.py", line 630, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
  File
"/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlal
chemy/pool.py", line 433, in checkout
    rec = pool._do_get()
  File
"/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlal
chemy/pool.py", line 945, in _do_get
    (self.size(), self.overflow(), self._timeout))
TimeoutError: QueuePool limit of size 5 overflow 10 reached, connection
timed out, timeout 30
 
After browsing through the sqlalchemy documentation and bits of the LNT
implementation,
it seems so far that the following pieces may be the key parts that cause
the problem
shown in the log.
 
The SQLAlchemy documentation seems to recommend to have a sqlalchemy session
per web
request. Looking at the following pieces of LNT, I got the impression that
instead a
session is shared between many or all requests:
 
>From ui/app.py, it shows Request.get_db() basically caches get_database from
"config":
...
class Request(flask.Request):
...
    def get_db(self):
...
        if self.db is None:
            echo = bool(self.args.get('db_log') or
self.form.get('db_log'))
            self.db = current_app.old_config.get_database(g.db_name,
echo=echo)
...
        return self.db
 
in config.py, it is shown that get_database returns a V4DB object by calling
a constructor:
...
    def get_database(self, name, echo=False):
...
        # Instantiate the appropriate database version.
        if db_entry.db_version == '0.4':
            return lnt.server.db.v4db.V4DB(db_entry.path, self,
                                           db_entry.baseline_revision,
                                           echo)
...
 
This constructor is in db/v4db.py:
...
class V4DB(object):
...
    def __init__(self, path, config, baseline_revision=0, echo=False):
...
        self.session = sqlalchemy.orm.sessionmaker(self.engine)()
...
        # Add several shortcut aliases.
        self.add = self.session.add
        self.commit = self.session.commit
        self.query = self.session.query
        self.rollback = self.session.rollback
...
 
 
 
It seems like a single session object is created in this constructor that
will ultimately
be shared across all Requests. It seems that instead, the request.get_db
method should
create a new session for each request. And close that session when the
request is finalized
which probably needs to be done by hooking into something Flask-specific.
 
The self.add and following lines in the constructor show that it probably
will be
non-trivial to refactor code so that there will not be a single session per
v4db object.
 
We're not sure if making separate sessions per Request is going to solve the
http://llvm.org/perf
instability problems; but that's the best idea we've got so far. 
 
Thanks,
 
Kristof
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150510/1996bc90/attachment.html>
Daniel Dunbar
2015-May-10  21:09 UTC
[LLVMdev] http://llvm.org/perf/ instability: some clues
Hi Kristof, It has been a long time since I looked at this, but aren't we just caching the DB on the request object, thus ensuring there is one opened DB, and hence session, per request? The V4DB is essentially just wrapping the session. - Daniel On Sun, May 10, 2015 at 11:21 AM, Kristof Beyls <kristof.beyls at arm.com> wrote:> Daniel, Tobias, Renato and myself have been looking a little bit at the > potential underlying reason > for why http://llvm.org/perf/ is instable, and have found some clues. I > want to share them here > to give people with more experience in the frameworks used by LNT (flask, > sqlalchemy, wsgi, …) > a chance to check if our reasoning below seems plausible. > > > > Daniel noticed the following backtrace in the log after > http://llvm.org/perf started giving “Internal Server Error” > again: > > 2015-05-08 22:57:05,309 ERROR: Exception on /db_default/v4/nts/287/graph > [GET] [in > /opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py:1423] > > Traceback (most recent call last): > > File > "/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", > line 1817, in wsgi_app > > response = self.full_dispatch_request() > > File > "/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", > line 1477, in full_dispatch_request > > rv = self.handle_user_exception(e) > > File > "/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", > line 1381, in handle_user_exception > > reraise(exc_type, exc_value, tb) > > File > "/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", > line 1475, in full_dispatch_request > > rv = self.dispatch_request() > > File > "/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", > line 1461, in dispatch_request > > return self.view_functions[rule.endpoint](**req.view_args) > > File > "/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/ui/decorators.py", > line 67, in wrap > > result = f(**args) > > File > "/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/ui/views.py", > line 385, in v4_run_graph > > ts = request.get_testsuite() > > File > "/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/ui/app.py", > line 76, in get_testsuite > > testsuites = self.get_db().testsuite > > File > "/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/ui/app.py", > line 55, in get_db > > self.db = current_app.old_config.get_database(g.db_name, echo=echo) > > File > "/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/config.py", > line 148, in get_database > > return lnt.server.db.v4db.V4DB(db_entry.path, self, echo=echo) > > File > "/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/db/v4db.py", > line 108, in __init__ > > .filter_by(id = lnt.testing.PASS).first() > > File > "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/query.py", > line 2334, in first > > ret = list(self[0:1]) > > File > "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/query.py", > line 2201, in __getitem__ > > return list(res) > > File > "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/query.py", > line 2405, in __iter__ > > return self._execute_and_instances(context) > > File > "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/query.py", > line 2418, in _execute_and_instances > > close_with_result=True) > > File > "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/query.py", > line 2409, in _connection_from_session > > **kw) > > File > "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/session.py", > line 846, in connection > > close_with_result=close_with_result) > > File > "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/session.py", > line 850, in _connection_for_bind > > return self.transaction._connection_for_bind(engine) > > File > "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/session.py", > line 315, in _connection_for_bind > > conn = bind.contextual_connect() > > File > "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/engine/base.py", > line 1737, in contextual_connect > > self.pool.connect(), > > File > "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/pool.py", > line 332, in connect > > return _ConnectionFairy._checkout(self) > > File > "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/pool.py", > line 630, in _checkout > > fairy = _ConnectionRecord.checkout(pool) > > File > "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/pool.py", > line 433, in checkout > > rec = pool._do_get() > > File > "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/pool.py", > line 945, in _do_get > > (self.size(), self.overflow(), self._timeout)) > > TimeoutError: QueuePool limit of size 5 overflow 10 reached, connection > timed out, timeout 30 > > > > After browsing through the sqlalchemy documentation and bits of the LNT > implementation, > it seems so far that the following pieces may be the key parts that cause > the problem > shown in the log. > > > > The SQLAlchemy documentation seems to recommend to have a sqlalchemy > session per web > request. Looking at the following pieces of LNT, I got the impression that > instead a > session is shared between many or all requests: > > > > From ui/app.py, it shows Request.get_db() basically caches get_database > from "config": > > ... > > class Request(flask.Request): > > ... > > def get_db(self): > > ... > > if self.db is None: > > echo = bool(self.args.get('db_log') or self.form.get('db_log')) > > self.db = current_app.old_config.get_database(g.db_name, > echo=echo) > ... > > return self.db > > > > in config.py, it is shown that get_database returns a V4DB object by > calling a constructor: > > ... > > def get_database(self, name, echo=False): > > ... > > # Instantiate the appropriate database version. > > if db_entry.db_version == '0.4': > > return lnt.server.db.v4db.V4DB(db_entry.path, self, > > db_entry.baseline_revision, > > echo) > ... > > > > This constructor is in db/v4db.py: > > ... > > class V4DB(object): > > ... > > def __init__(self, path, config, baseline_revision=0, echo=False): > > ... > > self.session = sqlalchemy.orm.sessionmaker(self.engine)() > > ... > > # Add several shortcut aliases. > > self.add = self.session.add > > self.commit = self.session.commit > > self.query = self.session.query > > self.rollback = self.session.rollback > ... > > > > > > > > It seems like a single session object is created in this constructor that > will ultimately > be shared across all Requests. It seems that instead, the request.get_db > method should > create a new session for each request. And close that session when the > request is finalized > which probably needs to be done by hooking into something Flask-specific. > > > > The self.add and following lines in the constructor show that it probably > will be > non-trivial to refactor code so that there will not be a single session > per v4db object. > > > > We’re not sure if making separate sessions per Request is going to solve > the http://llvm.org/perf > instability problems; but that’s the best idea we’ve got so far. > > > > Thanks, > > > > Kristof > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150510/80c74c1b/attachment.html>
Chris Matthews
2015-May-11  05:48 UTC
[LLVMdev] http://llvm.org/perf/ instability: some clues
Daniel is correct. I have been dealing with issues like this a lot recently with our internal LNT. Assuming this is the same issue, I’ll share what I have figured out so far. I run LNT from gunicorn, proxied with apache. We were getting a lot of timeouts on submissions, and exhausted database connections. In our case, we are using Postgres, and LNT would easily use all 100 connections Postgres creates by default. Apache like to mask these things by timing you out early. I traced our issue back to the submitRun being extremely slow on about 1 in 5 submissions. I get a feeling it was the submissions for particular machines possibly some of the older machines, though I could not reproduce it reliably. I found that with timeout removed, some submissions were taking 200s to process, and a few were as long as 400s. I did this by putting print statements in our production server and watching the submit requests being processed. All this time is spent generating the field changes, I did not dig in any further than that. What was happening is that as we were running more runs, we had more and more submissions taking that long, and eventually the server would start throwing errors like this, because all database connections were in use doing submissions. My solution: I jacked our timeouts way up (1000s) and upped our database connections even further. We do now process all our submissions, and I only feel sort of dirty when I think about it. The real solution: we have to find out what is going in in field change generation that is sucking up so much time! Submit run is interesting, it is the only place in LNT were we write into the database, and we are doing a lot of processing of the data when it comes in. Maybe we should track this with a PR? I am willing to help track this down. I’ve found the lack of request logging makes it sort of tricky to nail down how often this happens, the only insights I have are from adding print statements to suspect code paths.> On May 10, 2015, at 2:09 PM, Daniel Dunbar <daniel at zuster.org> wrote: > > Hi Kristof, > > It has been a long time since I looked at this, but aren't we just caching the DB on the request object, thus ensuring there is one opened DB, and hence session, per request? The V4DB is essentially just wrapping the session. > > - Daniel > > On Sun, May 10, 2015 at 11:21 AM, Kristof Beyls <kristof.beyls at arm.com <mailto:kristof.beyls at arm.com>> wrote: > Daniel, Tobias, Renato and myself have been looking a little bit at the potential underlying reason > for why http://llvm.org/perf/ <http://llvm.org/perf/> is instable, and have found some clues. I want to share them here > to give people with more experience in the frameworks used by LNT (flask, sqlalchemy, wsgi, …) > a chance to check if our reasoning below seems plausible. > > > > Daniel noticed the following backtrace in the log after http://llvm.org/perf <http://llvm.org/perf> started giving “Internal Server Error” > again: > > 2015-05-08 22:57:05,309 ERROR: Exception on /db_default/v4/nts/287/graph [GET] [in /opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py:1423] > Traceback (most recent call last): > File "/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1817, in wsgi_app > response = self.full_dispatch_request() > File "/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1477, in full_dispatch_request > rv = self.handle_user_exception(e) > File "/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1381, in handle_user_exception > reraise(exc_type, exc_value, tb) > File "/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1475, in full_dispatch_request > rv = self.dispatch_request() > File "/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1461, in dispatch_request > return self.view_functions[rule.endpoint](**req.view_args) > File "/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/ui/decorators.py", line 67, in wrap > result = f(**args) > File "/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/ui/views.py", line 385, in v4_run_graph > ts = request.get_testsuite() > File "/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/ui/app.py", line 76, in get_testsuite > testsuites = self.get_db().testsuite > File "/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/ui/app.py", line 55, in get_db > self.db = current_app.old_config.get_database(g.db_name, echo=echo) > File "/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/config.py", line 148, in get_database > return lnt.server.db.v4db.V4DB(db_entry.path, self, echo=echo) > File "/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/db/v4db.py", line 108, in __init__ > .filter_by(id = lnt.testing.PASS).first() > File "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/query.py", line 2334, in first > ret = list(self[0:1]) > File "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/query.py", line 2201, in __getitem__ > return list(res) > File "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/query.py", line 2405, in __iter__ > return self._execute_and_instances(context) > File "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/query.py", line 2418, in _execute_and_instances > close_with_result=True) > File "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/query.py", line 2409, in _connection_from_session > **kw) > File "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/session.py", line 846, in connection > close_with_result=close_with_result) > File "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/session.py", line 850, in _connection_for_bind > return self.transaction._connection_for_bind(engine) > File "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/session.py", line 315, in _connection_for_bind > conn = bind.contextual_connect() > File "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/engine/base.py", line 1737, in contextual_connect > self.pool.connect(), > File "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/pool.py", line 332, in connect > return _ConnectionFairy._checkout(self) > File "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/pool.py", line 630, in _checkout > fairy = _ConnectionRecord.checkout(pool) > File "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/pool.py", line 433, in checkout > rec = pool._do_get() > File "/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/pool.py", line 945, in _do_get > (self.size(), self.overflow(), self._timeout)) > TimeoutError: QueuePool limit of size 5 overflow 10 reached, connection timed out, timeout 30 > > > After browsing through the sqlalchemy documentation and bits of the LNT implementation, > it seems so far that the following pieces may be the key parts that cause the problem > shown in the log. > > > > The SQLAlchemy documentation seems to recommend to have a sqlalchemy session per web > request. Looking at the following pieces of LNT, I got the impression that instead a > session is shared between many or all requests: > > > > From ui/app.py, it shows Request.get_db() basically caches get_database from "config": > > ... > > class Request(flask.Request): > > ... > > def get_db(self): > > ... > > if self.db is None: > > echo = bool(self.args.get('db_log') or self.form.get('db_log')) > > self.db = current_app.old_config.get_database(g.db_name, echo=echo) > ... > > return self.db > > > > in config.py, it is shown that get_database returns a V4DB object by calling a constructor: > > ... > > def get_database(self, name, echo=False): > > ... > > # Instantiate the appropriate database version. > > if db_entry.db_version == '0.4': > > return lnt.server.db.v4db.V4DB(db_entry.path, self, > > db_entry.baseline_revision, > > echo) > ... > > > > This constructor is in db/v4db.py: > > ... > > class V4DB(object): > > ... > > def __init__(self, path, config, baseline_revision=0, echo=False): > > ... > > self.session = sqlalchemy.orm.sessionmaker(self.engine)() > > ... > > # Add several shortcut aliases. > > self.add = self.session.add > > self.commit = self.session.commit > > self.query = self.session.query > > self.rollback = self.session.rollback > ... > > > > > > > > It seems like a single session object is created in this constructor that will ultimately > be shared across all Requests. It seems that instead, the request.get_db method should > create a new session for each request. And close that session when the request is finalized > which probably needs to be done by hooking into something Flask-specific. > > > > The self.add and following lines in the constructor show that it probably will be > non-trivial to refactor code so that there will not be a single session per v4db object. > > > > We’re not sure if making separate sessions per Request is going to solve the http://llvm.org/perf <http://llvm.org/perf> > instability problems; but that’s the best idea we’ve got so far. > > > > Thanks, > > > > Kristof > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150510/cd37607d/attachment.html>