Skip to content Skip to sidebar Skip to footer

Using Multiple Cursors In A Nested Loop In Sqlite3 From Python-2.7

I’ve been having problems using multiple cursors on a single sqlite database within a nested loop. I found a solution that works for me, but it’s limited and I haven’t seen t

Solution 1:

This looks like you are hitting issue 10513, fixed in Python 2.7.13, 3.5.3 and 3.6.0b1.

There was a bug in the way transactions were handled, where all cursor states were reset in certain circumstances. This led to curOuter starting from the beginning again.

The work-around is to upgrade, or until you can upgrade, to not use cursors across transaction commits. By using curOuter.fetchall() you achieved the latter.

Solution 2:

You could build up a list of rows to insert in the inner loop and then cursor.executemany() outside the loop. This doesn't answer the multiple cursor question but may be a workaround for you.

curOuter = db.cursor()
rows=[]
for row in curOuter.execute('SELECT * FROM myConnections'):    
    id  = row[0]    
    scList = retrieve_shared_connections(id)  
    for sc in scList:

        rows.append((id,sc))
curOuter.executemany('''INSERT INTO sharedConnections(IdConnectedToMe, IdShared) VALUES (?,?)''', rows)  
db.commit()

Better yet only select the ID from myConnections:

curOuter.execute('SELECT id FROM myConnections')

Solution 3:

While building an in-memory list seems to be best solution, I've found that using explicit transactions reduces the number duplicates returned in the outer query. That would make it something like:

with db:
    curOuter = db.cursor()
    for row in curOuter.execute('SELECT * FROM myConnections'):    
        id  = row[0]
        with db:
            curInner = db.cursor()  
            scList = retrieve_shared_connections(id)  
            for sc in scList:  
                curInner.execute('''INSERT INTO sharedConnections(IdConnectedToMe, IdShared) VALUES (?,?)''', (id,sc))

Solution 4:

This is a bit older, I see. But when stumbling upon this question, I wondered, whether sqlite3 still has such issues in python-2.7. Let's see:

#!/usr/bin/pythonimport sqlite3
import argparse
from datetime import datetime

DBFILE = 'nested.sqlite'
MAX_A = 1000
MAX_B = 10000

parser = argparse.ArgumentParser(description='Nested SQLite cursors in Python')
parser.add_argument('step', type=int)
args = parser.parse_args()

connection = sqlite3.connect(DBFILE)
connection.row_factory = sqlite3.Row
t0 = datetime.now()

if args.step == 0:
    # set up test database
    cursor = connection.cursor()
    cursor.execute("""DROP TABLE IF EXISTS A""")
    cursor.execute("""DROP TABLE IF EXISTS B""")
    # intentionally omitting primary keys
    cursor.execute("""CREATE TABLE A ( K INTEGER )""")
    cursor.execute("""CREATE TABLE B ( K INTEGER, L INTEGER )""")
    cursor.executemany("""INSERT INTO A ( K ) VALUES ( ? )""", 
        [ (i,) for i inrange(0, MAX_A) ])
    connection.commit()
    for row in cursor.execute("""SELECT COUNT(*) CNT FROM A"""):
        print row['CNT']

if args.step == 1:
    # do the nested SELECT and INSERT
    read = connection.cursor()
    write = connection.cursor()
    for row in read.execute("""SELECT * FROM A"""):
        bs = [ ( row['K'], i ) for i inrange(0, MAX_B) ]
        for b in bs: # with .executemany() it would be twice as fast ;)
            write.execute("""INSERT INTO B ( K, L ) VALUES ( ?, ? )""", b)
    connection.commit()
    for row in connection.cursor().execute("""SELECT COUNT(*) CNT FROM B"""):
        print row['CNT']

elif args.step == 2:
    connection = sqlite3.connect(DBFILE)
    connection.row_factory = sqlite3.Row
    control = connection.cursor()
    ca = cb = 0# will count along our expectationfor row in control.execute("""SELECT * FROM B ORDER BY K ASC, L ASC"""):
        assert row['K'] == ca and row['L'] == cb
        cb += 1if cb == MAX_B:
            cb = 0
            ca += 1assert ca == MAX_A and cb == 0for row in connection.cursor().execute("""SELECT COUNT(*) CNT FROM B"""):
        print row['CNT']

print datetime.now() - t0

Output is

$ ./nested.py 0
1000
0:00:04.465695
$ ./nested.py 1
10000000
0:00:27.726074
$ ./nested.py 2
10000000
0:00:19.137563

This test was done using

$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13) [GCC 4.8.2] on linux2
>>>import sqlite3>>>sqlite3.version
'2.6.0'
>>>sqlite3.sqlite_version
'3.8.2'

The situation changes when we commit in packages, e.g. by indenting the connection.commit() in step 1 of the above test script. The behavior is quite strange, because only the secondcommit to the write cursor resets the read cursor, exactly as shown in the OP. After fiddling with the code above, I assume that OP did not do one commit as shown in the example code, but did commit in packages.

Remark: Drawing the cursors read and write from separate connections to support packaged commit, as suggested in an answer to another question, does not work because the commits will run against a foreign lock.

Post a Comment for "Using Multiple Cursors In A Nested Loop In Sqlite3 From Python-2.7"