psycopg2: insert multiple rows with one query












103















I need to insert multiple rows with one query (number of rows is not constant), so I need to execute query like this one:



INSERT INTO t (a, b) VALUES (1, 2), (3, 4), (5, 6);


The only way I know is



args = [(1,2), (3,4), (5,6)]
args_str = ','.join(cursor.mogrify("%s", (x, )) for x in args)
cursor.execute("INSERT INTO t (a, b) VALUES "+args_str)


but I want some simpler way.










share|improve this question





























    103















    I need to insert multiple rows with one query (number of rows is not constant), so I need to execute query like this one:



    INSERT INTO t (a, b) VALUES (1, 2), (3, 4), (5, 6);


    The only way I know is



    args = [(1,2), (3,4), (5,6)]
    args_str = ','.join(cursor.mogrify("%s", (x, )) for x in args)
    cursor.execute("INSERT INTO t (a, b) VALUES "+args_str)


    but I want some simpler way.










    share|improve this question



























      103












      103








      103


      32






      I need to insert multiple rows with one query (number of rows is not constant), so I need to execute query like this one:



      INSERT INTO t (a, b) VALUES (1, 2), (3, 4), (5, 6);


      The only way I know is



      args = [(1,2), (3,4), (5,6)]
      args_str = ','.join(cursor.mogrify("%s", (x, )) for x in args)
      cursor.execute("INSERT INTO t (a, b) VALUES "+args_str)


      but I want some simpler way.










      share|improve this question
















      I need to insert multiple rows with one query (number of rows is not constant), so I need to execute query like this one:



      INSERT INTO t (a, b) VALUES (1, 2), (3, 4), (5, 6);


      The only way I know is



      args = [(1,2), (3,4), (5,6)]
      args_str = ','.join(cursor.mogrify("%s", (x, )) for x in args)
      cursor.execute("INSERT INTO t (a, b) VALUES "+args_str)


      but I want some simpler way.







      python postgresql psycopg2






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 17 '11 at 14:28







      Sergey Fedoseev

















      asked Nov 15 '11 at 10:09









      Sergey FedoseevSergey Fedoseev

      98521116




      98521116
























          13 Answers
          13






          active

          oldest

          votes


















          182














          I built a program that inserts multiple lines to a server that was located in another city.



          I found out that using this method was about 10 times faster than executemany. In my case tup is a tuple containing about 2000 rows. It took about 10 seconds when using this method:



          args_str = ','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup)
          cur.execute("INSERT INTO table VALUES " + args_str)


          and 2 minutes when using this method:



          cur.executemany("INSERT INTO table VALUES(%s,%s,%s,%s,%s,%s,%s,%s,%s)", tup)





          share|improve this answer





















          • 10





            Still very relevant almost two years later. An experience today suggests that as the number of rows you want to push increases, the better it is to use the execute strategy. I saw speedup of around 100x thanks to this!

            – Rob Watts
            Jan 22 '14 at 21:16








          • 3





            Perhaps executemany runs a commit after each insert. If you instead wrap the whole thing in a transaction, maybe that would expedite things?

            – Richard
            Apr 12 '15 at 17:06






          • 4





            Just confirmed this improvement myself. From what I've read psycopg2's executemany doesn't do anything optimal, just loops and does many execute statements. Using this method, a 700 row insert to a remote server went from 60s to <2s.

            – Nelson
            Apr 27 '15 at 23:22






          • 2





            Maybe I'm being paranoid, but concatenating the query with a + seems like it could open up to sql injection, I feel like @Clodoaldo Neto execute_values() solution is safer.

            – Will Munn
            Jan 17 '18 at 11:55








          • 4





            in case someone encounters the following error: [TypeError: sequence item 0: expected str instance, bytes found] run this command instead [args_str = ','.join(cur.mogrify("(%s,%s)", x).decode("utf-8") for x in tup)]

            – mrt
            Sep 3 '18 at 23:08





















          113














          New execute_values method in Psycopg 2.7:



          data = [(1,'x'), (2,'y')]
          insert_query = 'insert into t (a, b) values %s'
          psycopg2.extras.execute_values (
          cursor, insert_query, data, template=None, page_size=100
          )


          The pythonic way of doing it in Psycopg 2.6:



          data = [(1,'x'), (2,'y')]
          records_list_template = ','.join(['%s'] * len(data))
          insert_query = 'insert into t (a, b) values {}'.format(records_list_template)
          cursor.execute(insert_query, data)


          Explanation: If the data to be inserted is given as a list of tuples like in



          data = [(1,'x'), (2,'y')]


          then it is already in the exact required format as





          1. the values syntax of the insert clause expects a list of records as in



            insert into t (a, b) values (1, 'x'),(2, 'y')



          2. Psycopg adapts a Python tuple to a Postgresql record.



          The only necessary work is to provide a records list template to be filled by psycopg



          # We use the data list to be sure of the template length
          records_list_template = ','.join(['%s'] * len(data))


          and place it in the insert query



          insert_query = 'insert into t (a, b) values {}'.format(records_list_template)


          Printing the insert_query outputs



          insert into t (a, b) values %s,%s


          Now to the usual Psycopg arguments substitution



          cursor.execute(insert_query, data)


          Or just testing what will be sent to the server



          print (cursor.mogrify(insert_query, data).decode('utf8'))


          Output:



          insert into t (a, b) values (1, 'x'),(2, 'y')





          share|improve this answer


























          • How does the performance of this method compare with cur.copy_from?

            – Michael Goldshteyn
            Mar 3 '16 at 22:05











          • Here's a gist with a benchmark. copy_from scales to about 6.5X faster on my machine with 10M records.

            – Joseph Sheedy
            Jul 13 '16 at 0:28











          • Looks nice - I think you have a stray , at the end of your initial definition of insert_query (unless you were trying to make it a tuple?) and are missing a s after the % for %s also in the initial definition of insert_query.

            – deadcode
            May 18 '17 at 1:22











          • @deadcode Fixed, Thanks.

            – Clodoaldo Neto
            May 18 '17 at 12:20






          • 1





            using execute_values I was able to get my system running at 1k records a minute up to 128k records a minute

            – Conrad.Dean
            Nov 15 '18 at 15:17



















          38














          Update with psycopg2 2.7:



          The classic executemany() is about 60 times slower than @ant32 's implementation (called "folded") as explained in this thread: https://www.postgresql.org/message-id/20170130215151.GA7081%40deb76.aryehleib.com



          This implementation was added to psycopg2 in version 2.7 and is called execute_values():



          from psycopg2.extras import execute_values
          execute_values(cur,
          "INSERT INTO test (id, v1, v2) VALUES %s",
          [(1, 2, 3), (4, 5, 6), (7, 8, 9)])




          Previous Answer:



          To insert multiple rows, using the multirow VALUES syntax with execute() is about 10x faster than using psycopg2 executemany(). Indeed, executemany() just runs many individual INSERT statements.



          @ant32 's code works perfectly in Python 2. But in Python 3, cursor.mogrify() returns bytes, cursor.execute() takes either bytes or strings, and ','.join() expects str instance.



          So in Python 3 you may need to modify @ant32 's code, by adding .decode('utf-8'):



          args_str = ','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x).decode('utf-8') for x in tup)
          cur.execute("INSERT INTO table VALUES " + args_str)


          Or by using bytes (with b'' or b"") only:



          args_bytes = b','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup)
          cur.execute(b"INSERT INTO table VALUES " + args_bytes)





          share|improve this answer

































            23














            A snippet from Psycopg2's tutorial page at Postgresql.org (see bottom):




            A last item I would like to show you is how to insert multiple rows using a dictionary. If you had the following:




            namedict = ({"first_name":"Joshua", "last_name":"Drake"},
            {"first_name":"Steven", "last_name":"Foo"},
            {"first_name":"David", "last_name":"Bar"})



            You could easily insert all three rows within the dictionary by using:




            cur = conn.cursor()
            cur.executemany("""INSERT INTO bar(first_name,last_name) VALUES (%(first_name)s, %(last_name)s)""", namedict)


            It doesn't save much code, but it definitively looks better.






            share|improve this answer





















            • 29





              This will run many individual INSERT statements. Useful, but not the same as a single multi-VALUEd insert.

              – Craig Ringer
              Apr 9 '13 at 2:26



















            21














            cursor.copy_from is the fastest solution I've found for bulk inserts by far. Here's a gist I made containing a class named IteratorFile which allows an iterator yielding strings to be read like a file. We can convert each input record to a string using a generator expression. So the solution would be



            args = [(1,2), (3,4), (5,6)]
            f = IteratorFile(("{}t{}".format(x[0], x[1]) for x in args))
            cursor.copy_from(f, 'table_name', columns=('a', 'b'))


            For this trivial size of args it won't make much of a speed difference, but I see big speedups when dealing with thousands+ of rows. It will also be more memory efficient than building a giant query string. An iterator would only ever hold one input record in memory at a time, where at some point you'll run out of memory in your Python process or in Postgres by building the query string.






            share|improve this answer





















            • 2





              Here is a benchmark comparing copy_from/IteratorFile with a query builder solution. copy_from scales to about 6.5X faster on my machine with 10M records.

              – Joseph Sheedy
              Sep 30 '16 at 17:49








            • 3





              do you have to dick around with escaping strings and timestamps etc?

              – CpILL
              Jun 13 '17 at 9:35











            • Yes, you'll have to make sure you have a well formed TSV records.

              – Joseph Sheedy
              Jun 14 '17 at 2:51



















            6














            All of these techniques are called 'Extended Inserts" in Postgres terminology, and as of the 24th of November 2016, it's still a ton faster than psychopg2's executemany() and all the other methods listed in this thread (which i tried before coming to this answer).



            Here's some code which doesnt use cur.mogrify and is nice and simply to get your head around:



            valueSQL = [ '%s', '%s', '%s', ... ] # as many as you have columns.
            sqlrows =
            rowsPerInsert = 3 # more means faster, but with diminishing returns..
            for row in getSomeData:
            # row == [1, 'a', 'yolo', ... ]
            sqlrows += row
            if ( len(sqlrows)/len(valueSQL) ) % rowsPerInsert == 0:
            # sqlrows == [ 1, 'a', 'yolo', 2, 'b', 'swag', 3, 'c', 'selfie' ]
            insertSQL = 'INSERT INTO "twitter" VALUES ' + ','.join(['(' + ','.join(valueSQL) + ')']*rowsPerInsert)
            cur.execute(insertSQL, sqlrows)
            con.commit()
            sqlrows =
            insertSQL = 'INSERT INTO "twitter" VALUES ' + ','.join(['(' + ','.join(valueSQL) + ')']*len(sqlrows))
            cur.execute(insertSQL, sqlrows)
            con.commit()


            But it should be noted that if you can use copy_from(), you should use copy_from ;)






            share|improve this answer


























            • Bringing up from the dead, but what happens in the situation of the last few rows? I assume you actually run that final clause again on the last remaining rows, in the case you have an even number of rows?

              – mcpeterson
              Nov 23 '16 at 23:03











            • Correct, sorry I must have forgotten to do that when i wrote the example - thats pretty stupid of me. Not doing so wouldn't have given people an error, which makes me worried how many people copy/pasted the solution and went about their business..... Anyway, very grateful mcpeterson - thank you!

              – J.J
              Nov 24 '16 at 1:49



















            3














            To nicely insert into DB a list of rows, using user given batch size and with psycopg2 !



            def get_batch(iterable, size=100):
            for i in range(0, len(iterable), size):
            yield iterable[i: i + size]


            def insert_rows_batch(table, rows, batch_size=500, target_fields=None):
            """
            A utility method to insert batch of tuples(rows) into a table
            NOTE: Handle data type for fields in rows yourself as per your table
            columns' type.

            :param table: Name of the target table
            :type table: str

            :param rows: The rows to insert into the table
            :type rows: iterable of tuples

            :param batch_size: The size of batch of rows to insert at a time
            :type batch_size: int

            :param target_fields: The names of the columns to fill in the table
            :type target_fields: iterable of strings
            """
            conn = cur = None
            if target_fields:
            target_fields = ", ".join(target_fields)
            target_fields = "({})".format(target_fields)
            else:
            target_fields = ''

            conn = get_conn() # get connection using psycopg2
            if conn:
            cur = conn.cursor()
            count = 0

            for mini_batch in get_batch(rows, batch_size):
            mini_batch_size = len(mini_batch)
            count += mini_batch_size
            record_template = ','.join(["%s"] * mini_batch_size)
            sql = "INSERT INTO {0} {1} VALUES {2};".format(
            table,
            target_fields,
            record_template)
            cur.execute(sql, mini_batch)
            conn.commit()
            print("Loaded {} rows into {} so far".format(count, table))
            print("Done loading. Loaded a total of {} rows".format(count))
            if cur:cur.close()
            if conn:conn.close()


            If you want UPSERT (Insert+Update) as well in postgres with batches: postgres_utilities






            share|improve this answer

































              1














              Another nice and efficient approach - is to pass rows for insertion as 1 argument,
              which is array of json objects.



              E.g. you passing argument:



              [ {id: 18, score: 1}, { id: 19, score: 5} ]


              It is array, which may contain any amount of objects inside.
              Then your SQL looks like:



              INSERT INTO links (parent_id, child_id, score) 
              SELECT 123, (r->>'id')::int, (r->>'score')::int
              FROM unnest($1::json) as r


              Notice: Your postgress must be new enough, to support json






              share|improve this answer































                0














                I've been using ant32's answer above for several years. However I've found that is thorws an error in python 3 because mogrify returns a byte string.



                Converting explicitly to bytse strings is a simple solution for making code python 3 compatible.



                args_str = b','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup) 
                cur.execute(b"INSERT INTO table VALUES " + args_str)





                share|improve this answer































                  0














                  If you're using SQLAlchemy, you don't need to mess with hand-crafting the string because SQLAlchemy supports generating a multi-row VALUES clause for a single INSERT statement:



                  rows = 
                  for i, name in enumerate(rawdata):
                  row = {
                  'id': i,
                  'name': name,
                  'valid': True,
                  }
                  rows.append(row)
                  if len(rows) > 0: # INSERT fails if no rows
                  insert_query = SQLAlchemyModelName.__table__.insert().values(rows)
                  session.execute(insert_query)





                  share|improve this answer


























                  • Under the hood SQLAlchemy uses psychopg2's executemany() for calls like this and therefore this answer will have severe performance issues for large queries. See the execute method docs.sqlalchemy.org/en/latest/orm/session_api.html.

                    – sage88
                    Jan 5 '17 at 22:48






                  • 2





                    I don't think that's the case. It's been a bit since I looked at this, but IIRC, this is actually building a single insertion statement in the insert_query line. Then, session.execute() is just calling psycopg2's execute() statement with a single massive string. So the "trick" is building the entire insertion statement object first. I am using this to insert 200,000 rows at a time and saw massive performance increases using this code compared to the normal executemany().

                    – Jeff Widman
                    Jan 6 '17 at 1:13








                  • 1





                    The SQLAlchemy doc you linked to has a section that shows exactly how this works and even says: "It is essential to note that passing multiple values is NOT the same as using traditional executemany() form". So it is explicitly calling out that this works.

                    – Jeff Widman
                    Jan 6 '17 at 1:23






                  • 1





                    I stand corrected. I didn't notice your usage of the values() method (without it SQLAlchemy just does executemany). I would say edit the answer to include a link to that doc so that I can change my vote, but obviously you've already included it. Perhaps mention that this is not the same thing as calling an insert() with execute() with a list of dicts?

                    – sage88
                    Jan 6 '17 at 6:15



















                  -1














                  If you want to insert multiple rows within one insert statemens (assuming you are not using ORM) the easiest way so far for me would be to use list of dictionaries. Here is an example:



                   t = [{'id':1, 'start_date': '2015-07-19 00:00:00', 'end_date': '2015-07-20 00:00:00', 'campaignid': 6},
                  {'id':2, 'start_date': '2015-07-19 00:00:00', 'end_date': '2015-07-20 00:00:00', 'campaignid': 7},
                  {'id':3, 'start_date': '2015-07-19 00:00:00', 'end_date': '2015-07-20 00:00:00', 'campaignid': 8}]

                  conn.execute("insert into campaign_dates
                  (id, start_date, end_date, campaignid)
                  values (%(id)s, %(start_date)s, %(end_date)s, %(campaignid)s);",
                  t)


                  As you can see only one query will be executed:



                  INFO sqlalchemy.engine.base.Engine insert into campaign_dates (id, start_date, end_date, campaignid) values (%(id)s, %(start_date)s, %(end_date)s, %(campaignid)s);
                  INFO sqlalchemy.engine.base.Engine [{'campaignid': 6, 'id': 1, 'end_date': '2015-07-20 00:00:00', 'start_date': '2015-07-19 00:00:00'}, {'campaignid': 7, 'id': 2, 'end_date': '2015-07-20 00:00:00', 'start_date': '2015-07-19 00:00:00'}, {'campaignid': 8, 'id': 3, 'end_date': '2015-07-20 00:00:00', 'start_date': '2015-07-19 00:00:00'}]
                  INFO sqlalchemy.engine.base.Engine COMMIT





                  share|improve this answer
























                  • Showing logging from the sqlalchemy engine is NOT a demonstration of only running a single query, it just means that the sqlalchemy engine ran one command. Under the hood this is using psychopg2's executemany which is very inefficient. See the execute method docs.sqlalchemy.org/en/latest/orm/session_api.html.

                    – sage88
                    Jan 5 '17 at 23:06



















                  -2














                  Finally in SQLalchemy1.2 version, this new implementation is added to use psycopg2.extras.execute_batch() instead of executemany when you initialize your engine with use_batch_mode=True like:



                  engine = create_engine(
                  "postgresql+psycopg2://scott:tiger@host/dbname",
                  use_batch_mode=True)


                  http://docs.sqlalchemy.org/en/latest/changelog/migration_12.html#change-4109



                  Then someone would have to use SQLalchmey won't bother to try different combinations of sqla and psycopg2 and direct SQL together..






                  share|improve this answer































                    -3














                    Using aiopg - The snippet below works perfectly fine



                        # items = [10, 11, 12, 13]
                    # group = 1
                    tup = [(gid, pid) for pid in items]
                    args_str = ",".join([str(s) for s in tup])
                    # insert into group values (1, 10), (1, 11), (1, 12), (1, 13)
                    yield from cur.execute("INSERT INTO group VALUES " + args_str)





                    share|improve this answer



















                    • 8





                      This method is not safe from SQL injections. As psycopg2 documentation states (that aiopg2 links to): 'Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.'

                      – Michał Pawłowski
                      Sep 10 '15 at 16:13













                    Your Answer






                    StackExchange.ifUsing("editor", function () {
                    StackExchange.using("externalEditor", function () {
                    StackExchange.using("snippets", function () {
                    StackExchange.snippets.init();
                    });
                    });
                    }, "code-snippets");

                    StackExchange.ready(function() {
                    var channelOptions = {
                    tags: "".split(" "),
                    id: "1"
                    };
                    initTagRenderer("".split(" "), "".split(" "), channelOptions);

                    StackExchange.using("externalEditor", function() {
                    // Have to fire editor after snippets, if snippets enabled
                    if (StackExchange.settings.snippets.snippetsEnabled) {
                    StackExchange.using("snippets", function() {
                    createEditor();
                    });
                    }
                    else {
                    createEditor();
                    }
                    });

                    function createEditor() {
                    StackExchange.prepareEditor({
                    heartbeatType: 'answer',
                    autoActivateHeartbeat: false,
                    convertImagesToLinks: true,
                    noModals: true,
                    showLowRepImageUploadWarning: true,
                    reputationToPostImages: 10,
                    bindNavPrevention: true,
                    postfix: "",
                    imageUploader: {
                    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                    allowUrls: true
                    },
                    onDemand: true,
                    discardSelector: ".discard-answer"
                    ,immediatelyShowMarkdownHelp:true
                    });


                    }
                    });














                    draft saved

                    draft discarded


















                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f8134602%2fpsycopg2-insert-multiple-rows-with-one-query%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown

























                    13 Answers
                    13






                    active

                    oldest

                    votes








                    13 Answers
                    13






                    active

                    oldest

                    votes









                    active

                    oldest

                    votes






                    active

                    oldest

                    votes









                    182














                    I built a program that inserts multiple lines to a server that was located in another city.



                    I found out that using this method was about 10 times faster than executemany. In my case tup is a tuple containing about 2000 rows. It took about 10 seconds when using this method:



                    args_str = ','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup)
                    cur.execute("INSERT INTO table VALUES " + args_str)


                    and 2 minutes when using this method:



                    cur.executemany("INSERT INTO table VALUES(%s,%s,%s,%s,%s,%s,%s,%s,%s)", tup)





                    share|improve this answer





















                    • 10





                      Still very relevant almost two years later. An experience today suggests that as the number of rows you want to push increases, the better it is to use the execute strategy. I saw speedup of around 100x thanks to this!

                      – Rob Watts
                      Jan 22 '14 at 21:16








                    • 3





                      Perhaps executemany runs a commit after each insert. If you instead wrap the whole thing in a transaction, maybe that would expedite things?

                      – Richard
                      Apr 12 '15 at 17:06






                    • 4





                      Just confirmed this improvement myself. From what I've read psycopg2's executemany doesn't do anything optimal, just loops and does many execute statements. Using this method, a 700 row insert to a remote server went from 60s to <2s.

                      – Nelson
                      Apr 27 '15 at 23:22






                    • 2





                      Maybe I'm being paranoid, but concatenating the query with a + seems like it could open up to sql injection, I feel like @Clodoaldo Neto execute_values() solution is safer.

                      – Will Munn
                      Jan 17 '18 at 11:55








                    • 4





                      in case someone encounters the following error: [TypeError: sequence item 0: expected str instance, bytes found] run this command instead [args_str = ','.join(cur.mogrify("(%s,%s)", x).decode("utf-8") for x in tup)]

                      – mrt
                      Sep 3 '18 at 23:08


















                    182














                    I built a program that inserts multiple lines to a server that was located in another city.



                    I found out that using this method was about 10 times faster than executemany. In my case tup is a tuple containing about 2000 rows. It took about 10 seconds when using this method:



                    args_str = ','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup)
                    cur.execute("INSERT INTO table VALUES " + args_str)


                    and 2 minutes when using this method:



                    cur.executemany("INSERT INTO table VALUES(%s,%s,%s,%s,%s,%s,%s,%s,%s)", tup)





                    share|improve this answer





















                    • 10





                      Still very relevant almost two years later. An experience today suggests that as the number of rows you want to push increases, the better it is to use the execute strategy. I saw speedup of around 100x thanks to this!

                      – Rob Watts
                      Jan 22 '14 at 21:16








                    • 3





                      Perhaps executemany runs a commit after each insert. If you instead wrap the whole thing in a transaction, maybe that would expedite things?

                      – Richard
                      Apr 12 '15 at 17:06






                    • 4





                      Just confirmed this improvement myself. From what I've read psycopg2's executemany doesn't do anything optimal, just loops and does many execute statements. Using this method, a 700 row insert to a remote server went from 60s to <2s.

                      – Nelson
                      Apr 27 '15 at 23:22






                    • 2





                      Maybe I'm being paranoid, but concatenating the query with a + seems like it could open up to sql injection, I feel like @Clodoaldo Neto execute_values() solution is safer.

                      – Will Munn
                      Jan 17 '18 at 11:55








                    • 4





                      in case someone encounters the following error: [TypeError: sequence item 0: expected str instance, bytes found] run this command instead [args_str = ','.join(cur.mogrify("(%s,%s)", x).decode("utf-8") for x in tup)]

                      – mrt
                      Sep 3 '18 at 23:08
















                    182












                    182








                    182







                    I built a program that inserts multiple lines to a server that was located in another city.



                    I found out that using this method was about 10 times faster than executemany. In my case tup is a tuple containing about 2000 rows. It took about 10 seconds when using this method:



                    args_str = ','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup)
                    cur.execute("INSERT INTO table VALUES " + args_str)


                    and 2 minutes when using this method:



                    cur.executemany("INSERT INTO table VALUES(%s,%s,%s,%s,%s,%s,%s,%s,%s)", tup)





                    share|improve this answer















                    I built a program that inserts multiple lines to a server that was located in another city.



                    I found out that using this method was about 10 times faster than executemany. In my case tup is a tuple containing about 2000 rows. It took about 10 seconds when using this method:



                    args_str = ','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup)
                    cur.execute("INSERT INTO table VALUES " + args_str)


                    and 2 minutes when using this method:



                    cur.executemany("INSERT INTO table VALUES(%s,%s,%s,%s,%s,%s,%s,%s,%s)", tup)






                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Mar 9 '17 at 11:40









                    Alex Riley

                    80k26160163




                    80k26160163










                    answered Apr 13 '12 at 19:53









                    ant32ant32

                    1,949184




                    1,949184








                    • 10





                      Still very relevant almost two years later. An experience today suggests that as the number of rows you want to push increases, the better it is to use the execute strategy. I saw speedup of around 100x thanks to this!

                      – Rob Watts
                      Jan 22 '14 at 21:16








                    • 3





                      Perhaps executemany runs a commit after each insert. If you instead wrap the whole thing in a transaction, maybe that would expedite things?

                      – Richard
                      Apr 12 '15 at 17:06






                    • 4





                      Just confirmed this improvement myself. From what I've read psycopg2's executemany doesn't do anything optimal, just loops and does many execute statements. Using this method, a 700 row insert to a remote server went from 60s to <2s.

                      – Nelson
                      Apr 27 '15 at 23:22






                    • 2





                      Maybe I'm being paranoid, but concatenating the query with a + seems like it could open up to sql injection, I feel like @Clodoaldo Neto execute_values() solution is safer.

                      – Will Munn
                      Jan 17 '18 at 11:55








                    • 4





                      in case someone encounters the following error: [TypeError: sequence item 0: expected str instance, bytes found] run this command instead [args_str = ','.join(cur.mogrify("(%s,%s)", x).decode("utf-8") for x in tup)]

                      – mrt
                      Sep 3 '18 at 23:08
















                    • 10





                      Still very relevant almost two years later. An experience today suggests that as the number of rows you want to push increases, the better it is to use the execute strategy. I saw speedup of around 100x thanks to this!

                      – Rob Watts
                      Jan 22 '14 at 21:16








                    • 3





                      Perhaps executemany runs a commit after each insert. If you instead wrap the whole thing in a transaction, maybe that would expedite things?

                      – Richard
                      Apr 12 '15 at 17:06






                    • 4





                      Just confirmed this improvement myself. From what I've read psycopg2's executemany doesn't do anything optimal, just loops and does many execute statements. Using this method, a 700 row insert to a remote server went from 60s to <2s.

                      – Nelson
                      Apr 27 '15 at 23:22






                    • 2





                      Maybe I'm being paranoid, but concatenating the query with a + seems like it could open up to sql injection, I feel like @Clodoaldo Neto execute_values() solution is safer.

                      – Will Munn
                      Jan 17 '18 at 11:55








                    • 4





                      in case someone encounters the following error: [TypeError: sequence item 0: expected str instance, bytes found] run this command instead [args_str = ','.join(cur.mogrify("(%s,%s)", x).decode("utf-8") for x in tup)]

                      – mrt
                      Sep 3 '18 at 23:08










                    10




                    10





                    Still very relevant almost two years later. An experience today suggests that as the number of rows you want to push increases, the better it is to use the execute strategy. I saw speedup of around 100x thanks to this!

                    – Rob Watts
                    Jan 22 '14 at 21:16







                    Still very relevant almost two years later. An experience today suggests that as the number of rows you want to push increases, the better it is to use the execute strategy. I saw speedup of around 100x thanks to this!

                    – Rob Watts
                    Jan 22 '14 at 21:16






                    3




                    3





                    Perhaps executemany runs a commit after each insert. If you instead wrap the whole thing in a transaction, maybe that would expedite things?

                    – Richard
                    Apr 12 '15 at 17:06





                    Perhaps executemany runs a commit after each insert. If you instead wrap the whole thing in a transaction, maybe that would expedite things?

                    – Richard
                    Apr 12 '15 at 17:06




                    4




                    4





                    Just confirmed this improvement myself. From what I've read psycopg2's executemany doesn't do anything optimal, just loops and does many execute statements. Using this method, a 700 row insert to a remote server went from 60s to <2s.

                    – Nelson
                    Apr 27 '15 at 23:22





                    Just confirmed this improvement myself. From what I've read psycopg2's executemany doesn't do anything optimal, just loops and does many execute statements. Using this method, a 700 row insert to a remote server went from 60s to <2s.

                    – Nelson
                    Apr 27 '15 at 23:22




                    2




                    2





                    Maybe I'm being paranoid, but concatenating the query with a + seems like it could open up to sql injection, I feel like @Clodoaldo Neto execute_values() solution is safer.

                    – Will Munn
                    Jan 17 '18 at 11:55







                    Maybe I'm being paranoid, but concatenating the query with a + seems like it could open up to sql injection, I feel like @Clodoaldo Neto execute_values() solution is safer.

                    – Will Munn
                    Jan 17 '18 at 11:55






                    4




                    4





                    in case someone encounters the following error: [TypeError: sequence item 0: expected str instance, bytes found] run this command instead [args_str = ','.join(cur.mogrify("(%s,%s)", x).decode("utf-8") for x in tup)]

                    – mrt
                    Sep 3 '18 at 23:08







                    in case someone encounters the following error: [TypeError: sequence item 0: expected str instance, bytes found] run this command instead [args_str = ','.join(cur.mogrify("(%s,%s)", x).decode("utf-8") for x in tup)]

                    – mrt
                    Sep 3 '18 at 23:08















                    113














                    New execute_values method in Psycopg 2.7:



                    data = [(1,'x'), (2,'y')]
                    insert_query = 'insert into t (a, b) values %s'
                    psycopg2.extras.execute_values (
                    cursor, insert_query, data, template=None, page_size=100
                    )


                    The pythonic way of doing it in Psycopg 2.6:



                    data = [(1,'x'), (2,'y')]
                    records_list_template = ','.join(['%s'] * len(data))
                    insert_query = 'insert into t (a, b) values {}'.format(records_list_template)
                    cursor.execute(insert_query, data)


                    Explanation: If the data to be inserted is given as a list of tuples like in



                    data = [(1,'x'), (2,'y')]


                    then it is already in the exact required format as





                    1. the values syntax of the insert clause expects a list of records as in



                      insert into t (a, b) values (1, 'x'),(2, 'y')



                    2. Psycopg adapts a Python tuple to a Postgresql record.



                    The only necessary work is to provide a records list template to be filled by psycopg



                    # We use the data list to be sure of the template length
                    records_list_template = ','.join(['%s'] * len(data))


                    and place it in the insert query



                    insert_query = 'insert into t (a, b) values {}'.format(records_list_template)


                    Printing the insert_query outputs



                    insert into t (a, b) values %s,%s


                    Now to the usual Psycopg arguments substitution



                    cursor.execute(insert_query, data)


                    Or just testing what will be sent to the server



                    print (cursor.mogrify(insert_query, data).decode('utf8'))


                    Output:



                    insert into t (a, b) values (1, 'x'),(2, 'y')





                    share|improve this answer


























                    • How does the performance of this method compare with cur.copy_from?

                      – Michael Goldshteyn
                      Mar 3 '16 at 22:05











                    • Here's a gist with a benchmark. copy_from scales to about 6.5X faster on my machine with 10M records.

                      – Joseph Sheedy
                      Jul 13 '16 at 0:28











                    • Looks nice - I think you have a stray , at the end of your initial definition of insert_query (unless you were trying to make it a tuple?) and are missing a s after the % for %s also in the initial definition of insert_query.

                      – deadcode
                      May 18 '17 at 1:22











                    • @deadcode Fixed, Thanks.

                      – Clodoaldo Neto
                      May 18 '17 at 12:20






                    • 1





                      using execute_values I was able to get my system running at 1k records a minute up to 128k records a minute

                      – Conrad.Dean
                      Nov 15 '18 at 15:17
















                    113














                    New execute_values method in Psycopg 2.7:



                    data = [(1,'x'), (2,'y')]
                    insert_query = 'insert into t (a, b) values %s'
                    psycopg2.extras.execute_values (
                    cursor, insert_query, data, template=None, page_size=100
                    )


                    The pythonic way of doing it in Psycopg 2.6:



                    data = [(1,'x'), (2,'y')]
                    records_list_template = ','.join(['%s'] * len(data))
                    insert_query = 'insert into t (a, b) values {}'.format(records_list_template)
                    cursor.execute(insert_query, data)


                    Explanation: If the data to be inserted is given as a list of tuples like in



                    data = [(1,'x'), (2,'y')]


                    then it is already in the exact required format as





                    1. the values syntax of the insert clause expects a list of records as in



                      insert into t (a, b) values (1, 'x'),(2, 'y')



                    2. Psycopg adapts a Python tuple to a Postgresql record.



                    The only necessary work is to provide a records list template to be filled by psycopg



                    # We use the data list to be sure of the template length
                    records_list_template = ','.join(['%s'] * len(data))


                    and place it in the insert query



                    insert_query = 'insert into t (a, b) values {}'.format(records_list_template)


                    Printing the insert_query outputs



                    insert into t (a, b) values %s,%s


                    Now to the usual Psycopg arguments substitution



                    cursor.execute(insert_query, data)


                    Or just testing what will be sent to the server



                    print (cursor.mogrify(insert_query, data).decode('utf8'))


                    Output:



                    insert into t (a, b) values (1, 'x'),(2, 'y')





                    share|improve this answer


























                    • How does the performance of this method compare with cur.copy_from?

                      – Michael Goldshteyn
                      Mar 3 '16 at 22:05











                    • Here's a gist with a benchmark. copy_from scales to about 6.5X faster on my machine with 10M records.

                      – Joseph Sheedy
                      Jul 13 '16 at 0:28











                    • Looks nice - I think you have a stray , at the end of your initial definition of insert_query (unless you were trying to make it a tuple?) and are missing a s after the % for %s also in the initial definition of insert_query.

                      – deadcode
                      May 18 '17 at 1:22











                    • @deadcode Fixed, Thanks.

                      – Clodoaldo Neto
                      May 18 '17 at 12:20






                    • 1





                      using execute_values I was able to get my system running at 1k records a minute up to 128k records a minute

                      – Conrad.Dean
                      Nov 15 '18 at 15:17














                    113












                    113








                    113







                    New execute_values method in Psycopg 2.7:



                    data = [(1,'x'), (2,'y')]
                    insert_query = 'insert into t (a, b) values %s'
                    psycopg2.extras.execute_values (
                    cursor, insert_query, data, template=None, page_size=100
                    )


                    The pythonic way of doing it in Psycopg 2.6:



                    data = [(1,'x'), (2,'y')]
                    records_list_template = ','.join(['%s'] * len(data))
                    insert_query = 'insert into t (a, b) values {}'.format(records_list_template)
                    cursor.execute(insert_query, data)


                    Explanation: If the data to be inserted is given as a list of tuples like in



                    data = [(1,'x'), (2,'y')]


                    then it is already in the exact required format as





                    1. the values syntax of the insert clause expects a list of records as in



                      insert into t (a, b) values (1, 'x'),(2, 'y')



                    2. Psycopg adapts a Python tuple to a Postgresql record.



                    The only necessary work is to provide a records list template to be filled by psycopg



                    # We use the data list to be sure of the template length
                    records_list_template = ','.join(['%s'] * len(data))


                    and place it in the insert query



                    insert_query = 'insert into t (a, b) values {}'.format(records_list_template)


                    Printing the insert_query outputs



                    insert into t (a, b) values %s,%s


                    Now to the usual Psycopg arguments substitution



                    cursor.execute(insert_query, data)


                    Or just testing what will be sent to the server



                    print (cursor.mogrify(insert_query, data).decode('utf8'))


                    Output:



                    insert into t (a, b) values (1, 'x'),(2, 'y')





                    share|improve this answer















                    New execute_values method in Psycopg 2.7:



                    data = [(1,'x'), (2,'y')]
                    insert_query = 'insert into t (a, b) values %s'
                    psycopg2.extras.execute_values (
                    cursor, insert_query, data, template=None, page_size=100
                    )


                    The pythonic way of doing it in Psycopg 2.6:



                    data = [(1,'x'), (2,'y')]
                    records_list_template = ','.join(['%s'] * len(data))
                    insert_query = 'insert into t (a, b) values {}'.format(records_list_template)
                    cursor.execute(insert_query, data)


                    Explanation: If the data to be inserted is given as a list of tuples like in



                    data = [(1,'x'), (2,'y')]


                    then it is already in the exact required format as





                    1. the values syntax of the insert clause expects a list of records as in



                      insert into t (a, b) values (1, 'x'),(2, 'y')



                    2. Psycopg adapts a Python tuple to a Postgresql record.



                    The only necessary work is to provide a records list template to be filled by psycopg



                    # We use the data list to be sure of the template length
                    records_list_template = ','.join(['%s'] * len(data))


                    and place it in the insert query



                    insert_query = 'insert into t (a, b) values {}'.format(records_list_template)


                    Printing the insert_query outputs



                    insert into t (a, b) values %s,%s


                    Now to the usual Psycopg arguments substitution



                    cursor.execute(insert_query, data)


                    Or just testing what will be sent to the server



                    print (cursor.mogrify(insert_query, data).decode('utf8'))


                    Output:



                    insert into t (a, b) values (1, 'x'),(2, 'y')






                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited May 18 '17 at 12:20

























                    answered Jun 22 '15 at 16:50









                    Clodoaldo NetoClodoaldo Neto

                    73.3k14129184




                    73.3k14129184













                    • How does the performance of this method compare with cur.copy_from?

                      – Michael Goldshteyn
                      Mar 3 '16 at 22:05











                    • Here's a gist with a benchmark. copy_from scales to about 6.5X faster on my machine with 10M records.

                      – Joseph Sheedy
                      Jul 13 '16 at 0:28











                    • Looks nice - I think you have a stray , at the end of your initial definition of insert_query (unless you were trying to make it a tuple?) and are missing a s after the % for %s also in the initial definition of insert_query.

                      – deadcode
                      May 18 '17 at 1:22











                    • @deadcode Fixed, Thanks.

                      – Clodoaldo Neto
                      May 18 '17 at 12:20






                    • 1





                      using execute_values I was able to get my system running at 1k records a minute up to 128k records a minute

                      – Conrad.Dean
                      Nov 15 '18 at 15:17



















                    • How does the performance of this method compare with cur.copy_from?

                      – Michael Goldshteyn
                      Mar 3 '16 at 22:05











                    • Here's a gist with a benchmark. copy_from scales to about 6.5X faster on my machine with 10M records.

                      – Joseph Sheedy
                      Jul 13 '16 at 0:28











                    • Looks nice - I think you have a stray , at the end of your initial definition of insert_query (unless you were trying to make it a tuple?) and are missing a s after the % for %s also in the initial definition of insert_query.

                      – deadcode
                      May 18 '17 at 1:22











                    • @deadcode Fixed, Thanks.

                      – Clodoaldo Neto
                      May 18 '17 at 12:20






                    • 1





                      using execute_values I was able to get my system running at 1k records a minute up to 128k records a minute

                      – Conrad.Dean
                      Nov 15 '18 at 15:17

















                    How does the performance of this method compare with cur.copy_from?

                    – Michael Goldshteyn
                    Mar 3 '16 at 22:05





                    How does the performance of this method compare with cur.copy_from?

                    – Michael Goldshteyn
                    Mar 3 '16 at 22:05













                    Here's a gist with a benchmark. copy_from scales to about 6.5X faster on my machine with 10M records.

                    – Joseph Sheedy
                    Jul 13 '16 at 0:28





                    Here's a gist with a benchmark. copy_from scales to about 6.5X faster on my machine with 10M records.

                    – Joseph Sheedy
                    Jul 13 '16 at 0:28













                    Looks nice - I think you have a stray , at the end of your initial definition of insert_query (unless you were trying to make it a tuple?) and are missing a s after the % for %s also in the initial definition of insert_query.

                    – deadcode
                    May 18 '17 at 1:22





                    Looks nice - I think you have a stray , at the end of your initial definition of insert_query (unless you were trying to make it a tuple?) and are missing a s after the % for %s also in the initial definition of insert_query.

                    – deadcode
                    May 18 '17 at 1:22













                    @deadcode Fixed, Thanks.

                    – Clodoaldo Neto
                    May 18 '17 at 12:20





                    @deadcode Fixed, Thanks.

                    – Clodoaldo Neto
                    May 18 '17 at 12:20




                    1




                    1





                    using execute_values I was able to get my system running at 1k records a minute up to 128k records a minute

                    – Conrad.Dean
                    Nov 15 '18 at 15:17





                    using execute_values I was able to get my system running at 1k records a minute up to 128k records a minute

                    – Conrad.Dean
                    Nov 15 '18 at 15:17











                    38














                    Update with psycopg2 2.7:



                    The classic executemany() is about 60 times slower than @ant32 's implementation (called "folded") as explained in this thread: https://www.postgresql.org/message-id/20170130215151.GA7081%40deb76.aryehleib.com



                    This implementation was added to psycopg2 in version 2.7 and is called execute_values():



                    from psycopg2.extras import execute_values
                    execute_values(cur,
                    "INSERT INTO test (id, v1, v2) VALUES %s",
                    [(1, 2, 3), (4, 5, 6), (7, 8, 9)])




                    Previous Answer:



                    To insert multiple rows, using the multirow VALUES syntax with execute() is about 10x faster than using psycopg2 executemany(). Indeed, executemany() just runs many individual INSERT statements.



                    @ant32 's code works perfectly in Python 2. But in Python 3, cursor.mogrify() returns bytes, cursor.execute() takes either bytes or strings, and ','.join() expects str instance.



                    So in Python 3 you may need to modify @ant32 's code, by adding .decode('utf-8'):



                    args_str = ','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x).decode('utf-8') for x in tup)
                    cur.execute("INSERT INTO table VALUES " + args_str)


                    Or by using bytes (with b'' or b"") only:



                    args_bytes = b','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup)
                    cur.execute(b"INSERT INTO table VALUES " + args_bytes)





                    share|improve this answer






























                      38














                      Update with psycopg2 2.7:



                      The classic executemany() is about 60 times slower than @ant32 's implementation (called "folded") as explained in this thread: https://www.postgresql.org/message-id/20170130215151.GA7081%40deb76.aryehleib.com



                      This implementation was added to psycopg2 in version 2.7 and is called execute_values():



                      from psycopg2.extras import execute_values
                      execute_values(cur,
                      "INSERT INTO test (id, v1, v2) VALUES %s",
                      [(1, 2, 3), (4, 5, 6), (7, 8, 9)])




                      Previous Answer:



                      To insert multiple rows, using the multirow VALUES syntax with execute() is about 10x faster than using psycopg2 executemany(). Indeed, executemany() just runs many individual INSERT statements.



                      @ant32 's code works perfectly in Python 2. But in Python 3, cursor.mogrify() returns bytes, cursor.execute() takes either bytes or strings, and ','.join() expects str instance.



                      So in Python 3 you may need to modify @ant32 's code, by adding .decode('utf-8'):



                      args_str = ','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x).decode('utf-8') for x in tup)
                      cur.execute("INSERT INTO table VALUES " + args_str)


                      Or by using bytes (with b'' or b"") only:



                      args_bytes = b','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup)
                      cur.execute(b"INSERT INTO table VALUES " + args_bytes)





                      share|improve this answer




























                        38












                        38








                        38







                        Update with psycopg2 2.7:



                        The classic executemany() is about 60 times slower than @ant32 's implementation (called "folded") as explained in this thread: https://www.postgresql.org/message-id/20170130215151.GA7081%40deb76.aryehleib.com



                        This implementation was added to psycopg2 in version 2.7 and is called execute_values():



                        from psycopg2.extras import execute_values
                        execute_values(cur,
                        "INSERT INTO test (id, v1, v2) VALUES %s",
                        [(1, 2, 3), (4, 5, 6), (7, 8, 9)])




                        Previous Answer:



                        To insert multiple rows, using the multirow VALUES syntax with execute() is about 10x faster than using psycopg2 executemany(). Indeed, executemany() just runs many individual INSERT statements.



                        @ant32 's code works perfectly in Python 2. But in Python 3, cursor.mogrify() returns bytes, cursor.execute() takes either bytes or strings, and ','.join() expects str instance.



                        So in Python 3 you may need to modify @ant32 's code, by adding .decode('utf-8'):



                        args_str = ','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x).decode('utf-8') for x in tup)
                        cur.execute("INSERT INTO table VALUES " + args_str)


                        Or by using bytes (with b'' or b"") only:



                        args_bytes = b','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup)
                        cur.execute(b"INSERT INTO table VALUES " + args_bytes)





                        share|improve this answer















                        Update with psycopg2 2.7:



                        The classic executemany() is about 60 times slower than @ant32 's implementation (called "folded") as explained in this thread: https://www.postgresql.org/message-id/20170130215151.GA7081%40deb76.aryehleib.com



                        This implementation was added to psycopg2 in version 2.7 and is called execute_values():



                        from psycopg2.extras import execute_values
                        execute_values(cur,
                        "INSERT INTO test (id, v1, v2) VALUES %s",
                        [(1, 2, 3), (4, 5, 6), (7, 8, 9)])




                        Previous Answer:



                        To insert multiple rows, using the multirow VALUES syntax with execute() is about 10x faster than using psycopg2 executemany(). Indeed, executemany() just runs many individual INSERT statements.



                        @ant32 's code works perfectly in Python 2. But in Python 3, cursor.mogrify() returns bytes, cursor.execute() takes either bytes or strings, and ','.join() expects str instance.



                        So in Python 3 you may need to modify @ant32 's code, by adding .decode('utf-8'):



                        args_str = ','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x).decode('utf-8') for x in tup)
                        cur.execute("INSERT INTO table VALUES " + args_str)


                        Or by using bytes (with b'' or b"") only:



                        args_bytes = b','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup)
                        cur.execute(b"INSERT INTO table VALUES " + args_bytes)






                        share|improve this answer














                        share|improve this answer



                        share|improve this answer








                        edited Nov 24 '18 at 9:12

























                        answered Aug 19 '16 at 8:48









                        Antoine DusséauxAntoine Dusséaux

                        1,70511126




                        1,70511126























                            23














                            A snippet from Psycopg2's tutorial page at Postgresql.org (see bottom):




                            A last item I would like to show you is how to insert multiple rows using a dictionary. If you had the following:




                            namedict = ({"first_name":"Joshua", "last_name":"Drake"},
                            {"first_name":"Steven", "last_name":"Foo"},
                            {"first_name":"David", "last_name":"Bar"})



                            You could easily insert all three rows within the dictionary by using:




                            cur = conn.cursor()
                            cur.executemany("""INSERT INTO bar(first_name,last_name) VALUES (%(first_name)s, %(last_name)s)""", namedict)


                            It doesn't save much code, but it definitively looks better.






                            share|improve this answer





















                            • 29





                              This will run many individual INSERT statements. Useful, but not the same as a single multi-VALUEd insert.

                              – Craig Ringer
                              Apr 9 '13 at 2:26
















                            23














                            A snippet from Psycopg2's tutorial page at Postgresql.org (see bottom):




                            A last item I would like to show you is how to insert multiple rows using a dictionary. If you had the following:




                            namedict = ({"first_name":"Joshua", "last_name":"Drake"},
                            {"first_name":"Steven", "last_name":"Foo"},
                            {"first_name":"David", "last_name":"Bar"})



                            You could easily insert all three rows within the dictionary by using:




                            cur = conn.cursor()
                            cur.executemany("""INSERT INTO bar(first_name,last_name) VALUES (%(first_name)s, %(last_name)s)""", namedict)


                            It doesn't save much code, but it definitively looks better.






                            share|improve this answer





















                            • 29





                              This will run many individual INSERT statements. Useful, but not the same as a single multi-VALUEd insert.

                              – Craig Ringer
                              Apr 9 '13 at 2:26














                            23












                            23








                            23







                            A snippet from Psycopg2's tutorial page at Postgresql.org (see bottom):




                            A last item I would like to show you is how to insert multiple rows using a dictionary. If you had the following:




                            namedict = ({"first_name":"Joshua", "last_name":"Drake"},
                            {"first_name":"Steven", "last_name":"Foo"},
                            {"first_name":"David", "last_name":"Bar"})



                            You could easily insert all three rows within the dictionary by using:




                            cur = conn.cursor()
                            cur.executemany("""INSERT INTO bar(first_name,last_name) VALUES (%(first_name)s, %(last_name)s)""", namedict)


                            It doesn't save much code, but it definitively looks better.






                            share|improve this answer















                            A snippet from Psycopg2's tutorial page at Postgresql.org (see bottom):




                            A last item I would like to show you is how to insert multiple rows using a dictionary. If you had the following:




                            namedict = ({"first_name":"Joshua", "last_name":"Drake"},
                            {"first_name":"Steven", "last_name":"Foo"},
                            {"first_name":"David", "last_name":"Bar"})



                            You could easily insert all three rows within the dictionary by using:




                            cur = conn.cursor()
                            cur.executemany("""INSERT INTO bar(first_name,last_name) VALUES (%(first_name)s, %(last_name)s)""", namedict)


                            It doesn't save much code, but it definitively looks better.







                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Oct 2 '15 at 1:22









                            Eric Leschinski

                            87.6k38323274




                            87.6k38323274










                            answered Dec 14 '11 at 11:06









                            ptrnptrn

                            1,75032228




                            1,75032228








                            • 29





                              This will run many individual INSERT statements. Useful, but not the same as a single multi-VALUEd insert.

                              – Craig Ringer
                              Apr 9 '13 at 2:26














                            • 29





                              This will run many individual INSERT statements. Useful, but not the same as a single multi-VALUEd insert.

                              – Craig Ringer
                              Apr 9 '13 at 2:26








                            29




                            29





                            This will run many individual INSERT statements. Useful, but not the same as a single multi-VALUEd insert.

                            – Craig Ringer
                            Apr 9 '13 at 2:26





                            This will run many individual INSERT statements. Useful, but not the same as a single multi-VALUEd insert.

                            – Craig Ringer
                            Apr 9 '13 at 2:26











                            21














                            cursor.copy_from is the fastest solution I've found for bulk inserts by far. Here's a gist I made containing a class named IteratorFile which allows an iterator yielding strings to be read like a file. We can convert each input record to a string using a generator expression. So the solution would be



                            args = [(1,2), (3,4), (5,6)]
                            f = IteratorFile(("{}t{}".format(x[0], x[1]) for x in args))
                            cursor.copy_from(f, 'table_name', columns=('a', 'b'))


                            For this trivial size of args it won't make much of a speed difference, but I see big speedups when dealing with thousands+ of rows. It will also be more memory efficient than building a giant query string. An iterator would only ever hold one input record in memory at a time, where at some point you'll run out of memory in your Python process or in Postgres by building the query string.






                            share|improve this answer





















                            • 2





                              Here is a benchmark comparing copy_from/IteratorFile with a query builder solution. copy_from scales to about 6.5X faster on my machine with 10M records.

                              – Joseph Sheedy
                              Sep 30 '16 at 17:49








                            • 3





                              do you have to dick around with escaping strings and timestamps etc?

                              – CpILL
                              Jun 13 '17 at 9:35











                            • Yes, you'll have to make sure you have a well formed TSV records.

                              – Joseph Sheedy
                              Jun 14 '17 at 2:51
















                            21














                            cursor.copy_from is the fastest solution I've found for bulk inserts by far. Here's a gist I made containing a class named IteratorFile which allows an iterator yielding strings to be read like a file. We can convert each input record to a string using a generator expression. So the solution would be



                            args = [(1,2), (3,4), (5,6)]
                            f = IteratorFile(("{}t{}".format(x[0], x[1]) for x in args))
                            cursor.copy_from(f, 'table_name', columns=('a', 'b'))


                            For this trivial size of args it won't make much of a speed difference, but I see big speedups when dealing with thousands+ of rows. It will also be more memory efficient than building a giant query string. An iterator would only ever hold one input record in memory at a time, where at some point you'll run out of memory in your Python process or in Postgres by building the query string.






                            share|improve this answer





















                            • 2





                              Here is a benchmark comparing copy_from/IteratorFile with a query builder solution. copy_from scales to about 6.5X faster on my machine with 10M records.

                              – Joseph Sheedy
                              Sep 30 '16 at 17:49








                            • 3





                              do you have to dick around with escaping strings and timestamps etc?

                              – CpILL
                              Jun 13 '17 at 9:35











                            • Yes, you'll have to make sure you have a well formed TSV records.

                              – Joseph Sheedy
                              Jun 14 '17 at 2:51














                            21












                            21








                            21







                            cursor.copy_from is the fastest solution I've found for bulk inserts by far. Here's a gist I made containing a class named IteratorFile which allows an iterator yielding strings to be read like a file. We can convert each input record to a string using a generator expression. So the solution would be



                            args = [(1,2), (3,4), (5,6)]
                            f = IteratorFile(("{}t{}".format(x[0], x[1]) for x in args))
                            cursor.copy_from(f, 'table_name', columns=('a', 'b'))


                            For this trivial size of args it won't make much of a speed difference, but I see big speedups when dealing with thousands+ of rows. It will also be more memory efficient than building a giant query string. An iterator would only ever hold one input record in memory at a time, where at some point you'll run out of memory in your Python process or in Postgres by building the query string.






                            share|improve this answer















                            cursor.copy_from is the fastest solution I've found for bulk inserts by far. Here's a gist I made containing a class named IteratorFile which allows an iterator yielding strings to be read like a file. We can convert each input record to a string using a generator expression. So the solution would be



                            args = [(1,2), (3,4), (5,6)]
                            f = IteratorFile(("{}t{}".format(x[0], x[1]) for x in args))
                            cursor.copy_from(f, 'table_name', columns=('a', 'b'))


                            For this trivial size of args it won't make much of a speed difference, but I see big speedups when dealing with thousands+ of rows. It will also be more memory efficient than building a giant query string. An iterator would only ever hold one input record in memory at a time, where at some point you'll run out of memory in your Python process or in Postgres by building the query string.







                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Sep 30 '16 at 20:25

























                            answered Jun 9 '15 at 1:06









                            Joseph SheedyJoseph Sheedy

                            3,58121726




                            3,58121726








                            • 2





                              Here is a benchmark comparing copy_from/IteratorFile with a query builder solution. copy_from scales to about 6.5X faster on my machine with 10M records.

                              – Joseph Sheedy
                              Sep 30 '16 at 17:49








                            • 3





                              do you have to dick around with escaping strings and timestamps etc?

                              – CpILL
                              Jun 13 '17 at 9:35











                            • Yes, you'll have to make sure you have a well formed TSV records.

                              – Joseph Sheedy
                              Jun 14 '17 at 2:51














                            • 2





                              Here is a benchmark comparing copy_from/IteratorFile with a query builder solution. copy_from scales to about 6.5X faster on my machine with 10M records.

                              – Joseph Sheedy
                              Sep 30 '16 at 17:49








                            • 3





                              do you have to dick around with escaping strings and timestamps etc?

                              – CpILL
                              Jun 13 '17 at 9:35











                            • Yes, you'll have to make sure you have a well formed TSV records.

                              – Joseph Sheedy
                              Jun 14 '17 at 2:51








                            2




                            2





                            Here is a benchmark comparing copy_from/IteratorFile with a query builder solution. copy_from scales to about 6.5X faster on my machine with 10M records.

                            – Joseph Sheedy
                            Sep 30 '16 at 17:49







                            Here is a benchmark comparing copy_from/IteratorFile with a query builder solution. copy_from scales to about 6.5X faster on my machine with 10M records.

                            – Joseph Sheedy
                            Sep 30 '16 at 17:49






                            3




                            3





                            do you have to dick around with escaping strings and timestamps etc?

                            – CpILL
                            Jun 13 '17 at 9:35





                            do you have to dick around with escaping strings and timestamps etc?

                            – CpILL
                            Jun 13 '17 at 9:35













                            Yes, you'll have to make sure you have a well formed TSV records.

                            – Joseph Sheedy
                            Jun 14 '17 at 2:51





                            Yes, you'll have to make sure you have a well formed TSV records.

                            – Joseph Sheedy
                            Jun 14 '17 at 2:51











                            6














                            All of these techniques are called 'Extended Inserts" in Postgres terminology, and as of the 24th of November 2016, it's still a ton faster than psychopg2's executemany() and all the other methods listed in this thread (which i tried before coming to this answer).



                            Here's some code which doesnt use cur.mogrify and is nice and simply to get your head around:



                            valueSQL = [ '%s', '%s', '%s', ... ] # as many as you have columns.
                            sqlrows =
                            rowsPerInsert = 3 # more means faster, but with diminishing returns..
                            for row in getSomeData:
                            # row == [1, 'a', 'yolo', ... ]
                            sqlrows += row
                            if ( len(sqlrows)/len(valueSQL) ) % rowsPerInsert == 0:
                            # sqlrows == [ 1, 'a', 'yolo', 2, 'b', 'swag', 3, 'c', 'selfie' ]
                            insertSQL = 'INSERT INTO "twitter" VALUES ' + ','.join(['(' + ','.join(valueSQL) + ')']*rowsPerInsert)
                            cur.execute(insertSQL, sqlrows)
                            con.commit()
                            sqlrows =
                            insertSQL = 'INSERT INTO "twitter" VALUES ' + ','.join(['(' + ','.join(valueSQL) + ')']*len(sqlrows))
                            cur.execute(insertSQL, sqlrows)
                            con.commit()


                            But it should be noted that if you can use copy_from(), you should use copy_from ;)






                            share|improve this answer


























                            • Bringing up from the dead, but what happens in the situation of the last few rows? I assume you actually run that final clause again on the last remaining rows, in the case you have an even number of rows?

                              – mcpeterson
                              Nov 23 '16 at 23:03











                            • Correct, sorry I must have forgotten to do that when i wrote the example - thats pretty stupid of me. Not doing so wouldn't have given people an error, which makes me worried how many people copy/pasted the solution and went about their business..... Anyway, very grateful mcpeterson - thank you!

                              – J.J
                              Nov 24 '16 at 1:49
















                            6














                            All of these techniques are called 'Extended Inserts" in Postgres terminology, and as of the 24th of November 2016, it's still a ton faster than psychopg2's executemany() and all the other methods listed in this thread (which i tried before coming to this answer).



                            Here's some code which doesnt use cur.mogrify and is nice and simply to get your head around:



                            valueSQL = [ '%s', '%s', '%s', ... ] # as many as you have columns.
                            sqlrows =
                            rowsPerInsert = 3 # more means faster, but with diminishing returns..
                            for row in getSomeData:
                            # row == [1, 'a', 'yolo', ... ]
                            sqlrows += row
                            if ( len(sqlrows)/len(valueSQL) ) % rowsPerInsert == 0:
                            # sqlrows == [ 1, 'a', 'yolo', 2, 'b', 'swag', 3, 'c', 'selfie' ]
                            insertSQL = 'INSERT INTO "twitter" VALUES ' + ','.join(['(' + ','.join(valueSQL) + ')']*rowsPerInsert)
                            cur.execute(insertSQL, sqlrows)
                            con.commit()
                            sqlrows =
                            insertSQL = 'INSERT INTO "twitter" VALUES ' + ','.join(['(' + ','.join(valueSQL) + ')']*len(sqlrows))
                            cur.execute(insertSQL, sqlrows)
                            con.commit()


                            But it should be noted that if you can use copy_from(), you should use copy_from ;)






                            share|improve this answer


























                            • Bringing up from the dead, but what happens in the situation of the last few rows? I assume you actually run that final clause again on the last remaining rows, in the case you have an even number of rows?

                              – mcpeterson
                              Nov 23 '16 at 23:03











                            • Correct, sorry I must have forgotten to do that when i wrote the example - thats pretty stupid of me. Not doing so wouldn't have given people an error, which makes me worried how many people copy/pasted the solution and went about their business..... Anyway, very grateful mcpeterson - thank you!

                              – J.J
                              Nov 24 '16 at 1:49














                            6












                            6








                            6







                            All of these techniques are called 'Extended Inserts" in Postgres terminology, and as of the 24th of November 2016, it's still a ton faster than psychopg2's executemany() and all the other methods listed in this thread (which i tried before coming to this answer).



                            Here's some code which doesnt use cur.mogrify and is nice and simply to get your head around:



                            valueSQL = [ '%s', '%s', '%s', ... ] # as many as you have columns.
                            sqlrows =
                            rowsPerInsert = 3 # more means faster, but with diminishing returns..
                            for row in getSomeData:
                            # row == [1, 'a', 'yolo', ... ]
                            sqlrows += row
                            if ( len(sqlrows)/len(valueSQL) ) % rowsPerInsert == 0:
                            # sqlrows == [ 1, 'a', 'yolo', 2, 'b', 'swag', 3, 'c', 'selfie' ]
                            insertSQL = 'INSERT INTO "twitter" VALUES ' + ','.join(['(' + ','.join(valueSQL) + ')']*rowsPerInsert)
                            cur.execute(insertSQL, sqlrows)
                            con.commit()
                            sqlrows =
                            insertSQL = 'INSERT INTO "twitter" VALUES ' + ','.join(['(' + ','.join(valueSQL) + ')']*len(sqlrows))
                            cur.execute(insertSQL, sqlrows)
                            con.commit()


                            But it should be noted that if you can use copy_from(), you should use copy_from ;)






                            share|improve this answer















                            All of these techniques are called 'Extended Inserts" in Postgres terminology, and as of the 24th of November 2016, it's still a ton faster than psychopg2's executemany() and all the other methods listed in this thread (which i tried before coming to this answer).



                            Here's some code which doesnt use cur.mogrify and is nice and simply to get your head around:



                            valueSQL = [ '%s', '%s', '%s', ... ] # as many as you have columns.
                            sqlrows =
                            rowsPerInsert = 3 # more means faster, but with diminishing returns..
                            for row in getSomeData:
                            # row == [1, 'a', 'yolo', ... ]
                            sqlrows += row
                            if ( len(sqlrows)/len(valueSQL) ) % rowsPerInsert == 0:
                            # sqlrows == [ 1, 'a', 'yolo', 2, 'b', 'swag', 3, 'c', 'selfie' ]
                            insertSQL = 'INSERT INTO "twitter" VALUES ' + ','.join(['(' + ','.join(valueSQL) + ')']*rowsPerInsert)
                            cur.execute(insertSQL, sqlrows)
                            con.commit()
                            sqlrows =
                            insertSQL = 'INSERT INTO "twitter" VALUES ' + ','.join(['(' + ','.join(valueSQL) + ')']*len(sqlrows))
                            cur.execute(insertSQL, sqlrows)
                            con.commit()


                            But it should be noted that if you can use copy_from(), you should use copy_from ;)







                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Nov 24 '16 at 1:47

























                            answered May 4 '15 at 15:28









                            J.JJ.J

                            1,82111828




                            1,82111828













                            • Bringing up from the dead, but what happens in the situation of the last few rows? I assume you actually run that final clause again on the last remaining rows, in the case you have an even number of rows?

                              – mcpeterson
                              Nov 23 '16 at 23:03











                            • Correct, sorry I must have forgotten to do that when i wrote the example - thats pretty stupid of me. Not doing so wouldn't have given people an error, which makes me worried how many people copy/pasted the solution and went about their business..... Anyway, very grateful mcpeterson - thank you!

                              – J.J
                              Nov 24 '16 at 1:49



















                            • Bringing up from the dead, but what happens in the situation of the last few rows? I assume you actually run that final clause again on the last remaining rows, in the case you have an even number of rows?

                              – mcpeterson
                              Nov 23 '16 at 23:03











                            • Correct, sorry I must have forgotten to do that when i wrote the example - thats pretty stupid of me. Not doing so wouldn't have given people an error, which makes me worried how many people copy/pasted the solution and went about their business..... Anyway, very grateful mcpeterson - thank you!

                              – J.J
                              Nov 24 '16 at 1:49

















                            Bringing up from the dead, but what happens in the situation of the last few rows? I assume you actually run that final clause again on the last remaining rows, in the case you have an even number of rows?

                            – mcpeterson
                            Nov 23 '16 at 23:03





                            Bringing up from the dead, but what happens in the situation of the last few rows? I assume you actually run that final clause again on the last remaining rows, in the case you have an even number of rows?

                            – mcpeterson
                            Nov 23 '16 at 23:03













                            Correct, sorry I must have forgotten to do that when i wrote the example - thats pretty stupid of me. Not doing so wouldn't have given people an error, which makes me worried how many people copy/pasted the solution and went about their business..... Anyway, very grateful mcpeterson - thank you!

                            – J.J
                            Nov 24 '16 at 1:49





                            Correct, sorry I must have forgotten to do that when i wrote the example - thats pretty stupid of me. Not doing so wouldn't have given people an error, which makes me worried how many people copy/pasted the solution and went about their business..... Anyway, very grateful mcpeterson - thank you!

                            – J.J
                            Nov 24 '16 at 1:49











                            3














                            To nicely insert into DB a list of rows, using user given batch size and with psycopg2 !



                            def get_batch(iterable, size=100):
                            for i in range(0, len(iterable), size):
                            yield iterable[i: i + size]


                            def insert_rows_batch(table, rows, batch_size=500, target_fields=None):
                            """
                            A utility method to insert batch of tuples(rows) into a table
                            NOTE: Handle data type for fields in rows yourself as per your table
                            columns' type.

                            :param table: Name of the target table
                            :type table: str

                            :param rows: The rows to insert into the table
                            :type rows: iterable of tuples

                            :param batch_size: The size of batch of rows to insert at a time
                            :type batch_size: int

                            :param target_fields: The names of the columns to fill in the table
                            :type target_fields: iterable of strings
                            """
                            conn = cur = None
                            if target_fields:
                            target_fields = ", ".join(target_fields)
                            target_fields = "({})".format(target_fields)
                            else:
                            target_fields = ''

                            conn = get_conn() # get connection using psycopg2
                            if conn:
                            cur = conn.cursor()
                            count = 0

                            for mini_batch in get_batch(rows, batch_size):
                            mini_batch_size = len(mini_batch)
                            count += mini_batch_size
                            record_template = ','.join(["%s"] * mini_batch_size)
                            sql = "INSERT INTO {0} {1} VALUES {2};".format(
                            table,
                            target_fields,
                            record_template)
                            cur.execute(sql, mini_batch)
                            conn.commit()
                            print("Loaded {} rows into {} so far".format(count, table))
                            print("Done loading. Loaded a total of {} rows".format(count))
                            if cur:cur.close()
                            if conn:conn.close()


                            If you want UPSERT (Insert+Update) as well in postgres with batches: postgres_utilities






                            share|improve this answer






























                              3














                              To nicely insert into DB a list of rows, using user given batch size and with psycopg2 !



                              def get_batch(iterable, size=100):
                              for i in range(0, len(iterable), size):
                              yield iterable[i: i + size]


                              def insert_rows_batch(table, rows, batch_size=500, target_fields=None):
                              """
                              A utility method to insert batch of tuples(rows) into a table
                              NOTE: Handle data type for fields in rows yourself as per your table
                              columns' type.

                              :param table: Name of the target table
                              :type table: str

                              :param rows: The rows to insert into the table
                              :type rows: iterable of tuples

                              :param batch_size: The size of batch of rows to insert at a time
                              :type batch_size: int

                              :param target_fields: The names of the columns to fill in the table
                              :type target_fields: iterable of strings
                              """
                              conn = cur = None
                              if target_fields:
                              target_fields = ", ".join(target_fields)
                              target_fields = "({})".format(target_fields)
                              else:
                              target_fields = ''

                              conn = get_conn() # get connection using psycopg2
                              if conn:
                              cur = conn.cursor()
                              count = 0

                              for mini_batch in get_batch(rows, batch_size):
                              mini_batch_size = len(mini_batch)
                              count += mini_batch_size
                              record_template = ','.join(["%s"] * mini_batch_size)
                              sql = "INSERT INTO {0} {1} VALUES {2};".format(
                              table,
                              target_fields,
                              record_template)
                              cur.execute(sql, mini_batch)
                              conn.commit()
                              print("Loaded {} rows into {} so far".format(count, table))
                              print("Done loading. Loaded a total of {} rows".format(count))
                              if cur:cur.close()
                              if conn:conn.close()


                              If you want UPSERT (Insert+Update) as well in postgres with batches: postgres_utilities






                              share|improve this answer




























                                3












                                3








                                3







                                To nicely insert into DB a list of rows, using user given batch size and with psycopg2 !



                                def get_batch(iterable, size=100):
                                for i in range(0, len(iterable), size):
                                yield iterable[i: i + size]


                                def insert_rows_batch(table, rows, batch_size=500, target_fields=None):
                                """
                                A utility method to insert batch of tuples(rows) into a table
                                NOTE: Handle data type for fields in rows yourself as per your table
                                columns' type.

                                :param table: Name of the target table
                                :type table: str

                                :param rows: The rows to insert into the table
                                :type rows: iterable of tuples

                                :param batch_size: The size of batch of rows to insert at a time
                                :type batch_size: int

                                :param target_fields: The names of the columns to fill in the table
                                :type target_fields: iterable of strings
                                """
                                conn = cur = None
                                if target_fields:
                                target_fields = ", ".join(target_fields)
                                target_fields = "({})".format(target_fields)
                                else:
                                target_fields = ''

                                conn = get_conn() # get connection using psycopg2
                                if conn:
                                cur = conn.cursor()
                                count = 0

                                for mini_batch in get_batch(rows, batch_size):
                                mini_batch_size = len(mini_batch)
                                count += mini_batch_size
                                record_template = ','.join(["%s"] * mini_batch_size)
                                sql = "INSERT INTO {0} {1} VALUES {2};".format(
                                table,
                                target_fields,
                                record_template)
                                cur.execute(sql, mini_batch)
                                conn.commit()
                                print("Loaded {} rows into {} so far".format(count, table))
                                print("Done loading. Loaded a total of {} rows".format(count))
                                if cur:cur.close()
                                if conn:conn.close()


                                If you want UPSERT (Insert+Update) as well in postgres with batches: postgres_utilities






                                share|improve this answer















                                To nicely insert into DB a list of rows, using user given batch size and with psycopg2 !



                                def get_batch(iterable, size=100):
                                for i in range(0, len(iterable), size):
                                yield iterable[i: i + size]


                                def insert_rows_batch(table, rows, batch_size=500, target_fields=None):
                                """
                                A utility method to insert batch of tuples(rows) into a table
                                NOTE: Handle data type for fields in rows yourself as per your table
                                columns' type.

                                :param table: Name of the target table
                                :type table: str

                                :param rows: The rows to insert into the table
                                :type rows: iterable of tuples

                                :param batch_size: The size of batch of rows to insert at a time
                                :type batch_size: int

                                :param target_fields: The names of the columns to fill in the table
                                :type target_fields: iterable of strings
                                """
                                conn = cur = None
                                if target_fields:
                                target_fields = ", ".join(target_fields)
                                target_fields = "({})".format(target_fields)
                                else:
                                target_fields = ''

                                conn = get_conn() # get connection using psycopg2
                                if conn:
                                cur = conn.cursor()
                                count = 0

                                for mini_batch in get_batch(rows, batch_size):
                                mini_batch_size = len(mini_batch)
                                count += mini_batch_size
                                record_template = ','.join(["%s"] * mini_batch_size)
                                sql = "INSERT INTO {0} {1} VALUES {2};".format(
                                table,
                                target_fields,
                                record_template)
                                cur.execute(sql, mini_batch)
                                conn.commit()
                                print("Loaded {} rows into {} so far".format(count, table))
                                print("Done loading. Loaded a total of {} rows".format(count))
                                if cur:cur.close()
                                if conn:conn.close()


                                If you want UPSERT (Insert+Update) as well in postgres with batches: postgres_utilities







                                share|improve this answer














                                share|improve this answer



                                share|improve this answer








                                edited Jan 7 at 5:42

























                                answered Apr 6 '18 at 11:03









                                MANUMANU

                                3001415




                                3001415























                                    1














                                    Another nice and efficient approach - is to pass rows for insertion as 1 argument,
                                    which is array of json objects.



                                    E.g. you passing argument:



                                    [ {id: 18, score: 1}, { id: 19, score: 5} ]


                                    It is array, which may contain any amount of objects inside.
                                    Then your SQL looks like:



                                    INSERT INTO links (parent_id, child_id, score) 
                                    SELECT 123, (r->>'id')::int, (r->>'score')::int
                                    FROM unnest($1::json) as r


                                    Notice: Your postgress must be new enough, to support json






                                    share|improve this answer




























                                      1














                                      Another nice and efficient approach - is to pass rows for insertion as 1 argument,
                                      which is array of json objects.



                                      E.g. you passing argument:



                                      [ {id: 18, score: 1}, { id: 19, score: 5} ]


                                      It is array, which may contain any amount of objects inside.
                                      Then your SQL looks like:



                                      INSERT INTO links (parent_id, child_id, score) 
                                      SELECT 123, (r->>'id')::int, (r->>'score')::int
                                      FROM unnest($1::json) as r


                                      Notice: Your postgress must be new enough, to support json






                                      share|improve this answer


























                                        1












                                        1








                                        1







                                        Another nice and efficient approach - is to pass rows for insertion as 1 argument,
                                        which is array of json objects.



                                        E.g. you passing argument:



                                        [ {id: 18, score: 1}, { id: 19, score: 5} ]


                                        It is array, which may contain any amount of objects inside.
                                        Then your SQL looks like:



                                        INSERT INTO links (parent_id, child_id, score) 
                                        SELECT 123, (r->>'id')::int, (r->>'score')::int
                                        FROM unnest($1::json) as r


                                        Notice: Your postgress must be new enough, to support json






                                        share|improve this answer













                                        Another nice and efficient approach - is to pass rows for insertion as 1 argument,
                                        which is array of json objects.



                                        E.g. you passing argument:



                                        [ {id: 18, score: 1}, { id: 19, score: 5} ]


                                        It is array, which may contain any amount of objects inside.
                                        Then your SQL looks like:



                                        INSERT INTO links (parent_id, child_id, score) 
                                        SELECT 123, (r->>'id')::int, (r->>'score')::int
                                        FROM unnest($1::json) as r


                                        Notice: Your postgress must be new enough, to support json







                                        share|improve this answer












                                        share|improve this answer



                                        share|improve this answer










                                        answered Oct 18 '15 at 11:04









                                        Dan KeyDan Key

                                        2,3131822




                                        2,3131822























                                            0














                                            I've been using ant32's answer above for several years. However I've found that is thorws an error in python 3 because mogrify returns a byte string.



                                            Converting explicitly to bytse strings is a simple solution for making code python 3 compatible.



                                            args_str = b','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup) 
                                            cur.execute(b"INSERT INTO table VALUES " + args_str)





                                            share|improve this answer




























                                              0














                                              I've been using ant32's answer above for several years. However I've found that is thorws an error in python 3 because mogrify returns a byte string.



                                              Converting explicitly to bytse strings is a simple solution for making code python 3 compatible.



                                              args_str = b','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup) 
                                              cur.execute(b"INSERT INTO table VALUES " + args_str)





                                              share|improve this answer


























                                                0












                                                0








                                                0







                                                I've been using ant32's answer above for several years. However I've found that is thorws an error in python 3 because mogrify returns a byte string.



                                                Converting explicitly to bytse strings is a simple solution for making code python 3 compatible.



                                                args_str = b','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup) 
                                                cur.execute(b"INSERT INTO table VALUES " + args_str)





                                                share|improve this answer













                                                I've been using ant32's answer above for several years. However I've found that is thorws an error in python 3 because mogrify returns a byte string.



                                                Converting explicitly to bytse strings is a simple solution for making code python 3 compatible.



                                                args_str = b','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup) 
                                                cur.execute(b"INSERT INTO table VALUES " + args_str)






                                                share|improve this answer












                                                share|improve this answer



                                                share|improve this answer










                                                answered Jul 27 '17 at 1:29









                                                jprockbellyjprockbelly

                                                989921




                                                989921























                                                    0














                                                    If you're using SQLAlchemy, you don't need to mess with hand-crafting the string because SQLAlchemy supports generating a multi-row VALUES clause for a single INSERT statement:



                                                    rows = 
                                                    for i, name in enumerate(rawdata):
                                                    row = {
                                                    'id': i,
                                                    'name': name,
                                                    'valid': True,
                                                    }
                                                    rows.append(row)
                                                    if len(rows) > 0: # INSERT fails if no rows
                                                    insert_query = SQLAlchemyModelName.__table__.insert().values(rows)
                                                    session.execute(insert_query)





                                                    share|improve this answer


























                                                    • Under the hood SQLAlchemy uses psychopg2's executemany() for calls like this and therefore this answer will have severe performance issues for large queries. See the execute method docs.sqlalchemy.org/en/latest/orm/session_api.html.

                                                      – sage88
                                                      Jan 5 '17 at 22:48






                                                    • 2





                                                      I don't think that's the case. It's been a bit since I looked at this, but IIRC, this is actually building a single insertion statement in the insert_query line. Then, session.execute() is just calling psycopg2's execute() statement with a single massive string. So the "trick" is building the entire insertion statement object first. I am using this to insert 200,000 rows at a time and saw massive performance increases using this code compared to the normal executemany().

                                                      – Jeff Widman
                                                      Jan 6 '17 at 1:13








                                                    • 1





                                                      The SQLAlchemy doc you linked to has a section that shows exactly how this works and even says: "It is essential to note that passing multiple values is NOT the same as using traditional executemany() form". So it is explicitly calling out that this works.

                                                      – Jeff Widman
                                                      Jan 6 '17 at 1:23






                                                    • 1





                                                      I stand corrected. I didn't notice your usage of the values() method (without it SQLAlchemy just does executemany). I would say edit the answer to include a link to that doc so that I can change my vote, but obviously you've already included it. Perhaps mention that this is not the same thing as calling an insert() with execute() with a list of dicts?

                                                      – sage88
                                                      Jan 6 '17 at 6:15
















                                                    0














                                                    If you're using SQLAlchemy, you don't need to mess with hand-crafting the string because SQLAlchemy supports generating a multi-row VALUES clause for a single INSERT statement:



                                                    rows = 
                                                    for i, name in enumerate(rawdata):
                                                    row = {
                                                    'id': i,
                                                    'name': name,
                                                    'valid': True,
                                                    }
                                                    rows.append(row)
                                                    if len(rows) > 0: # INSERT fails if no rows
                                                    insert_query = SQLAlchemyModelName.__table__.insert().values(rows)
                                                    session.execute(insert_query)





                                                    share|improve this answer


























                                                    • Under the hood SQLAlchemy uses psychopg2's executemany() for calls like this and therefore this answer will have severe performance issues for large queries. See the execute method docs.sqlalchemy.org/en/latest/orm/session_api.html.

                                                      – sage88
                                                      Jan 5 '17 at 22:48






                                                    • 2





                                                      I don't think that's the case. It's been a bit since I looked at this, but IIRC, this is actually building a single insertion statement in the insert_query line. Then, session.execute() is just calling psycopg2's execute() statement with a single massive string. So the "trick" is building the entire insertion statement object first. I am using this to insert 200,000 rows at a time and saw massive performance increases using this code compared to the normal executemany().

                                                      – Jeff Widman
                                                      Jan 6 '17 at 1:13








                                                    • 1





                                                      The SQLAlchemy doc you linked to has a section that shows exactly how this works and even says: "It is essential to note that passing multiple values is NOT the same as using traditional executemany() form". So it is explicitly calling out that this works.

                                                      – Jeff Widman
                                                      Jan 6 '17 at 1:23






                                                    • 1





                                                      I stand corrected. I didn't notice your usage of the values() method (without it SQLAlchemy just does executemany). I would say edit the answer to include a link to that doc so that I can change my vote, but obviously you've already included it. Perhaps mention that this is not the same thing as calling an insert() with execute() with a list of dicts?

                                                      – sage88
                                                      Jan 6 '17 at 6:15














                                                    0












                                                    0








                                                    0







                                                    If you're using SQLAlchemy, you don't need to mess with hand-crafting the string because SQLAlchemy supports generating a multi-row VALUES clause for a single INSERT statement:



                                                    rows = 
                                                    for i, name in enumerate(rawdata):
                                                    row = {
                                                    'id': i,
                                                    'name': name,
                                                    'valid': True,
                                                    }
                                                    rows.append(row)
                                                    if len(rows) > 0: # INSERT fails if no rows
                                                    insert_query = SQLAlchemyModelName.__table__.insert().values(rows)
                                                    session.execute(insert_query)





                                                    share|improve this answer















                                                    If you're using SQLAlchemy, you don't need to mess with hand-crafting the string because SQLAlchemy supports generating a multi-row VALUES clause for a single INSERT statement:



                                                    rows = 
                                                    for i, name in enumerate(rawdata):
                                                    row = {
                                                    'id': i,
                                                    'name': name,
                                                    'valid': True,
                                                    }
                                                    rows.append(row)
                                                    if len(rows) > 0: # INSERT fails if no rows
                                                    insert_query = SQLAlchemyModelName.__table__.insert().values(rows)
                                                    session.execute(insert_query)






                                                    share|improve this answer














                                                    share|improve this answer



                                                    share|improve this answer








                                                    edited Oct 19 '17 at 2:47

























                                                    answered Apr 27 '16 at 2:48









                                                    Jeff WidmanJeff Widman

                                                    7,38434365




                                                    7,38434365













                                                    • Under the hood SQLAlchemy uses psychopg2's executemany() for calls like this and therefore this answer will have severe performance issues for large queries. See the execute method docs.sqlalchemy.org/en/latest/orm/session_api.html.

                                                      – sage88
                                                      Jan 5 '17 at 22:48






                                                    • 2





                                                      I don't think that's the case. It's been a bit since I looked at this, but IIRC, this is actually building a single insertion statement in the insert_query line. Then, session.execute() is just calling psycopg2's execute() statement with a single massive string. So the "trick" is building the entire insertion statement object first. I am using this to insert 200,000 rows at a time and saw massive performance increases using this code compared to the normal executemany().

                                                      – Jeff Widman
                                                      Jan 6 '17 at 1:13








                                                    • 1





                                                      The SQLAlchemy doc you linked to has a section that shows exactly how this works and even says: "It is essential to note that passing multiple values is NOT the same as using traditional executemany() form". So it is explicitly calling out that this works.

                                                      – Jeff Widman
                                                      Jan 6 '17 at 1:23






                                                    • 1





                                                      I stand corrected. I didn't notice your usage of the values() method (without it SQLAlchemy just does executemany). I would say edit the answer to include a link to that doc so that I can change my vote, but obviously you've already included it. Perhaps mention that this is not the same thing as calling an insert() with execute() with a list of dicts?

                                                      – sage88
                                                      Jan 6 '17 at 6:15



















                                                    • Under the hood SQLAlchemy uses psychopg2's executemany() for calls like this and therefore this answer will have severe performance issues for large queries. See the execute method docs.sqlalchemy.org/en/latest/orm/session_api.html.

                                                      – sage88
                                                      Jan 5 '17 at 22:48






                                                    • 2





                                                      I don't think that's the case. It's been a bit since I looked at this, but IIRC, this is actually building a single insertion statement in the insert_query line. Then, session.execute() is just calling psycopg2's execute() statement with a single massive string. So the "trick" is building the entire insertion statement object first. I am using this to insert 200,000 rows at a time and saw massive performance increases using this code compared to the normal executemany().

                                                      – Jeff Widman
                                                      Jan 6 '17 at 1:13








                                                    • 1





                                                      The SQLAlchemy doc you linked to has a section that shows exactly how this works and even says: "It is essential to note that passing multiple values is NOT the same as using traditional executemany() form". So it is explicitly calling out that this works.

                                                      – Jeff Widman
                                                      Jan 6 '17 at 1:23






                                                    • 1





                                                      I stand corrected. I didn't notice your usage of the values() method (without it SQLAlchemy just does executemany). I would say edit the answer to include a link to that doc so that I can change my vote, but obviously you've already included it. Perhaps mention that this is not the same thing as calling an insert() with execute() with a list of dicts?

                                                      – sage88
                                                      Jan 6 '17 at 6:15

















                                                    Under the hood SQLAlchemy uses psychopg2's executemany() for calls like this and therefore this answer will have severe performance issues for large queries. See the execute method docs.sqlalchemy.org/en/latest/orm/session_api.html.

                                                    – sage88
                                                    Jan 5 '17 at 22:48





                                                    Under the hood SQLAlchemy uses psychopg2's executemany() for calls like this and therefore this answer will have severe performance issues for large queries. See the execute method docs.sqlalchemy.org/en/latest/orm/session_api.html.

                                                    – sage88
                                                    Jan 5 '17 at 22:48




                                                    2




                                                    2





                                                    I don't think that's the case. It's been a bit since I looked at this, but IIRC, this is actually building a single insertion statement in the insert_query line. Then, session.execute() is just calling psycopg2's execute() statement with a single massive string. So the "trick" is building the entire insertion statement object first. I am using this to insert 200,000 rows at a time and saw massive performance increases using this code compared to the normal executemany().

                                                    – Jeff Widman
                                                    Jan 6 '17 at 1:13







                                                    I don't think that's the case. It's been a bit since I looked at this, but IIRC, this is actually building a single insertion statement in the insert_query line. Then, session.execute() is just calling psycopg2's execute() statement with a single massive string. So the "trick" is building the entire insertion statement object first. I am using this to insert 200,000 rows at a time and saw massive performance increases using this code compared to the normal executemany().

                                                    – Jeff Widman
                                                    Jan 6 '17 at 1:13






                                                    1




                                                    1





                                                    The SQLAlchemy doc you linked to has a section that shows exactly how this works and even says: "It is essential to note that passing multiple values is NOT the same as using traditional executemany() form". So it is explicitly calling out that this works.

                                                    – Jeff Widman
                                                    Jan 6 '17 at 1:23





                                                    The SQLAlchemy doc you linked to has a section that shows exactly how this works and even says: "It is essential to note that passing multiple values is NOT the same as using traditional executemany() form". So it is explicitly calling out that this works.

                                                    – Jeff Widman
                                                    Jan 6 '17 at 1:23




                                                    1




                                                    1





                                                    I stand corrected. I didn't notice your usage of the values() method (without it SQLAlchemy just does executemany). I would say edit the answer to include a link to that doc so that I can change my vote, but obviously you've already included it. Perhaps mention that this is not the same thing as calling an insert() with execute() with a list of dicts?

                                                    – sage88
                                                    Jan 6 '17 at 6:15





                                                    I stand corrected. I didn't notice your usage of the values() method (without it SQLAlchemy just does executemany). I would say edit the answer to include a link to that doc so that I can change my vote, but obviously you've already included it. Perhaps mention that this is not the same thing as calling an insert() with execute() with a list of dicts?

                                                    – sage88
                                                    Jan 6 '17 at 6:15











                                                    -1














                                                    If you want to insert multiple rows within one insert statemens (assuming you are not using ORM) the easiest way so far for me would be to use list of dictionaries. Here is an example:



                                                     t = [{'id':1, 'start_date': '2015-07-19 00:00:00', 'end_date': '2015-07-20 00:00:00', 'campaignid': 6},
                                                    {'id':2, 'start_date': '2015-07-19 00:00:00', 'end_date': '2015-07-20 00:00:00', 'campaignid': 7},
                                                    {'id':3, 'start_date': '2015-07-19 00:00:00', 'end_date': '2015-07-20 00:00:00', 'campaignid': 8}]

                                                    conn.execute("insert into campaign_dates
                                                    (id, start_date, end_date, campaignid)
                                                    values (%(id)s, %(start_date)s, %(end_date)s, %(campaignid)s);",
                                                    t)


                                                    As you can see only one query will be executed:



                                                    INFO sqlalchemy.engine.base.Engine insert into campaign_dates (id, start_date, end_date, campaignid) values (%(id)s, %(start_date)s, %(end_date)s, %(campaignid)s);
                                                    INFO sqlalchemy.engine.base.Engine [{'campaignid': 6, 'id': 1, 'end_date': '2015-07-20 00:00:00', 'start_date': '2015-07-19 00:00:00'}, {'campaignid': 7, 'id': 2, 'end_date': '2015-07-20 00:00:00', 'start_date': '2015-07-19 00:00:00'}, {'campaignid': 8, 'id': 3, 'end_date': '2015-07-20 00:00:00', 'start_date': '2015-07-19 00:00:00'}]
                                                    INFO sqlalchemy.engine.base.Engine COMMIT





                                                    share|improve this answer
























                                                    • Showing logging from the sqlalchemy engine is NOT a demonstration of only running a single query, it just means that the sqlalchemy engine ran one command. Under the hood this is using psychopg2's executemany which is very inefficient. See the execute method docs.sqlalchemy.org/en/latest/orm/session_api.html.

                                                      – sage88
                                                      Jan 5 '17 at 23:06
















                                                    -1














                                                    If you want to insert multiple rows within one insert statemens (assuming you are not using ORM) the easiest way so far for me would be to use list of dictionaries. Here is an example:



                                                     t = [{'id':1, 'start_date': '2015-07-19 00:00:00', 'end_date': '2015-07-20 00:00:00', 'campaignid': 6},
                                                    {'id':2, 'start_date': '2015-07-19 00:00:00', 'end_date': '2015-07-20 00:00:00', 'campaignid': 7},
                                                    {'id':3, 'start_date': '2015-07-19 00:00:00', 'end_date': '2015-07-20 00:00:00', 'campaignid': 8}]

                                                    conn.execute("insert into campaign_dates
                                                    (id, start_date, end_date, campaignid)
                                                    values (%(id)s, %(start_date)s, %(end_date)s, %(campaignid)s);",
                                                    t)


                                                    As you can see only one query will be executed:



                                                    INFO sqlalchemy.engine.base.Engine insert into campaign_dates (id, start_date, end_date, campaignid) values (%(id)s, %(start_date)s, %(end_date)s, %(campaignid)s);
                                                    INFO sqlalchemy.engine.base.Engine [{'campaignid': 6, 'id': 1, 'end_date': '2015-07-20 00:00:00', 'start_date': '2015-07-19 00:00:00'}, {'campaignid': 7, 'id': 2, 'end_date': '2015-07-20 00:00:00', 'start_date': '2015-07-19 00:00:00'}, {'campaignid': 8, 'id': 3, 'end_date': '2015-07-20 00:00:00', 'start_date': '2015-07-19 00:00:00'}]
                                                    INFO sqlalchemy.engine.base.Engine COMMIT





                                                    share|improve this answer
























                                                    • Showing logging from the sqlalchemy engine is NOT a demonstration of only running a single query, it just means that the sqlalchemy engine ran one command. Under the hood this is using psychopg2's executemany which is very inefficient. See the execute method docs.sqlalchemy.org/en/latest/orm/session_api.html.

                                                      – sage88
                                                      Jan 5 '17 at 23:06














                                                    -1












                                                    -1








                                                    -1







                                                    If you want to insert multiple rows within one insert statemens (assuming you are not using ORM) the easiest way so far for me would be to use list of dictionaries. Here is an example:



                                                     t = [{'id':1, 'start_date': '2015-07-19 00:00:00', 'end_date': '2015-07-20 00:00:00', 'campaignid': 6},
                                                    {'id':2, 'start_date': '2015-07-19 00:00:00', 'end_date': '2015-07-20 00:00:00', 'campaignid': 7},
                                                    {'id':3, 'start_date': '2015-07-19 00:00:00', 'end_date': '2015-07-20 00:00:00', 'campaignid': 8}]

                                                    conn.execute("insert into campaign_dates
                                                    (id, start_date, end_date, campaignid)
                                                    values (%(id)s, %(start_date)s, %(end_date)s, %(campaignid)s);",
                                                    t)


                                                    As you can see only one query will be executed:



                                                    INFO sqlalchemy.engine.base.Engine insert into campaign_dates (id, start_date, end_date, campaignid) values (%(id)s, %(start_date)s, %(end_date)s, %(campaignid)s);
                                                    INFO sqlalchemy.engine.base.Engine [{'campaignid': 6, 'id': 1, 'end_date': '2015-07-20 00:00:00', 'start_date': '2015-07-19 00:00:00'}, {'campaignid': 7, 'id': 2, 'end_date': '2015-07-20 00:00:00', 'start_date': '2015-07-19 00:00:00'}, {'campaignid': 8, 'id': 3, 'end_date': '2015-07-20 00:00:00', 'start_date': '2015-07-19 00:00:00'}]
                                                    INFO sqlalchemy.engine.base.Engine COMMIT





                                                    share|improve this answer













                                                    If you want to insert multiple rows within one insert statemens (assuming you are not using ORM) the easiest way so far for me would be to use list of dictionaries. Here is an example:



                                                     t = [{'id':1, 'start_date': '2015-07-19 00:00:00', 'end_date': '2015-07-20 00:00:00', 'campaignid': 6},
                                                    {'id':2, 'start_date': '2015-07-19 00:00:00', 'end_date': '2015-07-20 00:00:00', 'campaignid': 7},
                                                    {'id':3, 'start_date': '2015-07-19 00:00:00', 'end_date': '2015-07-20 00:00:00', 'campaignid': 8}]

                                                    conn.execute("insert into campaign_dates
                                                    (id, start_date, end_date, campaignid)
                                                    values (%(id)s, %(start_date)s, %(end_date)s, %(campaignid)s);",
                                                    t)


                                                    As you can see only one query will be executed:



                                                    INFO sqlalchemy.engine.base.Engine insert into campaign_dates (id, start_date, end_date, campaignid) values (%(id)s, %(start_date)s, %(end_date)s, %(campaignid)s);
                                                    INFO sqlalchemy.engine.base.Engine [{'campaignid': 6, 'id': 1, 'end_date': '2015-07-20 00:00:00', 'start_date': '2015-07-19 00:00:00'}, {'campaignid': 7, 'id': 2, 'end_date': '2015-07-20 00:00:00', 'start_date': '2015-07-19 00:00:00'}, {'campaignid': 8, 'id': 3, 'end_date': '2015-07-20 00:00:00', 'start_date': '2015-07-19 00:00:00'}]
                                                    INFO sqlalchemy.engine.base.Engine COMMIT






                                                    share|improve this answer












                                                    share|improve this answer



                                                    share|improve this answer










                                                    answered Nov 16 '15 at 15:21









                                                    AlexAlex

                                                    1177




                                                    1177













                                                    • Showing logging from the sqlalchemy engine is NOT a demonstration of only running a single query, it just means that the sqlalchemy engine ran one command. Under the hood this is using psychopg2's executemany which is very inefficient. See the execute method docs.sqlalchemy.org/en/latest/orm/session_api.html.

                                                      – sage88
                                                      Jan 5 '17 at 23:06



















                                                    • Showing logging from the sqlalchemy engine is NOT a demonstration of only running a single query, it just means that the sqlalchemy engine ran one command. Under the hood this is using psychopg2's executemany which is very inefficient. See the execute method docs.sqlalchemy.org/en/latest/orm/session_api.html.

                                                      – sage88
                                                      Jan 5 '17 at 23:06

















                                                    Showing logging from the sqlalchemy engine is NOT a demonstration of only running a single query, it just means that the sqlalchemy engine ran one command. Under the hood this is using psychopg2's executemany which is very inefficient. See the execute method docs.sqlalchemy.org/en/latest/orm/session_api.html.

                                                    – sage88
                                                    Jan 5 '17 at 23:06





                                                    Showing logging from the sqlalchemy engine is NOT a demonstration of only running a single query, it just means that the sqlalchemy engine ran one command. Under the hood this is using psychopg2's executemany which is very inefficient. See the execute method docs.sqlalchemy.org/en/latest/orm/session_api.html.

                                                    – sage88
                                                    Jan 5 '17 at 23:06











                                                    -2














                                                    Finally in SQLalchemy1.2 version, this new implementation is added to use psycopg2.extras.execute_batch() instead of executemany when you initialize your engine with use_batch_mode=True like:



                                                    engine = create_engine(
                                                    "postgresql+psycopg2://scott:tiger@host/dbname",
                                                    use_batch_mode=True)


                                                    http://docs.sqlalchemy.org/en/latest/changelog/migration_12.html#change-4109



                                                    Then someone would have to use SQLalchmey won't bother to try different combinations of sqla and psycopg2 and direct SQL together..






                                                    share|improve this answer




























                                                      -2














                                                      Finally in SQLalchemy1.2 version, this new implementation is added to use psycopg2.extras.execute_batch() instead of executemany when you initialize your engine with use_batch_mode=True like:



                                                      engine = create_engine(
                                                      "postgresql+psycopg2://scott:tiger@host/dbname",
                                                      use_batch_mode=True)


                                                      http://docs.sqlalchemy.org/en/latest/changelog/migration_12.html#change-4109



                                                      Then someone would have to use SQLalchmey won't bother to try different combinations of sqla and psycopg2 and direct SQL together..






                                                      share|improve this answer


























                                                        -2












                                                        -2








                                                        -2







                                                        Finally in SQLalchemy1.2 version, this new implementation is added to use psycopg2.extras.execute_batch() instead of executemany when you initialize your engine with use_batch_mode=True like:



                                                        engine = create_engine(
                                                        "postgresql+psycopg2://scott:tiger@host/dbname",
                                                        use_batch_mode=True)


                                                        http://docs.sqlalchemy.org/en/latest/changelog/migration_12.html#change-4109



                                                        Then someone would have to use SQLalchmey won't bother to try different combinations of sqla and psycopg2 and direct SQL together..






                                                        share|improve this answer













                                                        Finally in SQLalchemy1.2 version, this new implementation is added to use psycopg2.extras.execute_batch() instead of executemany when you initialize your engine with use_batch_mode=True like:



                                                        engine = create_engine(
                                                        "postgresql+psycopg2://scott:tiger@host/dbname",
                                                        use_batch_mode=True)


                                                        http://docs.sqlalchemy.org/en/latest/changelog/migration_12.html#change-4109



                                                        Then someone would have to use SQLalchmey won't bother to try different combinations of sqla and psycopg2 and direct SQL together..







                                                        share|improve this answer












                                                        share|improve this answer



                                                        share|improve this answer










                                                        answered Jan 2 '18 at 2:01









                                                        user2189731user2189731

                                                        997




                                                        997























                                                            -3














                                                            Using aiopg - The snippet below works perfectly fine



                                                                # items = [10, 11, 12, 13]
                                                            # group = 1
                                                            tup = [(gid, pid) for pid in items]
                                                            args_str = ",".join([str(s) for s in tup])
                                                            # insert into group values (1, 10), (1, 11), (1, 12), (1, 13)
                                                            yield from cur.execute("INSERT INTO group VALUES " + args_str)





                                                            share|improve this answer



















                                                            • 8





                                                              This method is not safe from SQL injections. As psycopg2 documentation states (that aiopg2 links to): 'Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.'

                                                              – Michał Pawłowski
                                                              Sep 10 '15 at 16:13


















                                                            -3














                                                            Using aiopg - The snippet below works perfectly fine



                                                                # items = [10, 11, 12, 13]
                                                            # group = 1
                                                            tup = [(gid, pid) for pid in items]
                                                            args_str = ",".join([str(s) for s in tup])
                                                            # insert into group values (1, 10), (1, 11), (1, 12), (1, 13)
                                                            yield from cur.execute("INSERT INTO group VALUES " + args_str)





                                                            share|improve this answer



















                                                            • 8





                                                              This method is not safe from SQL injections. As psycopg2 documentation states (that aiopg2 links to): 'Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.'

                                                              – Michał Pawłowski
                                                              Sep 10 '15 at 16:13
















                                                            -3












                                                            -3








                                                            -3







                                                            Using aiopg - The snippet below works perfectly fine



                                                                # items = [10, 11, 12, 13]
                                                            # group = 1
                                                            tup = [(gid, pid) for pid in items]
                                                            args_str = ",".join([str(s) for s in tup])
                                                            # insert into group values (1, 10), (1, 11), (1, 12), (1, 13)
                                                            yield from cur.execute("INSERT INTO group VALUES " + args_str)





                                                            share|improve this answer













                                                            Using aiopg - The snippet below works perfectly fine



                                                                # items = [10, 11, 12, 13]
                                                            # group = 1
                                                            tup = [(gid, pid) for pid in items]
                                                            args_str = ",".join([str(s) for s in tup])
                                                            # insert into group values (1, 10), (1, 11), (1, 12), (1, 13)
                                                            yield from cur.execute("INSERT INTO group VALUES " + args_str)






                                                            share|improve this answer












                                                            share|improve this answer



                                                            share|improve this answer










                                                            answered Jul 16 '15 at 10:41









                                                            Nihal SharmaNihal Sharma

                                                            1,06542747




                                                            1,06542747








                                                            • 8





                                                              This method is not safe from SQL injections. As psycopg2 documentation states (that aiopg2 links to): 'Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.'

                                                              – Michał Pawłowski
                                                              Sep 10 '15 at 16:13
















                                                            • 8





                                                              This method is not safe from SQL injections. As psycopg2 documentation states (that aiopg2 links to): 'Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.'

                                                              – Michał Pawłowski
                                                              Sep 10 '15 at 16:13










                                                            8




                                                            8





                                                            This method is not safe from SQL injections. As psycopg2 documentation states (that aiopg2 links to): 'Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.'

                                                            – Michał Pawłowski
                                                            Sep 10 '15 at 16:13







                                                            This method is not safe from SQL injections. As psycopg2 documentation states (that aiopg2 links to): 'Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.'

                                                            – Michał Pawłowski
                                                            Sep 10 '15 at 16:13




















                                                            draft saved

                                                            draft discarded




















































                                                            Thanks for contributing an answer to Stack Overflow!


                                                            • Please be sure to answer the question. Provide details and share your research!

                                                            But avoid



                                                            • Asking for help, clarification, or responding to other answers.

                                                            • Making statements based on opinion; back them up with references or personal experience.


                                                            To learn more, see our tips on writing great answers.




                                                            draft saved


                                                            draft discarded














                                                            StackExchange.ready(
                                                            function () {
                                                            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f8134602%2fpsycopg2-insert-multiple-rows-with-one-query%23new-answer', 'question_page');
                                                            }
                                                            );

                                                            Post as a guest















                                                            Required, but never shown





















































                                                            Required, but never shown














                                                            Required, but never shown












                                                            Required, but never shown







                                                            Required, but never shown

































                                                            Required, but never shown














                                                            Required, but never shown












                                                            Required, but never shown







                                                            Required, but never shown







                                                            Popular posts from this blog

                                                            404 Error Contact Form 7 ajax form submitting

                                                            How to know if a Active Directory user can login interactively

                                                            TypeError: fit_transform() missing 1 required positional argument: 'X'