Saturday, 15 June 2013

SQL Concurrent Upsert (Insert/Update) without duplicates and Thread Safe



This blog entry discusses how to do a concurrent and safe upsert using T-SQ. Depending on whether there is an entry in the table it will either do an insert or an update. In this example there will be the primary key which will be unique. The primary key will be used to determine the uniqueness of the record and decide accordingly to do an insert or update. This also is handy for testing because 2 inserts of the same id will fail.

Here I will discuss how to handle whether to do an update or insert for a record concurrently and thread-safe. As you are well aware there is no upsert in T-SQL. There is however in SQL 2008 and greater a merge function. I will demonstrate that as well and show them both

First for SQL 2005 and earlier versions.

The code below is run on two or more sql query windows to prove that

DECLARE @ev_id as int
DECLARE @ev_description as varchar(50)

DECLARE @UpdateFlag as bit

SET @ev_id = 1999
SET @ev_description = 'THREAD 1'


WHILE 1=1       
BEGIN
      BEGIN TRANSACTION

      SET @UpdateFlag = 0
     
      SET @ev_id = @ev_id + 1
     
      INSERT INTO [dbo].[Event]
         ([ID]
         ,[Description]
         ,[TimeStamp])          
      SELECT @ev_id, @ev_description, getUTCDate()
      WHERE NOT EXISTS (SELECT * FROM [Event] WITH (UPDLOCK, HOLDLOCK) WHERE [ID] = @ev_id)

      IF @@ROWCOUNT > 0
      BEGIN
            SET @UpdateFlag = 1
      END
      ELSE
      BEGIN
            UPDATE [MicroangeloConccurentInsert].[dbo].[Event]
                  SET [ID] = @ev_id
                        ,[Description] = @ev_description
                        ,[TimeStamp] = getUTCDate()
            WHERE ID = @ev_id

--------------------
-----START TO BE DISCUSSED
--------------------
           
IF @@ROWCOUNT = 0
            BEGIN
                  INSERT INTO [dbo].[Event]
                     ([ID]
                     ,[Description]
                     ,[TimeStamp])          
                  SELECT @ev_id, @ev_description, getUTCDate()
                  WHERE NOT EXISTS (SELECT * FROM [Event] WITH (UPDLOCK, HOLDLOCK) WHERE [ID] = @ev_id)
                  IF @@ROWCOUNT = 0
                  BEGIN
                        UPDATE [MicroangeloConccurentInsert].[dbo].[Event]
                              SET [ID] = @ev_id
                                    ,[Description] = @ev_description
                                    ,[TimeStamp] = getUTCDate()
                        WHERE ID = @ev_id
                  END
            END

--------------------
-----END TO BE DISCUSSED
--------------------

      END
     
      COMMIT TRANSACTION
     
      if @ev_id = 20100
      begin
           
            BEGIN TRANSACTION
           
            DELETE FROM [Event] WHERE [ID] BETWEEN 2000 AND 20100
            PRINT 'DELETED ROWS'
            set @ev_id = 1999

            COMMIT TRANSACTION           
      end
           
     
      PRINT @UpdateFlag      
END

To be discussed (What does that mean)

Under normal circumstances the code between the comments “to be discussed” would not be needed. But because I am doing a delete then I need to check whether to upsert again. Ideally this would be in a while loop with while rowcount = 0 try inserting and or updating.

Now for the merge equivalent.

DECLARE @ev_id as int
DECLARE @ev_description as varchar(50)

DECLARE @UpdateFlag as bit

SET @ev_id = 1999
SET @ev_description = 'THREAD 1'


WHILE 1=1       
BEGIN
      BEGIN TRANSACTION

      SET @UpdateFlag = 0
      SET @ev_id = @ev_id + 1

      MERGE [dbo].[Event] as t
            USING ( VALUES (@ev_id) )
                              as s (ev_id)
            ON    (t.[id] = s.ev_id)
            WHEN MATCHED THEN
                  UPDATE SET [Description] = @ev_description,    
                                 [TimeStamp] = getUTCDate()
            WHEN NOT MATCHED THEN
                  INSERT
                     ([ID]
                     ,[Description]
                     ,[TimeStamp])
                  VALUES
                     (@ev_id
                     ,@ev_description
                     ,getUTCDate()) ;
                    
      COMMIT TRANSACTION
     
      if @ev_id = 5100
      begin
           
            BEGIN TRANSACTION
           
            DELETE FROM [Event] WHERE [ID] BETWEEN 2000 AND 20100
            PRINT 'DELETED ROWS'
            set @ev_id = 1999

            COMMIT TRANSACTION           
      end
     
END

As you can see the merge is much simpler. The only problem here is that you don’t know if you are doing an insert or an update. The main reason why I have used the first version is because I needed to insert a different data base on whether I was inserting or updating. Otherwise for 99% of all cases the second version is a better solution

So there you have it a multi threaded concurrent insert or update (upsert) and its thread safe.

Enjoy

1 comment:

  1. I think there is a need to find out and look for some more information about Upsert.These are actually the best operations one can apply to solve some complex IT problems.

    SSIS Upsert

    ReplyDelete