How Smoothly Insert Millions of Rows in SQL Server?

A few years ago, I needed to insert about  20 million  rows into a table. The data I needed to insert is a combination of two tables. And of course, I want it to be smooth and not lock tables for too long. Since the data does not come from files(usually CSV) it means I cannot use the BULK INSERT statement.  The main idea of this method is to split data insertion into small SQL portions . To demonstrate the method I prepared a simple Database. Imagine we have a Library project. The table dbo.Person stores our clients, the dbo.Book stores books and dbo.RentHistory stores books that were rented.

db structure Library
click to open in a new tab

Our tables contains the next amount of rows:

  • dbo.Person – 1 000 000 Rows
  • dbo.Book – 20 Rows
 Let’s assume that each library visitor rent each book. And we need to enter this in table dbo.RentHistory. It is not hard to calculate that we need to insert 20 million rows. 

Direct Way – Not Recommended

Than our sql statement will be the next:

INSERT INTO [dbo].[RentHistory]
([PersonId]
,[BookId]
,[RentStartDate]
,[RentEndDate])
SELECT
person.Id AS PersonId
,book.Id AS BookId
,GETDATE()
,DATEADD(month, 1, GETDATE())
FROM dbo.Person AS person
CROSS JOIN dbo.Book AS book

 

But this way does not guarantee table locks. As I mentioned above I suggest splitting one SQL statement into multiple.

Use Small Portions of Data – Recommended

 Using the ‘print’ command you need to generate chunks of the insert statements . And each statement has the  ‘GO’ operator  that allows us to commit these small implicit transactions. You can always control the size of the chunk and stop or pause the migration.

DECLARE @personCount INT, @bookCount INT
SELECT @personCount = Count(1) FROM dbo.Person
SELECT @bookCount = Count(1) FROM dbo.Book
DECLARE @i INT, @pagesize INT, @totalRentRows INT = @personCount * @bookCount
SET @i=0
SET @pagesize=100
WHILE @i <= @totalRentRows
BEGIN
    PRINT '
    INSERT INTO dbo.RentHistory (PersonId, BookId, RentStartDate, RentEndDate)
    SELECT PersonId, BookId, getDate(), DATEADD(month,1,getDate())
    FROM (
    SELECT
            ROW_NUMBER() OVER (ORDER BY person.Id, book.id)  as  TempId
            ,person.Id AS PersonId
            ,book.Id AS BookId
    FROM  dbo.Person AS person
    CROSS JOIN dbo.Book AS book
    ) SUB1
      where TempId between '+cast(@i as varchar(20)) +' and '+cast(@i+@pagesize as varchar(20)) +' 
        PRINT ''Page ' + cast((@i / @pageSize) as varchar(20))  + ' of ' + cast(@totalRentRows/@pageSize as varchar(20))+''''+
     '
     GO
     '
     SET @i=@i+@pagesize
END

 

Here is an example of the first two insert statements:

INSERT INTO dbo.RentHistory (PersonId, BookId, RentStartDate, RentEndDate)
        SELECT PersonId, BookId, getDate(), DATEADD(month,1,GETDATE())
        FROM (
        SELECT
                ROW_NUMBER() OVER (ORDER BY person.Id, book.id)  as  TempId
                ,person.Id AS PersonId
                ,book.Id AS BookId
        FROM  dbo.Person AS person
        CROSS JOIN dbo.Book AS book
        ) SUB1
          where TempId between 0 and 100 
            PRINT 'Page 0 of 200000'
         GO
         
        INSERT INTO dbo.RentHistory (PersonId, BookId, RentStartDate, RentEndDate)
        SELECT PersonId, BookId, getDate(), DATEADD(month,1,getDate())
        FROM (
        SELECT
                ROW_NUMBER() OVER (ORDER BY person.Id, book.id)  as  TempId
                ,person.Id AS PersonId
                ,book.Id AS BookId
        FROM  dbo.Person AS person
        CROSS JOIN dbo.Book AS book
        ) SUB1
          where TempId between 100 and 200 
            PRINT 'Page 1 of 200000'
         GO

 


insert millions of rows MS SQL statement
click to open in a new tab

Now you need to copy generated SQL statements from the ‘Message’ tab and start your migration. I am pretty sure there is a ton of various methods to solve such kind of SQL migrations and this one can be considered in your daily routine.

Video Demonstration

Leave a Reply

Your email address will not be published.