How to Break an Existing Single Database File Into Multiple Files

By Lori Brown | Performance Tuning

Jul 26

I occasionally come across some pretty good sized databases that are set up with a single data file.  We recently have been working with a client to break up their single data file into multiple data files so that we can spread them over several different LUNs and so that they can take advantage of the improved performance of using the files in parallel.  The concept is much like setting up tempdb with 1 file (up to 8) per core.

Since most people don’t think about using multiple files for databases until they have grown large enough to be a problem, I think that most don’t realize that breaking up a database can be done at any time, you just need to have enough space for new files.  Here is a bit of a demo on how to do this.

Create a database with one file and load it with a couple of large tables

USE [master]
GO
CREATE DATABASE [ADMIN]
 ON  PRIMARY 
( NAME = N'ADMIN', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\ADMIN.mdf' , SIZE = 102400KB , FILEGROWTH = 102400KB )
 LOG ON 
( NAME = N'ADMIN_log', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\ADMIN_log.ldf' , SIZE = 102400KB , FILEGROWTH = 102400KB )
GO
ALTER DATABASE [ADMIN] MODIFY FILE ( NAME = N'ADMIN', SIZE = 409600KB )
GO
-- ********************************************
-- Use the below code to load random data into tables
-- Change the value in the WHILE @Loop to control how many records are created
-- ********************************************
USE ADMIN
GO

CREATE TABLE [dbo].[RandomLoad1](
	[NumberColumn] [int] NULL,
	[BitColumn] [bit] NULL,
	[VarcharColumn] [varchar](20) NULL,
	[VarcharColumn100] [varchar](100) NULL,
	[CharColumn] [char](1) NULL,
	[DateColumn] [date] NULL
) ON [PRIMARY]

GO
CREATE TABLE [dbo].[RandomLoad2](
	[NumberColumn] [int] NULL,
	[BitColumn] [bit] NULL,
	[VarcharColumn] [varchar](20) NULL,
	[VarcharColumn100] [varchar](100) NULL,
	[CharColumn] [char](1) NULL,
	[DateColumn] [date] NULL
) ON [PRIMARY]

GO
CREATE TABLE [dbo].[RandomLoad3](
	[NumberColumn] [int] NULL,
	[BitColumn] [bit] NULL,
	[VarcharColumn] [varchar](20) NULL,
	[VarcharColumn100] [varchar](100) NULL,
	[CharColumn] [char](1) NULL,
	[DateColumn] [date] NULL
) ON [PRIMARY]

GO
-- This will populate a table with random records
DECLARE @Loop INT
DECLARE @varchar1 VARCHAR(20)
DECLARE @varchar2 VARCHAR(100)
DECLARE @varbin1 VARBINARY(128)
DECLARE @Char1 CHAR(2)
DECLARE @Num1 INT
DECLARE @Bit1 BIT
DECLARE @Date1 DATE
DECLARE @length INT

SET @Loop = 0

WHILE @Loop < 1000000  -- number of records to generate   5000   10000  500000  1000000
  BEGIN
    -- Generate Number
    SET @Num1 =  ROUND(RAND() * 100000, 0) 

    -- Generate Bit
		SET @Bit1 = CRYPT_GEN_RANDOM(1)%2

    -- Generate Varchar
    SET @varchar1 = ''
		SET @length = CAST(RAND() * 20 AS INT)
		WHILE @length <> 0
		BEGIN
			SET @varchar1 = @varchar1 + CHAR(CAST(RAND() * 96 + 32 AS INT))
			SET @length = @length - 1
		END

	-- Generate Varchar 100
    SET @varchar2 = ''
		SET @length = CAST(RAND() * 100 AS INT)
		WHILE @length <> 0
		BEGIN
			SET @varchar2 = @varchar2 + CHAR(CAST(RAND() * 96 + 50 AS INT))
			SET @length = @length - 1
		END

    -- Generate Char
		SET @Char1 = LEFT(newid(),1)
 
    -- Generate Date
    SET @Date1 = CAST(GETDATE() + (365 * 2 * RAND() - 365) AS DATE)
 
    INSERT INTO [RandomLoad1] VALUES (@Num1,@Bit1,@varchar1,@varchar2,@Char1,@Date1)
    SET @Loop = @Loop + 1
  END
GO

 

As you load data you can seed some known values into the tables at different places while loading up the tables with random values.  This will let you run some test queries if you are interested in them.

INSERT INTO RandomLoad1 ([NumberColumn],[BitColumn],[VarcharColumn],[VarcharColumn100],[CharColumn],[DateColumn])
VALUES (1001, 1, 'U2', 'And I still haven''t found what I''m looking for.', 'A', '2017-05-21')
INSERT INTO RandomLoad1 ([NumberColumn],[BitColumn],[VarcharColumn],[VarcharColumn100],[CharColumn],[DateColumn])
VALUES (2001, 0, 'ZZ Top', 'Cause every girl''s crazy bout a sharp dressed man.', 'B', '2015-11-11')
INSERT INTO RandomLoad1 ([NumberColumn],[BitColumn],[VarcharColumn],[VarcharColumn100],[CharColumn],[DateColumn])
VALUES (3001, 1, 'Cyndi Lauper', 'Girls just wanna have fun!', 'C', '2012-08-08')
INSERT INTO RandomLoad1 ([NumberColumn],[BitColumn],[VarcharColumn],[VarcharColumn100],[CharColumn],[DateColumn])
VALUES (4001, 1, 'Def Leppard', 'Rock of ages, still a rollin!', 'D', '2018-10-15')
INSERT INTO RandomLoad2 ([NumberColumn],[BitColumn],[VarcharColumn],[VarcharColumn100],[CharColumn],[DateColumn])
VALUES (5001, 1, 'Flock of Seagulls', 'I just ran.  I ran so far away.', 'E', '1984-02-14')
INSERT INTO RandomLoad2 ([NumberColumn],[BitColumn],[VarcharColumn],[VarcharColumn100],[CharColumn],[DateColumn])
VALUES (6001, 0, 'Styx', 'Domo Arigato Mister Roboto!', 'F', '1999-12-31')
INSERT INTO RandomLoad3 ([NumberColumn],[BitColumn],[VarcharColumn],[VarcharColumn100],[CharColumn],[DateColumn])
VALUES (7001, 1, 'Huey Lewis&the News', 'They say the heart of rock n roll is still beating.', 'G', '2003-06-02')
INSERT INTO RandomLoad3 ([NumberColumn],[BitColumn],[VarcharColumn],[VarcharColumn100],[CharColumn],[DateColumn])
VALUES (8001, 0, 'Twisted Sister', 'We''re not gonna take it anymore!', 'H', '2008-03-05')

 

Here is how the single database file is filled.

 

Now I want to add more files to break the main .mdf  up so that I can spread the files around and take advantage of some nice performance.
Add 3 new files…

-- Add files.  You can either add a new filegroup and new files or simply add files to an existing filegroup
USE [master]
GO
ALTER DATABASE [ADMIN] ADD FILE ( NAME = N'ADMIN2', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\ADMIN2.ndf' , SIZE = 204800KB , FILEGROWTH = 102400KB) TO FILEGROUP [PRIMARY]
GO
ALTER DATABASE [ADMIN] ADD FILE ( NAME = N'ADMIN3', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\ADMIN3.ndf' , SIZE = 204800KB , FILEGROWTH = 102400KB) TO FILEGROUP [PRIMARY]
GO
ALTER DATABASE [ADMIN] ADD FILE ( NAME = N'ADMIN4', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\ADMIN4.ndf' , SIZE = 204800KB , FILEGROWTH = 102400KB) TO FILEGROUP [PRIMARY]
GO

 

Here is how space is used in them before moving data to the new files.

Now we run a DBCC SHRINKFILE with EMPTYFILE and it will automatically start distributing data to the new files.

--Distribute the primary database .mdf file and empty it
USE [ADMIN]
GO
DBCC SHRINKFILE (N'ADMIN', EMPTYFILE)
GO

 

This will actually run in the background so if your workload is light, you can run it while users are connected.  Otherwise you will need to do this during a maintenance window when use is light or none at all.  You can also stop and restart this process.  So, if you can only let this go for 5 hours on one day and have to work on it again on another day you can do that but just stopping the above query and starting up again later.

 

I ran into this error at the end of the shrinkfile.  This happens because there is still system info in the first file that cannot be moved and is expected behavior.  Here is how the files are filled after the shrinkfile.  You can see that the new files have data sort of evenly distributed to them.

I decided to shrink the original database file to match the size of the other files so that I would not see any strange usage with one file being larger than all the others.  It is important to try to keep them all the same size.

GO
DBCC SHRINKFILE (N'ADMIN', 200)
GO

Now, I loaded more random data to see how the data will be stored.  You can see the proportional fill is working.

I have actually performed this type of project a couple of years ago on a live database that was roughly 1TB.  It took several days and did cause the users to experience some slowness.  So, test things well so that you can know what to expect if you break up your database.

If you are wondering about how performance will be improved, check out this from Paul Randal ( Blog | Twitter ).  The post while a little old is still very good.  https://www.sqlskills.com/blogs/paul/benchmarking-do-multiple-data-files-make-a-difference/

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com.  We will be happy to help!  Leave a comment and feel free to track back to us.  We love to talk tech with anyone in our SQL family!

Follow

About the Author

Lori is an avid runner, cross fitter and SQL enthusiast. She has been working for SQLRX for 15 years and has been working with SQL in general for 25 years. Yup...she is an old hand at this stuff.

  • […] Lori Brown shows us how to take a database with one database file and add new database files to it: […]

  • >