Do you have quite large MDF files on your database? By large, I mean hundreds of gigabytes (or larger). Have you ever noticed that your SQL Server disk stall metrics for these data files are much higher than the storage latency metrics exhibited on the underlying operating system layer? It could be that your SQL Server data files are being hammered too hard and you don’t have enough data files to help the SQL Server storage engine distribute the load. We do this for tempdb, right? Why don’t we do this enough for our user databases as well? It’s easy for a brand-new database from day zero, but what about existing databases that have grown out of control with a single data file attached? Let me show you how to adjust this for existing databases without an outage!

I’ll be taking the SQL Server database generated by the HammerDB benchmark, appropriately named tpcc_test. It’s just a backed up and restored copy of the (also appropriately named) database tpcc_single to represent a single MDF file present. If you’ve never used HammerDB to stress-test a relational database, I have a how-to guide on my site here.

I did a restore onto the F: drive of this server. The MDF size is rather substantial at 420GB. The log size is inconsequential for the purposes of this task.

Take a look at the current file system size versus the internal fill size. Record the internal fill size, as that’s what we’ll be using to help with some math shortly.

Now, how many files do you need? Here’s my standard DBA response – it depends. For the purposes of this test, I’m going to add seven NDF data files to complement the primary MDF. Your mileage may vary based on your workload.

How large do these files need to be? Take your internal fill size on the MDF, divide by the number of datafiles you want total, and round up a smidge to the nearest megabyte. If the database is quite active with a lot of data being added, you might want to add a bit more just to have it balance by the time we’re done with this task.

(Thanks Glenn Berry for the diagnostic queries!)

Now is the time to determine if these should be placed on the same underlying disk or if they should be distributed to more disks. I’m doing that on this server so that I can leverage the disk controller queues to more widely distribute the storage activity and see better overall performance when under pressure. I’ve attached three more virtual disks to this virtual machine, and the four are spread on the four disk controllers on this virtual machine. All disks are formatted with a 64KB NTFS allocation unit size with the large FRS flag enabled, and this test lab is on all-flash storage. Let’s place two of these files on each disk with the fourth holding the last NDF with the MDF.

See the little trick that I did here? Set the maximum file size of each of the new NDF files to be that percentage of the current main file space consumed.

Note: This could also be a good time to think about additional file groups with more data files. You could elect to relocate items such as hot and highly active tables, nonclustered indexes, auditing tables, etc. However, if you’re tied to the database design, such as with a third-party application vended database, you might be stuck with the architecture of a single primary file group. That’s the aspect I’m taking with this test, as the HammerDB database is technically from a third-party piece of software. The file structure on disk changes but very little changes outside of this adjustment. In a perfect world, the primary file group is reserved for metadata and all user data goes into one or more new file groups, but that’s a different topic for discussion, as I know almost all of us work with databases we’ve inherited.

Now here’s the fun part. We’re going to use the little-known command to empty the MDF file. In theory, it’ll drain the file completely, but since we capped the file growth of the NDF files that we attached to limit their growth rate, this command will run until we have filled the NDFs, then fail. It will take a while to perform this operation, but it keeps the database online while it runs. You might have a bit of a performance hit from the CPU and I/O impact of this running process, so be wise in when you chose to run this operation.

You can watch the process work with the internal used space query listed above. You’ll notice the primary MDF is draining while the other NDF files are slowly filling up.

Again, it does take a good amount of time for this process to finish. Give it time – it will eventually complete. When it finally ‘completes’, the EMPTYFILE command will fail because of the maximum file size limit on the NDF files.

FYI – Index optimization routines can cause a circular block with this process that I have encountered twice while running this process, so keep an eye out if you see this occur.

At this point, your internal data consumed space rates are almost identical.

Verify that your autogrow rates are now the same between the data files. Remove the file growth limit on the NDF files.

Perform a one-time shrink of the primary MDF file to shrink it down to where it is the same size as the others.

It will be most likely be very slightly out of balance. SQL Server does a pretty good job with the round robin algorithm for internal file fill rates, and it will finish balancing and then round robin filling internally as you continue to add data through normal usage. If you need to increase the size on some of the files so they match, go for it, and SQL Server will fill accordingly so it finally balances.

You might also check if you need to enable autogrowth for all files at the same rate if you have not already done so. This could mean that the database users might need to be punted from the database to enable the change, so do this during a maintenance window.

Sample your SQL Server disk stall rates and see what different you experience once this goes live.

Also, check if you get an improvement in SQL Server database backup times. If you can write as fast as you read and transmit, you might just get a bump in speed! Note that this is a backup to NUL test, so I’m testing just read performance and not actually taking a real backup. My normal network-based backup target cannot write as fast as it can read from flash disks, so I wouldn’t see any real improvement in backup times in my test lab.

I hope this helps you improve the operational efficiency of your SQL Server database data files, and you get better performance as a result. Let me know how this works for you!