Alternative to UNION

Question

Alternative to UNION

Bobby P 271

I have a UNION Query that is combining information from a History Table and the current Master Table. We are talking about a few million rows here. The problem is the amount of [tempdb] space it is allocating and using for the UNION. So we thought if we loaded a New Staging Table from first the History and then the Master, and "dumb" this down a little bit and the SQL Server Optimizer and use of [tempdb] resources, that might help. So we loaded the History ok. And it created a few million rows and now...And as I type this it is now 20-minutes running...we are trying to load the Master Data and it is taking FOREVER! I think the UNION definitely helped by eliminating dupes.

Is there any alternative here? Like is there any way of getting around this UNION that seems to be hogging soooooo much [tempdb] space??

Please let me know your thoughts.

And Thanks in advance for your review and hopeful for a good, quality answer.

0 comments

2 answers

Your answer

Answer 1

Erland Sommarskog 133.9K MVP Volunteer Moderator

Without knowing anything about the tables or the queries it is impossible to give any precise help, but I can only discuss in general terms.

The UNION operator will by default perform a DISTINCT operation. Consider this:

DECLARE @a TABLE (a int NOT NULL)
DECLARE @b TABLE (a int NOT NULL)

INSERT @a (a) VALUES(1), (1), (2), (3)
INSERT @a (a) VALUES(3), (4), (5), (5)

SELECT a FROM @a
UNION 
SELECT a FROM @b

SELECT a FROM @a
UNION ALL
SELECT a FROM @b

The first SELECT returns 1, 2, 3, 4, 5. That is the distinct values. On the other hand, the second query returns 1, 1, 2, 3, 3, 4, 5, 5. That is, all rows.

The UNION operation itself typically does not require a lot of tempdb space. But the operation of weeding out the duplicates certainly can require a lot of disk space. Particularly, if the memory grant for the query is too low. (The memory grant is a result of the optimizer's estimates.)

Introduction of this staging table does not sound like a fruitful approach. What would be a good way to deal with this query is more difficult to say without further knowledge. Possibly, use UNION ALL and have other conditions filtering out lots of rows before applying DISTINCT. But then again, this means that these operators will have to work on more rows, so I am not very optimistic about that approach.

Bobby P 271 Reputation points

2026-04-23T14:15:03.5533333+00:00

I am thinking along the lines to use an INNER JOIN instead. Again...I think the purpose of this UNION and it's HAVING clause...

HAVING COUNT(*) > 1

Is to isolate existing Provider Data so then it can see if any of the fields have changed.

I am trying to put a band-aid on a muddy greasy pig to try and address the [tempdb] abuse. A quick fix maybe...perhaps...And then when we get this to where we think it's using less [tempdb] and it's stable, then to Re-Engineer this.
Erland Sommarskog 133.9K Reputation points MVP Volunteer Moderator

2026-04-23T20:42:46.6766667+00:00

Maybe a join works. I can think of many UNION queries where a join makes little sense, but I don't know anything about this one, so I cannot say ay or nay.
Lakshmi Narayana Garikapati 1,310 Reputation points Microsoft External Staff Moderator

2026-05-04T11:42:16.46+00:00

Hi Bobby P,

Following up to confirm if the earlier responses addressed your query. If not, please share your questions and we'll assist further

Thanks,

Lakshmi

Answer 2

A UNION itself is not the root cause; tempdb pressure comes from how the underlying queries are executed (large sorts, hashes, spills, temp tables, etc.). The context here only provides general UNION behavior, but a few concrete options are supported:

Use UNION ALL if duplicates are not required to be removed

UNION removes duplicates, which forces a sort or hash operation and can consume a lot of tempdb. If duplicates do not need to be eliminated, replacing UNION with UNION ALL avoids that deduplication step and reduces tempdb usage:

SELECT col1, col2, ...
FROM History
UNION ALL
SELECT col1, col2, ...
FROM Master;

The documentation notes that using ALL keeps duplicates and generally runs faster because it avoids the extra work of removing duplicates.

Push work into a single UNION operation instead of staging

Access/SQL guidance shows that UNION is designed to combine multiple result sets in one operation, and the engine can optimize that better than manual staging into a separate table and then inserting again. A single UNION (or UNION ALL) over the two source tables is usually preferable to:

Insert History into a staging table
Then insert Master into the same staging table

because each insert can still require tempdb for sorts, indexes, and constraints on the staging table.

Ensure the UNION query is as simple as possible

The UNION operator requires:

Same number of output columns
Same order
Same or compatible data types

Keeping the projection narrow (only needed columns) and avoiding unnecessary expressions reduces row width and tempdb usage.

Investigate the specific code paths that fill tempdb

From the related tempdb guidance, tempdb can be filled by:

Very large intermediate result sets
Hash or sort spills
Spool operators

The recommendation is to identify the specific statements that trigger tempdb full (for example, error 1105) and focus tuning efforts there rather than assuming UNION alone is the problem.

If deduplication is required, UNION must either sort or hash, and that inherently uses tempdb. In that case, the main levers are:

Reducing the number of rows before the UNION (filtering earlier)
Reducing row width (fewer columns)
Using UNION ALL plus a separate, more targeted deduplication strategy if possible

References:

Share via

Alternative to UNION

2 answers

Your answer