SQL Server Deadlock Troubleshooting: How to Find, Read, and Fix Deadlocks in Production

You know the pattern. The application starts throwing errors. Users call the help desk. Someone checks the logs and sees the dreaded message: Transaction (Process ID X) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Your ticketing system is down. The business is screaming. And you have no idea which two queries are fighting each other.

I have been there. After 20+ years on SQL Server production systems, I can tell you that deadlocks are not mysterious. They follow predictable patterns, and once you can read a deadlock graph, you can fix them — and prevent the next one.

What a Deadlock Actually Is

A deadlock happens when two (or more) processes each hold a lock that the other process needs. Process A holds a lock on Table 1 and wants Table 2. Process B holds a lock on Table 2 and wants Table 1. Neither can proceed. SQL Server's deadlock monitor detects the cycle, picks one process as the victim, rolls it back, and lets the other continue.

The victim selection is not random — SQL Server picks the transaction that costs the least to roll back. The application sees a 1205 error. If you are not handling that error and retrying, the user sees a failure.

Deadlocks are a locking problem, not a bug in your application logic. The fix lives in your query design, indexes, and transaction scope.

Capturing Deadlock Graphs with Extended Events

Free · 2 Minutes

How healthy is your database, really?

Get your free database health score — spot risks before they become incidents.

Get my health score

Search results are full of advice to enable Trace Flag 1222 or 1204. That is legacy advice. The right tool is the system_health Extended Event session, which SQL Server has been running automatically since 2008 R2. It captures every deadlock graph without any configuration.

SELECT
    xdr.value('(event/@timestamp)[1]', 'datetime2') AS deadlock_time,
    xdr.query('.') AS deadlock_graph
FROM (
    SELECT CAST(target_data AS XML) AS target_data
    FROM sys.dm_xe_session_targets t
    JOIN sys.dm_xe_sessions s ON t.event_session_address = s.address
    WHERE s.name = 'system_health'
      AND t.target_name = 'ring_buffer'
) AS data
CROSS APPLY target_data.nodes('//RingBufferTarget/event[@name="xml_deadlock_report"]') AS XEventData(xdr)
ORDER BY deadlock_time DESC;

If you need persistent capture, create a dedicated session that writes to file targets. But for immediate troubleshooting, the ring buffer query above gets you the last ~100 deadlock graphs instantly.

Reading the Deadlock XML

When you click on a deadlock graph in SSMS, it renders as a visual diagram with processes (ovals) and resources (rectangles). Arrows show who owns what and who is waiting for what. The process marked with the X is the victim.

In the raw XML, the key elements are:

<process> — each participating transaction. Contains the SPID, the SQL text (in <inputbuf>), login name, and lock mode being requested.
<resource-list> — what each process is waiting on. Includes the object name, index name, and lock mode.
<owner-list> — what each process currently holds. Compare owner-list vs. waiter-list across processes to see the cycle.

The lock modes matter. S (shared) and X (exclusive) are the most common. A read operation holds S locks; a write holds X. X is incompatible with everything. That is why a reader and a writer can deadlock.

The 5 Most Common Deadlock Patterns

1. Opposite-order updates. Process A updates Table1 then Table2. Process B updates Table2 then Table1. This is the textbook deadlock. Fix: standardize update order across all transactions touching the same tables.

2. Key lookup deadlocks. A nonclustered index covers the WHERE clause but not the SELECT columns, forcing a key lookup to the clustered index. Concurrent writers on the clustered index create a cycle. Fix: add INCLUDE columns to the nonclustered index to eliminate the lookup.

3. Cascading FK deadlocks. A delete on a parent table triggers shared lock escalation down to child tables via foreign keys, while another transaction is inserting into those child tables. Fix: create indexes on FK columns in child tables.

4. Page split deadlocks. A high-insert workload on a fragmented index causes page splits. The split operation acquires locks on adjacent pages, intersecting with reader locks. Fix: set a fill factor appropriate to your insert pattern (80-90% is common).

5. Parallelism deadlocks. A single query running in parallel can deadlock with itself when parallel threads compete for exchange operator locks. Fix: reduce MAXDOP on the problem query, or set a cost threshold for parallelism.

War Story: The Nightly Batch That Deadlocked 200+ Times an Hour

A client came to us with a nightly batch job that processed payment records. By midnight it was generating 200+ deadlocks per hour and taking 6 hours to complete instead of 45 minutes.

The deadlock graphs told the story immediately. The batch was doing full table scans on a 12-million-row table because a composite index was missing a column from the WHERE clause. Without index coverage, SQL Server was escalating from row locks to page locks to table locks. The fix was a single index. Batch runtime dropped from 6 hours to 38 minutes. Deadlocks dropped to zero.

The lesson: most deadlocks have a missing index at their root. The index is not just a performance problem — it is a locking problem.

Prevention Strategies

Index coverage. Missing indexes are the most common root cause. Run the missing index DMVs and look for high-impact suggestions on your most-deadlocked tables.

Shorten transaction scope. Every lock is held for the duration of the transaction. Keep transactions tight — acquire locks late, release early.

Read Committed Snapshot Isolation (RCSI). RCSI changes the read behavior so readers use row versioning from tempdb instead of acquiring shared locks. Enable it at the database level:

ALTER DATABASE YourDatabase SET READ_COMMITTED_SNAPSHOT ON;

Note: RCSI increases tempdb usage. Size tempdb appropriately before enabling in production.

Lock hints as a last resort. NOLOCK reads dirty data. Use it only when dirty reads are explicitly acceptable. It is not a general-purpose deadlock fix.

Common Mistakes

Using NOLOCK everywhere. Trading deadlocks for dirty reads and phantom data. The application now returns incorrect results silently.

Retrying without understanding the pattern. Application-level retry on 1205 errors is correct. But retry without fixing the root cause just moves the pain.

Terminating SPIDs instead of fixing queries. Stopping the blocked process releases locks temporarily. Five minutes later the same deadlock pattern fires again.

Next Steps

If you have an active deadlock problem, start with the Extended Events query above against your production system. Pull the last 20 deadlock graphs, read the XML, and identify the pattern. Chances are you will see the same two queries appearing repeatedly. That is your fix target.

If you want expert eyes on your deadlock graphs and a systematic plan to eliminate them, our database performance tuning service includes a full deadlock analysis as part of the initial engagement.

Not sure where to start? Take our free SQL Server assessment — it gives you a prioritized list of your biggest performance and reliability risks, with deadlock exposure as a first-class signal.

Blocking and Deadlock Diagnosis SQL Server Index Tuning Performance Monitoring with DMVs