SQL Server Performance Monitoring with DMVs and Extended Events

It was 2:47 AM. A developer pinged me: "SQL Server is crawling. Everything is slow. Nothing changed."

Classic. I RDP'd in, opened Task Manager, saw SQL Server pegged at 94% CPU. Activity Monitor was spinning and not showing me anything useful. I could have spent the next two hours rebooting services and hoping for the best.

Instead, I ran three DMV queries. In four minutes, I had the culprit: a reporting query someone had scheduled to run every 60 seconds, with a missing index causing a full table scan on a 40 million row table. Query was gone. CPU dropped to 12%. I went back to bed.

That is what built-in tooling does when you know how to use it. No third-party monitoring required.

The 10 DMVs Every Production DBA Should Know

Dynamic Management Views are SQL Server's built-in instrumentation layer. They expose real-time data about what the engine is actually doing. Here are the ones I use weekly:

1. sys.dm_exec_requests — What Is Running Right Now

SELECT
    r.session_id,
    r.status,
    r.blocking_session_id,
    r.wait_type,
    r.wait_time / 1000.0 AS wait_seconds,
    r.cpu_time / 1000.0 AS cpu_seconds,
    r.total_elapsed_time / 1000.0 AS elapsed_seconds,
    r.logical_reads,
    SUBSTRING(st.text, (r.statement_start_offset/2)+1,
        ((CASE r.statement_end_offset WHEN -1 THEN DATALENGTH(st.text)
          ELSE r.statement_end_offset END - r.statement_start_offset)/2)+1) AS statement_text
FROM sys.dm_exec_requests r
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) st
WHERE r.session_id > 50
ORDER BY r.cpu_time DESC;

This is your first stop when production is hurting. Shows active queries, what they are waiting on, and how long they have been running.

2. sys.dm_exec_query_stats — Historical Query Performance

SELECT TOP 20
    qs.execution_count,
    qs.total_worker_time / qs.execution_count AS avg_cpu_microseconds,
    qs.total_logical_reads / qs.execution_count AS avg_logical_reads,
    qs.total_elapsed_time / qs.execution_count AS avg_elapsed_microseconds,
    SUBSTRING(st.text, (qs.statement_start_offset/2)+1,
        ((CASE qs.statement_end_offset WHEN -1 THEN DATALENGTH(st.text)
          ELSE qs.statement_end_offset END - qs.statement_start_offset)/2)+1) AS query_text
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
ORDER BY qs.total_worker_time DESC;

Shows queries that have been the worst offenders since the last plan cache clear.

3. sys.dm_os_wait_stats — What SQL Server Is Actually Waiting On

SELECT TOP 20
    wait_type,
    waiting_tasks_count,
    wait_time_ms / 1000.0 AS wait_time_seconds,
    max_wait_time_ms / 1000.0 AS max_wait_seconds,
    signal_wait_time_ms / 1000.0 AS signal_wait_seconds,
    (wait_time_ms - signal_wait_time_ms) / 1000.0 AS resource_wait_seconds
FROM sys.dm_os_wait_stats
WHERE wait_type NOT IN (
    'SLEEP_TASK', 'BROKER_TO_FLUSH', 'BROKER_TASK_STOP', 'CLR_AUTO_EVENT',
    'REQUEST_FOR_DEADLOCK_SEARCH', 'RESOURCE_QUEUE', 'SERVER_IDLE_CHECK',
    'SLEEP_DBSTARTUP', 'SLEEP_DBRECOVER', 'SLEEP_MASTERDBREADY',
    'SLEEP_MASTERMDREADY', 'SLEEP_MASTERUPGRADED', 'SLEEP_MSDBSTARTUP',
    'SLEEP_TEMPDBSTARTUP', 'SNI_HTTP_ACCEPT', 'SP_SERVER_DIAGNOSTICS_SLEEP',
    'SQLTRACE_BUFFER_FLUSH', 'WAITFOR', 'XE_DISPATCHER_WAIT', 'XE_TIMER_EVENT'
)
ORDER BY wait_time_ms DESC;

Wait stats are cumulative since SQL Server started. More on how to interpret these below — it is where most performance investigations actually start.

4 through 10: The Rest of the Core Set

sys.dm_exec_sessions — all connected sessions, login time, last request time, memory usage
sys.dm_os_memory_clerks — memory consumers by component; find memory pressure fast
sys.dm_db_index_usage_stats — which indexes are being used (seeks/scans/lookups) vs. ignored
sys.dm_db_missing_index_details — indexes SQL Server wishes existed
sys.dm_io_virtual_file_stats — I/O per database file; stall time tells you if storage is the bottleneck
sys.dm_tran_locks — active lock holders and waiters; pair with dm_exec_requests to trace blocking chains
sys.dm_os_ring_buffers — low-level memory and scheduler diagnostics; use when everything else looks clean and you are still stuck

Wait Stats Analysis: Reading What SQL Server Is Telling You

Free · 2 Minutes

How healthy is your database, really?

Get your free database health score — spot risks before they become incidents.

Get my health score

Wait stats are the fastest way to understand a server's health at a high level. The top wait types tell you exactly where time is being lost:

CXPACKET / CXCONSUMER — parallelism contention. Look at cost threshold for parallelism and max degree of parallelism settings.
LCK_M_* — locking waits. Go to dm_exec_requests filtered to blocking_session_id > 0.
PAGEIOLATCH_SH / PAGEIOLATCH_EX — data pages being read from disk. Either missing indexes causing full scans, or storage is too slow.
WRITELOG — transaction log writes stalling. Log disk I/O is the bottleneck. Separate log files to faster storage.
RESOURCE_SEMAPHORE — queries waiting for memory grants. Sort spills to disk. Index or rewrite the query.
SOS_SCHEDULER_YIELD — CPU pressure; queries yielding the scheduler.

Snapshot dm_os_wait_stats before and after a load period, subtract the delta, and sort by wait time. That delta tells you exactly what the server was struggling with during that window.

Want a quick read on where your database stands right now? The 2AM Database Test is a 2-minute self-assessment that surfaces your biggest risk areas.

Extended Events vs. SQL Trace and Profiler

If you are still running SQL Profiler against a production server, stop. It is deprecated, it is heavyweight, and it adds CPU overhead that nobody accounts for in the performance budget.

Extended Events (XE) replaced Trace as the production-safe way to capture query-level telemetry. The system_health session runs by default on every SQL Server instance — it captures deadlocks, error events, and wait info without any configuration from you.

Here is a lightweight XE session for capturing slow queries in production:

-- Capture queries exceeding 2 seconds
CREATE EVENT SESSION [SlowQueryCapture] ON SERVER
ADD EVENT sqlserver.sql_statement_completed (
    ACTION (sqlserver.sql_text, sqlserver.database_name, sqlserver.username)
    WHERE duration > 2000000
),
ADD EVENT sqlserver.rpc_completed (
    ACTION (sqlserver.sql_text, sqlserver.database_name)
    WHERE duration > 2000000
)
ADD TARGET package0.ring_buffer (SET max_memory = 51200),
ADD TARGET package0.event_file (SET filename = N'C:\XELogsSlowQueries.xel', max_file_size = 100);

ALTER EVENT SESSION [SlowQueryCapture] ON SERVER STATE = START;

Building a Lightweight Monitoring Dashboard

You do not need a third-party tool for monitoring history. A SQL Agent job that snapshots key DMV data into a local table every 5 minutes gives you enough history to spot trends:

CREATE TABLE dba_monitoring.wait_stats_snapshots (
    snapshot_time  DATETIME2       DEFAULT GETUTCDATE(),
    wait_type      NVARCHAR(60),
    waiting_tasks  BIGINT,
    wait_time_ms   BIGINT,
    signal_wait_ms BIGINT
);

INSERT INTO dba_monitoring.wait_stats_snapshots
    (wait_type, waiting_tasks, wait_time_ms, signal_wait_ms)
SELECT wait_type, waiting_tasks_count, wait_time_ms, signal_wait_time_ms
FROM sys.dm_os_wait_stats
WHERE wait_type NOT IN ('SLEEP_TASK', 'BROKER_TO_FLUSH', 'REQUEST_FOR_DEADLOCK_SEARCH', 'RESOURCE_QUEUE', 'WAITFOR');

Query deltas between snapshots to see wait patterns over time. This is a poor man's APM — one that understands SQL Server internals better than most commercial tools and costs exactly nothing.

Common Mistakes

Over-relying on Activity Monitor. It is a one-second snapshot of a moving system. Use DMVs — point-in-time data, repeatable, scriptable.

Ignoring wait stats. Most performance investigations start at the wrong level: jumping straight to individual queries before understanding the server's overall wait profile. Wait stats first, then drill down. You will cut your diagnostic time in half.

Not baselining. A wait stat number means nothing without context. Capture dm_os_wait_stats during a normal business day. That is your baseline. Everything after is measured against it.

Running Profiler in production. Already covered. Stop it.

The tools above handle 90% of the production incidents I get called into. CPU spike, blocking chain, I/O bottleneck, memory pressure — DMVs and XE surface all of it. No agent software, no licensing, no 3am calls to a vendor support line.

If you want to know where your database actually stands right now, the 2AM Database Test is a 2-minute self-assessment that tells you exactly what to tackle first. Free, no commitment, no sales call.

Already past the diagnostic phase and need hands-on help? Database Performance Tuning — fixed scope, no retainer, problems solved.

SQL Server Query Optimization SQL Server Index Tuning SQL Server Deadlock Troubleshooting Always On AGs Configuration

SQL Server Performance Monitoring: DMVs and Extended Events Every Production DBA Needs

The 10 DMVs Every Production DBA Should Know

1. sys.dm_exec_requests — What Is Running Right Now

2. sys.dm_exec_query_stats — Historical Query Performance

3. sys.dm_os_wait_stats — What SQL Server Is Actually Waiting On

4 through 10: The Rest of the Core Set

Wait Stats Analysis: Reading What SQL Server Is Telling You

Extended Events vs. SQL Trace and Profiler

Building a Lightweight Monitoring Dashboard

Common Mistakes

Related Articles

Get your free database health score

The 10 DMVs Every Production DBA Should Know

1. sys.dm_exec_requests — What Is Running Right Now

2. sys.dm_exec_query_stats — Historical Query Performance

3. sys.dm_os_wait_stats — What SQL Server Is Actually Waiting On

4 through 10: The Rest of the Core Set

Wait Stats Analysis: Reading What SQL Server Is Telling You

Extended Events vs. SQL Trace and Profiler

Building a Lightweight Monitoring Dashboard

Common Mistakes

Related Articles

Get your free database health score

Related Posts