In this article I’ll write nothing new regarding monitoring, but rather how to use what SQL Server already provide us and make it as light as possible; when working in “high OLTP workload” environments monitoring should be almost “invisible”, and by “high OLTP workload” I mean a server running at 50000/60000 Batches/sec
with the load made almost by stored procedures in the average CPU time from few ms to few tens of ms. Just for an example, the image below shows 10 minutes sampling taken at about 1/4 maximum load of one customer, with the first column showing the CPU time of every stored procedure over the total SQL Server CPU usage; so, if for example SQL was at 40% the first sp was responsible for the 19.8% of that load, and so on..
The above report is obtained from a table populated every minute with a simple query over the sys.dm_exec_procedure_stats
SELECT dateadd(mi, datediff(mi, 0, getdate()), 0), sql_handle, plan_handle, total_elapsed_time, total_worker_time, total_logical_reads, total_logical_writes, execution_count FROM sys.dm_exec_procedure_stats
In addition to this, on our servers we usually capture every minute a snapshot over the sys.dm_exec_requests to get a photo of the running statements, and use it also in realtime when there is something that “smells” strange ; for this purpose we initially used the famous sp_WhoIsActive, very powerfull, but sometimes, when there were several process running or some blocking conditions, it happened that it was lasting too much or, never ending; sometimes it caused also an excessive latch contention on the tempdb and definitively slowed down the instance.
After digging a bit into the issue it came out to be the high usage of string manipulation functions which caused heavy usage of the TempDb and CPU time, that’s why the idea of something “lighter” and with only the functionality needed for our purpose.
Basycally we use a statement like this
SELECT s.host_name, r.session_id as 'session', r.blocking_session_id as [blocked by], CAST (SUBSTRING(st.text, (r.statement_start_offset/2)+1, ((CASE r.statement_end_offset WHEN -1 THEN DATALENGTH(st.text) ELSE r.statement_end_offset END - r.statement_start_offset)/2) + 1) as text) AS statement_text, ISNULL(cast(OBJECT_SCHEMA_NAME(st.objectid, st.dbid) + '.' + OBJECT_NAME(st.objectid, st.dbid) as varchar(100)), 'AdHoc Statement') as object_name, s.login_name, CASE WHEN LEFT(s.program_name, 8) = 'SQLAgent' then (SELECT 'SQLAgent Job: ' + b.name from msdb.dbo.sysjobs b WHERE (SUBSTRING(MASTER.dbo.FN_VARBINTOHEXSTR(CONVERT(VARBINARY(16), b.JOB_ID)),1,10)) = SUBSTRING(s.PROGRAM_NAME,30,10)) ELSE s.program_name END AS [program_name], r.start_time as request_start_time,r.last_wait_type,r.wait_time, r.cpu_time, r.total_elapsed_time, r.logical_reads, r.reads as physical_reads, r.status as request_status FROM sys.dm_exec_sessions s left join sys.dm_exec_requests r on s.session_id = r.session_id CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) AS st WHERE r.session_id > 50 and r.session_id @@SPID AND last_wait_type 'SP_SERVER_DIAGNOSTICS_SLEEP' ORDER BY cpu_time DESC
for cehcking in realtime what is currently running, with an added
OUTER APPLY sys.dm_exec_text_query_plan( r.plan_handle, r.statement_start_offset, r.statement_end_offset) tp
when we want to take a look at the Execution Plan ( CAST (tp.query_plan as xml) AS plan_xml in the fields list).
Second, we check for blocking processes with something like this
SELECT s.session_id, r.blocking_session_id AS [blocked by], CAST(SUBSTRING(st.text, (r.statement_start_offset / 2) + 1, ((CASE r.statement_end_offset WHEN - 1 THEN DATALENGTH(st.text) ELSE r.statement_end_offset END - r.statement_start_offset) / 2) + 1) AS text) AS statement_text, ISNULL(cast(OBJECT_SCHEMA_NAME(st.objectid, st.dbid) + '.' + OBJECT_NAME(st.objectid, st.dbid) AS varchar(100)), 'AdHoc Statement') AS object_name, s.login_name, CASE WHEN LEFT(s.program_name, 8) = 'SQLAgent' then (SELECT 'SQLAgent Job: ' + b.name from msdb.dbo.sysjobs b WHERE (SUBSTRING(MASTER.dbo.FN_VARBINTOHEXSTR(CONVERT(VARBINARY(16), b.JOB_ID)),1,10)) = SUBSTRING(s.PROGRAM_NAME,30,10)) ELSE s.program_name END AS [program_name], r.start_time AS request_start_time, r.last_wait_type, r.cpu_time, r.total_elapsed_time, r.logical_reads, r.reads AS physical_reads, r.status AS request_status, r.wait_time, r.command, master.dbo.fn_varbintohexstr(r.plan_handle) AS plan_handle FROM sys.dm_exec_sessions s LEFT JOIN sys.dm_exec_requests r ON r.session_id = s.session_id OUTER APPLY sys.dm_exec_sql_text(r.sql_handle) AS st Where S.Session_ID In ( Select Blocking_Session_ID From Sys.DM_Exec_Requests (READUNCOMMITTED) Where Blocking_Session_ID 0 )
Finally, becasue TempDb is a precious resource we introduced a check for long running transactions which can block the checkpoint process and cause a huge increase of the TLog size and, why not, exhaust the disk space. Just remember that on the TempDb the checkpoint happens when the TLog is at 70% usage and any open (long) transaction delay the process, and could start an “avalanche effect”; several times it does not depend on how much data You write in the TempDb but rather how much that transaction last
This script can also be changed for checking a different database name
SELECT ISNULL(cast(tst.session_id as varchar),'Internal') as SessionID, tdt.database_transaction_log_bytes_reserved/1024 AS [TlogKB], tat.transaction_id AS [Transacton ID], tat.name AS [TRANSACTION Name], tat.transaction_begin_time AS [TRANSACTION BEGIN TIME], DATEDIFF(mi, tat.transaction_begin_time, GETDATE()) AS [Elapsed TIME (in MIN)], CASE tat.transaction_type WHEN 1 THEN 'Read/write' WHEN 2 THEN 'Read-only' WHEN 3 THEN 'System' WHEN 4 THEN 'Distributed' END AS [TRANSACTION Type], CASE tat.transaction_state WHEN 0 THEN 'The transaction has not been completely initialized yet.' WHEN 1 THEN 'The transaction has been initialized but has not started.' WHEN 2 THEN 'The transaction is active.' WHEN 3 THEN 'The transaction has ended. This is used for read-only transactions.' WHEN 4 THEN 'The commit process has been initiated on the distributed transaction. This is for distributed transactions only. The distributed transaction is still active but further processing cannot take place.' WHEN 5 THEN 'The transaction is in a prepared state and waiting resolution.' WHEN 6 THEN 'The transaction has been committed.' WHEN 7 THEN 'The transaction is being rolled back.' WHEN 8 THEN 'The transaction has been rolled back.' END AS [TRANSACTION Description], CAST (SUBSTRING(t.text, (r.statement_start_offset/2)+1, ((CASE r.statement_end_offset WHEN -1 THEN DATALENGTH(t.text) ELSE r.statement_end_offset END - r.statement_start_offset)/2) + 1) as text) AS statement_text FROM sys.dm_tran_active_transactions tat INNER JOIN sys.dm_tran_database_transactions tdt on tat.transaction_id=tdt.transaction_id LEFT JOIN sys.dm_tran_session_transactions AS tst ON tdt.transaction_id = tst.transaction_id LEFT OUTER JOIN sys.dm_exec_requests AS r ON tst.session_id = r.session_id OUTER APPLY sys.dm_exec_sql_text(r.plan_handle) AS t WHERE tdt.database_id=DB_ID('tempdb') and tdt.database_transaction_log_bytes_reserved > 0 ORDER BY 2 DESC
Putting it all together I wrote a sp with 3 input parameters and named it sp_WhatIsRunning
USE master GO CREATE PROC [dbo].[sp_WhatIsRunning] ( @ShowPlan Bit = 0, @ShowLock Bit = 0, @ShowTempLog Bit = 0 ) AS BEGIN SET NOCOUNT ON SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED IF (@ShowPlan = 0) BEGIN SELECT s.host_name, r.session_id as 'session', r.blocking_session_id as [blocked by], CAST (SUBSTRING(st.text, (r.statement_start_offset/2)+1, ((CASE r.statement_end_offset WHEN -1 THEN DATALENGTH(st.text) ELSE r.statement_end_offset END - r.statement_start_offset)/2) + 1) as text) AS statement_text, ISNULL(cast(OBJECT_SCHEMA_NAME(st.objectid, st.dbid) + '.' + OBJECT_NAME(st.objectid, st.dbid) as varchar(100)), 'AdHoc Statement') as object_name, s.login_name, CASE WHEN LEFT(s.program_name, 8) = 'SQLAgent' then (SELECT 'SQLAgent Job: ' + b.name from msdb.dbo.sysjobs b WHERE (SUBSTRING(MASTER.dbo.FN_VARBINTOHEXSTR(CONVERT(VARBINARY(16), b.JOB_ID)),1,10)) = SUBSTRING(s.PROGRAM_NAME,30,10)) ELSE s.program_name END AS [program_name], r.start_time as request_start_time,r.last_wait_type,r.wait_time, r.cpu_time, r.total_elapsed_time, r.logical_reads, r.reads as physical_reads, r.status as request_status FROM sys.dm_exec_sessions s left JOIN sys.dm_exec_requests r ON s.session_id = r.session_id CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) AS st WHERE r.session_id > 50 AND r.session_id @@SPID AND last_wait_type 'SP_SERVER_DIAGNOSTICS_SLEEP' ORDER BY cpu_time DESC END ELSE BEGIN SELECT s.host_name, r.session_id as 'session', r.blocking_session_id as [blocked by], CAST (SUBSTRING(st.text, (r.statement_start_offset/2)+1, ((CASE r.statement_end_offset WHEN -1 THEN DATALENGTH(st.text) ELSE r.statement_end_offset END - r.statement_start_offset)/2) + 1) as text) AS statement_text, ISNULL(cast(OBJECT_SCHEMA_NAME(st.objectid, st.dbid) + '.' + OBJECT_NAME(st.objectid, st.dbid) as varchar(100)), 'AdHoc Statement') as object_name, s.login_name, CASE WHEN LEFT(s.program_name, 8) = 'SQLAgent' then (SELECT 'SQLAgent Job: ' + b.name from msdb.dbo.sysjobs b WHERE (SUBSTRING(MASTER.dbo.FN_VARBINTOHEXSTR(CONVERT(VARBINARY(16), b.JOB_ID)),1,10)) = SUBSTRING(s.PROGRAM_NAME,30,10)) ELSE s.program_name END AS [program_name], r.start_time as request_start_time,r.last_wait_type,r.wait_time, r.cpu_time, r.total_elapsed_time, r.logical_reads, r.reads as physical_reads, r.status as request_status, CAST (tp.query_plan as xml) AS plan_xml from sys.dm_exec_sessions s left join sys.dm_exec_requests r on s.session_id = r.session_id CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) AS st OUTER APPLY sys.dm_exec_text_query_plan( r.plan_handle, r.statement_start_offset, r.statement_end_offset) tp WHERE r.session_id > 50 and r.session_id @@SPID AND last_wait_type 'SP_SERVER_DIAGNOSTICS_SLEEP' ORDER BY cpu_time DESC END IF (@ShowLock=1) BEGIN SELECT s.session_id, r.blocking_session_id AS [blocked by], CAST(SUBSTRING(st.text, (r.statement_start_offset / 2) + 1, ((CASE r.statement_end_offset WHEN - 1 THEN DATALENGTH(st.text) ELSE r.statement_end_offset END - r.statement_start_offset) / 2) + 1) AS text) AS statement_text, ISNULL(cast(OBJECT_SCHEMA_NAME(st.objectid, st.dbid) + '.' + OBJECT_NAME(st.objectid, st.dbid) AS varchar(100)), 'AdHoc Statement') AS object_name, s.login_name, CASE WHEN LEFT(s.program_name, 8) = 'SQLAgent' then (SELECT 'SQLAgent Job: ' + b.name from msdb.dbo.sysjobs b WHERE (SUBSTRING(MASTER.dbo.FN_VARBINTOHEXSTR(CONVERT(VARBINARY(16), b.JOB_ID)),1,10)) = SUBSTRING(s.PROGRAM_NAME,30,10)) ELSE s.program_name END AS [program_name] , r.start_time AS request_start_time, r.last_wait_type, r.cpu_time, r.total_elapsed_time, r.logical_reads, r.reads AS physical_reads, r.status AS request_status, r.wait_time, r.command, master.dbo.fn_varbintohexstr(r.plan_handle) AS plan_handle FROM sys.dm_exec_sessions s LEFT JOIN sys.dm_exec_requests r ON r.session_id = s.session_id OUTER APPLY sys.dm_exec_sql_text(r.sql_handle) AS st Where S.Session_ID In ( SELECT Blocking_Session_ID FROM Sys.DM_Exec_Requests (READUNCOMMITTED) WHERE Blocking_Session_ID 0 ) END IF (@ShowTempLog =1) BEGIN SELECT ISNULL(cast(tst.session_id as varchar),'Internal') as SessionID, tdt.database_transaction_log_bytes_reserved/1024 AS [TlogKB], tat.transaction_id AS [Transacton ID], tat.name AS [TRANSACTION Name], tat.transaction_begin_time AS [TRANSACTION BEGIN TIME], DATEDIFF(mi, tat.transaction_begin_time, GETDATE()) AS [Elapsed TIME (in MIN)], CASE tat.transaction_type WHEN 1 THEN 'Read/write' WHEN 2 THEN 'Read-only' WHEN 3 THEN 'System' WHEN 4 THEN 'Distributed' END AS [TRANSACTION Type], CASE tat.transaction_state WHEN 0 THEN 'The transaction has not been completely initialized yet.' WHEN 1 THEN 'The transaction has been initialized but has not started.' WHEN 2 THEN 'The transaction is active.' WHEN 3 THEN 'The transaction has ended. This is used for read-only transactions.' WHEN 4 THEN 'The commit process has been initiated on the distributed transaction. This is for distributed transactions only. The distributed transaction is still active but further processing cannot take place.' WHEN 5 THEN 'The transaction is in a prepared state and waiting resolution.' WHEN 6 THEN 'The transaction has been committed.' WHEN 7 THEN 'The transaction is being rolled back.' WHEN 8 THEN 'The transaction has been rolled back.' END AS [TRANSACTION Description], CAST (SUBSTRING(t.text, (r.statement_start_offset/2)+1, ((CASE r.statement_end_offset WHEN -1 THEN DATALENGTH(t.text) ELSE r.statement_end_offset END - r.statement_start_offset)/2) + 1) as text) AS statement_text FROM sys.dm_tran_active_transactions tat INNER JOIN sys.dm_tran_database_transactions tdt on tat.transaction_id=tdt.transaction_id LEFT JOIN sys.dm_tran_session_transactions AS tst ON tdt.transaction_id = tst.transaction_id LEFT OUTER JOIN sys.dm_exec_requests AS r ON tst.session_id = r.session_id OUTER APPLY sys.dm_exec_sql_text(r.plan_handle) AS t WHERE tdt.database_id=DB_ID('tempdb') and tdt.database_transaction_log_bytes_reserved > 0 ORDER BY 2 DESC END END
You can call it simply with
- exec sp_WhatIsRunning to get a list of running processes
- exec sp_WhatIsRunning 1 to add the Execution Plan to the output
- exec sp_WhatIsRunning 0,1 to show the list of running together with blocking/blocked processes
- exec sp_WhatIsRunning 0,0,1 to show the list of running processes together with the open transactions list on the TempDb
- exec sp_WhatIsRunning 1,1,1 to show all; running processes, blocked/blocking and TempDb active transactions
This is just a starting point, we created the sp this way because this is what we need for realtime monitoring, but the it can be expanded with additional fields and/or visualization options.