sql server | Blog di Mauro Munzi

Posts contrassegnato dai tag ‘sql server’

SQL Server Transaction cache can cause several negative side effects

Pubblicato: 12 agosto 2019 in performance, sql, sql server
Tag:ACCESS_METHODS_ACCESSOR_CACHE, compile locks, lock manager, OBJSTORE_LOCK_MANAGER, OBJSTORE_XACT_CACHE, sql server, sys.dm_os_memory_clercks, transaction cache

Since the installation of Sql Server 2016 Service Pack 1, on some of our customers with hi-end machines, we started seeing a couple of issues performance related, which sometimes caused the server to slow down, and in some circumstances almost hang. With the term hi-end I refer to boxes with more than 48 physical cores and an amount of RAM ranging from 1Tb to 2Tb, running at 40K batches/sec and more with a rate of several thousands of Sql Transactions/sec.

We never had such kind of issue with either 2014 or 2016RTM and in that period we were introducing further optimizations on both the Web code and the Stored Procedures, to further reduce resource utilization.

The faced issues basically were:

a huge amount of latch contention of type ACCESS_METHODS_ACCESSOR_CACHE, sometimes being the higher percentage of the day and with situations of many seconds of wait time,
Repeated episodes of heavy locks of type COMPILE, lastinng for 30/40s and even more
An increased CPU comsumption with the same load and usage pattern we were used to.

We started loking at the performance counters and found that the Plan Cache was always very low, around 1Gb for both Procedure and Ad Hoc, with an increased rate of Compilations/sec and Recompilations/sec over the past. Tried to investigate for an increased number of statistics update events but nothing, no apparent reasons for the continuous evictios from the Plan Cache; the Compile locks and the ACCESS_METHODS_ACCESSOR_CACHE latch were the results of the “hard work” of the engine over the Plan Cache.

So started looking at the RING_BUFFER_RESOURCE_MONITOR, thinking something like “ok, if it is a memory issue I should find something about it there…” and found this..

image005-2

Ok, system and process indicators were healthy, but the pool indicator was telling that some pool was suffering; how this can happen if the Plan Cache is already very low and the Buffer Pool was fine? That was my question…

So “asked” more details to the sys.dm_os_memory_clercks wich told

image006-2

So what??? 60Gb for the _XACT_CACHE and 14Gb for the _LOCK_MANAGER ? This is a 2Tb box, so we know that on this box the cache pressure limit is around 107Gb (75% of visible memory from 0 to 4Gb + 10% from 4Gb to 64Gb + 5% > 64b ) and that notifications are raised when a single cache store reaches 62,5% of the above calculations -> 67Gb, so we were roughly on that range. On the image You can also see the _OBJCP and the _SQLCP cache stores to be very low.

We issued a DBCC FREESYSTEMCACHE(‘All’) and after a while the result was

The plan cache for object and adhoc quickly grew to the value we always saw in the past, the other cache stores greatly reduced their footprint and for some days we didn’t have neither notifications from the RING_BUFFER nor latches or compile lock.

To check the behavior of the _XACT_CACHE and the _LOCK_MANAGER stores we started collecting the output of the sys.dm_os_memory_clercks every hour and quickly saw the tipical memory leak behavior.

This is the output of a second box with 1Tb RAM, with the collection started just before a DBCC FREESYSTEMCACHE; the plan cache slowly comes to a steady value and then when the OBJSTORE_XACT_CACHE comes to around 25Gb Sql starts kicking out plans and allowing less room for them; we then identified the “deeps” of the graph to be in synch with our most evident issues.

This is the graph created on the 2Tb box wich shows that as soon the OBJSTORE_XACT_CACHE is reduced the plan cache starts growing in value.

We decided to open a case with Microsoft support because it clealry seemed a memory leak, we never faced this issue till the installation of the SP1 for Sql 2016, and I never saw something similar during 15 years in the Sql Support Team…

MS told that nothing changed on that side starting with SP1 and suggested to try to implement Trace Flag 3920 to completely disable Transaction Cache; I’m not happy in completely disable it so by now we run DBCC FREESYSTEMCACHE(‘Transactions’) once a week and we don’t have those issues anymore.

However I still have my doubt for some regressions introduced by SP1…

Please feel free to contact me for further details or if you experienced a similar issue.

Temp Tables vs Table Variables deep dive (ENG)

Pubblicato: 18 ottobre 2018 in sql, sql server
Tag:fn_dblog, latch, sql server, sys.fn_dblog, table variable, table variable vs temp table, table variables vs temp tables, temp table, temp table vs table variable, temp tables vs table variables, tempdb

… when You can’t use InMemory

We read more and more on temporary tables and table variables, and the pro/cons of one over the other, and also some myths have been explained ( …both are on the tempdb ) ; to make it short, there are a couple of pro in favor of temp tables:

Have statistics, while for table variables (also InMemory) SQL Server always estimates 1 row.
Are accessible from stored procedures called inside the one that created the temp table.

Point 1 become negligible in case of objects with few rows, while in case of a great number of rows we can use:

OPTION (RECOMPILE), but particular attention may be needed in case of several calls (also every second), because You may spend more CPU resources in continuously recompiling the statements instead of run them.
TRACE FLAG 2453, but the setting is server wide, so the risk is to fall back on the situation mentioned above.

Given this let’s look at a more “practical” approach to this comparison, which basically take into account two fundamental key points for the TempDb performance when dealing with “High Workload Scenarios”:

PAGELATCH contention, EX and SH, on PFS pages ( 1 and every 8088 pages, 64Mb ), GAM and SGAM ( 2 and 3, every 511230 pages, 4Gb).
TempDb Tlog traffic, created by Log records, which translates into MB/s.

To show the differences I’ll use two stored procedures, one which creates a temporary table and the other which creates a table variable, inserting 5 rows each; their structure and average rows is very similar to that of our environments. An Extended Events session and a query over the TempDb Tlog will show the differences.


CREATE PROCEDURE [dbo].[proc_TestTemp]

AS

BEGIN

SET NOCOUNT ON

CREATE TABLE #Table1(

[Fld1] [bigint] NULL,

[Fld2] [int] NULL,

[Fld3] [int] NULL,

[Fld4] [tinyint] NULL,

[Fld5] [decimal](9, 2) NULL,

[Fld6] [decimal](9, 2) NULL,

[Fld7] [tinyint] NULL,

[Fld8] [varchar](15) ,

INDEX [IX_Fld1] CLUSTERED ([Fld1] ASC))       

INSERT INTO #Table1 VALUES (1,100000,200000,35,9.20,9.20, 3, 'Test')

INSERT INTO #Table1 VALUES (10,100000,200000,35,9.20,9.20, 3, 'Test')

INSERT INTO #Table1 VALUES (100,100000,200000,35,9.20,9.20, 3, 'Test')

INSERT INTO #Table1 VALUES (1000,100000,200000,35,9.20,9.20, 3, 'Test')

INSERT INTO #Table1 VALUES (1000,100000,200000,35,9.20,9.20, 3, 'Test')

END

GO       

CREATE PROCEDURE [dbo].[proc_TestVar]

AS

BEGIN

SET NOCOUNT ON

DECLARE @Table1 TABLE (

[Fld1] [bigint] NULL,

[Fld2] [int] NULL,

[Fld3] [int] NULL,

[Fld4] [tinyint] NULL,

[Fld5] [decimal](9, 2) NULL,

[Fld6] [decimal](9, 2) NULL,

[Fld7] [tinyint] NULL,

[Fld8] [varchar](15),

INDEX [IX_Fld1] CLUSTERED ([Fld1] ASC))

INSERT INTO @Table1 VALUES (1,100000,200000,35,9.20,9.20, 3, 'Test')

INSERT INTO @Table1 VALUES (10,100000,200000,35,9.20,9.20, 3, 'Test')

INSERT INTO @Table1 VALUES (100,100000,200000,35,9.20,9.20, 3, 'Test')

INSERT INTO @Table1 VALUES (1000,100000,200000,35,9.20,9.20, 3, 'Test')

INSERT INTO @Table1 VALUES (1000,100000,200000,35,9.20,9.20, 3, 'Test')

END

GO

The Extended Event session uses the sqlserver.latch_acquired event filtered for dbid 2 and the session from which I’m running the stored procedures.


CREATE EVENT SESSION [Latch]

ON SERVER

ADD EVENT sqlserver.latch_acquired(

ACTION(sqlserver.session_id,sqlserver.sql_text)

WHERE ([package0].[equal_uint64]([database_id],(2))

AND [sqlserver].[session_id]=(57)))

ADD TARGET package0.event_file(SET filename=N'Latch'),

ADD TARGET package0.ring_buffer(SET max_memory=(40960))

WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,

MAX_DISPATCH_LATENCY=1 SECONDS, MAX_EVENT_SIZE=0 KB,

MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)

GO

While the Tlog query is something like


SELECT

    fd.[Current LSN],

    fd.Operation,

    fd.AllocUnitName,

    fd.[Transaction Name],

    fd.[Transaction ID]

FROM sys.fn_dblog(NULL, NULL) AS fd

Putting all together and issuing a checkpoint before starting let’s check the results with the Temp Table…


USE tempdb

GO

CHECKPOINT

GO

ALTER EVENT SESSION Latch ON SERVER STATE = start

GO

exec [TestDb].[dbo].[proc_TestTemp]

GO

ALTER EVENT SESSION Latch ON SERVER STATE = stop

GO

SELECT

    fd.[Current LSN],

    fd.Operation,

    fd.AllocUnitName,

    fd.[Transaction Name],

    fd.[Transaction ID]

FROM sys.fn_dblog(NULL, NULL) AS fd

The Tlog query tells that there are 38 log records (138 at the first execution)

The Extended Events session show a total of 51 acquired Latches

Let’s remove the just created files and repeat the same test with the Table Variable.

This time the query over the Tlog is telling that the number of records is 23 (240 at the first execution, meaning that caching a table variable creates more records)

The session shows now a total of 11 acquired Latches.

Cattura

Basically we have that Table Variables require much less Latch and Tlog records than Temp Tables, a condition that in case of thousands of calls per minute could make a great difference. If we change the Tlog query to extract the SUM ([Log Record Lenght]) we find that the sp using the Temp Table writes 5552 bytes, while the other 2632 bytes.

Let’s check now what happens on the Performance Counters “Log Flushes/sec” and “Log Bytes Flushed/sec” of the “Databases” object, tempdb instance; for the purpose I used the SQL Load Generator to generate (only) around 130 Batch/sec with the two stored procedure.

This is the result with the Temp Tables

16 Log Flushes per second and around 650Kb/sec of Tlog traffic.

On the contrary with the Table Variables we have

Almost half the Log Flushes/sec ( 8 ) and the Log Bytes Flushed /sec more than halved at 300Kb/sec.

So, from a practical point of view, when we can’t enable the InMemory feature, it is more useful to start playing with the table variables, particularly when we work with a small number of rows and several calls per minute (or second); and when the performance favors Temp Tables, and the rate is not so high, a RECOMPILE in the statement should eliminate any difference.

Temp Tables vs Table Variables deep dive

Pubblicato: 7 ottobre 2018 in sql, sql server
Tag:fn_dblog, latch, sql server, sys.fn_dblog, table variable, table variable vs temp table, table variables vs temp tables, temp table, temp table vs table variable, temp tables vs table variables, tempdb

… quando non si possono usare le InMemory

Sulle tabelle temporanee e sulle variabili di tipo Table, e dei vantaggi e/o svantaggi delle prime rispetto alle altre, si è già scritto di tutto e di più, ed anche sfatati “strani” miti ( ma entrambe sono nel TempDb ); alla fine per farla breve rimangono a favore delle temporanee un paio di aspetti:

Hanno le statistiche mentre per le variabili ti tipo table (anche InMemory) SQL Server stima sempre 1 riga
Sono visibili da stored procedure richiamate da quella che ha creato la temporanea

Il punto 1 diventa trascurabile nel caso di oggetti con poche righe, mentre gli svantaggi con molte righe possono essere tranquillamente superati con:

OPTION (RECOMPILE), ma prestate attenzione al caso di numerose chiamate (anche ogni secondo), altrimenti si spreca più CPU per ricompilare gli statement che non eseguirli.
TRACE FLAG 2453, ma poi l’impostazione è per tutto il server, e per quanto riguarda la CPU si ricade nel punto precedente per tutte le stored procedure che usano variabili di tipo table.

Fatte queste premesse vediamo ora un approccio molto più pratico del confronto, che riguarda 2 aspetti fondamentali nelle performance del TempDb quando si ha a che fare con “High Workload Scenarios”;

PAGELATCH contention, EX e SH, nelle pagine PFS ( 1 e ogni 8088, 64Mb ), GAM e SGAM ( 2 e 3, e ogni 511230 pagine, 4Gb)
Traffico nel Tlog, generato dai record scritti, che alla fine si traduce in MB/s.

Per dimostrare le differenze userò due stored procedure che creano una tabella temporanea e una variabile di tipo table, inserendo 5 righe per ognuna; la loro struttura ed il numero di righe medie rispecchia una situazione tipica dei nostri ambienti; una sessione Extended Events e una query sul Tlog del Tempdb mostreranno le differenze.

CREATE PROCEDURE [dbo].[proc_TestTemp]
AS
BEGIN
SET NOCOUNT ON
CREATE TABLE #Table1(
[Fld1] [bigint] NULL,
[Fld2] [int] NULL,
[Fld3] [int] NULL,
[Fld4] [tinyint] NULL,
[Fld5] [decimal](9, 2) NULL,
[Fld6] [decimal](9, 2) NULL,
[Fld7] [tinyint] NULL,
[Fld8] [varchar](15) ,
INDEX [IX_Fld1] CLUSTERED ([Fld1] ASC))       

INSERT INTO #Table1 VALUES (1,100000,200000,35,9.20,9.20, 3, 'Prova')
INSERT INTO #Table1 VALUES (10,100000,200000,35,9.20,9.20, 3, 'Prova')
INSERT INTO #Table1 VALUES (100,100000,200000,35,9.20,9.20, 3, 'Prova')
INSERT INTO #Table1 VALUES (1000,100000,200000,35,9.20,9.20, 3, 'Prova')
INSERT INTO #Table1 VALUES (1000,100000,200000,35,9.20,9.20, 3, 'Prova')
END
GO       

CREATE PROCEDURE [dbo].[proc_TestVar]
AS
BEGIN
SET NOCOUNT ON
DECLARE @Table1 TABLE (
[Fld1] [bigint] NULL,
[Fld2] [int] NULL,
[Fld3] [int] NULL,
[Fld4] [tinyint] NULL,
[Fld5] [decimal](9, 2) NULL,
[Fld6] [decimal](9, 2) NULL,
[Fld7] [tinyint] NULL,
[Fld8] [varchar](15),
INDEX [IX_Fld1] CLUSTERED ([Fld1] ASC))

INSERT INTO @Table1 VALUES (1,100000,200000,35,9.20,9.20, 3, 'Prova')
INSERT INTO @Table1 VALUES (10,100000,200000,35,9.20,9.20, 3, 'Prova')
INSERT INTO @Table1 VALUES (100,100000,200000,35,9.20,9.20, 3, 'Prova')
INSERT INTO @Table1 VALUES (1000,100000,200000,35,9.20,9.20, 3, 'Prova')
INSERT INTO @Table1 VALUES (1000,100000,200000,35,9.20,9.20, 3, 'Prova')
END
GO

La sessione Extended Events usa l’evento sqlserver.latch_acquired filtrato per il dbid 2 e la sessione dalla quale si eseguono le due sp.

CREATE EVENT SESSION [Latch]
ON SERVER
ADD EVENT sqlserver.latch_acquired(
ACTION(sqlserver.session_id,sqlserver.sql_text)
WHERE ([package0].[equal_uint64]([database_id],(2))
AND [sqlserver].[session_id]=(57)))
ADD TARGET package0.event_file(SET filename=N'Latch'),
ADD TARGET package0.ring_buffer(SET max_memory=(40960))
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY=1 SECONDS, MAX_EVENT_SIZE=0 KB,
MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)
GO

Mentre la query sul Tlog è una cosa del tipo

SELECT
    fd.[Current LSN],
    fd.Operation,
    fd.AllocUnitName,
    fd.[Transaction Name],
    fd.[Transaction ID]
FROM sys.fn_dblog(NULL, NULL) AS fd

Mettendo tutto insieme ed eseguendo un checkpoint prima di iniziare vediamo cosa succede con la tabella temporanea…

USE tempdb
GO
CHECKPOINT
GO
ALTER EVENT SESSION Latch ON SERVER STATE = start
GO
exec [TestDb].[dbo].[proc_TestTemp]
GO
ALTER EVENT SESSION Latch ON SERVER STATE = stop
GO
SELECT
    fd.[Current LSN],
    fd.Operation,
    fd.AllocUnitName,
    fd.[Transaction Name],
    fd.[Transaction ID]
FROM sys.fn_dblog(NULL, NULL) AS fd

La query sul transaction log ci dice che sono stati scritti 38 log record ( 138 alla prima esecuzione )

Mentre per la sessione Extended Events sono stati acquisiti un totale di 51 Latch

Cancelliamo adesso i file della sessione Extended Events e ripetiamo la stessa cosa per la sp che usa la table variabile

Questa volta la query sul Tlog dice che i record scritti sono 23 ( 240 alla prima esecuzione, quindi mettere in cache una Table Variable genera più record)

Mentre la sessione Extended Events dice che i Latch acquisiti in totale sono solo 11
Cattura

Quindi in sostanza abbiamo che le variabili di tipo Table richiedono molti meno LATCH e record nel Tlog, condizione che nel caso di migliaia di chiamate al minuto può fare una differenza notevole. Se la query sul Transaction Log viene modificata per prendere una SUM ([Log Record Lenght]) abbiamo che la stored procedure che usa la tabella temporanea usa 5552 bytes, mentre l’altra 2632 bytes.

Vediamo adesso cosa succede prendendo come riferimento i contatori di Performance “Log Flushes/sec” e “Log Bytes Flushed/sec” dell’oggetto “Databases, istanza tempdb: a tal proposito ho configurato il tool SQL Load Generator per generare (solo) circa 130 Batch al secondo con le due stored appena viste.

Nel caso della stored procedure con tabella temporanea abbiamo il seguente risultato

16 Log Flushes al secondo e circa 650Kb/sec scritti nel tlog.
Passando alla varsione con variabile di tipo table otteniamo invece

I Log Flushes/sec sono dimezzati a 8 e i Log Bytes Flushed /sec più che dimezzati a 300Kb/sec.

Dal punto di vista pratico quindi,nel momento in cui non si possa abilitare l’ InMemory, è più conveniente affidarsi in prima istanza alle variabili ti tipo table, soprattutto quando si ha a che fare con poche righe e molte chiamate al minuto, e nei casi in cui la differenza di prestazioni volga a favore delle tabelle temporanee provare una RECOMPILE che molto spesso livella le prestazioni dei due oggetti.

Blog di Mauro Munzi

Articoli recenti

Segui il Blog via Email

Links

Posts contrassegnato dai tag ‘sql server’

SQL Server Transaction cache can cause several negative side effects

Temp Tables vs Table Variables deep dive (ENG)

… when You can’t use InMemory

Temp Tables vs Table Variables deep dive

… quando non si possono usare le InMemory

Articoli recenti

Segui il Blog via Email

Links

Posts contrassegnato dai tag ‘sql server’

SQL Server Transaction cache can cause several negative side effects

Condividi:

Temp Tables vs Table Variables deep dive (ENG)

… when You can’t use InMemory

Condividi:

Temp Tables vs Table Variables deep dive

… quando non si possono usare le InMemory

Condividi: