Monday , December 23 2024

Data Collector, Management Data Warehouse, Perfmon

 

Data Collector is a feature introduced in SQL Server 2008. A nice technology to monitor and report the performance of SQL Server systems. We can use Data Collectors in several different ways.

You can monitor using Perfmon at “t” moment or create a baseline using “Data Collector Set”.

Baseline creation can usually be useful to detect the problem when there is a problem in the system.

For example, you have created a data collector set for the Buffer Cache Hit Ratio(Indicates how much of the queries are read from the buffer cache. You can find the details of the counter in the table at the end of the article.)

And there was a problem in the database at “t” moment.

You have checked the Buffer Cache Hit Ratio ratio from “Data Collector Set” at “t” and you saw that the rate fell to 70% at the time of the problem. This means that there is an insufficiency of memory at the time of the problem. In this way, the baselines you create with data collectors can help you to solve the problem.

You should read the article “How To Calculate IOPS and Throughput of Your Database Server Using Data Collector” to monitor the counter counts from Perfmon and create baseline with data collector sets.

Another use of Data Collectors is the Management Data Warehouse. With Management Data Warehouse, we can create a baseline via SSMS using data collectors. But in its default configuration, baselines for CPU, Disk, Memory and Network are created. It is also possible to create custom baseline but I can say that it is not easy.

If only the Disk and Cpu monitoring is enough for you, I recommend you to use Utility Control Point.

Below you will find articles about Utility Control Point.

How To Monitor SQL Server With Utility Control Point“,

How To Manage Utility Control Point“,

How To Add an Instance To The Utility Control Point

 

Let’s examine the installation of Management Data Warehouse with an example.

From the Management tab on SSMS, right click on Data Collection and click Tasks and Configure Management Data Warehouse.

 

In the screen that appears, we are moving forward by selecting “Do not show this page again”

Management Data Warehouse needs a database to store the information it collects. On the next screen, it wants us to identify the database where it will save the information. We can select an existing database or we can create a new database for this process.

 

The next screen asks the user we want to authorize for the Management Data Warehouse. I’ve given admin rights to the TestLogin user. You can create a new login by clicking the New Login button.

The mdw_reader authorization is required for reporting, the mdw_writer authority is required to upload data, and the mdw_admin authorization is required to do both.

 

Next, click on Configure Data Collection as below to create our Data Collection Sets.

 

On the screen that appears,

In the Server name section, we select the instance of our MDW database that was created in the previous step.

In the database name section, we select our MDW database.

We need to select the file to keep before the data is transferred to the MDW database for Cache directory.

I created this folder on the C disk by creating a folder named Temp.

At the bottom, it asks for the data collector sets we want to activate. We select System Data Collection Sets and Transaction Performance Collection Sets and complete the process by clicking finish on the last screen.

 

You can find reports from the Management> Data Collection> Reports> Management Data Warehouse tabs as follows.

 

In addition to the Management Data Warehouse, you can create a baseline with data collector sets using the counter you want, as described in the article “How To Calculate IOPS and Throughput of Your Database Server Using Data Collector“.

 

Below you can find some important counter and best practice.

ObjectCounterCounter DescriptionThe Range Of Values That Alert Can Be Generated:
MemoryAvailable MBytesThe amount of free memory in the system< 3 GB
MemoryPages Input/SecNumber of pages accessed from disk per second (hard page fault)> 20(Average)
Paging File%UsagePaging File usage rate> 70%
ProcessorPrivileged TimeThe time that Windows Kernel processes spend on the CPU. Indicates a high amount of I/O or interruptions. You should look at DPC Time and Interrupt Time and Context Switches.> 30%(Alert can be generated if it is above this value for two minutes)
ProcessorProcessor TimeCPU Percentage> 90%(Alert can be generated if it is above this value for two minutes)
SystemContext Switches/secThe process of “context switches” is to change the thread that the processor processes. It should not exceed 6000 per CPU.> 384000
SystemProcessor Queue LengthNumber of processors waiting to run in the processor queue> 640
PhysicalDiskAvg. Disk sec/ReadAverage read time per second> 0.30
PhysicalDiskAvg. Disk sec/WriteAverage write time per second> 0.30
PhysicalDisk% Idle TimePercentage of unused disk< 50%(Alert can be generated if it is above this value for two minutes)
SQLServer:Access MethodsForwarded Records/secIf the data does not fit on the page during the update, it should be marked with the pointer to another page.

 

It only happens in the heap tables. Cluster index create-drop can be done.e

>(SQLServer:SQL Statistics:Batch Requests/sec)/10
SQLServer:Access MethodsFreeSpace Scans/secThis is the process of performing a blank page scan for row to be inserted in the heap tables.>(SQLServer:SQL Statistics:Batch Requests/sec)/10
SQLServer:Access MethodsIndex Searches/sec,Full Scans/secIndex search and table scan numbers per second. It means you need index.(Index Searches/sec) /  (Full Scans/sec) < 100
SQLServer:Access MethodsPage Splits/secThe number of pages divided in seconds. Fillfactor will fix the problem.>(SQLServer:SQL Statistics:Batch Requests/sec)/5
SQLServer:Access MethodsScan Point Revalidations/secThe number of scan points re-verified per second during the scan operation to continue>10
SQLServer:Access MethodsWorktables Created/secThe number of tables created in tempdb per second (Table spool, Index spool, etc.)> 400
SQLServer:Buffer ManagerBuffer cache hit ratioThis is the rate at which SQL Server can find the data that it wants from memory.< 97%
SQLServer:Buffer ManagerLazy writes/secThe number of dirty pages written to disk per second to free buffer cache.>20
SQLServer:Buffer ManagerPage life expectancyThe amount of time data pages stay in buffer in seconds.< 300
SQLServer:Buffer ManagerPage lookups/secIn seconds, the number of requests sent to find page in buffer pool. If this value is 100 times greater than the number of batch Requests per second, query plans are inefficient. High I/O queries should be tuned.(Page lookups/sec) / (Batch Requests/sec) > 100
SQLServer:LatchesLatch Waits/secThe number of waiting latch requests per second(Total Latch Wait Time) / (Latch Waits/Sec) > 10
SQLServer:LocksLock Requests/secThe number of requests that request lock per second(Lock Request/sec) / (Batch Requests/sec) > 500
SQLServer:LocksLock Timeouts (timeout > 0)/secTimeout due to lock<>0
SQLServer:LocksNumber of Deadlocks/secThe number of deadlock per second>0
SQLServer:Memory ManagerMemory Grants PendingThe number of connections waiting for memory<>0
SQLServer:SQL StatisticsSQL Compilations/secThe number of queries compiled per second>(Batch Requests/Sec)/5
SQLServer:SQL StatisticsSQL Re-Compilations/secThe number of queries recompiled per second.>(SQL Compilations/sec)/10
SQLServer:DatabaseLog Flush Waits/secIndicates the average wait time of SQL Server when writing to the database’s log.

 

Increasing values indicate that a long time has elapsed when writing to the log.

 

This results in concurrency problems and loss of performance in the system.

Increasing values are bad.
Network InterfaceBytes Received/sec

 

Bytes Sent/sec

It shows the amount of data received and sent from the network in bytes per second.

 

Sudden increases can be the source of the problem if they occur at the same time as performance issues.

Increasing values are bad.
SQLServer:Buffer ManagerCheckpoint pages/secIndicates the number of times the SQL Server’s checkpoint operation occurred in seconds.

 

If the Checkpoint process occurs frequently, then there are too many changes. This means more IO. IO can cause a bottleneck.

 

Sudden increases can be the source of the problem if they occur at the same time as performance issues.

Increasing values are bad.

 

Loading

About dbtut

We are a team with over 10 years of database management and BI experience. Our Expertises: Oracle, SQL Server, PostgreSQL, MySQL, MongoDB, Elasticsearch, Kibana, Grafana.

Leave a Reply

Your email address will not be published. Required fields are marked *