Read this awesome article from Oracle-Base
If you put 10 Oracle performance gurus in the same room they will all say database statistics are vital for the cost-based optimizer to choose the correct execution plan for a query, but they will all have a different opinion on how to gather those statistics. A couple of quotes that stand out in my mind are:
- "You don't necessarily need up to date statistics. You need statistics that are representative of your data." - Graham Wood.Meaning, the age of the statistics in your system is not a problem as long as they are still representative of your data. So just looking at the
LAST_ANALYZED
column of theDBA_TABLES
view is not an indication of valid stats on your system. - "Do you want the optimizer to give you the best performance, or consistent performance?" - Anjo KolkMeaning, regularly changing your stats potentially introduces change. Change is not always a good thing.
Neither of these experts are suggesting you never update your stats, just pointing out that in doing so you are altering information the optimizer uses to determine which execution plan is the most efficient. In altering that information it is not unlikely the optimizer may make a different decision. Hopefully it will be the correct decision, but maybe it wont. If you gather statistics for all tables every night, your system will potentially act differently every day. This is the fundamental paradox of gathering statistics.
So what should our statistics strategy be? Here are some suggestions.
- Automatic Optimizer Statistics Collection: From 10g onward the database automatically gathers statistics on a daily basis. The default statistics job has come under a lot of criticism over the years, but its value depends on the type of systems you are managing. Most of that criticism has come from people discussing edge cases, like large data warehouses. If you are managing lots of small databases that have relatively modest performance requirements, you can pretty much let Oracle do its own thing where stats are concerned. If you have any specific problems, deal with them on a case by case basis.
- Mixed Approach: You rely on the automatic job for the majority of stats collection, but you have specific tables or schemas that have very specific stats requirements. In these cases you can either set the preferences for the objects in question, or lock the stats for the specific tables/schemas to prevent the job from changing them, then devise a custom solution for those tables/schemas.
- Manual: You disable the automatic stats collection completely and devise a custom solution for the whole of the database.
Which one of these approaches you take should be decided on a case-by-case basis. Whichever route you take, you will be using the
DBMS_STATS
package to manage your stats.
Regardless of the approach you take, you need to consider system and fixed object statistics for every database, as these are not gathered by the automatic job.
DBMS_STATS
The
DBMS_STATS
package was introduced in Oracle 8i and is Oracle's preferred method of gathering statistics. Oracle list a number of benefits to using it including parallel execution, long term storage of statistics and transfer of statistics between servers.
The functionality of the DBMS_STATS package varies greatly between database versions, as do the default parameter settings and the quality of the statistics they generate. It is worth spending some time checking the documentation relevant to your version.
Table and Index Stats
Table statistics can be gathered for the database, schema, table or partition.
EXEC DBMS_STATS.gather_database_stats; EXEC DBMS_STATS.gather_database_stats(estimate_percent => 15); EXEC DBMS_STATS.gather_database_stats(estimate_percent => 15, cascade => TRUE); EXEC DBMS_STATS.gather_schema_stats('SCOTT'); EXEC DBMS_STATS.gather_schema_stats('SCOTT', estimate_percent => 15); EXEC DBMS_STATS.gather_schema_stats('SCOTT', estimate_percent => 15, cascade => TRUE); EXEC DBMS_STATS.gather_table_stats('SCOTT', 'EMPLOYEES'); EXEC DBMS_STATS.gather_table_stats('SCOTT', 'EMPLOYEES', estimate_percent => 15); EXEC DBMS_STATS.gather_table_stats('SCOTT', 'EMPLOYEES', estimate_percent => 15, cascade => TRUE); EXEC DBMS_STATS.gather_dictionary_stats;
The
ESTIMATE_PERCENT
parameter was often used when gathering stats from large segments to reduce the sample size and therefore the overhead of the operation. In Oracle 9i upwards, we also had the option of letting Oracle determine the sample size using theAUTO_SAMPLE_SIZE
constant, but this got a bad reputation because the selected sample size was sometimes inappropriate, making the resulting statistics questionable.
In Oracle 11g, the
AUTO_SAMPLE_SIZE
constant is the preferred (and default) sample size as the mechanism for determining the actual sample size has been improved. In addition, the statistics estimate based on the auto sampling are near to 100% accurate and much faster to gather than in previous versions
The
CASCADE
parameter determines if statistics should be gathered for all indexes on the table currently being analyzed. Prior to Oracle 10g, the default was FALSE, but in 10g upwards it defaults to AUTO_CASCADE
, which means Oracle determines if index stats are necessary.
As a result of these modifications to the behavior in the stats gathering, in Oracle 11g upwards, the basic defaults for gathering table stats are satisfactory for most tables.
Index statistics can be gathered explicitly using the
GATHER_INDEX_STATS
procedure.EXEC DBMS_STATS.gather_index_stats('SCOTT', 'EMPLOYEES_PK'); EXEC DBMS_STATS.gather_index_stats('SCOTT', 'EMPLOYEES_PK', estimate_percent => 15);
The current statistics information is available from the data dictionary views for the specific objects (DBA, ALL and USER views). Some of these view were added in later releases.
- DBA_TABLES
- DBA_TAB_STATISTICS
- DBA_TAB_PARTITIONS
- DBA_TAB_SUB_PARTITIONS
- DBA_TAB_COLUMNS
- DBA_TAB_COL_STATISTICS
- DBA_PART_COL_STATISTICS
- DBA_SUBPART_COL_STATISTICS
- DBA_INDEXES
- DBA_IND_STATISTICS
- DBA_IND_PARTITIONS
- DBA_IND_SUBPARTIONS
Histogram information is available from the following views.
- DBA_TAB_HISTOGRAMS
- DBA_PART_HISTOGRAMS
- DBA_SUBPART_HISTOGRAMS
Table, column and index statistics can be deleted using the relevant delete procedures.
EXEC DBMS_STATS.delete_database_stats; EXEC DBMS_STATS.delete_schema_stats('SCOTT'); EXEC DBMS_STATS.delete_table_stats('SCOTT', 'EMP'); EXEC DBMS_STATS.delete_column_stats('SCOTT', 'EMP', 'EMPNO'); EXEC DBMS_STATS.delete_index_stats('SCOTT', 'EMP_PK'); EXEC DBMS_STATS.delete_dictionary_stats;
System Stats
Introduced in Oracle 9iR1, the
GATHER_SYSTEM_STATS
procedure gathers statistics relating to the performance of your systems I/O and CPU. Giving the optimizer this information makes its choice of execution plan more accurate, since it is able to weigh the relative costs of operations using both the CPU and I/O profiles of the system.
There are two possible types of system statistics:
- Noworkload: All databases come bundled with a default set of noworkload statistics, but they can be replaced with more accurate information. When gathering noworkload stats, the database issues a series of random I/Os and tests the speed of the CPU. As you can imagine, this puts a load on your system during the gathering phase.
EXEC DBMS_STATS.gather_system_stats;
- Workload: When initiated using the start/stop or interval parameters, the database uses counters to keep track of all system operations, giving it an accurate idea of the performance of the system. If workload statistics are present, they will be used in preference to noworkload statistics.
-- Manually start and stop to sample a representative time (several hours) of system activity. EXEC DBMS_STATS.gather_system_stats('start'); EXEC DBMS_STATS.gather_system_stats('stop'); -- Sample from now until a specific number of minutes. DBMS_STATS.gather_system_stats('interval', interval => 180);
Your current system statistics can be displayed by querying the
AUX_STATS$
table.SELECT pname, pval1 FROM sys.aux_stats$ WHERE sname = 'SYSSTATS_MAIN'; PNAME PVAL1 ------------------------------ ---------- CPUSPEED CPUSPEEDNW 1074 IOSEEKTIM 10 IOTFRSPEED 4096 MAXTHR MBRC MREADTIM SLAVETHR SREADTIM 9 rows selected. SQL>
If you are running 11.2.0.1 or 11.2.0.2 then check out MOS Note: 9842771.8.
The
DELETE_SYSTEM_STATS
procedure will delete all workload stats and replace previously gathered noworkload stats with the default values.EXEC DBMS_STATS.delete_system_stats;
You only need to update your system statistics when something major has happened to your systems hardware or workload profile.
There are two schools of thought about system stats. One side avoid the use of system statistics altogether, favoring the default noworkload stats. The other side suggests providing accurate system statistics. The problem with the latter, is it is very difficult to decide what represents an accurate set of system statistics. Most people seem to favor investigation of systems using a variety of methods, including gathering system stats into a stats table, then manually setting the system statistics using the
SET_SYSTEM_STATS
procedure.EXEC DBMS_STATS.set_system_stats('iotfrspeed', 4096);
The available parameter names can be found here.
I would say, if in doubt, use the defaults.
Fixed Object Stats
Introduced in Oracle 10gR1, the
GATHER_FIXED_OBJECTS_STATS
procedure gathers statistics on the X$
tables, which sit underneath theV$
dynamic performance views. The X$
tables are not really tables at all, but a window on to the memory structures in the Oracle kernel. Fixed object stats are not gathered automatically, so you need to gather them manually at a time when the database is in a representative level of activity.EXEC DBMS_STATS.gather_fixed_objects_stats;
Major changes to initialization parameters or system activity should signal you to gather fresh stats, but under normal running this does not need to be done on a regular basis.
The stats are removed using the
DELETE_FIXED_OBJECTS_STATS
procedure.EXEC DBMS_STATS.delete_fixed_objects_stats;
Locking Stats
To prevent statistics being overwritten, you can lock the stats at schema, table or partition level.
EXEC DBMS_STATS.lock_schema_stats('SCOTT'); EXEC DBMS_STATS.lock_table_stats('SCOTT', 'EMP'); EXEC DBMS_STATS.lock_partition_stats('SCOTT', 'EMP', 'EMP_PART1');
If you need to replace the stats, they must be unlocked.
EXEC DBMS_STATS.unlock_schema_stats('SCOTT'); EXEC DBMS_STATS.unlock_table_stats('SCOTT', 'EMP'); EXEC DBMS_STATS.unlock_partition_stats('SCOTT', 'EMP', 'EMP_PART1');
Locking stats can be very useful to prevent automated jobs from changing them. This is especially useful with tables used for ETL processes. If the stats are gathered when the tables are empty, they will not reflect the real quantity of data during the load process. Instead, either gather stats each time the data is loaded, or gather them once on a full table and lock them.
Transfering Stats
It is possible to transfer statistics between servers allowing consistent execution plans between servers with varying amounts of data. First the statistics must be collected into a statistics table. In the following examples the statistics for the APPSCHEMA user are collected into a new table, STATS_TABLE, which is owned by DBASCHEMA.
EXEC DBMS_STATS.create_stat_table('DBASCHEMA','STATS_TABLE'); EXEC DBMS_STATS.export_schema_stats('APPSCHEMA','STATS_TABLE',NULL,'DBASCHEMA');
This table can then be transfered to another server using your preferred method (Export/Import, SQL*Plus COPY etc.) and the stats imported into the data dictionary as follows.
EXEC DBMS_STATS.import_schema_stats('APPSCHEMA','STATS_TABLE',NULL,'DBASCHEMA'); EXEC DBMS_STATS.drop_stat_table('DBASCHEMA','STATS_TABLE');
Setting Preferences
Since Oracle 10g, many of the default values of parameters for the
DBMS_STATS
procedures have changed from being hard coded to using preferences. In Oracle 10g, these preferences could be altered using the SET_PARAM
procedure.EXEC DBMS_STATS.set_param('DEGREE', '5');In 11g, the
SET_PARAM
procedure was deprecated in favor of a layered approach to preferences. The four levels of preferences are amended with the following procedures.SET_GLOBAL_PREFS
: Used to set global preferences, including some specific to the automatic stats collection job.SET_DATABASE_PREFS
: Sets preferences for the whole database.SET_SCHEMA_PREFS
: Sets preferences for a specific schema.SET_TABLE_PREFS
: Sets preferences for a specific table.
No comments:
Post a Comment