Presentation Slides can be downloaded from the AskTom website.
[Updated : 8:58am PDT]
What happens when a SQL Statement is issued?
- Syntax Check -- is it SQL?
- Semantic Check -- What is being queried? What Objects? Dictionary check.
- Shared Pool Check
(steps 1-3 == Soft parse)
ways to reduce soft parse:
ways to reduce soft parse:
- Enable JDBC Statement cacheing
- Bind vars
- PL/SQL functions
[Updated : 9:13am PDT]
Cost Based Optimizer introduced in 1992 in Oracle 7.0 #retroITfacts
Not really considered the default optimizer until version 9.0
Understanding how Optimizer work
Query Transformation
Did you know: There is an ANY and ALL sql set operator? Check it out!
Cost Based Optimizer introduced in 1992 in Oracle 7.0 #retroITfacts
Not really considered the default optimizer until version 9.0
Understanding how Optimizer work
Query Transformation
- Rewrite query text to allows it to be processed more efficiently Oracle does this
- YOU write code for readabilty and 'syntactically pleasing' -- says Mr Kyte
- translate statement into semantically equivalent SQL that can be processed
Did you know: There is an ANY and ALL sql set operator? Check it out!
[Updated: 9:27am PDT]
OR Expansion
'OR is the worst thing in the world. " Mr Kyte
Star Query Transformation
Frequency histograms:
(skipping pre-11g histogram discussion, as not relevant to me)
When to gather stats?
OR Expansion
'OR is the worst thing in the world. " Mr Kyte
- Transforms queries that contain OR predicates into the form of a UNION ALL query with two or more branches.
- Without the transformation Optimizer treats OR predicates as a single unit
- Can't use index on either column. <- THIS SHOULD BE IN ALL CAPS. IT"S IMPORTANT.
Star Query Transformation
- Turns query inside out, with joins back to
Optimizer Dump : 10053 trace file
All above is how Optimizer Query Transformation happens automatically!
[Updated: 9:40am PDT]
Understanding Optimizer Statistics
Who is responsible for Statistics in the database?
How to Gather Stats
Who is responsible for Statistics in the database?
- Developers or DBA?
- Tom Says 'Developers'
- Does DBA know what the application code?
- Does DBA know that unindexed column contains highly skewed data and is used by joins to other tables?
- Does DBA know that table is truncated at 3am but has more than a million rows by 3pm?
- Constraints impact optimizer immensely!
How to Gather Stats
- use DBMS_STATS
- never, ever EVER use 'analyze'
- Cost Base optimizer expects stats from DBMS_STATS, analyze has not been maintained.
- Incremental statistics (new in 11g)
- generate global stats from partition level stats
- control by the parameter INCREMENTAL (default is false)
- Concurrent stats gathering
- ability to gather stats on multiple options concurrently under a GATHER_SCHEMA_STATS command
- controlled by CONCURRENT (default is false)
(all this in the slides, but worth repeating, it's great foundational information)
[Updated: 9:57am PDT]
How to Gather Stats
- SET_SCHEMA_PREF updates default collection settings
- SET_GLOABL_PREF same as schema, but for global
- DON'T TOUCH ESTIMATE PERCENT IN 11G and beyond
- DON'T TOUCH METHOD OPT
- PUBLISH
- allows you to collect stats and validate if optimization is better before and after.
- need ot update session parameter to use new schema.
- STALE_PERCENT
- DEFAULT 10%
- should be updated or altered based activity pattern
Hierarchy for Parameter values
- parameter values from command-line
- table preferences
- global preferences
How to gather stats?
- 11g on words use AUTO_SAMPLE_SIZE
- New hash based algorithm
- speed and resources of 10% sample
- accuracy of 100% sample
- specifying ESTIMATE_PERCENT other than AUTO_SAMPLE_SIZE will use 10g algorthms
- NDV and NUM_ROWS are most imporant values when gathering stats
What information is gathered with DBMS_STATS?
Histograms- Number of rows
- number of blocks
- average row length
- number of distinct values
- number of nulls in column
Frequency histograms:
- only if the number of distinct values in column (NDV) is less than 254.
- Cardinatly estimate will be VERY accurate
Hight balenaced histograms
- if NDV for a column is greater than 254
- "What if you were like me in High School? Almost popular"
Top Frequency (new in 12c)
- creates a freqency histogram for 99% of values
- 1% must not happen often, so we'll give them a small cardinality.
- ONLY created with AUTO_SAMPLE_SIZE
Hybrid Histograms
- similar to hight balanced histograms as created when NDV > 254
- stores actual frequency of bucket endpoints in histograms
- no values allow to spill over other buckets
- more endpoint values can be squeezed into histogram
[Updated 10:25am PDT]
(skipping pre-11g histogram discussion, as not relevant to me)
Histograms in 11g and beyond
[Updated: 11:123am PST]- Adaptive Cursor Sharing, w/ Bind Peeking
- you can have BOTH plans
- plans are marked as bind sensitive
- create NEW plan if binds are not equivalent
- Look in v$sql to see info on adaptive cursor
- Nearly Popular Values
- the value is classifed as non-popular but density calculation is not accurate
- could use dynamic sampling hint
- /*+ dyanamic_sampling(table_name 2) */
- will get more accurate cardinality
- Good Hint (tells the optimizer how to better calculate information)
- Bad Hint (tells the optimizer how to do this)
Additional stats in 11g
- Column group stats
- useful when multiple column for the same table are used in WHERE clause
- DBMS_STATS.GATHER_EXTENDED_STATS ()
- 12c will do something like this automatically
- DBMS_STATS.SEED_COL_USAGE() will monitor for x seconds to help identify candidates for column group stats
- DBMS_STATS.REPORT_COL_USAGE()
- Expression Stats
- useful when column is used as part of a complex expression in WHERE clause
- Both are automatically maintained when stats are gathered on table
When to gather stats?
- Oracle database automatically collects by default.
- After a large data load
- if trickle loading into a partition table
- use DBMS_STATS.COPY_TABLE_STATS
"A BIG plan against a small table, is usually okay. Small plan against a big table is always a disaster."
- Tom Kyte
Stats Gathering Performance
- Three Parallel options
- Intra object using parallel execution
- used with DEGREE parameter in GATHER_*_STATS
- up to 2x + 1 for parallel server (reader & writer + query coordinator)
- inter object using concurrency
- used with CONCURRENCY parameter in GATHER_*_STATS
- combination of the two
- Incremental stats gathering for partitioned tables
[Updated: 11:45am PST]
When NOT to gather Stats?
- when data volume changes dramatically over time
- when is a good time to gather stats?
- when data has a representative data volumne
- lock stats to to ensure stats gathering job does not overwrite
- When table seen as part of an ETL process
- written once, read once, and truncated or deleted
- Use dynamic sampling instead
Other Types of Stats
- Dictionary Stats
- for EM , quest tools, etc
- Fixed Object Stats
- used for objects that depend on v$sessions
- collect stats after major changes, database upgrades, patches or refresh(?)
- System Stats
- "don't play around with them"
- maybe tinker them when using Oracle on Engineered systems
- Automatic Dynamic Sampling
[Updated 11:58am PST]
2 types of execution Plans
- Explain plan execution plan
- what MIGHT happen
- Row Source Execution plan
- what actually DID happen
How to get execution plan:
dbms_xplan.display for nicely formated explain plans
use display_cursor
Explain plans don't always tell the truth
- AUTOTRACE TRUE
- SQL Developer
What's a good plan for an Optimizer?
- What is Cost?
- Represents units of work and resources
- uses CPU and I/O as units of work
- Cost is really an estimate of CPU and I/O
- What is performance?
- fastest possible response time to query
- exploit what ever resources are available to make this happen fastest
- Optimizer does not focus on resources necessary to execute plan
Cardinality is the most important calculation!
Cardinality
- estimate of the number of rows that will be returned by each operation
- check cardinatlity estimates
- /*+ gather_plan_statistics */
- select * from table(dbms_display_cursor(format=>'ALLSTATS LAST')
- will show estiamted cardinality and actuall cardinality
- check cardinality with SQL Monitor (Enterprise Manager)
- will show estimated and actual rows
- is row source plan
- use /*+ monitor */
Look at SLIDE #25, for Solutions for incorrect cardinality estimates.
LUNCH BREAK
[Updated: 12:31pm PST]
Comments
Post a Comment