Chapter 13: Query Processing
n Overview
n Measures of Query Cost
n Selection Operation
n Sorting
n Join Operation
n Other Operations
n Evaluation of Expressions
Basic Steps in Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation
n Parsing and translation
l translate the query into its internal form. This is then translated into relational algebra.
l Parser checks syntax, verifies relations
n Evaluation
l The query-execution engine takes a query-evaluation plan, executes that plan, and returns the answers to the query.
Basic Steps in Query Processing : Optimization
n A relational algebra expression may have many equivalent expressions
l E.g., sbalance<2500(Õbalance(account)) is equivalent to
Õbalance(sbalance<2500(account))
n Each relational algebra operation can be evaluated using one of several different algorithms
l Correspondingly, a relational-algebra expression can be evaluated in many ways.
n Annotated expression specifying detailed evaluation strategy is called an evaluation-plan.
l E.g., can use an index on balance to find accounts with balance < 2500,
l or can perform complete relation scan and discard accounts with balance ³ 2500
n Query Optimization: Amongst all equivalent evaluation plans choose the one with lowest cost.
l Cost is estimated using statistical information from the
database catalog
4 e.g. number of tuples in each relation, size of tuples, etc.
n In this chapter we study
l How to measure query costs
l Algorithms for evaluating relational algebra operations
l How to combine algorithms for individual operations in order to evaluate a complete expression
n In Chapter 14
l We study how to optimize queries, that is, how to find an evaluation plan with lowest estimated cost
Measures of Query Cost
n Cost is generally measured as total elapsed time for answering query
l Many factors contribute to time cost
4 disk accesses, CPU, or even network communication
n Typically disk access is the predominant cost, and is also relatively easy to estimate. Measured by taking into account
l Number of seeks * average-seek-cost
l Number of blocks read * average-block-read-cost
l Number of blocks written * average-block-write-cost
4 Cost to write a block is greater than cost to read a block
– data is read back after being written to ensure that the write was successful
n For simplicity we just use the number of block transfers from disk and the number of seeks as the cost measures
l tT – time to transfer one block
l tS – time for one seek
l Cost for b block transfers plus S seeks
b * tT + S * tS
n We ignore CPU costs for simplicity
l Real systems do take CPU cost into account
n We do not include cost to writing output to disk in our cost formulae
n Several algorithms can reduce disk IO by using extra buffer space
l Amount of real memory available to buffer depends on other concurrent queries and OS processes, known only during execution
4 We often use worst case estimates, assuming only the minimum amount of memory needed for the operation is available
n Required data may be buffer resident already, avoiding disk I/O
l But hard to take into account for cost estimation
Selection Operation
n File scan – search algorithms that locate and retrieve records that fulfill a selection condition.
n Algorithm A1 (linear search). Scan each file block and test all records to see whether they satisfy the selection condition.
l Cost estimate = br block transfers + 1 seek
4 br denotes number of blocks containing records from relation r
l If selection is on a key attribute, can stop on finding record
4 cost = (br /2) block transfers + 1 seek
l Linear search can be applied regardless of
4 selection condition or
4 ordering of records in the file, or
4 availability of indices
n A2 (binary search). Applicable if selection is an equality comparison on the attribute on which file is ordered.
l Assume that the blocks of a relation are stored contiguously
l Cost estimate (number of disk blocks to be scanned):
4 cost of locating the first tuple by a binary search on the blocks
– élog2(br)ù * (tT + tS)
4 If there are multiple records satisfying selection
– Add transfer cost of the number of blocks containing records that satisfy selection condition
– Will see how to estimate this cost in Chapter 14
Selections Using Indices
n Index scan – search algorithms that use an index
l selection condition must be on search-key of index.
n A3 (primary index on candidate key, equality). Retrieve a single record that satisfies the corresponding equality condition
l Cost = (hi + 1) * (tT + tS)
n A4 (primary index on nonkey, equality) Retrieve multiple records.
l Records will be on consecutive blocks
4 Let b = number of blocks containing matching records
l Cost = hi * (tT + tS) + tS + tT * b
n A5 (equality on search-key of secondary index).
l Retrieve a single record if the search-key is a candidate key
4 Cost = (hi + 1) * (tT + tS)
l Retrieve multiple records if search-key is not a candidate key
4 each of n matching records may be on a different block
4 Cost = (hi + n) * (tT + tS)
– Can be very expensive!
Selections Involving Comparisons
n Can implement selections of the form sA£V (r) or sA ³ V(r) by using
l a linear file scan or binary search,
l or by using indices in the following ways:
n A6 (primary index, comparison). (Relation is sorted on A)
4 For sA ³ V(r) use index to find first tuple ³ v and scan relation sequentially from there
4 For sA£V (r) just scan relation sequentially till first tuple > v; do not use index
n A7 (secondary index, comparison).
4 For sA ³ V(r) use index to find first index entry ³ v and scan index sequentially from there, to find pointers to records.
4 For sA£V (r) just scan leaf pages of index finding pointers to records, till first entry > v
4 In either case, retrieve records that are pointed to
– requires an I/O for each record
– Linear file scan may be cheaper
Implementation of Complex Selections
n Conjunction: sq1Ù q2Ù. . . qn(r)
n A8 (conjunctive selection using one index).
l Select a combination of qi and algorithms A1 through A7 that results in the least cost for sqi (r).
l Test other conditions on tuple after fetching it into memory buffer.
n A9 (conjunctive selection using multiple-key index).
l Use appropriate composite (multiple-key) index if available.
n A10 (conjunctive selection by intersection of identifiers).
l Requires indices with record pointers.
l Use corresponding index for each condition, and take intersection of all the obtained sets of record pointers.
l Then fetch records from file
l If some conditions do not have appropriate indices, apply test in memory.
Algorithms for Complex Selections
n Disjunction:sq1Ú q2 Ú. . . qn (r).
n A11 (disjunctive selection by union of identifiers).
l Applicable if all conditions have available indices.
4 Otherwise use linear scan.
l Use corresponding index for each condition, and take union of all the obtained sets of record pointers.
l Then fetch records from file
n Negation: sØq(r)
l Use linear scan on file
l If very few records satisfy Øq, and an index is applicable to q
4 Find satisfying records using index and fetch from file
Sorting
n We may build an index on the relation, and then use the index to read the relation in sorted order. May lead to one disk block access for each tuple.
n For relations that fit in memory, techniques like quicksort can be used. For relations that don’t fit in memory, external
sort-merge is a good choice.
External Sort-Merge
n Create sorted runs. Let i be 0 initially.
Repeatedly do the following till the end of the relation:
(a) Read M blocks of relation into memory
(b) Sort the in-memory blocks
(c) Write sorted data to run Ri; increment i.
Let the final value of i be N
n Merge the runs (next slide)…..
n Merge the runs (N-way merge). We assume (for now) that N < M.
n Use N blocks of memory to buffer input runs, and 1 block to buffer output. Read the first block of each run into its buffer page
n repeat
n Select the first record (in sort order) among all buffer pages
n Write the record to the output buffer. If the output buffer is full write it to disk.
n Delete the record from its input buffer page.
If the buffer page becomes empty then
read the next block (if any) of the run into the buffer.
n until all input buffer pages are empty:
n If N ³ M, several merge passes are required.
l In each pass, contiguous groups of M - 1 runs are merged.
l A pass reduces the number of runs by a factor of M -1, and creates runs longer by the same factor.
4 E.g. If M=11, and there are 90 runs, one pass reduces the number of runs to 9, each 10 times the size of the initial runs
l Repeated passes are performed till all runs have been merged into one.
Example: External Sorting Using Sort-Merge
n Cost analysis:
l Total number of merge passes required: élogM–1(br/M)ù.
l Block transfers for initial run creation as well as in each pass is 2br
4 for final pass, we don’t count write cost
– we ignore final write cost for all operations since the output of an operation may be sent to the parent operation without being written to disk
4 Thus total number of block transfers for external sorting:
br ( 2 élogM–1(br / M)ù + 1)
l Seeks: next slide
n Cost of seeks
l During run generation: one seek to read each run and one seek to write each run
4 2 ébr / Mù
l During the merge phase
4 Buffer size: bb (read/write bb blocks at a time)
4 Need 2 ébr / bbù seeks for each merge pass
– except the final one which does not require a write
4 Total number of seeks:
2 ébr / Mù + ébr / bbù (2 élogM–1(br / M)ù -1)
Join Operation
n Several different algorithms to implement joins
l Nested-loop join
l Block nested-loop join
l Indexed nested-loop join
l Merge-join
l Hash-join
n Choice based on cost estimate
n Examples use the following information
l Number of records of customer: 10,000 depositor: 5000
l Number of blocks of customer: 400 depositor: 100
Nested-Loop Join
n To compute the theta join r q s
for each tuple tr in r do begin
for each tuple ts in s do begin
test pair (tr,ts) to see if they satisfy the join condition q
if they do, add tr • ts to the result.
end
end
n r is called the outer relation and s the inner relation of the join.
n Requires no indices and can be used with any kind of join condition.
n Expensive since it examines every pair of tuples in the two relations.
n In the worst case, if there is enough memory only to hold one block of each relation, the estimated cost is
nr * bs + br
block transfers, plus
nr + br
seeks
n If the smaller relation fits entirely in memory, use that as the inner relation.
l Reduces cost to br + bs block transfers and 2 seeks
n Assuming worst case memory availability cost estimate is
l with depositor as outer relation:
4 5000 * 400 + 100 = 2,000,100 block transfers,
4 5000 + 100 = 5100 seeks
l with customer as the outer relation
4 10000 * 100 + 400 = 1,000,400 block transfers and 10,400 seeks
n If smaller relation (depositor) fits entirely in memory, the cost estimate will be 500 block transfers.
n Block nested-loops algorithm (next slide) is preferable.
Block Nested-Loop Join
n Variant of nested-loop join in which every block of inner relation is paired with every block of outer relation.
for each block Br of r do begin
for each block Bs of s do begin
for each tuple tr in Br do begin
for each tuple ts in Bs do begin
Check if (tr,ts) satisfy the join condition
if they do, add tr • ts to the result.
end
end
end
end
n Worst case estimate: br * bs + br block transfers + 2 * br seeks
l Each block in the inner relation s is read once for each block in the outer relation (instead of once for each tuple in the outer relation
n Best case: br + bs block transfers + 2 seeks.
n Improvements to nested loop and block nested loop algorithms:
l In block nested-loop, use M — 2 disk blocks as blocking unit for outer relations, where M = memory size in blocks; use remaining two blocks to buffer inner relation and output
4 Cost = ébr / (M-2)ù * bs + br block transfers +
2 ébr / (M-2)ù seeks
l If equi-join attribute forms a key or inner relation, stop inner loop on first match
l Scan inner loop forward and backward alternately, to make use of the blocks remaining in buffer (with LRU replacement)
l Use index on inner relation if available (next slide)
Indexed Nested-Loop Join
n Index lookups can replace file scans if
l join is an equi-join or natural join and
l an index is available on the inner relation’s join attribute
4 Can construct an index just to compute a join.
n For each tuple tr in the outer relation r, use the index to look up tuples in s that satisfy the join condition with tuple tr.
n Worst case: buffer has space for only one page of r, and, for each tuple in r, we perform an index lookup on s.
n Cost of the join: br (tT + tS) + nr * c
l Where c is the cost of traversing index and fetching all matching s tuples for one tuple or r
l c can be estimated as cost of a single selection on s using the join condition.
n If indices are available on join attributes of both r and s,
use the relation with fewer tuples as the outer relation.
Example of Nested-Loop Join Costs
n Compute depositor customer, with depositor as the outer relation.
n Let customer have a primary B+-tree index on the join attribute customer-name, which contains 20 entries in each index node.
n Since customer has 10,000 tuples, the height of the tree is 4, and one more access is needed to find the actual data
n depositor has 5000 tuples
n Cost of block nested loops join
l 400*100 + 100 = 40,100 block transfers + 2 * 100 = 200 seeks
4 assuming worst case memory
4 may be significantly less with more memory
n Cost of indexed nested loops join
l 100 + 5000 * 5 = 25,100 block transfers and seeks.
l CPU cost likely to be less than that for block nested loops join
Merge-Join
n Sort both relations on their join attribute (if not already sorted on the join attributes).
n Merge the sorted relations to join them
n Join step is similar to the merge stage of the sort-merge algorithm.
n Main difference is handling of duplicate values in join attribute — every pair with same value on join attribute must be matched
n Detailed algorithm in book
n Can be used only for equi-joins and natural joins
n Each block needs to be read only once (assuming all tuples for any given value of the join attributes fit in memory
n Thus the cost of merge join is:
br + bs block transfers + ébr / bbù + ébs / bbù seeks
l + the cost of sorting if relations are unsorted.
n hybrid merge-join: If one relation is sorted, and the other has a secondary B+-tree index on the join attribute
l Merge the sorted relation with the leaf entries of the B+-tree .
l Sort the result on the addresses of the unsorted relation’s tuples
l Scan the unsorted relation in physical address order and merge with previous result, to replace addresses by the actual tuples
4 Sequential scan more efficient than random lookup
Hash-Join
n Applicable for equi-joins and natural joins.
n A hash function h is used to partition tuples of both relations
n h maps JoinAttrs values to {0, 1, ..., n}, where JoinAttrs denotes the common attributes of r and s used in the natural join.
l r0, r1, . . ., rn denote partitions of r tuples
4 Each tuple tr Î r is put in partition ri where i = h(tr [JoinAttrs]).
l r0,, r1. . ., rn denotes partitions of s tuples
4 Each tuple ts Îs is put in partition si, where i = h(ts [JoinAttrs]).
n Note: In book, ri is denoted as Hri, si is denoted as Hsi and
n is denoted as nh.

0 comments:
Post a Comment