## Selection Operation

- Consider the query to find the assets and branch-names of all banks who have depositors living in Port Chester. In relational algebra, this is
`(`*customer deposit branch*))- This expression constructs a huge relation,
*customer deposit branch*of which we are only interested in a few tuples.

- We also are only interested in two attributes of this relation.
- We can see that we only want tuples for which
*ccity = “Port Chester”*. - Thus we can rewrite our query as:
*deposit**branch*) - This should considerably reduce the size of the intermediate relation.

**Suggested Rule for Optimization:**- Perform select operations as early as possible.
- If our original query was restricted further to customers with a balance over $1000, the selection cannot be done directly to the customer relation above.
- The new relational algebra query is
`(`*customer deposit branch*)) - The selection cannot be applied to
*customer*, as*balance*is an attribute of*deposit*. - We can still rewrite as
`(`*customer**deposit*))*branch*) - If we look further at the subquery (middle two lines above), we can split the selection predicate in two:
(

*customer**deposit*)) - This rewriting gives us a chance to use our “perform selections early” rule again.
- We can now rewrite our subquery as:

**Second Transformational Rule:**- Replace expressions of the form by where and are predicates and
*e*is a relational algebra expression. - Generally,

- Replace expressions of the form by where and are predicates and