You all may have used IN, NOT IN, EXISTS, and NOT EXISTS or have seen at one point in their IT career. Many believe SQL keyword IN is similar to EXIST and SQL keyword NOT IN is equal to NOT EXISTS. Are they similar or different? Which is better for performance. Also, learn when to use one versus other. As a bonus, I will show your their JOIN equivalent.
SYNTAX: IN
Based on rule optimizer:
EXISTS is much faster than IN, when the sub-query results is very large.
IN is faster than EXISTS, when the sub-query results is very small.
Based on cost optimizer:
There is no difference.
After Oracle 10g, you dont see any execution plan difference because Optimizer has evolved a lot since than.
SYNTAX: NOT IN
SYNTAX: IN
SELECT * FROM customers WHERE Customerid IN (SELECT customerid FROM orders);SYNTAX: EXISTS
SELECT * FROM Customers WHERE EXISTS ( SELECT * FROM Orders WHERE Orders.CustomerID = Customers.ID);SYNTAX: ALTERNATIVE TO EXISTS
SELECT * FROM Customers WHERE EXISTS ( SELECT 1 FROM Orders WHERE Orders.CustomerID = Customers.ID);SYNTAX: JOIN
SELECT Customers.* FROM Customers INNER JOIN Orders ON Customers.ID = Orders.CustomerID;EXIST keyword evaluates TRUE or FALSE, but IN keyword compares all value in the corresponding sub query column. This is why IN is less efficient than EXISTS. When you don't know whether the record is exist or not, always choose EXISTS. Before Oracle 10g, one should be careful on what to use because Oracle may get bad execution plan.
Based on rule optimizer:
EXISTS is much faster than IN, when the sub-query results is very large.
IN is faster than EXISTS, when the sub-query results is very small.
Based on cost optimizer:
There is no difference.
After Oracle 10g, you dont see any execution plan difference because Optimizer has evolved a lot since than.
SYNTAX: NOT IN
SELECT * FROM Customers WHERE ID NOT IN ( SELECT CustomerID FROM Orders);SYNTAX: NOT EXISTS
SELECT * FROM Customers WHERE NOT EXISTS ( SELECT NULL FROM Orders WHERE Orders.CustomerID = Customers.ID);
A NULL cannot be compared to any other value, not even another NULL. Therefore, a NOT IN operation would fail if the result set being probed returns a NULL. In such a case, the results of a NOT IN query is 0 rows while a NOT EXISTS query would still show the rows present in the one table but not in the other table.SYNTAX:JOIN
SELECT Customers.* FROM Customers LEFT JOIN Orders ON orders.ID = customers.CustomerID AND Orders.ID is NULL;Exists statement works faster than than with most Joins. Exists will give you a single record and will save the time also. In case of joins the number of records will be more and all the records must be used. This may not be the case on all scenario, therefore running execution plan before executing is a good start to find which performs better.
Interested in working with me? I can be reached at pbaniya04[at]gmail.com for any questions, consulting opportunities or you may drop a line to say HELLO. Thank your again for visiting my blog and looking forward to serving you more.
Have a Database-ious Day!
Have a Database-ious Day!
No comments